This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
where p and ~ are respectively a referent specialization operator and a coreference partition operator on G, and tp is a substitution for annotation variables in G. The answer is said to be correct iff tp is a solution for C G and every annotation variable-flee instance of p~tpQ G is a logical consequence of P. The following definition of FCG unification is modified from the one in 4 with the introduction o f FCGP annotation terms and constraints on them. As defined in 4, a VAR generic marker is one whose concept occurs in an FCGP query, or in the body of an FCGP rule, or in the head of an FCGP rule and coreferenced with a concept in the body of the rule; a NON-VAR generic marker is one whose concept occurs in an FCGP fact, or in the head of an FCGP rule but are not coreferenced with any concept in the body of the rule.
Definition 4. 7 Let u and v be finite normal FCGs. An FCG unification from u to v is a mapping 0: u --->v such that: 1. V c e V C : type(c) is matchable to type(Oc) and referent-unified(c, Oc), and 2. V r e V R : type(r) is matchable to type(Or), and 'Vie { 1, 2 ..... arity(r) } : neighbor(Or, i) = Oneighbor(r, i), and 3. No VAR generic maker is unified with different individual markers or noncoreferenced NON-VARgeneric markers. The constraint produced by 0 is denoted by C O and defined by the set {aval(c)
aval(Oc) I c e V c , and c is a fuzzy attribute concept}u{type(c) <~ type(Oc) I ceVc,}~{type(r)
are respec-
Note that, in Definition 4.7, if aval(c) is of the form lub{cs 1, ~2 ..... Gin}, where t~i's are simple FCGP annotation terms, then the constraint lub{t~l, t~ ..... ~m} -
280
Definition 4.8 Let G be an FCGP goal QG I1 CG and C be an FCGP reductant if u then v (G and C have no variables in common). Suppose that there exists an FCG unification 0 from a normalized sub-graph g of QG to v. Then, the corresponding resolvent of G and C is a new FCGP goal denoted by Ro(G,C) and defined to be P0~0~50QG u II C0 & CG, where 80 deletes g from QG-
Property 4.3 Let G and C be respectively an FCGP goal and an FCGP reductant. If G is a normal FCGP goal, then any resolvent of G and C is also a normal FCGP goal.
Proof The proof is similar to the proofs for the alike properties of annotated logic programs (Lemma 2 in 20) and AFLPs (Property 4.3 in 7). Note that the order of the inequalities in C o is not significant, but the order of C o and C G in Definition 4.8 is.
Definition 4.9 Let P be an FCGP and G be an FCGP goal. A refutation of G and P is a finite sequence G
Example 4.3 Let P be the program in Example 4.1 and G be the following FCGP goal querying "Which product has a moderate price and what is its quality like?": PRODUCT: *x --->(ATTR2)=~PRICE: @moderate ---->(ATTR3)~ QUALITY: @X Assuming the relations between the linguistic labels in P and G as in Example 4.2, a refutation of G and P can be constructed as follows: go = PRODUCT:*x--->(ATTR2)~PRICE:@m~ (a sub-graph of QG) C 1 = if PRODUCT:*y--->(ATTR1)=~DEMAND: @tub{not high+~, not low+ql} (u) then PRODUCT: *y--~(ATTR2):=*,PRICE: @ lub {not expensive+ G, not cheap+ql } (v ) 01: go --->v, p01 = { }, ~ol = { {PRODUCT:*x, PRODUCT:*y } } G 1 = PRODUCT: *y---)(ATTR 1)==~DEMAND: @lub{not high+~, not low+ql} -->(AT'rR3)~ QUALITY: @X II (moderate
II (lub{not high+ G, not low+q1} <-tnormal) & (moderate -
281
c3--c: 03: QG 2 ---) C3, P03 = { }, 1~03= { } G3 = II (X
(moderate
II (X st quite good) & (not high+~ ~ normal) & (not low+W -
As in Example 4.2, one has X = quite good and ~ = W = 0 as a solution for the constraint above, whence the corresponding answer for G w.r.t. P is <{(PRODUCT:*X, #2) }, { }, {X/quite good}>. Note that, if a clause rather than a reductant of P were used to resolve go, there would not be a refutation of G and P because, generally, neither moderate -
is applied when gl is deleted from QG 1.
The following theorems state the soundness and the completeness of this FCGP proof procedure. Due to space limitation we omit the proofs for these theorems, which were presented in the submitted version of this paper.
Theorem 4.4 (Soundness) Let P be an FCGP and G be an FCGP goal. If G
Theorem 4.5 (Completeness) Let P be an FCGP and G be a normal FCGP goal. If there exists a correct answer for G w.r.t. P, then there exists a refutation of G and P. 4.4. Remarks A crisp attribute-value is a special fuzzy one defined by a fuzzy set whose membership function only has values 0 or 1. A basic type is a special fuzzy one whose fuzzy truth-value is absolutely true. So, a CG can be considered as a special FCG and a CGP as a special FCGP with all nodes in each of its rules being firm nodes. We now give remarks on proof procedures for both FCGPs and CGPs from the viewpoint of lattice-based reasoning. For discussion, let us consider the following CGP as a special FCGP: if PERSON: *X--)(WORK_FOR)--)PERSON: * then EMPLOYEE: *x if PERSON: *y--~(ATTEND)--~UNIVERSITY: * then STUDENT: *y PERSON: John----Y(WORK FOR)---)PERSON: * --)(ATrEND)--->UNIVERSITY: Queensland PERSON: John---)(BROTHER OF)---)GIRL: Mary where the first rule says "A person is an employee if working for another", the second rule says "A person is a student if attending a university", the first fact says "John works for some person and attends university Queensland", and the second fact says "John is a brother of Mary". Note the rules in this CGP, where two coreferenced concepts occur in the body
282
and the head of a rule with two different concept types. These rules realize a close coupling between the concept type hierarchy and the axiomatic part of a knowledge base (3, 4). A backward chaining proof procedure based on clause selection like the ones in 15, 28, 4 is not complete when such rules are present or facts are not in CG normal form. For example, it cannot satisfy the goal EMPLOYEE-STUDENT" John where EMPLOYEE-STUDENT= lub{EMPLOYEE,STUDENT}, because it does not combine rule-defined concept types of an individual, which are EMPLOYEE and STUDENTin this case. Meanwhile, CG normal form is required to combine fact-defined concept types of an individual. Such combinations are inherent in a forward chaining method (cf. 28, 19). Actually, the significance of FCGP reductants is to combine concept types as well as other lattice-based data for backward chaining. In this example, a reductant of the CGP is: if PERSON: *X-->(WORK_FOR)-->PERSON: * then EMPLOYEE-STUDENT: *x ---9(ATTEND)-->UNIVERSITY: * which together with the first fact resolve the goal EMPLOYEE-STUDENT: John. Moreover, in the light of annotated logic programming (20, 7), types have been viewed as annotations which can also be queried about. This then reveals an advantage of CG notation that makes this view possible, which has not been addressed in the previous works on CGPs/FCGPs (15, 28, 4, 19). With classical first-order logic notation, this view is hindered due to types being encoded in predicate symbols, whence queries about sorts of objects or relations among objects have not been thought about (cf. 1, 3). For example, the following CG query asks "What is John and what is John's relation with Mary?": X: John--->(Y)---->PERSON: Mary where X is a concept type variable and Y is a relation type variable. Applying the presented FCGP proof procedure, one obtains X = EMPLOYEE-STUDENT and Y = BROTHER_OF, which say "John is an employee and a student, and John is a brother of Mary", as the most informative answer w.r.t, the given CGP.
5. Conclusions The syntax of FCGPs with the introduction of fuzzy types has been presented, providing unified structure and treatment for both FCGPs and CGPs. The general declarative semantics of FCGPs based on the notion of ideal FCGs has been defined, ensuring finite proofs of logical consequences of programs. The fixpoint semantics of FCGPs has been studied as the bridge between their declarative and procedural semantics. Then, a new SLD-style proof procedure for FCGPs has been developed and proved to be sound and complete w.r.t, their declarative semantics. The two main new points in the presented FCGP proof procedure are that, it selects reductants rather than clauses of an FCGP in resolution steps and involves solving constraints on fuzzy value terms. As it has been analysed, a CGP/FCGP SLD-style proof procedure based on clause selection is not generally complete. The constraint solving supports more expressive queries which are possibly about not only fuzzy attribute-values but also fuzzy types. Since a CGP can be considered as a special FCGP, the results obtained here for
283
F C G P s could be applied to C G P systems. They could also be useful for any extension that adds to CGs lattice-based annotations to enhance their knowledge representation and reasoning power. The presented F C G P system, on one hand, extends C G P s to deal with vague and imprecise information pervading the real world as reflected in natural languages. On the other hand, to our knowledge, it is the first fuzzy order-sorted logic programming system for handling uncertainty about types o f objects. W h e n only fuzzy sets o f special cases are involved, F C G P s could become possibilistic CGPs, where concept and relation nodes in a C G are weighted by only values in 0,1 interpreted as necessity degrees. They are less expressive than general F C G P s but have simpler computation and are still very useful for CG-based systems dealing with uncertainty. Besides, F C G P s could be extended further to represent and reason with other kinds of uncertain knowledge, such as imprecise temporal information or vague generalized quantifiers. These are among the topics that are currently being investigated. A c k n o w l e d g m e n t . We would like to thank Marie-Laure Mugnier and the anonymous referees for the comments that help us to revise the paper for its readability. References 1. Ait-Kaci, H. & Nasr, R. (1986), Login: A Logic Programming Language with Built-In Inheritance. J. of Logic Programming, 3: 185-215. 2. Baldwin, J.E & Martin, T.P. & Pilsworth, B.W. (1995), Fril - Fuzzy and Evidential Reasoning in Artificial Intelligence. John Wiley & Sons, New York. 3. Beierle, C. & Hedtstuck, U. & Pletat, U. & Schmitt, P.H. & Siekrnann, J. (1992), An Order-Sorted Logicfor Knowledge Representation Systems. J. of Artificial Intelligence, 55: 149-191. 4. Cao, T.H. & Creasy, P.N. & Wuwongse, V. (1997), Fuzzy Unification and Resolution Proof Procedure for Fuzzy Conceptual Graph Programs. In Lukose, D. et al. (Eds.): Conceptual Structures - Fulfilling Peirce's Dream, LNAI No. 1257, Springer-Verlag, pp. 386-400. 5. Cao, T.H. & Creasy, P.N. & Wuwongse, V. (1997), Fuzzy Types and Their Lattices. In Proc. of the 6th IEEE International Conference on Fuzzy Systems, pp. 805-812. 6. Cao, T.H. & Creasy, P.N. (1997), Universal Marker and Functional Relation: Semantics and Operations. In Lukose, D. et al. (Eds.): Conceptual Structures - Fulfilling Peirce's Dream, LNAI No. 1257, Springer-Verlag, pp. 416-430. 7. Cao, T.H. (1997), Annotated Fuzzy Logic Programs. Int. J. of Fuzzy Sets and Systems. To appear. 8. Cao, T.H & Creasy, P.N. (1997), Fuzzy Conceptual Graph Programs and Their Fixpoint Semantics. Tech. Report No. 424, Department of CS&EE, University of Queensland. 9. Cao, T.H. (1998), Annotated Fuzzy Logic Programsfor Soft Computing. In Proc. of the 2nd International Conference on Computational Intelligence and Multimedia Applications, World Scientific, pp. 459-464. 10. Carpenter, B. (1992), The Logic of Typed Feature Structures with Applications to Unification Grammars, Logic Programs and Constraint Resolution. Cambridge University Press. 11. Chevallet, J-P. (1992), Un ModUle Logique de Recherche d'Informations Appliqud au Formalisme des Graphes Conceptuels. Le Prototype ELEN et Son Expdrimentation sur un Corpus de Composants Logiciels. PhD Thesis, Universit6 Joseph Fourier. 12. Dubois, D. & Lang, J. & Prade, H., (1994) PossibilisticLogic. In Gabbay, D.M. et al. (Eds.): Handbook of Logic in Artificial Intelligence and Logic Programming, Vol. 3, Oxford University Press, pp. 439-514. 13. Genest, D. & Chein, M. (1997), An Experiment in Document Retrieval Using Conceptual Graphs. In Lukose, D. et al. (Eds.): Conceptual Structures - Fulfilling Peirce's Dream, LNAI No. 1257, Springer-Verlag, pp. 489-504. 14. Ghosh, B.C. & Wuwongse, V. (1995), Conceptual Graph Programs and Their Declarative Semantics. IEICE Trans. on Information and Systems, Vol. E78-D, No. 9, pp. 1208-1217.
284
15. Ghosh, B.C. (1996), Conceptual Graph Language - A Language of Logic and Information in Conceptual Structures. PhD Thesis, Asian Institute of Technology. 16. Gr~itzer, G. (1978), GeneralLattice Theory. Academic Press, New York. 17. Ho, K.H.L. (1994), Learning Fuzzy Concepts By Examples with Fuzzy Conceptual Graphs. In Proc. of the 1st. Australian Conceptual Structures Workshop. 18. Hopcroft, J.E. & Ullman, J.D. (1979), Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Massachusetts. 19. Kerdiles, G. & Salvat, E. (1997), A Sound and Complete CG Proof Procedure Combining Projections with Analytic Tableaux. In Lukose, D. et al. (Eds.): Conceptual Structures - Fulfilling Peirce's Dream, LNAI No. 1257, Springer-Verlag, pp. 371-385. 20. Kifer, M. & Subrahmanian, V.S. (1992), Theory of Generalized Annotated Logic Programming and Its Applications. J. of Logic Programming, 12: 335-367. 21. Klawonn, F. (1995), Prolog Extensions to Many-Valued Logics. In H/Shle, U. & Klement, E.P. (Eds.): Non-Classical Logics and Their Applications to Fuzzy Subsets, Kluwer Academic Publishers, Dordrecht, pp. 271-289. 22. Lloyd, J.W. (1987), Foundations of Logic Programming. Springer-Verlag, Berlin. 23. Magrez, P. & Smets, P. (1989), Fuzzy Modus Ponens: A New Model Suitable for Applications in Knowledge-Based Systems. Int. J. of Intelligent Systems, 4: 181-200. 24. Mineau, G.W. (1994), Views, Mappings and Functions: Essential Definitions to the Conceptual Graph Theory. In Tepfenhart, W.M. & Dick, J.P. & Sowa, J.F. (Eds.): Conceptual Structures Current Practices, LNAI No. 835, Springer-Verlag, pp. 160-174. 25. Morton, S. (1987), Conceptual Graphs and Fuzziness in Artificial Intelligence. PhD Thesis, University of Bristol. 26. Mukaidono, M. & Shen, Z. & Ding, L. (1989), Fundamentals of Fuzzy Prolog. Int. J. of Approximate Reasoning, 3: 179-194. 27. Myaeng, S.H. & Khoo, C. (1993), On Uncertainty Handling in Plausible Reasoning with Conceptual Graphs. In Pfeiffer, H.D. & Nagle, T.E. (Eds.): Conceptual Structures - Theory and Implementation, LNAI No. 754, Springer-Verlag, pp. 137-147. 28. Salvat, E. & Mugnier, M.L. (1996), Sound and Complete Forward and Backward Chainings of Graph Rules. In Eklund, P.W. & Ellis, G. & Mann, G. (Eds.): Conceptual Structures - Knowledge Representation as Interlingua, LNAI No. 1115, Springer-Verlag, pp. 248-262. 29. Sowa, J.E (1984), Conceptual Structures: Information Processing in Mind and Machine. AddisonWesley, Massachusetts. 30. Sowa, J.F. (1991), Towards the Expressive Power of Natural Languages. In Sowa, J.F. (Ed.): Principles of Semantic Networks - Explorations in the Representation of Knowledge, Morgan Kaufmann Publishers, San Mateo, CA, pp. 157-189. 31. Sowa, J.E (1997), Matching Logical Structure to Linguistic Structure. In Houser, N. & Roberts, D.D. & Van Evra, J. (Eds.): Studies in the Logic of Charles Sanders Peirce, Indiana University Press, pp. 418-444. 32. Umano, M. (1987), Fuzzy Set Prolog. In Preprints of the 2nd International Fuzzy Systems Association Congress, pp. 750-753. 33. Wuwongse, V. & Manzano, M. (1993), Fuzzy Conceptual Graphs. In Mineau, G.W. & Moulin, B. & Sowa, J.E (Eds.): Conceptual Graphs for Knowledge Representation, LNAI No. 699, SpringerVerlag, pp. 430-449. 34. Wuwongse, V. & Cao, T.H. (1996), Towards Fuzzy Conceptual Graph Programs. In Eklund, P.W. & Ellis, G. & Mann, G. (Eds.): Conceptual Structures - Knowledge Representation as Interlingua, LNAI No. 1115, Springer-Verlag, pp. 263-276. 35. Zadeh, L.A. (1965), Fuzzy Sets. J. of Information and Control, 8: 338-353. 36. Zadeh, L.A. (1978), PRUF - A Meaning Representation Languagefor Natural Languages. Int. J. of Man-Machine Studies, 10: 395-460. 37. Zadeh, L.A. (1990), The Birth and Evolution of Fuzzy Logic. Int. J. of General Systems, 17: 95105. 38. Zadeh, L.A. (1996), Fuzzy Logic = Computing with Words. IEEE Trans. on Fuzzy Systems, 4: 103111.
Knowledge Querying in the Conceptual Graph Model: The RAP Module Olivier Guinaldo 1 and Ollivier Haemmerl~ 2 1 LIMOS - U. d'Auvergne, IUT de Clermont-Ferrand, BP 86, F-63172 Aubi~re Cedex, France guinaldo@volcans-ia, univ-bpclermont, fr
2 INA-PG, D~partement OMIP, 16, rue Claude Bernard F-75231 Paris Cedex 05, France Ollivier. gaemmerle@inapg, inra. fr
A b s t r a c t . The projection operation can be used to query a CG knowledge base by searching all the specializations of a particular CG - the question. But in some cases it may not be enough, particularly when the knowledge allowing an answer is distributed among several graphs belonging to the fact base. We define two operating mechanisms of knowledge querying, that work by means of graph operations. Both these mechanisms are equivalent and logically based. The first one modifies the knowledge base, while the second modifies the question. The RAP module is an implementation of the last algorithm on CoGITo.
1 Introduction The specialization relation, which can be computed by means of the projection operation, is the basis for the reasoning in the CG model. It expresses that one graph contains a more specific knowledge than another. One of the strong points of the CG model is that there exist sound and complete logical semantics, which means that the graphical reasoning is equivalent to the logical deduction upon the logical formulae associated with the graphs 1,
2. In this paper, we consider a knowledge based system in which the fact base and the query are represented in terms of CGs. An important step is to provide methods allowing one to query such a fact base. The projection operation can be used for such a search, but the problem is that the knowledge allowing one to answer a query can be distributed among several graphs of the knowledge base, in which case no projection can be made. We propose two algorithms designed to avoid that inconvenience. These algorithms are based on those of the ROCK system 3, 4. In the R o c k system we used some heuristics in order to limit the combinatory. But after our work upon the management of large knowledge bases, we implemented these algorithms and showed that they were equivalent to logical deduction 5, 6. This paper results from these works that were never published in an international event.
288
We propose a first reasoning algorithm which is sound and complete regarding the logical deduction. This algorithm works on the knowledge base by using a first step t h a t merges the knowledge in order to allow the projection to run. Then we propose a second algorithm t h a t uses a dual mechanism: the knowledge base is not modified, but the question is split and partial answers are searched for. We proved in 6 t h a t this second mechanism is equivalent to the first one, and t h a t it is sound and complete regarding the logical deduction. Then we show t h a t this second algorithm presents several valuable points compared with the first one. The last section of this article is a presentation of the RAP module based on the second algorithm. RAP is implemented on the CoGITo platform 7. 2
Knowledge
querying
In the following, we call the fact base a set of CGs under normal form I representing the facts of our system. Our goal is to provide CG answers to a CG question asked on a fact base. The projection operation is the ground operation used to make deductions.
2.1
T h e reasoning
In this paper, we exclusively focus on reasoning in terms of graph operations. However, in order to clarify the notions of fact base, question and answers, we propose an intuitive definition of these notions. A formal definition in first order logic is proposed in 6.
Definition 1 Let F B = { F 1 , . . . , Fk } be a set of CGs under normal form defined on the support S and ~ ( F 1 ) , . . . ,~(Fk) the set of associated formulae. Let ~ ( S ) be the set of formulae associated with S. Given Q a CG defined on S and ~(Q) its associated formula. We say that there exists an answer A to Q on F B iff ~( S), ~( F1), . . . , ~( Fk ) F9 (Q). The construction of such an answer is presented in the next section. 2.2
COMPOSITION:the first algorithm
Our goal is to propose a sound and complete algorithm computing CG answers according to the previous definition. In other words we want to show t h a t it is possible to give a CG answer A to the CG question Q without using a logical theorem prover. We could define the notion of answer as a "specialization of the CG question belonging to the CG fact base". Thus the algorithm could be "projecting the CG question upon each CG fact". But such a definition cannot solve the following 1 According to 8, a CG is under normal form if it doesn't have two conceptual vertices with the same individual marker; the normal form of a graph is computed by merging the conceptual vertices with the same individual marker.
289
problem: the knowledge relative to an individual marker and allowing one to answer a question m a y be distributed among several CG facts of the base. In t h a t case it is impossible to answer, because of the impossibility of finding a projection. For example, consider the question Q and the fact base F B presented in fig. 1. Q cannot be projected on a graph of FB. But the logical formula associated with Q can be deduced from the part of F B in bold lines. G r a p h F in fig. 1 is obtained by the disjunctive sum of F1 and F 2 (a disjunctive sum is a specialization operation in 8 consisting in the juxtaposition of two CGs), and then by the join made on the individual vertices Mike. This graph admits obviously a projection of Q on its grey subgraph. T h a t sub-graph of F is an answer to Q.
" ~ ~ ~ - - - - ~ C a r
CLT63
2
obj
1
Drive
1
agt
2
Man : Mike
F1
FB
Fig. 1. Example of a CG fact base and a CG question. No graph of FB admits a projection of Q. But graph F, the logical interpretation of which is equivalent to the conjonction of formulae associated with the graphs in FB, admits a projection of Q. There exists an answer to Q in FB.
So the first algorithm consists in considering the fact base F B = {F1,. 9 9 Fk} as a unique graph resulting from the disjunctive sum of F 1 , . . . , Fk, then normalizing t h a t graph by merging all the vertices with the same individual marker, and finally projecting Q on F. The COMPOSITION algorithm is presented thoroughly in 6.
290
2.3
BLUES: the second algorithm
Two drawbacks of the COMPOSITION algorithm can be noted. Firstly, the knowledge base is modified by the use of disjunctive sum and normalization. This is not a problem in terms of the complexity of these operations, but the original form of the knowledge is modified. This can be a problem for instance if such a knowledge represents sentences from a text and you want to know easily which sentences an answer comes from. Secondly, COMPOSITION involves a projection of the CG question into a graph of a size "equivalent to the size of the knowledge base". In terms of implementation, this can be prejudicial, the whole knowledge base having to be loaded into the m e m o r y in order to compute the projection. T h a t ' s why we propose the BLUES2 algorithm, which does not modify the fact base: BLUES splits the question instead of merging the CG base. The main idea of this algorithm was proposed by CarbonneiU and Haemmerl~ 4. It is an equivalent alternative to the COMPOSITION algorithm. Afterwards, we consider t h a t all the CG facts are individually in normal form: the combinatory increases significantly when we work with unspecified graphs, while putting a graph under normal form (which is an equivalent form in terms of logic) has a linear complexity in the size of the graph.
P r e s e n t a t i o n and definitions In the COMPOSITION algorithm, we have seen t h a t the answers to a question Q on a fact base F B are graphs resulting from the projection of Q into the graph produced by the disjunctive sum of CG facts then put under normal form. The BLUES algorithm simulates these operations in two stages: 1. It splits the question instead of merging the facts, then tries to m a t c h each generated sub-question into the base in order to obtain partial answers. 2. It expresses conditions on recombination of these partial answers in order to generate exact answers. More precisely, the s p l i t t i n g of a question Q gives us a set Q = { Q 1 , . . - , Qi, 9. . , Qn} of CGs t h a t we call s u b - q u e s t i o n s , and a set C = {C1,. 99 C j , . . . , Cm} of conceptual vertices sets. Each sub-question Qi is a copy of a connected sub-graph of Q which has at most one relation vertex (and its concept vertices neighbours). Each Cj is composed of all the concept vertices in Q which result from the splitting of the same concept vertex of Q t h a t we call a cut vertex. The c u t v e r t i c e s of Q are the concept vertices t h a t have at least two distinct neighbouring relation vertices. Moreover, we note c / t h e concept vertex of the sub-question Qi generated by the cut vertex c of Q (see fig. 2).
Definition 2 We call partial answer to a CG question Q the graph that results from the projection Hi of the sub-question Q~ into a graph of the base F B . We write it down IIi(Qi). 2 Building alL the solUtions aftEr splitS.
291
C
I
Q . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
C31
Cl
I
,. i L~nve
I
Q3
Q1 Q2
Q
2
Fig. 2. Split of a CG question Q. c and c~ are two cut vertices of Q that generate four sub-questions. We have the following sets: Q = {QI,Q2,Q3, Q4} and
{{cl, c2}, (c2, c3, ~,}} Our goal is to know whether a partial answer set 7~ -- { H I ( Q 1 ) , - . . , IIn(Qn)} can be recombined into an exact answer to Q. We note H the sum of projections I I 1 , . . . , 1-In such t h a t Vi, 1 < i < n, if v E Qi then II(v) = Hi(v). D e f i n i t i o n 3 Given Cj = { c i l , ' " , c i z } the set of concept vertices resulting from the splitting of the cut vertex c of Q. Given Vj, 1 < j <_ l, Ilii(ci~) the image by Ilij of the vertex ci~ in the sub-question Qij. The set 11Cj = { Ilz ( % ) , . . . , Ilz ( % ) } is r e c o m b i n a b l e / f each pair ( II~ ( ci. ), Ilv (ci. ) ) satisfies one of the following conditions (the indices u and v are between 1 and l): a) IIu (ci=) and Ilv (ci,) is the same vertex; the splitting was not necessary. b) IIu(ci=) and 11v(ci~) are distinct but they have the same individual marker. D e f i n i t i o n 4 G i v e n / / I ( Q 1 ) , ' " , IIn( Qn) some partial answers which have been
obtained by projecting the sub-questions (of Q) Q1, " " , Qn by I I 1 , . . . , IIn on the CG fact base B F . Given C = {C1,... , C j , . . . ,Cm} the set of concept vertices sets due to the splitting in the CG question Q. Given H the sum of projections H I , . . . ,IIn such that Vi 1 < i < n, if v E Qi then 11(v) = IIi(v). Given Q1,...,n the graph resulting from the disjunctive sum on Q1," " , Qn. IfVj, 1 <_ j <_ m, IICj is recombinable, then we call a n s w e r to Q the normal form of the CG 11(Q1,...,n). Figure 3 shows an example of recombination of a CG answer by the BLUES algorithm. Note t h a t BLUES computes all the CG answers to a CG question Q on a CG fact base. The BLUES algorithm is presented thoroughly in 6. We also proved in t h a t article t h a t BLUES algorithm is a sound and complete reasoning mechanism with regard to our logical definition of the reasoning (definition 1) and t h a t it is equivalent to the C O M P O S I T I O N algorithm. Moreover, it is i m p o r t a n t to say t h a t these two algorithms are both based on the NP-complete problem of the graph projection 9.
292
H, ( 0 ,
"J
..... # ~ ( O4 i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
9
i
i
. .F1 ............................
-.:.:7:: . . . . . . . . . . .~. . . . . . . . . . . . . . . . . . . . .
Q1, Q2
I
~
Duplication+ "& Normalization
Drive
Fig. 3. Recombination of answers. A is a CG answer to the CG question Q of figure 2 on the CG fact base F B = (F1,F2}. Each IIi(Qi) is the partial answer to subquestion Qi on F B . and V~ symbolize the different cases of recombination on IIC1 = ' / / 3 ( c D', / / 4 ( c 4 ) } . ' (IIz(cl), IIz(c2)} and HC2 = {H2(c2),
3 3.1
The RAP Module Presentation
The RAP module (seaRch for All Projections) is essentially an implementation of the BLUES algorithm, thus it proposes a sound and complete reasoning. The RAP module is implemented on the CoGITo platform 10, 7 and then can take advantage of its graph base management system 5, 11. This management system is based on the hierarchy on the CGs induced by the specialization relation. Added to the usual techniques of classification and indexation 12, 13, the CoGITo platform implements hash-coding and filtering mechanisms that allow a reduction of the search domain without using any projection operation. In addition the projection algorithms that are used are efficient 9,11. As far as we know 14, the RAP module is the only reasoning module working in a general framework (the projection is not reduced to an injective projection... ) and being sound and complete regarding the logical deduction. Among the systems close to ours, we can cite the PEIRCE system 15. PEIRCE ignores the possibility of "split" knowledge, and it uses injective projection. But it proposes "close match" CGs which are pertinent answers. ROCK (Reasoning On
293
Conceptual Knowledge) 3 is another system close to ours. It searches exact and pertinent answers by modifying the question graph. It considers type definitions (RAP doesn't). But its major drawback is that it is not complete. ROCK can fail to find exact answers to a question. This drawback was at the source of the development of the BLUES algorithm by Carbonneill and Haemmerl~ 4, 6. 3.2
O p t i m i z a t i o n o f t h e BLUES
algorithm
Two optimizations of the BLUES algorithm have been done in the RAP module. The first one consists in using the set of specialization graphs of each subquestion graph as the search domain, instead of all the fact base. This is done by means of the specialization search functionnalities of CoGITo, that use hierarchical structuration and indexation. The second optimization consists in recomposing the answers only when it is certain that the sets of concept vertices o f / / C are recombinable, instead of trying an hypothetic recombination for all the combination of partial answers. More precisely this optimization is based on the following property: the choice of a partial answer to a sub-question Qi implies restrictions on the choice of the partial answers to the sub-questions that are localized in the immediate neighbourhood of Qi. If no partial answer is possible for one of these sub-questions, then the partial answer of Qi that was chosen cannot lead to an exact answer
to Q. Two kinds of restriction are used in this optimization. In the example of figures 2 and 3, when the algorithm chooses a partial answer of Q1 in F1, then it must choose a partial answer also belonging to F1 as a partial answer of Q2, in order to make a recombination of a type. Then for Q3 and Qa, the algorithm has to choose partial answers containing the individual vertex Man:Mike in order to make a recombination of b type. The indexed gestion of graphs in the CoGITo platform makes such a restriction of the search domain easy.
4
Conclusion
and perspectives
We have studied two reasoning algorithms only based on graph operations: the COMPOSITION algorithm that modifies the CG fact base in order to "compose" the knowledge eventually split through several distinct CGs, and the BLUES algorithm that works by splitting the question graph, searching for partial answers, then recombining a complete answer. These algorithms are sound and complete regarding the logical deduction. This work was preliminary a "theoretic" work, but we have implemented the BLUES algorithm in order to test it and to observe its behaviour on large CG bases. The first test concerns a base of 10000 graphs (generated by a random algorithm). The CPU times of BLUES are close to those of COMPOSITION. It is a valuable result, because it shows that using an algorithm that respects the original form of a CG fact base is not prejudicial.
294
This first step in the development of a reasoning module of the CoGITo platform has to lead us to a more complete module taking the extension of the CG model into account (nested CGs for instance 8). Another direction of study can be the exploration of techniques allowing one to provide pertinent answers as it was done by the ROCK system. The BLUES algorithm would be easily adaptable to a heuristic reasoning during its recombination phase. This would allow to take advantage of both the completeness of the RAP module and the ability of providing pertinent answers of the ROCK system.
References 1. J.F. Sowa. Conceptual Structures - Information Processing in Mind and Machine. Addison-Wesley, Reading, Massachusetts, 1984. 2. M. Chein and M.L. Mugnier. Conceptual Graphs : fundamental notions. Revue d 'Intelligence A rtificieUe, 6(4):365-406, 1992. 3. B. Carbonneill and O. Haemmerld. ROCK : Un syst~me de question/r~ponse fond~ sur le formalisme des graphes conceptuels. In Acres du 9~me Congr~s Reconnaissance des Formes et Intelligence Artificielle, Paris, pages 159-169, 1994. 4. B. Caxbonneill. Vers un syst~me de reprgsentation de connaissances et de raisonnement fondg sur les graphes conceptuels. Th~se d'informatique, Universit~ Montpellier 2, Janvier 1996. 5. O. Guinaldo. Etude d'un gestionnaire d'ensembles de graphes conceptuels. PhD thesis, Universit~ Montpellier 2, D~cembre 1996. 6. O. Guinaldo and O. Haemmerl~. Algorithmes de raisonnement dans le formalisme des graphes conceptuels. In Acres du XIe congr~s RFIA, volume 3, pages 335-344, Clermont-Ferrand, 1998. 7. O. Guinaldo and O. Haemmerl~. CoGITo : une plate-forme logicielle pour raisonner avec des graphes conceptuels. In Acres du XVe congr~s INFORSID, pages 287-306, Toulouse, juin 1997. 8. M.L. Mugnier and M. Chein. Representer des connaissances et raisonner avec des graphes. Revue d'Intelligence ArtificieUe, 10(1):7-56, 1996. 9. M.L. Mugnier and M. Chein. Polynomial algorithms for projection and matching. In Heather D. Pfeiffer, editor, Proceedings of the 7th Annual Workshop on Conceptual Graphs, pages 49-58, New Mexico State University, 1992. 10. O. Haemmerl~. CoGITo : Une plate-forme de dgveloppement de logiciels sur les graphes conceptuels. Th~se d'informatique, Universit~ Montpellier 2, Janvier 1995. 11. O. Guinaldo. Conceptual graphs isomorphism - algorithm and use. In Proceedings of the 4th Int. Conf. on Conceptual Structures, Lecture Notes in Artificial Intelligence, Springer-Verlag, pages 160-174, Sydney, Australia, August 1996. 12. R. Levinson. Pattern Associativity and the Retrieval of Semantic Network. Computers Math. Applic., 23(6-9):573-600, 1992. 13. G. Ellis. Compiled hierarchical retrieval. In E. Way, editor, Proceedings of the 6th Annual Workshop on Conceptual Graphs, pages 187-207, Binghamton, 1991. 14. D. Lukose, editor. Proceedings of the First CGTOOLS Workshop, University of New South Wales Sydney, N.S.W., AUSTRALIA, August 1996. 15. G. Ellis, R. Levinson, and P. Robinson. Managing complex objects in PEIRCE. International journal on human-Computer Studies, 41:109-148, 1994.
Stepwise Construction of the Dedekind-MacNeille Completion Bernhard Ganter 1 and Sergei O. Kuznetsov 2 1 Technische Universit/it Dresden, Institut f'dr Algebra, D-01062 Dresden 2 Department of Theoretical Foundations of Informatics, All-Russia Institute for Scientific and Technical Information (VINITI), ul. Usievicha 20 a, 125219 Moscow, Russia
A b s t r a c t . Lattices are mathematical structures which are frequently used for the representation of data. Several authors have considered the problem of incremental construction of lattices. We show that with a rather general approach, this problem becomes well-structured. We give simple algorithms with satisfactory complexity bounds.
For a subset A C_ P of an ordered set (P, <) let A t denote the set of all upper bounds. T h a t is, A t := {p C P I a < p for all a E A}. The set A; of lower bound is defined dually. A c u t of (P, <) is a pair (A, B) with A, B C P, A t = B, and A = B $. It is well known that these cuts, ordered by (A1, B1) <~ (A2, B 2 ) : ~ A1 C A2 ( -' ~. B2 C B1) form a complete lattice, the D e d e k i n d - M a c N e i l l e c o m p l e t i o n (or short c o m p l e t i o n ) of (P, <). It is the smallest complete lattice containing a subset orderisomorphic with (P, <). The size of the completion may be exponential in PI. The completion can be computed in steps: first complete a small part of (P, <), then add another element, complete again, et cetera. Each such step increases the size of the completion only moderately and is moreover easy to perform. We shall demonstrate this by describing an elementary algorithm that, given a (finite) ordered set (P, <) and its completion (L, <), constructs the completion of any one-element extension of (P, <) in
O(ILI. IP. w(P)) steps, where w(P) denotes the width of (P, <). The special case that (P, <) is itself a complete lattice and thus isomorphic to its completion, has been considered as the problem of m i n i m a l i n s e r t i o n of an element into a lattice, see e.g. Valtchev 4. We obtain that the complexity of inserting an element into a lattice (L, <) and then forming its completion is bounded by
O(ILI .,,(L)).
296
The elementary considerations on the incidence matrix of (P, <), which we use in the proof, do not utilize any of the order properties. Our result therefore generalizes to arbitrary incidence matrices. In the language of Formal Concept Analysis this may be interpreted as inserting a preconcept into a concept lattice.
1
Computing the completion
Let us define a p r e c u t of an ordered set to be a pair (S, T), where S is an order filter and T is an order ideal such that S C_T ~, T C_ S t. We consider the following construction problem: INSTANCE:
A finite ordered set (P,_~), its completion, and a precut (S,T) of
(P, _).
OUTPUT: The completion of (P U {x}, ~), where x ~ P is some new element with
p~_x ~
pES
x~_p .'. '.. p E T
and
forallpEP. 1
(P, _~) may be given by its incidence matrix (of size O(PI2)). The completion may be represented as a list of cuts, that is, of pairs of subsets of P. With a simple case analysis we show how the cuts of (P U {x}, ~) can be obtained from those of (P, <).
Proposition 1. Each cut of (P U {x}, <), except (S U {x}, T U {x}), is of the form (C,D),
(CU{x},DnT),
or
(CNS, DU{x})
for some cut (C, D) of (P, ~). IS (C, D) is a cut of (P, ~) then 1. ( C U { x } , D N T ) is a cut of ( P U { x } , _ ) iff S c C = ( D n T ) 4, 2. ( C N S , D U { x } ) i s a c u t o f ( P U { x } , ~ ) i f f T C D = ( C N S ) t,
3. (C,D) i s a c u t o f ( P U { x } , ~ _ )
iffC~:S andn~:T.
For a p r o o f of this result and of the following see the next section.
Proposition 2. The number of cuts of (P U {x}, _<) does not exceed twice the number of cuts of (P, ~_), plus two. i natural embedding of the completion of (P, _~) into that of (P U {x}, _~) is given by the next proposition:
Proposition 3. For each cut (C, D) of (P, ~_) exactly one of (C,D),
(CU{x},n),
(c, n u { x } ) ,
(CU{x},nu{x})
is a cut of (P U {x}, <). These cuts can be considered to be the "old" cuts, up to a modification. "New" cuts are obtained only from cuts (C, D) that satisfy 3) and simultaneously 1) or 2). An algorithm can now be given: 1 For elements of P different from x, the order remains as it was.
297
A l g o r i t h m to construct the completion of (P U {x}, <). Let L denote the set of all cuts of (P, <). - Output (SU { x } , T U {x}). - For each (C, D) 9 L do: 1. If C C_ S and D q: T then output (C, D U {x}). 2. If C g S and D C_ T then output (C U {x}, D). 3. I f C g S a n d D ~ T t h e n (a) output (C, D), (b) if C = (D n T)t then output (C U {x}, D n T), (c) if D = (C N S) t then output (C n S, D U {x}). - End. It follows from the above propositions that this algorithm outputs every cut of (P U {x}, <) exactly once. Each step of the algorithm involves operations for subsets of P. The most time consuming one is the computation of (D n T) 4 and" of (C n S) t. Note that ( n N T) ~ = (min(D n T)) "~, where m i n ( n n T) is the set of the minimal elements of n N T and can be computed in O(P I 9w(P)) steps. Since I m i n ( n N T) I _< w (P) and, moreover, (min(D N T)) 4 =
I p 9 min(D n T ) } ,
we conclude that (D n T) * can be obtained with an effort of O(IPI" w(P))- The dual argument for (C n S) t leads to the same result. So if L is the set of cuts of (P, <), then the algorithm can be completed in O(ILl" IPI" ~(P)) steps. Let us mention that computing an incidence matrix of the completion can be done in O(IL 2) steps, once the completion has been computed, see Proposition 6.
2
Inserting a preconcept
A triple (G, M, I) is called a f o r m a l c o n t e x t if G and M are sets and I C G x M is a relation between G and M. For each subset A C G let A I := {m 9 M I (g, m) 9 I for all g 9 A}. Dually, we define for B C_ M B I:={g 9
9149
A formal concept of (G,M,I) is a pair (A,B) with A C C, B C_ M, A I = B , and A = B t. The formal concepts, ordered by (A1,B1) _< (A2, B2): r
A1 C_ A2
(~
B2 C_ B1),
form a complete lattice, the concept lattice of (G, M, I). Most of the arguments given below become rather obvious if one visualizes a formal context as a G x M - cross table, where the crosses indicate the incidence
298 relation I. The concepts (we sometimes omit the word "formal") then correspond to maximal rectangles in such a table. Note that if A = B I for some set B C_ M, then (A, A I) automatically is a concept of (G, M, I). A pair (A,B) with A C_ G, B C_ M, A C_ B I, and B C_ A I is called a p r e e o n c e p t of (G, M, I). In order to change a preconcept into a concept, one may extend each of the sets G and M by one element with the appropriate incidences. So as a straightforward generalization of the above, we consider the following construction problem: A finite context (G,M,I), its concept lattice, and a preconcept (S, T) of (G, M, I). OUTPUT: The concept lattice of (G U {x}, M U {x}, I+), where x ~ G U M is a new element and INSTANCE:
I + := I U ((S U {z}) x ({x} UT)). The special case of section 1 is obtained by letting G=M:=P
and
(g,m) E I : ~
g~m.
P r o p o s i t i o n 4. Each formal concept of (G O {x}, M U {z}, I+), with the exception of (S U {x}, T O {x}), is of the form
(C,D),
(CU{x},DNT),
or
(CNS, DU{x})
for some formal concept (C, D) of (G, M, I). With the obvious modifications, the conditions given in Proposition 1 hold. Proof. Each formal concept (A, B) of (G U {x}, M U {x}, I +) belongs to one of the following cases: 1. x E A , x E B . ThenA=SU{x},B=TU{x}. 2. x E A,x ~ B. Then B C_ T and B I = A \ { x } . Therefore (C,D) := (A \ {x}, (A \ {x}) I) is a formal concept of (G, M, I) satisfying S C C = ( D N T ) I.
(1)
Conversely if (C, D) is a formal concept of (G, M, I) satisfying (1), then
(A,B) := (CO { x } , D n T ) is a formal concept of (G O {x}, M U {x}, I+). 3. x ~ A, x E B, dual to 2. Then (C, D) := ((B \ {x}) I, B \ {x}) is a concept of (GU { x } , U U {x},I +) with T C n = ( C N S ) z. Conversely each formal concept (C, D) with (2) yields a formal concept (A, B) := (C n S, D U {x}) of (G U {x}, M U {x}, I+).
(2)
299
4. x • A, x r B. satisfying
Then (C, D) := (A, B) is a formal concept also of (G, M, I),
C~:S,
D~:T.
(3)
Conversely is each pair with (3) also a concept of (G U {x}, U U {x}, I+). If both (C U {x}, D N T) and (C N S , D U {x}) happen to be concepts, then S C_ C and T C_ D, which implies C U {z} = T I, D U {x} = S z. Thus apart from perhaps one exceptional case these two possibilities exclude each other. From each concept of (G, M , I ) , we therefore obtain at most two concepts of (G U {x}, M U {z}, I+), except in a single exceptional case, which may lead to three solutions. On the other hand, each concept of (G U {x}, U U {x}, I+), except (S U {x}, TU {x}), is obtained in this manner. This proves Proposition 2. To see that Proposition 3 holds in the general case, note that each formal concept (C, D) of (g, M, I) belongs to one of the following cases: 1. C = S, D = T. Then (CU{x}, DU{x}) is a concept of (GU{x}, MU{x}, I+). 2. C C_ S, T C D. Then D = C I and condition (2) (from the proof of Proposition 4) is fulfilled. Thus (C, D U {z}) is a concept of (G U {z}, M U {z}, I+). 3. S C C, D C_ T. Then C = D x and condition (1) is satisfied. Therefore (C U {x}, D) is a concept of (G U {x}, M U {x}, I+). 4. C ~: S, D q~ T. Then (C, D) is a concept of (G U {x}, M U {x}, I+). It is clear that each of the possible outcomes determines (C, D), and that therefore the possibilities are mutually exclusive. It is a routine matter to check that these formal concepts are ordered in the same way than those of (G, M, I). The construction thus yields a canonical order embedding of the small concept lattice into that of the enlarged context. Since all details have carried over to the more general case, we may resume: P r o p o s i t i o n 5. The algorithm given in section 1, when applied to the concept lattice L of (G, M, I), computes the concept lattice of (G U {x}, M U {x}, I+). The abovementioned complexity considerations apply as well, but it is helpful to introduce a parameter for contexts that corresponds to the width. The incidence relation induces a quasiorder relation on G by gl <: g2 : ~
{g2} I C__{g2} I.
Let w(G) be the width of this quasiorder, and let w(M) denote the width of the corresponding quasiorder on M. Let
r(G, M, I) := (w(a) + w(M)) 9 (ICl + IMI). Of course, r(G, M, I) < (IGI + IMI) 2. Provided the induced quasiorders on G and M are given as incidence matrices (these can be obtained in O(IGI 9 IMI 9 (IGI + IM)) steps), we have a better bound on the complexity of the derivation operators: the set A z can be computed from A with complexity O(r(G, M, I)).
300
Computing A I was the most time consuming step in the algorithm on section 1. Thus computing the new concept lattice can be performed with O(ILI 9
M, I))
bit operations. Each concept of (GU{x}, MS{x}, I+), except (SU{x}, TU{x}), is generated by exactly one of the steps 1, 2, 3a, 3b, 3c of the algorithm, and precisely 3b) and 3c) lead to "new" concepts (other than (SU {x}, TU {x}). When performing the algorithm, we may note down how the concepts were obtained. These data can be used later to construct an incidence matrix of the new lattice:
Proposition 6. The order relation of the new lattice can be computed in additional
O(ILI 2)
steps.
Proof. (SU {x}, TU {x}) is the largest concept containing x in its extent and the smallest concepts containing x in its intent. In other words, (S U {x}, T U {x}) is greater than all concepts generated in steps 2) and 3b) and smaller than all concepts generated by steps 1) and 3c). It is incomparable to the other elements. So we may exclude this concept from further considerations. The order relation between the "old" concepts, i.e. between those generated in steps 1), 2), and 3a), is the same as before. For the remaining case, we consider w.l.o.g, a concept (CU {x}, n A T ) , which was generated in step 35) from a concept (C, D) of (G, M, I). Now (CU {x}, O n T) < (E, F) if and only if (E, F) has been generated in steps 2) or 35) from some concept ( E \ {x}, ( E \ {x}) I) > (C, D) of (G, M, I). If x E E, then similarly (E, F) < (C U {x}, D n T) is true if and only if (E, F) has been generated in steps 2) or 3b) from some concept (E \ {x}, (E \ {x}) I) < (C, D) of (G, M, I). Suppose x r E. If (E, F) was obtained in steps 1) or 3a) of the algorithm, than (E, E I) is a concept of (G, M, I) and (E, F) < (CU{x}, O A T ) is equivalent to (E, E x) < (C, D). If (E, F) was obtained in step 3c), then S z C F, which implies O A T C S x C F. So in this case (E, F) < (CU {x}, O A T ) always holds. Summarizing these facts, we obtain all comparabilities of a concept (C U {x}, D n T) of (G U {x}, M U {x}, I +) which was derived from a concept (C, D) of (G, M, I) in step 3b): Concepts greater than (CU{x}, b A T ) are those obtained in steps 2 or 3b) from concepts greater than (C, D), concepts smaller than (C U {x}, D n T) are those obtained in steps 1), 2), 3a) or 3b) from those smaller than (C, D) and all those obtained in step 3c). Thus the comparabilities of (C U {x}, D n T) can be obtained from those of (C, D) using only a bounded number of elementary operations in each case. Filling the corresponding row of the incidence matrix is of complexity O (ILl). The argument for concepts obtained by 3c) is analogous.
The generalized algorithm may be applied to the context (P, P, ;g), obtained from an arbitrary ordered set (P, _<). The concept lattice is the lattice of maximal antichains of (P, <) (see Wille 5). Our result therefore relates to that of Jard, Jourdan and Rampon 2.
301
3
A non-incremental procedure may be more convenient
In practice, a strategy suggests itself that may be more time-consuming, but is nevertheless simpler than the algorithm presented in section 1. Rather than pursuing an incremental algorithm, it may be easier to compute the lattice "from scratch" (i.e. from the formal context, or, in the special case, from the ordered set (P, <)) each time. For this task there is an algorithm that is remarkably simple (it can b e programmed in a few lines) and at the same time is not dramatically slower than the incremental approach: it computes the concept lattice i of a formal context (G, M, I) in O(ILI. IGI 2. IMI) steps. Using the parameter introduced above, we can improve this to O( IL I 9IG I 9r( G, M, I)). This algorithm generates the formal concepts inductively and does not require a list of concepts to be stored. Let us exemplify the advantage of this by a simple calculation: A formal context (G, M, I) with IGI = IMI -- 50 may have as may as 250 formal concepts in the extreme. But even if the lattice is "small" and has only, say, 101~ elements, it would require almost a hundred Gigabytes of storage space. Generating such a lattice with the inductive algorithm appears to be time-consuming, but not out of reach; the storage space required would be less than one Kilobyte. Moreover, this algorithm admits modifications that allow to search specific parts of the lattice. For details and proofs we refer to the literature (see 1), but the algorithm itself is so simple that it can be recalled here. For simplicity assume G := { 1 , . . . , n}, and define for subsets A, B C G A
i E B is minimal in ( A \ B) U (B \ A).
Then the definition A
A
yields a strict linear order on the set of all subsets of G (a lexicographic or, for short, leetle order). If (A, B) is a formal concept of (G, M, I) then A is called its e x t e n t . Since B = A I, the extents uniquely determine the concepts. To generate all concepts, it therefore suffices to generate these. This can be done in lectic order, starting with 0 II. The step that constructs from a given set A the "next" extent is of the form A ~ i := ((A N { 1 , . . . , i - 1}) U {i}) II. The following theorem describes how the element i must be chosen:
Let (G, M, I) be a formal context with G := { 1 , . . . , n}. For given A C G, the smallest extent that is larger than A (with respect to the lectic order) is given by A + := A 9 i, T h e o r e m 1 (see 1).
where i is maximal with respect to A
302
It is easy to see that computing A • i requires at most O(IGI. IMI) steps, using the induced quasiorders only O(v(G, M, I)) steps. The "next" extent therefore is found at an expense of O(IG 2. IMI), or even O(IG I 9 v(G, M, I)). If a lattice diagram is to be generated, the inductive approach may even be faster than the incremental one. For a given extent A 5~ G, the extents of the upper covers are precisely the minimal sets of the form ( A U { i } ) H,
i~A.
Computing these requires O(IGI 2 9 IMI) steps. Localizing such an upper cover in a linear list of extents, using a binary search algorithm, can be done with O(loglLI) comparisons of subsets of G. The complexity thus is O(IG I 9 IMI), since ILl < 2 IMI. Every finite lattice (L, <) is isomorphic to some concept lattice. A natural choice is the formal context (J(L), M ( L ) , _<), where J(L) and M ( L ) denote the sets of join- and meet-irreducible elements of (L, <), respectively. If we denote the cnrdinalities of these sets by j ( L ) := IJ(L)h
re(L) := IM(L)h
we can resume: C o r o l l a r y 1. The covering relation of a finite lattice (L, <) can be computed in O ( j ( Z ) 2 9 r e ( L ) . ILl) steps, provided the sets J(L) and M ( L ) of join- and of meet-irreducible elements are given. This is considerably better e.g. than the bound given by Skorsky3. Again, the bound can be refined using the width of the induced orders on G and on M.
References 1. Bernhard Ganter, Rudolf Wille: Formale Begriffsanalyse - Mathematische Grundlagen. Springer Verlag 1996. 2. C. Jard, G.-V. Jourdan and J.-X. Rampon: Computing On-Line the Lattice of Maximal Antichains of Posets. Order 11 (1994). 3. Martin Skorsky: Endliche Verbiinde - Diagramme and Eigenschaften. Shaker, 1992. 4. Petko Valtchev: An Algorithm for Minimal Insertion in a Type Lattice. Second International KRUSE Symposium, Vancouver, 1997. 5. Rudolf Wille: Finite distributive lattices as concept lattices. Atti Inc. Logica Mathematica, 2 (1985).
PAC Learning Conceptual Graphs Pascal Jappy
Richard Nock
Laboratoire d'Informatique, de Robotique et de Micro61ectronique de Montpellier 161 rue Ada, 34392 Montpellier Cedex 5, France {jappy, nock}@lirmm.fr
Abstract. This paper discusses the practical learnability of simple classes of conceptual graphs. We place ourselves in the well studied PAC learnability framework and describe the proof technique we use. We first prove a negative learning result for general conceptual graphs. We then establish positive results by restricting ourselves to classes in which basic operations are polynomial. More precisely, we first state a sufficient condition for the learnability of graphs having a polynomial projection operation. We then extend this result to disjunctions of graphs of bounded size.
1
Introduction
The last decade has seen an explosive growth in database technology and the amount of data collected. Advances in data collection have flooded us with lots of data. Data mining is the efficient supervised or unsupervised discovery of interesting, useful, and previously unknown patterns from this data. Machine Learning, and in particular, Inductive Learning, is the supervised extraction of an abstract concept which has to correctly explain classified observations. In both Data Mining and Machine Learning, the choice of the language used to represent the data is of uppermost importance. And the study of the efficiency of learning for a given representation language, i.e. its learnability, has become an active field of research. It is common, in Data analysis and Machine Learning, to represent data as attribute-value pair vectors, often real valued. And, in Computational Learning Theory, the complexity of Learning has been studied mainly for Boolean (i.e. two valued) formula classes. However, alternative descriptions have been proposed in the fields of Artificial Intelligence and Knowledge Representation. Structural representations, for instance, provide a means of capturing a special type of information that others do not. Not only are an object's properties described, but also t h e relations between its subcomponents. Furthermore, the readability of such representations is far greater. This makes them much more usable by experts and an abundant literature has been devoted to them, from early semantic networks 18 to the more recent Description Logics 2 or Conceptual Graphs 23, 3. In Machine Learning, the former are now being used in several applications such as DENDRAL 13.
304
To formally analyse the learning hardness associated to a given description language, several theoretical models have been proposed. Identifiability in the limit 8, for instance, focuses on isolating a formula capable of explaining a set of examples in finite time but with no a priori bound on the number examples available to the learner. But the most widely used learnability model is the Probably Approximately Correct (PAC) framework introduced by Valiant 24. It has become a learnability benchmark in which the vast majority of studies of the past ten years have been undertaken. Early work cast a pessimistic shadow on the learnability of such classes. 9, for instance, showed that learning existential conjunctions in structural domains can be intractable. However, more recent studies have shown that compromises can be found between sheer representation power and complexity of learning. In the quest for more expressive representation languages for Machine Learning, two main trends of structural formalisms have emerged. On one side, Inductive Logic Programming (ILP) aims at learning concepts expressed as Horn Clauses from examples and background knowledge 15. Though the general problem is undecidable, many restricted clasSes have led to learnability in the limit 22 or Probably Approximately Correct (PAC) learnability results 6; 12; 4. On the other, a family of structural languages called Description Logics has been developed in the past decade by the Knowledge Representation (KR) community. They provide an efficient means for expressing concept hierarchies and form the basis of KR systems such as KL-ONE 2. Furthermore, they offer syntactic restrictions of First Order Logic (FOL) previously unexplored in ILP. In this paper, we study the possibility of using Conceptual Graphs in learning applications. This formalism has been studied extensively and their algorithmic properties are well known 3 making it a natural candidate for learning applications. Besides, their expressive power is high, equivalent to large subsets of First Order Logic 23, 3. b-hrthermore, document retreival applications which employ search procedures close to Data Mining techniques have been developed using conceptual graphs 7. We place ourselves in the PAC framework and use a constructive proof technique 1 to obtain our results. Our goal is to isolate which subsets of general Conceptual Graphs allows efficient learning. We first give negative result for conceptual graphs, showing that their NP-Complete projection operation makes it impossible to PAC learn them. We then show a sufficient condition for Conceptual Graph classes with polynomial projection to be learnable. Finally, we extend this result to disjunctions and decision lists of graphs of limited size. Both disjunctions and decision lists are widely employed in Machine Learning and the classes we define by extending these to conceptual graphs are likely to prove very efficient. The rest of this paper is organized as follows. In section 2, we present the' formal learnability model we have chosen to place ourselves in. In particular, we describe both the techniques used in practice to prove positive results, and existing theories which limit our search for efficiently learnable classes. In section 3,
305
we prove a first negative result for general conceptual graphs. Section 4 the lists our positive results and their respective proofs. We finally conclude in section 5.
2
Formal
Background
Our goal in this paper is to determine which conceptual graph classes are efficiently learnable. Obviously, this formal analysis requises an explicit model of what it means to be efficiently learnable. In this section, we describe our model of learnability, which is a slight modification of the PAC learnability model introduced by Valiant 24.
2.1
Hypothesis Space
In inductive learning, the goal is to find an unknown target concept, or some good approximation, from a set of labeled examples. Let X be a set called the domain, and let CG the set of conceptual graphs. A concept C over X is a subset of X. A concept class is a set of concepts. This will designate a constrained set of potential "target" concepts which could be labelling the training examples. A hypothesis is also a subset of X which the learning algorithm produces as an approximation of the hidden target concept. A hypothesis class is a set of hypotheses. Associated with each concept or hypothesis class is a language L C C~ used for writing down concepts or hypotheses in these classes. In this paper, both hypotheses and concept classes will be variously restrited Conceptual Graphs classes. We will also assume the existence of an acceptable size measure (by acceptable, we mean somehow polynomially related to the actual number of bits required to write down a concept or hypothesis). Also, as it will always be clear from the context whether we are refering to a concept or its representation i n / : , we will let c denote both. Usually, examples of the concept c are elements of the domain X, with x E X labeled as positive if x E c and negative otherwise. Here, we will depart from this practice and consider examples described as conceptual graphs, since this is closer to real applications of the domain. Therefore, examples will be elements selected from CG and x will be labeled positive if c subsumes x (i.e. c projects onto x) and negative otherwise. Finally, the learning algorithm must be allowed greater resources if the concept class is vast. Thus class complexity parameters are needed to express this richness. In this paper, we will follow the accepted theoretical presentation of conceptual graphs proposed in 3. Hence, the parameters which determine the richness of a given class of conceptual graphs are the number of relations in the support, the number of concept in the type lattice and the number of individual markers allowed. We will let a l , a2 and a3 denote these respectively. And more generally, C~(~) will represent the subset of CG defined using parameters (a~).
306
2.2
PAC
learnability
Informally, to say that a class C of concepts is "learnable from examples" means that there exists a corresponding learning algorithm ,4 which for any function c in C is capable of converging upon c as it is given more and more examples. ,4 will have learnt c from the examples. This accomplishment by itself is not necessarily very interesting as no bound on .A's execution time or on the number of examples required has been stated. For instance, any boolean function class defined over n variables is clearly learnable from examples, since all the algorithm has to do is to wait for an example corresponding to each of the 2 n possible variable assignments to know the complete truth table for e. So the interesting question really is to know whether C is easily learnable from examples, that is if the learning algorithm will require both few examples and a reasonable computing time. Valiant 24 introduced a notion of learnability which explicitely specifies that the hypothesis acquired by the learning algorithm should be a close approximation of the concept being taught via its examples, in the sense that this hypothesis should perform well on new data. We now make these notions more precise. Assume a fixed but unknown probability ditribution D~(o to be defined on CGa(~) and according to which the examples are drawn. When a hypothesis h is returned by the algorithm, its quality is evaluated by the measure of the probability that a new example drawn according to D~(o is incorrectly classified by h. 1. Let C denote a concept class and C(a~) the subset o l d containing all elements built with parameters (~i). C is polynomially learnable by 74 iff 3 A a learning algorithm, such that:
Definition
- VD(a~) probability distribution over C(a~), - Vc~(i) richness parameters, - Vc, target concept in C(a~), - Vc precision parameter, - Vr confidence parameter,
when given a set of size m polynomial in ~i, size(c), 1/~ and 1/~? of examples of c drawn according to D(~), A finds a hypothesis g in C(~) such that : PD(.,)(PD(.,)(f ~ g ) > e) < 77
(1)
in time polynomial in m and he. In other words, this definition requires that .4 be able to produce an answer whose error is less than the prespecified precision parameter e. However, it is theoretically possible to draw a learning sample of examples according to P~(~) that is not '~epresentative" of the target concept c. So, the model also supplies a confidence parameter (f and only demands that the precision be exceeded with probability at most J. Obviously, as e and 5 approach 0, the algorithm is allowed a larger learning sample and more computation time.
307
If a concept class C conforms to definition 1, it is said to be PAC learnable by itself, that is, the hidden concept c belongs to it and the learning algorithm also searches for the hypotheses to approximate c in C. There are a number of variations on this definition. In particular, one which interests us is the relaxation which happens when the algorithm is allowed to look for the hypotheses in another (hypothesis) class 74. In this case, the concept class C is said to be PAC learnable by 74.
2.3
Obtaining positive results
Proving positive PAC results using Valiant's definition can be dificult. Blumer et al. 1 have developed easier techniques for doing so. We now repeat one of their key theorems which allows the proof to be split into two simple ones. D e f i n i t i o n 2. A class C is said to be of polynomial (code) size if V(~i)~(c~,) = log(IC=, I) is polynomial in (ai).
Definition 3. A class C is said to be polynomial time identifiable if, when given parameters (ai) and a set of m examples, an identification algorithm can produce in time polynomial in m and (ai) a function g of C(~) which is consistent with the examples, or detect its non existence. T h e o r e m 1. Let C denote a concept class of polynomial size. If C is polynomial time identifiable then C is polynomially learnable. P r o o f Suppose f is a hypothesis in C(a~) whose error probability is greater than ~. The chance that f be consistent with a sample of m examples drawn according to 7~(a~) in C(a~) is at most (1 - e) m. So the chance that an arbitrary concept in C(~) satisfy both conditions is at most C(~)I(1 - e) m. The PAC learning requirements demand that: IC(o,,)l(1 - e) TM _< 1
-
6.
(2)
So, if we solve, we get: m > l(log(IC(~,)l) + log(~))
(3)
if m satisfies this equation, the probability that a hypothesis consistent with the sample turns out to have an error greater than e is less than & In other words, the algorithm needs only examine m examples and return a consistent hypothesis. The class being polynomial sized, m is bounded by a polynomial in ( ~ ) , 7, 1 ~ and the size of the largest example as required by the PAC framework. The practical consequence of this result is that, in order to prove a positive PAC result for a concept class C it is sufficient to produce an identification algorithm and prove C's polynomial size. Unfortunately, this also highlights one major constraint of this technique: if
308
testing whether an element x is an example of a concept c cannot be done in polynomial time, then no identification algorithm can be polynomial. This limits the use of the above results to classes for which this test is polynomial. In the following section we give a stronger result for conceptual graphs, similar in spirit to T h e o r e m 1 in 5. 3
A Negative
Result
Our first result concerns the general class of conceptual graphs CG for which we prove the necessity of a polynomial projection test for the existence of a PAC learning algorithm. The proof of our theorem relies on a structural complexity hypothesis and requires the presentation of the complexity class P/Poly. D e f i n i t i o n 4. The class P/Poly is the set of all languages accepted by a (pos-
sibly nonuniform) family of polynomial-size deterministic circuits. It is an accepted assumption that N P ~ P/Poly, which in particular implies N P - complete ~=P/Poly. We now state our first result. 2. Let C C CG denote a conceptual graph class. If the projection test between elements in C is either NP-Complete or coNP-Complete, then C is not PA C-learnable unless NP C P/Poly.
Theorem
P r o o f For any conceptual class C and concept c E C, we note inf(c) the concept having the same representation as c, but which denotes the set inf(c) -d E C such t h a t c projects into d. Similarly, we define inf(C) = inf(c),c E C. It is immediate t h a t C is PAC learnable i f f inf(C) is also, and that testing membership for a concept inf(c) E inf(C) is as hard as testing projection between elements in C. In T h e o r e m 7 21, it is shown t h a t if inf(C) is PAC learnable then inf(C) E P/Poly. Thus, if projection in inf(C) is NP-Complete or CoNP-Complete, then we have N P C C_ P/Poly or C o N P C_ P/Poly. This leads to a contradiction in b o t h cases (because P/Poly is closed under complementation). A simple corollary of this theorem yields our first result regarding the learnability of conceptual graphs. C o r o l l a r y 1. Since the projection operation is NP-complete in C~ , general conceptual graphs are not PAC learnable. In the next section, we examine restrictions of this ast class and determine which are learnable in the framework presented above. 4
Three
positive
results
In this section, we investigate restrictions on conceptual graphs which are sufficient to ensure PAC learnability. Our previous negative result imposes t h a t we restrict ourselves to classes whose projection operation operation is polynomial.
309 Our first positive result is a general one. It is based on a transcription of Theorem 1 and highlights a size limitation on classes which is sufficient to ensure classe's PAC learnability. We then specialize this and prove that well known boolean formula classes can be extended by substituting conceptual graphs of limited size for the boolean monomes more commonly used in machine learning. We prove that the classes thus obtained, which are far more expressive than their boolean counterparts, are PAC learnable. 4.1
P A C learning individual g r a p h s
T h e o r e m 3. Let C(a~) C CG denote a conceptual graph class. If C(a~) is of polynomial (code) size and the least upper bound operation is polynomial, then C(~) is PAC learnable. P r o o f We use the technique based on theorem 1. We prove that log(IC(~)l) is polynomial in (~i, and provide an identification algorithm, that is one capable of returning a conceptual graph consistent with an imput sample. We give a large upperbound of the number of elements of C(~) with a total number of n vertices (which we use as a crude but acceptable size measure) as follows. We suppose that we have n relation vertices, and n concept vertices. In this bipartite graph, we suppose that any edge is either present or absent. We suppose that for any n concept vertex, we have 0~2c~3 substitution possibilities, and for any relation vertex, we have ~1 possibilities of substitution. Given this procedure for generating graphs, it follows that we can generate any graph in C(~) having a total number of n vertices. The number of total substitutions of vertices is upperbounded by And the size of C(~) is therefore calculated taking into account the possibility for any edge of being either absent or present, therefore the total number of CG having n vertices is (largely) upperbounded by ((O~1C~20,3)n)n 2
=
?%3
It is obvious that taking the log of this quantity gives a polynomial in ai. The identification algorithm we propose is based on the least upper bound operation lub(., .) which returns for any pair of conceptual graphs the most specific one which projects into both. Let S = S + t2 S - denote the sample. Let %os be the graph defined as the lub of all positive examples and Cneg the lub of all negative examples. One of these two graphs is necessarily consistent with the sample. Indeed, if this were not the case, there would be a positive example which projects into cn~g and a negative one which projects into cpos. Since, by definition of the lub, cpos projects into all positive examples and cn~g into all negative ones, the transitivity of the projection operation would mean there would be a cycle between the positive examples (or the negative ones) and themselves, which is impossible.
310
We have thus produced an algorithm capable of returning a graph consistent with the input sample. And if the lub operation is polynomial, this algorithm also is. This concludes the proof. This result uses the lub(., .) operation to construct hypothesis graphs. The theorem below links the more commonly studied projection operation to PAC learnability. T h e o r e m 4. Let C(a,) E CG denote a conceptual graph class. If the both the
complexity o/ projection test between elements of C(~) and IC(a~)l are polynomial in (ai) then C(~) is PAC learnable. P r o o f Here, the size restriction imposed on C(a~) lets us construct an even simpler identification algorithm. Indeed, a simple enumeration of all the elements in C(~) and a projection test between the current graph and all the examples in the sample are all that is needed to select a consistent graph or detect its non existence. This algorithm is obviously polynomial if the theorem's assumptions are true. T h a t the polynomial (code) size criterion is met is also trivial since the class itself is of polynomial cardinal. This result is weaker than the previous one since it requires the class to be of smaller size. However, in the next section, we extend it by showing that under identical conditions on C(~), more elaborate hypotheses (than individual graphs) can be constructed from elements of C(~) and successfully PAC learnt. 4.2
L e a r n i n g f o r m u l a s b a s e d o n size l i m i t e d g r a p h s
In this section, we examine the possibility of using Conceptual Graphs as the basic buidling blocs of more elaborate formulas. Two classes of boolean formulas have been at the center of numerous studies in Machine Learning. These are DNF (Disjunctive Normal Form) 25 and DL (Decision Lists) 19. An element of DNF (which we will note dnf)is the disjunction of boolean monomes, that is the conjunction of boolean litterals. A d n f classifies an example x as positive if x satisfies at least one of its monomes. The learnability of DNF is an important open problem, but several results have been shown for subsets of this class. In particular, the class k-DNF, in which the monomes in a disjunction are limited in size to a maximum of k, is PAC learnable25. A decision list (dl) is an ordered list of conjunctive rules noted
(tl -~ gI), (t~ -+ g2), ..., (tk -~ gk),gk+l where V1 < i < k, ti is a boolean monome and gi E 0,1), and the class gk+l is called the default class. The class associated to any example x is the goal class corresponding to the first test passed by the example, ie the first monome satisfied by x. If none is passed, the example is assigned the default class. Again, a positive PAC learning result is obtained if the monomes in the list are limited in size to at most k.
311
These two classes are very much used in learning applications, and have good algorithmic properties. Thus, it is very interesting to extend them by replacing the monomes they contain by more expressive terms. This increases their representation power, yet retains the good algorithmic aspects. This approach has already been studied in detailed in 10 for several Description Logics as well as other similar structured knowledge representation formalisms. Here, we propose to do the same with conceptual graphs. Our next result shows under which conidtions on the graph sets this extension leads to learnable formula classes. T h e o r e m 5. Let C(c~) E C~ denote a conceptual graph class. If the both the complexity of projection test between elements of C(~) and C(a~) are polynomial in (ai) then k - C(a~)DNF and k - C(~)DL are P A C learnable. P r o o f Again, our proof is contructive. Let k - C(~) denote the subset of C(al) containing only graphs of size no greater than k. Replacing n by k in the size of C(~) yields:
Ik -C(,,) _<( a l a 2 a 3 ) k3
(4)
We first calculate the number of disjunctions and of desision lists of graphs in k-C(a0 then produce identificationalgorithmsfor k-C(a0DNF and k-C(a~)DL. Any element of k - C(~I)DNF is built in the followingway: all the graphs in k - C(a~) are examinedone by one and each is either added to the formulaor left out. Thus, the number of disjunctions is equal to 21k-C(-oI. And consequently: l o g ( k - C(~)DNFI) =
Ik- C(~,)l __ (ot10~20t3) ka
(5)
Similarly any element of k - C(a~)DL is built as follows: all the graphs in k - C(a~) are examined one by one and each is either added to the formula associated to class 0, or added to the formula associated to class 1, or left out. Also, the order of the elements in the class is important so all graphs selection orders must be examined. So the number of decision lists is equal to: Ik - C(a,)DL I = 31k-c(~,)l(Ik - C(~,)l)!
(6)
And consequently:
log(Ik -
C(,,)DLI) =
O(Ik
-
C(.,)l) = O((0~10/2013) k3)
(7)
The identification algorithms for the two classes we are interested in follow. These two greedy algorithms are based on the simple observation that any concept consistent with a sample set is necessarily consistent with any of its subsets. Therefore, both consist in selecting graphs which are consistent with a subset of the sample, adding the graph to the current formula and removing the examples correctly classified. When all the examples have been removed, a consistent formula has been constructed. If at anytime, this proves impossible, then the non consistency of the formula has been detected.
312
Tables 1 and 2 describe the identification algorithm for disjunctions. Every time a graph consistent with a part of the sample is found, it is ORed to the current disjunction. If all possible graphs have been examined and some examples are left unexplained then no disjunction is consistent with the sample. Indeed, the disjunction being commutative, the order of graph selection is unimportant and the concept class need only be searched once.
BuildDNF DNF := MakeEmptyDNF() ; LS := ExamplesSet ; WHILE NotAllNegative(LS) DO CurrentGraph := SearchPositiveGraph(LS) ; LS := LS -- ExamplesSatisfiedBy(CurrentGraph) DNF := DNF + CurrentGraph; END Return DNF; END
;
Table 1. Pseudocode for building a CG disjunction.
Sear chPositiveGraph (LS) CGraph = Make extensive search of a CG satisfied by positive examples; Return CGraph; END
Table 2. Pseudocode for extensive search of a CG consistent with positive examples.
It is obvious that for both classes, if Ik- C(~,) is polynomial in a i and the projection test is also polynomial in these parameters, then the two above algorithms also are. Two classes to which this result applies are Conceptual Trees 17 and locally injective Conceptual Graphs 14. Although their expressivity is not as good as unconstrained conceptual graphs, their combination in more complex formulas yields very interesting concept classes for learning. Disjunction Normal form formulas are thought to be natural knowledge reprewsentations for the human brain 25. Decision Lists are also very easy to interpret because they axe lists of easily understandable rules. Besides, they axe among the most general known PAC learnable classes. The extensions defined above are likely to prove very efficient and interpretable in learning applications.
313
BuildDL DL := MakeEmptyDL(); LS := ExamplesSet; WHILE NotOnly0neClass(LS) DO CurrentGraph := SearchGraph(LS); LS := LS -- ExamplesSatisfiedBy(CurrentGraph); DL := DL + CurrentGraph; END AddDefaultClass(LS); Return DL;
END T a b l e 3. Pseudocode for building a CG decision list.
SearchGraph(LS) CGraph = Make extensive search of a CG satisfied by examples from only one class; Return CGraph; END
Table 4. Pseudocode for extensive search of a CG.
5
Conclusion
In this paper, we have studied the practical learnability of classes of conceptual graphs. We have first presented a formal leaxnability model derived from the famous PAC learning framework which we have adapted to make it suitable for the analysis of graphs. This is our main contribution because it paves the way to other studies. In particular, we have shown that conceptual graphs, in their general form 3 do not allow PAC learning, under widely accepted complexity assumptions. This negative result being linked to the NP-Completeness of the projection test, we have then studied the possibility of learning classes of graphs for which basic algorithmic operations are polynomial, and shown that results become positive in this case. Finally, we have shown that graphs of limited size allow efficient learning not only as isolated concepts but as basic building blocs of more elaborate formulae which extend well known boolean concept classes. This type of extension of existing boolean formula classes has already been formally studied in 10 where various Description Logics were shown to be adequate extension languages. Here the results are different further since restricted conceptual graphs play this role. It would be ineteresting to compare the expressivity of the resulting classes in both cases. Another interesting extension would be to examine learning graphs rules, such as those presented in 20. This could not be done in this paper because a (polynomial) rule generation mechanism from example graphs would be needed to build
314
an indentification algorithm. However, the structure of these rules being an extension of function free Horn clauses quite similar to the extension of boolean formulae defined above, it is quite likely t h a t positive learnability results could also be obtained (with the same size limitation on graphs) by using some inference techniques of Inductive Logic Programming. Finally, the various PAC results presented in this work are just early learning results and are only meant to p a t h the way to more extensive studies. It should be noted t h a t the PAC framework has become a benchmark model over the years, but is by no means the only interesting one. In fact it makes severe assumptions which may bias learning results 11. Thus, it would be most interesting to study the learnability of conceptual graphs in newer models. In particular, U-learnability 16 comes to mind. Indeed, two of PAC's drawbacks are its distributional assumptions and the worst case analysis which makes a polynomial projection mandatory. This last point is a m a j o r handicap since in practice the NP-Completeness of the projection test is never a handicap and efficient algorithms have been incorporated in most applied systems.
References 1. A. Blumer, A. Ehrenfeucht, D. Hanssler, and M. K. Warmuth. Learnability and the vapnik-chervonenkis dimension. J. ACM, pages 929-965, 1989. 2. R.J. Brachman and J. Schmolze. An overview of the kl-one knowledge representation system. Cognitive Science, 9:171-216, 1985. 3. M. Chein and M.L. Mugnier. Conceptual graphs: Fundamental notions. Revue d'Intelligence Artificielle, pages 365-406, 1992. 4. W. W. Cohen. Pac-learning non-determinate clauses. In Proc. of AAAI-94, pages 676-681, 1994. 5. W. W. Cohen and H. Hirsh. The learnability of Description Logic with equality constrain ts. Machine Learning, pages 169-199, 1994. 6. S. Dzeroski, S. Muggleton, and S. Russel. Pac-learning of determinate logic programs. In Proc. of the 5 th International Conference on Computational Theory, pages 128-137, 1992. 7. D. Genest. Document retrieval: An approach based on conceptual graphs. Rapport de Recherche LIRMM No 97296, 1998. 8. E.M. Gold. Language identification in the limit. Information and Control, 10:447474, 1967. 9. D. Haussler. Learning conjunctive concepts in structural domains. Machine Learning, 4:7-40, 1989. 10. P. Jalopy and O. Gascuel. On the conputational hardness of learning from structured symbolic data. In Proceedings of the 6th Internationl Conference on Ordinal and Symbolic Data Analysis, OSDA95, pages 128-143, 1995. 11. P. Jappy, R. Nock, and O. Gascuel. Negative robust learning results for horn clause programs. In Proc. of the 13 ~h International Conference on Machine Learning, 1996. 12. J.U. Kietz. Some lower bounds for the computational complexity of inductive logic programming. In European Conference on Machine Learning, ECML'93, pages 115-123, 1993.
315
13. R.K. Lindzay, B.G. Buchanan, E.A. Feigenbaum, and J. Lederberg. Dendral: a case study of the first expert system for scientific hypothesis formation. Artificial Intelligence, 61:209-261, 1993. 14. M. Liqui~re. Apprentissage ~ partir d'objets structurds. Conception et Rdalisation. PhD thesis, Universit6 de Montpellier II, 1990. 15. S.H. Muggleton. Inductive Logic Programming. Academic Press. New York, 1992. 16. S.H. Muggleton. Bayesian inductive logic programming. In COLTg$, pages 3-11, 1994. 17. M.L. Mugnier and M. Chein. Polynomial algorithms for projection and matching. In Proc. of the 7th Workshop on Conceptual Structures, pages 68-76, 1992. 18. R.H. Richens. Preprogramming for mechanical translation. Mechanical Translation, 3, 1956. 19. R.L. Rivest. Learning decision lists. Machine Learning, pages 229-246, 1987. 20. E. Salvat and M.L. Mugnier. Sound and complete forward and backward chaining of graph rules. In Proceeding of the International Conference on Conceptual Structures, ICCS96, pages 248-262, 1996. 21. R. Shapire. The strength of weak learning. Machine Learning, 5(2), 1990. 22. E.Y. Shapiro. Algorithmic Program Debugging. Academic Press. New York, 1983. 23. J.F. Sowa. Conceptual Structures - Information Processinf in Mind and Machine. Addison-Wesley, 1984. 24. L. G. Valiant. A theory of the learnable. Communications of the A CM, pages 1134-1142, 1984. 25. L. G. Valiant. Learning disjunctions of conjunctions. In Proe. of the 9 th IJCAL pages 560-566, 1985.
Procedural Renunciation and the Semi-automatic Trap Graham A. Mann Artificial Intelligence Laboratory School of Computer Science & Engineering University of New South Wales Sydney, NSW 2052, Australia. [email protected] Abstract. This paper addresses two contemporary issues which could threaten the usefulness of conceptualgraphs and their widespread acceptance as a knowledge representation. The first concerns the recent debate over the place of actors in the formalism. After briefly summarising arguments on both sides, I take the position that actors should be retained, and marshal four supporting arguments. An example shows that (slightly enhanced) actor nodes can greatly simplify the delivery of external control signals, without excessively comphcating the denotation of the graphs they contain. The second issue concerns an epistemological problem which I have called the semi-automatic trap. This is our tendency to continue constructing systems of logic that depend on human involvement beyond necessity to the point at which such involvement is impractical, unscaleable and theoretically problematic. Two important escape routes from the semi-automatic trap are pointed out, involvmg more emiohasis o n automatic graph construction from primitive data, and emphasis o n automatic interpretation of conceptual graphs. Practical methods for both are suggested as ways forward for the community.
1
Introduction
Recently, the conceptual graphs (CG) community briefly debated the role of, and suitability for inclusion of actors, the angle-bracketed procedural nodes outlined the original theory, within the newly emerging ANSI standard. The exchange took place over a few weeks in the electronic forum of the CG mailing list. On the proelimination side, it was argued that actors are unnecessary in that existing elements of CG theory can effectively be used to do all that is claimed for actors without further complicating the standard; and were formally problematic, syntactically because their linear form symbols already have assignments and semantically because their denotation is at best awkward and at worst impossible to specify. Putative mechanisms for enabling naturally stepwise processes, to be handled by CG systems without actors were discussed. Those arguing for retention held that these alternatives lead to a different set of problems, and that explicit tokens of procedural knowledge are intuitive to use and strategically prudent, since industrial developers want them as connection points to their chosen programming language. Actors were claimed to provide needed ways of capturing concepts with extensions that are unbounded sets, changing the values of referents, and sending and receiving messages from outside the CG system. The outcome of this exchange was a kind of compromise, articulated by Sowa, in which actors are viewed as calls in a functional language, with a distinction drawn between "impure" and "pure" actors, meaning whether or not the function involved left side effects. Pure actors are much like relations and so could be expressed using the round-bracket or circle syntax, but would be defined by the named function, not the relational catalogue. Impure actors could pass control signals back and forth to concepts in the graphs to which they are connected, and may have irreversible effects outside the
320
system. Actors would be included in the forthcoming CGIF draft, but only the pure variety. In order to include impure actors, provision for actor-tokens in the referent field of a concept would need to be made, and a metalanguage statement amounting to a declaration specifying the possible inputs and outputs would need to be provided somehow; these were left as tasks for the future. In the first part of this paper, I will marshal a number of arguments which further substantiate the role of procedural tokens as an indispensable part of any knowledge formalism and make the case for a stronger role for actors in both the theory and practice of CGs. In the second part of this work is both a cautionary note about a risk to our efforts which is so pervasive that it is difficult to see clearly, and a response to what I see as our community's curious ambivalence about tools and implemented systems. On the one hand, we all recognise the need for standardised tools, and admire good applications of CG technology. Yet until only last year, papers about these subjects tended to be excluded from the main proceedings of our conferences, while discussions about such things were practically relegated to separate events. As funding for public sector research in many countries is being reduced, industrial sponsorship becomes more important than ever - yet for the most part our attention seems to be directed elsewhere, on minutiae. This is not a suggestion that pure research on this topic be abandoned, but only that striving toward a mathematical ideal not blind us to what will really make our efforts count in the long run - the widespread adoption of CGs in the knowledge-based societies of tomorrow. I will argue that these conflicts stem primarily from different views we hold about what CGs are to be used for. Some of us are interested in the notation itself and its formal properties. Others are applying the formalism and the algorithms implementing its denotation to make new kinds of databases, retrieval mechanisms and other information technologies. Still others are chiefly interested in knowledge representation and automated reasoning for artificial intelligence. Often the differences between these groups can be ignored. For the purposes of this exposition I will exaggerate the differences between these three professional viewpoints - the logician, the informationsystem builder and the intelligent-system builder - and show how each is affected by what I call the semi-automatic trap. This is our tendency to continue constructing systems of logic that depend on human involvement, beyond necessity into realms where such involvement is impractical, inefficient and theoretically problematic. Partly following from traditional Platonic epistemology (Rationalism) and its modern extensions, and partly a consequence of the seduction of the computer, the semiautomatic trap has ensnared the CG and other symbolic approaches to the study of artificial reasoning. After identifying the main features and implications of the semi-automatic trap, I point out two important escape routes, both involving full, rather than partial, automation. First, the emerging interest in acquisition of conceptual structures from simpler symbolic constituents can be viewed as the beginnings of a reaction to the prevailing inward-orientation of CG work. A sketch is offered of an experiment in which conceptual knowledge is collected from more primitive grades of data. Second, the more difficult issue of interpretation of conceptual structures by automatic means will be briefly considered. It is suggested that the label fields of nodes be made switchable, so that the developer can toggle between mnemonic labels and meaningless but systematic labels. A recommended principle to guide progress toward automatic interpretation - that we begin the transition from truth-preserving algorithms to plausibility-preserving heuristics - is illustrated with a simple example. Except for one instance, the practical demands imposed by the development of commercial systems will be seen to exert a healthy influence on these issues.
321
2
The Role of Actors
Let us begin with the role of actors in CG theory. In the debate mentioned above, writers on both sides commented that actors tended to be left out of many conceptual graphs appearing in published papers (except those used in commercial systems). The idea that actors are somehow unnecessary is a contemporary instance of the current ascendancy of declarative accounts of knowledge over procedural accounts. This orthodoxy holds that declarative formalisms (including programming languages, knowledge bases and models of brain function) are cleaner, more economical and more transparent than procedural versions. While many epistemologies pay lip service to the need for a balance between the two, the models they lead to rarely seem intermix them. Procedures are often portrayed as outdated, of limited portability, and resistant to learning or explanation. They are sometimes associated with bad programming habits in old-fashioned languages, hacking, and scruffy methodology. Some attempts to incorporate procedural elements in knowledge models began to be seen in the 80s in the KL-ONE modifications, KRYPTON 1 and B A C K 12, OMEGA 5 and CycL 8. Original CG theory 14, also a product of the 80s, showed foresight in permitting actors to influence and be influenced by concepts, even if the examples given were simple mathematical functions, which tended to foster the view that they are nothing more than a special kind of relation. At the end of this section, I will show how actors may be transformed into parameterised procedural calls in a way which has less to do with relations. Good reasons why procedures should be "tolerated" in CG epistemology were recently advanced by Rochowiak 13. Beginning from a classical philosophical definition of knowledge as justified true belief, he argues that declarative and procedural forms can be distinguished in each of these three definitional terms. The relationship between the two is complex, at least in the case of human knowledge: they share concepts, interact in complex ways see 6 and Rochowiak speaks of a "natural history" of (scientific) knowledge in which if unarticulated skills are sufficiently important to continued scientific enterprise, they are promoted through the heuristic level to a "context-free" level which is distinct from, yet dependent on, the prior procedural form. As usual, Charles Peirce seems to have beaten us all to the conclusion that knowledge must combine the declarative and procedural. As Rochowiak points out, Peirce was strongly committed to a view of concept meaning which fused the formal and practical. This commitment can be seen in his famous pragmatic maxim, one expression of which was: "Pragmatism is the principle that every theoretical judgement expressible in a sentence in the indicative mood is a confused form of thought whose only meaning, if it has any, lies in it's tendency to enforce a corresponding practical maxim expressible as a conditional having as its apodosis in the imperative mood. " 3: 5.18 Using the vocabulary available to him, Peirce was suggesting that conceptual knowledge be thought of as rule-like pairs, with an imperative apodosis (consequent), or what today would be a procedure in the rule's then-clause. This is the essence of today's expert systems. Of course, rules may have simple assertions or variable assignments in their then-clauses. But Peirce continually used the word "practical" in his definitions of the maxim, which means that he understood the centrality of active outcomes, in the broadest sense of that word. Furthermore, even a cursory examination of Peirce's work reveals the importance he attached to idea of knowledge as a dynamic, evolving entity which emerges from and is critiqued by social processes in a community. His ambition of creating a process by
322
which knowledge can be refined by successive elaborations of a publicly-readable graphical expression by members of a scientific community would, in the modern context, be a call to negotiate knowledge between the contributions of a multiple agents. His pragmatism was an attempt to overcome, or at least make tractable, the problems of logic theory, which in the words of Peircian scholar Ernest Nagel "...derive almost entirely from isolating knowledge from the procedures leading up to it, so that it becomes logically impossible to obtain." 2. These deliberations suggest that the procedural aspects of knowledge should not be factored out. Perhaps this idea has not been realised for historical reasons. In the late 1950s, computer science theory bifurcated into traditional stepwise control methods for a state-machine emerging directly from von Neumann's work on the one hand 17, and mathematical function-based models, on which for example McCarthy based his Lisp 9, on the other. At each turn, as simple data structures and control metaphors gave way to more sophisticated schemes, the declarative was presented as the more refined, explicit, portable and formally backed. As notions of conceptual knowledge as logic developed, this backing became connected to the solidity of truth-conditional logic. Languages based on this idea could have very simple and elegant interpreters, since they essentially only needed to return truth values for expressions they evaluated. (Of course, practical functional languages do concede a need for incremental changes of state to get things done, but only reluctantly, as signified by the disdainful term "side-effects".) This simplicity is appealing enough to help perpetuate the erroneous notion that what is important or interesting about expressions in a knowledge language can always be captured by mapping it to a Boolean value. I have argued elsewhere 10 that for some purposes, truth-conditionally is simply not adequate as an answer to the question: "what does this expression mean?". Though conceptual graph theory p e r s e requires no specific commitment to a particular semantics, much of the existing CG literature favours the truth-conditional variety, possibly in respect of its classical logic roots. But to the extent that conceptual graph theory has taken up the tradition, it has also inherited nagging questions about truthconditionality. Is the truth of a statement always the only aspect of interest? Under what conditions, if any, is it meaningful to assign a truth value to a statement about the future? To an imperative statement corresponding to the sentence "Go down these stairs."? Will the truth value be the same before and after the interpreter complies with the act? What relationship do truth-conditions have with the intentions of the speaker, the intentions of the hearer, or the hearer's active response? Now conceptual graphs can and have been used with other, more progressive semantics, including situation semantics 15 and four-pole semantics 10. The point here is that as long as the meaning of a conceptual graph is thought of as only a single Boolean variable, it is easy to ignore the need, in many cases, for an ordered sequence of active steps - a procedure - to be the substance of the graph's interpretation. So if we wish to define the pragmatic of conceptual graphs in terms of conceptual graphs themselves, we need to be able to explicitly encode procedures in the CG formalism. In one sense, this is really no great imposition: procedures of one kind or another are an integral part of CG theory, e.g. the canonical formation rules. The real questions are, what formalisms should be used to encode these, and what expressions should they have inside graphs? Why would we want to make the contents of procedures (as apposed to only explicit tokens of procedures) explicit within CGs? The principle reason is that changes to declarative aspects of graphs during reasoning can affect procedures and changes in the status, or progress, of procedures can then influence reasoning, by virtue of their common elements. But this could also be viewed as a problem for two reasons: because we do not have and may not wish to introduce a suitable language into the CG
323
formalism itself, and because it would cluttered up otherwise clearly understandable graphs. Without actors betokening procedures, definitions of ACT concepts - at least, those expressed in conceptual graphs - will in a sense always be empty. Imagine by how far the expansion of the concept SAMBA would miss the mark if it could not at any point provide access to some kind of (simulated or actual) sequence of motions. One might expect the definition-graph to classify the concept, declaratively, as a subtype of LATIN-DANCE, and perhaps add, also declaratively, salient features such as modernity and tempo. But such a definition could never be wholly satisfactory as a surrogate without some representation which could replay the pattern of steps in time so as to demonstrate, or facilitate recognition of, the actual dance itself. Dictionaries attempt to capture the meaning of the word "samba" without recourse to such replays but this is one of the reasons that machine-readable forms of conventional dictionaries are, by themselves, inadequate as a basis for commonsense knowledge. And movies or simulations of acts are one of the things we want to provide in a modern multimedia dictionary. An alternative to explicit actor nodes has been proposed in Sowa's object-oriented carstarting example 16. External processes, in this case the ignition of a car's engine, could be provably initiated from a conceptual graph built on an object-oriented software model. Messages could be sent to and from objects that appeared to be concept boxes labelled with process names to other process-objects, not shown on the graph, and presumably more closely connected with the physical engine. Ellis 4 subsequently criticised this design as awkward, but his own simpler model still encounters the basic difficulty of trying to describe declaratively what is essentially a procedure, and it still seems to miss the point. Concepts and relations should not be used as surrogates for active processes; they are statements of existence relative to an ontology. A more pragmatic question lurks beyond these complaints: even if we were satisfied with this kind of solution, do we really want our knowledge formalism to commit the user to a particular computational model? Given the popularity of the object-oriented programming approach, perhaps it is not a serious disadvantage - but it seems onerous to demand that CG developer must sign up with the object-oriented paradigm, when all that might be needed is a simple escape to the code level. It is not difficult to imagine design scenarios in which this tips the balance against the adoption of CGs. Figure 1 shows how actors would be used to produce a natural model of a car, without appealing to a particular computational paradigm or language. The referent of the concept MAX-DUR, *d, is functionally related to the referents of CAPACITY and FUEL-CONS by the "pure" actor DIVI, which divides the capacity of the tank, *c, by the rate of fuel consumption, *f. In the active interpretation of this, the output of DIVI is placed in the referent field of MAX-DUR. Execution of the procedure can be either data-directed, in which processing begins with values in the input nodes, or goal-directed, in which processing begins with a requested value at the output node, and propagates backwards to the inputs. These processes are initiated by the assertion mark ! and the request mark ?, respectively. The "impure" actor START, is not relation-like, but can be read as a state-changer. It is a procedure which can be initiated by a control mark in the ENGINE concept. The existence of a particular key is a precondition for the procedure and only when the referent *k is bound to PCX999 will the precondition be satisfied. Before the signal to start arrives, the first placeholder acts like a concept which needs to be restricted to the value KEY:PCX999. The ignition process creates side-effects, including rotation, heat and noise in the outside world; these may be measured by sensors and reported back to the CG system. At the end of the START process, a tachometer reports a number of
324 J RUNNING:@*r
\ TANK
I~-~ib zx~'rl),~-~l AUTOMOBILE
t
ENGINE
J J MAX-SPEED: , ~ *s FUEL-CONS:*f
I
I I
Figure 1. Alternative model of a car. revolutions per minute, *r, back via the second placeholder, which is then data-driven to the referent field of RUNNING with the prefix @. Syntactically, the actors may be represented using a LISP-like functional notation in which the list delimiters are the < and > angle brackets, the first element is a procedure name, the last element is a place holder for the returned value, and all other elements are an ordered list of input parameters.
325
since any variable must be universally quantified. This means that actors could appear in first-order logic expressions without interfering with their quantifications. Within the predicate for the concept connected to the output of the actor, ~ places a second element: a function with the input variables as arguments. Thus the DIVI actor in Figure 1 would appear as .... A (MAX-DUR, (DIVI (a, b)) A ... If desired, the output variable *c could be equated to the function elsewhere in the expression. The above arrangement is simple to understand and allows the graph's declarative nodes to interact with the actors. Suppose it was learned that the engine of the car has been replaced with a different engine. When the graph is updated to reflect this, the new instance will have a new characteristic fuel-consumption. Because of the actor connections, the maximum duration and range will automatically alter to reflect this change; yet it is difficult to imagine natural relations which could be used to link FUEL-CONS with those two concepts. From a goal-oriented perspective, if an engine was stationary and the goal was a non-zero speed, this could trigger a search for a key, which also could not be done naturally with a relation. Notice, however, that only the actor tokens can influence and be influenced by the declarative nodes: the elements of the procedure itself are beyond reach. For this reason, we may still wish for code which had explicit statement inside a graph, perhaps as chains of linked actors representing the subroutines of complex actions. In human beings, only our highestlevel plans are available to our reflection; lower level acts, those which have become automatised, as psychologists say, are not available.
3 The Semi-automatic Trap Let us turn now to another risk facing our community: the semi-automatic trap. This is our tendency to keep building systems of logical symbols that depend at least partially on human involvement for their creation, manipulation and interpretation. That is a serious problem for builders of intelligent systems, and to a lesser degree for information-system builders and logicians. The semi-automatic trap is best understood in its historical perspective. Logic has a long history of being done by humans on paper. The symbols which are used in a typical logic have a long aetiology. Perhaps they include Greek letters, betraying their Platonic ancestry. Essentially, these symbols are passive, mnemonic devices, designed to signify objects, classes or operations by triggering the associated ideas inside human brains. They are created by human beings for human manipulation and human interpretation. For most of their history, and in most of their applications, this role for logical symbols as manual tools has been uncontroversial and unproblematic. The decades-long revolution which has placed a computer on every desk offered rich opportunities to practitioners of logic, builders of information systems and artificial intelligence researchers, among countless others. Yet the advent of computers can now be seen, with the benefit of hindsight, to have loaded the semi-automatic trap, ready to be sprung. In their ubiquitous incarnation as semi-automatic machines - that is, devices which accept symbolic input from users via a keyboard, process them, and display symbolic output for human eyes via a screen - desktop (and laptop and palmtop) computers have human dependence built in as a basic affordance. For the greater part, the software which has been developed on such machines also tends to depend on human intervention via this interface. This too seems natural enough.
326
But this built-in, natural dependence can work against us when we try to use those machines and logic symbols to build symbolic structures which can behave like human conceptual knowledge, for the following reasons. First, the culture of logical mnemonics carries certain unstated assumptions with it. For example, although a symbol could in principle have a extensional grain size of anything from a highly specific and nameless microfeature to a broad metaphysical category, the symbols used in practice tend to cluster around a narrow range of possible grains associated with convenient words used for their labels. The fact that concepts have to be named could push hand-built ontologies away from the microfeatural to a coarser resolution. Second, mnemonic symbols can easily be parasitic for their true meaning on the human viewer, that is, they could merely draw on associations, inference mechanisms and the like in the head of a developer, or user, instead of supplying them itself. It is surprising how much this blinds us to the need for an interpretative component inside the system. Third, the tradition of logic still leads us to think of conceptual knowledge in terms of verifying propositions, and that truth-conditional semantics might be sufficient as a denotation for any conceptual structures. This again leads to the neglect of active procedural output as part of meaning, as discussed in Section 2. Let me make the problem clearer by taking it to extremes. Consider for a moment the fact that natural existence proofs of intelligence - humans and (more problematically) animals - have no screens and keyboards. If, as we wish to maintain, symbolic conceptual structures exist within their brains, then keyboards and screens cannot be a necessary condition for intelligence. But think of the disservice that would be done to most intelligent machines - and surely every CG program - if the screen and keyboard were removed. This will be my basic test for a system caught in the semi-automatic trap - will it continue to function and be useful with the human-dependent I/O devices disconnected? The reader may well object that this is test is uncharitable. After all, animals (including humans) and computers are really very different. Animals arrive with a genetic legacy, which might include innate conceptual structures and the machinery to support them. Computers are general-purpose machines to which any structure must be supplied - and this is ultimately a screen and keyboard job. To make a fair comparison, then, the machine does need a screen and keyboard. Very well - let the original test be modified as follows: the builder may stand in for Nature, supplying any programs and data that the system needs, but during the design phase only. Once the program is finished (born), the screen and keyboard must come off. The test still seems absurd because we have not yet exhausted the differences between animals and computers. Animals have sense organs and bodies with actuators, and these serve as I/O channels to the world. But - and this is important to full understanding of the issue - those channels do not depend on the manipulation or scrutiny on the part of a external human, as screens and keyboards do. Even in a human being, they do not: they serve a human brain, and could be influenced by or influence an external human in various ways, but they do not depend on it. Note that if the computer had other I/O devices with this property, such as cameras, microphones, sound generators and motors, they must remain connected during our test. We only wish to disable the machine in a specific way, so that it reveals its autonomy or the lack of autonomy. Am I serious when I demand that CG-based devices be useful without their screen and keyboard? Not entirely. To disqualify screens and keyboards on the grounds that they permit a kind of epistemological cheating is to oversimplify, because it ignores the fact that these devices can also be used to stand in for more complicated, and possibly less useful, "natural" I/O such as speech and hearing. We may be very satisfied, for example, with a natural language program that communicates using ASCII text,
327
provided we stay vigilant against other uses of the screen and keyboard which commit humans to too much of the wrong kind of involvement. This is the kind that does not relate to natural communication with the functioning program and would not be possible if the program were a person i Furthermore, it oversimplifies because some purposes of logicians and informationsystem builders may correctly demand no more than semi-automation. A successful logic visualiser might be essentially a tool for human work, like a calculator or spreadsheet. In these cases, it could be argued, the semi-automatic trap is really no trap at all. Even here, though, much of the motivation of tool developers lies in the potential they see in computerising conceptual structures to free human users of the need to carry out the time-consuming, repetitive or difficult parts of logic work - and this means automation. Of course, individual systems might pass the test, with their designers successfully avoiding the seductive powers of the screen and keyboard, and using their computers to try to write truly automatic programs that collect their own data, form their own goals, build their own representations and draw their own conclusions for action. The trouble is that we can s e e m to be building systems capable of human-like conceptual processing, when what we are really doing is only building systems that help people to do so. Disguising this mistake is the fact that, at least in information-retrieval systems, even a semi-automatic program could be quite useful. In fact, some information-system builders argue that a semi-automatic "intelligence amplification" (IA) approach is the best we can hope for at present, or even that this is better than AI altogether, since it preserves a human role in these affairs. But builders of autonomous agents, intelligent reasoning systems and true (that is, non-teleoperated) robots must avoid the semi-automatic trap to build automatic machines, because these devices must function independently. These builders cannot get away with thinking representations are all there is to knowledge. Again, we can visualise this in terms of the ability of such devices to operate without a keyboard and screen plugged in, even if they were needed to start the programs. A final barb that may hold us in the semi-automatic trap is that of power. I mean this in a more specific sense than notions of technological dominance or "knowledge is power". If responsibility and power are opposite sides of the same coin, it follows that human involvement in the knowledge process grants us power over it. In the artificial realm of symbols on a blackboard, the logician presides as god, letting this or that assumption be true, assigning variables, defining axioms and applying stepwise procedures. In the methodology of conceptual graphs, we proclaim the existence of types which divide aspects of reality into meaningfully distinct atoms, forming the basis of an ontology. The authority for this act is that of the knowledge specialist, or domain expert. Such persons naturally occupy positions of power in our technocratic society. And if the ontologies that are derived from their proclamations were in turn to form an obligatory language to which all artistic, scientific, commercial or military participants in a knowledge-mediated discourse have to conform, the focus of power would become very tangible indeed. 1. One might argue that language is an exception, because it is essentially about communications with other human beings, and thus dependent on them. But human speech is based on vocal apparatus which can produce sounds for other purposes, and auditory sensors which are used to detect many kinds of sounds besides speech. And of course, both can operate meaningfully even when there is no external human present. The same cannot be said of a keyboard and screen.
328
A number of critics have warned that society could find itself mired in a kind of ideological determinism instrumented by information technology e.g. 7. Particularly applicable here is the risk of opportunism on the part of an organisation or company that establishes an comprehensive ontology defining all the concepts, relations, contexts and actors for an industry as a bid for control of that industry. If one powerful organisation controlled these element, it could be difficult or impossible for outsiders to get new ideas recognised, or to communicate ideas which fell outside those terms of reference. The concern is that the organisation could disguise its attempt to monopolise the discourse by claiming that it was simply establishing a useful knowledge standard. But engineering standards for knowledge interchange should not be permitted to lead towards standardised knowledge content. Our community should be on its guard about this, so that our innovations do not become yet another pillar of inequality. If as we oversee the construction of large, shareable ontologies we are to avoid the twin evils of hard labour and ideological monopoly, we must become willing to take our hands off the levers of power to some degree. Since humans and their institutions tend to seek to consolidate power, not relinquish it, this barb can be expected to be the sharpest of all.
3 Fully Automated Acquisition In the CG community, we tend to neglect knowledge acquisition. In a set of 148 papers from ICCS meetings of the last five years, only 17 mentioned knowledge acquisition at all, and fewer discussed the topic in any detail. Concepts and relations appear in the catalogues of practically all CG systems as a matter of design-time proclamation, which is to say, human judgement. Neglecting other means of gathering knowledge tends to lock in proclamation as the only method of establishing knowledge using CGs. That may discourage experienced knowledge engineers looking for improved technologies for their craft, who know that manual knowledgebase creation and maintenance is potentially so labour-intensive that it may make their systems unacceptably costly. Perhaps CG developers are beginning to see the cruciality of this issue: whereas only 3 of the 56 papers in 1992 concerned this topic, this had risen to 9 out of 48 papers by 1997. Now consider what the semi-automatic trap has to teach us about knowledge acquisition. If, as in the above reductio ad absurdum, conceptual graph systems were not permitted to use a screen and keyboard for more than the initial system specification, would knowledge acquisition become impossible? Evidently not, because humans and animals can learn. But, for the sake of argument, how would it change under this restriction? The term "knowledge acquisition", as it is conventionally used, means encoding knowledge which has come from asking human beings (experts). Sometimes it takes on a slightly broader sense, in which; the knowledge may come from other sources like models, or books. "Asking a human" sounds like a natural process, available to a person and not likely to be disabled by removing the screen and keyboard from a computer. But of course, what almost always actually happens is that another person elicits the knowledge from the expert, casts the knowledge into the representation formalism by hand and then types it into the system by screen and keyboard. This would not be possible in a system in which these two devices were disconnected, that is, once the system was deployed. We could, of course, imagine a system in which deployment was delayed until all the knowledge it would ever need was built-in at design time. This would be a model of
329
totally innate knowledge. New knowledge could be had within the deployed system, but only by derivation from the innate supply. Perhaps, with great foresight on the part of the designer, such a knowledge system would be adequate. By analogy, most commercial programs are sold without their source code, with all information they need sealed into the executable code. But this strategy seems inherently risky. A system closed off from external change seems to lack an essential flexibility. It is precisely this sort of inflexibility that makes us seek more advanced forms of software than conventional programs. The alternative would be to use I/O devices which are allowed to remain connected (or, in the liberal interpretation of the rule, to use the screen and keyboard for natural exchange only). To do that, the human knowledge elicitor must be replaced. This means creating not only a natural language interface, but also a program to conduct the elicitation process, automatically generating the graphs for expression as questions, and a method for automatically dealing with the conceptual graphs that result from the parsing of the expert's responses. Since I have written elsewhere about parsing and learning by asking 10, I will focus here on this method. SKB
EKB
PKB
Syllabus
Figure 2. A teachable CG knowledge machine. Figure 2 describes a CG machine containing three knowledgebases: 9 SKB, the semantic knowledgebase, consisting of conceptual, relational and actor hierarchies. 9 EKB, the episodic knowledgebase, a sequential list of conceptual graphs, representing the conceptual history of the system. 9 PKB, the procedural knowledgebase, containing the source for actors defined in the system. This could be as simple as a set of listings of Lisp functions. The acquisition process depends on a special teaching language in which a range of operations on these databases may be expressed in simple fashion. Such a language would be something like the Structured English Interface for the Deakin Toolset 11, except that additions and changes to the conceptual hierarchies as well as the construction of graphs based on them would be allowed. The procedural knowledgebase would need a theoretically different set of techniques for skill learning, which will not be addressed here. For example, the following sequence of expressions sets up a
330
situation in which a hungry bear is inside a cave, beginning with no concept of bear or cave:
NEWCON DEFCON
BEAR < ANIMAL BEAR (A'I~R FURRY ATFR COLOUR:Brown LOC PLACE EATS ANIMAL) CAVE < LANDMARK CAVE (CHRC DARK CONT ROCKS CONT BATS)
BEAR1 G105 ASSERT
JOIN (BEAR EXPR HUNGER) JOIN (CAVE CONT BEAR1) G105
NEWCON DEFCON
The "interpreter" (in the specific technical sense of a parser-executor of strings in an artificial language) of Figure 2 chooses appropriate operators such as copy, restrict, or join and decides how to apply these to update the SKB and EKB. In most cases, the opcode of an instruction should be enough to select the correct knowledgebase, since the operations appropriate to hierarchy-building and episodic memory are different. Instructions concerning operations on individual graphs would use local variables to hold the graphs until a command sent the graph to a specified knowledgebase. The course of knowledge to be learned will be introduced in a specially prepared sequence of these expressions in this language, called a syllabus. The syllabus would be written by hand, which might at first glace seem to defeat the notion of automating acquisition. However, until we are prepared to construct much more sophisticated perceptual devices (such as a camera which returns conceptual graphs describing objects and events in its field of view), the data which informs the learning process must come from such human-mediated sources. Since advanced raw-data-to-CG converters seem far off, our efforts to reduce human involvement are compromised. It might instead be hoped that the simple teaching language, beginning from a form like that described above, would evolve with experience, so that frequently recurring patterns of operations were eventually chunked up into powerful elements of a higher-level language. Ideally, the high-level form would both decouple the content of the material from the operations, thus allowing syllabuses focus primarily on content, and to be shorter and easier to write.
4 Fully Automated Interpretation Given that the goal is creating a system for intelligent reasoning and not a tool for manipulating logic, one way test that a system is not semi-automatic-trapped is to systematically replace all the mnemonics in the graphs - the type labels in the concepts, relations, actors and contexts - with "blind" labels such as random combinations of characters. If this disadvantages or disables the run-time system, it means that the symbols in the graphs are parasitic on the meanings in the user's head, and so is not fully automated. In practice, it would be useful to be able to switch between the arbitrary labels and mnemonic ones, since the mnemonics are legitimate and useful for design and debugging. Being able to turn off the mnemonics at run-time would force attention away from the graphs and onto the human-read/writable forms with which users of the system will deal. Let us assume, for convenience, that the form for users of a hypothetical system is natural language, and permit that to explain what is needed to automatically interpret graphs. Section 2 argued that a truth-conditional semantics is not enough. In 10, I
331
expressed this as the need to move beyond truth-preserving algorithms to plausibilitypreserving heuristics. What does this mean? Imagine that the natural language system is asked two questions: 1. .
"Can a rabbit fly?" "Can you arrange the names in file "customers.txt" alphabetically and send that to printerl?
To answer 1, a truth-conditional, truth-preserving approach can be tried. Once the pragmatic component of the system has recognised the form as a question, an attempt to join the definitions of RABBIT and FLY can be made, and since selectional constraints in the definition graphs should reject an attempt to fit two incompatible graphs together, an answer of "no", based on whether the join algorithm was successful, can be returned. Perhaps, though, it would be more convivial to return either the successfully joined graph or an error message that repotted what had blocked the join. This would avoid the embarrassment of a simple "yes" answer, in the event that the system had deduced that a rabbit was a suitable patient for transport by aeroplane. It is easy to show that to properly answer 2 it takes a plausibility-preserving, procedural heuristic approach to prevail. First, assume that the only procedure available which could possibly alphabetise the file is a generalised Sort function. The word "arrange" is insufficient to choose an operation, so the system searches all available acts for a suitable match. The way the match is performed is crucial to success here; it must be quite liberal if it is to cope with the many possible ways in which it might properly be summoned. Assuming the Sort function was found to have an optional parameter called "alphabetical", then it might be appropriate to use Sort(alphabetical, customers.txt). We could not know this kind of relationship with the certainty required for a truth-preserving algorithm. A heuristic enforcing only plausibility, on the other hand, could take such a liberty, and thus only it could succeed in this case. Second, in order to avoid the pragmatic howler of returning a yes-or-no answer to 2, the system must actively carry out first the sort and then the print operation. The "and" linking the two is not a Boolean conjunction, but a conditional link ordering two tasks. Successfully recognising the two clauses as a pipelined print operation and performing it is the interpretation of the question. If unsuccessful, some kind of error message representing an explanation would then be appropriate, as in 1.
5
Conclusions
By making provision for actors, CG theory has already prepared the way for progressing beyond the notion that description is all there is to representation. To address the shortcomings of formal knowledge representations which do not recognise the significance of procedural aspects of knowledge, more attention must be paid to both the tokens which mark their presence and to the active processes which substantiate those tokens. These processes cannot be completely divorced from the pragmatics of a CG system. It will not be enough to continue developing algorithms which manipulate CGs, without somehow recognising these explicitly within the graphs themselves. Ideally, the entire codification of an active process would appear in a conceptual graph, but this complicates the notation. At least actor nodes with the same names as coded procedures outside the system should be allowed for.
332
Those interested in the adoption of this formalism as a knowledge standard must also now progress beyond the notion that representations are all there is to knowledge. To create working knowledge for intelligent systems, it is important not to perpetuate passive symbols designed for human use. This engenders a kind of introspection which is compelling to system builders, and potentially aversive to commercial developers. More seriously, it carries the risk of building too much human involvement into the system. The semi-automatic trap is a gedankenexperiment designed to reveal this risk. By asking how knowledge systems would function without their screens and keyboards. it reminds us that these conduits can sometimes work against the development of true automated reasoning. Two escapes from the semi-automatic trap are briefly discussed: fully automated acquisition and fully automated interpretation. In the case of knowledge acquisition, too much human involvement is problematic because it commits the system builder to large amounts of collection and maintenance work on the knowledgebase. It also risks granting a great deal of power to any highlyresourced organisation which is able to manually create a large ontology. It would be preferrable to eliminate the human elicitor in knowledge acquisition. Although the suggested teaching experiment does not accomplish this, it may be a step in the right direction. In the case of interpretation, the ease with which humans take on the role of interpreter of symbols makes the human-readability of CGs a double-edged sword. Therefore I suggested that the mnemonic labels inside the nodes be able to be switched off, so that any parasitism of the system may be exposed. Our artificial reasoning systems will be better able to cope with the vagaries of real tasks when they use plausibility-preserving heuristics instead of truth-preserving algorithms. Truth preservation is important for maintaining the canonicity of true graphs during arbitrary transformations, but it might block sensible but unsound steps in the reasoning process. Such steps could be ubiquitous in commonsense thinking.
References 1. Brachman, J. et.al. Krypton: A Functional Approach to Knowledge Representation. IEEE Computer, 1983, 16, 10, 67-74. 2. Buchler, J. Charles Peirce's Empiricism. New York: Harcourt, Brace & Co., 1939. 3. Burk, A.W. The Collected Papers of Charles Sanders Peirce. Vol. 5. 4. Ellis, G. Object-oriented Conceptual Graphs. In G. Ellis, R. Levinson, W. Rich and J.F. Sowa (Eds.) Conceptual Structures: Applications, Implementation and Theory. Lecture Notes in AI 954, Springer-Verlag, Berlin, 1995, 114-157.
5. Hewitt, C., et. al. Knowledge Embedding in the Description System Omega. Proceedings of the First National Conference on Artificial Intelligence, Stanford, CA, 1980, 157-164. 6. Hiebert, J. Conceptual and Procedural Knowledge in Mathematics: An Introductory Analysis. In J. Hiebert (Ed.) Conceptual and Procedural Knowledge: The Case of Mathematics. Hillsdale, NJ: Lawrence Earlbaum Assoc., 1986, pp.l-27. 7. Lacroix, G. Technical Domination and Techniques of Domination in the New Bureacratic Processes. In L. Yngstrom, et. al. (Ed.s) Can Information Technology Result in Benevolent Bureacracies? The Netherlands: Elsevier Science Publishing Co., 1985, 173-178.
333 8. Lenat, D. et. al. CYC: Towards Programs with Common Sense. Communications of the ACM, 1990, 33, 30-49. 9. McCarthy, J. Recursive Functions of Symbolic Expressions and their Computation by Machine, Part 1. Communications of the ACM, 1960, 3, 4. 10. Mann, G.A. Control of a Navigating Rational Agent by Natural Language. PhD thesis, University of New South Wales, 1996. 11. Munday, C., Sobora, F. & Lukose, D. UNE-CG-KEE: Next Generation Knowledge Engineering Environment. Proceedings of the I st Australian Knowledge Structures Workshop. Armidale, Australia, 1994, 103-117. 12. Nebel, B. & v o n Luck, K. Hybrid Reasoning in BACK. In Z.W. Ras and L. Saitta (Ed.s) Methodologies for Intelligent Systems, Vol.3. North-Holland, Amsterdam, The Netherlands, 1988. 13. Rochowiak, D. A Pragmatic Understanding of "Knowing That" and "Knowing How": The Pivotal Role of Conceptual Structures. In D. Lukose, H. Delugach, M. Keeler, L. Searle & J.F. Sowa (Eds.) Conceptual Structures: Fulfilling Peirce's Dream. Lecture Notes in AI 1257, Springer-Verlag, Berlin, 1997, 25-40. 14. J.F. Sowa: Conceptual structures. Menlo Park, California: Addison-Wesley Publishing Company, 1984. 15. Sowa, J.F. Conceptual Graph Summary. In T.E. Nagle et. al. (Eds.), Conceptual Structures: Current Research and Practice. Chichester: Ellis Horwood, 1992, 339-348. 16. Sowa, J.F. Logical Foudations for Representing Object-Oriented Systems. Journal of Theoretical and Experimental Artificial Intelligence, 1993, 5. 17. von Neumann, J. The Computer & the Brain. New York: Yale University Press, 1958.
Ontologies and Conceptual Structures William M. Tepfenhart AT&T Laboratories 480 Red Hill Rd Middletown, NJ 07748 William.Tepfenhart@ att.com
Abstract. This paper addresses
an issue associated with representing information using conceptual graphs. This issue concerns the great variability in approaches that individuals use with regard to the conceptual graph representation and the ontologies employed. This great variability makes it difficult for individual authors to use results of other authors. This paper lays out all of these differences and the consequences on the ontologies. It compares the ontologies and representations used in papers presented at the International Conference on Conceptual Structures in 1997. This comparison illustrates the diversity of approaches taken within the CG community.
1 Introduction One of the problems about reading papers on conceptual structures is that there are almost as many different approaches to conceptual structures as there are authors. In the original book by Sowa 1, he described three basic representational elements: concepts, conceptual relations, and actors. Since then, other authors have modified concepts, conceptual relations, and actors in very different manners -- different in terms of how they are defined and used. In addition, there are at least four graph types: simple graphs, nested graphs, positive nested graphs, and actor graphs. These differences make comparison between papers difficult and at times impossible. However, there is an even worse problem and that is -- it is fracturing the conceptual graph community along multiple lines. This paper does not attempt to unify all of the different approaches. Such a task is difficult and the effort involved tremendous. It is not even clear that the result would be of value to any but a few. Instead, this paper lays out certain fundamental differences in the various approaches to conceptual graphs. Using the results of this paper, the interested reader will understand how to interpret papers based on very different sets of premises and perhaps be more forgiving to those who have chosen a different approach.
335
The basic elements for which differences are identified in this paper are: the descriptive emphasis, the definitional information, the conceptual grounding, the processing approaches, the ontological structures, and the knowledge structures. As will be shown, there are many degrees of freedom in how one can combine all of these different elements. This paper will not argue which combinations of these elements are meaningful, although it might seem to some readers that some are not. The next six sections of this paper address the following topics: 9 descriptive emphasis - what aspects about the world are stressed most in the ontology and how that affects where and how concepts are defined. 9 definitional information - what information is captured in the definition of concepts and how that information is to be used. 9 conceptual groundings - the semantic basis on which the meaning of the concept is founded. 9 processing approaches - how information captured within conceptual graphs is processed and the implications in terms of how concepts are defined. 9 ontological structures - how concepts are arranged in a type structure and the kinds of processing that can be performed over it. 9 knowledge structures - the graph structures which individuals use to express information and how that structure influences the ontology. Each section describes the element and gives examples of the approach. This is followed by a section that classifies individual papers according to the ontological assumptions on which they are based. The paper concludes by giving a summary of the results presented here.
2 Descriptive Emphasis One element contributing to an ontology is the descriptive emphasis. The descriptive emphasis is the part of the physical world that is stressed most within the ontology and knowledge structures. Some descriptions focus on the state while others focus on the act. The distinction between the two is significant in terms of the kind of information captured, the types of operations that are performed over them, and the kinds of inferences that can be achieved. In fact, the different emphasis controls what kinds of information must be derived from a given graph and knowledge base versus what information is trivially extracted.
2.1 State An ontology that emphases state concentrates on things and the relationships among them. Actions are expressed as changes in state and are characterized by an initial state, a final state, and the act that links the two. The ontology, of course, supports this kind of treatment directly. An example of this for 'A Cat Sat On A Mat' is,
336
Cat: * -> (on) -> Mat: * (posture) -> Standing: * -> <Sit> -> Cat: * -> (on) -> Mat: * (posture) -> Sitting: * In this example, the initial state is one in which a cat is on a mat in a standing position; the final state is one in which a cat is on a mat in a sitting position; and the link between the two is an actor that represents the movement of the cat into a sitting position. The use of an actor <Sit> expresses the semantics of an active relation although, as will be discussed in a later section, actors are not the only mechanism (computational) to express the changes that are taking place. The ontology reflects this way of viewing the physical world by having states and the objects within them captured as concepts. The concepts are defined within the concept type structure. The relationships between objects within a state are captured as conceptual relations which are defined within the relation type structure. Relationships between states are captured as active relations which can be defined as a relation within the relation type structure or as actors within an actor type structure.
2.2 Act An ontology that emphasizes acts concentrates on the transitions and the roles that things play within them. Actions that take place are characterized by the subject of the act, the recipient of the act, the location it took place, and the manner in which the subject executed the act. An example of this is, Sat: * - > (agent) -> Cat: * (location) -> Mat: * In this case, the act is expressed by Sat: * where the agent of the act is given by Cat:*, and the location is given by Mat: *. The ontology, of course, supports this kind of treatment directly. Here the objects involved and the act are expressed as concepts which are defined within the concept type structure. The relationships between the act and the participants are expressed as conceptual relationships which are defined within the relation type structure.
3 Def'mitional Information Sowa states that a concept is defined by Sowa, type a(x) is u where a is the type label and the body u is the differentia of a, and type(x) is called the genus of a. An example of this is, type CIRCUS-ELEPHANT(x) is
337
ELEPHANT: *X <- (AGNT) <- PERFORM <- (LOC) <- CIRCUS While it would appear that this definition is rather clear, it is in the treatment of the differentia that authors differ tremendously. There are two approaches: the differentia is treated as a predicate or as a prototype. In some cases, authors do not use the definitional mechanism. There is another factor complicating understanding the ontology behind many of the papers about conceptual graphs. Many authors do not give definitions for the concepts and conceptual relations employed in their papers. One must assume that they have some sort of definition that is obvious to them. More significantly, these authors do not exploit definitions or describe how they are used when processing graphs.
3.1 None While some authors assume but do not give definitions for conceptual elements, there are others who do not employ the Sowa definitional mechanism in any form or fashion. For these authors, definitional information is given by the placement within some network of connected nodes. The definition is established by all nodes to which it is linked by way of relationships.
3.2 Logic One approach treats the differentia as a predicate which is evaluated according to the rules of logic. In this approach, the definitional graph is the predicate. If the necessary information is available for the predicate evaluate to true, then this is deemed as sufficient to establish the type for an individual.
3.3 Prototype In another approach, the definitional graph is treated as a prototype. Individuals are treated and described as instances. The attributes associated with an individual are those given within the graph that constitutes the differentia those inherited from parent types. In a sense, this approach is compatible with the class systems used in Object Oriented systems.
4 Conceptual Grounding Ogden and Richards, in 2, established the meaning triangle as a means for expressing the relationships among symbols, concepts, and referents. The meaning triangle is illustrated in the figure below. In the lower left corner is the symbol which
338
corresponds to the linguistic element of a word. In the lower right corner is the referent which is related to the object. The top of the triangle is the concept which serves to link the symbol and the referent. The direct link between symbol and referent is actually a virtual link. Concept
Symbol
.......................
Referent
Certain relationships need to be explained in this figure. In particular, the link between the symbol and the concept is an invokes relationship. The idea here is that the symbol invokes in the mind of an individual the concept. Alternatively, one may view the link as the symbol expresses the concept. The relation between the referent and the concept is a little more complex. The referent is observed and expressed as a percept. The percept is then interpreted as a concept.
4.1Percept A percept based approach exploits the assertion that a concept is the interpretation of a percept which is the result of some sensation of an object. As a result, developing an ontology and grounding the semantics of concepts in the physical world is a matter of studying objects and how we observe them. Abstract concepts are introduced as a result of computational operations over perceptually grounded percepts. In the process of grounding the semantics of a concepts in percepts, the investigator must necessarily concern themselves with the nature of sensors and actuators. That is, in order to ground the concept in the physical we have to understand how we perceive the physical and can interact with it.
4.2 Linguistic A linguistic based approach to grounding the semantics of a concept is based on the view that the concept is the result of an invocation by a symbol. Hence, for every symbol there is some meaningful concept. By studying natural languages, we can get a 9very detailed view of the concepts and the type relationships that exist among.
5 Processing Approaches Given a conceptual graph, what does one do with it? There are three major approaches to processing conceptual graphs. One approach has its roots in semantic
339
networks, another in predicate logic, and a third in procedure. These three different approaches have significant effect on how concepts are defined. In reading papers on conceptual graphs, it is clear that many of them do not describe how the graphs are to be processed. The papers concentrate on capturing some domain or natural language. There is little focus on what one does with the knowledge once captured.
5.1 Semantic Network In this approach, graphs containing referents are treated as activations within a semantic network. Processing the graph is a matter of activating connected nodes until some query is resolved.
5.2 Predicate Logic In this approach, graphs containing referents are assertions that are made on sheets of assertion. Processing occurs as the result of a asserting a query. The query in the form of a conceptual graph with existential quantifiers. The resolution of the query graph and the bindings that make the graph true is the result returned. This approach is very much like a prolog style of programming.
5.3 Procedure The arrival of a graph in a working memory area is treated as an event. The event triggers processing in the form of an actor firing. The actor may be defined in terms such that it invokes additional actors to fire as a result of a change in the input graph. Actors modify the input graph until some stop condition is reached.
6 Ontological Structures Integral to the whole concept of an ontology is the type structure. In various approaches this resulting structures have different characteristics. Operations that make sense in one, do not make sense in the other. The choice of one approach over the other has significant consequences on the kinds of processing that must be performed in classification or establishing co-reference. The type structure is a partial ordering of the concepts based on type-of relations. Where individual authors disagree is on the nature of the type-of relation. In some cases, individuals allow a concept to exist in more than one type-of relation with others. Others restrict a concept to exist in a type-of relation with one concept. The result is that in some approaches the type structure is a hierarchy while in others it is a lattice.
340
6.1 Type Hierarchy In a type hierarchy, the ontology type structure is captured in the form of a tree. Even within the community that employs type hierarchies there is some disagreement about how individuals stand in relation to the hierarchy. In some, a particular individual can be classified as only a single leaf type of the lattice (this allows it to be classified as any of the parent types of the leaf). The other approach allows an individual to be classified as multiple subtypes of a single concept.
6.2 Type Lattice In a type lattice, the ontology type structure is captured in the form of a lattice. At the top of the lattice is the universal true and at the base of the lattice is the absurd. In this approach both natural types and role types appear in the lattice with concepts able to inherit from both. CAT:*
'l
~
MAT:*
7 Knowledge Structures Sowa's original book 1 described a basic conceptual graph which has become known as a simple graph and nested graphs for capturing logical expressions. Since then, additional graph types have been added. These additional graphs include positive nested graphs and actor graphs.
7.1 Simple Graphs Simple graphs are bi-partite directed graphs containing only concepts and conceptual relations. They are the most basic graph employed in the community. An example, often cited, is
PERSON
r----
PERSON
<- (CHLD) <- MOTHER
7.2 Nested Graphs Nested graphs are intended to express positive and negative propositions as needed in first order logic. The following illustrates how to capture the statement, "Every person has a mother.",
341
7.3 Positive Nested Graphs Positive nested graphs are similar in structure to nested graphs with the exception that negations are not allowed. They also differ in the sense of the processing that is performed over them. In this case, the nested graph is part of the referent field of some concept which participates in the outer graph. The nested graph provides greater description about an individual without complicating the exterior graph with the additional details. An example of a nest graph (taken partially from 6),
I
Person:Peter: **
PAINTING: A: SCENE: *:** ~
I,
THINK:*:**I
I Bucolic: *:**
7.4 Actor Graphs Actor graphs are graphs that incorporate actors in addition to concepts and conceptual relations. The actors are treated as active relations which have the potential to modify a graph. An example of an actor graph for the action that a cat moved onto a mat is,
II CA~:x t--~~--~
I CAT:x I
MAT:y IL
,J MA :Y I
342
8 A Sampling of Approaches In this section of the paper, the approaches demonstrated in the papers presented in International Conference on Conceptual Structures 1997 are mapped against the various ontological styles described in the previous sections. Individual authors of papers may find that their work has been misclassified in this paper. The fact that misclassification of an authors approach is likely is mentioned for a specific reason. In particular, it is to make the point that understanding the ontology employed by various authors is difficult, particularly since few authors bother to explain their ontology or what assumptions they make in constructing it. In some cases, it is exactly this information that is necessary for a different author to use their results. In the tables that follow, care has been taken to accurately characterize papers. A number of papers which appear in the tables will not have entries associated with them in some or all of the main columns. After serious consideration, it was decided that the lack of information in some papers about the ontological basis and concepts which are being processed constitutes a data point just as valid as when it was stated. This further underscores the need for such information to be stated.
Author
131 141 151 61 171 81 19 101 HI l121 131
141 151 ll61 ll71 181
Descriptive Emphasis State Act
Definitional Emphasis Nonc
X X X
Prototype
X X
X X
X X X X
X X X
X X X X X
X X X
X
191 201 211 22l 1231 24l
Predicate
X
X X
343
25 26 27 28 29
X
X X X
X
X
30 31 32 33 34 35 361 37
X X X X X X
X X X
38 In the table that follows the conceptual grounding and processing approaches are compared. One will notice in looking at this table, that there are a number of papers for which are no entries. This is not a mistake, some papers don't include information about these topics.
Percept 31 4 5 6 7 8 9 lOl 111 12 13 14 15 16 17 181 19 20 211 22
Processing Approach
Conceptual Grounding
Author
Linguistic
Semantic
Predicate
Procedure
X x x x
X
x x X X
x x X
X
X X X X X X
X
X X X X
X
X
344
23 24 25 26 27 28 291 301 31 32 331 34 35 36 37 38
X X X
X X
X
X X X X X
X X
X X X
X X
X X X
In the table that follows, there are several unusual entries. In some cases, there is an ontological structure defined and not a knowledge structure. That is, some papers only describe the type structure and not how the concepts are to used within graphs. In other cases, the knowledge structures are defined without any reference to the ontological structure. Author
Ontological Structure Lattice Hierarchy,
X
3 4
5 6 7 8 9 10 11 12 131 14 15 16 17 18 19 20
x x x x X X X
X X X
Knowledge Structure
Nested Graphs Positive Nested Graphs Nested Graphs Positive Nested Graphs Simple Graphs Actor Graphs Actor Graphs Simple Graphs Nested Graphs Nested Graphs Nested Graphs Nested Graphs Nested Graphs Nested Graphs Nested Graphs Simple Graphs
345
21 22 23
X X
24
25 26 27 28 29 301 31 32 33 341 35
36 37 38
X X X X X
X
Gamma Graphs Nested Graphs Fuzzy Graphs Simple Graphs Nested Graphs Actor Graphs Simple Graphs Simple Graphs Simple Graphs Simple Graphs Simple Graphs Simple Graphs Simple Graphs Simple Graphs
9 Summary This paper has attempted to outline some of the fundamental ideas that form the basis of research efforts in the area of ontologies and conceptual structures. It is hoped that the preceding sections have conveyed the difficulty in comparing and contrasting papers on conceptual structures, particularly with regard to the ontological foundations. The variety of approaches, processing styles, and assumptions make it difficult for one author to apply the results of another. One result is that much effort is being spent to solve the same problem several times because the language in which the problem is framed appears to be different. At the least, this paper gives an outline by which technical disagreements and discussion can be focused. That is, we can use the results presented in this paper as a means to argue about whether a predicate or procedural approach is most appropriate for conceptual graphs. If the different approaches are all appropriate, then when does one approach work better than another? If the community continues to use multiple approaches, then how can we use conceptual graphs as an exchange mechanism when the underlying ontologies have such different foundations?
346
10 References 1. Sowa, J.F., Conceptual Structures: Information Processing In Mind and Machine, Addison Wesley, Reading, MA, 1984. 2. Ogden, C.K., and I.A. Riehards, The Meaning Of Meaning, Harcourt, Brace, and World, New York, NY, 1946. 3. Sowa, J.F, "A Peircean Foundations for the Theory of Context," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Vedag, Berlin, 1998, (pp. 41-64). 4. Chein, M., 'q'he CORALI Project: From Conceptual Graphs to Conceptual Graphs via Labeled Graphs," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 65-79). 5. Mineau, G.W. and M-L Mugnier, "Contexts: A Formal Definition Of Worlds Of Assertions," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 80-94). 6. Chein, M. and M-L Mugnier, "Positive Nested Conceptual Graphs," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 95-109). 7. Wermelinger, M. "A Different Perspective On Canonicity," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, SpringerVerlag, Berlin, 1998, (pp. 110-124). 8. Tepfenhart, W.M., "Aggregations In Conceptual Graphs," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, SpringerVerlag, Berlin, 1998, (pp. 125-137). 9. Mineau, G.W. and R. Missaoui, 'q'he Representation Of Semantic Constraints in Conceptual Systems," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 138-152). 10. Faron, C. and J-G. Ganascia, "Representation of Defaults and Exceptions in Conceptual Graphs Formalism," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 153-167). 11. Ribiere, M. and R. Dieng, "Introduction of Viewpoints in Conceptual Graph Formalism," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 168-182). 12. Angelova G., and K. Bontcheva, "Task-Dependent Aspects of Knowledge Acquisition: A Case Study in a Technical Domain," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 183-197). 13. Richards, D. and P. Compton, "Uncovering the Conceptual Models in Ripple Down Rules," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 198-212). 14. Kremer, R., D. Lukose, and B. Gaines, "Knowledge Modeling Using Annotated Flow Chart," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 213-227). 15. Lukose, D., "Complex Modeling Constructs in MODEL-ECS," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 228-243). 16. Dick, J.P, "Modeling Cause and Effect in Legal Text," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, SpringerVerlag, Berlin, 1998, (pp. 244-259).
347 17. Raban, R., "Information Systems Modeling with GCs Logic," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 260-274). 18. Bos, C. B. Botella, and P. Vanheeghe, "Modeling and Simulating Human Behaviors with Conceptual Graphs," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 275-289). 19. Wille, R., "Conceptual Graphs and Formal Concept Analysis," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 290-303). 20. Biedermann, K., "How Triadic Diagrams Represent Conceptual Structures," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 304-317). 21. Stumme, G. "Concept Exploration - A Tool for Creating and Exploring Conceptual Hierarchies," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 318-332). 22. Prediger, S., "Logical Scaling in Formal Concept Analysis," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 332-341). 23. Ellis, G. and S. Callaghan, "Organization of Knowledge Using Order Factors," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Spdnger-Verlag, Berlin, 1998, (pp. 342-356). 24. Ohrstrom, P., "C.S. Peirce and the Quest for Gamma Graphs," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 138-152). 25. Kerdiles, G. and E. Salvat, "A Sound and Complete CG Proof Procedure Combining Projections with Analytic Tableaux," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 357-370). 26. Coa, T.H., P.N. Creasy, and V. Wuwongse, "Fuzzy Unification and Resolution Proof Procedure for Fuzz Conceptual Graph Programs," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 386-400). 27. Leclere, M. "Reasoning with Type Definitions," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 401-415). 28. Coa, T.H. and P.N. Creasy, "Universal Marker and Functional Relation: Semantics and Operations," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 416-430). 29. Raban, R. and JH. S. Delugach, "Animating Conceptual Graphs," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 431-445). 30. Bournaud, I. and J-G. Ganascia, "Accounting for Domain Knowledge in the Construction of a Generalization Space," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 446459). 31. Mann, G.A., "Rational and Affective Linking Across Conceptual Cases - Without Rules," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 460-473).
348
32. Gerbe, O. "Conceptual Graphs for Corporate knowledge Repositories," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 474-488). 33. Genest, D. and M. Chein, "An Experiment in Document Retrieval Using Conceptual Graphs," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 489-504). 34. Keeler, M.A.L.F. Searle, and C. Kloesel, "PORT: A Testbed Paradigm for Knowledge Processing in the Humanities," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlinr 1998, (pp. 505-520). 35. Clark, P. and B. Porter, "Using AcceSs Paths to Guide Inference with Conceptual Graphs," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 521-535). 36. de Moor, A., "Applying Conceptual Graph Theory to the User-Driven Specifications of Network Information Systems," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 536-550). 37. Puder, A. and K. Romer, "Generic Trading Service in Telecommunication Platforms," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 551-565). 38. Polovina, S., "Assessing Sowa's Conceptual Graphs for Effective Strategic Management Decisions, Based on a Comparative Study with Eden's Cognitive Mapping," Conceptual Structures: Fulfilling Peirce's Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 566-580).
Manual Acquisition of Uncountable Types in Closed Worlds Galia Angelova Bulgarian Academy of Sciences, 25A Acad. G. Bonchev St., 1113 Sofia, Bulgaria galja~lml.acad.bg
A b s t r a c t . The paper considers the problem of classifying countableuncountable entities during the process of Knowledge Acquisition (KA) from texts. Since one of the main goals of KA is to identify types, means to distinguish new types, instances and individuals become particularly important. We review briefly related studies to show that the distinction countable-uncountable depends on the considered natural language, context usage and the domain; then countability is a perspective to look at a closed world since there is no universal general taxonomy. Finally we propose an internal ontological solution for mass objects which suits to a project 1 for generation of multilingual Natural Language (NL) explanations from Conceptual Graphs (CG).
1
Introduction
KA aims at (i) the identification of (task-specific) objects and relationships in the acquisition domain and (ii) the encoding of these entities in the most suitable structures of the chosen knowledge representation formalism. KA is usually performed from texts in a manual, semi-automatic or automatic manner. KA explicates the concept types, their instances and individuals as they are described in the acquisition texts. Obviously, the type fragmentariness is strongly influenced by the concrete natural language - the particular words have languagespecific semantic granularity and often this granularity is the KA hint for the semantic content of the correspondingly acquired type. Uncountable types raise a special interest. These types (often mass objects) should be recognised and encoded properly, since their instances behave in a specific manner. If KA does not identify the countable-uncountable types whatever their language citation is, there is no 'later' adjustment where this distinction would be evaluated in the context of the acquisition texts and explicated in the Knowledge Base (KB). So, in our view, the precise acquisition of uncountable concepts is one of the obligatory tasks to be performed by KA. This paper discusses KA of uncountable types from noun phrases. We summarise our experience in manual KA from technical texts (generally discussed in AB2) and the way we treat the obtained instances after a precise study of 1 DBR-MAT "Intelligent Translation System", funded by Volkswagen Foundation (1996-98), http://nats-www.informatik.uni-hamburg.de/projects/dbr-mat.
352
their (i) linguistic behaviour in the texts and (ii) conceptual behaviour in the closed domain. Section 2 presents the distinction countable-uncountable from a linguistic perspective. Since we perform KA from NL texts to provide further NL generation, the linguistic outlook is the most natural initial standpoint. Section 3 summarises briefly other related approaches to mass nouns and objects. Section 4 shows how the observation of linguistic facts affects the conceptual representation within the KB of CG. Section 5 contains the conclusion.
2
Linguistic Perspective to Countable vs. U n c o u n t a b l e
It is difficult to say whether the distinction countable-uncountable is to be drawn (i) between linguistic entities in the NL texts (as it is widely accepted, these are noun phrases), or (ii) between conceptual entities (objects in the world). Moreover, a clearly defined hierarchy of countable-uncountable entities does not exist. Below we consider the problems of such a classification. The contrastive study Moll adopts a classification of nouns after the linguist Otto Jespersen (see Fig. 1); following them we accept that the distinction countable-uncountable is one of the most important semantic characteristics of nouns. This fits very well to the KA goals to identify types; it is really important to classify the acquired concept types as types wi~h enumerable instances and individuals vs. other lypes. We can use Fig. 1 as a guide for conceptual classification of types if we interpret properly the ontological content of the hierarchy. To discuss Fig. 1, let us consider nouns and world objects which are 'named' by these nouns:
Noun
J
Countable
Concrete
Abstract
Uncountable
Concrete
Abstract I I I
Individual
C o liective
M ass
C o lective
Fig. 1. Taxonomy of nouns after Jespersen. All partitions are disjoint. The dotted lines indicate the only non-exhaustive partition.
1. C o u n t a b l e , C o n c r e t e , I n d i v i d u a l : table, student. Material world objects, with certain shape or precise limits, which exist as sets of separable concrete
353
2. 3.
4. 5.
instances and individuals. This is nearly the only class of nouns whose grammatical behaviour coincides with the individual behaviour of the respective conceptual objects. C o u n t a b l e , C o n c r e t e , C o l l e c t i v e : flock, herd. Grouping material objects which exist as enumerable instances. C o u n t a b l e , A b s t r a c t , I n d i v i d u a l : idea, thought. Abstract objects named by nouns whose grammatical features allow counting. We say 'one, three, many ideas'. The language-influenced conceptualisation is that ideas can be counted (at least in English). T h a t is why in the NL texts we meet references to different instances of the respective concept type. C o u n t a b l e , A b s t r a c t , C o l l e c t i v e : family, company. Grouping abstract objects which exist as enumerable instances. U n c o u n t a b l e , C o n c r e t e , Mass: wine, water, sand, snow, copper, butter, sugar. The semantic structure of mass nouns excludes enumeration of instances but mass nouns have the category of number because (i) they appear in singular, e.g. 'this is silver', and (ii) they express quantity. Plural forms usually indicate a modified meaning or a polysemy. Mass nouns correspond to material objects with properties that (i) the object substance can be measured but not counted, and (ii) each separate part of this stuff has the quality and meaning of the whole.
We reviewed many works to justify this taxonomy; most of the authors would agree with the interpretation of the above five classes. To show the complications of the exact classification of uncountables, however, we compile different examples mapping them into the hierarchy in Fig. 1: 1. U n c o u n t a b l e , C o n c r e t e , C o l l e c t i v e : pottery, silverware. 2. U n c o u n t a b l e , A b s t r a c t , b u t n o t M a s s a n d n o t C o l l e c t i v e : love, liberty, democracy, dispersion, music, elegance, etc. Abstract objects which seem to exist 'in one instance only'. 3. U n c o u n t a b l e , A b s t r a c t , Mass: success, knowledge, software. 4. U n c o u n t a b l e , A b s t r a c t , C o l l e c t i v e : mankind, gentry, police. If software is something like abstract mass, then why not to consider hardware as concrete mass? But on the other hand, Can hardware be measured like water? Moreover, hardware resembles pottery, because not every physical part of hardware is still hardware. But pottery is classified in Moll as uncountable, concrete, collective. To move hardware to the same class as pottery would probably mean to consider software as abstract, collective - which would contradict some other opinions on what collective is, and so on. It is difficult to distinguish abstract collective and abstract mass nouns in various cases, if we compare the opinions of different authors. Facing all these complications in building an exact taxonomy, we decided to avoid classifying each abstract object in the above categories. We hoped that the opposition countable-uncountable would provide enough conceptual evidence for the number of instances. Unfortunately, even this partition is not absolute in
354
case of multilinguality: e.g., news in English is uncountable, singular, the news is good, while in Bulgarian one says 'two news' similarly to 'two ideas' in English. So we can use the taxonomy in Fig. 1 as a basic framework supporting KA classifications, but only if we remember that it is somewhat relative, especially in particular multilingual cases. Below we continue discussing noun phrases and concept types but always bearing in mind the duality language unit - type acquired in the KB.
3
Other Approaches to Uncountability
AI usually studies the philosophical, cognitive and formal foundations of knowledge modelling. In Hall objects from the Naive Physics world are considered. Properties like countable-uncountable seem to be treated as intrinsic ones. In Gull the countability is a category in a top-level ontology of universals. WCH develops an interesting taxonomy of part-whole relations to explain the English usage of part of. Two cases concern the distinction uncountable-countable : the relations portion-mass (slice-pie) and stuff-object (steel-car). Our KA goals, however, lay at the border between support of multilingual NL processing and conceptual modelling using CG; that is why we are mostly interested either in approaches which address both layers together or in CG-related approaches. In Sol and So2 we see that mass nouns have measured individuals which correspond to concrete amounts of stuff, representing in this way different individual quantities of types WATER, TIME etc. Plural referents are only used with countable things, while mass nouns are not normally used in the plural. To represent substances by CG, Tell proposes the type definition:
type substance(x) is p h y s i c a l _ o b j e c t : {*}@++ (x) -> (has_components) -> Type_0f (physical_object) ~*} (has_internal_structure)-> Structure ...... 'A substance is a set containing an uncountable number of physical objects of various types and the members within the set have some internal structure'. Tell proposes to perform computations over sets and then substance properties can be interpreted as consequences of the relationships that exist among the instances. Tell explains compositions, states, viscosity, and reactivity of substances. This is a conceptualisation of substances at a micro-level, providing an extremely detailed description of the micro-changes that can take place over time. The NL-oriented approaches, however, always take into consideration (i) the linguistic facts, and (ii) (to some extent) the object denoted by the noun and the properties of this object. According to Lyons, the count-mass distinction is primarily a linguistic one, which is clearly seen in cases of multilinguality (remember the example of news, uncountable in English and countable in Bulgarian). R. Dale Dal investigates the domain of cooking and discusses conflicts between a naive ontology and linguistic facts. In the domain of cooking, rice and lentil are rather similar (small
355
objects of roughly the same size), whose individuals are not considered separately in recipes. The linguistic expressions of ingredients, however, represent rice as a mass noun while lentil behaves like a count noun: e.g. four ounces of rice, four ounces of lentils. 'If the count/mass distinction was ontologically based, we would expect these descriptions to be either both count or both mass' Dal. We found particularly interesting the observation that 'physical objects are not inherently count or mass, but are viewed as being count or mass' in some domains. So Dal considers physical objects from either a mass or a count perspective. 'Thus, a specific physical object can be viewed one time as a mass, and another time as a countable object: when cooking, I will in all likelihood view a quantity of rice as a mass, but if I am a scientist examining rice grains for evidence of pesticide use, I may view the same quantity of rice as a countable set of individuals'. In Dal, exactly one perspective at a time is allowed: each object in the closed domain of cooking is either mass, or count. Comparing Tell and Dal, we see that at a micro-level all objects can be treated as sets of instances; but natural language does not work at the level of atomic components. In most realistic domains the basic objects have much bigger granularity and they are denoted by words that make us treat them as count or mass. Unfortunately, the context-dependent NL usage provides flexible shifts of granularity (see Hol D. 'A road can be viewed as a line (planning a trip), as a surface (driving on it), and as a volume (hitting a pothole) .... Many concepts are inherently granularity-dependent.' Probably we could say that closed domain is such a domain, where a pre-fixed number of perspectives to each object exist and the relevant kinds of granularity are pre-fixed as well. To summarize, in NL we refer to the conceptual entities as follows: (i) when the stuff itself is referred to, an uncountable noun is typically used; in other cases the emphasis is on the object form and shape and then we refer to particular instances by means of countable nouns; (ii) in some domains the entities are treated as compositions at the level of micro-ingredients,while in other domains we see them as compositions of ingredients at a much higher level.
4
A Mixed
Count-Mass
Taxonomy
in Closed
Domains
We acquire CG in the domain of admixture separation from polluted water in order to generate domain explanations in several NLs from the underlying KB AB1. In the context of the present considerations, we try to satisfy the following requirements: (i) adequate internal conceptuMisation providing easy surface verbalisation in different NLs with a proper usage of singular-plural and count-mass nouns; (ii) clear separation between conceptual data (in the single KB) and linguistic data (in the system lexicons, one lexicon per language); (iii) conceptual structures allowing for easy integration of the closed domain into more universal ontologies. In the type hierarchy we define important domain types and integrate them under an upper model. Figure 2 represents a simplified view to the taxonomy
356
where countable and mass entities are classified as subtypes of PHYSICALO B J E C T . We acquire as concept types two important objects in the domain: oil drop and oil particle. Note that this decision is domain-dependent, since in this domain the polluting oil exists as particles and drops in the polluted water. Furthermore, OIL-DROP and OIL-PARTICLE are subtypes of the substance OIL. These two types show the borderline where the domain taxonomy is integrated into a universal taxonomy of physical objects. The ISA-KIND relation, introduced in AB2, defines the perspective of looking at particles and drops as OIL: they are typical 'quantities of stuff'. To conform to some standard, we adopted the keyword P A C K A G E R from Dal. The ISA-KIND relation indicates that the classifications OIL --+ OIL-PARTICLE and OIL -+ OIL-DROP are partitions into role subtypes, because these are subtypes that can be changed during the life time of the physical object (similarly to PROFESSION for PERSONS), while the partitions OIL --+ MINERAL-OIL and OIL --+ SYNTHETIC-OIL are classifications into natural subtypes according to the usual type-of relation. Note that the PACKAGER-perspective covers the two cases of part-whole relations mentioned in WCH: it denotes the relations portion-mass and stuff-object, which are not distinguished in the current domain and consequently, are not treated as different ones in the conceptual model. Such a conceptual solution provides flexible links to the lexicons in case of count-mass nouns. Imagine in some natural language only the word for the stuff exists (e.g. grape is a mass noun in Bulgarian and Russian); then this word is linked to the stuff-concept. But in some other languages, the related words can name typical 'packaged' quantities (like grape in English); if we acquire such quantities as types, the respective words will be connected to these special concept types. Note that it is not obligatory to have 'naming' lexicon elements in any language and for any domain types; the explanations are constructed for the existing concepts in the corresponding grammatical forms. Since the inheritance works for type-of relations, in DBR-MAT we can, for instance, generate the explanation: Each oil particle has dimension less than 0.05 mm. Here we use particle in singular, since it is a countable object and inherits the characteristic features of PARTICLE. As a subtype of PARTICLE, OIL-PARTICLE has SHAPE and DIMENSION. From the perspective of OIL, however, we talk about particles always in plural, and thus we connect countable and mass nouns in the generated explanations. For instance: Oil appears as oil particles and oil drops. Viewed as oil,
oil particles have density and relative weight. Additionally, in the KB, when instances of the types OIL-PARTICLE and OIL-DROP appear in conceptual graphs with unspecified plural referent, i.e. as OIL-PARTICLE: {*} and OIL-DROP: {*}, we can make a generalisation and replace these types by OI L. Then they are verbalised as mass nouns. It is obvious that the natural language we generate is not as flexible as a human one, but this solution is an opportunity to mix countable and uncountable perspective in one utterance. Note that we cannot produce phrases like two oil particles, because in our domain-specific KB the unspecified plural referent sets
357
. . .
/ . . .
PHYSICAL-OBJECT C O U N TAB LE
\
SUB STAN C E
CONCEPT-TYPE
/\ DRO P
PARTIC LE
OIL-DROP
0 IL
NAqI/RAL
O IL - PARTIC LE
ROLE
PROFESSION
SYNTHETIC -OIL
PACKAGER
M IN E R A L - O IL
~ IL- p~ACR_ ~ _ ~ ~
,~ACKAGER
I
~ IL-DRO P ~.....~
Fig. 2. Taxonomy mixing countable and mass objects by classification into role and n~tur~l subtypes.
from the type definition of SUBSTANCE would not be instantiated with counted sets. To conclude, such crosspoint types between countable and mass types require very careful elaboration of: (i) the type hierarchy, (ii) the type definitions of the supertypes and the 'crosspoint' type so that to assure correct generalisation and specialisation, (iii) the characteristics of both supertypes to assure correct inheritance.
5
Conclusion
This paper discusses difficulties in classifying world objects as countable and uncountable and presents a (more or less) empirical solution applied in an ongoing project. We see that KA requires extremely detailed analysis of the source texts and deep understanding of the CG structures and idiosyncrasies. It is somehow risky to consider only the NL level, since the language phenomena in the acquisition texts are often misleading (clearly seen in a multilingual paradigm). Actually the distinction countable-mass should be defined at a deeper concep-
358
tual level. We try to keep the two perspectives closely related: the uncountable stuff and the countable individual objects made by this stiff. Fig. 2 shows that we allow both count-mass perspectives together but only to specially acquired, domain dependent types. In this sense, our approach addresses the closed world of one domain. To 'open' this closed world and to integrate another closed world for another domain will probably require the addition of new 'crosspoint' types between countable-mass objects. In our view, the classification countable-uncountable is not less meaningful than the partition abstract-material. Fig. 1 differs from many upper-level ontological classifications, where abstract-real is the highest partition of the top (see FH1). But a careful analysis of the taxonomy in Fig. 1 shows that it is easy to replace the two upper layers, i.e. the classification concrete-abstract can easily become the topmost partition. Despite the problems discussed in this paper, it seems worthwhile to consider at least the countable-mass distinction of p h y s i c a l o b j e c t as one of the unifying principles for top-level ontologies.
Acknowledgements The author is grateful to the three anonymous referees for the fruitful discussion and the suggestions.
References AB1 Angelova, G., Boncheva, K.: DB-MAT: Knowledge Acquisition, Processing and NL Generation using Conceptual Graphs. ICCS-96, LNAI 1115 (1997) 115-129. lAB2 Angelova, G., Boncheva, K.: Task-Dependent Aspects of Knowledge Acquisition: a Case Study in a Technical Domain. ICCS-97, LNAI 1257 (1996) 183-197. Dal Dale, R.: Generating Referring Expressions in a Domain of Objects and Processes. Ph.D. Thesis, University of Edinburgh (1988). FH1 Fridman, N., Hafner, C.: The State of the Art in Ontology Design. A Survey and Comparative Review. AI Magazine Vol. 18(3) (1997), 53 - 74. Gull Guarino, N.: Some Organizing Principles for a Unified Top-Level Ontology. In: Working Notes AAAI Spring Symp. on Ontological Engineering, Stanford (1997). Hal Hayes, P.: The Second Naive Physics Manifesto. In Brachman, Levesque (eds.) Readings in Knowledge Representation, Morgan Kaufmann Publ. (1985) 468-485. Hol Hobbs, J.: Sketch of an ontology underlying the way we talk about the world. Int. J. Human-Computer Studies 43 (1995), 819 - 830. Mol Molhova, J. The Noun: a Contrastive English - Bulgarian Study. Publ. House of the Sofia University "St. K1. Ohridski', Sofia (1992). Sol Sowa, J.: Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley, Reading, MA (1984). So2 Sowa, J.: Conceptual Graphs Summary. In: Nagle, Nagle, Gerholz, Eklund (Eds.), Conc. Structures: Current Research and Practice, Ellis Horwood (1992) 3-52. Tel Tepfenhart, W.: Representing Knowledge about Substances. LNAI 754 (1992) 59-71. WCH Winston, M., Chaffin, R., I-Ierrmann, D.: A Taxonomy of Part-Whole Relations. Cognitive Science 11 (1987) 417-444.
A Logical Framework for Modeling a Discourse from the Point of View of the Agents Involved in It 1 Bernard Moulin, Professor Computer Science Department and Research Center in Geomatics Laval University, Pouliot Building, Ste Foy (QC) G1K 7P4, Canada Phone: (418) 656-5580, E-mail: [email protected]
A b s t r a c t . The way people interpret a discourse in real life goes well be-
yond the traditional semantic interpretation based on predicate calculus as is currently done in approaches such as Sowa's Conceptual Graph Theory, Kamp's DRT. From a cognitive point of view, understanding a story is not a mere process of identifying truth conditions of a series of sentences, but i s a construction process of building several partial models such as a model of the environment in which the story takes place, a model of mental attitudes for each character and a model of the verbal interactions taking place in the story. Based on this cognitive basis, we propose a logical framework differentiating three components in an agent's mental model: a temporal model which simulates an agent's experience of the passing of time; the agent's memory model which records the explicit mental attitudes the agent is aware of and the agent's attentional model containing the knowledge structures that the agent manipulates in its current situation
1
Introduction
Conceptual Graphs (CG) 16 have been applied in several natural language research projects and the CG notation can be used to represent fairly complex sentences, including several interesting linguistic phenomena such as attitude reports, anaphors, indexicals and subordinate sentences. However, most researchers have overlooked the importance of modeling the context in which sentences are uttered by locutors. Even modeling a simple sentence such as "Peter saw the girl who was playing in the park with a red ball" requires a proper representation of the context of utterance. For instance, the preterit "saw" cannot be modeled without referring to the time when the locutor uttered the sentence: hence, we need to represent the locutor and the context of utterance in addition to representing the sentence itself. To this end, we proposed to model the contents of whole discourses using an approach in which it is possible to explicitly represent the context of utterance of speech acts 1 An extended version of this paper can be found in reference 11. This research is sponsored by the Natural Sciences and Engineering Council of Canada and FCAR. My apologies to the reviewers because I could not answer all their questions in such a short paper.
360
8, 9, 10. Any sentence is thought of as resulting from the action of a locutor performing a speech act, which determines the context of utterance of that sentence. A major contribution of that approach was the explicit introduction of temporal coordinate systems in the discourse representation using three kinds of constructs: the narrator's perspective, the temporal localizations and the agent's perspectives. As an example let us consider the story displayed in Figure 1. It is told by an unidentified narrator using the past tense, which indicates that the reported events occurred in the past relative to the narrator's time, and hence to the moment when the reader reads the story. When the narrator reports the characters' words, the verb tense changes for the present or the future. These tenses are relative to the characters' temporal perspectives which differ from the narrator's temporal perspective that is temporally located after that date. This example shows the necessity of explicitly introducing in the discourse representation the contexts of utterance of the different speech acts performed by the narrator and the characters. The complete representation of this story can be found in 11. Monday October 20 1997, Quebec city (SI). Peter wanted to read Sowa's book ($2), but he did not have it ($3). He recalled that Mary bought it last year ($4). He phoned her ($5) and asked her ($6): "Can you lend me Sowa's book for a week?"(S7) Mary answered (S8):"Sure! ($9) Come and pick it! (SI0)". John leplied (SII):"Thanks! (SI2) I will come tomorrow" (SI3). F i g u r e 1: A s a m p l e story Note: the numbers Si are used to identify the various sentences of the text
However, even such a representation is not sufficient if we want to enable software agents to reason about the discourse content. We aim at creating a system which will be able to manipulate the mental models of the characters involved in the discourse and simulate certain mechanisms related to story understanding. We based our approach on cognitive studies that have shown that readers adopt a point of view "within the text or discourse" 2. The Deitic Shift Theory (DST) argues that the metaphor of the reader "getting inside the story" is cognitively valid. The reader often takes a cognitive stance within the world Of the narrative and interprets the text from that perspective 14. Segal completes this view by discussing about the mimesis mechanism: "A reader in a mimetic mode is led to experience the story phenomena as events happening around him or her, with people to identify with and to feel emotional about... The reader is often presented a view of the narrative world from the point of view of a character... We propose that this can occur by the reader cognitively situating him or herself in or near the mind of the character in order to interpret the text" (15 p 67 artt 68). Hence, from a cognitive point of view, understanding a story is not a mere process of identifying truth conditions of a series of sentences, but is a construction process of building several partial models such as a model of the environment in which the story takes place, a model of mental attitudes for each character and a model of the verbal interactions taking place in the story. Hence the following assumption: When understanding a discourse, a reader creates several mental models that contain the mental attitudes (beliefs, desires, emotions, etc.) that she attributes to
361
each character as well as the communicative and non-communicative actions performed by those characters.
Hence, when using CGs to model the semantic content of a discourse, we need: 1) a way of representing the context of utterance of agents' speech acts; 2) the underlying temporal structure; 3) a way of representing the mental models of each character involved in the discourse. In the next sections we address point 3) and propose an approach based on a logical framework that differentiates three components in an agent's mental model: a temporal model which simulates an agent's experience of the passing of time (Section 2); the agent's memory model which records the explicit mental attitudes the agent is aware of and the attentional model containing the knowledge structures that the agent manipulates in its current situation (Section 3).
2. A n Agent's M e n t a l M o d e l Based on A w a r e n e s s In order to find an appropriate approach to represent agents' mental models, we can consider different formalisms that have been proposed to model and reason about mental attitudes 2 1 among which the so-called BDI approach 13 5 is widely used to formalize agents' knowledge in multi-agent systems. These formalisms use a possibleworlds approach 6 for modeling the semantics of agent's attitudes. For example, in the BDI approach 13 5, an agent's mental attitudes such as beliefs, goals and intentions are modeled as sets of accessible worlds associated with an agent and a time index thanks to accessibility relations typical of each category of mental attitudes. However, such logical approaches are impaired by the problem of logical omniscience, according to which agents are supposed to know all the consequences of their beliefs. This ideal framework is impractical when dealing with discourses that reflect human behaviors, simply because people are not logically omniscient 7. In addition, it is difficult to imagine a computer program that will practically and efficiently manipulate sets of possible worlds and accessibility relations. In order to overcome this theoretical problem, Fagin et al. 4 proposed to explicitly model an agent's knowledge by augmenting the possible-worlds approach with a syntactic notion of awareness, considering that an agent must be aware of a concept before being able to have beliefs about it. In a more radical approach, Moore suggested partitioning the agent's memory into different spaces, each corresponding to one kind of propositional attitude (one space for beliefs, another for desires, another for fears, etc.), "these spaces being functionally differentiated by the processes that operate on them and connect them to the agent's sensors and effectors" 7. The approach that we propose in this paper tries to reconcile these various positions, while providing a practical framework for an agent 2 In AI litterature, elements such as beliefs, goals and intentions are usually called "mental states". However, we use the term "mental attitudes" to categorize those elements. An agent's mental model evolves through time and we use the term agent's mental state to characterize the current state of the agent's mental model. Hence, for us an agent's mental state is composed of several mental attitudes.
362
in order to manipulate knowledge extracted from a discourse. The proposed agent's framework is composed of three layers: the agent's inner time model which simulates its experience of the passing of time; the agent's memory model which records the explicit mental attitudes the agent is aware of and the attentional model containing the knowledge structures that the agent manipulates in its current situation. In order to formalize the agent's inner time model, we use a first-order, branching time logic, largely inspired by the logical language proposed in 5 and 13. It is a firstorder variant of CTL*, Emerson's Computational Tree Logic 3, extended to a possible-worlds framework 6. In such a logic, formulae are evaluated in worlds modeled as time-trees having a single past and a branching future. A particular time index in a particular world is called a world-position. The agent's actions transform one world position into another. A primitive action is an action that is performable by the agent and uniquely determines the world position in the time tree. The branches of a time tree can be viewed as representing the choices available to the agent at each moment in time. CTL* provides all the necessary operators of a temporal logic. It is quite natural to model time using the possible-worlds approach because the future is naturally thought of as a branching structure and because the actions performed by the agent move its position within this branching structure. The agent's successive world positions correspond to the evolution of the agent's internal mental state through time as a result of its actions (reasoning, communicative and non-communicative acts). The agent does not need to be aware of all the possible futures reachable from a given world position: this is a simple way of modeling the limited knowledge of future courses of events that characterizes people. An agent's successive world positions specify a temporal path that implements the agent's experience of the passing of time: this characterizes its "inner time". This inner time must be distinguished from what we will call the "calendric time" which corresponds to the official external measures of time that is available to agents and users (dates, hours, etc.).
3
The Agent's Memory Model and Attentional Model
The agent's mental attitudes are recorded in what we call the agent's memory model. Following Fagin 4 we consider that the definition of the various mental attitudes in terms of accessibility relations between possible worlds corresponds to the characterization of an implicit knowledge that cannot be reached directly by the agent. At each world position the agent can only use the instances of mental attitudes it is aware of. Following Moore's proposal of partitioning the agent's memory into different spaces, the awareness dimension is captured by projecting an agent's current world-position onto so-called knowledge domains. The projection of agent Ag's world-position Wto on the knowledge domain Attitude-D defines the agent's range of awareness relative to domain Attitude-D at time index tO in world w. The agent's range of awareness is the subset of predicates contained in knowledge domain Attitude-D which characterize the particular instances of attitudes the agent is aware of at world-position Wto . The
363
knowledge domains that we consider in this paper are the belief domain Belief-D and the goal domain Goal-D. But an agent can also use other knowledge domains such as the emotion domain Emotion-D that can be partitioned into sub-domains such as FearD, Hope-D and Regret-D. In addition to knowledge domains which represent an agent's explicit recognition of mental attitudes, we use domains to represent an agent's explicit recognition of relevant elements in its environment, namely the situational domain Situational-D, the propositional domain Propositional-D, the calendric domain Calendric-D and the spatial domain Spatial-D. The situational domain contains the identifiers of any relevant situation that an agent can explicitly recognize in the environment. Situations are categorized into States, Processes, Events, and other sub-categories which are relevant for processing temporal information in discourse 8 9: these sub-categories characterize the way an agent perceives situations. A situation is specified by three elements: a propositional description found in the propositional domain Propositional-D, temporal information found in the calendric domain Calendric-D and spatial information found in the spatial domain Spatial-D. Hence, for each situation there is a corresponding proposition in Propositional-D, a temporal interval in Calendric-D and a spatial location in Spatial-D. Propositions are expressed in a predicative form which is equivalent to conceptual graphs. The elements contained in the calendric domain are time intervals which agree with a temporal topology. The elements contained in the spatial domain are points or areas which agree with a spatial topology. Figure 2 illustrates how worlds and domains are used to model agent Peter's mental attitudes obtained after reading sentences S 1 to $4 in the text of Figure 1. Worlds are represented by rectangles embedding circles representing time indexes and related together by segments representing possible time paths. Ovals represent knowledge domains. Curved links represent relations between world positions and elements of domains (such as Spatial-D, Calendric-D, Peter's Belief-D, etc.) or relations between elements of different domains (such as Situational-D and Spatial-D, Calendric-D or Propositional-D). After reading sentences S1 to $3, we can assume that Peter is in a world-position represented by the left rectangle in Figure 2, at time index tl in world W1. This world position is associated with a spatial localization Peter's.home in the domain Spatial-D and a date dl in the domain Calendric-D. This information is not mentioned in the story but is necessary to structure Peter's knowledge. Time index tl is related to beliefs P.bl and P.b2 in Peter's Belief-D. P.bl is related to situation sl in Situational-D which is in turn related to proposition p29, to location Peter's.home in Spatial-D and to a time interval -, Now in Calendric-D. Now is a variable which takes the value of the date associated to the current time index. Proposition p29 is expressed as a conceptual graph represented in a compact linear form: Possess (AGNT- PERSON:Peter; OBJ- BOOK: Sowa's.Book) Notice that in Calendric-D we symbolize the temporal topological properties using a time axis: only the dates dl and d2 associated with time indexes tl and t2 have been represented as included in the time interval named October 201997.
364
Specification of propositions PROP(p29, Possess(AGNT-PERSON:Peter; OBJ-BOOK: Sowa's.Book)) PROP(p30, Read (AGNT-PERSON:Peter; OBJ-BOOK:Sowa's.Book)) PROP(p31, Buy (AGNT-PERSON:Mary; OBJ- BOOK: Sowa's.Book)) PROP(p32, Lend (AGNT-PERSON:Mary;PINT- PERSON: Peter; OBJ-BOOK:Sowa's,Book))
Figure 2: Worlds and Domains From sentence $2 we know that Peter wants to read Sowa's book, which is represented by the link between time index tl and the goal P.gl in Peter's Goal-D. P.gl is related to situation s4 in Situational-D which is in turn related to proposition p30 in Propositional-D. P.gl is related to a date in Calendric-D and a location in Spatial-D, but those links have not been represented completely in order to simplify the figure (only dotted segments indicate the existence of those links). In world W1, agent Peter can choose to move from time index tl to various other time indexes shown by different circles in the world rectangle in Figure 2. Moving from one time index to another is the result of performing an elementary operation. From our little story we can imagine that Peter wanted that Mary lends him the book: the corresponding elementary operation is the creation of a goal P.g2 with the status active. In Figure 2 this is represented by the large arrow linking the rectangles of worlds W and W2 on which appears the specification of the elementary operation Creates (Peter, P.g2, active). When this elementary operation is performed, agent Peter moves into a new world W2 at time index t2 associated with the spatial localization Peter's.home and date d2. Time index t2 is still related to beliefs P.bl and P.b2 in Belief-D, but also to goals P.gl and P.g2 in Peter's Goal-D. In an agent Agl's mental model certain domains may represent mental attitudes of another agent Ag2: They represent the mental attitudes that Agl attributes to Ag2. As an example, Goal
365
P.g2 in Peter's Goal-D is related to Goal rag6 in Mary's Goal-D which is contained in Peter's mental model. Goal mg6 is associated with situation s2 which is itself related to proposition p32. Beliefs and goals are formally expressed using predicates which hold for an agent Ag, a world w and a time index t as for example Peter's beliefs and goals at time index t2: Peter, W2, t2 I= BELe.bl(Peter,STATE~I(NOTp29, -, Now, Peter's.home )) Peter, W2, t2 I= BELp.b2(Peter,EVENT~(p31, dx, -) ) Peter, W2, t2 I= GOALp.gl(Peter,PROCESS~(p30, -, Quebec ), active) Peter, W2, t2 I= GOAl-,.gz(Peter,GOAL.~(Mary, PROCESSa(p32, -, Quebec), active), active) The agent's memory model gathers the elements composing the agent's successive mental states. The amount of information contained in the agent's memory model may increase considerably over time, resulting in efficiency problems. This is similar to what is observed with human beings. They record "on the fly" lots of visual, auditive and tactile but they usually do not consciously remember this detailed information over long periods of time. They remember information they pay attention to. Similarly in our framework, the agent's attentional model gathers some information extracted from the agent's memory model because of its importance or relevance for the agent's current activities. The attentional model is composed of a set of knowledge bases that structure the agent's knowledge and enable it to perform the appropriate reasoning, communicative and non-communicative actions. Among those knowledge bases, we consider the Belief-Space, Decision-Space, Conversational-Space and Action-Space. The Belief-Space contains a set of beliefs exlracted from the memory model and a set of rules enabling the agent to reason about those beliefs. Each belief is marked by the world position that was the agent's current world position when the belief was acquired or inferred. The Decision-Space contains a set of goals extracted from the memory model and a set of rules enabling the agent to reason about those goals. Each goal is marked by the world position that was the agent's current world position when the goal was acquired or inferred. The Conversational-Space models the agents' verbal interactions in terms of exchanges of mental attitudes and agents' positionings relative to these mental attitudes. In our approach a conversation is thought of as a negotiation game in which agents negotiate about the mental attitudes that they present to their interlocutors: they propose certain mental attitudes and other locutors react to those proposals, accepting or rejecting the proposed attitudes, asking for further information or justification, etc. 12. The Action-Space records all the communicative, noncommunicative and inference actions that are performed by the agent. Details and examples about the attentional model can be found in 11.
5
Conclusion
This paper is a contribution to the debate about the notion of context which has taken place for several years in the CG community. Whereas our temporal model 8, 9, 10. presented a static representation of the various contexts of utterance found in a
366 discourse, the present approach considers that the context is built up from the accumulation of knowledge in various knowledge bases (the various spaces of the attentional model) that compose the agent's mental model. The proposed logical framework provides the temporal model of discourse with semantics that can be practically implemented in an agent's system. However, the comparison of this framework with other approaches of context modeling would deserve another entire paper.
References 1. Cohen P. R.& Levesque H. J. (1990), Rational Interaction as the Basis for Communication, in Cohen P.R., Morgan J. & Pollack M. E.(edts.) (1990), Intentions in Communication, MIT Press, 221-255. 2. Duchan J. F., Bruder G.A. & Hewitt L.E. (1995), Deixis in Narrative, Hillsdale: Lawrence Erlbaum Ass. 3. Emerson E. A. (1990), Temporal and modal logic, In van Leeuwen J. (edt.), Handbook of Theoretical Computer Science, North Holland, Amsterdam, NL. 4. Fagin R., Halpern J.Y., Moses Y. & Vardi M.Y., Reasoning about Knowledge, MIT Press, 1996. 5. Haddadi A. (1995), Communication and Cooperation in Agent Systems, Springer Verlag Lecture Notes in AI n. 1056. 6. Kripke S. (1963), Semantical considerations on modal logic, Acta Philosophica Fennica, vol 16, 83-89. 7. Moore R.C. (1995), Logic and Representation, CSLI Lecture Notes, n. 39. 8. Moulin B. (1992), A conceptual graph approach for representing temporal information in discourse, Knowledge-Based Systems, vo15 n3, 183-192. 9. Moulin B. (1993), The representation of linguistic information in an approach used for modelling temporal knowledge in discourses. In Mineau G. W., Moulin B. & Sowa J. F. (eds.), Conceptual Graphs for Knowledge representation, Lecture Notes on Artificial Intelligence, Springer Verlag, 182-204. 10. Moulin B. (1997), Temporal contexts for discourse representation: an extension of the conceptual graph approach, the Journal of Applied Intelligence, vol 7 n3, 227255. 11. Moulin B. (1998), A logical framework for modeling a discourse from the point of view of the agents involved in it, Res. Rep. DIUL-RR 98-03, Laval Univ., 16 p. 12. Moulin B., Rousseau D.& Lapalme G. (1994), A Multi-Agent Approach for Modeling Conversations, Proc. of the International Conference on Artificial Intelligence and Natural Language, Pads, 35-50. 13. Rao A.S.& Georgeff M. P. (1991), Modeling rational agents within a BDI architecture, In proceedings of KR'91 Conference, Cambridge, Mass, 473-484. 14. Segal E.M. (1995a), Narrative comprehension and the role of Deictic Shift Theory, in 2, 3-17. 15. Segal E.M. (1995b), A cognitive-phenomenological theory of fictional narrative, in 2, 61- 78. 16. Sowa J. F. (1984). Conceptual Structures. Reading Mass: Addison Wesley.
Computational Processing of Verbal Polysemy with Conceptual Structures Karim Chibout and Anne Vilnat Language and Cognition Group,LIMSI-CNRS, B.P. 133, 91403 Orsay cedex, France. chibout, vilnat @lims i. fr
A b s t r a c t . Our work takes place in the general framework of lexicosemantic knowledge representation to be used by a Natural Language Understanding system. More specifically, we were interested in an adequate modelling of verb descriptions allowing to interpret semantic incoherence due to verbal polysemy. The main goal is to realise a module which is able to detect and to deal with figurative meanings. Therefore, we first propose a lexico-semantic knowledge base; then we present the processes allowing to determine the different meanings which may be associated to a given predicate, and to discriminate these meanings for a given sentence. Each verb is defined by a basic action (its supertype) specified by the case relations allowing to specify it (its definition graph), that means the object, mean, manner, goal and/or result relations which distinguish the described verb meaning from the specified basic action. This description is recursive : the basic actions are in turn defined using more general actions. To help interpreting the different meanings conveyed by a verb and its hyponyms, we have determined three major types of heuristics consisting in searching the type lattice, and/or examining the associated definitions.
1
Introduction
Our work takes place in the general framework of lexico-semantic knowledge representation to be used by a Natural Language Understanding system. More specifically, we were interested in an adequate modelling of verb descriptions allowing to interpret semantic incoherence due to verbal polysemy. The main goal is to realise a module which is able to detect and to deal with figurative meanings.We studied two complementary thrusts to model polysemy : (a) lexico-semantic knowledge representation, (b) processes to interpret the different meanings which may be associated to a given predicate, and to discriminate these meanings for a given sentence. After an outline of some metaphor examples that justified our approach, we will present links between verbs within a lexical network and the semantic structure associated with each of them. We will finally define the model of polysemy propounded from this representation formalism; in particular, selection rules within the network in order to process figurative meanings of verbs.
368
2
Polysemy,
metaphors
and
figurative
meanings
The distinction between polysemy and homonymy must be clearly distinguished : two homonyms only share the same orthography, two polysems share semantic elements. In case of homonymy, the solution is to create as many concepts as necessary. Otherwise, in case of polysemy the different senses of a word are mostly figurative meanings and derive from its core meaning. Thus, it is necessary to be able to determine this core meaning, and the derivation rules implied in this senses elaboration. The following examples illustrate this phenomenon. Lexical metaphors are created by selecting words specific semantic features . In the following example, both nouns s h a r e / s p h e r i c a l / f e a t u r e . 1)La terre est une orange. (Earth is an orange.) 1 The characteristic feature used in the comparison may also be taken in a figurative meaning, and the phrase is twice non literal. 2) Sally is a block of ice. The feature / c o l d / o f the ice (literal meaning) is assimilated to the feature / c o l d / o f human character (figurative meaning). This characteristic may be true or assumed (in the beliefs) : 3) John is a gorilla, to signify a man with violent, brutal manners; whereas the ethologist will insist on the peaceful manners of this animal. The same metaphor may convey multiple meanings depending on the context, by selecting different semantic features. In 3), gorilla may also m e a n / h a i r y / o r even/(physically) strong/. Figure interpretation consists in selecting one of the semantic characteristics among those which define the metaphoric term. When multiple meanings are possible, they are semantically related as they come from a unique representation. Polysemy is also considered as in context (re-)building process from a unique representation. We pay most attention to semantic incoherence resolution due to verbal polysemy. Analysing multiple meanings of French verbs allows us to precise the semantic representations from which they are elaborated. 4) Les vagues couraient jusquaux rochers. (The waves run towards the rocks) The verbal metaphor in 4) may be interpreted in replacing to run by to move quickly, which both are implied by the semantic description of this verb. Roughly speaking, the semantic features associated to to run are :/to m o v e / + / w i t h feet/ + / q u i c k l y / . . . Hyperonym is not the only part of the meaning which is used to build a figurative meaning. 5) La robe dtrangle la taille de la jeune fille. (The dress constricts the young girls waist, with the French verb dtrangler, which literally means to strangle, to translate the notion conveyed by to constrict). In this example to strangle is defined as "to suffocate by grasping the neck"; the semantic incoherence cannot be resolved by the hyperonym but by the selection of the f e a t u r e / g r a s p / (synonym of to constrict) which specifies the method used to do the action. To find 1- ~in this paper, the examples illustrating polysemy will generally be given in French, with a translation in English, to maintain their polysemous nature, which will probably lost their character in English. Being not English native speaker, it was difficult to always find analaguous examples, and to be sure that they convey the same senses.
369
the different meanings of a predicate, always semantically related, it is necessary to select its hyperonym , or to extract another part of its definition, or to combine both mechanisms (see for example the interpretation of to run in 4). These polysemous behaviours lead us to propose a hierarchical representation of the concepts, completed for each concept by a precise semantic description allowing to express semantic features that differentiate it from its father (direct hyperonym) and from its brothers in the hierarchy. This representation is implemented in the conceptual graph formalism ? : the hierarchy will take place in the type lattice, and semantic descriptions will be translated in type definitions. 3
Ontology,
definitions
and
conceptual
graphs
From the preceding study, it is clear that the best suited tool to build the ontology is a dictionary to obtain the different meaning components. The entries constitute rather precise descriptions generally including the hyperonym and referring to verbs whose meanings are related to the one defined. The different relations (such as mean, manner, goal,...) that specify the verb with respect to its hyperonym are also given. Yet, the definitions in a dictionary are not always homogenous concerning their structure and their content ?, ?. Thus it is impossible to rely only on a dictionary to build our network.The method used to organise verbal concepts has been divided in two steps. Firstly, we have analysed about a hundred verbs in French. These concepts have been categorised in details following precise criteria, such as a systematic definition of the kind of the relations between the verb and its semantic features. The representation of each lexical item is called conceptual schema, and corresponds to an enhanced dictionary definition. Each conceptual verb is given in terms of its nearest hyperonym (the central event) and some semantic cases which specify it. In addition to the classical case relations (agent, object, mean,...), four cases are essential for a complete verb description : the manner an event is realised, the method used to realise it, the result of the event and its intrinsic goal. For example, the verbs to cut and to cover have the following semantic descriptions : t o c u t : to divide (nearest hyperonym) a solid object (object) into several pieces (result) using an edge tool (mean) by going through the object (method) t o c o v e r : to place (nearest hyperonym) something (object) over something else (support)in order to hide it(goal1) or protect it (goal2). The verb hierarchy is organised following the case relations which are associated. A verbal concept is hyperonym (respectively hyponym) of another one if they share a common hyperonym and if the case structure of this verb presents one of the following feature : (a) lack (respectively presence) of a value defined for a given case, for example to divide is hyperonym of to cut, because the definition of to cut includes a mean and a method; (b) presence of a case with multiple value (respectively unique value), for example to cover is hyperonym of to plate and to veil which only include one of the possible goals (respectively to protect and to hide) ; (c) presence of a case with a generic value (respectively specific value), for example to decapitate is hyperonym of to guillotine; the mean case of the first
370
(edge tool) is a sub-type of the one of the second (guiliotine).This b o t t o m - u p building of the hierarchy allows us to define some large semantic classes. Thus, we determine about fifteen primitives appearing to be similar to the classical case relations, and mostly corresponding to state verbs : Owner (ex : to own), Location (ex : to live in), Container (ex : to contain), Support(ex : to support), Patient (ex: to suffer), T i m e (ex: to exist), Experiencer (ex : to know),... At the J
(a)
Acilon
F~re,C ~ r 9..
FaireCe~.r po,~ibllite .
.
.
.
.
.
.
.
.
.
.
Empecher (Prevent) Etoeffer (Suffocate)
FaireC~eml~ FaireMomir
Museler (Muzzle)
Fal~-r
.
Tue~ (Kill)
Arretex (Stop)
Fairc Est~cr
FaireCesse~Paraitrc
S~ (S/~ Divisef (Divide)
|
(Mince)
Gommer (Rubout)
--9
FaimEntrer
Cachet Dissiper Effa~r (Hide) (Dissipate) (Erase)
Asphyxie~ Etrangler Noyer Decapiter Egorgex Peigner Couper Casser Fragmentef ""' (Asl~yxiate) (Strangle) (Drown)(Decapilate)(CutThr~t)(Comb) (Cut) (Bleak) (Fragmen0
..
999 FalrcDeveairLoc
Fai,eDisnarmtm ~
Intermmpre Demeler (ntem~t) (Unravel)
Guillotiner (Guillotine)
~'lir
Imp~ter (Impo~)
Eclipsef (Eclipse)
FaimPenetrer (Penetrate) Instiller (Instill)
"9
Bds~ (Smash)
(c)
give-food:*x
~ ~ ' , , , . ~
animate:*z
Fig. 1. Extract of the hierarchy (a) and description of (c) type definition graph
to feed: (b) canonical graph and
higher level of the hierarchy, the process verbs and the action verbs derive from these state primitives. So processes are expressed as Devenir (become)/Cesser (cease) + primitive, and action verbs as FaireDevenir/FaireCesser + primitive : - Devenir Contenant -
(to fill, to eat, to drink), Devenir Temps (to born), DevenirExpdrienceur (to learn) CesserContenant (to (become) empty), CesserTemps (to die), CesserExpdrienceur (to forget) FaireDevenir Contenant (to fill, to water), FaireDevenir Temps (to create, to give birth to, to calve), FaireDevenirExpgrienceur (to teach ) - FaireCesserContenant (to empty), FaireCesserTemps (to interrupt, to kill)
-
Our study has concerned about 2000 French verbs. Thus we have constituted a large lexico-semantic network corresponding to more than 1000 verbal concepts,
371
organised inside a hierarchy (see Fig.l). For each node, the description consists in : a sub-categorisation frame, a case structure (represented in a canonical graph) and a definition (represented in a definition graph), describing how it processes its hyperonym (i.e. its super-type in the hierarchy), (see Fig.l). The association word/concept is defined in a global table. These descriptions have allowed us to build the lexico-semantic knowledge base used by the Natural Language Processing system developed at LIMSI ?. The definition graphs have been built following the method we describe in the first part of this paragraph. Thus we make sure that the definitions are not axi-hoc definitions, following a top-down building of the hierarchy, but respect the meanings of the words they represent. But, at that time we cannot verify that the constraints due to inheritance mechanismare verified : this constitutes the following step of our work. 4
Verbal
polysemy
: elements
of interpretation
As we have said it, we built the hierarchy we presented just above to be able to interpret lexical polysemy of French verbs. We assume that there exists a (proto)typical meaning associated to each verb. The prototype is the more often attested meaning. The so-called figurative meanings derive from this prototypical meaning. These figurative meanings are related to the semantic structure of the verb by taking into account either the nearest hyperonym (as in example 4), or other parts of the conceptual schema (as in example 5). But other processes are necessary to interpret metaphoric senses, based on primitives substitutions. Let us consider the interpretation of the following example: 6) La fermi@re nourrit le feu avec des branchages. (The farmer's wife f e e d s the fire with lopped branches (to keep going the fire)) The constraints expressed in the canonical graph of to feed are violated. But the definition graph (see Fig.l) includes the particular meaning of this word (in example 6) : it is given in the sub-graph corresponding to to maintain (in French entretenir which means to keep going in this context). The semantic constraints expressed in the canonical graph of to maintain are then fulfilled (particularly for the type of the object fire). Selecting parts of the semantic representation of lexical items is not the only process used to build figurative meanings ; nevertheless, as we show it just below, it is a necessary condition to be able to elaborate these meanings.Some figurative meanings rely on conceptual metaphors ?. Thus, nourrir quelqu 'an de ragots (to feed someone with ill-natured gossips), abreuver quelqu'un de connaissances (to swamp someone with knowledge) may be interpreted from the metaphor Mind is Container. Such conceptual metaphors are interesting because they are rather general : they apply not only to a given verb, but to a whole class of verbs. Thus, gaver de connaissances (to fill ones mind with knowledge), se nourrir de lectures (to improve the mind with reading), dgvorer un livre (to devour a book) will be interpreted using the metaphor Mind is Container. These interpretation processes do not invalidate our model for polysemy. The semantic representation
372
we propose is based on a hierarchy, thus it is possible to go up to the semantic primitives from which verbal items are dependant. The extracts of the type lattice in Fig.2 clarify the way the interpretation is done. Semantic primitives play a central role to interpret figurative meanings due to conceptual metaphor. Thus, all the metaphoric uses, such as abreuver quelqu'un de connaissances (to swamp someone with knowledge), for which the analogy Mind is container will be relevant, will be interpreted by substituting the primitive CONTAINER to the primitive EXPERIENCER. From the metaphoric verb, the semantic class corre-
Absorber SeRemplir Deve~nna(~'c~ (ab~j,~) (fill) / / ~ / ~ ApprcndreComprendre DP:cldffre~ Avalor "'" (learn) (understand) (decil~he~ ',swallow) ~eNourrir Boire Lira :e~d~er (drink) (read)
I,
,~br~uv7
~foe~ll~r
(inculcate) (tea~)
I ~o~r~
(proclaim)
(eat)
D~s
SeGaver (devour)(gorgeoneself)
Fig. 2. Interpretation examples
sponding to the primitive is reached. Then the substitution rule (Container -> Experiencer) is applied, and the pertinent concept (FaireApprendre for Abreuver (to water) and Lira (to read) for D4vorer (to devour) is searched for in this branch using canonical graphs. The searched concept must be described with a canonical graph expressing constraints that are verified by the arguments of the metaphoric sentence (in our examples knowledge and book) The most difficult point is due to the fact that canonical graphs don't always discriminate the pertinent concept from the others belonging to the same branch (for example FaireApprendre, Inculquer (to inculcate), Enseigner (to teach), etc. have identical canonical graphs).This technical difficulty doesn't invalidate the model proposed for conventional metaphoric senses. Conceptual metaphors are not applied casually ; it is the fact that a verbal item is related to a given primitive in its semantic structure which justifies the use of this or that metaphor. There is no "spontaneous generation" of meaning; verbs contain in their conceptual schemas the entry points towards other meanings. Following examples illustrate the classic metaphor of SPACE -> TIME : 7) couper la parole (to interrupt someone),interrompre (to interrupt)
373 8) briser une conversation (to break off a conversation)interrompre brusquement (to suddenly interrupt) 9)entrecouper ses phrases de sanglots (to interrupt one's phrases with sobs) interrompre frgquemment (to frequently interrupt) All the verbs couper, briser, entrecouper depend on the class FaireCesserEspace (expressing spatial discontinuity). Interpreting metaphors consists in substituting class FaireCesser Temps (temporal discontinuity) to this class. For the two last examples, the adverbs attached when they are interpreted belong to the definitions of the verbs (to smash briser: break suddenly... ; entrecouper : to cut frequently... About ten substitution rules of this kind have been determined. They are probably not exhaustive, but have nevertheless being tested on 1000 verbs. Moreover, the fact that common meanings are shared by a class of verbs (regrouped because they belong to a same branch) partially validate our ontology. Such figurative meaning interpretations are rather poor. The following example illustrates this point. 10) Le paysan coupait souvent par le champ de bld.(The farmer often cuts through the wheatfield) Polysemy is solved by selecting the case method ,i.e. to go through associated to cut in its definition graph (see Sect. ??). But replacing to cut by to go through looses information : the implicit notion of spatial reduction (or more exactly moving distance reduction) conveyed by to cut in this sentence is lost. A more precise interpretation will need to determine which meaning parts of the metaphoric verb have to be transferred to the inferred meaning.
5
Related
works
Lexical classifications based on semantic criteria are relatively numerous in artificial intelligence. Semantic nets are main lexical knowledge representation modes; but they essentially concern nouns. However, Levin ? proposes an English verbs summary classification founded on syntactico-semantic criteria. The hierarchy lacks accuracy : Levin gets main classes (occasionally sub-classes) in which verbs are listed; but, on no account, there are semantic relations defined between verbs within a same class (or sub-class). Verbs within a same class are assimilated to syntactic synonyms ; therefore from a semantic point of view, these verbs may convey rather distant meanings. Some authors try to build precise semantic hierarchies of English verbs on a large scale ?. They emphasise the complexity of this task, notably due to the different semantic fields implied in relations between verbs having connected meanings. Miller and his collaborators determine a particularisation relation Mlowing to group different semantic components that distinguish a verb from its hyperonym. This relation between two verbs V1 and V2 (V1 hyponym of V2) is named troponymy and is expressed by the formula "accomplish V1 is accomplishing V2 with a particular manner". For example,battle, war, tourney, duel,..are troponyms of the verbal predicate fight. Troponyms of communication verbs imply the speaker intention or his motivation to communicate, as in examine, confess, preach,.., or the communication
374
media used : fax, email, phone, telex, .... Relying on these principles, they implemented a large lexical network (Wordnet) that organise verbs and other lexical categories in terms of signified. Our work is in part inspired by their approach, but we systematically precise the kind of relation between t r o p o n y m concepts (case relations).
6
Conclusion
We have presented linguistic and computational aspects of verbal polysemy. Figurative meanings of a predicate represent as m a n y shifts in meaning from a unique semantic structure that define this verb. The resolution processes consist in a simple selection of the pertinent elements of this representation (i. e. the pertinent subgraph of the definition graph) or in the same selection plus an inference from one of these elements (conceptual metaphors). Our knowledge base has been implemented in conceptual graphs but till need to be verified concerning possible incoherence due to the inheritance mechanism. The psychological validity of the representation and of the treatment we propose is also being tested in an experiment. We don't claim that our work is exhaustive neither in the semantic representation proposed, nor in the different acceptions understanding processes. We define a general framework in which the links between the multiple senses of a verb m a y be expressed. Polysemy is one of the m a j o r characteristics of natural language, it is also one of the most complex to apprehend. Moreover, as the other contextual phenomena (anaphora, implicit, etc.), polysemy is one of the m a j o r difficulty in Natural Language Processing.
References 1. Chibout, K., Masson, N. : Un r@seau lexico-s~mantique de verbes construit & partir du dictionnaire pour le traitement informatique du fran~ais, Actes du colloque LTTAUPELF-UREF Lexicomatique et Dictionnairique, Lyon, Septembre 1995. 2. Lakoff, G., Johnson, M. : Les m6taphores dans la vie quotidienne. Collection "Propositions", les 6ditions de Minuit (1985). 3. Levin, B. : English verb classes and alternations, University of Chicago Press, 1993. 4. Martin, R. : Pour une logique du sens. Linguistique nouvelle. Presses Universitaires de France, (1983). 5. Miller, A. G., Fellbaum, C., Gross, D. : WORDNET a Lexical Database Organised on Psycholinguistic Principles. in Zernik (Ed.) : Proceedings of the First International Lexical Acquisition Workshop, I.J.C.A.I., D~troit,(1989). 6. Searle, J.R. : Metaphor in Andrew Ortony (Ed.) : Metaphor and Thought. Cambridge University Press (1979), pp. 284-324. 7. Sowa J. Conceptual Structures : processing in mind and machine. AddisonWesley.Reading, Massachussetts, (1984). 8. Vapillon J., Briffault X., Sabah G., Chibout K. : An object oriented linguistic engineering using LFG and CG, ACL/EACL Workshop : Computational Environments for Grammar Development and Linguistic Engineering, Madrid (1997).
Word Graphs: The Second Set C. H o e d e I a n d X. Liu 2 1 University of Twente, Faculty of Mathematical Sciences, P.O. Box 217, 7500 AE Enschede, The Netherlands c. hoede~math, utwente, n l
2 Department of Applied Mathematics, Northwestern Polytechnical University, 710072 Xi'an, P.R. China
A b s t r a c t . In continuation of the paper of Hoede and Li on word graphs for a set of prepositions, word graphs are given for adjectives, adverbs and Chinese classifier words. It is argued that these three classes of words belong to a general class of words that may be called adwords. These words express the fact that certain graphs may be brought into connection with graphs that describe the important classes of nouns and verbs. Some subclasses of adwords are discussed as well as some subclasses of Chinese classifier words.
K e y words: Knowledge graphs, word graphs, adjectives, adverbs, classifters. A M S S u b j e c t Classifications: 05C99, 68F99.
1
Introduction
W e refer t o t h e p a p e r of H o e d e a n d Li 1 for a n i n t r o d u c t i o n t o k n o w l e d g e g r a p h s as far as n e e d e d for this p a p e r . W e o n l y recall t h e following. W o r d s are c o n s i d e r e d to b e r e p r e s e n t a b l e b y d i r e c t e d labeled graphs. T h e vertices, or tokens, are i n d i c a t e d b y squares a n d r e p r e s e n t somethings. T h e arcs have c e r t a i n t y p e s t h a t are c o n s i d e r e d t o r e p r e s e n t t h e relationship b e t w e e n s o m e t h i n g s as recognizable b y t h e mind. T h e g r a p h s t h a t we will discuss are t h e r e f o r e c o n s i d e r e d t o b e s u b g r a p h s of a huge mind graph, r e p r e s e n t i n g t h e knowledge of a m i n d a n d t h e r e f o r e also called knowledge graph. T h e s e knowledge g r a p h s are v e r y similar t o c o n c e p t u a l g r a p h s , b u t are r e s t r i c t e d as far as t h e n u m b e r of t y p e s of r e l a t i o n s h i p is concerned. T h e r e are two t y p e s of relationships. T h e b i n a r y relationships, t h e u s u a l arcs, m a y have t h e following labels:
376
EQU : Identity SUB
: Inclusional
part-ofness
Alikeness D I S : Disparateness CAU : Causality OaD : Ordering P A R : Attribuation SKO : Informational dependency. ALI
:
The SKO-relationship is used as a loop to represent universal quantification. Next to the binary relationship there are the n-ary frame-relations. There are four of these. : Relationship of constituting elements with a concept, being a subgraph of the mind graph. NEGPAR : Negation of a certain subgraph. P O S P A R : Possibility of a certain subgraph. NECPAR : Necessity of a certain subgraph. FPAR
These four frame relationships generalize the wellknown logical operators. If a certain subgraph of the mind graph is the representation of a wellformed proposition p, this proposition is represented by the frame, ~p is represented by the same subgraph framed with the N E G P A R relationship and the modal propositions ~p and Rp are represented by the same subgraph framed with the POSPAR and the N E C P A R relationship respectively. In this way logical systems can be represented by different types of frames of very specific subgraphs. We refer to Van den Berg 2 for a knowledge graph treatment of logical systems. So logic is described by frames of propositions. If a subgraph of the mind graph does not correspond to a proposition the framing, and the representation of the frame by a token, may still take place. Any such frame may be baptized, i.e. labeled with a word. The directed ALI-relationship is used between a word and the token to type the token. Thus ALI
(
STONE
is to be read as "something like a stone". Note that the token may represent a large subgraph of the mind graph. In particular verbs may have large frame contents. Verbs are represented in the same way. So ~
ALI
HIT
377
is t h e way th e verb HIT is represented. T h e directed EQu-relationship is used between a word and a token to valuate or instantiate the token. So PLUTO
EQU
~O <
ALI
DOG
is to be read as "something like a dog equal to P l u t o " . T h e min d g r ap h is considered to be a wordless represent at i on of t h o u g h t relationships between units of perceptions. T h e words come in w hen certain subgraphs are "framed and named". At the most elementary level
the frame contents may just be one relationship. These are the first word graphs to start with. It turned out that prepositions have such very simple structures and for that reason they formed the first set of word graphs. The frame contents of frames representing nouns and verbs express the definitions of the concepts (note that frames do literally take other concepts together). A lexicon of word graphs is being constructed at the University of Twente. In order to make the theme of this paper, adwords, clear we recall the preposition OF. There were three word graphs given in total graph form, where arcs are also indicated by vertices with the type of relationship as label. These three word graphs were:
Suppose we have t he word combination RED BALL. T h e word BALL can b e represented, according to our formalism, by <
ALI
BALL
Now RED is a word a t t r i b u t e d to t he word ball in t he sense t h a t its word g r a ph is linked to t he word graph of BALL. For RED we would take t h e following word graph into t he lexicon:
RED
EQU
~
~
ALI
COLOUR
analogous to the graph for PLUTO. Now we say "red is the colour of the ball" and the word graphs are to be linked in a way that we use the word
378
OF for. As we see colour as an exterior attribute of ball we choose the
PAR-relationship to represent RED BALL by r
ALI
BALL
PAR l RED
EQU
~
r
ALI
COLOUR.
It is in this way t h a t RED is a word linked to the word BALL. It is usually called an adjective. Note that without the word RED we would still have COLOURED BALL. Also note t h a t we do not say RED COLOUR but do say T H E C O L O U R RED. It should be clarifying t h a t HAVE w a s seen as BE W I T H , BE being represented by the e m p t y frame and W I T H having word graph 9 . So the graph might also be brought under words by "the ball has colour red". Note t h a t people m a y differ in opinion on expressing RED BALL. W e have just given our view. Quite a few adjectives are similar to RED and can be seen as adwords linked by the PAR-relationship. However, there are other ways how word graphs can be linked to the word graph of a noun and a verb. One particular way is by the FPAR-relationship t h a t links the constituents of a definition to the defined concept. Suppose, for example, t h a t a stone is defined as '% structure of molecules" (which would not be precise enough, but lexica contain scores of unprecise definitions). If the type of molecules is denoted in the graph, by means o f a n E Q U - a r c , a s silicon, we m a y speak of a SILICON S T O N E and SILICON is now functioning as an adjective. This t y p e of adjective is of another n a t u r e t h a n the adjective RED, the difference being expressed by the way of linking, one time by a PAR-relationship and the other time by an F P A R relationship. T h e essential linguistic phenomenon is t h a t word graphs are linked to other word graphs. As this can be done in various ways, to nouns and to verbs, it is more natural to speak of adwords. We will discuss adjectives and adverbs from this view point. The interesting phenomenon of classifiers in Chinese is closely related to our way of viewing adwords. In Chinese it is not well-spoken to say "a spear". One should say "a stick spear", where stick is a word expressing a classifying aspect of spear. A HORSE is not I MA but should be expressed a s I PI MA, where PI is the classifier, the meaning of which seems to have been lost in the course of time. Yet the adword PI should be expressed in proper speaking. There are more t h a n 400 of these classifiers. We will discuss several of t h e m from the same view point as before and give word graphs for them.
379
In the word graph project we plan to start with studying structural parsing as soon as sufficient word graphs are available. This will take a third set of word graphs, containing the remaining word types. 2
Adwords
In this paper we cannot give an extensive treatment of the grammatical aspects of adjectives, adverbs and classifiers. We will restrict ourselves to some major subclasses of these adwords. The book of Quirk, Greenbaum, Leech and Svartnik 3 was used for reference, more specifically Chapter 5. Our examples are chosen from this book. We will not stress syntactic problems. A certain knowledge graph is brought under words by expressing by words certain of its subgraphs. The way this is done differs from language to language. In an extremely simple example we have that RED BALL in English is uttered as BALLON ROUGE in French. Any knowledge graph admits an utterance path. Usually there axe several utterance paths, i.e. ways of uttering the words, having word graphs that cover the knowledge graph, in a linear order. As our example RED BALL or BALL WITH COLOUR RED shows, there are ways of bringing a knowledge graph under words that are more precise than others. In natural language often the less precise descriptions, like RED BALL, are used. That RED is a colour and that colour can be attributed to a BALL is background knowledge for these concepts RED and BALL that enables the short but incomplete utterance path. A lexicon may contain a word graph for RED that includes the colour concept, but a word graph for BALL that does not mention the possibility of linking to another concept by means of a PAR-relationship. A machine may then not be able to create a connected knowledge graph for RED BALL, unless it is instructed to interpret the syntactical fact that both words axe uttered together as justifying the linking of their word graphs by some arc. For a more elaborate discussion of the interplay of semantics and syntax we refer to the thesis of Willems 4.
2.1
Adjectives
As word graphs are supposed to grasp the semantics of a word, according to the slogan "the structure is the meaning", we focus on the Paragraphs 5.37 to 5.41 in Quirk et al., in which they give a semantic subclassification of adjectives.
380
T h e y make the distinctions stative/dynamic, gradable/non-gradable and inherent/non-inherent. Their Table 5:2 looks as follows: stative gradable inherent +
+
+
BLACK
(coat)
+
+
-{-
--
-~-
BRAVE(man) B R I T I S H (citizen)
~-
-~-
--
NEW
-
(friend)
Adjectives are characteristically stative, most are gradable and most are inherent. The normal adjective type is all three, like B L A C K . Quirk et al. give the imperative as a way to distinguish stative from dynamic adjectives. One can say B E C A R E F U L but not B E T A L L o r B E B L A C K , B E B R I T I S H r e spectively B E NEW. One can say BE B R A V E , explaining the minus sign in the first column. B L A C K E R , BRAVER and NEWER are gradings but BRITISHER is not possible, explaining the minus sign in the second column. Inherent adjectives characterize the referent of the noun directly. T h e y consider BLACK, TRUE and BRITISH to be inherent adjectives. Before undertaking our own discussions we should reproduce their premodification examples on page 925 in which combinations of adwords are mentioned. deter- general age miners etc. THE
HECTIC
THE
EXTRAVAGANT
colour participle
GREY
A
SMALL
HIS
HEAVY
d e n o m - head inal SOCIAL LIFE
A INTRICATE
noun
LONDON SOCIAL LIFE
A
SOME
provenance
OLD
CRUMBLING
CHURCH
TOWER
CRUMBLING
GOTHIC CHURCH
TOWER
INTERCHINESE LOCKING GREEN CARVED
NEW
DESIGNS JADE
IDOL MORAL RESPONSIBILITIES
It is clear t h a t proposals have been m a d e for what Quirk et al. call sem a n t i c sets. We will do exactly that, but the basis of our proposal will be the types of relationships a noun, or verb, can have with other words. The FPAR-adwords We use the FPAR-relationship to represent the definitional contents of a concept. Any word used in the definition might
381
be called an FPAR-adword. However, usually some restrictions are made. If the definition contains a preposition like OF this word is not considered an adword. An other remark that should be made is that a definition may contain the concept of colour. In that case the adjective, say GREY, is considered to be inherent. The definition of an elephant m a y contain the statement that it is a grey animal. In BLACK COAT, however, the adjective cannot be considered to be inherent, like in the table given. The point is t h a t colour is attributed subjectively, a red ball in green light looks black. Even in white light objects m a y have different colours for colour blind persons. For this reason we used the PAR-relationship and consider RED to be a PAR-adword and colour to be non-inherent. Similarly BRAVE and BRITISH are disputable as inherent adjectives, for different reasons. Braveness is present according to a judgement. One is considered to be brave by others. BRAVE can be seen as an instantiation of judgement. Other adjectives of this t y p e are KIND and UGLY. BRITISH does not describe an inherent aspect either. Anything that is part of Britain, seen as a frame, can be called BRITISH. In this way the frame name determines the adjective for its constituents. BRITISH may therefore be called an inverse FPAR-~:lword. Examples of this t y p e of adjectives are LAMB in LAMB MEAT and CITY in CITY COUNCIL o r CHURCH in CHURCH TOWER.
We conclude that the restriction inherent/non-inherent had better be replaced by the distinction F P A R / N O N F P A R . For material objects a typical F P A R - a d w o r d expresses the sort of material. So in JADE IDOL, JADE d e scribes the material and is a typical FPAR-adword. If STEAM is HOT WATER VAPOUR then HOT is a n F P A R - a d w o r d a s is WATER, instantiating relative t e m p e r a t u r e and material of the vapour. Note that we speak of relative t e m p e r a t u r e as HOT is not a temperature, like FAST is not a velocity. Within a flame concepts m a y occur that allow a measure, like temperature or length. In those cases the corresponding FPAR-adwords are gradable, like in hot, hotter or long, longer. Usually these adwords do not indicate absolute t e m p e r a t u r e or length b u t relative t e m p e r a t u r e and relative length. In "a two meter man", the precise length TWO METER m a y be interpreted as an FPAR-adword too.
T h e P A t t - a d w o r d s We use the PAR-relationship to represent exterior attribuation. Judgements on a concept are typical exterior attribuations.
382
We already classified BRAVEas a PAR-adword. BEAUTIFULis another good example. An important class of words concerns space and time aspects. A ball may exist at some location at some moment in time. Its space-time coordinates will be seen as determined from the outside, i.e. by exterior attribuation, and therefore represented by a PAR-relationship. Adjectives like E A R L Y and S U D D E N belong to this class and also OLD and NEW. The link to the main concept, or head as Quirk et al. call it, is by a PAR-link that connects the time aspect to the concept in the following way ALI
SOMETHING
PART r
ALI
TIME
The word graphs for the adjectives embed the time aspect in a more elaborate word graph in order to express the various meanings. In these graphs the time of the speech act may play an important role, like in the description of tense, a theme that we will not discuss here. A speech act may occur at time to, whereas the intended description concerns something at time tl, oRen determined by the discourse of which the speech act is part. In a historical account one may read "the former king was tyrannical but the new king was a very kind person". The adjective F O R M E l t refers to a time before a certain time t 2 , when the king was replaced, the adjective N E W refers to a time after that time t2. The fact that this took place before to is apparent from the past tense coming forward in WAS.
The word graph for F O R M E R and NEW will contain an ORb-relationship. In E A R L Y reference to a time interval will have to be made, expressing that the first part of the time interval is meant. This also holds for O L D or ANCIENT. Differences between the three adjectives must come forward in the word graphs, but if these are left out by the speaker, in his choice of words, he might speak of "in the early days", "in the old days" or "in the ancient days" to express the same thing. The more elaborate word graphs may become quite large. This situation is similar for dictionaries, where bad dictionaries give short definitions, and good dictionaries try to give more precise definitions. We would like to recall here from 1 that in our theory the meaning of a word is in principle the whole graph considered, containing the word. This implies that different graphs, i.e. different contexts, give different meanings to a word.
383
T h e C A U - a d w o r d s In the representation of language by knowledge graphs verbs are represented by a token and CAC-relationships. Transitive verbs like WRITE are represented as
CAU
CAU
,
)~
ALI l WRITE whereas intransitive verbs like SLEEP are represented as CAU
ALI T SLEEP Consider the leftmost token. It represents "something writing" respectively "something sleeping". Thus W R I T I N G and SLEEPING are adjectives of that something and are, for obvious reasons, classified as CAu-adwords. The rightmost token in the first knowledge graph might represent a letter and can be described as "written letter". So the adjective W R I T T E N can also be classified as a CAu-adword. Again we should note, like for NEW, that the word graph should contain information referring to the time aspects. WINNING team, BOILING water and MARRIED couple give examples of CAw-adwords. The last example is somewhat tricky. Marrying usually involves two persons and "A marries B" and "B marries A". Then "A and B have got married", or both are "in the state of marriage". This brings us to a special class of adjectives, describing states. Suppose something is subject to a series of changes. At any time it is then in a certain state. In HAPPY girl the adjective describes a state the girl is in, "girl being happy" is doing the same. The verb B E Was represented by the empty frame. Anything in the frame Is. HAVE, which was defined as BE WITH, CAN and MUST are verbs that likewise correspond to the frame relationships that we distinguished. When they are used, they usually express states, focusing on the process rather than on the subject or object related to the process by a CAu-relationship. Yet the adjectives describing states are discussed in this section, as we stress the relationship with verbs. States are exemplified by adjectives like ABLAZE, ASLEEP or ALIVE, where descriptions of processes stand central. The check on the distinction stative/dynamic, by trying out the imperative, like in BE CAREFUL, corresponds to checking whether the adjective can be seen as describing a state.
384
It should be noted that predicative use of an adjective expresses a B E frame. T H E C A R IS H E A V Y instead of T H E H E A V Y C A R expresses the "being heavy" of the car explicitely. In Turkish D O K T O R D A D I R literally means DOCTOR AT BE. The very frequently used agglutination DIR expresses the BE-frame. Most of the a-adjectives, like ABLAZE, are predicative only. T H E H O U S E IS A B L A Z E c a n be said, T H E A B L A Z E H O U S E cannot be said. In T H E CAR IS HEAVY the use of the adjective HEAVY is predicative. The analogy is suggested by the word IS. However, in this case this word IS stems e.g. from A T R U C K IS ( D E F I N E D A S ) A H E A V Y C A R which is s h o r t c u t t e d t o A T R U C K IS H E A V Y o r e v e n T H E C A R IS HEAVY. This explains the predicative use of an adjective that is essentially an FPAR-adword. HEAVY is pushed into the role of a state describer like A B L A Z E . T h e A L I - a d w o r d s The ALI-relationship between two concepts expresses the alikeness of the concepts. This relationship m a y be seen as primus inter pares as the process of concept creation seems to d e p e n d heavily on the becoming aware of alikeness. P r o t o t y p e definitions express what has been seen as common properties of a set of somethings. Adjectives that are expressing that a concept looks like another concept m a y have specific endings. DISASTROUS expresses a similarity with a disaster and INDUSTRIOUS expresses a similarity with certain aspects of industry. Other endings are - I S H a s in F O O L I S H o r - L I K E a s in C H I L D L I K E , for example in combination with BEHAVIOUR. A special category of adjectives are ALI-adwords that are themselves n o u n s . T H U N D E R in T H U N D E R N O I S E o r T R A I T O R in T R A I T O R K N I G H T express that the noise is like heard when thunder occurs or that the knight acts like a traitor. Especially for this category of adjectives, nouns acting as adjectives, it becomes clear that the classification that we give in different types of adwords makes sense. 2.2
Adverbs
In our theory nouns and verbs are both represented by word graphs. In this respect nouns and verbs are describing concepts basically in the same way. However, nouns do not necessarily include CAu-relationships in their definition, whereas verbs do. But for this difference a verb may be seen as a special type of noun. This means that there is no basic difference in the way other words may act as adwords of verbs in comparison to the way other words act as adwords of nouns. This is also underlined by the possibility of substantivation of verbs. Compare TO PLAY, PLAYING and A PLAY.
385 We will discuss only a few examples. In our view time and location of the act expressed by the verb are natural aspects to consider in relation to the verb. Many adverbs refer to these aspects and are classified as PAR-adwords. We mention OFTEN, OUTSIDE, BRIEFLY~ EVER as examples. The explicit word graphs for these words may become quite large as rather complex aspects are expressed. Judgements and measurements are two aspects that are also often expressed by PAR-adwords. WELL, QUITE, EXTREMELY, ENOUGH, MUCH, ALMOST are reflecting judgements or measurements. A judgement may be interpreted as a subjective measurement, hence the treatment of these two aspects at the same time. CAU-adwords refer to influences or, in most cases, to consequences of the acts described by the verbs. Hence adverbs like AMAZINGLY or SURPRISINGLY may be mentioned. ALI-adwords include CLOCKWISE, or any of the many adverbs with ending -wlsE, but also words like TOO or AS are used as adverbs and clearly should be classifted as ALI-adwords. BRILLIANTLY, like a brilliant, is representative of another subclass of these adwords. FPAR-adwords are somewhat rare. In "the country is deteriorating economically" the adverb ECONOMICALLY indicates in what respect the country deteriorates. The country's economy must be seen as frame part and for this reason ECONOMICALLY may be classified as an FPAR-adword. Two remarks are to be made still. Firstly, there is a set of words, mentioned by Quirk et al., that are sometimes used as adverbs, but actually refer to the use of logic in language. NEVERTHELESS, HOWEVER, THOUGH, YET, SO, ELSE are such words. We would prefer to include them in a special third list of words graphs with NO and PROBABLY, to mention some other potential adverbs. Secondly, we would like to comment on an example of Quirk et al.; A FAR MOR EASILY INTELLIGIBLE EXPLANATION, showing compilation of adwords. In traditional discussion we would have to decide whether we are dealing with adjectives or adverbs. The great advantage of our theory is that this discussion is avoided. Instead one may have the discussion about the classification that is to be given to each of the adwords. However, once a knowledge graph, expressing the text, is made out of the word graphs, the way these word graphs glue together, by which type of arc, immediately gives the answer in that discussion. An important aspect of this example is that the sheer possibility of compiling adwords in language is an argument for the modelling of language in terms of knowledge graphs, built from word graphs.
386
2.3
Classifiers in the Chinese Language
In Chinese, a special class of words are the quantity words, as Chinese prefer to call them, which are also called classifiers. F P A R - c l a s s i f i e r s One of the most frequent q u a n t i t y words in KO, used in combination with a word like TUNG TSI, THING. For A THING in Chinese we should say, YI KO TUNG TSI. There are m a n y nouns for which the q u a n t i t y word KO is used. As this example already shows the q u a n t i t y word m a y be seen as a word naming a subframe of a frame carrying the n a m e of the noun. Hence here the relationship between noun and q u a n t i t y word is an FPAR-relationship, the q u a n t i t y word is an FPAR-adword and, in terms of knowledge graphs, we get the following graph.
~ FPAR l 9
ALI
NOUN
QUANTITY WORD.
Here, 9 indicates the something described by the q u a n t i t y word. Although a word like KO is not felt to have a meaning of its own by Chinese, it should refer to something felt as an essential property of the following noun. Many quantity words express the property of the noun t h a t it describes a unit. Another example YI FENG XIN for A LETTER, shows an adword FENG, t h a t in the combination is not felt to have a meaning. However, there is a verb FENG for folding and it is clear t h a t XIN (LETTER) is described in a way t h a t expresses the property t h a t letters usually consist of folded paper. For A WELL we have YI YAN JING, where the q u a n t i t y word YAN has some meaning, namely HOLE. If a well is the configuration of water coming from a hole in the ground, the hole property of a well is clearly mentioned as property not to be deleted in the description. Q u a n t i t y words like these should perhaps better be called property words or classifiers. O t h e r c l a s s i f i e r s Next to the FPAa-relationship to a noun of a property word, there are m a n y other classifiers t h a t do have a separate meaning, but are used in combination with a noun to express a certain feature. These q u a n t i t y words can be divided into 13 subclasses. We will discuss this in more detail now and start with a typical example: YI HU SHUI, A POT OF WATER. In English too a class description of water is necessary. HU, or POT, is an adword for SHUI, or WATER. A pot typically contains a
387
liquid, as a container it has a specific shape and is m a d e of some material like China, metal or class. T h e important feature is that of containing a liquid. In knowledge graph representation we have in first instance
LIQUID
POT
EQU ALII WATER
IALI
,_rn
PARI
9
SUB
IPAR ,~KI
ALI l SPACE
l ALI SPACE
T h e graph can be extended to include the other features of P O T like shape or material, b u t the containment feature gives the direct link to the noun. In terms of adwords we cannot say that POT is a eAR-adword, the linking to WATER is more complex, in fact our example is a clear example of a relationship in knowledge graph theory. T w o concepts are part of one graph and it is this graph that characterizes the relationship between the two concepts. If we have to baptize this relationship we would choose the word CONTAINER or CONTAINMENT. So HU is an adword of SHUI linked to it by a relationship of complex nature, that is however representable by the basic types of arcs. There are quite a few adwords of the t y p e CONTAINER, PING 'or BOTTLE is just one other example. We now list the 13 subclasses by giving a characterization of the class, like CONTAINER~ an example, like YI HU SHUI, A POT OF WATER and the relevant structure relating the quantity word and the noun in one other
case. 1. CONTAINMENT E x a m p l e : YI HU SHUI (a POT of water). G r a p h : see above.
2. SIMILARITY E x a m p l e : YI DI SHUI (a DROP of water)
388
Graph: LIQUID
DROP
EQU ALII ,el WATER PARI ALI l SHAPE
IALI 9 ALI
IPAR TALI SHAPE
Remark: This subclass too is quite numerous. Because of lack of space we just mention the 11 other subclasses with an example only. 3. SET E x a m p l e : YI CHUAN PUTAO (a CLUSTER of grapes) 4. TIME E x a m p l e : YI TIAN SHIDIAN (a DAY of time) 5. LENGTH Example: YI MI CHANG (a METER of length) 6. AREA E x a m p l e : YI YINGMU TUDI (an ACRE of land) 7. WEIGHT E x a m p l e : YI KE YINZI (a GRAM of silver) 8. VOLUME Example: YI SHENG SHUI (a LITER of water) 9. MONEY E x a m p l e : ZHE JIAZHI YI MEIYUAN (this costs a DOLLAR). 10. CHINESE CHARACTER E x a m p l e : "WANG" ZI BIHUA YOU SI GE ("WANG" word STROKE(S) has four element(s)). 11. NUMBER Example: YI DA WAZI (a DOZEN of socks). 12. ACTION Example: DATA YI ZHANG (hit him a PALM). 13. COMPLEXES Examples: SAN JIACI FEIJI (three TIMES of flight).
In a more elaborate version of this paper word graphs and remarks are given for these other subclasses.
389
3
Discussion
In the first paper on word graphs nouns, verbs and prepositions were discussed. In this paper a second set of word graphs has been presented, that we called adwords. Seeing adjectives, adverbs and classifiers in Chinese as instances of words that glue to other words like nouns or verbs in certain specific ways a completely different view on these wordclasses has been developed. We do not have enough space here to show that our way of classifying, according to the way graphs are linked, indeed solves quite a few problems concerning them. In fact it should be stressed that the correctness of our m e t h o d of representing words is still to be proven by showing that linguistic problems can indeed be solved. Preliminary results show that the approach is really quite promising. B u t first we have to construct a word graph lexicon. In principle we have covered most types of words already. There remain some other types, like the "logical" words or wordcompositions like e.g. DISCONNECT, and we plan to consider t h e m in a third paper. Chinese quantity words will be collected extensively in a special report. References 1. Hoede, C., Li X.: Word Graphs: The First Set, in Conceptual Structures: Knowledge Representation as Interlingua. Auxiliary Proceedings of the Fourth International Conference on Conceptual Structures, Bondi Beach, Sydney, Australia (P.W. Eklund, G. Ellis and G. Mann, eds.), ICCS '96, (1996) 81-93 2. Berg, H. van den: Knowledge Graphs and Logic: One of Two Kinds. Dissertation, University of Twente, The Netherlands, ISBN 90-9006360-9 (1993) 3. Quirk, R., Creenbaum, S., Leech, G., Svartnik, J.: A Grammar of Contemporary English. Longman (1972) 4. Willems, M.: Chemistry of Language: A graph-theoretical study of linguistic semantics. Dissertation, University of Twente, The Netherlands, ISBN 90-9005672-6 (1993)
Tuning Up Conceptual Graph Representation for Multilingual Natural Language Processing in Medicine Anne-Marie Rassinoux, Robert H. Baud, Christian Lovis, Judith C. Wagner, Jean-Raoul Scherrer Medical Informatics Division, University Hospital of Geneva, Switzerland anne-marie.rassinoux @dim.hcuge.ch
Abstract. Multilingual natural language processing (NLP), whether it concerns analysis or generation of sentences, requires a sound language-independent representation for grasping the deep meaning of narratives. The formalism of conceptual graphs (CGs), especially designed to cope with natural language semantics, constitutes a good repository for dealing with the compositionality and intricacies of medical language. This paper describes our experiment, as part of the European GALEN project, for exploiting a conceptual graph representation of medical language, upon which multilingual medical language processing is performed.
1
Introduction
Health care, like any institution managing a huge amount of textual information, has to face an important counterbalancing challenge. On the one hand, the ability to rapidly and easily access and retrieve relevant information on a patient is a crucial need. Such a functionality is better achieved when information is encoded and structured in databases or electronic patient records, thus allowing the formulation of precise queries. On the other hand, effective communication is better performed through the expressiveness of natural language, whether it is among health care providers themselves, or to directly address the patient. It appears, therefore, that both textual documents and structured information must coexist in the same environment. The approaches developed to switch from one to another are at the heart of research and development performed in the domain of medical language understanding 1. This has led to substantial and ongoing results related to both the analysis 2 and generation 3 of medical texts. This article presents substantial issues resulting from our experience in handling a unique knowledge representation - being used as input for the generation task and as output for the analysis task - for grasping the deep meaning of medical sentences. The peculiarities of medical language and their implications for knowledge representation are first described. Then the adjustments achieved to mediate between the implicitness and expressiveness of natural language (NL) on the one hand, and the accuracy and granularity of knowledge representation (KR) on the other hand, are exposed.
391
2 Background Domain knowledge model that represents medical information in a languageindependent and structured way constitutes a major cornerstone upon which multilingual applications can be built. This requires the description of a consistent view of all the relevant entities of the domain and their associated attributes. Such a description must reflect an appropriate level of generality and granularity, useful for integrating knowledge from diverse sources as well as maintaining standard representation that can be exchanged and reused in the future. These requirements are at the basis of the European GALEN project 1. This project led to the development of a common reference model (the CORE model) which is expressed in a languageindependent manner through a descriptive logic language (the GRAIL Kernel) 4. It affords a high level ontology that organizes concepts and attributes (also called semantic links or relationships) upon which multiple inheritance can be applied. Moreover, it allows composite concepts to be built from existing ones, providing that the constraints required for such compositions are available. This compositional modeling facilitates shifting between different levels of detail. This presents advantages for NLP, as such compositions allow conceptual structures to be formulated in natural language using more or less precise medical vocabulary. The importance of multilingual natural language processing was particularly emphasized during the first year's work in GALEN-IN-USE2 5. On the one hand, natural language generation is of paramount interest for information presentation as it allows knotted and complex GRAIL expressions to be naturally displayed to the user with sentences formulated in his native (or at least known) natural language. On the other hand, natural language analysis is of significant help for data capture in so far as it produces a structured representation which can then be (semi) automatically mapped to the GALEN CORE model. Therefore, linguistic tools including generation 6, 7 as well as analysis 8 of medical texts, have been developed as part of the GALEN project. Pursuing a previous in-house experiment 9 with the formalism of conceptual graph (CG) 10, the latter has been chosen as the modeling formalism for representing medical language that is handled by' NLP tools. Due to the fact that the CG formalism shares many features with the GRAIL language in which the CORE model is expressed 4, the translation from GRAIL to CG representation was a straightforward process using Definite Clause Grammar (DCG). The NLP tools have been mainly tested on a new French coding system for surgical procedures named NCAM 3 11. From this classification, more than 500 1 GALEN stands for General Architecture for Language, Encyclopaedias and Nomenclatures in medicine, and is currently funded as part of Framework IV of the EC Healthcare Telematics research program. 2 GALEN-IN-USE is the current phase of the GALEN project, whose aim is to apply tools and methods of GALEN to assist in the collaborative construction and maintenance of surgical procedure classifications. 3 NCAM stands for Nomenclature Commune des Actes M~dicaux and is currently developed by the University of Saint Etienne in France.
392
surgical procedures belonging to the urology domain were modeled, while respecting the multi-level sanctioning of the GALEN representation, and then regenerated into natural language phrases. Fig. 1 shows the CG representing the French surgical procedure <
r~st-~U~g thefocus: i cl SurgicalDeed:yI
i
~i~
(rel_isMainlyCharacterisedBy)->cl_performancei (rel_isEnactmentOf)->el GeneralisedProeess:x\\. | .. . . . . - . . . . . . . -,
1'
.
iii .
.
.
~w Language annotations
/,:
................
-~
........
-- ........
CG" l From concepts to /~, en: scopy; r: scople . r~" 9 ! ~ .1 -.,r en:surgical; fr:chirurgical t t~urglcalL~eeo/A (isMainlyCharacterisedBy)->performance/ / I (isEnactmentOf)->Inspecting~ / li en: pyelo; fr: py61o (playsClinicalRole)->SurgicalRole\/ fien: calico; fr: calico (acts SpecificallyOn)->ArbitraryBodyConstru~J/ * (hasArbitraryComponent)->RenalPelvis / en: endoscopic (hasArbitraryComponent)->CalixOfI~dney~ fr: endoscopique (hasPhysicalMeans)->Endoscope (hasSpecificSubprocess)~SurgicalApproachin.qs en: by; fr: par (hasPhysicalMeans)->Route(passesThrough)->SkinAsOrgan\\\\\.~ en: percutaneous route __ ~p.e.r.c..uta.n.6.e ........... !
i Relational contraction:
Type contraction:
cl_GeneralisedProcess: x11 (rel-hasSpecificSubpr~176 I(rel_hasPhysicalMeans)_>cl_Route: Y\~-__
cl_Route: x(rel_passesThrough)->cl_SkinAsOrgan~.
...........
:
.
.
.
ii!!!!ii!iiiii
.
.
.
.
.
.
.
.
.
.
. . . . . . . . . . . . . . . . . . . . . . .
:,',!,!~
Ouput o f the generation tool for English and French: "~ en: 'endoscopic surgical pyelocalicoscopy by percutaneous route' | fr: 'py61ocalicoscopie chirurgicale endoscopique par voie percutan6e~ Fig. 1. Operations applied during the generation task on the CG representing the French rubric ~Pydlo-calicoscopie par voie percutande~ 4
4 Internally, a concept is prefixed by c/_, a Simple relationship by rel_ and a composite relationship by reld_. For the sake of simplicity, these prefixes are omitted in the above CG.
393
3
Modeling Medical Language
Modeling medical language requires taking into account the variations in the description of medical terms, while supporting a uniform representation expressing the medical concepts characterized by attributes and values. However, the way information is modeled does not always correspond to the way information is expressed in natural language. Different compromises must therefore be set up.
3.1
Annotation of Medical Information
In order to make available and operational the semantic content of the CORE model for NLP, the major task has consisted in annotating the model in the corresponding languages treated (mainly French, English, German, and Italian). These linguistic annotations are performed at two levels. First, conceptual entities are annotated with 'content words' that correspond mostly to the syntactic categories of nouns and adjectives. Either single words, parts of words like prefixes and suffixes, or multiword expressions are permitted as annotation (see the language annotation part in Fig. 1). Second, annotations of relationships are more frequently achieved through 'function words'. The latter are conveyed either through grammatical structures, such as the adjectival structure or the noun complement (as in the examples pyelic calculus or calculus of the renal pelvis for the relationship rel_hasSpecificLocation), or directly through grammatical words such as prepositions (as in the example urethroplasty for perineal hypospadias where the preposition 'for' denotes the relationship rel_hasSpecificGoal). The annotation process, which only occurs on named concepts, enables the creation of multilingual dictionaries that settle a direct bridge between concepts and language words. Every meaningful primitive concept belonging to the GALEN CORE model needs to be annotated. Besides, composite concepts, for which a definition is maintained at the conceptual level, may be annotated based on the availability and conciseness of words in the language treated. The verbosity of medical language and the complexity of the modeling style can then be respectively tuned by annotating composite concepts and composite relationships. For example, the concept cl_PercutaneousRoute is directly annotated with concise expressions, and the relationship reld_byRouteOf, especially created for NLP purposes, allows the nested concept cl_SurgicalApproaching to be masked during linguistic treatments (see Fig. 1). However, the combinatorial aspect of the compositional approach as well as the continually growing creation of new medical terms, make the annotation task unbounded and time-consuming. This has led to the implementation of procedural treatments at the linguistic level that map syntactic structures upon semantic representation but also include the management of standard usage of prefixes and suffixes. The latter is especially important for surgical procedures that are commonly expressed through compound word forms 12. This means that the word pyelocalicoscopy is never described, as an entry in the English dictionary (as the description of the concept denoting a pyelocalicoscopy is not explicitly named in the model), but is automatically generated according to its corresponding semantic description. Automated morphosemantic treatment also implies that the linguistic
394
module be aware of abstract constructions used at the conceptual level to handle the enumeration of constituents. In Fig. 1, both the abstract concept cl_ArbitraryBodyConstruct and the relationship rel_hasArbitraryComponent are used to clarify the different body parts on which the inspection occurs.
3.2
The Relevance of the Focus for NLP
A recognized property of CG formalism, over other formalisms such as the frame system, is its ability to easily turn over the representation, i.e. to draw the same graph from a different head concept. For example, the following conceptual graph cl_Pain->(rel_hasLocation)->cl_Abdomen can be rewritten into cl_Abdomen ->(rel_isLocationOf)->cl_Pain. Even if these two graphs appear equivalent at first sight, from the conceptual viewpoint, some subtleties can be pointed out when shifting to NL. The former graph is naturally translated into abdominal pain or pain in the abdomen, whereas the second one tends to be translated into painful abdomen. In medical practice, the interpretation underlying these two clinical terms significantly differs. The key issue here is that, for NLP purposes, the head concept of a graph (such as cl_Pain in the first graph and cl_Abdomen in the second one) is precisely considered as the focus of the message to be communicated. The rest of the graph, therefore, is only there to characterize this main concept in more detail. Such an observation questions the focus-neutral property of CG formalism in' so far as linguistic tools add special significance to the head concept or focus of a graph. Indeed, the latter is interpreted as the central wording upon which the rest of the sentence is built.
3.3
Contexts or Nested Conceptual Graphs
Contexts constitute a major topic of interest for both the linguist and conceptual graph communities. Recent attempts to formally define the notion of context have come to light 13, 14, and the use of context for increasing the expressiveness of NL representation is clearly asserted 15. For GALEN modeling, contexts appear as a major way of avoiding ambiguity when representing medical language. In particular, associating a specific role with a concept (as for example, cl_SurgicalRole for identifying a surgical procedure, or cl_InfectiveRole or cl_AllergicRole for specifying a pathological role) allows for reasoning and then restricting the inference process to what is sensible to say about this concept. Such packaging of information is graphically represented through brackets delimiting the nested graph (see the CG displayed in Fig. 1 where two contexts are surrounded by bold brackets). Handling these contexts at the linguistic level results mainly in enclosing the scope of the nested graph in a single proposition, which can be expressed by a simple noun phrase or through a more complex sentence.
395
4
F o r m a l Operations to Mediate between K R and N L
The previous section emphasized the gaps that exist between KR and medical language phrases. Indeed, as KR aims to describe, in an unambiguous way, the meaning carried out by NL, such a structure is naturally more complete and accurate than that which is simply expressed in NL. Mediating between these two means of communication implies setting up specific formal operations for readjusting KR and NL expressions.
4.1
Basic Operations on CGs
Balancing the degree of granularity and thus the complexity of a conceptual representation can be achieved in two different ways. On the one hand, a conceptual graph can be contracted in order to display information in a more concise manner. The contraction operation, which consists in replacing a connex portion of a graph by an explicit entity, is basically grounded on the projection operation (see 10, pp. 99). On the other hand, a conceptual graph can be expanded in order to add and thus make explicit precise information on the semantic content of the graph. The expansion operation, which consists in replacing a composite entity by its full definition, is based on the join operation (see 10, pp. 92). As the general guideline for the generation task in the GALEN project is to produce phrases 'as detailed as necessary but as concise as possible', the projection operation appears as the central means to mediate with the complexity of KR. In order to adjust this operation for particular usage, it has been necessary to provide, in addition to the projected graph, the hanging graph, the list of cut points and finally the specializations performed during the projection. The hanging graph only embeds the remaining portions of the original graph that were connected to formal parameters (i.e. composed of one or two parts depending on the number of formal parameters, x and y, present in the definition). All other hanging subgraphs are clearly considered as cut points. Each specialization performed in the original graph, whether it concerns a relationship or a concept, is recorded in a list. Each of these components are then checked, as explained in the following section, for their particular behavior in the three following situations: the setting-up of the focus, the contraction of conceptual definitions, and the management of contexts.
4.2
Refining Basic Operations for NLP
In the simplest case, the focus of a graph is defined as the head concept of the graph. However, in KR, this solution is frequently left for the benefit of a representation that allows the focus as well as other concepts mentioned at the same level to be represented uniformly. This is the case for the example shown in Fig. 1, where the general concept cl_SurgicalDeed, representative of the type of concept to be modeled, is taken as the head concept of the graph. Then specific relationships, such as rel_isMainlyCharacterisedBy and rel_isCharacterisedBy, are respectively used to clarify the 'primary' procedure and a number of, possibly optional, additional
396
procedures. Establishing the focus of the graph in this case consists in restoring the primary procedure as the head concept of the graph, by projecting the corresponding abstraction shown in Fig. 1 on the initial graph. Then, the projected graph is straightly replaced by the specialization of the concept identified by the formal parameter x. In the example, the latter corresponds to the concept cl_Inspecting that is a descendant of cl_GeneralisedProcess in the conceptual hierarchy. Moreover, this operation prohibits the presence of cut points as well as any specialization done on concepts other than the formal parameter x. The two hanging graphs (if not empty) are then appended to the new head of the graph. In order to retain the level of detail of the conceptual representation, the type contraction does not allow specialization. Moreover, it normally prohibits the presence of cut points, which are signs of a resulting disconnected graph. However, for the generation task, such a rule can be bypassed. For example, let us consider the following graph: cl_SurgicalExcising-
( rel_actsSpecificallyOn )- > c l_Adenoma ( rel_hasLocativeAttribute )- >cl_ProstateGlandk\ Assuming that the composite concept cl_Adenomectomy exists in the model, the contraction
of its
corresponding
definition would produce the cut point But, as the type contraction in the generation process is intended to ensure the conciseness of the produced NL expressions, by translating a portion of graph by precise words, the cut point can be joined to the hanging graph. This contributes to the generation of the valid NL expression prostatic adenomectomy from the above graph. For the relational contraction, the projection operation permits the specialization on concepts and relationships, as these relational definitions, specifically introduced for NLP purposes, are commonly expressed in the most general way possible. However, the cut points are not permitted. Finally, contexts are treated first of all by looking for the definition of a composite concept already described in the model that can be successfully projected on the nested graph. This is the case in Fig. 1 for the contextual information describing the percutaneous route that is replaced by the concise concept cl_PercutaneousRoute. In all the other cases, the boundaries of the contexts are simply removed and the nested graph is merged to the main graph as performed for the surgical role in Fig. 1.
(rel_hasLocativeAttribute)->cl_ProstateGland.
5
Conclusion
Our experience with managing the conceptual graph formalism for NLP has reinforced our belief that a logical, expressive, and tractable representation of medical concepts is a requisite for dealing with the intricacies of medical language. In spite of the effort undertaken to independently manage conceptual knowledge (which in this case is mainly modeled within the GALEN project) and linguistic knowledge (which is handled by linguistic tools), it clearly appears that fine-tuning of both sources of knowledge is a requisite towards building concrete multilingual applications. Such an adjustment affects both the KR and the multilingual NLP tools, and is realized through declarative as well as procedural processes. On the one hand, it has been
397 necessary to add declarative knowledge through the specification of both multilingual annotations and language-independent definitions. On the other hand, the procedural adjustment has been mainly achieved through the implementation of morphosemantic treatment at the linguistic level, and the refinement of conceptual operations for holding the modeling style at the KR level. All these compromises have proved to be adequate for smoothly and methodically counterbalancing the granularity and complexity of KR with the implicitness and expressiveness of NL.
References 1. 2. 3. 4. 5. 6. 7.
8. 9.
10. 11.
12. 13. 14. 15.
McCray, A.T., Scherrer, J.-R., Safran, C., Chute, C.G. (eds.): Special Issue on Concepts, Knowledge, and Language in Health-Care Information Systems (IMIA). Meth Inform Med 34 (1995). Spyns, P.: Natural Language Processing in Medicine: An Overview. Meth Inform Med 35(4/5) (1996) 285-301. Cawsey, A.J., Webber, B.L., Jones, R.B.: Natural Language Generation in Health Care. JAMIA 4 (1997) 473-482. Rector, A.L., Nowlan, W.A., Glowinski, A.: Goals for Concept Representation in the GALEN project. In: Safran, C. (ed.): Proceedings of SCAMC'93. New York: McGrawHill, Inc. (1993) 414-418. Rogers, J.E., Rector, A.L.: Terminological Systems: Bridging the Generation Gap. In: Masys, D.R. (ed.): Proceedings of the 1997 AMIA Annual Fall Symposium. Philadelphia: Hanley & Belfus, Inc. (1997) 610-614. Wagner, J.C., Baud, R.H., Scherrer, J.-R.: Using the Conceptual Graphs Operations for Natural Language Generation in Medicine. In: Ellis, G. et al. (eds.): Proceedings of ICCS 95. Berlin: Springer-Verlag (1995) 115-128. Wagner, J.C., Solomon, W.D., Michel, P.-A. et al.: Multilingual Natural Language Generation as Part of a Medical Terminology Server. In: Greenes, R.A., Peterson, H.E., Protti, D.J. (eds.): Proceedings of MEDINFO95. North-Holland: HC&CC, Inc. (1995) 100-104. Rassinoux, A.-M., Wagner, J.C., Lovis, C., et al.: Analysis of Medical Texts Based on a Sound Medical Model. In: Gardner, R.M. (ed).: Proceedings of SCAMC'95. Philadelphia: Hanley & Belfus, Inc. (1995) 27-31. Rassinoux, A.-M., Baud, R.H., Scherrer, J.-R.: A Multilingual Analyser of Medical Texts. In: Tepfenhart, W.M., Dick, J.P., Sowa, J.F. (eds.): Proceedings of ICCS94. Berlin: Springer-Verlag (1994) 84-96. Sowa, J.F.: Conceptual Structures: Information Processing in Mind and Machine. Reading, MA: Addison-Wesley Publishing Compagny (1984). Rodrigues, J.-M., Trombert-Paviot, B., Baud, R. et al.: Galen-In-Use: An EU Project applied to the development of a new national coding system for surgical procedures: NCAM. In: Pappas, C., Maglaveras, N., Scherrer, J.-R. (eds.): Proceedings of MIE97. Amsterdam: IOS Press (1997) 897-901. Norton, L.M., Pacak, M.G.: Morphosemantic Analysis of Compound Word Forms Denoting Surgical Procedures. Meth Inform Med 22(1) (1983) 29-36. Sowa, J.F.: Peircean Foundations for a Theory of Context. In: Lukose, D. et al. (eds.): Proceedings of ICCS97. Berlin: Springer (1997) 41-64. Mineau, G.W., Gerb~, O.: Contexts: A Formal Definition of Worlds of Assertions. In: Lukose, D. et al. (eds.): Proceedings of ICCS'97. Berlin: Springer (1997) 80-94. Dick, J.P.: Using Contexts to Represent Text. In: Tepfenhart, W.M., Dick, J.P., Sowa, J.F. (eds.): Proceedings of ICCS'94. Berlin: Springer-Verlag (1994) 196-213.
Conceptual Graphs for Representing Business Processes in Corporate Memories Olivier Gerb@ 1 , Rudolf K. Keller 2, and Guy W. Mineau 3 1 DMR Consulting Group Inc. 1200 McGill College, Montr@al, QuEbec, Canada H3B 4G7 Olivier. Gerbe@dmr.
ca
UniversitE de MontrEal C.P. 6128 Succursale Centre-Ville, MontrEal, QuEbec, Canada H3C 3J7 keller@IRO. UMontreal.
ca
3 Universit@ Laval QuEbec, QuEbec, Canada G1K 7P4 mineau@ift,
ulaval, ca
Abstract. This paper presents the second part of a study conducted at DMR Consulting Group during the development of a corporate memory. It presents a comparison of four major formalisms for the representation of business processes: UML (Unified Modeling Language), PIF (Process Interchange Format), WfMC (Workflow Management Coalition) framework and conceptual graphs. This comparison shows that conceptual graphs are the best suited formalism for representing business processes in the given context. Our ongoing implementation of the DMR corporate memory - used by several hundred DMR consultants around the world is based on conceptual graphs, and preliminary experience indicates that this formalism indeed offers the flexibility required for representing the intricacies of business processes.
1
Introduction
Charnel Havens, EDS (Electronic D a t a Systems) Chief Knowledge Officer, presents in 5 the issues of knowledge management.
With a huge portion of a company's worth residing in the knowledge of its employees, the time has come to get the most out o/ that valuable corporate resource - by applying management techniques. The challenge companies will have to meet is the memorization of knowledge as well as its storage and its dissemination to employees throughout the organization. Knowledge may be capitalized on and managed in corporate memories in order to ensure standardization, consistency and coherence. Knowledge management requires the acquisition, storage, evolution and dissemination of knowledge acquired by the organization 14, and computer systems are certainly the only way to realize corporate memories 15 which meet these objectives.
402
DMR Consulting Group Inc. has initiated the IT Macroscope project 7, a research project that aims to develop methodologies allowing organizations: i) to use IT (Information Technology) for increasing competitiveness and innovation in both the service and product sectors; ii) to organize and manage IT investments; iii) to implement information system solutions both practically and effectively; and iv) to ensure that IT investments are profitable. In parallel with methodology development, tools for designing and maintaining these methodologies, designing training courses, and for managing and promoting IT Macroscope products were designed. These tools implement the concept of corporate memory. This corporate memory, called the Method Repository, plays a fundamental role. It captures, stores 3, retrieves and disseminates 4 throughout the organization all the consulting and software engineering processes and the corresponding knowledge produced by the experts in the IT domain. During the early stage of the development of the Method Repository, the choice of a knowledge representation formalism was identified as a key issue. That lead us to define specific requirements for corporate memories, to identify suitable knowledge representation formalisms and to compare them in order to choose the most appropriate formalism. We identified two main aspects: knowledge structure and dynamics - business processes, together with activities, events, and participants. The first part of the study 2 lead us to adopt the conceptual graph formalism for structural knowledge. Uniformity of the formalism used in the Method Repository was one issue but not the all-decisive one in adopting conceptuM graphs for the dynamic aspect, too. Rather, our decision is based on the comparison framework presented in this paper. In our comparison, we studied four major business modeling formalisms or exchange formats, UML (Unified Modeling Language), PIF (Process Interchange Format), WfMC (Workflow Management Coalition) framework, and conceptual graphs, against our specific requirements. Choosing these four formalisms for our study has been motivated by the requirement for building our solution on existing or de facto standards. Our study demonstrates that conceptual graphs are particularly well suited for representing business processes in corporate memories since they support: (i) shared activities, and (ii) management of instances. The paper is organized as follows. Section 2 introduces the basic notions of business processes as used in this paper. Section 3 defines specific requirements for the representation of business processes in corporate memories. Section 4 compares the four formalisms. Finally, Section 5 reports on the on-going implementation of the Method Repository and discusses future work.
2
Basic Notions
In this section, we present basic notions relevant to the representation of business processes. Main notions of representation of the dynamics in an enterprise are processes; activities, participants (input, output, and agent), events (preconditions and postconditions), and notions of sequence and parallelism of activity
403 executions. These notions build upon some commonly used definitions in enterprise modeling, as summarized in the following paragraph. A process is seen as a set of activities. An activity is a transformation of input entities into output entities by agents. An event marks the end of an activity; the event corresponds to the fulfilment of both the activity's postcondition and the precondition of its successor activity. An agent is a h u m a n or material resource t h a t enables an activity. An input or output is a resource t h a t is consumed of produced by an activity. The notions of sequence and parallelism define the possible order of activity executions. Sequence specifies an order of executions and parallelism specifies independence between executions. Figure 1 presents the notions of activity, agent, input, and output. Activities are represented by a circle and participants of activities are represented by rectangles and linked to their respective activities by arcs ; directions of arc define their participation: input, output or agent. There is no notational distinction between input and agent. Note t h a t this simple process representation exclusively serves for introducing terminology and for illustrating our requirements.
Fig. 1. Activity with input, output and agent.
Figure 2 illustrates the notions of sequence and parallelism by a process composed of five activities. T h e activity Write Productioa Order is the first activity of the process, the activities Build Frame and Cut Panes are executed in parallel, Assemble Window follows the activities Build Frame and Cut Panes, and finally Deliver Window terminates the process. Note t h a t we only have to consider the representation of parallel activities or sequential activities; all other cases can be represented by these two cases by splitting activities into sub-activities.
3
Requirements
This section introduces the two main requirements underlying our study: representation of processes sharing activities and management of instances t h a t are 9involved in a process. It is obvious t h a t there exists a lot of other requirements to represent a business process in a corporate memory. Since these other requirements are mostly met by all the formalisms studied, we decided to focus on the two main requirements mentioned above.
404
Fig. 2. A process as a set of activities.
3.1
Sharing Activities
Let us consider the case of two processes t h a t share a same activity. Figure 3 illustrates this settings.
Fig. 3. Sharing Activities.
The example depicted in Fig. 3 deals with the fabrication of a product which is made out of two components: a software component and a hardware component, as in a cellular phone, a microwave oven or a computer. The first process describes the development of the software component. It is composed of Design Software~ Validate Specifications, and Write Software Code. The second process describes the development of the hardware component. It is composed of activities Design Hardware, Validate Specifications~ and Build Hardware. The activity Validate S p e c i f i c a t i o n s is an activity of synchronization and is shared by the two processes. The problem in this example is the representation and identification of the two processes. Each process is composed of three activities, with one of t h e m being in common. Therefore the formalism must offer reuse of parts of process definitions or support some kind of shared variable mechanism.
To support the representation of business processes in corporate memory, a formalism must offer features to represent processes sharing the same activities.
405
3.2
Instance Management
To illustrate the problem of instance management, let us assume the example of a window manufacturer who has a special department for building non standard size windows. Figure 4 presents the window fabrication process.
Fig. 4. The Window Problem.
A fabrication order is established from a client order. A fabrication order defines the size and material of the frame and the size and thickness of the glasses to insert into the frame. The fabrication order is sent to the frame builder and glass cutter teams which execute the order. Then the frame and glasses are transmitted to the window assembly team which insert the glasses into the frame. The problem of this team is to insert the right glasses (size and thickness) into the right frames (size and material). Some frames take more time to build than others, so the frames may be finished in a different order than the glasses are. This problem can be solved by the assembly team by assembling the frame and glasses in conformity with the fabrication order. At the notational level, this requires the possibility of specifying instances of input and output participants. To support representation of business processes in corporate memory, the formalism must offer features to represent and manage the related instances needed by different processes.
4
Formalisms
This section presents the four business process modeling formalisms of our study. These formalisms offer representation features in order to describe, exchange, and execute business processes. Each of the studied formalisms supports the representation of the basic notions introduced in Section 2, so we concentrate on the specific requirements discussed above. Against these requirements we have evaluated the four formalisms, UML 1 (Unified Modeling Language), PIF 8
406
(Process Interchange Format), WfMC framework 6 (Workflow Management Coalition) and conceptual graphs. Other formalisms, Petri net 16 and CML 11, 12, have been considered but not included in this study because not wellsuited to represent business processes or not enough formal.
4.1
Unified Modeling Language
In 2 we presented how to represent the static structure in UML 1 (Unified Modeling Language). Let us recall that UML, developed by Grady Booch, Jim Rumbangh and Ivar Jacobson from the unification of Booch method, OMT and OOSE, is considered as a de facto standard. UML provides several kinds of diagrams that allow to show different aspects of the dynamics of processes. Use Case diagrams show interrelations between functions provided by a system and external agents that use these functions. Sequence diagrams and Collaboration diagrams present interactions between objects by specifying messages exchanged among objects. State diagrams describe the behavior of objects of a class or the behavior of a method in response to a request. A state diagram shows the sequence of states an object may have during its lifetime. It also shows responsible requests for state transitions, responses and actions of objects corresponding to requests. Activity diagrams have been recently introduced in UML. They are used to describe processes that involve several types of objects. An activity diagram is a special case of state diagram where states represent the completion of activities. In the context of corporate memory, activity diagrams are the most relevant and we will present their main concepts in what follows. In UML, there are two types of execution of activities: execution of activities that represent atomic actions, they are called ActionState, and execution of a non atomic sequence of actions, they are called ActivityState. Exchange of objects among actions are modeled by object flows that are called ObjectFlowState. ObjectFlowStates implements notions of inputs and ouptuts. Agents are represented by Swimlane in activity diagrams. However it is possible to define agent as a participant to an activity and to establish explicitly a relationship between agent and activity. Figure 5 shows how to model the cut window pane activity with participants. Activity diagrams shows possible scenarios; this means that activity diagrams
I
Fig. 5. UML - The cut window pane Activity.
407
show objects instead of classes. Dashed arrows link inputs and outputs to activities. Processes may be represented using activity diagrams in UML and Fig. 6 shows an example of the window building process. Solid arrows between processes represent the control flow.
Fig. 6. UML - The whole Process.
Sharing Activities As detailed in 1, UML does not support adequate representation features for sharing activities. However activity diagrams are new in the definition of the language and all cases have not been yet presented. Instances Management In opposition with the representation of structure 2, the process representation is done at the instance level. Activity diagrams involve objects not classes and therefore it is possible to represent the window problem by using the object f a b r i c a t i o n order which specifies frame and panes. Figure 7 shows a representation for the window problem.
Fig. 7. UML - The Window Problem.
408
4.2
Process Interchange Format (PIF)
The PIF (Process Interchange Format) workgroup, composed of representatives from companies and universities developed a format to exchange the specifications of processes 8. A PIF process description is a set of frame definitions. Each frame specifies an instance of one class of the PIF metamodel. Figure 8 shows PIF metamodel. It is composed of a generic class ENTITYfrom which all other classes are derived and of four core classes: ACTIVITY, OBJECT, TIMEP01NT, and RELATION. Subclasses
su r ,, Itheo Agent
Derf~
I Activity m~l--' ---t'---t'----~ I credafi%% : b
e
g
~
Time Point ~__Jbefore Fig. 8. PIF - Metamodel.
of ACTIVITY and OBJECT are respectively DECISION and AGENT. Class RELATION has seven subclasses, subclasses CREATES,MODIFIES, PERFOR/4S,and USES define relationships between ACTIVITYand OBJECT,the subclass BEFOREdefines a predecessor relationship between two points in time, the subclass SUCCESSORdefines a successor relationship between two activities and, ACTIVITY-STATUSdefines the status of an activity at a point in time. Figure 9 shows the representation of an activity using the PIF format. ACT1 (define-frameACT1
(define-framePRFRMS1
(define-frameOUTPUT1
((Instance-Of ACTIVITY) (Name "Cut Window Panes') (End END-ACT1)))
Instance-Of PERFORMS) Actor AGTf) (Activity ACTf)))
((Instance-of OBJECT) (Name "panes")))
(define-frameEND-ACT1
(define-frameINPUT1
:own-slots
:own-slots
((Instance-OfTIMEPOINT))) (define-frameAGTt :own-slots
((Instance-OfAGENT) (Name "Glazier")))
:own -slots
l(
:own-slots
(define-frameCRTS1 :own-slots :own-slots ((Instance-Of CREATES) ((Instance-OfOBJECT) (ActivityACT1) (Name "FabricationOrder"))) (Object OUTPUT1))) (define-frameUSES1 :own-slots (Instance-Of USES)
Activity ACT1) Object INPUTt)))
Fig. 9. PIF - Activity with participants.
409
defines the cut window panes activity as an instance of A C T I V I T Y with a name and a relation to END-ACT1. END-ACT1 represents the end of the activity and is defined as a point in time. Then come definitions of the three participants; each participant is defined in two parts: definition of the participant itself and definition of the relationship between the activity and the participant. With the P I F process interchange format and framework, there is no explicit definition of a process. A process is the set of defined activities. Example shown in Fig. 10 shows how two activities ACT1 and ACT2 are linked by a B E F O R E relationship.
(define4rame ACT1 :own-slots Instance-Of ACTIVITY) Name 'Write Fabrication Order") (End END-ACT1)))
I(
(define-frame ACT2 :own-slots (( nstancce-OfACT V TY) (Name Build Frame") (End END-ACT2)))
(define-frame END-ACT1 :own-slots ((Instance-Of TIMEPOINT)))
(define-frame END-ACT2 :own-slots ((Instance-Of TIMEPOINT)))
(define-frame ACT1-ACT2 :own-slots ((Instance-Of BEFORE) (Preceding-Timepoint END-ACT1) (succeeding-Timepoint END-ACT2)))
Fig. 10. PIF - Process.
Sharing Activities The P I F format supports representation of several sequences of activities. It is possible to define in one file more t h a n one sequence of activities by a set of frames instance of B E F O R E . However it is not possible to explicitly identify several processes. Instance Management With the P I F format activities and participants involved in the activities are described at the type level. Therefore, it is not possible to identify instances in P I F activity definitions. 4.3
Workflow Reference Model
The Workflow Management Coalition (WfMC) defines in the Workflow Reference Model 6 a basic metamodel t h a t supports process definition. The Workflow Reference Model defines six basic object types to represent relatively simple processes. These types are: Worflow Type Definition~ Activity, Role, Transition Conditions, Workflow Relevant Data, and Invoked Application. Figure II shows
the basic process definition metamodel. The Workflow Management Coalition has also published a Process Definition Interchange in version 1.0 b e t a 17 t h a t describes a common interface to the exchange of process definitions between workflow engines. Figure 12 presents the definition of the activity Cut Window P a n e s using this exchange format. Participants (inputs or agents) to an activity are defined explicitly. D a t a t h a t are created or modified by an activity are defined in the postconditions of the activity or defined as output p a r a m e t e r s of invoked applications during activity execution. In W F M C Process Definition Interchange format, a process is defined as a list of activities and a list of transitions t h a t
410
I Workflow Type Definition I
has
consists of may Role
I_ refer to
I
r
I
Activity
may have
I uses
L
I
-I
I
Transition I Conditions
Data
usesl
Invoked Application may refer to
Fig. 11. WfMC - Basic Process Definition MetaModel.
specify in which order activities are executed. In Fig. 13 of the following section, examples of definitions of activities in W F M C interchange format axe shown.
ACTIVITY Cut_Window_Panes PARTICIPANT Glazier, Fabrication_Order POST CONDITION Window_Panes exists END ACTIVITY PARTICIPANT Glazier TYPE HUMAN END PARTICIPANT
DATA Fabrication_Order TYPE COMPLEX DATA END DATA DATA Window_Panes TYPE REFERENCE END DATA
Fig. 12. WfMC - Activity with Participants.
Sharing Activities Processes are defined using keyword W O R K F L O W and ENDW O R K F L O W which respectively begins and ends a process definition. In a process definition, it is possible to use activities or participants that have been defined in another process definition. In the example shown in Fig. 13, two processes are defined with a common activity. The commom activity is defined in process 1 and reused in process 2. Instance Management Process definitions are defined at type level. However conditions that fire activity or that are realized at the end of an activity are expressed using Boolean expressions with variables. In theory, it is possible to represent the window problem but the version 1.0 beta of Process Definition Interchange 17 gives few indications to realize it.
411 WORKFLOW PROCESS1 ACTIVITYDesign_Software
WORKFLOW PROCESS2 ACTIVITYDesign_Hardware
ENiLACTIVITY
ENiLACTIVITY
ACTIVITYValidate_Specifications
ACTIVITYBuild_Hardware
EI~ID'_ACTIVlTY
I'ND_ACTIVlTY
ACTIVITYWrite_Software_Code
TRANSITION FROM Design_Hardware TO Validate_Specifications END_TRANSITION
END'_ACTIVlTY TRANSITION FROM Design_Software TO ValidateSpecifications END_TRANSITION TRANSITION FROMValidateSpecifications TO Write_Software_Code END TRANSITION END--WORKFLOW
TRANSITION FROMValidate Specifications TO BuildHardware END_TRANSITION END_WORKFLOW
Fig. 13. WfMC - Processes Sharing Activities.
4.4
Conceptual Graphs and Processes
In conceptual graph theory, there is no standard way to represent processes. Processes have not been extensively studied and only a few works are related to the representation of processes. John Sowa in 13 presents some directions to represent processes. Dickson Lukose 9 and Guy Mineau 10 have proposed executable conceptual structures. We present below a possible metamodel to represent processes that fulfills corporate memory requirements as expressed in Section 3. The metamodel (Fig. 14) is composed of three basic concepts: hCTIVITY,PROCESS,and EVENT.An activity
TYPE ACTIVITY(x) IS
T:*x-
(INPUT)<-T:*i
(OUTPUT)<-T:*e (AGENT)<-T:*a (DEPENDS-ON)->PRECONDITION:*pre (REALIZES)->POSTCONDITION:*post
TYPE EVENT(x) IS
T:*xl-
(END)->EVENT:*evl ( FOLLOWS)<-EVENT:*ev2
TYPE PROCESS(x) IS (FF;~ST)<.EVENT:."
Fig. 14. Conceptual Graphs - Metamodel.
is defined by its inputs and outputs, the agents that enable the activity, and by pre and post conditions. Preconditions define conditions or states that must be verified to fire the execution of the activity; postconditions define states or conditions that will result from the execution of the activity. An event is a point in time that marks the end of an activity; it marks the realization of the postcondition of the activity. A process is defined as a set of events that represent the execution of a set of activities.
412
Using this metamodel, the Cut-Window-Panes process is defined by the definition graph presented in Fig. 15 where two variables with the same n a m e represent the same object. TYPE CUT-WINDOW-PANES(x)IS ACTIVITY:*x(AGENT)<-GLAZIER:* (INPUT)<-ORDER:*o (OUTPUT)<-PANES:*v (REALIZES)->POSTCONDITION:ORDRE:*o(CONFORMS)<-PANES:*v.
Fig. 15. Conceptual Graphs - The Cut Window Panes Activity.
The process to build a window is represented by the definition graph shown in Fig. 16. TYPE BUILD-WINDOW(x)IS PROCESS:*x(FIRS'I-)<-EVENT:*evl(END)->WRITE-FABRICATION-ORDER:* (FOLLOWS)<-EVENT:*ev2a(END)->BUILD-FRAME:*, (FOLLOWS)<-EVENT:*ev2b(END)->CUT-WlNDOW-PAN ES:* (FOLLOWS)<-EVENT:*ev3(END->ASSEMBLE-WINDOW:* (FOLLOWS)<-EVENT:*ev3(END)->DELIVER-WlNDOW:*. F i g . 16. Conceptual Graphs - Process.
Sharing Activities The proposed model allows the representation of processes t h a t share a same activity (as indicated by variables under a global coreference assumption1). Figure 17 shows two processes that share the same activity TYPE PROCESSI(x) IS TYPE PROCESS2(x)IS PROCESS:*xPROCESS:*x(FIRST)<-EVENT:*ev la(FIRST)<-EVENT:*evlb(END)->DESIG N-HARDWARE:* END)->DESIGN-SOFTWARE:* (FOLLOWS)<-EVENT:*ev2aFOLLOWS)<-EVENT:*ev2b(END)->VALIDATE-SPECIFICATIONS:*vs (END)->VALIDATE-SPECIFICATIONS:*vs (FOLLOWS)<-EVENT:*ev3b(FOLLOWS)<-EVENT:*ev3a(END)->BUILD-HARDWARE:*. (END)->WRITE-SOFTWARE-CODE:*.
I
Fig. 17. Conceptual Graphs - Processes Sharing Activities.
VALIDATE-SPECIFICATIONS. Each process is defined by a sequence of events, and
one event of each process marks the end of the activity. 1 The proposed model assumes global coreference. Two variables with the same identifier represent the same concept.
413
Instance Management. Figure 18 shows that with the use of variables and the global coreference assumption, conceptual graphs support the representation of TYPE WRITE-FABRICATION-ORDER(x)IS ACTIVITY:*)(INPUT)<-CLIENT-ORDER:*c OUTPUT)<-ORDER:*o.
I
TYPE BUILD-FRAME(x)IS ACTIVITY:*x(INPUT)<-ORDER:*o (OUTPUT)<-FRAME:*f (REALIZES)->POSTCONDITION:ORDER:*o(CONFORMS)<-FRAME:*f.
TYPE ASSEMBLE-WINDOW(x)IS ACTIVITY:*x(INPUT)<-ORDER:*o (INPUT)<-PANES:*p (INPUT)<-FRAME:*f (DEPENDS-ON)->PRECONDITION:ORDER:*o(CONFORMS)<-PANES:*p (CONFORMS)<-FRAME:*f (OUTPUT)<-WlNDOW:*w.
TYPE CUT-WINDOW-PANES(x)IS ACTIVlTY:*x(INPUT)<-ORDER:*o (OUTPUT)<-PANES:*p (REALIZES)->POSTCONDITION:ORDER:*o(CONFORMS)<-PANES:*v.
Fig. 18. Conceptual Graphs - The Window Problem.
the window problem. The concept type definition of WRITE-FABRICATION-ORDER, BUILD-FRAME, CUT-WINDOW-PANES, and ASSEMBLE-WINDOW specify that the frame and panes involved in assemble window are conformed to the fabrication order. 4.5
Summary
Table 1 presents a summary of this survey on business process representation formalisms. This summary shows that the framework proposed by the WfMC Table 1. Summary UML PIF WfMC CG
Sharing Activities Instances Management No Yes Yes No Yes Yes Yes Yes
and conceptual graphs fulfill our requirements for the representation of business processes in corporate memories. However, the first part of our study 2 identified conceptual graphs as the best-suited formalism for knowledge structure. Therefore, for the sake of uniformity of formalism, we chose conceptual graphs.
5
E x p e r i e n c e and Future Work
Using conceptual graph formalism, a corporate memory has been developed at the Research & Development Department of DMR Consulting Group Inc in or-
414
der to memorize the methods, know-how and expertise of its consultants. This corporate memory, called Method Repository, is a complete authoring environment used to edit, store and display the methods used by the consultants of DMR. The core of the environment is the CG Knowledge Base; it is a knowledge engineering system based on conceptual graphs. Four methods are commercially delivered: Information Systems Development, Architecture, Benefits Realization, and Strategy; their documentation in paper and in hypertext format is generated from conceptual graphs. About two hundred business processes have been modeled and from about 80,000 conceptual graphs, we generated more than 100,000 H T M L pages in both English and French that can be browsed using commercial Web browsers. This paper has described the research we have done to identify which formalism was the most suitable for the representation of business processes in corporate memories. We have compared four formalisms and this comparison has shown as in a previous study 2 how conceptual graphs are a good response to the specific requirements involved in the development of corporate memories.
References 1 2 3 4 5 6 7 8 9 10 11
12 13
G. Booch, J. Rumbaugh, and I. Jacobson. Unified Modeling Language, Version 1.1. Rational Software Corporation, 1997. O. Gerb6. Conceptual graphs for corporate knowledge repositories. In Proceedings of 5th International Conference on Conceptual Structures, pages 474-488, 1997. O. Gerb6, B. Guay, and M. Perron. Using conceptual graphs for methods modeling. In Proceedings of the 4th International Conference on Conceptual Structures, 1996. O. Gerb6 and M. Perron. Presentation definition language using conceptual graphs. In Peirce Workshop Proceedings, 1995. C. Havens. Enter, the chief knowledge officer. CIO Canada, 4(10):36-42, 1996. D. Hollingsworth. The Workflow Reference Model. Workflow Management Coalition, 1994. DMR Consulting Group Inc. The IT Macroscope Project, 1996. J. Lee, M. Gruninger, Y. Jin, T. Malone, A. Tate, and Yost G. other members of the PIF Working Group. The PIF Process Interchange Format and Framework (May 24, 1996), 1996. availaible at http://soa.cba.hawali.edu/pif/. D. Lukose. Model-ecs: Executable conceptual modelling language. In Proceedings of Knowledge Acquisition Workshop (KAW96), 1996. D. Lukose and G.W Mineau. A comparative study of dynamic conceptual graphs. In Accepted for publication at the 11th KAW, 1998. A. Schreiber, B. Wielenga, H. Akkermans, W. Van de Velde, and A. Anjewierden. Cml: The commonkads conceptual modelling language. In L. Steels, A. Schreiber, and W. Van de Velde, editors, Proceedings of the 8th European Knowledge Acquisition Workshop (EKAW'94), pages 1-24. Springer-Verlag, 1994. G. Schreiber, B. Wielenga, H. Akkermans, W. Van de Velde, and A. Anjewiereden. Cml: The commonkads conceptual modelling language. In Proceedings of the 8th European Knowledge Acquisition Workshop (EKAW'94), 1994. J. Sowa. Processes and participants. In P. Eklund, G. Ellis, and G. Mann, editors, Proceedings of the 4th International Conference on Conceptual Structures, ICCS'96, pages 1-22. Springer, 1996.
415
14
15 16 17
E. W. Stein. Organizational memory: Review of concepts and recommendations for management. International Journal of Information Management, 15(1):17-32, 1995. G. van Heijst, R. van der Spek, and E. Kruizinga. Organizing corporate memories. In Proceedings of the Knowledge Acquisition Workshop, 1996. W G l l . High-level petri net standard - working draft - version 2.5. 1997. Workflow Management Coalition. Interface 1: Process Definition Interchange, 1996.
Handling Specification Knowledge Evolution Using Context Lattices Aldo de Moor 1 and Guy Mineau 2 1 Tilburg University, Infolab, Tilburg University, P.O.Box 90153, 5000 LE Tilburg, The Netherlands ademoor@kub, nl
Universit4 Laval, Department of Computer Science, Quebec City, Canada, G1K 7P4, m i n e a u @ i ft. u l a v a l , c a
Abstract. Intemet-based information technologies have considerable potential for improving collaboration in professional communities. In this paper, we explain the concept of user-driven specification of network information systems that these communities require, and we describe some problems related to finding the right focus for adequate user involvement. A methodological approach to the management of specification knowledge definitions, which involves composition norms, is summarized. Subsequently, an existing conceptual graph based definitional framework for contexts is presented. Conceptual graphs are a simple, general, and powerful way of representing and reasoning about complex knowledge structures. This definitional framework organizes such graphs in context lattices, allowing for their efficient handling. We show how context lattices can be used for structuring composition norms. The approach makes use of context lattices in order to automatically verify specification constraints. To enable structured specification discourse, this mechanism is used to automatically select relevant users to be involved, as well as the information appropriate for building their discourse agendas. Consequently, this paper shows how conceptual graphs can play an important role in the development of this key Internet-based activity.
1 Introduction Increasingly more distributed professional communities, such as research networks, are discovering the potential o f collaboration through electronic media such as the Internet. However, several factors contribute to making it hard to determine the optimal or even just adequate use o f information technology to support these networks in their collaborative activities 1. One reason is that most knowledge creation activities are complex, situated, and dynamic. Another complicating factor is that numerous networked information tools are available, from which it is often difficult to determine which ones to use for what task purposes. Furthermore, system specification becomes even harder as it must also be user-driven, meaning that the users themselves are to discover 'breakdowns' in their use o f the system and negotiate specification changes with other users and implementors. Users must initiate their own specification processes, because they themselves are the task experts, and moreover are often only loosely organized, without extensive organizational support for taking care of system development. For example,
417
the publishing of electronic journals by networks of scholars instead of by commercial publishing houses is rapidly becoming popular. The support of the complex collaborative processes involved, such as the reviewing and editing of electronic publications, must not just be treated from a technical perspective. Rather, the new information tools must be designed to 'play an effective role within the social infrastructure of scholarship' 2. The involved scholars themselves are in a good position to define this role, as they best understand the subtleties of their requirements, and can provide volunteer specification labour in these mostly underfunded joint projects. To overcome the specification hurdles, structured methods for user-driven specification are needed. Already some approaches exist, for instance rapid application development, prototyping, and radically tailorable tools 3, 4, which more strongly involve users than traditional systems development methods. However, some drawbacks are that these approaches are based on traditional sequential instead of on evolutionary development models, focus too much on implementation rather than on conceptual issues, or support single user instead of group specification processes. 1.1
User-Driven Specification
True user-driven systems development means that each user can initiate and co-direct the specification process, based on concrete functionality problems he experiences when using the information system for his own purposes. Rather than doing a 'summative evaluation' of the information system in progress, in which users only approve of the overall specification process results, a user should be able to do a 'formative evaluation'. This entails that the users, rather than the developers, propose and decide upon specification suggestions which developers only help translate into actual modifications of the design of the system 5. One approach that could in potential deal with the mentioned issues is process composition 6. Its essence is that users of a system start with a rough definition of their work processes which are completely supported by the set of available tools. Over time, these specifications are gradually refined, always making sure that all processes are covered by available tool-enabled functionality. Such an approach takes into account the empirical findings that in general users initially only need to have an essential understanding of their business processes and tools to be able to initiate work 7, and that new technologies must be introduced gradually to prevent disruption of current work practices 8. One implementation in progress of (group) process composition is the RENISYS specification method for research network information systems 1. This method is discussed later on in this paper. 1.2
Finding a Focus
A major problem with process composition is that it is very difficult to determine the exact scope of a specification process aimed at resolving a functionality problem. Finding the proper scope is important in order to arrive at legitimate specifications, which are not only meaningful, but also acceptable to the professional community as a whole 1. However, this acceptability does not mean that all users should be consulted about
418
every change all the time. Of course, on the one hand, all users who have an interest in the system component to be changed need to be involved. On the other hand, however, as few users as possible should participate in the resolution of a particular specification problem, in order to prevent 'specification overload', as well as to ensure the assignment of clear specification responsibilities. Most current specification approaches intending to foster user participation do not systematically analyze how to achieve adequate user involvement in specification processes. For user participation in specification discourse (defined as rational discussion among users to reach agreement on the specifications of their network information system to become more satisfactory), it is at least necessary to precisely know: 1. 2. 3. 4.
When to consult users? Which users to consult? What to consult them about? How to consult them?
Question 1 has to do with how to recognize breakdowns, which are disruptions in work processes experienced by participants while using the information system. A breakdown should trigger specification discourse resulting in newly defined functionality that better matches the real information needs of the user community. Question 4 focuses on how such specification discourse is to be systematically supported. Users could be provided with semi-structured linguistic options (representing for instance requests, assertions, promises), which are tailored to the particular specification problem at hand. Answering these two questions does not fall within the scope of the current paper. Ideas being worked out in the RENISYS project are taken from the language/action perspective 9. This rather new paradigm for IS specification looks at the actions people carry out while communicating, and how this communication helps them to coordinate their activities. One of the key paradigmatic ideas is that people can make commitments as a result of speech acts. Such commitments in turn can be used to generate agendas of tasks to be carried out and evaluated by the various participating users. An agenda for a particular user thus consists of all things a user has to do, normally concerning the conduct and coordination of goal-oriented activities. In our case, however, agenda items refer to the specifications to be made or agreed upon of the network information system that supports the group work. In this article, we will concentrate on questions 2 and 3. The main issues we will address are: (1) selecting the relevant users to participate in system specification discourse, and (2) determining the possibly different agendas for a particular specification discourse for the various selected users. We will do this by developing a mechanism to efficiently handle user-driven specification knowledge evolution using context lattices. These were first presented in 10, and will be briefly reintroduced in Sect. 3.3. The context lattices are used to (1) organize specification knowledge, (2) check whether knowledge definitions are legitimate (i.e. both meaningful and acceptable), and (3) determine which participants should be involved with what privileges in specification discourse to resolve illegitimate knowledge definitions. In Sect. 2, the approach to knowledge handling in the user-driven specification method RENISYS is described. Sect. 3 introduces conceptual graph-based contexts and context lattices. In Sect. 4, context lattices are ap-
419
plied to structure what is called composition norm management, and in this way support the specification process.
2 Specification Knowledge Handling First, the different categories of specification knowledge distinguished in the RENISYS method are presented. Then, the problem of how to ensure that specification changes are covered by what is called the composition norm closure, is discussed.
2.1 KnowledgeCategories RENISYS distinguishes three types of specification knowledge: ontological (type) definitions, state definitions, and norm definitions. The ontologies contain functionality specifications (what are the entities, attributes, and relationships to be represented and supported by the IS). States define what entities are or should be actually present. Norms determine (1) who can use the system (determined by action norms) and (2) who should be involved in their specification (determined by composition norms). Conceptual graphs are used as the underlying knowledge representation formalism because a knowledge representation formalism is needed that is sufficiently close to natural language to efficiently express complex specifications understandable to users, yet that is formal and constrained enough to allow for automated coordination of the specification process. CG theory is very well suited to this task, as argued in 1. Type Definitions In RENISYS, the type definitions are organized into an ontological framework consisting of three kinds of ontologies. The heart of this framework is the core process ontology, consisting of elementary network process concepts derived from workflow modelling theory. Built on top of these generic concepts, three domain ontologies are defined. A domain is a system of network entities that can be observed by analyzing the universe of discourse from a particular perspective. The problem domain is the UoD seen from the task perspective, the human network is the UoD observed from the organizational perspective, and the information system is the same seen from the functionality perspective. The domain ontologies can be customized by the user to express concepts specific to his situation, thus allowing for conceptual evolution. Finally, the framework ontology describes a set of mapping constructs that link entities from the various domains. Type definitions represent functionality specifications, such as the structure of documents, or the inputs and outputs of workflows. For example, a simplified definition of type MAILING_LIST could be: TYPE: MAILING_LIST:*x -> (matr) -> R E C E I V E D MAIL (rslt) -> R E S E N T _ M A I L (poss) <- L I S T _ O W N E R .
(def)
->
INFORMATION_TOOL:
?x
-
Note that we do not use the standard type definition format introduced by Sowa, as we want a uniform representation format that can be used for all three categories of
420
knowledge (i.e. types, norms, and states). Furthermore, we want to be able to represent and infer from qualified type definitions, such as partial, proposed, and invalid type definitions. For instance, partial type definitions must be identified and represented as such. They are incomplete definitions of the necessary properties that a concept type should have. They are very important in guiding specification discourse, as often a group of users will initially agree on a concept at least necessarily having a set of properties, while also agreeing that the definition is not yet complete. A partial type definition is thus open to further debate. State Definitions State definitions represent states-of-affairs, which are first of all needed to determine which entities the information system implementation must support. For example, the following state definition indicates that John Doe is the list owner of the cg-mailing list. We thus know that all mailing list owner functions must be installed for at least this network participant. STATE:
MAILING_LIST:
cg-list
<-
(poss) <-
LIST_OWNER:
J o h n Doe.
Also, state knowledge plays two crucial roles in the specification process of the network information system. First, it can be used to detect incomplete or inconsistent functionality specifications. For instance, if the type definition of a mailing list says that there should be at least one list owner, but (unlike in the above state definition) no such list owner has been defined, then a specification process can be started to specify who currently plays this role. Alternatively, if no such person can be defined, it may be that the type definition of mailing list must be revised so that this (currently mandatory) relation can be removed. Second, state definitions can be used as input objects into specification processes, for instance by allowing for the identification of subjects who can create new knowledge definitions. Such concrete assignments of specification responsibilities are essential for network information system development to be successful. N o r m Definitions Norm definitions represent deontic knowledge, which includes such concepts as responsibilities, permissions and prohibitions. This knowledge can, among other things, help to define and manage workflow and specification commitments. Formal models for such commitment management in a language/action context are dealt with in speech-act based deontic logic 11. A key concept is that of actor, which is an interpreting entity capable of playing process controlling roles. Actor concepts themselves are ultimately instantiated by subjects, who are the people using and developing the network information system. The basic pattern of a norm definition is an actor to which the norm applies, in combination with a control process (initiation, execution, or evaluation) and a transformation (a process in which a set of input objects is transformed into an output object) being controlled. Norm definitions can be subdivided into action norms and composition norms. An action is a control process plus the controlled (operational level) workflow, a composition is defined as a control process plus a (meta-level) specification process. Action norms regulate behaviour at the operational level, in which case the transformations are called workflow processes. An example of an action norm is the following permitted action, which says that a list owner is permitted to add a list member:
421
PERM_ACTION: LIST_OWNER ADDLIST_MEMBER .
<-
(agnt)
<-
EXEC
->
(obj)
-
Composition norms, on the other hand, define desired behaviour at the specification level: they allow users who are, through actor roles involved in workflows, to be identified as simultaneously having legitimate roles in the specification process. Three kinds of specification processes are distinguished: creation, modification, and termination. An example of a composition norm could be this mandatory composition: MAND_COMP: TERMINATE
LIST_OWNER -> (rslt)
<->
(agnt) TYPE:
< - EVAL -> LIST_MEMBER.
(obj)
-
The termination of a type means that a legitimate type is removed from the type hierarchy together with all its definitions, which may be required if a concept is no longer useful. This particular norm means that a list owner is required to evaluate (i.e. approve or reject) any list member type termination, which has been proposed by possibly another actor. Having a well-supported approach for dealing with composition norms is crucial for managing the change process of network information systems. These norms help to identify which actors are to be involved in a particular specification process. Furthermore, they can be used to set the agenda for specification discourse, since they indicate what knowledge definitions an actor can legitimately handle and in what way. Thus, composition norms provide the key to answering the two questions we posed in section 1.2.
2.2
Composition Norm Closure
Traditional information systems analysis can be characterized as taking a snapshot of "the" sum of information requirements of an organization by a monolithic external group of analysts. However, in network information systems development, many users are often only temporarily involved in specification processes and this only from a very limited perspective and mandate: trying to resolve their own particular problem or that of others with whom they closely collaborate. However, if every specification is linked to others and every specification must be covered by the appropriate composition norms, a major problem arises in case of (partially) changing needs: how to guarantee that proposed specification changes remain part of the composition norm closure (defined as the sum of the explicitly asserted plus all derivable composition norms), i.e., how to make sure that a proposed specification is legitimate and also does not leave any other specification uncovered? To deal with this problem, it is often not enough to find just one applicable norm. Completeness is very important. For instance, if one wants to know whether the current user, who plays a number of actor roles, is allowed to change the definition of a particular type, all composition norms applicable to this definition need to be identified. However, as the knowledge base of graphs grows large, checking every unorganized composition norm by standard projections can get very cumbersome. This is especially true when recursive operations on embedded parts must be carried out. Furthermore, such a straightforward approach does not easily generate related contextual information,
422
such as the other definitions the actor specifying the current definition is allowed to make. Therefore, a more sophisticated norm querying and updating mechanism is needed. Such a query mechanism, which is optimized to handle particular contexts and the relations between different worlds of assertions, is formed by context lattices. Two of the major advantages of context lattices are that they (1) allow queries to be simplified, as embedded queries can be subdivided into their constituting parts and (2) the structure of the knowledge base can be queried, allowing for interesting relations to be easily discovered 10.
3
Contexts
Composition norms play a crucial role in the coordination of the user-driven specification process, as they put constraints on who is authorized to (re)define which particular knowledge definitions. Thus, the knowledge definitions are only true if the specification process conditions under which they are asserted are true as well. Such conditional sheets of assertion can be naturally represented as conceptual graph contexts 10. Contexts are an essential building block of conceptual graph theory 12. Building on these notions, Mineau and Gerb6 1997 presented a formal theory of context lattices, which is briefly summarized here. A context is a conceptual device that can be used for organizing information that originates from multiple worlds of assertion. It consists of an extension and an intention. In a context, the truth of a set of assertions (the extension) depends on a specific set of conditions (the intention). Thus, the intention is formed by those graphs which, if conjunctively satisfied, make the extension true. Thus, only if the intention graphs can all be made true, do the extension graphs exist. A context Ci is defined as a tuple of two sets of conceptual graphs: Ci = < T, G >
(1)
where T is the intention, and G is the extension of Ci. Two functions / and E were defined so that for a context Ci its intention T equals I(Ci) and the extension G is the same as E(C/). Contexts can directly be used to represent norms. The intention of a (composition) norm defines that some actor is capable of controlling a specification process of some kind of knowledge definition. The graph representation of this most generic composition norm intention is: ACTOR <- (agnt) <- CONTROL (rslt -> DEFINITION: #
->
(obj)
->
SPECIFY
-
It will be used as the intention of some context, while the referent of the DEFINITION concept, representing the knowledge definition being specified, will be considered as being in the extension of the same context. The format of the extension graph depends on the type of this definition (i.e. TYPE, PERM_ACTION, or STATE). Exampies of these definitions were given in the previous section.
423
3.1 Example: Mailing List We will illustrate the ideas put forward in this paper with a short example of a specification process typically encountered in a research network. To clarify the ideas introduced in this paper, only three permitted compositions are given. In a realistic case, required and forbidden compositions will also be needed. The example is the following. Many research networks are supported by mailing lists. A mailing list comes installed with a default set of properties. Some public lists allow any member to control the change of all their properties, which is explicitly defined as a composition norm. Often, however, as the networks grow in scope, the mailing list is to play new roles. For example, the purpose for which the mailing list is used could be changed from enabling general information exchange to supporting the preparation of a confidential report. In case of such a private mailing list, the list owner, who is a special type of network actor, can explicitly be allowed to modify the setting of the list parameters. Finally, (for any type of mailing list) a list owner can start the cancellation of the action norm which says that a list applicant can register himself as a list member. In this case, the following three (permitted) composition norms apply: (1) In a mailing list, any network participant is permitted to control (initiate, execute, and evaluate) modifications of mailing list properties, for example when the scope of the group needs to be changed. (2) In case of a private mailing list, a list owner is permitted to make modifications of the properties of the mailing list, i.e. he may change the settings about whether the list has open or closed subscription, whether it is moderated or not, etc. (3) A list owner is allowed to initiate the termination of the action (norm) that a list applicant can register himself as a list member. As contexts Ci, these composition norms could be represented as follows:
I CI: P e r m _ C o m p _ l I il: ACTOR <- (agnt) <- CONTROL -> (obj) -> MODIFY I (rslt) -> TYPE: # ...........................................................................
I I I
I gl: M A I L I N G _ L I S T ...........................................................................
I
I C2:Perm_Comp_2
I
I i2: L I S T _ O W N E R <- (agnt) <- EXEC -> (obj) -> MODIFY I (rslt) -> TYPE: # ...........................................................................
I I
I g2: P R I V A T E _ M A I L I N G _ L I S T ...........................................................................
I
I C3:Perm_Comp_3 I i3: LIST_OWNER <- agnt) <- INIT -> (obj) -> TERMINATE I (rslt) -> P E R M A C T I O N : # ...........................................................................
1 I I
I g3: L I S T _ A P P L I C A N T <- (agnt) <- EXEC -> (obj) -> R E G _ L I S T _ M E M B E R ...........................................................................
I
424
Hereby we assume that this (partial) type hierarchy has been defined: T
> CONTROL > EXEC INIT DEFINITION > PERM_ACTION TYPE INFO_TOOL > MAILINGLIST > PRIVATE_MAILING_LIST ACTOR > LIST_APPLICANT LIST_OWNER REGLIST_MEMBER SPECIFY > MODIFY TERMINATE
The contexts thus allow for a clear separation between the knowledge being specified (Yi), and the modality of the actual specification process (ii). Note that contexts instead of non-nested CGs are not only used in C3, but in C1 and C2 as well, because the intention (actor permitted to control specification process) represents the conditions under which the extension (knowledge definition, e.g. mailing list) may be specified. 3.2
Basic Context Inferences
Contexts have some interesting properties that can be inferred from the previous definitions 10. First, it is important to realize that a graph g can be in E(Ci) either because it has been explicitly asserted in that context, or because it is part of the transitive closure of the asserted graphs of that context. Thus, if gm is asserted in some context but is a generalization of gn from E(Ci), then gm is also considered to be in E(Ci). Thus, in the example, E(C2) contains both g2 and gl, as a MAILING..LIST is a generalization of a PRIVATE_MAILING_LIST. Second, if I(Cj) < I ( C i ) then E(Cj) C_E(Ci). This means that if the intention of Cj is a specialization of the intention of Ci, then (at least) all the extension graphs of Cj are in the extension of Ci. Thus, as i2 < il, we can derive that E(C1) = {gl, g2 }. Note that the contexts described in Sect. 3.1 only contain the extension graphs that have been explicitly asserted. Later on in this paper we will also include the derived extension graphs. Now that we have made some basic context inferences, we will look at how they can be used in the construction of context lattices. 3.3
Context Lattices
A context lattice is a structure that can be used to organize a set of contexts, allowing associations between these contexts to be made. A context lattice L consists of a set of formal contexts C.~, which are structured in a partial order <. A formal context C* is represented as:
425
C* = < I*(Ci),E*(Ti) >
(2)
where I*(Gi) = UjI(Cj)IGi C E(Cj) and E*(Ti) = NjE(Cj)II(Cj) C Ti Context lattices can be used to optimize query mechanisms concerning (1) particular contexts and (2) the relations between sets of intentions and extensions. In other words, they can be used to make explicit the relations between different worlds of assertions 10. To create the context lattice for our example, we need to take the following steps: 1. Recalculate the contexts As noted earlier, contexts can have explicitly asserted as well as derived extension graphs. The representations of C1, C2 and C3 only showed the explicitly asserted extension graphs: c~ =< {i~}, {g~} > c2 =< {i2}, {g2} > C3 = < {i3},
{gs} >
With completely calculated extensions (using the inferences of section 3.2 to recalculate .E(Ci)) the contexts can be represented as: cx =< {il},{gl,g2} > c2 =< {i2},{g~,g2} > c3 =< {i3},{a.} > Note that C1 and C2 now have the same set of extension graphs. 2. Calculate the formal contexts Foreachcontext, wenowcalculatetheformalcontext(usmg" Gi'* = < I*(Gi),E*(7~) >). c; =< Ul,~2},{gl,~2} > c~ = < {i1,~2}, {g~,~2} > c~ =< {i.}, {g,} > For C1 and C2 the same formal context is produced. 3. Calculate the context lattice Each formal context should occur only once, so redundant formal contexts (i.e. the above C~) must be removed. Furthermore, in order to create a lattice, we must also add a formal context including all extension graphs, as well as a context including all intention graphs. After renumbering the formal contexts, the resulting context lattice is as represented in fig. 1:
c~ = < {il, i2, i3}, {} > Fig. 1. The Context Lattice for the Example
426
4
Structuring Composition Norm Management
On top of the context lattice structure, a mechanism that allows for its efficient querying has been defined 10. One of its main advantages is that complex queries consisting of sequences of steps and involving both intentions and extensions can be formulated. This mechanism can be used for a structured yet flexible approach to norm management in specification discourse, by automatically determining which users to involve in discussions about specification and changes and what they are to discuss about (their agendas). There are two main ways in which a context lattice can be used in a user-driven specification process. First, it can be used to assess whether a new knowledge definition is legitimate by checking if a particular specification process is covered by some composition norm. As this is a relatively simple task of projecting the specification process on the composition norm base, we do not work out this application here. The second application of context lattices in the specification process is applying (new) specification constraints (constraints on the relations that hold between different knowledge definitions) on the existing (type, norm, and state) knowledge bases. This differs from the first application as, after a constraint has b.een applied, originally legitimate knowledge base definitions may become illegitimate, and would then need to be respecified. In this section we will first give a brief summary of how context lattices can be queried. Then, it will be illustrated how this query mechanism can play a role in composition norm management, by applying it to our example in the resolution of one realistic specification constraint.
4.1 QueryingConcept Lattices In order to make series of consecutive queries where the result of one query is the input for the embedding query, which is needed for navigating a context lattice, we need two more constructs. First, we need to be able to query a particular context extension or intention. Second, we must be able to identify the context which matches the result of an extension or intention-directed query. For the first purpose, two query functions have been defined that allow respectively extension or intention graphs to be retrieved from a specified formal context 10:
5E. (C*, q) = {geE(C*)lg < q} ~I* (C*, q) = {geI(C*)lg < q}
(3) (4)
Furthermore, Mineau and Gerb6 have constructed two context-identifying functions:
CE(G) = < I*(a), E*(I*(G)) > CI(T) = < I*(E*(T)), E*(T) >
(5) (6)
Space does not permit to describe the inner workings of these functions in detail (see 10 for further explanation). Right now, it suffices to understand that these functions allow the most specific context related to respectively a set of extension graphs G
427 or a set of intention graphs T to be found Together, these functions can be used to produce embedded queries by alternately querying and identifying contexts, thus enabling navigation through the context lattice.
4.2
Supporting the Specification Process
In the applications of context lattices, discussed in the previous section, the following general steps apply: 1) Check either the specification of a new knowledge definition against the composition norm base or the specifications of an existing knowledge base against a specification constraint. 2) Identify the resulting illegitimate knowledge definition(s). 3) Identify appropriate 'remedial composition norms' (i.e. composition norms in which the illegitimate knowledge definition is in the extension) 4) Build discourse agendas (overviews of the specifications to discuss) for the users identified by those remedial composition norms, so that they can start resolving the illegitimate definitions. These processes consist of sequences of queries that switch their focus between what is being defined and who is defining. For this purpose, the functions provided by context lattice theory are concise and powerful, at least from a conceptual point of view. One way in which we can apply context lattices is by formulating specification constraints, which constrain possible specifications and can be expressed as (sequences of) composition norm queries. Note that the example of the resolution of a specification constraint presented next is simple, and the translation into context lattice queries is not yet very elegant. However, what we try to present here is the general idea that flattening queries using context lattices is a powerful tool for simplifying and helping to understand queries with respect to the contexts where they apply. In future work, we aim to develop a more standardized approach that can apply to different situations.
4.3
An Example
One specification constraint could be: "Only actors involved in the definition of permitted actions are to be involved in the definition of(the functionality of) information tools". The constraint guarantees that enabling technical functionality is defined only by those who are also involved in defining the use that is being made of at least some of these tools. This specification constraint and much more complex ones can be helpful to realize more user-driven specification, tailored to the unique characteristics of a professional community. The power of the approach developed in this paper is that it allows such Constraints to be easily checked against any existing norm base, identifying (now) illegitimate knowledge definitions, and providing the contextual information necessary for their resolution. We will illustrate these rather abstract notions by translating the above mentioned informal specification constraint into a concrete sequence of composition norm queries. Decomposing the specification constraint, we must answer the following questions:
428
1. Which actors control the specification of which information tools? 2. Are there illegitimate composition norms (because some of these norm actors are not also being involved in the specification of any permitted actions?) 3. Which actors are to respecify these illegitimate norms on the basis of what agendas? Questions 1-3 can be decomposed into the following steps (this decomposition is not trivial, in future research we aim at providing guidelines to achieve it): la. Determine which specializations gj of information tools have been defined. The query Sl should start at the top of the context lattice, as this context contains all extension graphs: sl = ~E* (C~, ql) ----{gl, g2 } = { M A I L I N G _ L I S T ,
PRIVATE_MAILING~LIST
}
where ql ----INFO_T00L
lb. For each of these information tools gj, determine which actors ai control its specification: s2 = ~ . (C~(gl), q2) = ~* (CL q2) = {i~, i2} s3 = ~ . (cE@2), az) = ~,. (C~, q2) = {il, i2} whereq2----ACTOR:?
<-
(aunt)
<-
CONTROL
->
(obj)
->
SPECIFY
-
(rslt) -> TYPE ----{ A C T O R , LIST_0WNER}
a2 ----ACTOR:?
and as = ACTOR:
? ----{ ACTOR
, LIST_OWNER
}
2a. Determine which actors ai are involved in the specification of permitted actions. This query should be directed toward the bottom of the context lattice, as this context contains all intention graphs (which in turn include the desired actor concepts): 84 = ~,. ( c ~ , q4) = {i~} whereq4----ACTOR:?
(rslt) a4 = ACTOR:
<-
(aunt)
->
PERM_ACTION
<-
? --= { LIST_0WNER
CONTROL
->
(obj)
->
SPECIFY
-
}
2b. Using aa, determine, for each type of information tool gj (see la), its corresponding si and actors ai(see lb), which actors a'i currently illegitimately control its specification process. g1: M A I L I N G _ L I S T i a 2 ---- a 2 -- ( a 2 CI a 4 ) ---- { ACTOR
}
92: P R I V A T E _ M A I L I N G L I S T
a'3 = as - (a3 n a4) = { ACTOR
}
2c. For each tool identified by the gj having illegitimate controlling actors a'i, define the illegitimate composition norms c k = < i t , g j > by selecting from the si from l b i those it which contain ai: c '1 = <
Q,gl >
c'z = < Q , 9 2
>
3. In the previous two steps we identified the illegitimate norms. Now we will prepare the stage for the specification discourse in which these norms are to be corrected. A composition norm does not just need to be seen as a context. It is itself a knowledge definition which needs to be covered by the extension graph of at least one other composition norm, which in that case acts as a meta-norm. In order to correct the illegitimate norms we need to (a) identify which actors are permitted to do this and (b) what items should be on their specification agenda. This step falls outside the scope of this paper but is presented here to provide the reader with the whole picture. A forthcoming paper will elaborate on meta-norms and contexts of contexts.
429 t
3a. For each illegitimate composition norm c k, select the actors ai from the permitted (meta) composition norms era which allow that c k to be modified: 1 /
CTn = < i r a , g i n > where~m
=
ACTOR:? ->
<- (agnt) <EXEC (rslt) -> PERM_COMP:
->
(obj)
->
MODIFY
-
#
t
and g m = c~
3b. For each of these actors ai, build an agenda Ai. Such an agenda could consist of (1) all illegltamate norms c k that each actor is permitted to respecify and (2) contextual information from the most specific context in which these norms are represented, or other contexts which are related to this context in some significant way. The exact contextual graphs to be included in these agendas are determined by the way in which the specification discourse is being supported, which is not covered in this paper and needs considerable future research. However, we would like to give some idea of the direction we are investigating. In our example, we identified the illegitimate (derived) composition norm 'any actor is permitted to control (i.e. initiate, execute, and evaluate) the specification of a private mailing list' (< il, if2 >). From its formal context C~ it also appears that a list owner, on the other hand, is permitted to at least execute the modification of this type (< i2, if2 >). If another specification constraint would say that one permitted composition for each control process category per knowledge definition suffices, then only the initiation and evaluation of the modification now would remain to be defined (as the execution of the modification of the private mailing list type is already covered by the norm referring to the list owner). Thus, the specification agendas A i for the actors ai identified in 3a could include : 'you can be involved in the respecification of the initiation and the evaluation of the modification of the type private mailing list', as well as 'there is also actor-such-and-such (e.g. the list owner) who has the same (or more general/specific) specification rights, with whom yon can negotiate or whom you can ask for advice.'. Of course, in a well-supported discourse these kinds of agendas would be translated into statements and queries much more readable to their human interpreters, but such issues are of a linguistic nature and are not dealt with here. .
5
.
/
Conclusions
Rapid change in work practices and supporting information technology is becoming an ever more important aspect of life in many distributed professional communities. One of their critical success factors therefore is the continuous involvement of users in the (re)specification of their network information System. In this paper, the conceptual graph-based approach for the navigation of context lattices developed by Mineau and Gerb6 1997 was used to structure the handling of user-driven specification knowledge evolution. In virtual professional communities, the various kinds of norms and the knowledge definitions to which they apply, as well as the specification constraints that apply to these norms, are prone to change. The formal context lattice approach can be used to guarantee that specification processes result in 1 For lack of space, we have not included such composition norms in our example, but since they are also represented in a context lattice, the same mechanisms apply. The only difference is that the extension graphs are themselves contexts (as defined in Sect.3).
430
legitimate knowledge definitions, which are both meaningful and acceptable to the user community. Extracting the context to which a query is applied, provides simpler graphs that can more easily be understood by the user when he interacts with the CG base. It also provides a hierarchical path that guides the matching process between CGs, that would otherwise not be there to guide the search. Even though the computation cost of matching graphs would be the same, overall performance would be improved by these guidelines as the search is more constrained. But the most interesting part about using a context lattice, is that it provides a structuring of different contexts that help conceptualize (and possibly visualize) how different contexts ('micro-worlds') relate to one another, adding to the conceptualization power of conceptual graphs. In future research, we plan to further formalize and standardize the still quite conceptual approach presented here, and also look into issues regarding its implementation.
References 1. A. De Moor. Applying conceptual graph theory to the user-driven specification of network information systems. In Proceedings of the Fifth International Conference on Conceptual Structures, University of Washington, Seattle, August 3-8, 1997, pages 536-550. SpringerVerlag, 1997. Lecture Notes in Artificial Intelligence No. 1257. 2. B.R. Gaines. Dimensions of electronic journals. In T.M. Harrison and T. Stephen, editors, Computer Networking and Scholarly Communication in the Twenty-First Century, pages 315-339. State University of New York Press, 1996. 3. L.J. Arthur. Rapid Evolutionary Development - Requirements, Prototyping & Software Creation. John Wiley & Sons, 1992. 4. T.W. Malone, K.-Y. Lai, and C. Fry. Experiments with Oval: A radically tailorable tool for cooperative work. ACM Transactions on Information Systems, 13(2): 177-205, 1995. 5. E Holt. User-centred design and writing tools: Designing with writers, not for writers. Intelligent Tutoring Media, 3(2/3):53-63, 1992. 6. G. Fitzpatrick and J. Welsh. Process support: Inflexible imposition or chaotic composition? Interacting with Computers, 7(2):167-180, 1995. 7. L.J. Arthur. Quantum improvements in software system quality. Communications of the ACM, 40(6):46-52, 1997. 8. 1. Hawryszkiewycz. A framework for strategic planning for communications support. In Proceedings of the Inaugural Conference of lnfortnatics in Multinational Enterprises, Washington, October 1997, 1997. 9. E Dignum, J. Dietz, E. Verharen, and H. Weigand, editors. Proceedings of the First International Workshop on Communication Modeling 'Communication Modeling - The Language/Action Perspective', Tilburg, The Netherlands, July 1-2, 1996. Springer eWiC series, 1996. http://www.springer.co.uk/eWiC/Workshops/CM96.html. 10. G. Mineau and O. Gerbt. Contexts: A formal definition of words of assertions. In Proceedings of the Fifth International Conference on Conceptual Structures, University of Washington, Seattle, August 3-8, 1997, pages 80--94. Springer Verlag, 1997. Lecture Notes in Artificial Intelligence, No. 1257. 11. E Dignum and H. Weigand. Communication and deontic logic. In R. Wieringa and R. Feenstra, editors, Working Papers of the IS-CORE Workshop on Information Systems - Correctness and Reusability, Amsterdam, 26-30 September, 1994, pages 401--415, September 1994. 12. J.F. Sowa. Conceptual Structures: Information Processing in Mind and Machine. AddisonWesley, 1984.
Using CG Formal Contexts to Support Business System Interoperation Hung Wing 1, Robert M. Colomb 1 and Guy Mineau 2 for Distributed Systems Technology Department of Computer Science The University of Queensland Brisbane, Qld 4072, Australia Dept. of Computer Science Universit~ Laval, Canada
1 CRC
A b s t r a c t . This paper describes a standard interoperability model based on a knowledge representation language such as Conceptual Graphs (CGs). In particular, it describes how an Electronic Data Interchange (EDI) mapping facility can use CG contexts to integrate and compare different trade documents by combining and analysing different concept lattices derived from formal concept analysis theory. In doing this, we hope to provide a formal construct which will support the next generation of EDI trading concerned with corporate information.
1
Introduction
There have been several attempts to overcome semantic heterogeneity existing between two or more business systems. It could be a simple paper-based system in which purchase orders generated from a purchasing program can be faxed (or communicated via telephone) to a human coordinator, whose job is to extract and transcribe the information from an order to a format that is required by an order entry program. In general, the coordinator has specific knowledge that is necessary to handle the various inconsistencies and missing information associated with exchanged messages. For example, the coordinator should know what to do when information is provided that was not requested (unused item) or when information that was requested but it is not provided (null item). The above interoperation technique is considered simple and relatively inexpensive to implement since it does not require the support of EDI software. However, this approach is not flexible enough to really support a complex and dynamic trade environment where time critical trade transactions (e.g. a foreign exchange deal) may need to be interoperated on-the-fly without prior negotiation (about standardised trading terms) having to rely on the human ability to quickly and correctly transcribe complex trade information. In facilitation of system interoperation, other more sophisticated systems include general service discovery tools like the trader service of Open Distributed Processing 5, schema integration tools in multidatabase systems 2, contextbased interchange tools of heterogeneous systems 7, 1, email message filtering
432
tools of Computer Systems for Collaborative Work (CSCW) 3, or EDI trade systems 6. The above systems are similar in the sense that they all rely on commonly shared structures (ontologies) of some kind to compare and identify semantic heterogeneity associated with underlying components. However, what seems lacking in these systems is a formal construct which can be used to specify and compare the different contexts associated with trade messages. Detailed descriptions of theses systems and their pros and cons can be found in 9. In this paper, we describe an enhanced approach which will support business system interoperation by using Conceptual Graph Formal Context4 deriving from the Formal Concept Analysis theory 8. The paper is organised as follows: Section 2 overviews some of the relevant formal methods. Section 3 describes how we can overcome the so-called 1st trade problem (refers to the initial high cost to establish a collection of commonly agreed trading terms).
formal ~ formal
/
.
, ~
formal
Fig. 1. EDI Mapping Facility (EMF)
2
Relevant
formal
methods
In designing the EDI Mapping Facility (EMF) (shown in Figure 1), we aim to facilitate the following: 1) S y s t e m a t i c i n t e r o p e r a t i o n : allow business system to dynamically and systematically collaborate with each other with minimal human intervention, 2) U n i l a t e r a l c h a n g e s : allow business system to change and extend trade messages with minimal consensus from other business systems, and 3) M i n i m i s i n g u p f r o n t c o o r d i n a t i o n : eliminate the so-called one-to-one bilateral trade agreements imposed by traditional EDI systems. To support the above aims, we need to be able to express the various message concepts and relationships among these concepts. In doing so we need a logical notation of some kind. In general, a formal notation such as first order logic, Object Z, or CGs is considered useful due to the following: 1) it is an unambiguous logical notation, 2) it is an expressive specification language, and 3) the specification aspects can be demonstrated by using mathematical proof techniques.
433
However, we choose CGs to specify EDI messages due to the following added benefits: First, the graphical notation of CG is designed for human readability. Second, the Canonical Formation Rules of CGs allow a collection of conceptual graph expressions to be composed (by using join, copy) and decomposed (by using restrict, simplify ) to form new conceptual graph expressions. In this sense, the formation rules are a kind of graph grammar which can be used to specify EDI messages. In addition, they can also be used to enforce certain semantic constraints. Here, the canonical formation rules define the syntax of the trade expressions, but they do not necessarily guarantee that these expressions are true. To derive correct expressions from other correct expressions we need rules of inference. Third, aiming to support reasoning with graphs, Peirce defined a set of five inference rules (erasure, insertion, iteration, de-iteration, double negation) and an axiom (the empty set) based on primitive operations of copying and reasoning about graphs in various contexts. Thus, rules of inference allow a new EDI trade expression to be derived from an existing trade expression, allowing an Internet-based trade model to be reasoned about and analysed. Furthermore, to facilitate systematic interoperation.we need to be able to formalise the various trade contexts (assumptions and assertions) associated with EDI messages. According to Mineau and Gerb~, informally: 'A context is defined in two parts: an intention, a set of conceptual graphs which describe the conditions which make the asserted graphs true, and an extension, composed of all the graphs true under these conditions' 4. Formally, a context Ci can be described as a tuple of two sets of CGs, T/ and Gi. T/ defines the conditions under which Ci exists, represented by a single intention graph; Gi is the set of CGs true in that context. So, for a context Ci, Ci= < T/, Gi > = , where I(Ci), a single CG, is the intention graph of Ci, and E(Ci), the set of graphs conjunctively true in Ci, are the extension graphs. Based on Formal Concept Analysis theory 8, Mineau and Gerb~ further define the formal context, named C*, as a tuple < T/, Gi > where Gi = E*(Ti) and T~ = I*(G~) = I(C*). With these definitions, the context lattice, L, can be computed automatically by applying the algorithm given in the formal concept analysis theory described below. This lattice provides an explanation and access structure to the knowledge base, and relates different worlds of assertions to one another. Thus, L is defined as: L = < {C*}, < > . In the next section, we describe how these formal methods can be applied to solve the so-called 1st trade problem.
3
An
example:
Overcoming
the
1st trade
problem
This example is based on the following trade scenario: in a foreign exchange deal, a broker A uses a foreign exchange standard provided by a major New York bank to compose a purchase order. Similarly, a broker B uses another standard provided by a major Tokyo bank to compose an order entry. Note that these two standards are both specialisations of the same EDI standard. The key idea here
434
is to deploy an approach in which we can specify the different assumptions and assertions relevant to the trade messages so we use these formalised specifications to systematically identify the different mismatches (null, unused, and missing items). As an example, Figure 2 shows how we may use CG contexts to model the various trade assumptions (intents il, ..., iT) and concept assertions (extents e l , ..., el2).
CotCllXt: f:a~lIn r=,'llor4e
L-'oNrlEXT: ~tl~l~. ~ h ~ g e
C6. r cl
r!
c2
14 3
~.,.~:-) t.XlC4TEX'i': FcrCilm e x c h r ~
Iktt~mem:
~,
Fac~tmlO00
....
l .lfi. . . . . .
J
I
{;}I(I1N(?: F,lell. e l { l ~ .
Fltlor t I
I2
{.I k~tllt.XT: r~letkll qxchn.ll~, ('.IT~IK'y~J|'Y
~7 ( ~ a - - ~ , * ~ ' ~ i ~ ; ' ; ~ ' , ~
;~-')
~:~'~{~--~
I
F i g . 2. Sample C G contexts relevant to a purchase order (left) and an order entry (right)
There are several steps involve in the systematic interoperation of EDI messages. The following steps is based on the EMF shown in Figure 1. 9 S t e p 1. P r e p a r e a n d f o r w a r d specs: Broker A and B can interact with the Customer Application Program and Supplier Application Program, respectively, to compose a purchase order and an order entry based on the standardised vocabularies provided by a foreign exchange standard. Figure 2 shows two possible specifications: a purchase order and an order entry. Once formal specifications have been defined (by using CG formation rules and contexts), they can be forwarded to either an Purchase Order Handler or an Order Entry Handler for processing. Upon receiving an order request, the Purchase Order Handler checks its internal record stored in the Supplier Log to see whether or not this order spec has been processed before. If not, this 'lst trade' spec is forwarded to the EMF Server for processing. Otherwise, based on the previously established trade information stored in the Supplier Log, the relevant profile can then be retrieved and forwarded with the relevant order data to an appropriate Order Entry Handler for processing. In order to identify the discrepancy between a purchase order
435 and an order entry, the Order Entry Handler needs to forward an order entry spec to an EMF Server for processing. 9 S t e p 2. I n t e g r a t e a n d c o m p a r e specs: To effectively compare two specs from different sources the EMF server needs to do the following: 1) formalise the specs and organise their formal contexts into two separate type hierarchies known as Context Lattices. Note that in order to compare two different specs, an agreement on an initial ontology must exist; 2) by navigating and comparing the structures of these context lattices it can identify and integrate contexts of one source with contexts of another source. In doing so it can form an integrated lattice, and 3) this integrated lattice can then be accessed and navigated to identify those equivalent a n d / o r conflicting intentions (or assumptions). From these identified and matched intentions it can compare the extents in order to identify those matched, unused, null, and conflict assertions. The result of the comparison steps can then be used to generate the necessary mapping profiles. In the following, we describe how the above steps can be formally carried out. Moe attributes
M )o a t t r i b u t e s il
.C-•o objects
cl r c3 c4 c5
i2
i3
ia
x x x
x x
x
cbe objects
r c7 r
x g
i5 i6
i7
x
r,9 cl( cll cl~ x
x
x
c12
Fig. 3. FCA Formal Contexts represent two different sets of assumptions (about standard, currency and scale factor)
G e n e r a t i n g t h e C o n t e x t Lattices: Based on the FCA theory, the above formal CG contexts can be systematicMly re-arranged to form the corresponding FCA contexts of the purchase order and order entry (denoted a s / C r a n d / C o , respectively). These contexts are illustrated as cross-tables shown in Figure 3. The cross-table on the left depicts the formal context (/C~,) of the purchase order spec representing a query graph, while the cross-table shown on the right depicts the formal context (/Co) of the order entry spec representing a type graph. To simplify our example, all asserted conceptual relations shown in Figure 2 have been ignored in the cross-tables. If the application is required to query and compare the asserted relations, they can easily be included in the cross-tables prior to generating the context lattice. Recall from FCA theory that for a given context /C we can systematically find its formal concepts (X~, Bi). By ordering these formal concepts based on the sub/superconcept relation (<), we can systematically determine the concept lattice B(/C) based on the join and meet operations of FCA theory. Thus, from our example, the context /C7, a n d / C o shown in Figure 3 can be systematically processed to generate concept lattices B(/C~,) and B(/Co) shown in Figure 4,
436
El= <{ }{cl,...,c4}>
C6= <{ }{c5,...,c13}>
~
C3=<{il,i2} {c3,c4}> C4=<{il,i/~3}{c41>
C8=<{i4,i5} {c8}>
C5= < {i 1,i2,i3 }{c4 }> P/Order Context Lattice
2}> C9=<{i4,i~,i7 }{cl 2}>
ClO= <{i4,i5,i6,i7 }{}> O/Entry Context Lattice
Fig. 4. Context lattices generated from cross-tables shown in Figure 3
respectively. The context /(:7, has five formal concepts { C1,C2,C3,C4,C5} and the context/Co also has five formal concepts { C 6 , C 7 , C 8 , C 9 , C 1 0 }. I n t e g r a t i n g t h e C o n t e x t L a t t i c e s : At this point, the context lattices B(/C 7,) and B(ICo) represent the important hierarchical conceptual clustering of the asserted concepts (via the extents) and a representation of all implications between the assumptions (via its intents). With these context lattices we can then proceed to query and compare the asserted concepts based on the previously specified assumptions. However, before we can compare the individual concepts, we need to combine the context lattices to form an integrated context lattice. In doing this we ensure that only those concepts that are based on the same type of assumptions (or intention type) can be compared with each other. Otherwise, the comparison would make no sense.
$2
'C~ pair 7 matching pair
(
CIO= <{i4,i5,i6,i7} { }> C5-- <{il,i2,i3}{c4}>
Fig. 5. Integrated Context Lattice
Based on the information given by the individual contexts shown in Figure 2 we can derive that il is equivalent to i4 (i.e. both context C2 and C7 are based on the same foreign exchange standard). Thus, we can integrate and compare context C2's individual concepts c2, c3, c4 against node C7's individual concepts (c7, cs, c12). By comparing these individual concepts we find that c2 = c7, c3 = cs, and c4 = cir. Note that this comparison is true only when the above concepts are defined according to the convention specified by intents
437 il and i4. This integration step is illustrated in the top part of the integrated lattice shown in Figure 5. Similarly, we can integrate and compare contexts C3 and C8. In this case, we find that concept cs = c3 (quantity) according to the assumption that cs and c3 are based o n i4 (foreign exchange standard) and i5 (factor = 1). This integration step is illustrated in the left part of the integrated lattice shown in Figure 5. While integrating and comparing C5 and C9 we find some discrepancies in the intentions i2 (factor =1) and i6 (factor = 1000), also in i3 (currency = USD) and i7 (currency = JPY). These discrepancies in the intent parts suggest that both c4 and c12 (CostPerUnit) are based on a conflicting assumption. This integration and comparison step is illustrated in the right part of the integrated lattice shown in Figure 5. The results of the integration process can be used to form the result profiles which identify those null, unused and mismatched items of a purchase order and an order entry. This profile is then forwarded to the relevant handlers to control the data flow between business sytems. In general, a mapping profile can systematically be generated by inferring the integrated context lattice. 9 S t e p 3, Forward relevant data: Upon receiving the mapping profiles from an EMF Server, the Purchase Order Handler and Order Entry Handler store these profiles in the Supplier Log and Customer Log, respectively. In doing so, subsequent order requests can use these profiles to coordinate and transport the purchase order data to the appropriate order entry programs without having to integrate and compare the Purchase Order's and Order Entry's specs. It is important to point out that by navigating the context lattice, analysis software would be able to identify the reasons behind the mismatching results. Some mismatches (e.g. unknown concepts those which cannot be identified by a particular standard) can be impossible to interpret by another system without the intervention of a human coordinator. However, some other mismatches (e.g. those exchanged concepts that were based on a shared ontology but were not provided or asked for) can be systematically appended to the original specs and forwarded back to the Purchase Order Handler or Order Entry Handler for re-processing. A n o p e n r e s e a r c h a g e n d a : Previously, we have described an approach to identify discrepancies among different CG concepts. It is important to note that discrepancies may come from relations and not just from concepts. They may come in the way concepts are connected by relations or they may come from within nested concepts (or nested relations). For example, the message 'Broker A delivers a quote to Broker B' may have different interpretations depending on whether A calls (or emails) B to deliver a quote on the spot (which may not be secured), or A requests a quote specialist (by using a server) to make a delivery (in which a quote can be securely delivered by using the encryption, certification, and non-repudiation techniques). If the application does not care how the quote is delivered, as long as B receives the quote, then it is not necessary to analyse or reason about the relevant nested concepts (or relations). However, if the security associated with the delivery is of concern we need to find a way to compare and
438
identify the potential conflicts embedded in nested concepts. Discrepancies associated with relations can be solved by using the above described approach. For example, we can substitute relations (instead of concepts) as the extents of the cross table shown in Figure 3. In doing so, we can then generate the necessary context lattices and integrated lattices based on relations and not concepts. Thus, we can then compare and identify discrepancies a m o n g relations. If we view the different ways in which concepts are connected by relations as a collection of 'super concepts', then to identify the discrepancies a m o n g these super concepts, a partial common ontology (which describes how concepts m a y be connected by relations) must be used. The level of matching between different ontologies will have a direct impact on the comparison heuristic. The problem here is to discover some heuristics to guide a search through two lattices in order to integrate them. In doing so, we can find enough similarity to discover dissimilarities. To conclude, by using formal concept analysis and conceptual graph f o r m a l ism we can systematically create context lattices to represent complex message specifications and their assumptions. In doing so, message specs can be effectively navigated and compared, thus it is feasible that a formal EDI m a p p i n g approach can be facilitated.
References 1. C. Goh, S. Bressan, S. Madnick, and M. Siegel. Context interchange: Representing and reasoning about data semantics in heterogeneous systems. ACM Sigmod Record, 1997. 2. Vipul Kashyap and Amit Sheth. Semantic and schematic similarities between database objects: a context-based approach. The VLDB Journal, 5, 1996. 3. J. Lee and T. Malone. Partially Shared Views: A Scheme for Communication among Groups that Use Different Type Hierarchies. ACM Transactions on Inormation Systems, 8(1), 1990. 4. G. Minean and O. Gerb~. Contexts: A formal definition of worlds of assertions. In Dickson Lukose et al, editor, International Conerence on Conceptual Structures (ICCS97), 1997. 5. A. Puder and K. RSmer. Generic trading service in telecommunication platforms. In Dickson Lukose et al, editor, International Conerence on Conceptual Structures
(ICCS97), 1997. 6. L. Raymond and F. Bergeton. EDI success in small and medium-sized enterprises: A field study. Journal of Organizational Computing and Electronic Commerce, 6(2), 1996. 7. G. Wiederhold. Mediators in architecture of future information systems. IEEE Computer, 25(3), 1992. 8. R. Wille. Restructuring Lattice Theory: An Approach Based on Hierarchies of Concepts. In I. Rival, editor, Ordered Sets. Reidel, Dordrecht, Boston, 1982. 9. Hung Wing. Managing Complex and Open Web-deloyable Trade Objects. PhD Thesis, University of Queensland, QLD. 4072, Australia, 1998.
Ontologically, Yours Daniel Kayser L.I.P.N., Institut Galilée Université Paris-Nord Avenue Jean-Baptiste Clément F-93430 Villetaneuse, France [email protected]
Abstract. The word ‘ontology’ belongs to the vocabulary of Philosophy, and it is now commonly used in Artificial Intelligence with a rather different meaning. There is nothing wrong here, so long as what is meant remains clear. AI typically claims to use ontology for building ‘conceptual structures’. Now concepts are named by words, and in the relation concept-word, the goals of AI imply not to move too far away from words. Therefore, it is not at all certain that ‘conceptual structures’ should be handled in the way that logicians and philosophers consider to be adequate for concepts. Thinking about the actual use of words in reasoning may therefore open some new perspectives in the theory of ‘conceptual structures’.
1 Introduction Knowledge Representation (KR) aims at expressing the knowledge concerning a domain in terms of its concepts, relations, individuals or whatever; however, the choice of these entities is not considered to be a genuine KR question, but rather an ontological one. Ontology is more and more frequently referred to in Artificial Intelligence (AI), and there is an active and scientifically very creative research community currently interested in ‘ontologies’. Now, the word ‘ontology’ belongs to the vocabulary of Philosophy, where it is deeply related with problems of existence. When AI considers some questions as being ontological, does it really mean that we are concerned with metaphysical problems? Most of us will certainly answer negatively. We are considering models of a reality; these models are designed for a given purpose, and the main problem we are interested in is their adequacy to this purpose. Being adequate is not grounded on being “real”; furthermore being realistic, i.e. aiming at the representation of “all” aspects of reality, goes in the opposite direction to being adequate, for obvious complexity reasons. What “AI ontology” wants is to find the basis for efficient models, whereas what “philosophical ontology” wants is to discuss the problem of real M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 35-48, 1998. © Springer-Verlag Berlin Heidelberg 1998
36
D. Kayser
existence, whatever this may mean (see below). Their goals are not comparable, and their methods can be completely contradictory. Although it may distract me from my main points, I would like to digress on what means "to exist" for the layman.
1.1 Trivial Considerations about Existence The most "natural" way of existing is precisely to exist in Nature. Trees exist. Ghosts do not exist. But we often forget that tree is a word, that people use to refer to a category of objects, and the “existence” of trees is nothing more than an agreement that this oak, that pine, and so on belong to the same category, agreement based on the fact that these objects share common properties that we consider as important. The other “way of existence” is the existence inside of a model. In the standard propositional calculi, there exist tautologies; in euclidean geometry, for each point there exists a unique line including the point and having a given direction, etc. In some sense, the distinction between both “ways of existence” is ill-defined, because as we have noticed, rather than to Nature itself, the first one refers to an agreement about the "obvious model” of Nature, hence existence in Nature is also existence inside of a model. But there is another deep problem to take into consideration, namely whether objects (e.g. this oak, etc.) do exist in Nature or only in the "obvious model”, presumably innate, by which we perceive the external world. At a very different scale from ours, this oak is not an object at all: it is a collection of molecules or, alternatively, an indistinguishable part of the vegetal cover of the earth. This question may at first sight appear to be more relevant to “philosophical ontology” than to AI, but we will see that it is not the case.
1.2 Plan of the Paper Section 2 explains the difference between "philosophical" ontology and AI in terms of their different view of what a concept actually is. Of course, only a small part of such a subject is covered, and we try to focus on what is relevant for the basic idea defended in this paper. We have already underlined that even if there are fundamental discrepancies between a concept and the lexical unit that names it, it was nonetheless impossible to ignore the behaviour of words in a discussion about concepts. Therefore, section 3 develops and exemplifies some linguistic considerations, in the domain of technical terms as in the general use of the language. We will then be ready to present, in section 4, the main thesis of the paper: the inadequacy of current AI ontologies — as soon as they concern a domain dealing with concepts which are not perfectly defined logically — originates in an idealistic view which does not meet AI needs. This view should be replaced by a much more flexible one, taking into account variability, i.e. the fact that ontological uniqueness is a wrong
Ontologically, Yours
37
requirement for AI, and borrowing from the linguistic processes by which the meaning of a word gets adapted to its context of use.
2 Philosophical vs. AI View of a Concept Philosophy, as for that matter Mathematical Logic, has a rather strict notion of a concept. In its extensional version, a concept corresponds exactly to a unary predicate, i.e. to a function which, for every object, tells whether or not it "falls under" the concept. The other view of a concept, i.e. its intension, can either be seen as a kind of generalization (intension as extension in multiple worlds, cf. [10]), as a mere variant (intension as a set of properties, each property being nothing more than a unary predicate) or as ontologically distinct (properties "exist", their having an extension being a contingent, less important, feature). The intensional view has interesting applications in AI (equational calculus based on intensional normal forms has proven useful in Description Logics), but identifying an object with the set of its properties is counterintuitive, and when the properties are allowed to change with time, the identity of the object is hard to maintain. We therefore concentrate on the extensional view. There are a number of difficult questions lurking behind this seemingly simple notion, some of which are discussed below.
2.1 Vague Concepts The first problem is well known. Whether an object "falls" or not under a concept does not always come as a yes / no decision. Under the designation of vagueness (Russell), fuzziness (Zadeh) or whatever, this issue has been investigated by various means. None of them appear to be completely satisfactory. Consider for instance the popular fuzzy-set representation of a vague concept: if a concept c is vague, it is hard to decide which real number µ (i.e. a number, each digit of which has a well-defined value!) corresponds to the degree of membership of every object o to c. The meaning of setting µ to 1 must also be clear, i.e. there must exist a criterion telling what a perfect instance of c is, and this seems somehow contradictory to the fact that c is only vaguely defined. Several other technical or mathematical proposals have been put forward in order to cope with vagueness, and we will not discuss them here, because we want to focus on a more difficult issue, which seems to have been often overlooked.
2.2 Vague Objects? In order to give either a yes / no answer, or a more finely shaded one as to whether object o falls under concept c, it is not only important to know what we mean by c, but
38
D. Kayser
also what o precisely is. All traditional logical approaches tend to consider that objects 1 "are there", and to ignore the problem of deciding what counts as an object . There are well known philosophical difficulties, e.g. whether a heap is an object or a set of objects, yielding to possible paradoxes (sorites), see also [13]. There may be also physical difficulties to define objects, e.g. at the wave/corpuscle level. None of them is really relevant to AI: when we represent entities in knowledge bases, they usually behave as “classical” objects. Now even so, what we consider in the real world to be an object tends to be much more problematic than most of us are ready to admit. Language adds a lot of difficulties, that we cannot ignore — therefore, we will discuss them in the following section — but language is far from being responsible for 2 all the problems. Ostensive designation for instance introduces also ambiguity. If, pointing to some direction, you utter or make a gesture suggesting “this is awkward”, you may mean that some object, or all objects of some category, or a situation in which some (set of) object(s) play(s) a role, or many other things, are unpleasant to you. A "conceptual" translation such as: ($ x) (awkward(x))
(F)
requires to define over which domain D (physical objects, classes of physical objects, sets, situations, etc.) the variable x is quantified, or in other terms what counts as an object, independently of any "sorites" or "quantum physics" kind of consideration.
2.3 Variable Ontology Even if we are able to make a reasonable choice for D under some given circumstances, what seems to have been up to now completely neglected is the fact that this choice must be a revisable one. As a matter of fact, the ability to modify an ontology during a reasoning process seems to be a fundamental property of intelligence. My favourite example here — but an amazing number of various situations exhibit the same pattern — consists in planning a journey; suppose you have to go by car from A to B; it is often enough to take A and B as points, and the route between them as a line. But the map on which you plan your route may give indications of narrow sections of the road, or of low bridges, in which case you have to take temporarily into account two or three-dimensional clearance considerations. Moreover, if the place B where you are heading, is located in a town T, it is convenient to reason as if T were a point, up to T's outskirts where you might have to choose between directions T-West and T-East, a very strange choice if T continues to be handled as a point!
1
Mereology [9] can be counted as different in this respect, but this approach adds specific difficulties which we cannot develop here. 2 See also [11].
Ontologically, Yours
39
We will discuss in section 4. some implications of this kind of phenomena. The only consequence that we draw for now is that the solution of an "ontological problem" should not be a structure such that the concepts of the application rigidly map to a unique entity of the structure. The degrees of freedom that are needed should answer the problems of vagueness or fuzziness as well as the problem of the identity of objects.
3 Some Linguistic Considerations
3.1 General Natural languages add a lot of specificities to the problem of concept representation; just to mention a few of them, the distinction between homonymy and polysemy, the distinction between proper (literal) meaning and various metonymic or metaphorical extensions, the assignment of a syntactic category, the variation of the meaning of an expression when it gets translated from a language to another, … It may therefore seem wise for "knowledge engineers" to be careful not to tread on the dangerous ground of language. But this cautiousness would lead to evade the real problems. After all, it is not by chance if, in order to make clear what concept we are interested in, we always designate it by a natural language word or phrase. The widespread belief that precautions — such as putting capital letters for concepts and lowercase ones for words — are sufficient to prevent concepts from borrowing the involved behaviour of words, is groundless. Let us consider a couple of examples.
3.2 What Kind of Thing is a Unix Command? “Command” is a word that has various meanings according to context. However, in the narrow scope of the Unix Operating System, it seems that this word corresponds to a single, well-defined concept. 3 Looking across a User's Guide tells us how the word is used; because of some of the linguistic phenomena listed in §3.1., and especially of metonymy, the uses may not be very helpful to determine exactly where to put the concept in an ontology. However, it should provide useful hints. The first occurrence of the word, and the only one which has the form of a definition, is very misleading:
3
We refer in this section to "SunOS User's Guide: Getting Started ©1990 Sun Microsystems Inc." The numbers following the sentences refer to the page numbers in that Guide.
40
D. Kayser
Most commands are executable files; that is, they are files that you can run.
(9)
Later on, it appears that the Guide tries to carefully distinguish between the command and the command line (i.e. command + options + arguments). But quickly the tendency to conciseness seems to override this distinction: The mkdir command creates directory.
(23)
Is this sentence a metonymy, i.e. should it be read as “command lines having mkdir as their command create directories”? Or is it still the proper sense, i.e. the command does create the directory, but the way to do it requires that you type a command line including the command and some other stuff? A similar doubt arises with the very frequent occurrences of sentences “the syntax of xxx is …” where xxx has been introduced as a command. The (non-ambiguous) meaning of such sentences can be both understood as metonymic ("the syntax of a command line starting with xxx is …”) or proper (commands have syntax, explaining how command lines using them should be formed). Sentences like the following one are obviously inconsistent with the policy expressed in the Guide, since options are said to be part of command lines, not of commands: Use the ls -l command to determine what permissions files and directories have.
(37)
The above remarks only illustrate the very trivial fact that, despite its attempts to name consistently the concepts, the Guide, as every technical or non-technical piece of text, cannot completely avoid sloppiness in its wording. However, this leaves completely open our initial question: where should we put the concept command in an ontology? Under “string of characters”? Under “class of processes”? Under “implements”? Or still in a different part of the ontology? An even more important question is “what is the criterion allowing to choose among these possibilities?”. Consider for a moment that we select (more or less arbitrarily) one of these options; for instance, we decide that “basically”, a command is a kind of string of characters; then, all the other concepts that may be called a command should be derived from this "basis". This decision being taken, a careful reading of the Guide allows us to endow this "basic" concept with roles, e.g. commands are strings of characters: • typed by a user, • in order to fulfill one of his/her goals, • being interpreted by a program, • thereby creating a process, • etc. The Guide could then be rewritten, albeit in a clumsy style, in a consistent way with this ontological decision. As we have seen above, this rewriting may not be unique, but if the meaning of the sentence is clear, the various translations should be equivalent.
Ontologically, Yours
41
But some cases remain problematic. Consider the sentence: kill provides you with a direct way to stop commands that you no longer want.
(85)
The clear meaning of this sentence is: « if you type at some time t a command line starting with the command (string of characters) c, in order to fulfill goal g, thereby creating process p, if
at some later time t’, g is not fulfilled, p is still active, and you have the new goal g’ consisting in cancelling p,
then you should type at t’ a command line l obeying the syntax "kill PIDs", where … ». The least that can be said here is that the translation is not easy. Similar difficulties will be encountered with: running a command in the background.
(88)
restarting a command
(88)
These difficulties are related to the spatio-temporal dimension which was not taken into account in the choice of our "basic" concept. They belong to the type/token kind of difficulties, that arise more or less everywhere. As a matter of fact, if a command is identified to a string of characters, two different users typing the same string will issue the same command, as will do a single user typing the same string at two different times. The correspondence between command and process will therefore become rather problematic. In technical domains, the story generally ends well: there is in principle one (or several) “ontologically correct” representation(s) of each occurrence of a term that can be expressed as formula(s) using the “basic concept”, whatever it is, that has been selected for that term. However, the relationship can be very hard to find (in the (85) example above, commands stands for processes initiated by command lines starting with "commands" in the “basic” sense of the term ; in a similar study [6] on the Hypercard User's Manual, an even more far-fetched solution was needed: it turned out that field was used for the way the content of the “basic” field concept was displayed on the printer!).
42
D. Kayser
3.3 What Kind of Thing is a Traffic Light? In everyday life, it is far less obvious to determine a "basic" concept, and, more radically, whether this idea of "basic concept" makes sense at all. In a previous work [4], I argued that book should basically refer to an equivalence class of physical objects, the equivalence relation being under-determined (depending on the context, e.g. two different editions of a book count as equivalent or not). Therefore, even in a well-defined situation, there may remain for ordinary language an under-determination as to how a given occurrence of a word such as book should be translated by means of "basic" concepts. In a more recent paper [3] my colleagues and myself have tried to investigate how the concept of traffic light was to be ontologically classified, when car-crash reports 4 had to be analyzed . Two sensible choices appear as possible: either a traffic light is a physical object, or it is a process. Of course, there is a relation between them: the process is the purpose of the object, the object is the place where the process takes place. In several sentences, both concepts must coexist in a single occurrence of the term. Consider for example: (…) car d’ordinaire se trouve un feu à ce carrefour. (hors fonctionnement ce jour-là)
A6
(…) as usually there is a traffic light at this crossroads (out of use this very day) What is located at the crossroads is the physical object, and what is not functioning is the process. Maybe, more strikingly: Étant à l’arrêt au feu tricolore (rouge)
B4
being stopped at the three-coloured traffic light (red) The physical object is not three-coloured, nor is the process! The process allows the light emitted by the object to take different colours at different times; the parenthesis reflects the trouble of the author of the report, to assign simultaneously different types to a single word. An "ontologically correct" paraphrase of the text would be something like: being stopped in the vicinity of the location of a physical object sheltering a process emitting light that at different times takes three different colours, at a phase of the process where the colour was red !! A similar case can be made about: A Nanterre, arrêtée à un feu rouge, un automobiliste n'a pas vu le feu
B66
in Nanterre, while I was stopped at a red traffic light, a car-driver did not see the traffic light 4
We are grateful to M.A.I.F. Insurance Co. for having put a number of reports at our disposal. The number following the sentences correspond to a numbering of these texts.
Ontologically, Yours
43
the author is stopped in the vicinity of the object, at a specific phase of the process, and what happened is presumably not the fact that the other car-driver did not see the object, but that he did not pay attention to the current phase of the process. The temporal interpretation is event sharper in: Alors que je redémarrais au feu vert, (…)
B6
While I was getting going again at the green traffic light, (…) If the "basic concept" is the object, we have to jump from it to the process and then to the instant at which the process is changing phases. The conclusion is that we may arbitrarily assign a "basic" concept to the term traffic light. But then nearly every occurrence of the term must be treated as a metonymy, that our knowledge of the usual driving situations allows us to decode, more or less accurately, in order to get an expression using the "basic" meaning. However, this view entails: • a choice of the "basis", for which no solid ground seems available, • a tedious, sometimes reckless, translation of the occurrences of the term, • and this results in an overall impression of artificiality and inadequacy of the conceptual representation.
3.4 Pustejovsky’s Generative Lexicon In [12] and various other writings, J.Pustejovsky suggests to deal with the variability of semantic interpretation of a given lexical unit by a process called type coercion. Lexical units get a type, but when they are used in sentences where this type yields illformed expressions, the process operates and produces an entity of a new type. This process relies on a qualia structure associated to each lexical unit; four facets are defined: CONST, FORMAL, TELIC, AGENTIVE, but the method to fill these facets remains intuitive, and the examples provided show that arbitrariness cannot be eliminated. To take once more the example of book, FORMAL is filled with physobj (standing for physical object) and AGENTIVE, with write, while it would have been possible to consider the "agent" of the physical-object book to be the printer, or alternatively the "formal" facet of the written book as its semantic content rather than as a physical-object. To cope with the plurality of views on a single word, Pustejovsky introduced more recently "complex types", e.g. book has type physical-object.info, and according to the context, one, or the other, or both members of this couple is/are activated. Unfortunately, this step goes in the direction of a finite enumeration of word senses, which is precisely what the "generative" approach is supposed to avoid.
44
D. Kayser
3.5 (Provisional) Conclusion The difficulties arising in the above examples can be blamed on the language only. Using natural languages necessarily entails delicate transpositions in a neat conceptual structure, but the existence and importance of such structures might remain unquestionned. Alternatively, they can be a hint to the pointlessness of keeping concepts pure from any linguistic contamination.
4 Some Promising Tracks
4.1 Semantic Adaptation The meanings that a single word can take in various contexts form a virtually 5 unlimited set. Dictionaries keep, in an often peculiar organization , the most typical semantic values. Even if this fact is seldom acknowledged, no text understanding is possible if the understander has only access to the dictionary meaning of each word of the text: every human unconsciously twists in some way or another the sense(s) found in the dictionary in order to find an adequate value of a word in its context. As an illustration among billions of others, the temporal value of feu in example (B6) §3.3. is not given in any dictionary — and should probably not be given if we are to keep their size compatible with human use. There is therefore a process of semantic adaptation, about which surprizingly little 6 is known neither psychologically nor computationally , in order to transform the lexical information on a word into a reasonably adequate semantic value for that word in its context. This process is, to some degree, regular. The regularities have been listed in Linguistics, and usually considered as rules governing metonymy. However, linguists focus their study on "interesting" metonymies rather than on the most common ones; moreover their rules correspond to a flat list of possible shifts, little being said concerning the actual applicability of the rule in a given situation. Now a blind application of the rules generates a huge number of interpretations, most of which would never come to the mind of a human understander. This describes the situation of words, but what is the relevance of the phenomenon of semantic adaptation and of its regularity when concepts are considered? 5
As no semantic organizational principle has been spelled out, most dictionaries organize the senses according to the (syntactic) constructions allowed by each meaning; as a result, intuitively very similar senses are remote in the dictionary (see [7]). 6 See however the special issue of Computational Intelligence devoted to non-literal meaning [2].
Ontologically, Yours
45
As we already insisted, concepts are not that different from words, not different anyhow to a point where a treatment having interesting features for words would be completely irrelevant for concepts.
4.2 Conceptual Adaptation The idea is thus to take advantage of the study of the process of semantic adaptation in words, in order to investigate to which extent it might be useful in concepts too. At first sight, such an idea seems both counterproductive and … absurd: • counterproductive, because concepts have been invented, as it were, to get rid of the pecularities of words when speaking about content; imitating at the level of concepts the seemingly erratic behaviour of lexical semantics is sort of going backwards, with respect to the advances obtained in working with concepts instead of words; • absurd, as semantic adaptation navigates from concept to concept in order to find which one meets the constraints imposed by the context on the occurrence of a word; it makes no sense at all to allow a concept to wander around in a structure of concepts which themselves wander around! However, there might be a more reasonable picture, introducing different levels. At each level, the domain of interest is structured with stable concepts. Before going further, let us observe that the same concept can be found at different levels, but with no easy correspondence. As a metaphor, the black circle representing Montpellier on the 1/5,000,000 map that I have in my notebook does not yield a huge black dot on the 1/50,000 map they sell you for going cycling in the outskirts of the city. Should it correspond to a typical point in the city (e.g. the townhall), to the area delimited by the city limits, by the limits of the conurbation, …? The answer depends on the situation, if any, where the question makes sense. Consider now a concept occurring in a conceptual representation; even it bears the same name as a "stable" concept appearing at some level, this does not necessarily entail a full coincidence with it. A process of conceptual adaptation may reveal as necessary here as the process of semantic adaptation was inescapable for text understanding. More precisely, I argue that any formula (F) representing some piece of knowledge (or some questions to be answered, e.g. if the formula has free variables) in a conceptual language should not be matched with (for consistency checking) or fed into (for inference) a single conceptual structure, identifying the concept names appearing in the formula with those occuring in the pre-existing structure. Instead of that, the formula should rather be confronted with several such structures, without being misled by identity of concept names. In each case, a process of conceptual adaptation should be invoked, in order to discard the levels where (F) makes no sense, and to modify (F), following regular patterns, in order to get from it meaningful results. I know very well that this opinion should be backed up at least by an illustrative example. The trouble is that every simple example, say of a size compatible with this
46
D. Kayser
paper, will more easily be solved with a one-level conceptual architecture, and if the agreement between the concept names of the structure and (F) is not good, the normal reaction is to blame the knowledge engineer for inconsistently using his/her own predicate names. Therefore, I will only hint at what such an example should look like; of course, it will always be possible to solve it in a « logically correct » way. Consider the (genuine) car-crash report below: J'ai vu l'arrière du véhicule B s'approcher. J'ai appuyé à fond sur le frein, le véhicule B a continué à s'approcher. J'ai cru sur le moment que mes freins n'avaient pas répondu et nous avons rempli ce constat. J'ai immediatement après essayé mes freins qui fonctionnaient très bien. J'ai maintenant la certitude, la rue étant légèrement en pente, que c'est le camion qui a reculé et est venu me percuter (…)
B68
I saw the back of vehicle B coming nearer. I pushed on the brake to the bottom, but vehicle B kept coming nearer. I believed, on the spot, that my brakes had failed and we filled the present report. Immediately after, I checked the brakes which work quite well. I am now absolutely conviced that, as the street has a slight slope, it is the truck who was moving back and stroke me (…) Independently of the words used, the representation of what happened requires a rather complex description on some notions, e.g. beliefs and how to confirm them, in order to infer that the author now disagrees with what (s)he wrote in the report; on the other hand, the comprehension of this text requires only a very crude description of vehicles and of their dynamics (there is a brake; when it works, it prevents you from coming nearer the vehicle you see). In most other texts, the opposite holds: confirmation of beliefs is irrelevant, but considerations on the effect of speed and dampness over braking distances play an essential role. Having a single ontology to shelter all the knowledge that may come into play entails: • reasoning in every case in the most complex setting — a very inefficient strategy —, and • handling a number of concepts presumably larger by an order of magnitude than the size of the vocabulary, hence the need to use e.g. STREET#1 (a line), STREET#2 (a surface, the causeway), STREET#3 (a skew surface, the causeway plus the sidewalks), STREET#4 (a volume, including the buildings opening onto the sidewalks), and so on. The alternative is to use a simple ontology (e.g. STREET defined as a line) and to refine it when and where the need arises. In the above example, "brake" can at first be defined as a part of a vehicle, but clearly we need later to distinguish among BRAKE#1 (the pedal), BRAKE#2 (the brake shoe), and BRAKE#3 (the whole device by which pressing on the pedal results in the shoe rubbing on the wheels). Having all sorts of brakes and streets cluttering the ontology of the domain of carcrashes is a very hard constraint to cope with, and a very useless one too!
Ontologically, Yours
47
Having ontologies at different levels, with bridges between them, and reasoning at the simplest level, except when it is inconsistent to do so seems much more attractive. But the difficulties must not be underestimated: • designing a strategy to select the ontological level appropriate for a given situation is difficult, • having efficient cooperation between levels which, by definition, do not share the same ontology is even harder, • deciding when to start and stop a new level of reasoning cannot be grounded on any formal principle, but only on empirical basis. Even if these obstacles are fearsome and hence this "promising track" not really attractive, I believe that the relatively easy way explored by philosophers and logicians has shown its intrinsic limitations, and nothing really less fearsome might solve the problem. More technical developments of this idea can be found in previous papers [8] [5].
5 Summary The less controversial examples of conceptual structures use mathematically perfectly defined notions (not going back as far in History as Port Royal’s definition of comprehension and extension on triangles, it is worth noticing that e.g. Chein & Mugnier's paper [1] use an ontology of polygons!). This is so because only in that case, the quest for a unique ontology is admissible and successful. The acceptance of ontological multiplicity, which goes necessarily with the need for a process of conceptual adaptation does not mean setting the fox (I mean the scruffiness inherent to natural languages) to mind the geese (I mean the neatness of the conceptual level); it is essential if we want to develop tools truly adapted to reallife domains, and not to their idealisation, but it can be done with a high standard of rigour. This idea provides by itself no methodology; at least, endorsing it means to stop looking for oversimplistic solutions where they obviously do not work.
References 1. Chein, M., Mugnier, M-L.: Conceptual Graphs: Fundamental Notions. Revue d’Intelligence Artificielle vol.6 n°4 (1992) 365-406 2. Computational Intelligence. Special Issue on Non-Literal Language (D.Fass, J.Martin, E.Hinckelmann eds.) vol.8 n°3 (Aug.1992) 3. Gayral, F., Kayser, D., Lévy F.: Quelle est la couleur du feu rouge du Boulevard Henri IV ? in Référence et anaphore, Revue VERBUM tome XIX n os1-2 (1997) 177-200 4. Kayser, D.: Une sémantique qui n'a pas de sens. Langages n°87 (Septembre 1987) 33-45 5. Kayser, D.: Le raisonnement à profondeur variable. Actes des journées nationales du GRECO-P.R.C. d'Intelligence Artificielle, Éditions Teknea, Toulouse (1988) 109-136 6. Kayser, D.: Terme et dénotation. La Banque des Mots, n° spécial 7-1995 (1996) 19-34
48
D. Kayser
7. Kayser, D.: La sémantique lexicale est d'abord inférentielle. Langue Française n°113 “Aux sources de la polysémie nominale” (P.Cadiot et B.Habert eds.) (Mars 1997) 92-106 8. Kayser, D., Coulon, D.: Variable-Depth Natural Language Understanding. Proc. 7th I.J.C.A.I., Vancouver (1981) 64-66 9. Lesniewski, S.: Sur les fondements de la mathématique (1927) (trad.fr. par G.Kalinowski, Hermès, Paris, 1989) 10. Montague, R.: Formal Philosophy (R.Thomason, ed.) Yale University Press (1974) 11. Nunberg, G.D.: The Pragmatics of Reference. Indiana University Linguistics Club, Bloomington (Indiana) (June 1978) 12. Pustejovsky, J.: The Generative Lexicon. The MIT Press (1995) 13. Unger, P.: There are no ordinary things. Synthese vol.41 (1979) 117-154
Executing Conceptual Graphs Walling R. Cyre The Bradley Department of Electrical and Computer Engineering Virginia Tech Blacksburg, VA 24061-0111 [email protected]
Abstract. This paper addresses the issue of directly executing conceptual graphs by developing an execution model that simulates interactions among behavioral concepts and with attributes related to object concepts. While several researchers have proposed various mechanisms for computing or simulating conceptual graphs, but these usually rely on extensions to conceptual graphs. The simulation algorithm described in this paper is inspired by digital logic simulators and reactive systems simulators. Behavior in conceptual graphs is described by action, event and state concept types along with all their subtypes. Activity in such concepts propagates over conceptual relations to invoke activity or changes in other behavioral concepts or to affect the attributes related to object type concepts. The challenging issues of orderly simulation of behavior recursively described by another graphs, and of combinational relations are also addressed.
1 Introduction Conceptual Graphs have been used to represent a great variety of knowledge, including dynamic knowledge or knowledge which describes the behavior of people, animate beings, displays and mechanisms. In simple graphs it is not difficult to mentally trace possible activities and decide if the situation is modeled correctly, and the anticipated behavior is described. In complex and hierarchical graphs, however, this may not be practical so that validation of the description must be carried out by simulation or reasoning. From another viewpoint, simulation is useful to predict the consequences of behavior described by a conceptual graph. In this paper, mechanics for simulation by executing conceptual graphs are presented. As an example of conceptual graphs execution, consider the graph of Figure 1 which describes the throwing of a ball. Suppose one wishes to execute or simulate this graph to observe the behavior it describes. As discussed in the next section, most researchers have suggested extending conceptual graphs with special nodes called actors or demons to implement behavior. Instead, consider what is necessary to execute this graph as it is. First, the throw action must be associated with some underlying procedure or type definition that knows what types of concepts may be related to it, and how the execution of the throw action affects these adjacent concepts. M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 51-64, 1998. © Springer-Verlag Berlin Heidelberg 1998
52
W.R. Cyre
In this case, executing a throw should only change the position attribute of the ball from matching that of Joe (100,35) to that of Hal (200,0) and change the tense of throw from future to past. Note that this situation has been modeled so that only attributes of concepts are modified. Had the position concept of the ball been coreferent with Joe’s position, then it would be necessary to necessary for the throw procedure to modify the structure of the graph by removing the position link of the ball from Joe’s position to Hal’s position. Since the graph is a data structure, modifying it is not a problem. In the discussions that follow, it will be assumed that action procedures only act on the values of attributes, and that these values may be rewritten multiple times.
position
(100,350)
person: Joe
agnt
person: Hal
dest
obj
ball
throw
position
(200,0)
position
(100,350)
future Fig. 1. A Description of a Behavior
In this discussion, throw was assumed to have a procedure defined for it in a library of concept procedures. Such a Liberia can easily get out of hand, so it is more desirable to have a type definition for throw in terms of a modest set of primitive actions which copy, replace, combine or operate only on values of attributes. Here, the type definition would have only a copy operation from the agent position concept to the object position concept. Figure 1 only has one isolated behavioral concept: throw. The conceptual graph of Figure 2 provides a more interesting example in which the execution of actions and events have effects on one another. One would like to execute or simulate this description by specifying an initial condition, say that action 1 is active, the state is true and the variable has the initial value of zero. Action 1 generates event 2, which in turn deactivates action 1 and initiates the increment action. The increment action reads its variable’s value attribute, increments it and returns the sum as the result. Starting the increment also generates event 4 if it is enabled by the state being true. Firing event 4 terminates the increment and re-initiates action 1. The example of Figure 1 considered a single action, and its procedure or definition needed to know what to do with the possible adjacent concepts. Figure 2 has relations such as generator, terminator, intiator and enabler, which relate actions, events and states. As described later, to simplify the simulator, these and similar types of relations can be processed uniformly regardless of the subtypes of actions events and states they are incident to. Only actions need to have ‘personalized’ procedures.
Executing Conceptual Graphs
53
In the following sections, related work by other researchers is considered. Following that, a simulation algorithm is described, including it’s supporting mechanisms (data types), it’s simulation cycle, and how conceptual relations among behavioral concept types affect the execution of graphs.
terminator
event: #2
initiator
generator action: #1
increment: #3 terminator
initiator
event: #4
enabler
generator
state:
“7”
operand
>
result
variable: #6
“0”
value
Fig. 2. An Example Graph
2 Related Work In 1984, Sowa stated that “Conceptual Graphs represent declarative information.” and then went on to describe “ ... a formalism for graphs of actors that are bound to conceptual graphs.” These actor nodes form (bipartite) dataflow graphs [7] with value type concepts of conceptual graphs. The relations between the actors and concepts are input_arc and output_arc to the actors. Sowa described an execution model for computing with conflict-free, acyclic dataflow graphs using a system of tokens. Actors can be defined recursively or by other graphs with actors. This model only allows single assignments (of referents) to (generic) value concepts. While this view of executing conceptual graphs has the computational power of dataflow models, functional dataflow graphs are not the most popular model of computation, and they require the appendage of a new node type (actors) to conceptual graphs. In addition, dataflow models are not easily derived from most natural language descriptions. In the present paper we show an execution model for general conceptual graphs without special actors. Other models of computation were considered by Delugach, including state transition models [5,6]. Another type of node, called ‘demon’ was added to conceptual graphs to account for the problem of states. The argument is that if a conceptual graph is to be true, and a system can be in only one state at a time, then (state) concepts must be created and destroyed when transitions occur. This approach was extended recently by the introduction of assertion types and assertion events [12].
54
W.R. Cyre
While demon nodes or assertion events offer the attractive capability of modifying the structure of a conceptual graph, they are external to conceptual graphs as actors are. Here, we avoid having to create and destroy states, by simply marking them as being true or false (negated). The actor graphs and problem maps of Lukose [10] are object-oriented extensions of conceptual graphs to provide executability. An actor graph consists of the type definition of an act concept supplemented with an actor whose methods are invoked by messages. Execution of an actor graph terminates in a goal state which may be a condition for the execution of other actor graphs. Control of sequences and nesting of actor graphs are provided by problem maps. While this approach does associate executable procedures with act concept types, no mention is given to event and proposition types. The application of conceptual graphs to governing agents, such as robots, was introduced by Mann [11]. A conceptual database includes schemata of actions the agent can perform. Commands in the form of conceptual graphs derived from natural language expressions are unified with schemata to evoke behavior. The behavior itself is produced by programs associated with demon concepts in the schemata. At the same time, the present author considered the visual interpretation of conceptual graphs, including the pictorial depiction of their knowledge [3]. Comments were included on animating displays generated from conceptual graphs. Animation would be produced by animator procedures associated with action and event type concepts. In this case, a conceptual graph would govern a display engine rather than a robot. While both of these proposals, execute conceptual graphs, each is rather specialized. Recently, simulation of conceptual graphs has been considered more generally [1]. This approach considers actions, states and events, where the underlying execution model is a state transition system. Action concepts are preconditioned by states and events. These actions can be recursively defined as sets of actions joined by temporal relations and conditioned by states and events. Events are changes in the truth of states. These state changes are described by links to transition concepts which, in turn, are invoked by actions through ‘effect’ relations. A simulation step consists of identifying the set of actions whose preconditions are true, and selecting one for execution to advance the simulation. Time apparently advances with each time step. The user is consulted to resolve indeterminacy due to multiple enabled actions in a step. Simulation also detects unreachable actions and inconsistencies. The execution model we describe treats states and events more generally and considers the interaction between behavior concepts (actions, events, states) and values or entities. Our set of general concept types and the simulation algorithm were developed through an examination of modeling notations for computer systems rather than from considering human behavior. As discussed later, behavioral models are not limited to computer system behavior. Since the present execution model is based on computer systems [2], it is appropriate to review here some approaches used in computer simulators. This will be limited to event-driven simulators. In computer logic simulators, the only possible events are changes in values (signals). At a given point in time, one or more signals may change. The behavior of each circuit having a signal change event on any input
Executing Conceptual Graphs
55
is simulated to determine output signal events. An output event consists of a value change and the time the event will occur due to delay in the circuitry. These events are posted to a queue. Once all output events due to current events have been determined, the next future event(s) is found from the event queue and simulation time is advanced to that time so that event can be processed. If the next event occurs at the present time, the simulation time is not advanced. In a simple simulator, the code that simulates behaviors of circuits is contained n a library. In more general digital simulators [9], the user may write processes which respond to input events and produce output events. The procedures of all processes execute once upon initial startup of the simulator. This is appropriate since hardware is always processing its inputs as long as power is applied. After startup, procedures execute only when stimulated. In a simulation cycle, all stimulated processes complete before the next cycle begins. A very general modeling notation called Statecharts is supported by a more elaborate simulation procedure [8]. Our simulation algorithm was inspired by this approach. Statecharts are founded on finite state machines. The states may be hierarchical, consisting in turn of transition diagrams. Parallel compound machines are supported so the system can be in multiple sub-states at a time. Each transition can be triggered by an event and predicated on a condition. Both conditions and events may be Boolean combinations of other conditions and events, respectively. When a transition occurs, an action may be stimulated. Other actions may be stimulated by events as long as the system is in a particular (enabling) state. Such action invocations may be predicated on other conditions. In addition, actions and events may generate new events. The objective of the present paper is to show how conceptual graphs can be executed or simulated by an approach such as this.
3 Simulation Approach 3.1 Execution Semantics of Concept Types To begin to develop a general conceptual graph simulator, it is necessary to first consider the types of concepts that will participate in the simulation. In the present discussion, we consider the top-level type hierarchy of Figure 3. The behavior types actively participate in a simulation. Objects are passive elements whose attributes, such as value for variable and position for entity, can be affected by the execution of action types. State concepts describe the status of what is described by a conceptual graph, and may be pre-conditions on the execution of actions or the occurrence of events. States may be defined by the activity of actions or their relationships among attributes of objects, such as the values of variables and the positions of objects. Events are instantaneous and mark changes in activity of actions or the attributes of objects.
56
W.R. Cyre
entity object
variable action
T
behavior
event state value
attribute
position delay
Fig. 3. Top-Level Concept Type Hierarchy Since conceptual graph theory allows concepts to be defined n terms of conceptual graphs, such recursively defined concepts must be accounted for in executing a graph. Consider an action that is defined by a graph that includes actions, events and states. When the defined action is executed, its graph must be executed completely before the action is completed and it’s consequences may be propagated on. In order to show that our approach is quite general, some traditional concept types [13] are interpreted in terms of the simulator type hierarchy of Figure 3, as shown in Table 1. Note that some concept type names conflict. For example, Sowa’s event type is classified as an action here because it has an agent and does not exclude a duration. Our events have no duration and are only discontinuities in actions. The type believe is an action (operation) whose operand is a belief or proposition (state). In this paper, we will only discuss the attributes value of variables and position of entity. Note however, that all discussions are extended to any attribute, such as the age or the color of an entity. Conceptual relations describe the interaction among these concept types and directs the simulator in assigning attributes and generating events during simulation. The interpretation of conceptual relations during simulation is described in detail in a later section. 3.2 Simulation Support Mechanisms The simulation algorithm described here is event-driven. Since event is a concept type in the type hierarchy, it is useful to define incident as the element processed by the simulation algorithm. Each incident is associated with a specific concept of the conceptual graph(s) being simulated, and has five attributes which specify: the concept
Executing Conceptual Graphs
57
type, the concept identifier the simulator time, the level of recursion and the type of operation to be performed. The simulator time is measured with respect to the beginning of the simulation, which occurs at time zero. The simulator may perform many cycles at a given simulator time. That is, a cycle does not necessarily advance the simulation time, and simulation time cannot be reversed. Often, incidents generated at the present time must be processed in the present time (without delay). Such incidents are assigned a zero for simulation time. The types of incidents generated and processed by the simulator are listed in Table 2. The effects of these types of incidents are described later. Table 1. Traditional Concept Types
Traditional Type
Simulator Type
act age animal believe belief message color communicate contain event proposition teacher think warm
action attribute entity action state variable attribute action state action state entity action attribute
The simulator uses the collection of lists shown in Table 3 to keep tract of incidents and the state of the conceptual graph. The Queue contains incidents that have not yet been processed by the simulator. The Current list contains the incidents to be processed during the current simulation cycle, and the Future list contains incidents generated during the current cycle and to be processed in a future cycle. The Current incidents are selected from Queue at the beginning of a cycle, and the Future incidents will be added to the Queue at the end of the cycle and before the next cycle begins. Which action concepts are active is kept track of by the Activity list. Activity describes the status of an action; an action may be active, but may not do anything during a given cycle. Those actions which must be executed during the current cycle are listed in the Execute list. The True_states list keeps track of which states are true at the present time. A state concept may reflect the activity of an action, or may represent conditions defined by relationships among attributes of objects, such as values of variables or positions of entities. The Value and Position lists keep track of
58
W.R. Cyre
the current values of variable concepts and the positions of entity concepts. In Prolog, lists are convenient structures for maintaining the status of the conceptual graph. In other implementation languages, lists of pointers might be used instead, and some lists might be eliminated entirely. For example, values and positions can be left in referent fields of concepts, but then, the graph would have to be scanned each cycle. Similarly, the status of action concepts and the truth of state concepts can be represented by the unary relations (active) and (not), respectively, incident to the concepts.
Table 2. Types of Simulation Incidents
Incident Type action
Parameters type, id, time, level, operation
event
type, id, time, level, none
state
type, id, time, level, operation
variable
type, id, time, level, new_value
entity
type, id, new_position
time,
Function
level,
Starts, stops or resumes an action. Fires an event. Enters or exits a state (makes true or false). Assigns a new value. Assigns a new position.
Table 3. Working Lists used by the Simulator
List Queue Current Future Activity True_states Values Positions Execute
Contents Pending incidents Incidents to be processed in the current cycle Future incidents produced in the current cycle Action concepts that are currently active State concepts that are true Pairs of variable concepts and their current values Pairs of entity concepts and their current positions Action concepts to be executed in the current cycle
Executing Conceptual Graphs
59
3.3 The Simulation Cycle A simulation cycle consists of the following steps: 1) Get current incidents: The list of incidents, Current, to be processed during the current cycle is extracted from the incidents Queue. All incidents at the current level of recursion and having a zero time are extracted for processing in the current cycle. If no incidents with zero time exist at the current level, then the level is raised. If no incidents with zero time exist at the top level, simulation time must be advanced. Then, the incident(s) with the earliest time and the deepest level is extracted for processing during this cycle. The simulation level is set to that level and simulator time is advanced to that time. Current incidents are deleted from the Queue. 2) Update attributes: Process attribute (value and position) change incidents by changing these attributes of the affected concepts. This may result in an immediate change in whether some states (conditions) are true, so the True_states list may be affected. Incidents are deleted from the Current list as they are processed. 3) Process remaining incidents: Process state, action and event incidents from the Current list. The order of processing these incidents is immaterial since any consequences are placed in the Future list for processing during a later simulation cycle. 4) Execute actions. Finally, action concepts to be executed during the current cycle are executed in this step. This must be last, since event occurrences and state changes stimulate and precondition activities, respectively. 5) Update Queue: Append the Future list to the Queue. The manner of processing the various types of incidents is described in the following paragraphs. Note again that an incident is processed only if the simulator time has caught up with the incident time. Attribute (value and position) incidents specify changes in values of variables and positions of entities, and the time they are to occur. In each case, the affected concept is located and its appropriate attribute is changed, both in the graph and in the Values and Positions lists. These incidents must be processed before action and event incidents, since value and position changes can affect the truth of preconditions for executing actions and firing events. Two types of state incidents are defined. An enter state incident causes the state concept to be added to the True_states list, and an exit incident removes the state concept from this list. State incidents are generated explicitly by actions, and so do not account for changes in status of state concepts due to changes in activity of actions and the attributes of variables or entities. For example, the state incidents
60
W.R. Cyre
incident(service, #, _,exit) and incident(idle, #, _,enter) are indicated by the expression, "Reset changes the mode from service to idle.” An event incident fires the indicated event concept in a conceptual graph. The consequences of firing an event are determined by the conceptual relations incident with the event. Incidents generated through conceptual relations are described shortly. Action incidents change the activity of action concepts by adding them to or removing them from the Activity list. A stop incident removes the action. A start incident or resume incident places the action onto the Activity list. A start incident invokes an execution of the action with initial values and positions for associated variables and entities. A resume incident invokes an execution of the action using the last values or positions. This supports persistence in actions. Once incidents have been processed and changes in objects and states are completed, the action concepts stimulated into execution are processed. Action concepts may have operands they operate on to produce results. These will be value or position attributes of other concepts. If an action is executed by a start incident, it is reset to initial values/positions before execution begins. Otherwise, the current values/positions are used. Execution may not only generate value or position incidents with respect to result concepts, but may also generate event, state and other action incidents. To generate value and position events, a procedure for transforming inputs to outputs must be available. Rather than defining actor or demon nodes as part of an extended conceptual graph to account for these procedures, we follow the pattern of digital simulators that have libraries of procedures for simulating the behavior of action concepts. That is there are a collection of primitive actions which the simulator knows how to process. Other actions can be recursively defined in terms of schemata employing these primitive actions. So, to execute a complex action, its schema is executed. But, the schemata may take multiple cycles to execute. To satisfy this requirement the simulator has levels of cycles. That is the function of the level parameter of the incidents. All current incidents at the deepest level of recursion must be completed before any higher-level incidents are processed. It is possible that some primitive actions may be invoked at different levels at the same simulation time. The present simulation strategy executes the action at the deepest level first. The simulation cycle is not completed until all action concepts have completed their execution, that is, suspended themselves. Self-suspension here means the computation is complete, and has nothing to do with terminating the activity of the action, unless the action generates a stop incident to itself. 3.4 Conceptual Relations and the Production of New Incidents As described thus far, the simulator consumes incidents but produces none, so a simulation would soon die out. New incidents are produced by the conceptual relations incident to firing events and executing actions. Table 4 shows a collection of relation types among behaviors and objects. Although their names may be unfamiliar, these relations account for most interactions among concepts, with the exception of attributes. Only binary relations are shown in the table; some ternary relations will be
Executing Conceptual Graphs
61
considered later. The challenge in developing a model for executing conceptual graphs is to determine which incidents are generated by the various relations, and how combinations of relations incident with concepts interact.
Table 4. Signatures of Selected Conceptual Relations
Has
entity
entity
part
variable action
event
state
variable
action
event
state
attribute
status
position color age value structure
part agent patient source destination
operand result
cause deactivator temporal part generator
initiator resumor terminator temporal trigger temporal part make_true entrance make_false exit
if
enabler
part
In Table 4, a relation is interpreted as the row concept ‘has relation’ with the column concept, such as an event has initiator action. This interpretation yields some unusual relation type names, but is traditional in conceptual graphs, and is necessary when considering combinations of relations incident with behavior concepts. First, consider relations actions have with behaviors (Row 3 in Table 4). When an event incident to an initiator relation fires, it will generate an action start incident for the related action, with zero time and the current level of recursion, e.g. incident(action, Id, 0, L, start). Similarly, the suspendor and resumor relations will generate stop and resume action incidents for their related actions. A single action incident may not be sufficient to stimulate the execution of the action. If the action has one or more if relations with states and the states are not true (on the True_states list), then the action will not execute. In addition, an action must have an action incident on each of its initiator or resumor relations to execute, since conceptual graph theory interprets multiple relations of the same type incident to a concept as conjoined (ANDed). Disjunction of relations is not conveniently represented in conceptual graphs. For this purpose, we define new relations or and xor to synthesize artificial disjunctive concepts. The relation xor indicates exclusive-or. Since complex combinations cannot be expressed this way, introduction of artificial concepts is necessary, as in the example graph in Figure 4, which indicates that action A executes only if states S1 and S2 are true and if a start incident was produced from event E5 as well as event E3 or E4 or both E2 and E2. That is the condition (S1 and S2 and E5 and ((E1 and E2) or
62
W.R. Cyre
E3 or E4)) Event [event: *1] was artificially introduced to represent the event that E3 or E4 or [event: *2] occurred. Event [event: *2] is the event that E1 and E2 occur simultaneously.
[action, A] (initiator) -> [event: *1] (or) -> [event: *2] (part) -> [event: E1] (part) -> [event: E2], (or) -> [event: E3] (or) -> [event: E4] (initiator) -> [event: E5] (if) -> [state: S1] (if) -> [state: S2],. Fig. 4. Complex Conditioning of Action Execution.
Which types of incident are generated by some of the relations identified in Table 4 are shown in Table 5. Thus far the generation of the time parameter of incidents has not been addressed, so all incidents generated with the above relations have zero (present) time and the simulation time never advances. To introduce time, it is necessary to add a set of ternary relations comparable to the relations of Table 4. For example the relation initiator_after has the signature shown in Figure 5.
[ action ] -> (initiator_after) [1] -> [event] [2] -> [delay],. Fig. 5. Signature for initiator_after relation.
During firing of the event, then, incident(action,#,T,start) is added to the Future list, where the value of T is the current simulation time plus the delay. Table 4 also shows temporal relations among actions and events. These may include interval relations, endpoint relations and point relations [4]. Although temporal relations seem to imply causality, we interpret them here as constraints. For example, [action: A1] -> (starts_when_finishes) -> [action: A2]
Executing Conceptual Graphs
63
does not cause a start action incident for action A1 to be generated when action A2 terminates. Instead the simulator must check when action A1 is initiated that action A2 has terminated, and post an exception if this is not the case. Alternatively, temporal relations could be interpreted as causal, in which case the temporal and other behavioral relations can be checked statically for consistency. Similarly, duration relations incident to actions can be used as constraints to check if a stop incident occurs with the appropriate delay after a start incident, or the duration can be used when the action starts to generate a stop incident that terminates the action.
Table 5.
Incidents Produced by Conceptual Relations
Concept activity
Incident Relation
Consequent Incident
Fire event
initiator resumor terminator trigger entrance exit cause deactivator generator make_true make_false
action start action resume action stop event state enter state exit action start action stop event state enter state exit
Execute action
4 Conclusions Mechanisms for simulating hierarchical conceptual graphs without introducing special nodes such as actors or demons have been described. To support execution of graphs, concept types are classified as behavior (action, event, state), object (entity, variable) and attribute. Execution is performed by procedures associated with action types that operate on object attributes, and by procedures associated with conceptual relations among behavior concepts. Although the simulation strategy was inspired by digital system simulators, the approach has been show to be applicable to general concept types and relations.
64
W.R. Cyre
5 Acknowledgments This work was funded in part by the National Science Foundation, Grant MIP-9707317.
References 1.
2. 3.
4.
5. 6.
7. 8. 9. 10. 11.
12. 13.
C. Bos, B. Botella and P. Vanheeghe, “Modelling and Simulating Human Behaviours with Conceptual Graphs,” Proc. 5th Int’l Conf. on Conceptual Structures, Seattle, WA, 275-289, August 3-8, 1997. Walling Cyre "A Requirements Language for Automated Analysis," International Journal of Intelligent Systems, 10(7), 665-689, July, 1995. W. R. Cyre, S. Balachandar, and A. Thakar, “Knowledge Visualization from Conceptual Structures,” Proc. 2nd Int’l Conf. on Conceptual Structures, College Park, MD, 275-292, August 16-20, 1994. W. R. Cyre, "Acquiring Temporal Knowledge from Schedules," in G. Mineau, B. Moulin, J. Sowa, eds., Conceptual Graphs for Knowledge Representation, Springer-Verlag, NY, 328-344, 1993. (ICCS'93) H. Delugach, “Dynamic Assertion and Retraction of Conceptual Graphs,” Proc. 7th Workshop on Conceptual Structures, Binghamton, NY, July 11-13, 1991. H. Delugach, “Using Conceptual Graphs to Analyze Multiple Views of Software Requirements,” Proc. 6th Workshop on Conceptual Structures, Boston,MA, July 29, 1990. J. Dennis, “First Version of a Data Flow Procedure Language,” Lecture Notes in Computer Science, Springer-Verlag, NY, 362-376, 1974. D. Harel and A. Naamad, The STATEMATE Semantics of Statecharts, i-Logix, Inc. Andover, MA, June 1995. R. Lipsett, C.F. Schaefer & C. Ussery, VHDL: Hardware Description and Design, Kluwer Academic, Boston, 1989. D. Lukose, “Executable Conceptual Structures,” Proc. 1st Int’l Conf. on Conceptual Structures, Quebec City, Canada, 223-237, August 4-7, 1993. G. Mann, “A Rational Goal-Seeking Agent using Conceptual Graphs,” Proc. 2nd Int’l Conf. on Conceptual Structures, College Park, MD, 113-126, August 16-20, 1994. R. Raban and H. S. Delugach, “Animating Conceptual Graphs,” Proc. 5th Int’l Conf. on Conceptual Structures, Seattle, WA, 431-445, August 3-8, 1997. J. Sowa, Conceptual Structures, Addison-Wesley, Reading, MA, 1984.
From Actors to Processes: The Representation of Dynamic Knowledge Using Conceptual Graphs Guy W. Mineau Department of Computer Science Université Laval Quebec City, Canada tel.: (418) 656-5189 fax: (418) 656-2324 email: [email protected]
Abstract. The conceptual graph formalism provides all necessary representational primitives needed to model static knowledge. As such, it offers a complete set of knowledge modeling tools, covering a wide range of knowledge modeling requirements. However, the representation of dynamic knowledge falls outside the scope of the actual theory. Dynamic knowledge supposes that transformations of objects are possible. Processes describe such transformations. To allow the representation of processes, we need a way to represent state changes in a conceptual graph based system. Consequently, the theory should be extended to include the description of processes based on the representation of assertions and retractions about the world. This paper extends the conceptual graph theory in that direction, taking into account the implementation considerations that such an extension entails.
1 Introduction This paper introduces a second-order knowledge description primitive into the conceptual graph theory, the process statement, needed to define dynamic processes. It explains how such processes can be described, implemented and executed in a conceptual graph based environment. To achieve this goal, it also introduces assert and retract operations. The need for the description of processes came from a major research and development project conducted at DMR Consulting Group in Montreal, where a corporate memory was being developed using conceptual graphs as its representation formalism. Among other things, the company’s processes needed to be represented. Although they are actually described in a static format using first-order conceptual graphs, advanced user support capabilities will eventually require them to be explained, taught, updated and validated. For that purpose, we need to provide for their execution, and thus, for their representation as dynamic knowledge. Dynamic knowledge supposes that transformations of objects are possible. Processes describe such transformations. To represent processes, we need a way to describe state changes in a conceptual graph based system. We decided to use assertion and retraction operations as a means to describe state changes. Therefore, the definition of a process that we put forth in this paper is based on such operations. Generally, processes can be described using algorithmic languages. These languages are mapped onto state transition machines, such as computers. So, a process M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 65-79, 1998. © Springer-Verlag Berlin Heidelberg 1998
66
G.W. Mineau
can be described as a sequence of state transitions. A transition transforms a system in such a way that its previous state gives way to a new state. These previous and new states can be described minimally by conditions, called respectively pre and postconditions, which characterize them. The preconditions of a transition form the smallest set of conditions that must conjunctively be true in order for the transition to occur; its postconditions can be described in terms of assertions to and retractions from the previous state. Thus, transitions can be represented by pairs of pre and postconditions. Processes can be defined as sequences of transitions, where the postconditions of a transition match the preconditions of the next transition. The triggering of a transition may be controlled by different mechanisms; usually, it depends solely on the truth value of the preconditions of the transition. This simplifies the control mechanism which needs to be implemented for the execution of processes; therefore, this is the approach that we advocate. Section 2 reviews the actual conceptual graph (cg) literature on processes. Section 3 presents an example that shows how a simple process can be translated into a set of transitions. Section 4 describes the process statement that this paper introduces. Finally, because of its application-oriented nature, this paper also addresses the implementation issues related to the engineering of such a representation framework; section 5 covers these issues.
2 The CG Literature on Processes Delugach introduced a primitive form of demons in [1]. Demons are processes triggered by the evaluation of some preconditions. Delugach’s demons take concepts as input parameters, but assert or retract concepts as the result of their actions, contrarily to actors which only compute individuals of a predetermined type. Demons are thus a generalization of actors. They can be defined using other demons as well. Consequently, they allow the representation of a broader range of computation. We extended these ideas by allowing a demon to have any cg as input and output parameters. Consequently, our processes are a generalization of Delugach’s demons. We kept the same graphical representation as Delugach’s, using a labeled double-lined diamond box. However, we had to devise a new parameter passing mechanism. We chose context boxes since what we present here is totally compatible with the definition of contexts as formalized in [2]. Similarly to [3], we chose state transitions as a basis for representing a process, 1 except that we do not impose to explicitly define all execution paths ; the execution of a process will create this path dynamically. This simplifies the process description activity. There is much work in the cg community about the extension the cg formalism to include process related primitives [4, 5, 6, 7, 8]. One of the main motivation behind these efforts, is the development of an object oriented architecture on top of a cgbased system [12]. As we advocate, [7] focuses on simple primitives to allow the modeling of processes. The process description language that we foresee could be extented to include high-level concepts as proposed in [7]. In [9], we explain how our two approaches complete each other. Also, [12] uses transitions as a basis for describing behaviour and [11] uses contexts as pre and postconditions for modeling behaviour. The work presented here is 1
An execution path is defined as a possible sequence of operations, i.e., of state transitions, according to some algorithm.
The Representation of Dynamic Knowledge Using Conceptual Graphs
67
totally compatible with what is presented in these two papers, but furthermore, 1) it adds packaging facilitly, 2) it deals with implementation details that render a full definition and execution framework for processes, and 3) it is completely compatible with the definition of contexts as formalized in [2], providing a formal environment for using contexts as state descriptions. In what follows, we present a framework to describe processes in such a way that: 1) both dynamic and static knowledge can be defined using simple conceptual graphs (in a completely integrated manner), 2) they can be easily executed using a simple execution engine, and 3) inferences on processes are possible in order to validate them, produce explanations and support other knowledge-dependent tasks.
3 From Algorithms to Pre and Postcondition Pairs We believe that a small example will be sufficient to show how a simple process, the iterative factorial algorithm, can be automatically translated into a set of pre/postcondition pairs. From this example, the definitions and explanations given in sections 4 and 5 below will then become easier to present and justify. Let Figure 1 illustrate the process that we want to represent as a set of pre/postconditions pairs. Since a process relies on a synchronization mechanism to properly sequence the different transitions that it is composed of, and since we wish to represent transitions only in terms of pre and postconditions (for implementation simplicity), we decided to include the sequencing information into the 2 pre/postconditions themselves . With an algorithmic language such as C, variable dependencies and boolean values determine the proper sequence of instructions. Then it is rather easy to determine the additional conditions that must be inserted in the pre and postconditions for them to represent the proper sequence structure of the algorithm. Without giving a complete algorithm that extracts this sequence structure, Figure 2 provides the synchronization graph of the algorithm of Figure 1. The reader will find it easy to verify its validity, knowing that all non-labeled arcs determine variable dependencies between different instructions, that arcs marked with T and F indicate a dependency on a boolean value, and that arcs marked as L indicate a loop. l0: int fact(int n) l1: { int f; l2: int i; l3: f = 1; l4: i = 2; l5: while (i <= n) l6: { f = f * i; l7: i = i + 1; l8: } l9: return f; } 3 Fig. 1. The iterative version of the factorial algorithm written in C.
2
3
Any sequencing mechanism that can easily be predicated can be translated to pairs of pre and post conditions. The presented mechanism only implements the SWS and FBS relations of [13]; other temporal relations may be expressed in terms of these two relations whenever a process can be subdivided in smaller subprocesses. For simplicity, n is known to be equal or greater than 0.
68
G.W. Mineau
Each algorithmic instruction will be represented as a small set of transitions, called transition rules. The synchronization graph is useful to update the preconditions of
l0
l3
l7
l6 T
l4
L
l5 F
l9
Fig. 2. The synchronization graph of the algorithm of Figure 1.
these rules, making sure that they include the information as to when they should fire. Also, since we do not wish to refire rules once they have been executed (except for loops), the corresponding postconditions will make sure that their preconditions will 4 not be true again. Synchronization is achieved through assertions and retractions. line l0: the entry point of the algorithm The triggering of the process will be possible by the assertion of a graph stating that line l0 is to be executed. A process will be defined using variables (as done below), but it will be executed only after the instantiation of all concepts representing line numbers. The triggering of the process would then be done through the assertion of a graph where the concept refers to the first line of the process to be executed (see Figure 3). Having individuals unique to a process will ensure that only the transitions 5 of this process will be triggered . The transition rule associated with line l0 (see below) is composed of a single 6 pair of pre/postconditions . The preconditions include a graph that corresponds to the input parameter of the algorithm. Section 5 below presents how parameter passing can be implemented. The postconditions include one assertion and one retraction (the first two graphs) which are meant to prevent this transition rule to fire again.
Line: #l0
to_do
Fig. 3. The assertion needed to trigger the process.
pre1:
Integer 4
5
6
val
Variable: *n
In our example, variables dependencies are used to determine possible execution sequences; we chose to reference them through line numbers. In this paper, we illustrate the definition of a process, not an actual individual process ready for execution. Consequently, all figures (but Figure 1) display generic concepts instead of individuals. In order to associate different pre and postconditions together, indexes are used. Also, all graphs in a postcondition are asserted, except those preceded by the negation sign, which are retracted. Of course, assertions and retractions may be without effect as asserted and negated graphs may already be known to be true or false respectively.
The Representation of Dynamic Knowledge Using Conceptual Graphs
Line: *l0
to_do
Line: *l0
to_do
Line: *l0
done
Line: *l3
to_do
Line: *l4
to_do
69
post1: ¬
line l3: f = 1; and line l4: i = 2; An assignment operation is represented by two transition rules because the assigned value may be different (or not) from the actual value of the variable. Here, the assigned value could be determined directly from the algorithm. Note that line l6 will not be set to “to do” after l3, since it depends on some boolean value, as shown by the T marked arrow in the synchronization graph of Figure 2. Because of space limitation, only the transition rules associated with line l3 are shown below; those of line l4 (pre4, 7 post4, pre5 and post5) will not be shown since they are very similar to those of line l3. pre2:
Line: *l3
to_do
Line: *l0
done
Integer: *z2
!=
val
Variable: *f
Integer: 1
Boolean: True 7
Here, we defined one actor for each basic computation normally provided by a computer, such as !=, <=, +, *, etc. They are in fact procedures which are called when activated. Their activation is done when their input concepts are all instantiated to individual concepts. Then the computation is triggered and their output concepts are instantiated to the computed value.
70
G.W. Mineau
post2: ¬
¬
Integer: *z2
val
Variable: *f
Integer: 1
val
Variable: *f
Line: *l3
to_do
Line: *l3
done
pre3:
Integer: *z2
!=
val
Variable: *f
Integer: 1
Boolean: False Line: *l3
to_do
Line: *l0
done
Line: *l3
to_do
Line: *l3
done
post3: ¬
line l5: while (i <= n) The loop requires two transition rules since its condition evaluates to a boolean value. Depending upon this evaluation, the set of instructions in the body of the loop will be set to “to do” or not. Also, since the loop will not be completed until the condition evaluates to false, we can not set the status of line l5 to “done” until then. Please note that the last two graphs in post6 are meant to reset the transition rules that are part of the loop, allowing them to refire (for the next iteration).
The Representation of Dynamic Knowledge Using Conceptual Graphs
pre6:
Line: *l5
to_do
Line: *l4
done
2
Integer
val
Variable: *n
1
Integer
val
Variable: *i
<=
Boolean: True post6:
Line: *l5
to_do
Line: *l6
to_do
¬
Line: *l6
done
¬
Line: *l7
done
¬
pre7:
2
Integer
val
Variable: *n
1
Integer
val
Variable: *i
<=
Boolean: False
Line: *l5
to_do
Line: *l4
done
71
72
G.W. Mineau
post7: ¬
Line: *l5
to_do
Line: *l5
done
Line: *l9
to_do
line l6: f = f * i; This instruction is an assignment which is done after the expression on the right side is evaluated. Only the first transition rule associated with l6, pre8 and post8, is shown below, because of space limitation. pre8:
Integer: *z4
Integer
*
!=
Variable: *f
val
val
Variable: *i
Integer: *y4
Boolean: True Line: *l6
to_do
Line: *l3
done
post8: ¬
Integer: *z4
val
Variable: *f
Integer: *y4
val
Variable: *f
The Representation of Dynamic Knowledge Using Conceptual Graphs
¬
Line: *l6
to_do
Line: *l6
done
Line: *l7
to_do
73
line l7: i = i + 1; This instruction is similar to the one in line l6. Its postcondition, however, will assert that line l5 has to be executed next (it must be marked “to do” again) since at the end of the loop the condition of line l5 must be reevaluated. Again, because of space restrictions, we will not show pre10/post10 and pre11/post11, the transition rules associated with l7. line l9: the exit point of the algorithm In post12, the first graph indicates the end of the process, while the last graph represents the output parameter of the algorithm. In order to instantiate this graph to the computed value (contained in variable f), a graph to that effect is inserted in pre12 (its last graph). Since coreference is global in a process definition, the value of f is passed on to post12, producing the appropriate computed value as the output parameter of the process. pre12:
Line: *l5
done
Line: *l9
to_do
Integer: *z5
val
Variable: *f
post12: ¬
Line: *l9
to_do
Line: *l9
done
Integer: *z5 4 The Process Statement Let us define a transition rule ri as a pair of pre and postconditions <prei,posti>. Figure 4 shows how a process can be defined. A process statement is accompanied by a list of transition rules which describe the process itself. This list is called the rule set
74
G.W. Mineau
of the process. This rule set can be produced automatically from the analysis of an algorithm, as illustrated in Section 3. When parameters are used, they are defined as separate graphs before the rule set, but are automatically incorporated into some transition rule. The input parameters will appear in the preconditions of the first transition rule (the entry point of the process); while the output parameters will appear in the postconditions of the last 8 transition rule (the exit point of the process) . In our example, the first graph of pre1 would be labeled u1 while the last cg in post12 would be labeled u2. Figure 5 shows how the factorial algorithm could be defined as a cg process. process name(in1 u1; ... inn un; out1 un+1; ... outm un+m) is: u1, ... un, un+1, ... un+m {ri, "i ³ [a,b]} Fig. 4. The process statement.
process fact(in1 u1; out1 u2) is: u1, u2 {ri, "i ³ [1,12]} Fig. 5. The definition of the factorial process.
In graphical form, a process is represented by a double-lined diamond box labeled 9 with the name of the process . Its parameters each appear as a separate statement linked to the process symbol by an ingoing (for input) or outgoing (for output) arc. Each arc is indexed according to the process definition. Figure 6 shows how our factorial process is graphically represented.
fact
STATEMENT: 1
Integer: *z5
1
STATEMENT:
Integer
val
Variable: *n
Fig. 6. The graphical representation of the factorial process.
Provided that there is an agreed-upon ontology on process definition terms, a process symbol could be expanded to this other representation. Figure 7 shows such an expansion with regard to the vocabulary of Figure 8. With this particular vocabulary, four conceptual relations are introduced: pre, states, asserts and retracts, and four 8 9
For simplicity reasons, we suppose that a process has only one entry and one exit point. In the linear form, a process can be represented by its name between double angled brackets. For example, we would have: <
The Representation of Dynamic Knowledge Using Conceptual Graphs
75
concepts are defined: Process, Statement, Assertion (a true statement) and Negation (a false statement). In terms of this vocabulary, the graph of Figure 6 could be expanded to the one in Figure 9. This is given just as an example of process expansion with regard to a pre-established vocabulary; of course, other expansions could be defined.
ASSERTION
pre
asserts
STATEMENT
negates
STATEMENT
PROCESS pre
NEGATION
Fig. 7. An alternative representation for processes.
T
Link PROCESS STATEMENT ASSERTION
pre
NEGATION
states
asserts
negates
Fig. 8. A simple ontology (concepts and relations) for the description of processes.
STATEMENT: PROCESS
asserts
Integer: *z6 pre
STATEMENT:
Integer
val
Variable: *n
Fig. 9. A possible expansion of the graph of Figure 6.
5 The Process Statement This section deals with two main issues relevant to the implementation of the framework proposed in this paper: the parameter passing (section 5.1) and the process call (section 5.2) mechanisms.
76
G.W. Mineau
5.1 The Evaluation of Parameters The formal paremeters of a process, both input and output, are identified with indexes in the process statement. They are those graphs which appear before the rule set of the process. They are automatically inserted in the preconditions of the first transition rule of the process (for input parameters), and in the postconditions of the last transition rule (for output parameters). As shown by the example of Figure 6, each effective parameter is given as a separate statement linked by a numbered arc to the process for which it is a parameter. Since the arcs are directed and numbered, there is no confusion between input and output parameters. That way, and because input and output boxes contain only one cg, the effective and formal parameters of a process can be matched to one another unambiguously. Let us define KB, a knowledge base represented as a set of conceptual graphs, and let us define S(g), the set of conceptual graphs belonging to KB, which are specializations of some graph g of KB. Then, S(g) = {gi ³ KB | $pi, a projection, such that gi = pig}. The graphs of S(g) are said to be models of g. Let d: KB P(KB), where P(KB) is the power set of KB, be a function that produces the models of any graph of KB, called its denotation set, such that dg = S(g). Let e and f be two cgs of KB; and de and df be their denotation sets. Let us define the following three generalization relationships between e and f. If $p, a projection, such that e = pf, then e•gf; if $p', a projection, such that f = p'e, then e•gf; if e•gf and e•gf then e =g f. Let e be a cg used as an effective input parameter of a process p, and f, its formal counterpart (used in the definition of process p). Let us suppose that one of these relationships holds: 1) e •g f, or 2) e •g f. In the first case, since all graphs in de would be included in df if f were to be asserted in KB, we assign de to df: df de. In the second case, f covers only a portion of e. We assign to df, the subset of de that is covered by f: df {gi ³ de | $pi, a projection, such that gi = pif}. In any case, if df is empty, then f is negated. Process p can be executed only after the denotation sets of all 10 its formal input parameters are computed . If f is negated while it should not, this will prevent process p from being executed. Let us now define e, a cg used as an effective output parameter, and f, its 11 formal counterpart. As the result of a process, e will be asserted with regard to its computed counterpart f. If e •g f, then: de df. If e •g f, then: de subset of df which makes e true, i.e., de {gi ³ df | $pi, a projection, such that gi = pie}. In any case, if de is empty, then e is negated. In the simplest case, the equality of effective and formal parameters should be sought as much as possible, as done in conventional programming. For example, let us suppose that g1 and g2 are the only two graphs in de, and let us suppose that e •g f (where both are input parameters, effective and formal respectively). Then process p would use {g1,g2} as df. If there exists a transformation of f, say f*, done by p, which should be asserted as a result of p, then g1* and g2*
10
11
Of course this computation may be costly. Different strategies to achieve this computation and to the represent denotation sets may be adopted; they will not be discussed here. It could also be negated as the result of the process, in which case de must be empty.
77
The Representation of Dynamic Knowledge Using Conceptual Graphs 12
should be computed from g1 and g2 respectively, and asserted as this result. The efficiency issues related to this mechanism fall outside the scope of this paper and will not be discussed here. 5.2 Process Calls As seen previously, a process is described using generic concepts representing the variables of the process and the line numbers where the instructions are located in the algorithm. The scope of these concepts spans over the whole process description. This rules the coreference mechanism. Two overall principles should be enforced in order to avoid confusion when executing a process. First, the variables of a process are local to that process: no global variables are allowed. Second, the line numbers of a process should be different from the line numbers of any other processes. These two principles aim at enforcing the independence between processes, and it helps to reduce the complexity associated with the implementation of the execution engine. It consequently increases the reliability of the system. To make sure that these two principles are respected throughout the execution of a process, the following procedures must be applied when the execution of a process is required (i.e., when a process definition must be instantiated and executed). 1) Compute the •g relation between corresponding effective and formal parameters. 2) Compute the denotation sets of the formal input parameters. 3) Create an individual process by instantiating all concepts representing line numbers and variables to values not previously existing in the system. 4) Incorporate the formal parameters to the rule set of the process. 13 5) Add this rule set to the execution engine . 6) Assert the graph that triggers the process, called the trigger. After the process is completed, i.e., when the graph stating this fact is asserted by the process, the following procedure must be applied. 1) Compute the denotation set of the effective output parameters and assert (or retract) them. 2) Withdraw the rule set of the process from the execution engine. 3) Retract the trigger. 4) Delete all individuals (line numbers and variables) created specifically for the execution of this process, and retract all graphs which refer to them.
6 Conclusion and Future Developments This paper presents a simple representation framework for processes, based on pre and post conditions, that allows their integration into a cg-based system, and their easy implementation for execution. In such a framework, the emphasis is put on transformations of objects from their input to output states; other representations put the emphasis on some other characteristic such as the structure of the process itself 12 13
In conventional programming |de| = 1. Here, we allow multiple transformations to take place. The framework that we propose in this paper is well adapted for execution in any rule-based system.
78
G.W. Mineau
[7]. Our choice permits easy implementation of an execution engine, while providing a basis for comparing processes (subprocesses) on their pre and post conditions (which will be useful in a near future). With this aim in mind, for our needs, a process is better characterized in terms of its input and output, rather than in terms of its structure. However, we must acknowledge the fact that the proposed implementation would suffer from the same completeness and soundness problems as any rule-based system: are two postconditions conflicting? Are two preconditions interdependent? These are problems that, of course, still remain unsolved. Furthermore, even though not all process formalisms use variables and algorithmic primitives to synchronize their action, the point that this paper makes is that automatic translation from most process representation formalisms to transition rules is possible. The knowledge modeler can use the formalism that he or she is the most familiar with in order to describe processes; their representation using cgs can be automatically derived, offering full integration to a cg knowledge base. This should facilitate the associated knowledge modeling activities. Also, second-order knowledge on application domains, such as constraints on processes, could be represented as transition rules as well [10]. Using the same execution engine, the execution of processes could undergo constraint validation without requiring the implementation of any particular facility. For these reasons, the development of a Conceptual Programming Environment (CPE) where imperative and logic programming coexist and complement each other, which is our long-term goal, could be easily developed. This would allow for explanations on processes to be generated, for processes to be simulated and validated, and so on, providing tremendous support not only to the knowledge modeler, but to the end users as well. Consequently, we think that the definition framework that we propose in this paper for the representation of dynamic processes is a necessary addition to the actual cg theory, particularly with regard to its usefulness for corporate memories and task support systems. With the process statement that it introduces, this paper increases the expressivity of the formalism without impairing on its simplicity and without making any special or ad hoc ontological commitment, leaving the theory as simple and as unbiased as it was originally intented to be. Finally, considering the cost of technology transfer from theory to practice, this paper proposes a definition framework which is extremely simple to implement. By doing so, we hope to facilitate its use in real cg systems, helping the dissemination of the theory.
References 1. Delugach, H. S. (1991). Dynamic Assertion and Retraction of Conceptual Graphs. In: Proc. of the 6th Ann. Workshop on Conceptual Structures, E. Way (Ed.), Binghamton, N.Y.: SUNY at Binghamton, pp. 15-24. 2. Mineau, G.W. & Gerbé, O., (1997). Contexts: A Formal Definition of Worlds of Assertions. In: Lecture Notes in AI #1257. Springer-Verlag. 80-94.
The Representation of Dynamic Knowledge Using Conceptual Graphs
79
3. Creasy, P. & Moulin, B. (1991). Approaches to Data Conceptual Modelling. In Proc. of the 6th Annual Workshop on Conceptual Structures, E. Way (Ed.), Binghamton, N.Y.: SUNY at Binghamton, pp. 387-399. 4. Haemmerlé, O. (1995). Implementation of Multi Agent Systems using Conceptual Graphs for Knowledge and Message Representation: The GoGITo Platform. In: Proc. of the 3rd Int. Conf. on Conceptual Structures (ICCS-95), G. Ellis, R. Levinson, & B. Rich (Ed.), Santa Cruz, CA: UCSC, pp. 13-24. 5. Hines, T. R., Oh, J. C. & Hines, M. L. A. (1990). Object-Oriented Conceptual Graphs. In: Proc. of the 5th Ann. Workshop on Conceptual Structures, L. Gerholz & P. Eklund (Ed.), Boston, Mass. Section A.08. 6. Kabbaj, A. & Frasson, C. (1995). Dynamic CG: Toward a General Model of Computation. In: Proc. of the 3rd Int. Conf. on Conceptual Structures (ICCS-95), G. Ellis, R. Levinson & B. Rich (Ed.), Santa Cruz, CA: UCSC, pp. 46-60. 7. Lukose, D., Cross, T., Munday, C. & Sobora, F. (1995). Operational KADS Conceptual Model using Conceptual Graphs and Executable Conceptual Structures. In: Proc. of the 3rd Int. Conf. on Conceptual Structures (ICCS-95), G. Ellis, R. Levinson & B. Rich (Ed.), Santa Cruz, CA: UCSC, pp. 71-85. 8. Wuwongse, V. & Gosh, B. C. (1992). Towards Deductive Object-Oriented Databases Based on Conceptual Graphs. In: Lecture Notes in AI #754. Springer-Verlag. pp. 188-205. 9. Lukose, D. & Mineau, G.W., (1998). A Comparative Study of Dynamic Conceptual Graphs. In: Proceedings of the 11th Knowledge Acquisition for Knowledge-Based Systems Workshop (KAW-98). Banff, Alberta, Canada. Section VKM-7. April 18-23. 10. Mineau, G.W. & Missaoui, R., (1996). Semantic Constraints in Conceptual Graph Systems. DMR Consulting Group. R&D Division. Report #960611A. 11. Bos, C., Botella, B. & VanHeeghe, P., (1997). Modeling and Simulating Human Behaviours with Conceptual Graphs. In: Lecture Notes in AI #1257. Springer-Verlag. 275-289. 12. Ellis, G., (1995). Object-Oriented Conceptual Graphs. In: Lecture Notes in AI #954. Springer-Verlag. 144-157. 13. Allen, J.F., (1981). An Interval-Based Representation of Temporal Knowledge. In: Proceedings of the 7th Int. Joint Conference on Artificial Intelligence. 221-226.
A Semantic Validation of Conceptual Graphs Juliette Dibie1,2 , Ollivier Haemmerl´e1 , and St´ephane Loiseau1,3 1
INA-PG, D´epartement OMIP 16, rue Claude Bernard, F-75231 Paris Cedex 05, France 2 LAMSADE, Universit´e de Paris IX - Dauphine Place du Mar´echal de Lattre de Tassigny, F-75775 Paris Cedex 16, France 3 LRI - CNRS, Bˆ at 490, Universit´e Paris XI - Orsay F-91405 Orsay Cedex, France [email protected], [email protected], [email protected]
Abstract. Validation is an important part of the research work on knowledge based systems. Three kinds of validation have been studied: syntactic validation, logical validation and semantic validation. We are interested in these three kinds of validation for knowledge based systems based on conceptual graphs. This paper focuses on semantic validation. We present a method to check whether a knowledge base is semantically valid. This is done with respect to constraints. We present two types of constraints, minimal and maximal descriptive constraints. Each of them is associated with a conceptual graph G. It allows one to specify that if there exist specializations of G in the knowledge base, then these specializations must respect some conditions. For each kind of descriptive constraints we propose a way of checking if a knowledge base is valid and define their logical semantics. Finally we compare descriptive constraints with other extensions of the conceptual graph model.
1
Introduction
Validation is an increasing preoccupation for knowledge based (KB) system developers mainly for two reasons. Firstly, researchers have worked for a long time on knowledge representation and related algorithms. Some operational systems can now be designed using those approaches, so researchers focus on methods and tools to help ensuring the quality of KB system, which is a requirement for industrial systems. Secondly, the particular nature of knowledge requires specific approaches for KB systems; software engineering works provide only partial solutions. According to [1], the validation is defined as “the set of activities of which the goal is to contribute in ensuring to a certain degree the quality and reliability of a KB system”. Most of the works on the validation of KB systems concern the rule based systems [2–5]. In other works, some study the validation of knowledge model [6] and in particular the validation of semantic networks. [7] proposes M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 80-93, 1998 Springer-Verlag Berlin Heidelberg 1998
A Semantic Validation of Conceptual Graphs
81
solutions to exhibit incoherences in the knowledge bases using production rules and semantic networks. [8] defines descriptive logics as a language with good properties to represent and verify a structured knowledge. As far as we know, no work has been done previously on the validation of conceptual graphs. In this paper, we are interested in the semantic validation of conceptual graphs. The semantic validation of a KB system aims at ensuring that the semantics of the factual knowledge of the KB conform to some constraints. The constraints are rules of the form “If . . . Then ⊥” which are reliable knowledge expressing contradictory conditions. Our work is based on the CG model [9] as formalized in [10]. A knowledge base (KB) is composed of a support (S) and of one (or several) conceptual graph(s) (G) representing the factual knowledge. The semantic validation leads us to extend the CG model by introducing constraints, called descriptive constraints. A descriptive constraint is knowledge given by the designer for validation purpose, associated with a conceptual graph G and built on the support S. A descriptive constraint allows one to specify that if there exist specializations of G in the KB, then these specializations must respect const some conditions. A descriptive constraint is of the form G ⇒ C, where C characterizes the conditions associated with the specializations of the graph G in the KB. A descriptive constraint is either minimal or maximal. A minimal (resp. maximal) descriptive constraint specifies that if there exist specializations of G in the KB, then these specializations must atleast (resp. atmost) respect some conditions. The combination of minimal and maximal descriptive constraints provides an efficient way of setting constraints on conceptual graphs. This paper is organized as follows. The second section deals with the definitions of descriptive constraints. The third section presents the semantic validation of a KB, with respect to some descriptive constraints. In the fourth section, we discuss the descriptive constraints with regard to what already exists in the CG model in terms of definition and constraint.
2
Descriptive Constraints
A descriptive constraint is either minimal or maximal. Firstly, we formalize the notion of descriptive constraint. Secondly, we present the minimal and maximal descriptive constraints themselves. 2.1
Definition of Descriptive Constraints
Intuitively, a descriptive constraint associated with a conceptual graph G allows one to specify that if there exists a specialization of G in the knowledge base, then this specialization must respect some conditions. It restricts the possible specializations of G in the base to specific specializations. For instance a descriptive constraint allows one to indicate that “if a painter paints, then he must paint a painting”. A coreference link [9] links two concept vertices of comparable labels. It is graphically represented by a dotted line between these concept vertices.
82
J. Dibie, O. Haemmerle, and S. Loiseau
Definition 1 Let G1 and G2 be two connected conceptual graphs, C1 and C2 being their respective class of concept vertices. A coreference link between two concept vertices of comparable labels of C = C1 × C2 is an equivalence relation on C. Coreference links are used in the definition of descriptive constraints. The conditions that must be respected by all the specializations of G in the base are characterized by descriptions, the concept vertices of G being linked by coreference links to some concept vertices of each description. Definition 2 A description, noted d, of a connected conceptual graph G is a connected conceptual graph specialization of G, such that each concept vertex of G is linked by a coreference link to one of its concept vertices. We call head of the description the subgraph composed of the concept vertices linked by a coreference link to the concept vertices of G and of the relation vertices linking these concept vertices together. The head of a description d is unique and noted hd. We admit that a descriptive constraint is associated with an irredundant conceptual graph. A conceptual graph G is irredundant if there is no projection from G into one of its strict subgraphs [11]. Definition 3 A descriptive constraint associated with an irredundant connected m const W di where for all conceptual graph G called constraint graph is noted G ⇒ i=1
i ∈ [1, m], di is a description of G. Example 1 (followed throughout the paper): Let G and d be the following conceptual graphs: G
d
Painter: * .. .. .. Painter: *
agt
2
2
1
agt
1
Paint: * .. .. .. Paint: *
obj 2
1
Painting: *
hd Let KB be the knowledge base composed of the following factual knowledge Γ = {γ1 , γ2 , γ3 }, a juxtaposition of the connected conceptual graphs γ1 , γ2 and γ3 : γ1
Painter: Dali c1
γ2
Painter: Ernst c4
agt 1
r1
2
2
agt 1
r3
Paint: * c2 Paint: * c5
1
obj 2
r2
Painting: * c3
A Semantic Validation of Conceptual Graphs
obj 2 Painting: BlueI
r5 c8 1 2 obj Painting: BlueII
r6 c9 const The descriptive constraint G ⇒ d means that if a painter paints then he must paint a painting. Intuitively, it appears that the descriptive constraint is not satisfied by KB. On the one hand the painter Ernst paints, but he does not paint any painting and on the other hand the painter Mir´ o paints more than one painting.
γ3
2.2
Painter: Mir´ o c6
2
agt 1
r4
83
Paint: * c7
1
Minimal Descriptive Constraints and Maximal Descriptive Constraints
There exist two types of descriptive constraints: the minimal descriptive constraints and the maximal descriptive constraints. They allow one to set restrictions on a conceptual graph as follows: m min W di . It allows one to 1. the minimal descriptive constraint is noted G ⇒ i=1
specify that for each specialization of G in the base, this specialization must at least satisfy a description di , i ∈ [1, m]; m max W di ∨ G. It allows one to 2. the maximal descriptive constraint is noted G ⇒ i=1
specify that for each specialization of G in the base and for each description di , i ∈ [1, m], this specialization must at most satisfy it once. That is to say that if a specialization of G satisfies a description di , i ∈ [1, m], then it must satisfy this description only once. There can be specializations of G which do not satisfy any description di , i ∈ [1, m], then they satisfy the conceptual graph G itself; 3. with the previous two descriptive constraints, we can express the fact that for each specialization of G in the base, this specialization must only satisfy a description di , i ∈ [1, m]. That is to say that each specialization of G must at least satisfy a description di , i ∈ [1, m], and this only once. The intuitive notions at least satisfy and at most satisfy are formalized in the definitions of the satisfaction of minimal and maximal descriptive constraints. Example 2 : Let us consider the conceptual graph G and its description d presented in example 1. min The intuitive interpretation of the minimal descriptive constraint G ⇒ d is that “if a painter paints, then he must at least paint a painting”. max The intuitive interpretation of the maximal descriptive constraint G ⇒ d ∨ G is that “if a painter paints, then he must at most paint a single painting, he can paint none”.
84
3
J. Dibie, O. Haemmerle, and S. Loiseau
Semantic Validation
The semantic validation aims at ensuring that the coherence of the factual knowledge is checked with respect to constraints, which represent some expert knowledge given for validation purpose. We say that a knowledge base is semantically valid if it respects all the constraints of the knowledge based system. Definition 4 A knowledge base KB is semantically valid iff KB satisfies each minimal descriptive constraint and each maximal descriptive constraint of the knowledge based system. We introduce the notion of image graph, used in the definitions of the satisfaction of minimal and maximal descriptive constraints. Definition 5 Let H and G be conceptual graphs. We say that there exists a projection from G into H with h the image graph of G if h is a subgraph of H and there exists a surjective projection from G into h. In the following, we consider that the knowledge base KB, being semantically validated, is composed of a support S and a conceptual graph under normal form Γ . A conceptual graph G is under normal form if each individual marker belonging to G appears exactly once [11]. 3.1
Satisfaction of Minimal Descriptive Constraints
In terms of graph operations The satisfaction of a minimal descriptive conm min W di by a knowledge base KB consists in checking if each spestraint G ⇒ i=1
cialization of G in KB satisfies at least one description di , i ∈ [1, m]. In terms of graph operations, the constraint is satisfied iff for each specialization γ of G in KB (γ is an image graph of G in Γ ), there exists at least one projection from a description di , i ∈ [1, m], into Γ with γ the image graph of the head of di . Definition 6 Let KB be a knowledge base composed of a support S and a m min W conceptual graph under normal form Γ . Let G ⇒ di be a minimal descriptive i=1
constraint. The minimal descriptive constraint is satisfied by KB iff for each projection from G into Γ such that γ is the image graph of G, there exists at least a projection from a description di , i ∈ [1, m], into Γ with γ the image graph of the head of di . Example 3 : Let us consider the conceptual graph G, its description d and the conceptual graphs γ1 and γ2 presented in example 1. a) Let KB be a knowledge base composed of the factual knowledge Γ = {γ1 }. We min check if KB satisfies the minimal descriptive constraint DC: G ⇒ d.
A Semantic Validation of Conceptual Graphs
85
There is a projection from G into γ1 with γ10 the conceptual graph limited to the vertices c1 , r1 and c2 and the image graph of G; there is a projection from d into γ1 with γ10 the image graph of the head of d. KB satisfies the minimal descriptive constraint DC: the painter Dali paints and he paints a painting. b) Let KB 0 be a knowledge base composed of the factual knowledge Γ 0 = {γ1 , γ2 }. We check if KB 0 satisfies the minimal descriptive constraint DC. (1) There is a projection from G into γ1 with γ10 the conceptual graph limited to the vertices c1 , r1 and c2 and the image graph of G; there is a projection from d into γ1 with γ10 the image graph of the head of d; (2) there is a projection from G into γ2 but there doesn’t exist a projection from d into γ2 . KB 0 doesn’t satisfy the minimal descriptive constraint DC: the painter Dali paints and he paints a painting, but the painter Ernst paints and he paints nothing. Logical semantics The logical interpretation of definition 6 relies on the notion of S-substitution. A S-substitution is an application of a logical formula ϕ associated with a conceptual graph into another one ψ such that with each term or atom of ϕ a term or an atom of ψ is associated [11]. Property 1 Let KB be a knowledge base composed of a support S and a conm min W di be a minimal descriptive ceptual graph under normal form Γ . Let G ⇒ i=1
constraint. The minimal descriptive constraint is satisfied by KB iff: for each S-substitution ρ from Φ(G) into Φ(Γ ), ∃ a S-substitution ρ0 from Φ(di ) into Φ(Γ ) with ρ0 (Φ(hdi )) = ρ(Φ(G)). We prove the equivalence of definition 6 and its logical interpretation (property 1) in proof 1 of the annex section. Example 4 : Let us consider example 3 b, with knowledge base KB 0 composed of the factual knowledge Γ 0 = {γ1 , γ2 }. The logical interpretation [9, 11] of conceptual graphs G, d and Γ 0 are: Φ(G) = ∃x, y P ainter(x) ∧ P aint(y) ∧ agt(y, x) Φ(d) = ∃u, v, w P ainter(u) ∧ P aint(v) ∧ P ainting(w) ∧ agt(v, u) ∧ obj(v, w) Φ(Γ 0 ) = ∃l, m, n P ainter(Dali)∧P aint(l)∧P ainting(m)∧agt(l, Dali)∧obj(l, m) ∧ P ainter(Ernst) ∧ P aint(n) ∧ agt(n, Ernst) We obtain the same result from a logical point of view as from a graphic one. (1) Let ρ1 be a S-substitution from Φ(G) into Φ(Γ 0 ), ρ1 = {(Painter, Painter), (x, Dali), (Paint, Paint), (y, l), (agt, agt)}, there exists a S-substitution ρ0 from Φ(d) into Φ(Γ 0 ), ρ0 = {(Painter, Painter), (u, Dali), (Paint, Paint), (v, l), (Painting, Painting), (w, m), (agt, agt), (obj, obj)} and ρ0 (Φ(hd)) = ρ1 (Φ(G)).
86
J. Dibie, O. Haemmerle, and S. Loiseau
(2) Let ρ2 be a S-substitution from Φ(G) into Φ(Γ 0 ), ρ2 = {(Painter, Painter), (x, Ernst), (Paint, Paint), (y, n), (agt, agt)}, there doesn’t exist a S-substitution ρ0 from Φ(d) into Φ(Γ 0 ) with ρ0 (Φ(hd)) = ρ2 (Φ(G)). KB 0 doesn’t satisfy the minimal descriptive constraint DC because of case (2). 3.2
Satisfaction of Maximal Descriptive Constraints
Whereas redundance is considered in the study of the satisfaction of minimal descriptive constraints, it makes no sense in the study of the satisfaction of maximal descriptive constraints. The maximal descriptive constraint contains the idea of at most whereas redundance is logically interpreted by the idea of at least. Intuitively, we can see that having a maximal descriptive constraint composed by redundant conceptual graphs and studying the satisfaction of a maximal descriptive constraint by a redundant knowledge base would be irrelevant. Thus, we admit that the descriptions of maximal descriptive constraints and the conceptual graph(s) belonging to the knowledge base are irredundant. In terms of graph operations The satisfaction of a maximal descriptive m max W di ∨ G by a knowledge base KB consists in checking if for constraint G ⇒ i=1
each description di , i ∈ [1, m], each specialization of G in KB satisfies at most this description once. In terms of graphs operations, the constraint is satisfied iff for each specialization γ of G in the base (γ is an image graph of G in Γ ), there exists at most one projection from each description di , i ∈ [1, m], into Γ with γ the image graph of the head of di . Definition 7 Let KB be a knowledge base composed of a support S and let Γ m max W be an irredundant conceptual graph under normal form. Let G ⇒ di ∨G be a i=1
maximal descriptive constraint, the descriptions di , i ∈ [1, m], being irredundant. The maximal descriptive constraint is satisfied by KB iff for each projection from G into Γ such that γ is the image graph of G, there is at most one projection from each graph di , i ∈ [1, m], into Γ with γ the image graph of the head of di . Example 5 : Let us consider the conceptual graph G, its description d and the conceptual graphs γ2 and γ3 presented in example 1. a) Let KB be a knowledge base composed of the factual knowledge Γ = {γ2 }. We max check if KB satisfies the maximal descriptive constraint DC: G ⇒ d ∨ G. There is a projection from G into γ2 and there does not exist a projection from d into Γ with γ2 the image graph of hd. KB satisfies the maximal descriptive constraint DC: the painter Ernst paints and as he paints nothing, he does not paint more than one painting. b) Let KB 0 be a knowledge base composed of the factual knowledge Γ 0 = {γ2 , γ3 }. We check if KB 0 satisfies the maximal descriptive constraint DC.
A Semantic Validation of Conceptual Graphs
87
(1) There is a projection from G into γ2 and there does not exist a projection from d into Γ with γ2 the image graph of hd; (2) there is a projection from G into γ3 and there are two projections from d into Γ 0 with γ30 the conceptual graph limited to vertices c6 , r4 and c7 and the image graph of hd. KB 0 doesn’t satisfy the maximal descriptive constraint DC, because of the two projections from d into Γ 0 in case (2): the painter Ernst paints and he paints nothing, but the painter Mir´ o paints more than one painting. Logical semantics The logical interpretation of definition 7 is as follows. Property 2 Let KB be a knowledge base composed of a support S and an m max W irredundant conceptual graph under normal form Γ . Let G ⇒ di ∨ G be a i=1
maximal descriptive constraint, the descriptions di , i ∈ [1, m], being irredundant. The maximal descriptive constraint is satisfied by KB iff: for each S-substitution ρ from Φ(G) into Φ(Γ ), if ∃ a S-substitution ρ0 from Φ(di ) into Φ(Γ ) with ρ0 (Φ(hdi )) = ρ(Φ(G)) then ρ0 must be unique. We prove the equivalence of definition 7 and its logical interpretation (property 2) in proof 2 of the annex section. Example 6 : Let us consider example 5 b, with knowledge base KB 0 composed of the factual knowledge Γ 0 = {γ2 , γ3 }. The logical interpretations of the conceptual graphs G, d and Γ 0 are: Φ(G) = ∃x, y P ainter(x) ∧ P aint(y) ∧ agt(y, x) Φ(d) = ∃u, v, w P ainter(u) ∧ P aint(v) ∧ P ainting(w) ∧ agt(v, u) ∧ obj(v, w) o) ∧ Φ(Γ 0 ) = ∃l, m P ainter(Ernst) ∧ P aint(l) ∧ agt(l, Ernst) ∧ P ainter(M ir´ P aint(m)∧P ainting(BlueI)∧P ainting(BlueII)∧agt(m, M ir´ o)∧obj(m, BlueI) ∧ obj(m, BlueII) We obtain the same result from a logical point of view as from a graphic one. (1) Let ρ1 be a S-substitution from Φ(G) into Φ(Γ 0 ), ρ1 = {(Painter, Painter), (x, Ernst), (Paint, Paint), (y, l), (agt, agt)}, there doesn’t exist a S-substitution ρ0 from Φ(d) into Φ(Γ 0 ) with ρ0 (Φ(hd)) = ρ1 (Φ(G)). (2) Let ρ2 be a S-substitution from Φ(G) into Φ(Γ 0 ), ρ2 = {(Painter, Painter), (x, Mir´ o), (Paint, Paint), (y, m), (agt, agt)}, there exist two S-substitutions ρ0 and ρ00 from Φ(d) into Φ(Γ 0 ): o), (Paint, Paint), (v, m), (Painting, ρ0 = {(Painter, Painter), (u, Mir´ Painting), (w, BlueI), (agt, agt), (obj, obj)} and ρ0 (Φ(hd)) = ρ2 (Φ(G)); o), (Paint, Paint), (v, m), (Painting, ρ00 = {(Painter, Painter), (u, Mir´ Painting), (w, BlueII), (agt, agt), (obj, obj)} and ρ00 (Φ(hd)) = ρ2 (Φ(G)). KB 0 doesn’t satisfy the maximal descriptive constraint DC because of case (2).
88
4 4.1
J. Dibie, O. Haemmerle, and S. Loiseau
Discussion Descriptive Constraints versus Definitions
In the conceptual graph (CG) model, several authors have been interested in the definition of types [9, 12]. A type is defined (1) either by genus, the genus representing its over-type: in that case, there is implication between the type and its genus; (2) or by genus and difference: the definition then declares an equivalence between the type and its description. [12] also talks about partial definition of a type t that declares the implication of a description D by the def type t: t(x) ⇒ D(x). In the description logic, the definition is richer than in the CG model because it allows one to set restrictions on concepts. A description logic [13] contains concepts and roles, which can respectively be compared with concept types and relation types of the CG model. These concepts and roles can be either primitive or complex, complex concepts and roles being defined via descriptions built by a set of constructors. For instance, a complex concept can be defined by restrictions of roles: three constructors, all, atleast and atmost, are used to build concepts by restricting their relationships with other concepts. The definition by genus and difference of types in the CG model is not enough to establish such restrictions on types. The introduction of descriptive constraints in the CG model allows one to set restrictions on types and more generally on conceptual graphs. A descriptive constraint is associated with a conceptual graph, which can be limited to a concept vertex (constraint associated with a concept type) or to a relation vertex with its neighbours (constraint associated with a relation type). The notion of the constructors atleast and atmost of the description logic is covered in the minimal descriptive constraints and the maximal descriptive constraints. The descriptive constraints thus allow one to extend the definition of types in the CG model by setting restrictions on types. It is on purpose that we have used the word “descriptions” to characterize restrictions and the notation m const W di to represent the descriptive constraints, which are inspired by the G ⇒ i=1
partial definition of a type t. As in the definitions, we are interested in multiple descriptive constraints and nested descriptive constraints. We talk about multiple descriptive constraints if more than one descriptive constraint is associated with a constraint graph. We talk about nested descriptive constraints if a descriptive constraint is associated with the description of a constraint graph. In these two cases, the following definition and property are obvious. Definition 8 If two distinct descriptive constraints DC1 and DC2 are associm const W const ated with the same constraint graph G, DC1 : G ⇒ di and DC2 : G ⇒ m W k=1
i=1
const
d0k , then the descriptive constraint associated with G is G ⇒
m W i=1
di ∨
l W k=1
d0k .
A Semantic Validation of Conceptual Graphs
89
Property 3 Let DC be a descriptive constraint of a constraint graph G, DC: m const W G ⇒ di . If a descriptive constraint DC 0 is associated with one of the i=1
const
descriptions dj of G, DC 0 : dj ⇒ const
becomes G ⇒
m W i=1,i6=j
di ∨
l W k=1
l W k=1
d0k , then the descriptive constraint DC
d0k .
The proof of this property is almost immediate. It uses definition 8 and the property of transitivity of the coreference links (definition 1). 4.2
Descriptive Constraints versus Constraints
The notion of constraint is not a new idea in the Conceptual Structures Theory. It is studied by [10] in the expression of knowledge of different types and is associated with the notion of canonicity by [14–16] and[17] took an interest in the representation of semantic constraints in conceptual graph systems. We study these three cases in comparison with the descriptive constraints. [10] propose to express knowledge of different types. They presente in particular the notions of necessarily implicit knowledge and local constraint. These two types of knowledge can be expressed in terms of descriptive constraints. A necessarily implicit knowledge D associated with a type t expresses the fact that if in a conceptual graph there is a concept vertex of type t0 such that t0 ≤ t, then this vertex has necessarily the particular graph D in its environment. Such a necessarily implicit knowledge can be seen as a minimal descriptive constraint of min the form: t(x) ⇒ D(x). A local constraint C associated with a type t expresses the fact that in every graph satisfying the constraint C, the environment of each concept vertex of type t0 such that t0 ≤ t conforms to this constraint. Such a local const constraint can be seen as a descriptive constraint of the form: t(x) ⇒ C(x). More precisely, a local constraint can allow to represent functional constraints, for instance every painter who paints paints a single painting, which can be seen max as a maximal descriptive constraint of the form G ⇒ d ∨ G with G and d the conceptual graphs of example 1, d being the local constraint. [15] says that canonicity handles the violation of selectional constraints. The checking mechanism of canonicity consists in checking whether a conceptual graph G is a canonical graph with respect to a canonical basis. According to [14], a conceptual graph G is canonical iff there are projections of the graphs of the canonical basis into G that wholly cover G. This checking mechanism of canonicity cannot be compared with the satisfaction of minimal descriptive constraints. Let Bcan be a canonical basis composed of n canonical graphs noted di , i ∈ [1, n]. Let us study the canonicity of a conceptual graph G with respect to Bcan . We would like to demonstrate that: “G is canonical with respect to Bcan iff the knowledge base composed of the conceptual graph G satisfies the m m min W min W di ”. But G ⇒ di is not a minimal minimal descriptive constraint G ⇒ i=1
i=1
90
J. Dibie, O. Haemmerle, and S. Loiseau
descriptive constraint because the graphs di , i ∈ [1, m], are not specializations of the graph G. Besides, [16] proposes a positive canonical model and negative canonical models, associated with each concept type and relation type in the ontology of the domain, to strengthen the semantic checking mechanism of the CG model. The checking mechanism of canonicity of a conceptual graph G is enforced by projecting positive and negative canonical models into G. There must exist a projection from the positive canonical model of each type into every occurrence of that type in G and there must not exist a projection from the negative canonical models of each type into any occurrence of that type in G. This mechanism of semantic checking can be compared with the satisfaction of minimal descriptive constraints. Let us build the descriptive constraints corresponding to canonical models. The positive canonical model and the negative canonical models are conceptual graphs associated with a type t. We note them respectively G+ t and G− ti , i ∈ [1, m]. Let Gt be the conceptual graph associated with the type t, Gt is limited to a concept vertex if t is a concept type, and to a relation vertex and its neighbours if t is a relation type. The checking mechanism of canonicity of a conceptual graph G can be compared with the minimal descriptive conmin straint Gt ⇒ G+ t that must be satisfied and the minimal descriptive constraint m min W − Gti that must not be satisfied. Gt ⇒ i=1
[17] present topological constraints on conceptual graphs, that are based upon the identification of invalid graphs. The authors propose to use non-validity intervals to represent invalid graphs. A non-validity interval is a pair of conceptual graphs u and v, represented within brackets, such that v ≤ u. Any graph G which falls in a non-validity interval, for instance [u, v], is to be considered invalid, that is to say G is invalid if v ≤ G ≤ u, otherwise G is plausible according to this interval. Such a non-validity interval can be seen as a minimal descriptive min constraint of the form: u ⇒ v. As a matter of fact G is invalid (G does not satisfy the minimal descriptive constraint) if G is a specialization of u and is not more specific than v, else it satisfies the minimal descriptive constraint. The non-validity intervals allow one not to consider the boundary values of the interval, which cannot be done with our descriptive constraints. The authors also present domain constraints. They want to extend the conceptual graph theory to cover a wider range of constraints. Yet, our goal is to propose relevant constraints for the validation of a knowledge base built on conceptual graphs, and not new semantic constraints in the conceptual graph model. 4.3
Conclusion
Conceptual Structures Theory [9] talks about three levels of meaning: syntax, logic and ontology. We have been working on conceptual graph validation in three directions, according to these three levels: the syntactic validation, the logical validation and the semantic validation. The first level is a syntactic validation [18] that consists in checking a knowledge base structurally. Firstly, the KB must conform to the definitions of the
A Semantic Validation of Conceptual Graphs
91
conceptual graph model. Secondly, the factual knowledge of the KB must conform to its terminological knowledge. The second level is a logical validation [19] leading to a KB that corresponds exactly to the designer’s intuitive interpretation. An extension of the conceptual graph model –the LV-graph model– is proposed and allows one to define “distinction links” between concept vertices. The third level is a semantic validation that consists in confronting the KB to some expert knowledge specifications. The semantic validation of a KB using descriptive constraints ensures that the factual knowledge of the KB satisfies these constraints. Instead of being used for validation purpose the descriptive constraints could be considered as terminological knowledge belonging to the knowledge base. We could imagine extending the conceptual graph model by introducing a descriptive base, a set of descriptive constraints, such that the definition base or the canonical basis. So, it would be valuable to build conceptual graphs which would be wellformed with respect to a descriptive base. This work has been implemented and tested for several months on the CoGITo platform [20] which is a tool designed to implement applications based on the conceptual graph model [21, 22].
5
Annex: Proof of the Properties
The definition of the S-substitution verifies the following property [11], which we recall for proof 1: Property 4 There exists a projection from a conceptual graph G into a conceptual graph H iff there exists a S-substitution from Φ(G) into Φ(H). Proof 1 Let us show the equivalence of the two propositions: – P1 : “for each S-substitution ρ from Φ(G) into Φ(Γ ), ∃ a S-substitution ρ0 from Φ(di ) into Φ(Γ ) with ρ0 (Φ(hdi )) = ρ(Φ(G))” – P2 : “for each projection from G into Γ such that γ is the image graph of G, there is at least a projection from a graph di into Γ with γ the image graph of the head of di ”. According to property 4, there exists a projection π from G into Γ iff there exists a S-substitution ρ from Φ(G) into Φ(Γ ), and there exists a projection π 0 from di into Γ iff there exists a S-substitution ρ0 from Φ(di ) into Φ(Γ ). P1 is equivalent to “for each projection π from G into Γ , there exists a projection π 0 from di into Γ such that the image graph of the head hdi of di in Γ corresponds to the image graph of G in Γ ”. The idea of “there exists a projection from di into Γ such that . . .” is equivalent to the idea of “there is at least a projection from a graph di into Γ with . . .”. Thus P1 is equivalent to “for each projection π from G into Γ with γ the image graph of G in Γ , there is at least a projection from a graph di into Γ with γ the image graph of the head hdi of di ”, that is to say P1 ⇔ P2 .
92
J. Dibie, O. Haemmerle, and S. Loiseau
Proof 2 Let us show the equivalence of the two propositions: – P1 : “for each S-substitution ρ from Φ(G) into Φ(Γ ), if ∃ a S-substitution ρ0 from Φ(di ) into Φ(Γ ) with ρ0 (Φ(hdi )) = ρ(Φ(G)), then ρ0 must be unique” – P2 : “for each projection from G into Γ such that γ is the image graph of G, there is at most one projection from each graph di into Γ with γ the image graph of the head of di ”. According to property 4, there exists a projection π from G into Γ iff there exists a S-substitution ρ from Φ(G) into Φ(Γ ), and there exists a projection π 0 from di into Γ iff there exists a S-substitution ρ0 from Φ(di ) into Φ(Γ ). P1 is equivalent to “for each projection π from G into Γ , if there exists a projection π 0 from di into Γ such that the image graph of the head hdi of di in Γ corresponds to the image graph of G in Γ , then the projection π 0 must be unique”. The idea of “if there exists a projection from di into Γ such that . . ., then it must be unique” is equivalent to the idea of “there is at most one projection from each graph di into Γ with . . .”. Thus P1 is equivalent to “for each projection π from G into Γ with γ the image graph of G in Γ , there is at most one projection from each graph di into Γ with γ the image graph of the head hdi of di ”, that is to say P1 ⇔ P2 .
References 1. J.P. Laurent. Proposals for a valid terminology in KBS validation. European Conference of Artificial Intelligence, pages 829–834, August 1992. 2. A. Ginsberg. Knowledge-base reduction : a new approach to checking knowledge bases for inconsistency and redundancy. AAAI, pages 585–589, 1988. 3. P. Meseguer. Verification of multi-level rule-based expert systems. AAAI, pages 323–328, 1991. 4. A. Preece and N. Zlaterava. A state of the art in automated validation of knowledge-based systems. Expert Systems with Applications, 7(2):151–167, 1994. 5. S. Loiseau. Checking and restoring the consistency of knowledge databases. Encyclopedia of Computer Science and Technologie, 36:15–34, 1997. 6. C. Haouche and J. Charlet. KBS validation : a knowledge acquisition perspective. ECAI, pages 433–437, 1996. 7. M.C. Rousset. Knowledge formal specifications for formal verification: a proposal based on the integration of different logical formalism. European Conference of Artificial Intelligence, pages 739–743, 1994. 8. P. Hors and M.C. Rousset. Modeling and verifying complex objects : A declarative approach based on description logics. European Conference of Artificial Intelligence, 1996. 9. J.F. Sowa. Conceptual structures: information processing in mind and machine. Addison Wesley Publishing Company, 1984. 10. M.L. Mugnier and M. Chein. Repr´esenter des connaissances et raisonner avec des graphes. Revue d’Intelligence Artificielle, 10(1):7–56, 1996. 11. M. Chein and M.L. Mugnier. Conceptual graphs: fundamental notions. Revue d’Intelligence Artificielle, 6(4):365–406, 1992.
A Semantic Validation of Conceptual Graphs
93
12. M. Lecl`ere. Les connaissances du niveau terminologique du mod` ele des graphes conceptuels : construction et exploitation. PhD thesis, Universit´e Montpellier II, Decembre 1995. 13. F. Bouali. Les systemes terminologiques et la relation partie de. Rapport de stage de DEA, Universit´e PARIS-SUD, centre d’Orsay, Septembre 1992. 14. M.L. Mugnier and M. Chein. Characterization and algorithmic recognition of canonical conceptual graphs. In Proceedings of the 1st International Conference on Conceptual Structures, ICCS’93, Quebec City, Canada, L.N.A.I. 699, pages 294–311. Springer-Verlag, August 1993. 15. M. Wermelinger. A different perspective on canonicity. In Proceedings of the 5th Int. Conf. on Conceptual Structures, Lecture Notes in Artificial Intelligence 1257, pages 110–124, Seattle, U.S.A., 1997. Springer Verlag. 16. P. Kocura. Conceptual graph canonicity and semantic constraints. In Gerard Ellis Peter W. Eklund and Graham Mann, editors, Conceptual Structures: Knowledge Representation as Interlingua - Auxilliary Proceedings of the Fourth International Conference on Conceptual Structures, pages 133–145, Sydney, Australia, August 1996. Springer Verlag. 17. G. W. Mineau and R. Missaoui. The representation of semantic constraints in conceptual graph systems. In Proceedings of the 5th Int. Conf. on Conceptual Structures, Lecture Notes in Artificial Intelligence 1257, pages 138–152, Seattle, U.S.A., 1997. Springer Verlag. 18. J. Dibie. Validation des graphes conceptuels. Rapport de stage de DEA, Universit´e Paris IX, Dauphine, Paris, Septembre 1997. 19. J. Dibie, O. Haemmerl´e, and S. Loiseau. Une validation logique des graphes conceptuels. Rapport interne, Institut National Agronomique, Paris, Janvier 1998. 20. O. Haemmerl´e. CoGITo: une plate-forme de d´eveloppement de logiciels sur les graphes conceptuels. PhD thesis, Universit´e Montpellier II, Janvier 1995. 21. P. Martin and L. Alpay. Conceptual structures and structured documents. In Proceedings of the 4th Int. Conf. on Conceptual Structures, Lecture Notes in Artificial Intelligence 1115, Springer-Verlag, pages 145–159, Sydney, Australia, August 1996. 22. C. Bos, B. Botella, and P. Vanheeghe. Modelling and simulating human behaviours with conceptual graphs. In Proceedings of the 5th Int. Conf. on Conceptual Structures, Lecture Notes in Artificial Intelligence 1257, pages 275–289, Seattle, U.S.A., 1997. Springer Verlag.
Using Viewpoints and CG for the Representation and Management of a Corporate Memory in Concurrent Engineering Myriam Ribière INRIA (project ACACIA) 2004, route des Lucioles BP.93 06902 Sophia-Antipolis Cedex E-mail: [email protected] Abstract. In Concurrent Engineering, several participants (designers, managers,...) stemming from different specialities, collaborate in order to build a system. In this paper, we propose a representation with the Conceptual Graph formalism to keep a trace of the design process in Concurrent Engineering as a project memory. Such design process is a cycle of individual design and collaborative evaluation. In order to facilitate knowledge extraction from this memory and to ensure the evolution of knowledge, we define a viewpoints management. This management is based in our study of viewpoints in Conceptual Graphs.
1 Introduction In Concurrent Engineering (CE), several participants (designers, managers,...) stemming from different specialities, collaborate in order to construct a system (also called artefact). Such a design process is a cycle of individual design, in which each designer defines a design proposition corresponding to his speciality and to his task, and collaborative evaluation in which the integration of propositions in the artefact is evaluated. Participants need sometimes to review previous decisions. So, it is important to define a project memory and to represent different states of the artefact. This memory must be accessible by designers in different specialities during the project process.. Conceptual Graph (CG) formalism offers a knowledge representation close to natural language, we propose to use this formalism to represent states of the artefact. We also exploit Viewpoint notion to make knowledge accessible to different participants. We present first (Section 2) Concurrent Engineering (CE) through the description of the CE task and the corporate memory in CE. In the next section (Section 3), we clarify the viewpoint definition according to the CE context. Then, we propose (Section 4) methods and algorithms to manage a part of the corporate memory (the different states of the artefact). Those methods are related to the building, the consultation and the update of the artefact during the CE task.
2 Concurrent Engineering and Corporate Memory 2.1 CE Task As we noted above, the CE process is a cycle of individual design and collaborative M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 94-108, 1998 Springer-Verlag Berlin Heidelberg 1998
Management of a Corporate Memory in Concurrent Engineering
evaluation. In fact, we note three main tasks in CE: Modify Requirements Artefact Private Models Shared Models
Design
Propositions
Assumptions Argue
Decision
Evaluate Arguments
Fig. 1. CE Task
•
Individual Design: (Design task), relying on his private knowledge (Private model), each designer generates some propositions to satisfy given requirements. His task is similar to a design task. • Cooperative Evaluation: (Evaluate task), The group evaluates the integration of propositions in the artefact. Propositions may not satisfy participants’ needs and conflicts can appear. To promote the acceptance of his propositions by the group, a participant justifies them with a number of arguments (Argue task). Assumptions made in the design task are used to determine arguments and to define them. Argumentation tries to change other participant’s opinion by justifying the utility and the necessity of a proposition. We can note that different types of knowledge (private and shared) and different types of tasks (individual and cooperative) are manipulated in CE task. Note also that several fields (domains) are implicated and that designers may be in geographically distant sites. So, different types of knowledge (related to profession, to experience, etc.) have to be capitalized (on) and designers need to access them (from a relative field, or collaborative decisions,...) for each task. 2.2 Corporate Memory in CE In [11] we study the complete elaboration of a corporate memory in CE, and distinguish different types of memory needed during the different task of the CE (Fig. 1.). We distinguish a memory dedicated to the individual design task and another dedicated to the cooperative evaluation task. We can notice that states of the artefact are essential in the second memory, they represent the evolution of the design object during the CE process. Participants need to review previous decisions (in previous artefact states), and they need to examine the current state of the artefact in order to specify requirements of each part of the artefact and the interaction between these parts. They also need to refer to previous project. So there is a direct interaction between experts and the description of the artefact. For this reason, we use the conceptual graph formalism to represent information or knowledge in the artefact. CG are close to natural language thanks to their representation of knowledge with concepts and relations, and they have a graphical display form of logic which is more readable than classical notations. We exploit also arguments of Gerbé, in [5], about how conceptual graphs "are a response
95
96
M. Ribiere
to the specific requirements involved in the development of corporate knowledge repositories": the possibility to express knowledge both at the type and instance level, partial knowledge, category or instance in relationship, and category or instance in MetaModel. In a second phase, we must exploit the viewpoint notion to describe the artefact according to the different experts collaborating to its building. Furthermore viewpoints help in information retrieval, that all participants need in the different tasks described in CE process. So in the next section we describe the use of viewpoints in such a process and we reuse the integration of viewpoints that we presented in a previous paper [10] to formalize our representation of artefact.
3 Viewpoints and Definition "Viewpoint" is a polysemous word, i.e. its definition depend on the context of use. In TROPES, for example, a viewpoint "is a perspective of interest from which an expert examines the knowledge base". It is a general definition, that can take several interpretations in different applications. 3.1 Definition of a Viewpoint for CE In CE, we must introduce a human dimension that corresponds to the different participants in a design project, and a knowledge dimension corresponding to the experience, competence and situation of participants. Generally we can say that a viewpoint is taken by a person according to his/her knowledge, domain of competence and his/her objective in his/her activity. Finch [4] speaks about a vocational viewpoint for a viewpoint used in a particular work activity. So in CE, we characterize a viewpoint by three objects: person, domain, objective. Each object has a definition: • • •
person is described by the name of the person, its situation in the enterprise and its competencies and level of competence for each competency, domain is described by the name of the domain, and the current activity, objective is described by a focus in the activity on a design object.
We applied this definition in accidentology, collaborative design and conflict management between distributed data. We notice that person, domain, objective are always (or more often) present in the elaboration of a viewpoint. So we can constitute a begin of ontology for CE with those objects and define viewpoints in CE on this ontology, but the description of each object depends on the application. Moreover we consider two important parts in a viewpoint, first the "Objective" constitutes the focus, and second "Domain" and "Person" constitute the different angles of
Management of a Corporate Memory in Concurrent Engineering
view on a same focus (Fig. 2.). electronic expert electronics computing network network designer
FOCUS ON performance in number of users
management project manager Fig. 2. Example on a description of a network
According to this definition, we propose in the next section an adaptation of the integration of viewpoints in CG. 3.2 Viewpoints and CGs First step in using viewpoints and conceptual graphs To represent knowledge in a context of several experts, it is interesting to model the multiple perspectives that different experts may have on the objects handled in their reasoning. Taking inspiration from research in object oriented representation, we proposed in a previous work [10] an extension of the conceptual graph formalism to integrate viewpoints in the support and in the building of conceptual graphs. Viewpoints allow us to define the context of use and the origin (name of the expert or group of experts, speciality, degree of experience,...) of concept types introduced in a graph. As a reminder we detail some definitions used in this previous work: •
•
Definition: viewpoint relation, basic and v-oriented concept type Let C1, C2 be two concept types. If C1 < C2, then there may exist a "viewpoint relation" VPt such that C1 is a subtype of C2 according to VPt. VPt is a second order conceptual relation of signature (TYPE,TYPE). Then C1 is a basic concept type and C2 a v-oriented concept type.
Definitions: basic and v-oriented concept - Let Tc be a basic type concept, and r an individual or generic referent. [Tc:r] is an instantiation of the "basic concept type" Tc. It is called "basic concept". - Let To be a v-oriented concept type, and r an individual or generic referent. [To:r] is an instantiation of the "v-oriented concept type" To. It is called "v-oriented concept". • We also introduce a new relation, called "Repr", which links a concept to one of its representations. Repr is a first order relation. It can be expressed only if the two concept types are linked through a viewpoint relation in the viewpoint knowledge base. Two concepts linked by a "representation relation" describe the same object, We also detailed different second order relations (equivalence, inclusion, exclusion) between concept types, which can help to manage the representation relation or to make deductions. Those relations depend on the representation relation. Our aim was to define viewpoints to help knowledge representation with conceptual graphs for multi-expert knowledge acquisition and also to have an accessible and evol-
97
98
M. Ribiere
utive knowledge base of conceptual graphs through viewpoints. Our approach is also aimed at making comparisons of expertises by defining second order relation between concept types such as equivalence, inclusion or exclusion. According to the definition in the previous section we describe the viewpoint relation by a definition type and express sufficient conditions to use this relation. Definition of a viewpoint relation: necessary conditions for Vpt (TC1,TC2) [TYPE:TC1] [TYPE:TC2] (<) sufficient conditions for Vpt (TC1,TC2) [person:*x] (described_by) [ρVpt] [Objective:*y] [Domain:*z] The different concept types person, objective and domain have a definition in CE context, but their definition is not always the same. So we express necessary conditions for the definition of those objects: necessary conditions for Person(x) (attr) [Name] [Person:*x] (attr) [Situation] (attr) [Competence/Level of competence] necessary conditions for Domain(x) (attr) [Name] [Domain:*x] (attr) [activity] necessary conditions for Objective(x) (attr) [Object] [Objective:*x] (attr) [Focus] In the context of CE project, this proposition for viewpoints is not sufficient. So we have to clarify this integration of viewpoints and our aim in using viewpoints. Second step in characterization of viewpoints and use with conceptual graphs The integration of viewpoints proposed in [10] is based on a simple definition of viewpoints that is: "viewpoint is the explicit expression of a particular subtype relation existing between two concept types". This definition takes into account the difficulty in elaborating an unique model using different terminologies. It allows us to describe a complex object under different perspectives, and allows the possibility of comparing two descriptions in verifying if they use equivalent concept types. But our aim is to compare two descriptions on the same level of abstraction, written from different angles, but focusing on the same problem or task. The determination of such graphs is not easy with only viewpoints and terminology. In fact it is very difficult to automatically determine the viewpoint in a description (represented by a conceptual graph) from the concept types used. So we must distinguish viewpoints used in descriptions of complex objects and viewpoints used in descriptions of an expertise, a proposition
Management of a Corporate Memory in Concurrent Engineering
or a belief. Description viewpoints (Vpt exploitation system) [TYPE: Computer] (Vpt protocol _network) (Vpt usability_in_network)
[TYPE: PC] [TYPE:Computer_unix] [TYPE: Computer VMS] [TYPE:Computer IPX] [TYPE: Computer_TCP-IP] [TYPE: Computer_PPP] [TYPE: Server] [TYPE: client_terminal] [TYPE: client_Computer]
Expertise viewpoints (Vpt_electronics_engineer) [TYPE:cable_topology] (Vpt_network_designer) [TYPE:Machine_topology] Description of complex objects Computer (Repr) [Computer_unix: primo] (Repr) [Computer_TCP-IP: primo] [Computer:primo] (Repr) [Sever: primo] (Repr) [Computer_unix: M2] (Repr) [Computer_TCP-IP: ] [Computer:M2] (Repr) [client_terminal: primo]
[TYPE:topology_of_network]
Description topologic of network, by network designer: graph definition of Machine_topology(x) (composed_by) [Computer_Unix] [Machine_topology: *x] (composed_by)
(nb_Computer) [Number:10] [Server: primo]
Fig. 3. Example of description and expertise viewpoints
We consider our first definition of viewpoint dedicated to the description of complex objects as "viewpoint of description", and we clarify in this paper the notion of "viewpoint of expertise". The distinction is essential, it determines the nature of knowledge that we index with viewpoints, and the possibility of each viewpoint in terms of knowledge management. Definition: projection operation (as reminder) Let G and G’ two conceptual graphs. The viewpoint projection of G into G’ is an application π:G→G’, where πG is a subgraph of G’, such that: - For each concept c in G, πc is a concept in G’ and type(πc) is a subtype of type(c). - For each conceptual relation r in G, type(πr)=type(r). If the ith arc of r is linked to a concept c in G, the ith arc of πr must be linked to πc in G’. Definition: viewpoints of expertise , see example in (Fig. 3.) Suppose the experts have a common objective, described by a basic concept type F1. Their expertises, according to this common objective F1, will be expressed by v-oriented subtypes of F1, according to the expertise viewpoints V1,...,Vn. An expertise viewpoint relation V is a viewpoint relation where the concept type
99
100
M. Ribiere
Objective is replaced by F1in its definition. sufficient conditions for expertise_Vpt (F1,TC) [person:*x] (described_by) [ρVpt] [F1:*y] [Domain:*z]
4 Methodology to Build and Manage CE Project Memory with CG and Viewpoints As we said in (Section 2) a part of a project memory in CE must contain all states of the artefact during the project. So our first interest is to represent an artefact. According to the definition of viewpoints in CE and the integration of such viewpoints in the conceptual graph formalism, we propose a method to build the artefact with a multiview approach and a method to facilitate the management (information retrieval and up-date) of this part of the corporate memory. We proposed also algorithms 4.1 Methodology to Build the Artefact For describing a design object or product, we must follow several steps: a) decomposition of the object in several components, b) describe each component c) describe relations among components and their interactions. [12] proposes a multi-view product model based on the life cycle of the product. Each object must be described under several views by only one expert. The different views or steps in the life cycle are "skeletal structure or topologic view", "geometric view", "manufacturing view", "material view", "thermal view", "mechanical view" and "dynamic/Kinematic view". We can note according to our definition of viewpoint, that those views represent the different focus or objective, taken by experts to describe objects. Each expert can also have his description according to the focus. This model is proposed for only one expert for one view, but we can generalize and apply it to several experts on one view. Indeed two experts can be in the same domain, and express their description in the same focus, but can have two different descriptions that correspond to their level of competence and experience in the domain. For example see (Fig. 5.) Method to represent the artefact with viewpoints and CG (Fig. 4.) ❶ We can note that those different views constitute the different focus on description that all participants can take. So we can characterize the objective characteristic of all "expertise viewpoints" with the different step of the life cycle on the product.Declaration of concept types in the lattice •Introduction of all basic concept types in the lattice corresponding to all components of the object, •Introduction of all oriented concept types corresponding to the different expertise for each focus on components. ❷ Construction of the CG for the decomposition of the product ❸ Introduction of all expertise viewpoints (according to life cycle and participant) ❹ Description through second order conceptual graphs, of all viewpoint relations existing between basic concept types and v-oriented concept types
Management of a Corporate Memory in Concurrent Engineering
➎ Definition via a CG of all oriented concept types ➏ Instanciation of concept types corresponding to the different objects and their different description(s) to express interaction or relations between components Support Knowledge base of CG
❶
T
❸
➎
Concept type Set of relations lattice
Viewpoint management
❷ ➏
❹ Viewpoint base ➏Instantiation base Fig. 4. Steps in the building of artefact ❹Description of viewpoint relation and expertise viewpoint relation Description viewpoints (Vpt_mechanic) [TYPE:arrow] [TYPE:Beam ] (Vpt_shape) [TYPE:structural_element] (Vpt_geometric) [TYPE:parallelepiped] Expertise viewpoints [TYPE:Description_geometric_tree] (Vpt_geometric&E1) [TYPE:Tree_geometric] (Vpt_topologic&E2) [TYPE:Tree_skeletal] [TYPE:Description_topologic_tree] (Vpt_topologicE3) [TYPE:Tree_skin] (Vpt_thermal&E4) [TYPE:Tree_flux] [TYPE:Description_thermal_tree] (Vpt_thermal&E1) [TYPE:Tree_surface] (Vpt_materialE ) [TYPE:Tree_mechanic] [TYPE:Description_material_tree] 3 [TYPE:Tree_thermal] (Vpt_material&E1) ➎ Definition via a CG of oriented concept types definition graph of Tree_geometric (x) [Tree:*x] (Composed_of) [cylinder] [size:*y] (has_for) (Composed_of)
[parallepiped]
(has_for)
[quotation:*z]
Fig. 5. Example of a part of an artefact represented vith viewpoints and CGs
Algorithms for Consistency checking during the building of the artefact: In this section, we present two of several algorithms(sub-process of the principal programs are not described in this paper), for checking consistency during the building of the artefact. We can not deal with all cases, in this paper, so the assumption is that the concept types lattice and the set of relations (conceptual relations, viewpoint relations, expertise relations) are already build.We denote the input variables with I:, and Output variable with O: • creation of a viewpoint relation Vpt between two concept types T1 and T2: Program create_viewpoint_relation(I: Vpt,T1,T2)
101
102
M. Ribiere
G := [TYPE:T2] -> (Vpt) -> [TYPE:T1] If in the concept type lattice T2 < T1 then add G in the viewpoint base else you can not create a viewpoint relation between T2 and T1 EndProgram
• instantiation of a concept type Tc by a referent ref Program instantiation (I: Tc,ref) if [Tc:ref] already present in the instantiation base then ok for the instantiation else if Tc is a basic concept type then creation_description(I:Tc, ref, instantiation base) else if Tc is a v_oriented concept type then Tb := basic_concept_type_associated(I: viewpoint base) if [Tb:ref] already present in instanciation base then creation_viewpoint_instance(I:Tb,Tc,ref,instanciationbase,O: List_types_instantiated) creation_other_instanciation(I:Tb,Tc,ref,List_types_i nstantiated,viewpoint base, instanciation base) else add [T:ref]<-(Repr)<-[Tc:ref] in the instantiation base creation_other instanciation (I: T,Tc, ref,nil,viewpoint base, instanciation base) else ok for instantiation EndProgram
4.2 Description of the Management of the Artefact in CE
➊
interaction with corporate memory
consultations consultations
Information Retrieval through viewpoints consultations
consultations
➋ Visualization of Artefact with CG
visualization
➌ modifications
Update the Artefact by integration of design solution
design solution in CG
Fig. 6. Viewpoint management in CE corporate memory
Method for information retrieval in the artefact ➊ Information Retrieval through viewpoints (Fig. 6.)
Individual design Collaborative Evaluation
Artefact (CG base)
Management of a Corporate Memory in Concurrent Engineering
In the first task of CE, designers need to consult the current state of the artefact. So we propose to help the designer in information retrieval from the artefact representation by using viewpoints. Information retrieval uses the characterization of viewpoints on the objects domain, person and focus to filter information for the user (Fig. 6.). The principle is to use a navigation graph based on those objects in order to automatically filter information and to present only CG related to the asked viewpoints. For example, in (Fig. 7.) the characterization on "domain" of the artefact gives four viewpoints: mechanics, electronics, aerodynamics and hydraulics. Each viewpoint gives access to related information (phase 2 of CG visualization), or to another characteristic existing in viewpoints (in that case "hydraulics" gives access to three characteristics related to more specific viewpoints allowing more specific information about the artefact). This navigation1 helps the user to find appropriate information and prevents him from getting lost in the complex base of CG. . mechanics
electronics
pump
Artefact aerodynamics
hydraulics
hydraulics circuits
brakes
viewpoints
Fig. 7. Navigation through viewpoints: consultation of the hydraulics viewpoint in the artefact.
note: When a participant choose one level in navigation graph, a concept is selected. It is composed by a concept type (domain, person or objective) and a referent (resp. reference of the domain, resp. reference of the person or resp. reference of the objective) ➋ Visualization of artefact with CG (Fig. 6.) After using the navigation graph, we can select conceptual graphs related to the selected set of characteristics (that must correspond to one or several viewpoints). As we said in (Section 2.2), the use of conceptual graphs makes direct consultation of the designer easier. Algorithms (sub-process of the principal program are not described in this paper): This algorithm uses only expertise viewpoints. Program selection_graphs // Selection Concepts characterising the viewpoint asked by the user Selection_in_navigation_graph(O: L_concepts) //Search in definition of viewpoint relation types, all viewpoint relation types that contains all the concepts selected Search_viewpoints_associated(I:L_concepts, O: viewpoint_list) Search_list_of_v_oriented_concept_types_associated(I: viewpoint_list,O:list_v_oriented_concept_types) Search_graphs_definitions_and_list_of_v_oriented_concepts(I:L 1. The user can ask for information to be filtred with all objects characterising viewpoints used in the artefact structuring.
103
104
M. Ribiere
ist_v_oriented_concept_type; O:list_answered_graphs_on_concept_type_definition, list_oriented_concept) Search_graphs_in_CG_Base(I:list_oriented_concept,O: list_answered_graphs) EndProgram
Remark: This work is in progress, because answers of this algorithm are not sufficiently organized for the designer, in comparison to the work realised in Rock [1]. Method for the Up-date of the artefact by integration of design propositions ➌ Update the artefact by integration of design propositions (Fig. 6.) In the third task of CE (evaluate task), designers need to integrate the design propositions chosen in the artefact. This integration must be guided by the navigation through objects characterizing a viewpoint in order to detect in which part of the artefact, propositions must be integrated. Using description viewpoints, we can detect if concept types used to describe a design proposition have equivalence in other viewpoints: in [10] we defined inter-viewpoints relations, like equivalence or inclusion, between two concept types. Thanks to these inter-viewpoint relations and to the projection operator, we can compare a conceptual graph of the artefact and the conceptual graph of the design proposition. Furthermore with the different operations of conceptual graphs, we can detect if the given design proposition graph is a specialization (resp. generalization) of a more generic (resp. specific) graph already present in the artefact. The system can also detect if the given design proposition graph is the definition of a concept already used in a conceptual graph in the artefact. All those verifications can be done with classic conceptual graph operations and with new operations created for the viewpoint management in conceptual graphs [10]. The use of conceptual graph and viewpoints facilitates the integration of design solution in the artefact by choosing the good viewpoints to integrate the solution and by using operations on conceptual graph for consistency checking. Algorithm for comparison of conceptual graphs In this paper, we focus on the building of a corporate memory with the different states of the artefact. When we want to integrate a solution in the artefact, to generate a new state of the artefact, we must take into account that the solution is a consensual decision of participants. The solution is the result of the "evaluate task" in the CE Cycle. So an algorithm for comparison of graphs (such as the proposition in [3]) is essential to detect that the solution is more general or more specific than descriptions in the artefact. But the extension for viewpoints proposed in this part could be very more efficient in the "evaluate task". So we propose an extension, for the viewpoint management, of the algorithm proposed in [3] for the comparison and integration of two conceptual graphs. [3] proposes some relations existing between concept types, and also relations existing between two "links". A link is an "elementary link of a conceptual graph rel(C1,...,Cn)". This relation permits to categorize the relation existing between two graphs (subgraph or supergraph, contraction and expansion, specialization, generalization, instantiation and
Management of a Corporate Memory in Concurrent Engineering
conceptualization). In each of this categorization, we have more precise relations that can be exist between two graphs. As in [3], we use the notation CG=( , , ). To a given conceptual graph, is the set of its "elementary links" denoted rel(C1,...,Cn) with rel ∈ , with arity(rel)=n, and for i ∈ [1..n], Ci = adj(i,rel) ∈ . In our approach of viewpoint we propose new relations among Concept vertices of different conceptual graphs and in the same way, several relations among elementary links of those graphs.
new relations among Concept vertices of different conceptual graphs Let CG1=( 1, 1, 1) and CG2=( 2, 2, 2) the conceptual graphs to be compared. We define the additional binary relations on 1 * 2 and redefine some of relations described in [3]: • is_specialized_viewpoint (C1,C2) iff type (C1) < type(C2) ∧ referent(C1) = referent(C2) ∧ C1 is a v-oriented concept type ∧ C2 is a basic concept type ∧ ∃G in instantiation base such that G:[C1]->(Repr)->[C2] • is_generalized_viewpoint(C1,C2) iff is_specialized_viewpoint(C2,C1) • is_equivalent_concept (C1,C2) iff type(C1) ∪ type(C2) ≠ Τ ∧ ∃G in viewpoint base such that G:[TYPE:type(C1)] ->(equiv)->[TYPE:type(C2)] • is_inclusion_concept (C1,C2) iff type(C1) ∪ type(C2) ≠ Τ ∧ ∃G in viewpoint base such that G:[TYPE:type(C1)] ->(incl)->[TYPE:type(C2)] • is_exclusion_concept (C1,C2) iff type(C1) ∪ type(C2) ≠ Τ ∧ ∃G in viewpoint base such as G:[TYPE:type(C1)] ->(excl)->[TYPE:type(C2)] • is_more_generalized_concept (C1,C2) iff is_generalization (C1,C2) ∨ is_generalized_viewpoint(C1,C2) ∨ is_generalization&conceptualization(C1,C2)
New relations among elementary links of different conceptual graphs Let CG1=( 1, 1, 1) and CG2=( 2, 2, 2) the conceptual graphs to be compared. We define additional kinds of relations possible 2between elementary links of CG1 and CG2, respectively denoted link1=rel1(C11...C1n) and link2 = rel2 (C21...C2n), where rel1 and rel2 have the same arity: • is_concept_total_viewpoint_specialization(Link1,Link2) iff type(rel1)=type(rel2) ∧ ∀i∈[1..n], is_specialized_viewpoint (adj(i,rel1), adj(i,rel2)) • is_concept_partial_viewpoint_specialization (Link1,Link2) iff type(rel1)=type(rel2) ∧∀i∈[1..n], (is_specialized_viewpoint (adj(i,rel1), adj(i,rel2)) ∨
2.In this article we do not define the same relation if type(rel1)
105
106
• • • • • •
•
M. Ribiere
is_specialization(adj(i,rel1),adj(i,rel2)) ∨ is_same_concept(adj(i,rel1), adj(i,rel2)) ) ∧∃i∈[1..n], is_specialized_viewpoint (adj(i,rel1), adj(i,rel2)) is_concept_total_viewpoint_generalization(Link1,Link2) iff is_concept_total_viewpoint_specialization(Link2,Link1) is_concept_partial_viewpoint_generalization(Link1,Link2) iff is_concept_partial_viewpoint specialization (Link2,Link1) is_concept_total_equivalence (Link1,Link2) iff type(rel1)=type(rel2) ∧ ∀i∈[1..n], is_equivalent_concept (adj(i,rel1), adj(i,rel2)) is_concept_total_inclusion (Link1,Link2) iff type(rel1)=type(rel2) ∧ ∀i∈[1..n], is_inclusion_concept (adj(i,rel1), adj(i,rel2)) is_concept_exclusion (Link1,Link2) iff type(rel1)=type(rel2) ∧ ∃i∈[1..n], is_exclusion_concept (adj(i,rel1), adj(i,rel2)) is_concept_partial_equivalence (Link1,Link2) iff type(rel1)=type(rel2) ∧ ∀i∈[1..n], (is_equivalent_concept (adj(i,rel1), adj(i,rel2)) ∨ is_same_concept(adj(i,rel1), adj(i,rel2)) ) ∧∃i∈[1..n], is_equivalent_concept (adj(i,rel1), adj(i,rel2)) is_concept_partial_inclusion (Link1,Link2) iff type(rel1)=type(rel2) ∧ ∀i∈[1..n], (is_inclusion_concept (adj(i,rel1), adj(i,rel2)) ∨ is_same_concept(adj(i,rel1), adj(i,rel2)) ) ∧∃i∈[1..n], is_inclusion_concept (adj(i,rel1), adj(i,rel2))
New relations among conceptual graphs According to the definition of relations between elementary links, we can define new relations between conceptual graphs: • The equivalence relation can be exploited for graphs expressing the same propositions with different vocabularies. • The inclusion relation can be exploited for expressing inclusion of a proposition in another. • The exclusion relation can help to indicate two incompatible graphs. • The definition of specialization through viewpoints, can be included in the relation "concept total specialization" and "concept partial specialization". All those relations can help in detection of conflicts between design propositions in the "evaluate task" of the CE process • CG2 is "a totally equivalent graph" of CG1 iff ∃ a graph morphism (hc: 2-> 1, hr: 2-> 1, ha: 2-> 1) from CG2 to CG1 such that ∀link2∈ 2, is_concept_total_equivalent (link1,ha(link2)) • CG2 is "a partially equivalent graph" of CG1 iff ∃ a graph morphism (hc: 2-> 1, hr: 2-> 1, ha: 2-> 1) from CG2 to CG1 such that ∀link2∈ 2, (is_concept_partial_equivalent (link1,ha(link2)) ∨ is_same_link (link1,ha(link2))) ∧ ∃link2∈ 2 such that: is_concept_partial_equivalent (link1,ha(link2)) • CG2 is "a totally included graph" of CG1 iff ∃ a graph morphism (hc: 2-> 1, hr: 2-> 1, ha: 2-> 1) from CG2 to CG1 such that ∀link2∈ 2, is_concept_total_inclusion (link1,ha(link2)) • CG2 is "a partially included graph" of CG1 iff ∃ a graph morphism (hc: 2-> 1,
Management of a Corporate Memory in Concurrent Engineering
hr: 2-> 1, ha: 2-> 1) from CG2 to CG1 such that ∀link2∈ 2, (is_concept_partial_inclusion (link1,ha(link2)) ∨ is_same_link (link1,ha(link2))) ∧ ∃link2∈ 2 such that: is_concept_partial_inclusion (link1,ha(link2)) CG2 is "a exclusion graph" of CG1 iff ∃ a graph morphism (hc: 2-> 1, hr: 2-> 1, ha: 2-> 1) from CG2 to CG1 such that ∃link2∈ 2, is_concept_exclusion (link1,ha(link2))
•
4.2.0.1 strategies of integration of a proposition in the artefact When all relations between elementary links are given, [3] proposes several strategies to integrate the two compared conceptual graphs . In the artefact, the information or knowledge must be as precise as possible. So we detail the different cases of relations that could exist between two conceptual graphs: • If there is one of the different relations of specialization, detail in [3], then we apply the "strategy of the highest direct specialization": if the proposition is more precise than a description in the artefact, and use more precise expression, we prefer to restrict what was expressed in the artefact. • If there is one of the different relations of instantiation, then we apply the "strategy of the highest direct instantiation". • If there is a relation of partial equivalence and equivalence, then we can keep the two graphs in the different viewpoints they belong to, or choose one of them. • If there is a relation of inclusion or partial inclusion, we keep the graph that includes the other graphs, because the information in the included graphs will be present in the other graph. • If there is an excluded relation, the proposition is not valid. This case can note append in this context of integration of a solution in the artefact, but is efficient in the "evaluate task".
5 Conclusion In this paper, we present an approach to the construction of a corporate memory in CE, in taking account the different steps of the artefact during the design process. We propose a representation of artefact with viewpoints and conceptual graphs based on previous work on CG and an adaptation of viewpoints and viewpoints management to this problem. We introduce the notion of description viewpoints, and expertise viewpoints. We propose several algorithms (not detailed in this paper)for the building and maintain of the CG base representing artefact. Related work are already done in Design Rationale: [9] propose to use case librairies to represent past experiences and [6] focused on the usability of design rationale documents, but they do not use the conceptual graph formalism. Our approach using viewpoints and CGs can be extend to the different elements of a corporate memory detailed in [6] and it take care of the variety of information sources and the context in which knowledge or information must be understood. The implementation of this work is in progress and realized with the COGITO platform. In further work we have to organize answers of algorithms according to the different levels of viewpoints. Our aim is to propose a support system allowing the management of a CE project memory, based first on the artefact, but also must be
107
108
M. Ribiere
extended to the proposition (for the Design Rationale part) of [11], i.e. take into account the history of artefact but also the different possible choices during the CE process represented by the different design propositions. In this way, we could use efficiently the algorithm of comparison to see differences between several propositions
References 1.
Carbonneill, B., Haemmerlé, O., Rock: Un système de question/reponse fondé sur le formlaisme des graphes conceptuels, In actes du 9ième Congrès Reconnaissances des Formes et Intelligence Artificielle, Paris, p. 159-169, 1994.
2.
Cointe, C., Matta, N., Ribière, M., Design Propositions Evaluation: Using Viewpoint to manage Conflicts in CREoPS2, In Proceedings of ISPE/CE, Concurrent Engineering: Research and Applications, Rochester August 1997.
3.
Dieng, R., Hug, S.,MULTIKAT, a Tool for Comparing Knowledge of Multiple Experts, In Proceedings of ICCS’98, Ed. Springer Verlag, Montpellier, France, August 1998.
4.
Finch, I., Viewpoints - Facilitating Expert Systems for Multiple Users, In Proceedings of the 4th International Conference on Database and Expert Systems Applications, DEXA’93, ed. Springer Verlag, 1993.
5.
Gerbé, O., Conceptual Graphs for corporate Knowledge Repositories, In Proceedings of ICCS’97, ed. Springer-Verlag, Seattle, Washington, USA, August 1997
6.
Karsenty, L., An empirical evaluation of design rationale documents, Electronic Proceedings of CHI’96, [http://www.acm.org/sigchi/chi96/proceedings/papers/Karsenty/ lk_txt.htm], 1996.
7.
Leite, J.,Viewpoints on viewpoints, in Proceedings of Viewpoints 96: An International Workshop on Multiple Perspectives in Software Development, San Francisco, USA, 14-15 October, 1996.
8.
Marino, O., Rechenmann, F., P. Uvietta, Multiple perspectives and classification mechanism in Object-oriented Representation, Proc. 9th ECAI, Stockholm, Sweden, p. 425-430, Pitman Publishing, London, August 1990.
9.
Prasad, M.V.N., Plaza, E., Corporate Memories as Distributed Case Librairies, Proceedings of the 10th banff, Knowledge Acquisition for Knowledge-based Systems Workshop, Banff, Alberta, Canada, November 9-14, p. 40-1 40-19, 1996.
10.
Ribière, M., Dieng, R., Introduction of viewpoints in Conceptual Graph Formalism, In Proceedings of ICCS’97, Ed. Springer-Verlag, Seattle, USA, Août 1997.
11.
Ribière, M., Matta, N., Guide for the elaboration of a Corporate Memory in CE, submitted to the 5th European Concurrent Engineering Conference, Erlangen-Nuremberg, Germany, april 26-29, 1998.
12.
Tichkiewitch, S., Un modèle multi-vues pour la conception integrée, in Summer Scool on ""Entreprises communicantes: Tendances et Enjeux", Modane, France, 1997.
13.
Sowa, J.F., Conceptual Structures, Information Processing in Mind and Machine. Reading, Addison-Wesley, 1984.
WebKB-GE — A Visual Editor for Canonical Conceptual Graphs S. Pollitt1 , A. Burrow1 , and P.W. Eklund2 1
2
Department of Computer Science, University of Adelaide, Australia 5005 sepollitt/[email protected] School of Information Technology, Griffith University, Parklands Drive, Southport, Australia 9276 [email protected] Abstract. This paper reports a CG editor implementation which uses canonical formation as the direct manipulation metaphore. The editor is written in Java and embedded within the WekKB indexation tool. The user’s mental map is explicitly supported by a separate representation of a graph’s visual layout. In addition, co-operative knowledge formulation is supported by network-aware work-sharing features. The layout language and its implementation are described as well as the design and implementation features.
1
Introduction
Display form conceptual graphs (CGs) provide information additional to the graph itself. An editing tool should therefore preserve layout information. For aesthetic reasons a regular layout style is also preferred. However, one consideration of good CG layout (as opposed to a general graph layout) is that understandability is the primary goal rather than attractiveness [2]. The editor we describe (WebKB-GE) limits manipulation on the graph to the canonical formation rules [16] (copy, restrict, join, simplify). Atomic graphs are also canonical and therefore any CG constructed using WebKB-GE will be canonical. WebKB [12] is a public domain experimental knowledge annotation toolkit. It allows indices of any Document Elements (DEs) on the WWW to be built using annotations in CGs. This permits the semantic content, and relationships to other DEs, to be precisely described. Search is initiated remotely, via a WWW-browser and/or a knowledge engine. This enables construction of documents using inference within the knowledge engine to assemble DEs. Additionally, the knowledge base provides an alternate index through which both query and direct hyperlink navigation can occur. WebKB has been built using Javascript and Java for the WWW-based interface and C and C++ for the inference engines. One of the goals of the WebKB toolkit is to aid Computer Supported Co-operative Work (CSCW). WebKB-GE is integrated into WebKB and therefore multi-user/distributed features are implemented.
2
Design Goals
WebKB-GE is designed to be used by domain experts in a distributed cooperative environment. This means: (i) domain dependent base languages must be distributed; (ii) co-operation depends on a shared understanding of a base M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 111–118, 1998. c Springer-Verlag Berlin Heidelberg 1998
112
S. Pollitt, A. Burrow, and P.W. Eklund
language; (iii) domain experts are not necessarily experts in CG theory; (iv) large, collaborative domain knowledge bases are difficult to navigate; (v) a medium for collaborative communications must be provided. The design of WebKBGE supports the construction of accurate well-formed CGs allowing the user to experience a CG canon’s expressiveness. This is achieved through a direct manipulation interface. The properties of a graphs depiction are explicitly stored between sessions. WebKB-GE is designed to operate as a client tool in a distributed environment. 2.1
Direct Manipulation
Direct manipulation (DM) allows complex commands to be activated by direct and intuitive user actions. DM is the visibility of the object of interest; rapid, reversible, incremental actions; and replacement of complex command language by direct manipulation of the object of interest [15]. It should allow the user to form a model of the object represented in the interface [4]. A well recognised subclass of DM interfaces is the graphical object editor [17] where the subject is edited through interaction with a graphical depiction. Unidraw [18] is a toolkit explicitly designed to support the construction of graphical object editors. WebKB-GE is an example of a graphical object editor handling CGs. Central to the DM interface is the Editing/Manipulation Pane. This contains a number of Objects manipulated by Tools. Relations between objects are also indicated. Objects provide visual representations of complex entities. The Concepts and Relations in CGs are graphical objects in WebKB-GE. A concept may contain additional information (an individual name for example) but only the Type is displayed in the visual representation. A palette of tools is provided to manipulate objects. The behaviour of a tool may be a function of the manipulated object, so each tool expresses an abstract operation. “Operation animation” is an essential feature of a DM interface. An operation like “move” allows the user to see the object dragged from the start to the finish point. Visual feedback about the success or failure of an operation must be provided. 2.2
Canonical Graphs
WebKB is aimed at the domain expert. It is important to restrict the graphs to those derivable from a canon. To ensure canonical graphs, the only operations allowed are canonical formation rules [16]; (i) Copy – a copy of a canonical graph is canonical; (ii) Restrict – a more general type may be restricted to a more specific type (as defined in the Type hierarchy). Also, a generic concept type may be replaced by an individual object of that type; (iii) Join – two canonical sub-graphs containing an identical concept are joined at that concept; (iv) Simplify – when a relation is duplicated between identical concepts the duplicates are redundant and removed. A distinction is made between operations that affect the graph and operations that affect the representation of the graph. Each of the four canonical operations operate on the underlying graph and the visual representation is updated accordingly. These are the only operations allowed on the graph itself. Operations on the representation of the graph, such as a “Move” operation, are also allowed.
WebKB-GE — A Visual Editor for Canonical Conceptual Graphs
113
Fig. 1. Visual feedback in a WebKB-GE join operation. The left hand side shows an unsuccessful join. The system starts in a stable state (a) a concept is dragged in (b). The join is invalid so the dragged object turns red. The mouse is released at this point. The operation is undone by snapping the concept back to the previous position (c). The right hand side of shows a successful join. The system starts in (d), the lower left concept is moved towards an identical concept (e). The mouse is released and the two concepts snap together as the join is performed (f).
2.3
Distributed Multi-User Application
CSCW tools such as WebKB, allow members of a group to access a shared canon. The canon is received each time the user starts the application to ensure changes are propogated from the central site. The server is not fixed and the user chooses from several. Graphs created by distributed users should be available to the other members of the workgroup. The ideal way is for clients to send graphs back to the central server. When users share information it is important that the original creator be acknowledged. These features are implemented in WebKB-GE. 2.4
Layout Retention
In preceding local implementations of CG editors [1,3] only the graph operations were considered important. No display or linear form editors in the literature have the capacity to maintain display form representations [11,13]. In these tools the formation rules are sometimes implemented for graph manipulation but only the final linear form of the graph is stored between editing sessions. The visual representation of a conceptual graph contains additional metainformation significant to the graph’s creator. Layout should be disturbed as little as possible by operations performed on the underlying CG. This allows the users’ mental map [14] to be preserved. Additionally, the user will want to alter the visual representation while not altering the underlying graph. The only such operation permitted is moving objects to new spatial locations. These moving operations are constrained — a regularity is imposed by the layout language describing the visual representations. The method chosen for layout storage is implemented according to Burrow and Eklund[2] and described below in Section 3.4.
114
3
S. Pollitt, A. Burrow, and P.W. Eklund
Implementation
3.1 Architecture A general multiple server/multiple client architecture allows communication over a network such as the Internet. The network is not an essential part of the system — both server and client can run on a single machine. The server code controls distribution of a canon to clients. In addition, initial layouts for a canon are sent to a client. The server can handle database access for a shared CG canon. The client requests a copy of a canon from a server as well as the layout of any new subgraphs the user adds to the editing pane. 3.2 Implementation Language: Java One important issue is the difference between Java Applets and Java Applications. An Applet is a Java program [5] that runs within a restricted environment, normally a Web browser. This restricted environment does not allow certain operations for security reasons: (i) making a network connection to a machine other than that from which the applet was loaded. This restricts each instance of the client to only contact a single server. Changing servers requires the whole applet be re-loaded from the new server: (ii) accessing the local disk. The client is unable to read/write to the local disk and unable to save/load CGs locally. This is not a large problem depending on how saving graphs is handled. If all graphs are saved via the server no local access is required. Each time an applet is loaded in a web browser all code for that applet is downloaded from the server to the client machine. This ensures the user is receiving continuously updated software but can be a slow if the applet is large. With a Java application, the Java Run-time Environment (JRE) is downloaded separately for the appropriate platform. The application code is executed using the JRE. Applet restrictions do not apply for Java Applications and for this reason both the client and server code are written as Java Applications. A number of DM toolkits are available for use with Java. Most provide a layer on top of the Abstract Windowing Toolkit (AWT) to make interface creation straightforward. Some of the more widely known toolkits are subArctic [6], Sgraphics [9] and the Internet Foundation Classes [7]. The DM toolkit for WebKB-GE is subArctic[6], a constraint-based layout toolkit. At the time WebKB-GE was written it was the most stable of supported toolkits. It is also available free for both commercial and non-commercial use. Objectspace has created the “Generic Collection Library” (JGL) for Java [8]. This library was also used. 3.3 Communication For communication between client and server a simple protocol was implemented. Currently only two active operations are implemented: (i) Canon Request — the server responds to this by returning a copy of the canon being served, read from its local disk; (ii) Layout Request — the server reads the relation name sent by the client and returns the layout specification for that relation. Additional operations, such as canon modifications and database access, can be added to the protocol if required. The client application contains two parsers to process information from a server. One parser reads the linear CGs and other information from the canon to build the internal CG data store. The second reads layout and builds the visual representation of each graph.
WebKB-GE — A Visual Editor for Canonical Conceptual Graphs
115
3.4 Layout Language The layout language devised to display CGs is the feature that differentiates this editor from others[11,13,3]. The language originates in Kamada and Kawai[10] who developed a constraint based layout language (COOL) to generate a pictorial representation of text. Graphical relations in COOL are one of either geometric relations — constraints between variables characterising the objects; and drawing relations — lines and labels on and between objects. Specifying layout is performed in two stages: (i) constraining the (defined) reference points to lie on a geometric curve (line, arc, spline); (ii) connecting the reference points of the object by a geometric curve with an arrowhead. Burrow and Eklund [2] devised a canon that represents the visual structure of conceptual graphs in the language of CGs. Following that work, the actual physical locations of objects are not stored in WebKB-GE. Instead, spatial relationships between the objects are captured in the language. A container-based approach is used. All objects must be stored in horizontal and vertical containers. The ordering of the objects within containers is preserved as objects are moved. If a container in either orientation does not exist at the final move location, a new container is created in that direction. Moving the final object out of a container dissolves the container. Horizontal containers are defined to extend to the width of the editing pane with the height of the tallest object contained. Vertical containers are defined to extend to the height of the editing pane with the width of the widest object. In WebKB, as in any co-operative work environment, it is important to record the original source of knowledge and data. Because layout information is saved separately from the linear form a mapping from linear form to display objects is also required. This is achieved by generating a quasi-unique identifier for every graph component. This is created using the Internet Protocol (IP) address of the machine on which the graph was created along with the time-stamp. This results in a twenty four digit identifier. Inside the client application the linear and display form of the graph are stored and manipulated separately. The two forms are synchronised using the twenty four digit identifier discussed above. The abstract (linear) graph information is stored in a series of data structures descending from the editing pane. The canon currently in use is also stored in the editing sheet and constructed from the Type and Relation hierarchies. Each entry in the Relation hierarchy contains a graph describing the concept types to be connected by the relation. Both graphs in relation definitions and user constructed graphs have their abstract information stored internally in a CG map: one for each graph. The graphical layout is stored in a series of containers with the top level being the “Container Constrainer”. This handles alignment of horizontal and vertical containers and creates/destroys new/old containers. Each container is responsible for managing the layout of the objects contained within it. When objects change their dimensions or are added and removed the container resizes appropriately. The code ensures containers are not too close together. Each object in the graphical representation maintains a connection with the abstract graph object it displays. When graph operations are targeted on a graphical object the abstract object is
116
S. Pollitt, A. Burrow, and P.W. Eklund
retrieved. Links between abstract graph objects (edges) are not stored in object containers but maintained within the user interface. Due to the simplicity of the layout language there are only a very small number of definitions that may appear. [HBOX:hboxnum]→(HCONTAINS)→[ELEMENT:!uniqueid]. [VBOX:vboxnum]→(VCONTAINS)→[ELEMENT:!uniqueid]. hboxnum and vboxnum denote the box into which to insert the element. The box numbers do not have to indicate any sort of ordering. The uniqueid is the placeholder for the corresponding element in the linear description. [ELEMENT:!uniqueid]→(ELTRIGHT)→ [ELEMENT:!uniqueid]. [ELEMENT:!uniqueid]→(ELTBELOW)→[ELEMENT:!uniqueid]. The second uniqueid indicates the corresponding element is to the right (below) the first uniqueid. [HBOX:hboxnum]→(BOXBELOW)→[HBOX:hboxnum]. [VBOX:vboxnum]→(BOXRIGHT)→[VBOX:vboxnum].
The second box specified is below (to the right of) the first. With these definitions the relative layout is stored. When a relation is added to the editing pane by the user, or a graph is loaded from a file, the following process occurs: 1. the linear form is loaded into a temporary CG map. This restricts the search space for resolving unique IDs and rollback if an error occurs. If a relation from the canon is being added, this occurred when the canon was retrieved; 2. the layout script of the graph is processed and objects to be placed in the containers. Container objects are reordered correctly; 3. dummy objects are resolved using the appropriate abstract objects from the linear form. Graphical representations of the links between objects are created and stored in the interface; 4. each graphical container is resized to fit the largest contained object. Containers are assigned an initial starting position which accounts for spacing between the containers. Positions of the graphical links are updated as the containers (and consequently, the objects) move; 5. if the previous stages occur successfully, the layout and abstract objects are merged in the editing pane. [!1:User]{->(!2:Role)->[!3:WN_expert], ->(!4:Chrc)->[!5:Property]}. Fig. 2. The augmented linear form of the CG.
Fig. 2 shows the linear form augmented with unique identifiers. The graphical layout script is shown in Fig. 3 and the corresponding screen-shot shown in Fig. 4. The format of the canon used by the editor is a simple series of definitions giving the concept and relation hierarchies. The lattices containing the definitions must be defined first, with a separate lattice for every arity relation required. For example (from the default canon):
WebKB-GE — A Visual Editor for Canonical Conceptual Graphs [HBOX:1] -> (HCONTAINS) -> [ELEMENT:!1]. [HBOX:1] -> (HCONTAINS) -> [ELEMENT:!4]. [HBOX:2] -> (HCONTAINS) -> [ELEMENT:!2]. [HBOX:2] -> (HCONTAINS) -> [ELEMENT:!5]. [HBOX:3] -> (HCONTAINS) -> [ELEMENT:!3]. [ELEMENT:!1] -> (ELTRIGHT) -> [ELEMENT:!4]. [ELEMENT:!2] -> (ELTRIGHT) -> [ELEMENT:!5]. [HBOX:1] -> (BOXBELOW) -> [HBOX:2]. [HBOX:2] -> (BOXBELOW) -> [HBOX:3].
117
[VBOX:1] -> (VCONTAINS) -> [ELEMENT:!1]. [VBOX:1] -> (VCONTAINS) -> [ELEMENT:!2]. [VBOX:2] -> (VCONTAINS) -> [ELEMENT:!4]. [VBOX:2] -> (VCONTAINS) -> [ELEMENT:!5]. [VBOX:2] -> (VCONTAINS) -> [ELEMENT:!3]. [ELEMENT:!1] -> (ELTBELOW) -> [ELEMENT:!2]. [ELEMENT:!4] -> (ELTBELOW) -> [ELEMENT:!5]. [ELEMENT:!5] -> (ELTBELOW) -> [ELEMENT:!3]. [VBOX:1] -> (BOXRIGHT) -> [VBOX:2].
Fig. 3. The layout description of the graph.
Fig. 4. The top level window with the example graph loaded. lattice UNIVERSAL, ABSURD is type : *. lattice TOP-T1-T1, BOT-T1-T1 is relation ( UNIVERSAL, UNIVERSAL ).
The concept hierarchy is then defined:
type Entity(x) is [!1:UNIVERSAL:*x]. type Situation(x) is [!2:UNIVERSAL:*x]. type Something_playing_a_role(x) is [!3:UNIVERSAL:*x].
Finally the relation hierarchies are defined: relation Attributive_binaryRel(x, y) is [!1:UNIVERSAL:*x] -> (!2:TOP-T1-T1) -> [!3:UNIVERSAL:*y]. relation Spatial_binaryRel(x, y) is [!7:UNIVERSAL:*x] -> (!8:TOP-T1-T1) -> [!9:UNIVERSAL:*y].
relation Component_binaryRel(x, y) is [!4:UNIVERSAL:*x] -> (!5:TOP-T1-T1) -> [!6:UNIVERSAL:*y].
Once the linear sections of the canon have been defined, the initial layouts of the relations must be defined. Each relation layout is specified in a file with the naming form:
4
[ELEMENT:!7]. [ELEMENT:!8]. [ELEMENT:!9]. -> [ELEMENT:!8]. -> [ELEMENT:!9].
[VBOX:1] [VBOX:2] [VBOX:3] [VBOX:1] [VBOX:2]
-> -> -> -> ->
(VCONTAINS) -> [ELEMENT:!7]. (VCONTAINS) -> [ELEMENT:!8]. (VCONTAINS) -> [ELEMENT:!9]. (BOXRIGHT) -> [VBOX:2]. (BOXRIGHT) -> [VBOX:3].
Conclusion
A visual-form CG editor has been designed and implemented. Editing operations are restricted to the four canonical formation rules. This guarantees well-formed CGs. The key feature of this editor is the use of a graphical scripting language to capture relevant details of the graph layout. This information is stored along with the linear information.
118
S. Pollitt, A. Burrow, and P.W. Eklund
The application is written to be used for co-operative work by networkconnected users and in particular for use in the WebKB indexation toolkit. Only simple graphs are currently supported. The ability to extend the framework to nested graphs, along with extending the layout language with different containers, is inherent in the design of the editor. WekKB and WebKB-GE may be obtained from http://www.int.gu.edu.au/kvo.
References 1. A.L. Burrow. Meta tool support for a GUI for conceptual structures. Hons. thesis, Dept. of Computer Science, www.int.gu.edu.au/kvo/reports/andrew.ps.gz, 1994. 2. A.L. Burrow and P.W. Eklund. A visual structure representation language for conceptual structures. In Proceedings of the 3rd International Conference on Conceptual Structures (Supplement), pages 165–171, 1995. 3. Peter W. Eklund, Josh Leane, and Chris Nowak. GRIT: An Implementation of a Graphical User Interface for Conceptual Structures. Technical Report TR94-03, University of Adelaide, Dept. Computer Science, Feb. 1994. 4. P.W. Eklund, J. Leane, and C. Nowak. GRIT: A GUI for conceptual structures. In Proceeedings of the 2rd International Workshop on PEIRCE. ICCS-93, 1993. 5. James Gosling and Henry McGilton. The Java Language Environment: A White Paper. Technical report, Sun Microsystems, 1996. 6. Scott Hudson and Ian Smith. subArctic User Manual. Technical report, GVU Center, Georgia Institute of Technology, 1997. 7. IFC Dev. Guide. Technical report, Netscape Communications Corp., 1997. 8. JGL User Manual. Technical report, Objectspace Inc., 1997. 9. Mike Jones. Sgraphics Design Documentation. Technical report, Mountain Alternative Systems, http://www.mass.com/software/sgraphics/index, 1997. 10. T. Kamada and S. Kawai. A general framework for visualizing abstract objects and relations. ACM Transactions on Graphics, 10(1):1–39, January 1991. 11. R.Y. Kamath and Walling Cyre. Automatic integration of digital system requirements using schemata. In 3rd International Conference on Conceptual Structures ICCS’95, number 954 in LNAI, pages 44–58, Berlin, 1995. Springer-Verlag. 12. P.H. Martin. The WebKB set of tools: a common scheme for shared WWW annotations, shared knowledge bases and information retrieval. In Proceedings of the CGTools Workskop at the 5th International Conference on Conceptual Structures ICCS ’97, pages 588–595. Springer Verlag, LNAI 1257, 1997. 13. Jens-Uwe M¨ oller and Detlev Wiesse. Editing conceptual graphs. In Proceedings of the 4th International Conference on Conceptual Structures ICCS’96, pages 175– 187, Berlin, 1996. Springer-Verlag, LNAI 1115. 14. George G. Roberston, Stuart K. Card, and Jock D. Mackinlay. Information visualization using 3D interactive animation. CACM, 36(4):57–71, Apr 1993. 15. B Shneiderman. Direct manipulation. Computer, 16(8):57–68, Aug 1983. 16. J Sowa. Conceptual Structures : Information Processing in Mind and Machine. Addison-Wesley, 1984. 17. J. Vlissides. Generalized graphical object editing. Technical Report CSL-TR-90427, Dept. of Elec. Eng. and Computer Science, Stanford University, 1990. 18. John M. Vlissides and Mark A. Linton. Unidraw: A framework for building domainspecific graphical editors. ACM Trans. on Info. Systems, 8(3):237–268, Jul 1990.
Mapping of CGIF to operational interfaces A. Puder International Computer Science Institute 1947 Center St., Suite 600 Berkeley, CA 94704{1198 USA [email protected]
Abstract. The Conceptual Graph Interchange Format (CGIF) is a no-
tation for conceptual graphs which is meant for communication between computers. CGIF is represented through a grammar that de nes \on{ the{wire{representations". In this paper we argue that for interacting applications in an open distributed environment this is too inecient both in terms of the application creation process as well as runtime characteristics. We propose to employ the widespread middleware platform based on CORBA to allow the interoperability within a heterogeneous environment. The major result of this paper is a speci cation of an operational interface written in CORBA's Interface De nition Language (IDL) that is equivalent to CGIF, yet better suited for the ecient implementation of applications in distributed systems. Keywords: CGIF, CORBA, IDL.
1
Introduction
Conceptual Graphs (CG) are abstract information structures that are independent of a notation (see [5]). Various notations have been developed for dierent purposes (see Figure 1). Among these are the display form (graphical notation) or the linear form (textual notation). These two notations are intended for human computer interaction. Another notation called Conceptual Graph Interchange Format (CGIF) is meant for communication between computers. CGIF is represented through a grammar that de nes \on{the{wire{representations" (i.e. the format of the data transmitted over the network). The reason for developing CGIF was to support the interoperability for CG{ based applications that needed to communicate with other CG{based applications. We argue that for interacting applications in an open distributed environment this is too inecient both in terms of the application creation process as well as runtime characteristics. Applications that need to interoperate are written by dierent teams of programmers, in dierent programming languages using dierent communication protocols. A generalization of this problem is addressed by so{called middleware platforms . As the name suggests, these platforms reside between the operating system and the application. One prominent middleware platform is de ned through the Common Object Request Broker Architecture (CORBA) which allows the interoperability within a heterogeneous environment (see [4]). In this paper we will show how to use CORBA for CG{based applications. M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 119-126, 1998 Springer-Verlag Berlin Heidelberg 1998
120
A. Puder
Conceptual Graphs
Intension Extension Linear Form
Display Form
Human Computer Interaction
Fig. 1.
CGIF
CGIDL
Computer Computer Interaction
Dierent notations represent the intension of conceptual graphs.
The outline of this paper is as follows: in Section 2 we give a short overview of CORBA. In Section 3 we discuss some drawbacks of using CGIF for distributed applications. In Section 4 we present our mapping of CGIF to CORBA IDL, which is further explained in Section 5 through an example. It should be noted that we describe work{in{progress. The following explanations emphasize the potential of using CORBA technology for CG{based applications. A complete mapping of CGIF to CORBA IDL is subject to further research.
2 Overview of CORBA Modern programming languages employ the object paradigm to structure computation within a single operating system process. The next logical step is to distribute a computation over multiple processes on a single machine or even on dierent machines. Because object orientation has proven to be an adequate means for developing and maintaining large scale applications, it seems reasonable to apply the object paradigm to distributed computation as well: objects are distributed over the machines within a networked environment and communicate with each other. As a fact of life, the computers within a networked environment dier in hardware architecture, operating system software, and the programming languages used to implement the objects. That is what we call a heterogeneous distributed environment . To allow communication between objects in such an environment, one needs a rather complex piece of software called a middleware platform . The Common Object Request Broker Architecture (CORBA) is a speci cation of such a middleware platform. The CORBA standard is issued by the Object Management Group (OMG), an international organization with over 750 information software vendors, software developers, and users. The goal of the OMG is the establishment of industry guidelines and object management speci cations to provide a common framework for application development. CORBA addresses the following issues:
Mapping of CGIF to Operational Interfaces Object orientation:
cations.
121
Objects are the basic building blocks of CORBA appli-
A caller uses the same mechanisms to invoke an object whether it is located in the same address space, on the same machine, or on a remote machine. Hardware, OS, and language independence: CORBA components can be implemented using dierent programming languages on dierent hardware architectures running dierent operating systems. Vendor independence: CORBA compliant implementations from dierent vendors interoperate and applications are portable between dierent vendors. Distribution transparency:
One important aspect of CORBA is that it is a speci cation and not an implementation . CORBA just provides a framework allowing applications to interoperate in a distributed and heterogeneous environment. But it does not prescribe any speci c technology how to implement the CORBA standard. The standard is freely available via the World Wide Web at http://www.omg.org/. Currently there exist many implementations of CORBA focusing on dierent market segments.
IDL Compiler Client
Client Stub
Server Skeleton
Server
DSI DII
Object Adapter ORB
Fig. 2.
Basic building blocks of a CORBA based middleware platform.
Figure 2 gives an overview of the components of a CORBA system (depicted in gray), as well as the embedding of an application in such a platform (white components). The Object Request Broker (ORB) is responsible for transferring operations from clients to servers. This requires the ORB to locate a server implementation (and possibly activate it), transmit the operation and its parameters, and nally return the results back to the client.
122
A. Puder
An Object Adapter (OA) oers various services to a server such as the management of object references, the activation of server implementations, and the instantiation of new server objects. Dierent OAs may be tailored for speci c application domains and may oer dierent services to a server. The ORB is responsible for dispatching between dierent OAs. One mandatory OA is the so{called Basic Object Adapter (BOA). As its name implies, it oers only very basic services to a server. The interface between a client and a server is speci ed with an Interface Definition Language (IDL). According to the object paradigm, an IDL speci cation separates the interface of a server from its implementation. This way a client has access to a server's operational interface without being aware of the server's implementation details. An IDL{compiler generates a stub for the client and a skeleton for the server which are responsible for marshalling and unmarshalling the parameters of an operation. The Dynamic Invocation Interface (DII) and the Dynamic Skeleton Interface (DSI) allow the sending and receiving of operation invocations. They represent the marshalling and unmarshalling API oered to stubs and skeletons by the ORB. Dierent ORB implementations can interoperate through the Internet Inter{ORB Protocol (IIOP) which describes the on{the{wire representations of basic and constructed IDL data types as well as message formats needed for the protocol. With that respect it de nes a transfer syntax, just like CGIF does. The design of IIOP was driven by the goal to keep it simple, scalable, and general.
3 Using CGIF in a heterogeneous environment In this section we explain how CGIF might be used in constructing distributed applications and the disadvantages this has. Figure 3 depicts a typical con guration. The application consists of a client and a server, communicating via a transport layer over the network. The messages being exchanged between the client and the server contain CGs as de ned by CGIF. First note that this is not sucient for a distributed application. CGIF only allows to code parameters for operations, but the kind of operations to be invoked at the server is out of scope of CGIF. The distinction between parameters and operations corresponds to the distinction between KIF and KQML (see [2]). In that respect there is no equivalent to KQML in the CG{world. One of our premises is that the client and server can be written in dierent programming languages, running on dierent operating systems using dierent transport media. A programmer would most certainly de ne some data structures to represent a CG in his/her programming language. In order to transmit a CG, the internal data structure needs to be translated to CGIF. This is accomplished by a stub . On the server side the CG coded in CGIF needs to be translated back to an internal data structure again. This is done by a skeleton . The black nodes in Figure 3 show the important steps in the translation process. At step 1 the CG exists as a data structure in the programming language used to implement the client. The stub, which is also written in the same pro-
Mapping of CGIF to Operational Interfaces
Client
Server
1
4
Stub
Skeleton
2
3
123
Transport Layer
Fig. 3. Marshalling code is contained in stubs and skeletons.
gramming language as the client, translates this data structure to CGIF (step 2). After transporting the CG over the network, it arrives at the server (step 3). The skeleton translates the CG back into a data representation in the programming language used for the implementation of the server. At step 4 the CG can nally be processed by the server. CGIF does not prescribe an internal data structure of a programming language. I.e., using CGIF for transmitting CGs, a programmer must rst make such a de nition based on his/her programming language followed by a manual coding of the stub and skeleton. This imposes a high overhead in the application creation process. The main bene t of using CORBA IDL is that stubs and skeletons are automatically generated by an IDL compiler and there is a well{de ned mapping from IDL to dierent programming languages. Furthermore, the IDL speci cation does not only allow the speci cation of parameters but also for operations which makes it suitable for the speci cation of operational interfaces between application components. Although an IDL speci cation induces a transfer syntax through IIOP similar to CGIF, an IDL speci cation is better suited for the design of distributed applications. An IDL speci cation hides the transfer syntax and focuses on the user de ned types, which are mapped to dierent programming languages by an IDL compiler. CGIF on the other hand exposes the complexity of the transfer syntax to an application programmer who is responsible for coding stubs and skeletons. 4
CG Interface through IDL
In this section we show how to translate some of the basic de nitions of the proposed CG standard to CORBA IDL. The explanations presented here should
124
A. Puder
be seen as a proof of concept. A more thorough approach including all de nitions of the CG standard is still a research topic. The mapping we explain in the following covers de nitions 3.1 (Conceptual Graphs), 3.2 (Concept) and 3.3 (Conceptual Relation) of the proposed CG standard (see [6]). The basic design principle of the operational interface is to exploit some common features of the object paradigm and the CG de nitions. A conceptual graph is a bipartite graph consisting of concept and conceptual relation nodes. This de nition resembles an object graph, where objects represent the nodes of a CG and object references arcs between the nodes. Therefore it seems feasible to model the nodes of a CG through CORBA objects and the links between the nodes through object references. This way of modelling a CG through CORBA IDL has several advantages. Since object references are used to connect relation with concept nodes, one CG can be distributed over several hosts. The objects, which denote the nodes of a CG are not required to remain in the same address space, since an object reference can span host boundaries in a heterogeneous environment. Furthermore, a CG does not necessarily need to be sent by value, but rather by reference. Only if the receiving side of an operation actually accesses a node, it will be transferred over the network. This scheme enables a lazy evaluation strategy for CG operations. It is common to place all IDL de nitions related to a particular service in a separate namespace to avoid name con icts with other applications. Therefore, we assume that the following IDL de nitions are embraced by an IDL module: module CG { // Here come all definitions related to // the CG module };
Using the inheritance mechanism of CORBA IDL we rst de ne a common base interface for concepts and conceptual relations. This interface is called Node and de ned through the following speci cation: typedef string Label; interface Node { attribute Label type; };
The interface Node contains all the de nitions which are common for concept and relation nodes. The common property shared by those two types of nodes is the type label. The type label is represented through an attribute of type Label. Note that we made Label a synonym for string through a typedef declaration. Next, we de ne the interface for all concept nodes: interface Concept : Node { attribute Label referent; }; typedef sequence
Mapping of CGIF to Operational Interfaces
125
The interface Concept inherits all properties (i.e., attribute Label) from interface Node and adds a new property, namely an attribute for the referent. The following typedef de nes an unbounded sequence of concepts. This is necessary for the de nition of the relation node: interface Relation : Node { attribute ConceptSeq links; };
Just as the interface Concept, the interface Relation inherits all properties from Node. The new property added here is a list of neighboring concept nodes, represented by the attribute links. Note that ConceptSeq is an ordered sequence. The length of this sequence corresponds with the arity of the relation. The rst item in a sequence refers to the concept node pointing towards the relation. The nal de nition gives the IDL abstraction of a CG: typedef sequence
A CG is represented by a list of interface Node. For brevity reasons the IDL type is called Graph. The order of appearance of the individual nodes is of no importance. This data structure suces to transmit a simple CG over the network. In the following section we provide a little example how this de nition might be used in a real application context.
5 Example Given the basic speci cations from the previous section, how does a programmer develop an application using CG? Using the CORBA framework the programmer would have to design the interface to the application to be written based on the de nition from the previous section. E.g., consider a simple application which would oer database functionality for CGs, such as save, restore, etc. The resulting IDL speci cation would look some like the following: #include "cg.idl" interface DB { typedef string Key; exception Duplicate {}; exception NotFound {}; exception IllegalKey {};
};
Key save( in CG::Graph c ) raises( Duplicate ); CG::Graph retrieve( in Key k ) raises( NotFound, IllegalKey ); void delete( in CG::Graph c ) raises( NotFound, IllegalKey );
126
A. Puder
This example assumes that the basic de nitions for CGs from Section 4 are stored in a le called \cg.idl". The de nitions are made known through the #include directive. Access to the database is de ned through interface DB. The database stores CGs and assigns a unique key to each CG. The database allows to save, retrieve and delete CGs. If a CG is saved, the database returns a unique key for this CG. The operations retrieve and delete need a key as an input parameter. Errors are reported through exceptions. The IDL de nition of interface DB is all that a client program will need in order to access an implementation. As pointed out before, the language and precise technology that was used to implement the database are irrelevant to the client. 6
Conclusion
The construction of CG{based applications can bene t from the usage of middleware platforms. In this paper we have shown how to translate the basic de nitions for a conceptual graph to CORBA IDL. A conceptual graph is represented through a set of CORBA objects which do not need to reside in the same address space. Besides language independence, this has the advantage of supporting lazy evaluation strategies for CG operations. Once a proper mapping from CGIF to CORBA IDL has been accomplished, the existing CG applications should be re{structured using CORBA (see [1]). By doing so, those applications could more easily exploit speci c services they oer among each other and to other applications. There are several CORBA implementations available including free ones (see [3]). Applications, which use CORBA as a middleware platform, can be easily accessed in a standardized fashion from any CORBA implementation, such as the one included in the Netscape Communicator. References 1. CGTOOLS. Conceptual Graphs Tools homepage. http://cs.une.edu.au/cgtools/, School of Mathematical and Computer Science, University of New England, Australia, 1997. 2. M.R. Genesereth and S.P. Ketchpel. Software Agents. Communications of the Association for Computing Machinery, 37(7):48{53, July 1994. 3. MICO. A free CORBA 2.0 compliant implementation. http://www.vsb.informatik.uni-frankfurt.de/mico, Computer Science Department, University of Frankfurt, 1997. 4. Object Management Group (OMG), The Common Object Request Broker: Architecure and Speci cation, Revision 2.2, February 1998. 5. J.F. Sowa. Conceptual Structures, information processing mind and machine. Addison{Wesley Publishing Company, 1984. 6. J.F. Sowa. Standardization of Conceptual Graphs. ANSI Draft, 1998.
TOSCANA-Systems Based on Thesauri Bernd Groh1 , Selma Strahringer2 , and Rudolf Wille2 1
School of Information Technology Griffith University PMB 50 Gold Coast Mail Centre QLD 9726 Australia e-mail: [email protected] 2 Fachbereich Mathematik Technische Universit¨ at Darmstadt, Schloßgartenstr. 7 64289 Darmstadt, Germany e-mail: {strahringer, wille}@mathematik.tu-darmstadt.de
Abstract. TOSCANA is a computer program which allows an online interaction with data bases to analyse and explore data conceptually. Such interaction uses conceptual data systems which are based on formal contexts consisting of relationships between objects and attributes. Those formal contexts often have attributes taken from a thesaurus, which may be understood as ordered set and be completed to a join-semilattice (if necessary). The join of thesaurus terms indicates the degree of resemblance of the terms and should therefore be included in the formal contexts containing those terms. Consequently, the formal contexts of a conceptual data system based on a thesaurus should have join-closed attribute sets. A problem arises for the TOSCANA-system implementing such conceptual data system because the attributes in a nested line diagram produced by TOSCANA might not be join-closed, although its components have join-closed attribute sets. In this paper we offer a solution to this problem by developing a method for extending line diagrams to those whose attributes are join-closed. This method allows to implement TOSCANA-systems based on thesauri which respect the joinstructure of the thesauri.
Keywords: Conceptual Knowledge Processing, Formal Concept Analysis, Drawing Line Diagrams, Thesauri
1
TOSCANA
TOSCANA is a computer program which allows an online interaction with databases to analyse and explore data conceptually. TOSCANA realizes conceptual data systems [17],[20] which are mathematically specified systems consisting of a data context and a collection of formal contexts, called conceptual scales, together with line diagrams of their concept lattices. There is a connection between formal objects of the conceptual scales and the objects in the data context that M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 127–138, 1998. c Springer-Verlag Berlin Heidelberg 1998
128
B. Groh, S. Strahringer, and R. Wille
can be activated to conceptually represent the data objects within the line diagrams of the conceptual scales. This allows thematic views into the database (underlying the data context) via graphically presented concept lattices showing networks of conceptual relationships. The views may even be combined, interchanged, and refined so that a flexible and informative navigation through a conceptual landscape derived from the database can be performed (cf. [21]). For an elementary understanding of conceptual data systems, it is best to assume that the data are given by a larger formal context K := (G, M, I). A conceptual scale derived from data context K can then be specified as a subcontext (G, Mj , I ∩ (G × Mj )) with Mj ⊆ M . A basic proposition of Formal Concept Analysis states that the concept lattice of K can be represented as a W -subsemilattice within the direct product of the concept lattices of subcontexts S (G, Mj , I ∩(G×Mj )) (j ∈ J) if M = j∈J Mj [3; p. 77]. This explains how every concept lattice can be represented by a nested line diagram of smaller concept lattices. Figure 1 shows a nested line diagram produced with the assistance of TOSCANA from a database about environmental literature. One concept lattice is represented by the big circles with their connecting line segments, while the line diagram of the second is inserted into each big circle. In such a way the direct product of any two lattices can be diagramed, where a non-nested line diagram of the direct product may be obtained from a nested diagram by replacing each line segment between two big circles by line segments between corresponding elements of the two line diagrams inside the two big circles. In Fig. 11 , the black little circles represent W the combined concept lattice which has the two smaller concept lattices as -homomorphic images. Since larger data contexts usually give rise to an extensive object labelling, TOSCANA attaches first to a node the number of objects which generate the concept represented by that node; after clicking on that number, TOSCANA presents all names of the counted objects. TOSCANA-systems have been successfully elaborated for many purposes in different research areas, but also on the commercial level. For example, TOSCANA-systems have been established: for analyzing data of children with diabetes [17], for investigating international cooperations [11], for exploring laws and regulations concerning building constructions [13], for retrieving books in a library [12], [15], for assisting engineers in designing pipings [19], for inquiring flight movements at Frankfurt Airport [10], for inspecting the burner of an incinerating plant [9], for developing qualitative theories in music esthetics [14], for studying semantics of speech-act verbs [8], for examining the medical nomenclature system SNOMED [16] etc. In applying the TOSCANA program, the desire often arises to extend the program by additional functionalities so that TOSCANA is still in a process of further development (cf. [18]).
1
Translation of the labels in Fig. 1: Fluss/river, Oberflaechengewaesser/surface waters, Talsperre/impounded dam, Stausee/impounded lake, Staugewaesser/back water, Stauanlage/reservoir facilities, Staudamm/storage dam, Stausufe/barrage weir with locks, Seen/lakes, Teich/pond
TOSCANA-Systems Based on Thesauri
129
Stauanlage Staugewaesser
Staudamm
23649 55
Stausee
Staustufe 7
Talsperre
Oberflaechengewaesser 741 18 6
2
39
11 9
Fluss
Seen 641 27 2
213 3 8
9
1 11 1
1
15
8 2
5 Teich
58 49 1 1 2
1
9 2
1
2
Fig. 1. Nested line diagram of a data context derived from a database about environmental literature
130
2
B. Groh, S. Strahringer, and R. Wille
Thesaurus Terms as Attributes
Data contexts, which are supposed to be analysed and explored conceptually, often have attributes that are terms of a thesaurus. Those terms are already hierarchically structured by relations between the terms. We discuss in this section how the hierarchical structure of the thesaurus terms should be respected in a TOSCANA-system with additional functionality (compare other approaches in [2], [5], [6]). Let us first describe a formalization of thesauri that meets our purpose. Mathematically, we understand a thesaurus as a set T of terms together with a order relation ≤, i.e. a reflexive, transitive, and anti-symmetric binary relation on T . Note that we do not assume that a thesaurus has a tree structure, i.e., we allow that, for an element x ∈ T , there are elements y and z in T with x ≤ y, x ≤ z and y 6≤z, z 6≤y. We assume that a thesaurus has a unique maximal element. If this is not made explicit, we add a new top element to the structure. For t1 , t2 ∈ T , the element t1 ∨ t2 is defined as the unique minimal upper bound of t1 and t2 . If there is no unique minimal upper bound of t1 and t2 , we add a new element to T which is understood as the conjunction of the minimal upper bounds of t1 and t2 . We interpret t1 ∨ t2 as the smallest common generalization of the terms t1 and t2 . The extension of (T, ≤) obtained by completing the ∨operation leads mathematically to a ∨-semilattice. This allows us to assume for the sequel that the considered thesaurus is formalized by a finite ∨-semilattice. A formal context (G, M , I) where M := (M, ≤) is a finite ∨-semilattice is called a context with compatible attribute-semilattice if m1 ≤ m2 and gIm1 imply gIm2 for all m1 , m2 ∈ M and g ∈ G. Interesting examples of contexts with compatible attribute-semilattice can be derived from document databases where the documents have been indexed by thesaurus terms to describe their content. For a data context K := (G, M , I) with compatible attribute semilattice M := (M, ≤), the question arises how a conceptual data system based on K may respect the semilattice structure on the attributes of K. This question especially suggests to discuss desirable properties of conceptual scales derived from K. As mentioned above, the join m1 ∨ m2 of the attributes m1 and m2 is interpreted as the smallest common generalization of m1 and m2 and indicates therefore the degree of resemblance of the attributes m1 and m2 . Of course, for a conceptual scale Kj := (G, Mj , I ∩ (G × Mj )) with Mj ⊆ M , it is desirable to code this indication of resemblance into the scale; hence, the attribute set Mj of the scale Kj should be join-closed, i.e. m1 ∨ m2 ∈ Mj for all m1 , m2 ∈ Mj . We say that a subcontext (conceptual scale) Kj is join-closed if its attribute set Mj is joinclosed. Now, we conclude our consideration with the claim that the conceptual scales derived from a data context with compatible attribute-semilattice should be join-closed. Since TOSCANA-systems work with nested line diagrams, we are confronted with the question of how the resemblance of attributes is visible in a nested line diagram which represents conceptual scales derived from a data context with compatible attribute-semilattice. An investigation of this question uncovers the problem that, for join-closed conceptual scales Kj := (G, Mj , I ∩ (G × Mj ))
TOSCANA-Systems Based on Thesauri
131
(j = 1, 2) of the data context K := (G, M , I), the join m1 ∨ m2 for m1 ∈ M1 and m2 ∈ M2 may not have a correspondig concept node to be attached to within the nested line diagram which represents the union of the conceptual scales K1 and K2 ; in particular, this would yield m1 ∨ m2 6∈M1 ∪ M2 . Thus, we have to extend the union M1 ∪ M2 by those problematic attributes which yields the join-closed attribute set M12 := M1 ∪ M2 ∪ (M1 ∨ M2 ) with M1 ∨ M2 := {m1 ∨ m2 | m1 ∈ M1 and m2 ∈ M2 }. This attribute set determines the join-closed subcontext K1 ∨ K2 := (G, M12 , I ∩ (G × M12 )) which should be the contextual basis for the desired extension of the nested line diagram. The problem is how to derive graphically that extension from the already drawn diagram. For solving this problem, we propose to insert into the nested line diagram node after node for the problematic attributes and to complete the diagram after each positioning of a new node to obtain a drawing for the concept lattice of the respective extended subcontext. This procedure, which finally reaches a line diagram for the concept lattice of K1 ∨ K2 , reduces the problem to the case of extending a subcontext by one attribute. How this elementary case can be treated will be explained in the next section.
3
Extending Lattice Diagrams
We consider a formal context K := (G, M, I) and a subcontext K0 := (G, M0 , I ∩ (G × M0 ) with M0 ⊆ M ; furthemore, let m ∈ M \ M0 such that mI is not an extent of K0 , and let Km 0 := (G, M0 ∪ {m}, I ∩ (G × M0 ∪ {m}). In this section we describe how to extend a line diagram of the concept lattice of K0 to a line m diagram of the concept lattice of Km 0 . Recall that U(K0 ) denotes the set of m m extents of the context K0 and that (U(K0 ), ⊆) ∼ = B(Km 0 ). This means that it suffices to consider only the closure system of extents together with the settheoretic inclusion for our drawing problem. We start by defining the ordered set P := (U(K0 )∪{mI }, ⊆). In P the element I m has exactly one upper cover which we denote by C. Obviously, completing P by forming all intersections yields the closure system (U(Km 0 ), ⊆). To identify the new intersections, we consider the following partition of P into four classes (see Fig. 2). (1) (2) (3) (4)
[C) := {X ∈ P |C ⊆ X} (mI ] := {X ∈ P |X ⊆ mI } (C] \ ((mI ] ∪ {C}) with (C] := {X ∈ P |X ⊆ C} P \ ([C) ∪ (C])
We say that S ∈ U(K0 ) is a minimal ∩-generator if mI ∩ S 6∈P and if R ⊆ S with mI ∩ R = mI ∩ S (R ∈ U(K0 )) implies R = S. Since minimal ∩-generators have to be incomparable to mI , they cannot be in the class (1) or (2). For an element S of class (4), the equality (S ∩ C) ∩ mI = S ∩ mI shows that S cannot be a minimal ∩-generator. Thus, the minimal ∩-generators form a subset gm of I the class (3). Obviously, we have (U(Km 0 ) = P ∪ {m ∩ S | S ∈ gm } and each I minimal ∩-generator S has S ∩ m as lower neighbour in (U(Km 0 ).
132
B. Groh, S. Strahringer, and R. Wille
P (1)
C
(4) (3)
(4)
I
m (3) (2)
Fig. 2. Partition of the ordered set P into four classes
Now, we are able to describe how to derive a line diagram for (U(Km 0 ) from a line diagram of (U(K0 ). First we insert a node for mI and join this node with the node of C by a line segment. Then we move the subdiagram representing (C]0 := {X ∈ U(K0 ) | X ⊆ C} along that line segment until the node of C coincides with the node of mI , but keep a copy of the nodes representing (C]0 \ ((mI ] ∪ {C}), i.e. class (3), still at its original place so that the nodes representing elements in (C]0 \ ((mI ] ∪ {C}) double. Each pair of doubled nodes is joined by a new line segment. The resulting diagram has eventually too many nodes because only the intersection of mI with the minimal ∩-generators yields new elements. Therefore we replace all moved nodes of the elements in (C]0 \ ((mI ] ∪ {C}) that are not in gm by a dummy; but we keep all line segments in place. In this way we obtain a line diagram for (U(Km 0 ) where the nodes represent the lattice elements and the dummies guarantee a satisfying arrangement of the line segments. Let us now illustrate our drawing procedure by an example. The documents in the database ULIDAT of the Umweltbundesamt (Federal Environmental Agency of Germany) are indexed by terms of the Umwelt-Thesaurus (Environmental Thesaurus) to describe their content for the purpose of document retrieval (cf. [1]). Hence we may understand the documents as objects and the thesaurus terms as attributes of a formal context K := (G, M, I) with ∨-compatible attributesemilattice. We have chosen the scales K1 and K2 induced by the sets of terms M1 and M2 , respectively, visualized in Fig. 3 together with the order relation constituted by the thesaurus. The ∨-closure M12 := M1 ∪ M2 ∪ (M1 ∨ M2 ) consists of the thesaurus terms shown in Fig. 42 . We can see that the term ”Stehendes Gewaesser” which is a join of ”Seen” and ”Staugewaesser” and the term ”T” (for the top element of the thesaurus) which is a join of ”Stauanlage” and ”Oberflaechengewaesser” are the new attributes. Since all objects have the 2
Translation of the new label in Fig. 4: Stehendes Gewaesser/standing waters
TOSCANA-Systems Based on Thesauri
133
Ober aechengewaesser Seen Teich
Fluss
(M1 ; )
Stauanlage Staudamm
Staugewaesser Stausee
Staustufe Talsperre (M2 ; )
Fig. 3. Two ordered sets of thesaurus terms
attribute ”T”, it does not carry any information and can therefore be omitted in the line diagram. The line diagram of the concept lattice of K1 ∨ K2 in Fig. 5 will be extended to a line diagram of B(K1 ∨ K2 ). For reasons of comprehension we use a non-nested line diagram first and will later extend a nested line diagram. The line diagram of B(K1 ∨K2 ) in Fig. 6 has been obtained from the line diagram in Fig. 5 by doubling six lattice elements which yields the new elements in Fig. 6 represented by the black circles with a white dot. In Fig. 7, a line diagram of B(K1 ∨ K2 ) is shown that has been derived from the nested line diagram in Fig. 1. In the line diagram in Fig. 1, each of the three top elements in the two large circles labelled with ”Oberflaechengewasser” and ”Fluss” has been doubled. The resulting lattice elements are encircled by an ellipse in Fig. 7.
4
Discussion
The study of TOSCANA-systems based on thesauri has been motivated by different investigations of thesauri and classification systems from which the desire arose to develop methods for representing the resemblance of classification terms within line diagrams of concept lattices. The understanding of the join of two classification terms as the smallest generalization of those terms suggests an algebraic description method for the degree of resemblance of thesaurus terms. From this suggestion we derived the claim that a TOSCANA-system based on a thesaurus should have join-closed conceptual scales. Since TOSCANA represents combined conceptual scales by nested line diagrams, it is a consequence of our claim that the attributes in those diagrams should be join-closed too. This leads
134
B. Groh, S. Strahringer, and R. Wille
T
Ober aechengewaesser Stehendes Gewaesser Fluss
Seen
Stauanlage Staugewaesser
Teich Stausee
Staudamm
Staustufe Talsperre
(M1 [ M2 [ (M1 _ M2 ); ) Fig. 4. The join-closure of the thesaurus terms shown in Fig. 3
to the problem how to extend a line diagram of a concept lattice with attributes taken from a ∨-semilattice to a line diagram with join-closed attributes. Our solution of this problem offers an incremental procedure of inserting into line diagrams node by node new ∧-irreducible attribute concepts. Each new attribute forces an extension of the actual concept lattice which doubles a convex subset of the lattice (cf. [4]). This extension need not to be the new concept lattice generated by the old one and the new attribute, but it contains the new lattice. Thus the line diagram of the extension yields a graphical representation of the new concept lattice if we replace the elements of the extension which are not in that lattice by dummies. Of course, one might even erase those elements, but then one has to erase line segments too and has to insert eventually new line segments which could cause serious problems. An advantage of keeping the superfluous nodes in only changing them to dummies lies in the possibility to automate the drawing procedure. Thus, with our method, we especially offer an incremental procedure for drawing concept lattices automatically (an incremental algorithm for determining concept lattices has already been published in [7]).
TOSCANA-Systems Based on Thesauri
135
23649 Stauanlage Oberflaechengewaesser
55 Staudamm
741 Fluss
7 641
18
Seen 213
Staugewaesser 2
27
Staustufe 6
Teich
3 58
8
Talsperre
49
11 Stausee
2
39
1 11
9
2 15
1
9 1
1
5 9
8
1 2
2
1
1 2
Fig. 5. A non-nested line diagram of the concept lattice visualized in Fig. 1
References 1. W. Batschi: Environmental thesaurus and classification of the Umweltbundesamt (Federal Environmental Agency), Berlin. In: S. Stancikova and I. Dahlberg (eds.): Environmental Knowledge Organization and Information Management. Indeks, Frankfurt/Main 1994. 2. C. Carpineto and G. Romano: A Lattice Conceptual Clustering System and Its Application to Browsing Retrieval. Machine Learning 24 (1996), 95-122. 3. B. Ganter and R. Wille, Formale Begriffsanalyse: Mathematische Grundlagen. Springer, Berlin-Heidelberg 1996. (English translation to appear) 4. W. Geyer: The generalized doubling construction and formal concept analysis. Algebra Universalis 32 (1994), 341–367. 5. R. Godin and H. Mili: Building and Maintaining Analysis-Level Class Hierarchies Using Galois Lattices. In Proceedings of the ACM Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA’93), A. Paepcke (Ed.), Washington, DC, 1993, ACM Press, 394-410. 6. R. Godin, G.W. Mineau, and R. Missaoui: Incremental structuring of knowledge bases. In: Proceedings of the International Knowledge Retrieval, Use, and Storage
136
B. Groh, S. Strahringer, and R. Wille 23649 Stauanlage Oberflaechengewaesser
55
Stehendes Gewaesser
Staudamm
741 Fluss
7 641
18
Seen 213
Staugewaesser
2
27
Staustufe 6
Teich
3 8
58 49
Talsperre 11 Stausee
2
39
1 11
9
2 15
1
9 1
1
5 9
8
1 2
2
1
1 2
Fig. 6. The line diagram in Fig. 5 extended by six new lattice elements
7. 8. 9. 10. 11. 12.
for Efficiency Symposium (KRUSE’95), Santa Cruz, Lecture Notes in Artificial Intelligence, Springer, 1995, 179-198. R. Godin, R. Missaoui, and H. Alaoui: Incremental Concept Formation Algorithms Based on Galois (Concept) Lattices. Computational Intelligence, 11(2) (1995), 246-267. A. Grosskopf and G. Harras: A TOSCANA-system for speech-act verbs. FB4Preprint, TU Darmstadt 1998. E. Kalix, Entwicklung von Regelungskonzepten f¨ ur thermische Abfallbehandlungsanlagen. Diplomarbeit, TU Darmstadt, 1997. U. Kaufmann, Begriffliche Analyse u ¨ ber Flugereignisse – Implementierung eines Erkundungs- und Analysesystems mit TOSCANA. Diplomarbeit, TH Darmstadt 1996. B. Kohler-Koch and F. Vogt, Normen und regelgeleitete internationale Kooperationen. FB4-Preprint 1632, TU Darmstadt 1994. W. Kollewe, C. Sander, R. Schmiede, and R. Wille, TOSCANA als Instrument der der bibliothekarischen Sacherschließung. In: H. Havekost and H.J. W¨ atjen (eds.), Aufbau und Erschließung begrifflicher Datenbanken. (BIS)-Verlag, Oldenburg, 1995, 95-114.
TOSCANA-Systems Based on Thesauri Stauanlage Staugewaesser
Staudamm
23649 55
Stausee
Staustufe 7
Talsperre
Stehendes Gewaesser Oberflaechengewaesser 741 18 2
6 39
11 9
Fluss
Seen 641 27
213 3
2
8
9
1 11 1
1
15
8 2
5 Teich
58 49 1 1 2
1
9 2
1
2
Fig. 7. The nested line diagram in Fig. 1 extended by six new lattice elements
137
138
B. Groh, S. Strahringer, and R. Wille
13. W. Kollewe, M. Skorsky, F. Vogt, and R. Wille, TOSCANA ein Werkzeug zur begrifflichen Analyse und Erkundung von Daten. In: R. Wille and M. Zickwolff (eds.), Begriffliche Wissensverarbeitung – Grundfragen und Aufgaben. B.I.Wissenschaftsverlag, Mannheim, 1994, 267-288. 14. K. Mackensen and U. Wille: Qualitative text analysis supported by conceptual data systems. Preprint, ZUMA, Mannheim 1997. 15. T. Rock and R. Wille: Ein TOSCANA-System zur Literatursuche. In: G. Stumme and R. Wille (eds.): Begriffliche Wissensverarbeitung: Methoden und Anwendungen. Springer, Berlin-Heidelberg (to appear) 16. M. Roth-Hintz, M. Mieth, T. Wetter, S. Strahringer, B. Groh, R. Wille, Investigating SNOMED by Formal Concept Analysis. Submitted to: Artificial Intelligence in Medicine. 17. P. Scheich, M. Sorsky, F. Vogt, C. Wachter, R. Wille: Conceptual data systems. In: O. Opitz, B. Lausen, R. Klar (eds.): Information and classification. Springer, Berlin-Heidelberg 1993, 72–84. 18. G. Stumme and K. E. Wolff: Computing in conceptual data systems with relational structures. In: G. Mineau and A. Fall (eds.): Proceedings of the Second International Symposium on Knowledge Retrieval, Use, Storage for Efficiency. Simon Fraser University, Vancouver 1997, 206–219. 19. N. Vogel: Ein begriffliches Erkundungssystem f¨ ur Rohrleitungen. Diplomarbeit, TU Darmstadt 1995. 20. F. Vogt and R. Wille: TOSCANA - a graphical tool for analyzing and exploring data. In: R. Tamassia, I.G. Tollis (eds.): Graph Drawing ’94. Lecture Notes in Computer Science 894. Springer Berlin-Heidelberg-New York 1995, 226-233. 21. R. Wille: Conceptual landscapes of knowledge: a pragmatic paradigm for knowledge processing. In: G. Mineau and A. Fall (eds.): Proceedings of the Second International Symposium on Knowledge Retrieval, Use, Storage for Efficiency. Simon Fraser University, Vancouver 1997, 2–13.
M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 139-153, 1998 Springer-Verlag Berlin Heidelberg 1998
140
R. Dieng and S. Hug
MULTIKAT, a Tool for Comparing Knowledge of Multiple Experts
141
142
R. Dieng and S. Hug
MULTIKAT, a Tool for Comparing Knowledge of Multiple Experts
143
144
R. Dieng and S. Hug
MULTIKAT, a Tool for Comparing Knowledge of Multiple Experts
145
146
R. Dieng and S. Hug
MULTIKAT, a Tool for Comparing Knowledge of Multiple Experts
147
148
R. Dieng and S. Hug
MULTIKAT, a Tool for Comparing Knowledge of Multiple Experts
149
150
R. Dieng and S. Hug
MULTIKAT, a Tool for Comparing Knowledge of Multiple Experts
151
152
R. Dieng and S. Hug
MULTIKAT, a Tool for Comparing Knowledge of Multiple Experts
153
A Platform Allowing Typed Nested Graphs: How CoGITo Became CoGITaNT David Genest and Eric Salvat LIRMM (CNRS and Universit´e Montpellier II), 161 rue Ada, 34392 Montpellier Cedex 5, France.
Abstract. This paper presents CoGITaNT, a software development platform for applications based on conceptual graphs. CoGITaNT is a new version of the CoGITo platform, adding simple graph rules and typed nested graphs with coreference links.
1
Introduction
The goal of the CORALI project (Conceptual graphs at Lirmm) is to build a theoretical formal model, to search for algorithms for solving problems in this model, and to develop of software tools implementing this theory. Our research group considers the CG model [16] as a declarative model where knowledge is solely represented by labeled graphs and reasoning is can be done by labeled graphs operations [3]. In a first stage, such a model, the “simple conceptual graph” (SCG) model has been defined [4,11]. This model has sound and complete semantics in first order logic. This model has been extended in several ways such as rules [12,13] and positive nested graphs with coreference links [5]. As for the SCG model, a sound and complete semantics has been proposed for these extensions [6,15]. The software platform CoGITo (Conceptual Graphs Integrated Tools) [9,10] had been developed on the SCG model. This paper presents the updated version of this platform, CoGITaNT (CoGITo allowing Nested Typed graphs), which is based on these extensions: graph rules and nested conceptual graphs with coreference links.
2
CoGITaNT
CoGITaNT is a software platform: it enables an application developer to manage graphs and to apply the operations of the model. For portability and maintenance reasons, the object oriented programming paradigm was chosen and CoGITaNT was developed as a set of C++ classes. Each element of the theoretical model is represented by a class (conceptual graph, concept vertex, concept type, support,. . . ). Hence, the use of object programming techniques allows to represent graphs in a “natural” way (close to the model): for example, a graph is a set of concept vertices plus a set of relation vertices having an ordered set M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 154–161, 1998. c Springer-Verlag Berlin Heidelberg 1998
A Platform Allowing Typed Nested Graphs: How CoGITo Became CoGITaNT
155
of neighbors. The methods associated with each class correspond to the usual handling functions (graph creation, type deletion, edge addition, . . . ) and specific operations on the model (projection, join, fusion, . . . ). CoGITaNT is compiled using the freeware C++ compiler GNU C++ on Unix systems. CoGITaNT is available for free on request (for further informations, please send an e-mail to [email protected]). CoGITaNT is an extension of CoGITo: each functionality of the previous version is present in the new one. Hence, applications based upon CoGITo should use CoGITaNT without important source files modifications. CoGITo has been presented in [10,2], we only describe here some distinctive characteristics. The platform manages SCGs, and implements algorithms that have been developed by the CORALI group, such as a backtracking projection algorithm, a polynomial projection algorithm for the case when the graph to be projected is a tree, and a maximal join (isojoin) algorithm. The new version extends available operations on simple graphs by implementing graph rules. CoGITaNT also introduces a new set of classes that allows handling of typed nested graphs with coreference links and some other new features: graphs are no more necessarily connected, the set of concept types and the set of relation types ;ay not be necessarily ordered by a “kind of” relation in a lattice but may be partially ordered in a poset.
3
Graph Rules
Conceptual graph rules have been presented in [13] and [12]. These rules are of “IF G1 THEN G2 ” (noted G1 ⇒ G2 ) kind, where G1 and G2 are simple conceptual graphs with co-reference links between concepts of G1 and G2 . Such vertices are called the connection points of the rule. Conceptual graph rules are implemented as a new CoGITaNT class. An instance of this class consists of two SCGs, the hypothesis and the conclusion, and a list of couple of concept vertices (c1 , c2 ), where c1 is a vertex of the hypothesis and c2 is a vertex of the conclusion. This list represents the list of connection points of the rule. G1
1 person : *x
1
G2
person : *x
2 brother
uncle
2
father
person : *z
2 person : *z 1
father
2
1 person : *y
2
1 person : *
father
person : *y
Fig. 1. An example of a graph rule.
Figure 1 may be interpreted as the following sentence: “if a person X is the brother of a person Y, and Y is the father of a person Z, then X is the uncle of Y, and there exists a person which is the father of X and Y”. In this figure, x, y, z
156
D. Genest and E. Salvat
are used to represent coreference links. Vertices which referents are ∗x, ∗y and ∗z are the connection points of this rule. CG rules can be used in forward chaining and backward chaining, both mechanisms being defined with graph operations. Furthermore, these two mechanisms are sound and complete with respect to deduction on the corresponding subset of FOL formulae. Forward chaining is used to explicitly enrich facts with knowledge which is implicitly present in the knowledge base by the way of rules. Then, when a fact fulfills the hypothesis of a rule (i.e. when there is a projection from the hypothesis to the fact), then the conclusion of the rule can be “added” to the fact (i.e. joined on connection points). This basic operation of forward chaining is implemented in CoGITaNT. This method allows to apply a rule R on a CG G, following a projection π from the hypothesis of R to G. The resulting graph is R[G, π], obtained by joining the conclusion of R on G. The join is made on the connection points of the conclusion and the vertices of G which are images of the corresponding connection points in the hypothesis. This method is the basic method of forward chaining, since it allows only to apply a given rule on a given graph and following a given projection. But, since CoGITaNT allows to compute all projections between two graphs, then it is easy to compute all applications of a rule on a graph. Backward chaining is used to prove a request (a goal) on a knowledge base without applying rules on the facts of the knowledge base. Then, we search for a unification between the conclusion of a rule and the request. If such an unification exists, then a new request is built deleting the unified part of the previous request and joining the hypothesis of the rule to the new request. We defined backward chaining method in two steps: first, we compute all unifications between the conclusion of the rule and the request graph; then, given a rule R, a request graph Q and a unification u(Q, R), another procedure builds the new request. As in forward chaining, the implemented methods are the basic methods for a backward chaining mechanism. The user can manage the resolution procedure, for example by implementing heuristics that choose, in each step, the unifications to use at first.
4 4.1
Typed Nested Graphs with Coreference Links Untyped Nested Graphs and Typed Nested Graphs
In some applications, the SCG model does not allow a satisfactory representation of the knowledge involved. An application for the validation of stereotyped behavior models in human organizations [1] uses CGs to represent behaviors: Such CGs represent situations, pre-conditions and post-conditions of actions, each of these elements being described with CGs. Another application is using CGs for document retrieval [7]: a document is described with a CG, such a CG represents the author, the title, and the subject (which is described by a CG). In these applications, knowledge can be structured by hierarchical levels, but this structure can not be represented easily by the SCG model. A satisfactory way of representing this structure is to put CGs within CGs.
A Platform Allowing Typed Nested Graphs: How CoGITo Became CoGITaNT
157
An extension of the model, called the (untyped) “nested conceptual graph” (NCG) model [5] allows the representation of this hierarchical structure by adding in each concept vertex a “partial internal description” which may be either a set of NCGs or generic (i.e. an empty set, denoted ∗∗). 2
1
person : * : **
agent
1 watch : * : **
object 2
television : * power-on button : * : ** screen : * TV show : * 2 presenter : * : **
1 agent 1 attr
1 talk : * : **
2 object
person : * : **
2 uninteresting : * : **
Fig. 2. An untyped nested conceptual graph.
The (untyped) nested graph of figure 2 may represent the knowledge “A person is watching television. This television has a power-on button and its screen displays an uninteresting TV show where a presenter talks about someone”. In figure 2, the same notion of nesting is used to represent that the button and the screen are “components” of the TV, the screen “represents” a TV show, and the scene where the presenter talks about someone is a “description” of the show. Hence, untyped NCGs fail to represent these various nesting semantics. However, in some applications, to specify the nesting semantics is useful. Typed NCGs may represent these more specific semantics than “the nested graph is the partial internal description” (see figure 3). The typed NCG model is not precisely described here, please refer to [5] for a more complete description. The typed NCG extension adds to the support a new partially ordered type set, called the nesting type set which is disjoint from the other type sets, and have a greatest element called “Description”. A typed NCG G can be denoted G = (R, C, E, l) where R, C and E are respectively relation, concept and edge sets of G, l is the labeling function of R and C such that ∀r ∈ R, l(r) = type(r) is an element of the relation type set, ∀c ∈ C, l(c) = (type(c), ref (c), desc(c)), where type(c) is an element of the concept type set, ref (c) is an individual marker or ∗, and desc(c) (called “internal partial description”) is ∗∗, called the generic description or a non-empty set of couples (ti , Gi ), where ti is a nesting type and Gi is a typed NCG. 4.2
Typed Nested Graphs in CoGITaNT
One of the main new characteristics of CoGITaNT is that it allows handling typed NCGs. Data structures used by CoGITaNT for representing such graphs are a natural implementation of this structure.
158
D. Genest and E. Salvat 2 person : * : **
1 agent
1 watch : * : **
object 2
television : *
Component power-on button : * : ** screen : *
Representation TV show : *
Description 2 presenter : * : **
1
1 attr
1 talk : * : **
agent
2 object
person : * : **
2 uninteresting : * : **
Fig. 3. A typed nested conceptual graph and a coreference link (dashed line).
As described in figure 4, a typed NCG is composed of a list of connected components and a list of coreference classes (see further details). A connected component is constituted of a list of relation vertices and a list of concept vertices. A concept vertex is composed of a concept type, a referent (individual marker or ∗) and a list of nestings (∗∗ is represented by a NULL pointer). A nesting is constituted of a nesting type and a typed NCG. As for simple graphs, available methods implement usual handling functions and operations on the extended model. The projection structure and the projection operation defined on SCGs have been adapted to the recursive structure of NCGs and the projection operation follows the constraint induced by types of nestings [5]. A “projection between typed NCGs” from H to G is represented by a list of “projections between connected components”1 . A “projection between connected components” from ccH to ccG is represented by a list of pairs (r1 , r2 ) of relations2 and a list of structures (c1 , c2 , n)3 where c1 is a concept vertex of ccH , c2 is a concept vertex of ccG and n is an empty list (if c1 has a generic description) or a list of structures (n1 , n2 , g) where n1 is a nesting of c1 , n2 is a nesting of c2 and g is a “projection between typed NCGs” from the graph nested in n1 to the graph nested in n2 . Thus, the representation of a projection between typed NCGs has a recursive structure which is comparable with the structure of typed NCGs. The projection algorithm between NCGs computes first every projection between level-0 graphs without considering nestings. Obtained projections are then filtered: let Π be one of these projections, if a concept vertex c of H has a nesting (t, G) such that Π(c) does not contain a nesting (t0 , G0 ) such 1 2 3
for each connected component in H there is one “projection between connected components” in the list. for each relation vertex r in ccH there is one pair (r, Π(r)) in the list, where Π(r) is the image of r. for each concept vertex c in ccH there is one structure (c, Π(c), n) in the list, where Π(c) is the image of c.
A Platform Allowing Typed Nested Graphs: How CoGITo Became CoGITaNT
159
typed NCG
list of connected components
list of coreference classes
connected component
list of relation vertices
list of concept vertices
list of concept vertices
concept vertex
relation vertex
relation type
coreference class
concept type
referent
list of nestings
nesting
nesting type
typed NCG
Fig. 4. Data structures: typed NCG.
that t ≥ t0 and there is a projection from G to G0 , then Π is not an acceptable projection. Acceptable projections are then completed (third part of (c1 , c2 , n) structures: projections between nestings). SCGs are also typed NCGs (without any nesting) and untyped NCGs are also typed NCGs (the type of each nesting is “Description”), hence CoGITaNT can also handle SCGs and untyped NCGs. Even if data structures and operations are not optimal when handling such graphs, there is no sensible lack of performances comparing to SCGs operations of CoGITo. Useless comparisons of (generic) descriptions for SCGs handled as typed NCGs (e.g. comparison of NULL pointers) and useless comparisons of nesting types (equal to “Description”) for untyped NCGs handled as typed NCGs do not influence on the overall performances of the system. 4.3
Coreference Links
CoGITaNT also allows handling coreference links. A set of non-trivial coreference classes is associated with each graph. These classes are represented by lists of (pointers to) concept vertices. Trivial coreference classes (such as one-element classes and classes constituted by the set of concept vertices having the same type and the same individual marker) are not represented for memory saving reasons. The coreference link in figure 3 is represented by a (non-trivial) 2element coreference class that contains the two “person” concept vertices. Of course, the projection method makes use of coreference classes: the images of all vertices of a given coreference class must belong to the same (trivial or non trivial) coreference class. The projection algorithm from H to G currently computes first every projection from H to G without coreference constraints.
160
D. Genest and E. Salvat
Let Π be one of these projections. In order to return only those conforming to coreference constraints, projections are then filtered: ∀coH a non trivial coreference class of H, ∀c1 ∈ coH , ∀c2 ∈ coH , if Π(c1 ) and Π(c2 ) are not in a same (trivial or non trivial) coreference class, then Π is not an acceptable projection.
5
The BCGCT Format
A simple extension of the BCGCT format [9] is the CoGITaNT native file format: it is a textual format allowing supports, rules and graphs to be saved in permanent memory. Files in this format can easily be written and understood by a user, BCGCT is indeed a structured file format which represents every element of the model in a natural way. For example, the representation of a graph is constituted of three parts, the first describes concept vertices, the second describes relation vertices, and the third represents edges between these vertices. The representation of a concept vertex is a 3-tuple constituted of a concept type, an individual marker or an optional “∗” (generic concept), and a set of couples (ti , Gi ) where ti is a nesting type and Gi is a graph identifier, or an optional “∗∗” (generic description). Ex: c1=[television:*:(Component,gtvdescr)]; (“gtvdescr” is a graph identifier). Coreference links are represented using the same variable symbol for each concept vertex belonging to the same coreference class. Ex: c3=[person:$x1:**]; and c9=[person:$x1]; represent that these concept vertices belong to the same coreference class.
6
Conclusion
The platform has been provided to about ten research centers and firms. In particular, two collaborations of our research group, the one with Dassault Electronique and the other with the Abes (Agency of French university libraries) lead evolutions of the model and the platform. The research center of the firm Dassault Electronique uses the platform for building a software that furnishes an assistance for the acquisition and the validation of stereotyped behavior models in human organizations [1]. This application led our research group to the definition of the typed NCG model, and its implementation in CoGITaNT. The other project of our group is on document retrieval: a first approach [8] convinced the Abes to continue with our collaboration to study the efficiency of a document retrieval system based on CGs and to develop a prototype of such a system. These collaborations make us consider many improvement perspectives for CoGITaNT. The efficiency of some operations can be improved by algorithmical studies. Indeed, the optimization of the unification procedure for graph rules will improve the backward chaining mechanism efficiency. Moreover, coreference links processing during projection computing can be improved by an algorithm that considers as soon as possible restrictions induced by these links.
A Platform Allowing Typed Nested Graphs: How CoGITo Became CoGITaNT
161
We plan the extension of the expressiveness of CoGITaNT with nested graph rules. A first theoretical study has been done in [14,12], but nested graphs considered in this study are without coreference links. The extension of this work with coreference links will be implemented in the platform. In the long term, negation (or limited types of negation which are required by applications we are involved in) will be introduced in CoGITaNT as required by the users of the platform.
References 1. Corinne Bos, Bernard Botella, and Philippe Vanheeghe. Modelling and simulating human behaviours with conceptual graphs. In Proceedings of ICCS ’97, volume 1257 of LNAI, pages 275–289. Springer, 1997. 2. Boris Carbonneill, Michel Chein, Olivier Cogis, Olivier Guinaldo, Ollivier Haemmerl´e, Marie-Laure Mugnier, and Eric Salvat. The COnceptual gRAphs at LIrmm project. In Proceedings of the first CGTools Workshop, pages 5–8, 1996. 3. Michel Chein. The CORALI project: From conceptual graphs to conceptual graphs via labelled graphs. In Proceedings of ICCS ’97, volume 1257 of LNAI, pages 65–79. Springer, 1997. 4. Michel Chein and Marie-Laure Mugnier. Conceptual graphs: Fundamental notions. RIA, 6(4):365–406, 1992. 5. Michel Chein and Marie-Laure Mugnier. Positive nested conceptual graphs. In Proceedings of ICCS ’97, volume 1257 of LNAI, pages 95–109. Springer, 1997. 6. Michel Chein, Marie-Laure Mugnier, and Genevi`eve Simonet. Nested graphs: A graph-based knowledge representation model with FOL semantics. To be published in proceedings of KR’98, 1998. 7. David Genest. Une utilisation des graphes conceptuels pour la recherche documentaire. M´emoire de DEA, 1996. Universit´e Montpellier II. 8. David Genest and Michel Chein. An experiment in document retrieval using conceptual graphs. In Proceedings of ICCS ’97, volume 1257 of LNAI, pages 489–504. Springer, 1997. 9. Olliver Haemmerl´e. CoGITo : une plateforme de d´ eveloppement de logiciels sur les graphes conceptuels. PhD thesis, Universit´e Montpellier II, France, 1995. 10. Olliver Haemmerl´e. Implementation of multi-agent systems using conceptual graphs for knowledge and message representation: the CoGITo platform. In Supplementary Proceedings of ICCS’95, pages 13–24, 1995. 11. Marie-Laure Mugnier and Michel Chein. Repr´esenter des connaissances et raisonner avec des graphes. RIA, 10(1):7–56, 1996. 12. Eric Salvat. Raisonner avec des op´ erations de graphes : Graphes conceptuels et r`egles d’inf´erence. PhD thesis, Universit´e Montpellier II, France, 1997. 13. Eric Salvat and Marie-Laure Mugnier. Sound and complete forward and backward chaining of graph rules. In Proceedings of ICCS ’96, volume 1115 of LNAI, pages 248–262. Springer, 1996. 14. Eric Salvat and Genevi`eve Simonet. R`egles d’inf´erence pour les graphes conceptuels emboˆıt´es. RR LIRMM 97013, 1997. 15. Genevi`eve Simonet. Une s´emantique logique pour les graphes emboˆıt´es. RR LIRMM 96047, 1996. 16. John F. Sowa. Conceptual Structures: Information Processing in Mind and Machine. Addison Wesley, 1984.
Towards Correspondences Between Conceptual Graphs and Description Logics P. Coupey1 and C. Faron2 1
LIPN-CNRS UPRESA 7030, Universit´e Paris 13, Av. J.B. Cl´ement, 93430 Villetaneuse, France [email protected] 2 LIFO, Universtit´e d’Orl´eans, rue L´eonard de Vinci, BP 6759 45067 Orl´eans cedex 2, France [email protected], [email protected]
Abstract. We present a formal correspondence between Conceptual Graphs and Description Logics. More precisely, we consider the Simple Conceptual Graphs model provided with type definitions (which we call T SCG) and the ALEOI standard Description Logic. We prove an equivalence between a subset of T SCG and a subset of ALEOI. Based on this equivalence, we suggest extensions of both formalisms while preserving the equivalence. In particular, regarding to standard Description Logics where a concept can be defined by the conjunction of any two concepts, we propose an extension of type definition in CGs allowing type definitions from the “conjunction” of any two types and consequently partial type definitions. Symmetrically, regarding generalization/specialization operations in Conceptual Graphs, we conclude by suggesting how Description Logics could take advantage of these correspondences to improve the explanation of subsumption computation.
1
Introduction
Conceptual Graphs (CGs) [21] and Description Logics (DLs) [3] both are knowledge representation formalisms descended from semantic networks. They are dedicated to the representation of assertional knowledge (i.e. facts) and terminological knowledge: hierarchies of concept types and relation types in CGs, hierarchies of concepts and roles in DLs. Subsumption is central in both formalisms: between graphs or between types in CGs (generalization), between concepts in DLs. Similarities between these formalisms have often been pointed out [1,11] but up to now, to our knowledge, no formal study has ever been carried out about correspondences between CGs and DLs. Beyond an interesting theoretical result, such a work would offer CGs and DLs communities a mutual advantage of about 15 years of research. More precisely, numerous formal results in DLs about semantics and subsumption complexity could easily be adapted to CGs. Symmetrically, specialization/generalization and graph projection operations defined in CGs would help in explaining the computation of subsumption in DLs and thus contribute to current research in this community [14]. M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 165–178, 1998. c Springer-Verlag Berlin Heidelberg 1998
166
P. Coupey and C. Faron
In this paper, we present a formal correspondence between Conceptual Graphs and Description Logics. More precisely, focusing on the terminological level, we consider the standard Description Logic ALEOI [18] and the Simple Conceptual Graphs model [7,8,5] provided with type definitions [6,12,13] we call T SCG. We outline fundamental differences between both formalisms and set up restrictions on them, thus defining two subsets: G and L. Then we show, regarding their formal semantics, that G and L are two notational variants of the same formalism and that subsumption definitions in G and L are equivalent. Based on this result, we make both formalisms vary while preserving the equivalence. In particular, regarding standard DLs where a concept can be defined by the conjunction of any two concepts, we propose an extension of type definition in CGs allowing concept type definitions from the “conjunction” of any two types and consequently partial type definitions. Symmetrically, regarding generalization/specialization operations in CGs, we suggest how DLs could take advantage of their correspondences with CGs to improve the explanation of subsumption computation. First we present ALEOI and T SCG in sections 2 and 3. Then we prove in section 4 the equivalence between the sub-formalisms G of T SCG and L of ALEOI. In section 5 we examine extensions of G and L that preserve the equivalence.
2
Description Logics
Description logics are a family of knowledge representation formalisms descended from the KL-ONE language [3]. They mostly formalize the idea of concept definition and reasoning about these definitions. A DL includes a terminological language and an assertional language. The assertional language is dedicated to the statement of facts and assertional rules, the terminological language to the construction of concepts and roles (binary relations). A concept definition states necessary and sufficient conditions for membership in the extension of that concept. Concepts in DLs are organized in a taxonomy. The two fundamental reasoning tasks are thus subsumption computation between concepts, and classification. The classifier automatically inserts a new defined concept in the taxonomy, linking it to its most specific subsumers and to the most general concepts it subsumes. Among standard Description Logics we have selected ALEOI [18] since it is the largest subset of the best known DL CLASSIC [2] and the necessary description language for which a correspondence with the Simple Conceptual Graphs model holds true. ALEOI is inductively defined from a set Pc of primitive concepts, a set Pr of primitive roles, a set I of individuals, the constant concept >, and two abstract syntax rules; one for concepts (P is a primitive concept, R a role and ai elements of I): C, D →
|
> P
most general concept primitive concept
Towards Correspondences Between Conceptual Graphs and Description Logics
| | | |
∀R.C ∃R.C C uD {a1 . . . an }
167
universal restriction on role existential restriction on role concept conjunction concept in extension
and one for roles (Q is a primitive role): R →
|
Q Q−1
primitive role inverse primitive role
Figure 1 presents examples of ALEOI formulae. The first one describes the females who are researchers; the second one, the males who have at least one child; the third, the females all of whose children are graduates; the fourth, the boys whose mother has a sister who is a researcher; the last one, the males whose (only) friends are Pamela and Claudia, who are female. F emale u Researcher M ale u ∃child.> F emale u ∀child.Graduate Boy u ∃child−1 .(F emale u ∃sister.Researcher) M ale u ∀F riend.({P amela Claudia} u F emale) Fig. 1. Examples of ALEOI formulae
A concept may be fully defined1 (i.e. necessary and sufficient conditions) from a term C of ALEOI, this is noted A ≡ C, or partially defined (i.e. necessary but not sufficient conditions) from a term C of ALEOI, this is noted A < C. A terminological knowledge base (T-KB) is thus a set of concepts and their definitions (partial or full). Note that all partial definitions can be converted into full definitions by using new primitive concepts (cf. [17]). Let an atomic concept A be partially defined w.r.t. a term C: A < C. This can be converted into a full definition by adding to Pc a new primitive concept A0 : A ≡ A0 u C. A0 implicitly describes the remainder of the necessary additional conditions for a C to be an A. From a theoretical point of view this supposes that one can always consider a T-KB has no more partial definitions. Figure 2 presents the definition of concept RN ephew.
RN ephew ≡ Boy u ∃child−1 .(F emale u ∃sister.Researcher) Fig. 2. Definition of concept RNephew 1
We assume that concept definitions are non-recursive.
168
P. Coupey and C. Faron
The formal meaning of concept descriptions built according to the above rules is classically given as an extensional semantics by an interpretation I = (D, k.kI ). The domain D is an arbitrarily non-empty set of individuals and k.kI is an interpretation function mapping each concept onto a subset of D, each role onto a subset of D × D and each individual ai onto an element of D (if ai and aj are different individual names then kai kI 6= kaj kI ). The denotation of a concept description is given by: k>kI = D kC u DkI = kCkI ∩ kDkI k∀R.CkI = {a ∈ D, ∀b, if (a, b) ∈ kRkI then b ∈ kCkI } k∃R.CkI = {a ∈ S D, ∃b, (a, b) ∈ kRkI and b ∈ kCkI } n I k{a1 . . . an }k = i=1 {kai kI } −1 I kQ k = {(a, b), (b, a) ∈ kQkI } An interpretation I is a model for a concept C if kCkI is non-empty. Based on this semantics, C is subsumed by D, noted C < D, iff kCkI ⊂ kDkI for every interpretation I. C is equivalent to D iff (C < D) and (D < C). The reader will find complete theoretical and practical developments in [17].
3
Conceptual Graphs
Conceptual Graphs have first been introduced by J. Sowa in [21]. In this paper, we consider the Simple Conceptual Graphs model SCG defined in [7,8,5] and extended with type definitions in [12,13]. We call it T SCG. It is a formal system appropriate to comparisons with Description Logics. We focus on the terminological level of T SCG, i.e the canon. A canon mainly consists of two type hierarchies: the concept type hierarchy Tc and the relation type hierarchy Tr , each one provided with a most general type, respectively >c and >r . A canon also contains a canonical base of conceptual graphs, called star graphs, expressing constraints on the maximal types of the adjacent concept vertices of relation vertices [16]. Conceptual graphs are built according to the canon, i.e they are made of concept and relation nodes whose types belong to Tc and Tr and respect the constraints expressed in the canonical base. Both type hierarchies are composed of atomic and defined types. A concept type definition2 is a monadic abstraction, i.e. a conceptual graph whose one generic concept is considered as formal parameter. It is noted tc (x) ⇔ D(x). The formal parameter concept node of D(x) is called the head of tc , its type the genus of tc and D(x) the differentia from tc to its genus [13]. A relation type definition is a n-ary abstraction, i.e. a conceptual graph with n generic concepts considered as formal parameters. It is noted tr (x1 , ...xn ) ⇔ D(x1 , ...xn ). Figure 3 presents the definition of concept type RN ephew and its logical interpretation. 2
We assume that type definitions are non-recursive.
Towards Correspondences Between Conceptual Graphs and Description Logics RN ephew(x) ⇔
Boy:*x
child
Female:
sister
169
Researcher:
∀x (RN ephew(x) ⇔ (∃y, ∃z Boy(x) ∧ child(y, x) ∧ F emale(y) ∧ sister(y, z) ∧ Researcher(z))) Fig. 3. Definition of concept type RNephew
Tc and Tr are provided with the order relations ≤c and ≤r . Between two atomic concept types, the order relation is given by the user. In the extended model provided with type definitions, a defined concept type tc is by definition introduced as sub-type of its genus. The ≤c relation is thus extended with tc ≤c genus(tc ). On the contrary, the relation type definitions do not extend the ≤r relation: between any two relation types, the order relation is given by the user. Any conceptual graph can be converted into an equivalent conceptual graph by expanding its defined types, i.e. replacing each one by its own definition graph. The atomic form of a conceptual graph is an expanded graph all of whose types are atomic. In the following, for type definitions, we will only consider the expanded form of definition graphs. The reader will find a complete development of type definition in [13].
4
Correspondences Between T SCG and ALEOI
Let us now consider the terminological parts of T SCG and ALEOI. Regarding the definition of concept RNephew (figure 2) and type RNephew (figure 3) which obviously describe the same set of individuals, one may easily guess that a conceptual graph made of a relation R between two concepts C:x and D: is translated in ALEOI by the formula C u ∃R.D. In this section, our aim is to formalize this intuition and prove an equivalence between a subset G of T SCG and a subset L of ALEOI. In the following subsection we define G and L by restricting T SCG and ALEOI. 4.1
Restrictions of T SCG and ALEOI
To state an equivalence between T SCG and ALEOI, some restrictions are necessary which correspond to real differences between CGs and DLs, others are only useful to avoid the writing of definitions and properties unnecessarily complex. In section 5 we show that most of these restrictions may be relaxed while preserving the equivalence. Necessary Restrictions 1. In ALEOI and more generally in all DLs, complex role descriptions are not allowed. In T SCG, let us consider two relations R and R0 from a concept vertex C:*x to a concept vertex D: whose logical formula is ∃y C(x) ∧
170
2. 3. 4.
5. 6.
P. Coupey and C. Faron
R(x, y) ∧ R0 (x, y) ∧ D(y). This can not be translated in ALEOI since it is impossible to describe properties on roles, and thus to describe the conjunction R(x, y) ∧ R0 (x, y). This kind of description corresponds to cycles (cf. [21, page 78]). In consequence, we only consider in T SCG definition graphs which do not contain cycles, i.e. trees (like in [15]). Since there is no constraint on roles in ALEOI, we consider the canonical base in T SCG is empty. Concept vertices adjacent to a relation vertex can thus have any type. Since there is no connective to define complex roles in ALEOI, we consider there is no defined relation types in the set Tr of T SCG. Since there are only binary relations in standard DLs, we do not consider n-ary relation types in T SCG. As outlined in section 5 this is not a real restriction since it is known that any n-ary relation can be translated as a set of binary relations. Since CGs are existential graphs, we do not consider in ALEOI the universal quantification on roles, i.e. the ∀ connective which is standard in DLs. Since there is no way in CGs to define3 a type as the conjunction of two types, we limit in ALEOI the scope of the u connective of standard DLs: the conjunction connective is restricted to C u ∃R.D.
We will show in section 5.2 how to relax all these restrictions, the fifth excepted, by extending T SCG and ALEOI. Useful Restrictions 1. Even the descriptions of sets of individuals in ALEOI and individual referents in T SCG both are allowed, it would unnecessarily complicate the proof of the equivalence. In consequence, we consider no concepts description in extension in ALEOI ({a1 . . . an }) and no individual concepts in T SCG(only generic concepts). 2. Since any concept in ALEOI is defined, except primitive ones that can not be compared, we consider type hierarchies as only made of incomparable types we call primitive types: the most specialized common super-type of any two atomic types is type >c in Tc and type >r in Tr . 3. Since in DLs subsumption between concepts relative to a terminology T is equivalent to subsumption with an empty terminology (cf. section 2), we only consider unordered atomic types in the canon of T SCG(i.e. no defined types in the canon). We will show in section 5 how to relax these useful but not necessary restrictions. 4.2
G and L
We call G and L the sub-formalisms of T SCG and ALEOI corresponding to the above restrictions. 3
We only consider “fully” defined types (cf. the second useful restriction below).
Towards Correspondences Between Conceptual Graphs and Description Logics
171
Concept definition in L may be defined like in ALEOI(cf. section 2), with no concept in extension, no universal restriction on roles and a limited version of conjunction: C u ∃R.D. Here we give an equivalent definition of concept definition relying on an inductive definition of the set C of concepts: C0 = Pc ∪ {>} −1 .(Cp+1 ) . . . u ∃Rq−1 .(Cq ), C0 ∈ Cn = Cn−1 ∪ {C0 u ∃R1 .(C1 ) . . . u ∃Rp .(Cp ) u ∃Rp+1 C0 , ∀i = 1 . . . q, Ci ∈ Cn−1 , Ri ∈ Pr } Concepts in L are then defined as follows: A ≡ C, C ∈ C. Concept type definitions in G are inductively defined by considering the following inductive definition of the set CG of conceptual graphs: CG 0 = { C: , C ∈ Tc }
CG n = CG n−1 ∪ {
G1
R1
C1 :
Rp
Cp :
Rp+1
Cp+1:
Rq
Cq :
Gp
C:
Gp+1
, C ∈ Tc , ∀i = 1 . . . q, Ri ∈ Tr , Ci ∈ Tc , Gi ∈
Gn
CG n−1 } Concept types in G are then defined as follows: t ⇔ D(x), D(x) ∈ CG, its head being the concept vertex C:x . The canon of G is made of two unordered sets Tc and Tr of types we call primitive types since they correspond to primitive concepts and roles in L. 4.3
Equivalence Between G and L
Theorem 1. G and L are two notational variants of the same formalism and subsumption definitions in G and L are equivalent. Proof. Let us first show that any concept type definition of G can be translated in L and that any concept definition of L can be translated in G. To do this, let us consider the two functions f : G −→ L and g : L −→ G. f is defined as follows: (Pc ∪ {>} = Tc and Pr = Tr ) if G ∈ CG 0 , f (G) = f ( C:x ) = C,
if G ∈ CG n , f (G) = f (
R1
C1 :
Rp
Cp :
Rp+1
Cp+1:
Rq
Cq :
C:
G1 Gp Gp+1
) = C u ∃R1 .(C1 ) . . . u ∃Rp .(Cp ) u
Gn
−1 .(Cp+1 ) . . . u ∃Rq−1 .(Cq ) , with Ci = f (Gi ), ∀i = 1 . . . q. ∃Rp+1
172
P. Coupey and C. Faron
A concept C : of G is translated into a concept C of L, a relation Ri,i=1...p to a role Ri , a relation Rj,j=p+1...n , to a role Rj−1 . Symmetrically, g is defined as follows: if D ∈ C0 , g(D) = D: , −1 .(Dp+1 ) . . .u∃Rq−1 .(Dq )) = if D ∈ Cn , g(D) = g(C0 u∃R1 .(D1 ) . . .u∃Rp .(Dp )u∃Rp+1 R1
C1 :
Rp
Cp :
Rp+1
Cp+1:
Rq
Cq :
C:
G1 Gp Gp+1
, with Gi = g(Di ) and Di = Ci u . . ., ∀i = 1 . . . q.
Gn
It is obvious that g = f −1 and f is bijective, meaning that any concept type definition of G can be translated to a concept definition of L and vice-versa. Let us now consider the semantics of concept type definitions in G. Let t(x) ⇔ G(x), G ∈ CG n . Its logical interpretation is the following: ∀x(t(x) ↔ F (x)), F (x) being an existentially closed conjunction of unary and binary predicates interpreting the concepts and relations of G respectively. The extensional semantics of such a concept type definition G in G is given in [8]. It is exactly the same as the equivalent concept definition f (G) in L (cf. section 2). In other words G and L have the same semantics. G and L are thus two notational variants of the same logical formalism. Moreover, since the definitions of subsumption between two concept types in G and subsumption between two concepts in L both rely on the inclusion of the extensions of concept types and concepts respectively, they are equivalent. Remark: Our proof of the equivalence is based on the semantics. It could also have been shown by examining the connection between the equational system of ALEOI(properties of its connectives) which characterizes equivalence classes [10,9] and the notion of irredundant graphs introduced in [7,8].
5
Extensions
The equivalence between G and L is a first (but necessary) step towards broader correspondences between CGs and DLs. Research in DLs mostly consists in studying formally and in practice different Description Logics by varying the description language. Our aim is to apply this principle to CGs by conjointly extending G and L while preserving the equivalence, then taking advantage of these correspondences to transfer results from DLs to CGs and vice-versa. In this section, we succinctly present some of these extensions. We do not give the equivalence proofs but only the intuitive elements which conjecture them. Some of these extensions still are subsets of T SCG (section 5.1) while others propose an enrichment of CGs (section 5.2).
Towards Correspondences Between Conceptual Graphs and Description Logics
5.1
173
Subsets of T SCG
Individuals Whereas both T SCG and ALEOI allow description of individuals, for the sake of simplicity, we have defined G with no individual referents in conceptual graphs, and L with no concept definition in extension ({a1 . . . an}). Extending G with individual referents in conceptual graphs while preserving the equivalence with L does not bring any major problem. To do this, the terminological language of L must be extended with P (ai ) and >(ai ) which are respectively equivalent to P u {ai } and > u {ai }, {ai } being a concept in extension (cf. section 2) restricted to a single element. As a result, a limited form of cycles can be tolerated in CGs: cycles “closed” at an individual concept vertex can be translated in DLs. Figure 4 presents an example of such a conceptual graph of G and its translation in L.
chief
BusDriver:
Male:*x
friend
Female:Claudia
own
Cat:
sister
M ale u ∃f riend.(F emale(Claudia) u ∃own.Cat) u ∃chief.(BusDriver u ∃sister.F emale(Claudia)) Fig. 4. A definition graph including individuals and its translation in DLs
Relation Type Definition T SCG is provided with a relation type definition mechanism (cf. section 3). Some DLs also are provided with role connectives allowing the description of complex roles: R, Q →
| |
RuQ domain R C range R C
role conjunction domain restriction co-domain restriction
For example, childR ≡ (domain child (M ale u ∃own.Cat)) u (range childR (F emale u ∃ member.F ootballT eam)) defines a new role childR which describes a particular child relation between father and daughter such that the father owns a cat and the daughter is member of a football team. The translation of this example in CGs is given in figure 5. The equivalence between L and G could thus be extended to role and relation type definitions 4 . 4
Even if intuition allows to conjecture such an extension, an in-depth study still has to be done to make explicit a precise equivalence between relation type definitions in CGs and role definitions in DLs, which is outside of the scope of this paper.
174
P. Coupey and C. Faron
childR(x, y) ⇔ Cat:
own
Male:*x
child
Female:*y
member
FootballTeam:
Fig. 5. Definition of relation type childR
Canonical Base Translating in DLs the star graphs of the canonical base of T SCG also requires the addition of role connectives to L. From a theoretical point of view, a star graph associated to a type relation R of Tr defines constraints relative to the concepts vertices adjacent to a relation vertex of type R in a conceptual graph (cf. section 3). In DLs, this is viewed as the definiMale:*y married be tion of a new, more specific, relation type. Let Female: *x the star graph associated to the married relation type. Its translation in DLs would be the definition of a new role marriedF M ≡ (domain married F emale) u (range married M ale), the role marriedF M being thus equivalent to the married relation type with its star graph. More generally, the constraints described by a star graph in T SCG are translated in DLs by using the range and domain restrictions in role definition. Cycles in CGs Translating in DLs conceptual graphs not limited to trees also requires the addition of role connectives to L. As an example, let R and R0 be two relations vertices from a concept vertex C:*x to a concept vertex D: . Its translation in ALEOI provided with role connectives would be the following: C u ∃(R u R0 ).D. N-ary Relation Types In standard DLs roles are binary while relation types in CGs are n-ary. However it is known that any n-ary relation can be described with a set of binary relations [4,20]. From a theoretical point of view, adding n-ary relations to L and G can therefore be considered without any change. 5.2
Extending Type Definitions in T SCG
Concept Conjunction The conjunction of any two concepts is standard in DLs, while impossible in CGs. As an example, let N onSmoker and BusDriver be two primitive concepts; N onSmokeruBusDriver describes all the individuals who are non smokers and bus drivers. As it is shown in figure 6a, the problem that arises when translating such a conjunction in CGs is relative to the non connexion of definition graphs. A “non connected” graph being a set of connected graphs, from a logical point of view (as it is shown in [8]) it is interpreted as the conjunction of the formulae interpreting its connected components. Then, such a conjunction is exactly the logical expression underlying the concept conjunction we want to translate in figure 6a. However authorizing this kind of definition graphs in CGs would induce technical graph writing problems when expanding defined concept types: as it is shown in figure 6c, to which concept should the
Towards Correspondences Between Conceptual Graphs and Description Logics
175
f riend relation be connected when expanding concept C in the graph of figure 6b? C(x) ⇔
NonSmoker:
C: friend
NonSmoker:*x
?
friend
C(x) ⇔ NonSmoker:*x
BusDriver: id
BusDriver:*x a
Researcher: b
Researcher:
BusDriver: c
d
Fig. 6. Concept type conjunction
To this representational problem in GCs, we propose a solution drawing its source from DLs. Some DLs are provided with a specific role id (for identity) describing all the couples (x, x) [14,19]. Let id be an equivalent relation type in T SCG: it allows to define concept types as a concept type conjunction while preserving the connection of the defintion graph. For instance, concept type C in figure 6d describes all the individuals which are N onSmoker and BusDriver. The logical interpretation of this definition graph is the following: ∀ x(C(x) ⇔ ∃ y (N onSmoker(x) ∧ BusDriver(y) ∧ id(x, y))). Since x is equal to y (thanks to the id relation type), x is a BusDriver and a N onSmoker which is exactly the semantics of N onSmoker u BusDriver. In other words, concept type definitions in G may be extended while preserving its equivalence with L (extended with id role)5 . Uniformizing Type Definition In DLs, one can always consider that a T-KB contains only fully defined concepts by adding new primitive concepts (cf. section 3.2). This may be transposed to CGs by using the id relation. More precisely, a partial concept type definition may be converted into a full definition by adding a relation id from the head of the definition graph to a concept whose type is a new primitive concept type (cf. the figure 7b and 7c). This improvement would give an homogeneous semantics for all types in the canon. In CGs there is a distinction between atomic types which are sub-types of other type(s) and those which are fully defined with a conceptual graph. Regarding correspondences with DLs, one may envision an extension of CGs with partial definitions. Provided with partial definitions and the id relation, it would be possible to associate an equivalent full definition to any atomic or partially defined concept type, by building a conceptual graph using the “id” relation for conjunction of super-types and a new type - like in DLs (cf. section 2. For example, in figure 7a, C1 and C2 are partially defined by the conjunction 5
Of course this extension would imply to adapt the graphs operations (for example, by taking into account the commutativity of id) which is out of the scope of this paper.
176
P. Coupey and C. Faron
of P 1 and P 2 for C1 and the conjunction of C1 and P 3 for C2 (P 1, P 2 and P 3 are primitive types). Figure 7b presents the partial definitions of C1 and C2 and figure 7 their full definitions (P 4 and P 5 are two new primitive types). The expanded form of the definition of C2 is presented in figure 7d. C1(x) ⇔
C1(x) ⇒ P1:*x
P1 P2
P1:*x id
P2:
P3 C2
a
P2:
id
P4:
id id
P1:*x
id
P3:
id
P5:
P3:
id
P2:
P5:
id
P4:
C2(x) ⇔ C1:*x
C2(x) ⇒
C1
C2(x) ⇔ id
C1:*x
id
P3:
b
c
d
Fig. 7. Definition graphs for non primitive concept types
As a result, from a theoretical point of view, one would now consider a terminological knowledge base for CGs as a set of (full) concept type definitions, the canon containing only primitive concept types as it is the case in DLs. This result should be adapted to type relation definition. Such an extension would allow to unify the formal semantics for types, definition graphs and subsumption. Thus, from a theoretical point of view, the whole terminological reasoning in CGs could be circumscribed to only the graph specialization/generalization operations6 .
6
Conclusion and Perspectives
In this paper, our aim was to provide formal correspondences between Conceptual Graphs and Description Logics. We have proved that a subset G of T SCG and a subset L of the ALEOI Description Logic are two notational variants of the same formalism at the terminological level, and that subsumption definitions are equivalent. This theoretical work is a necessary prerequisite for taking mutual advantage of research in CGs and DLs. As a first step in this transposition process, we have proposed an extension of T SCG with a special relation type id. This extension allows: to introduce partial type definitions without any change from a theoretical point of view - like in DLs; to associate a definition graph to any “non primitive” concept type (atomic or not) - like in DLs. As a result, it syntactically and semantically unifies subsumption between any concept types since it could be defined by a specialization of two expanded graphs. 6
Of course, we do not claim that, from a practical point of view, the canon and partial definitions are not usefull but that they can be not considered for formal study of terminological reasoning.
Towards Correspondences Between Conceptual Graphs and Description Logics
177
In this paper, we have focused on correspondences between the terminological levels of DLs and CGs. A first perspective of this work consists in studying formal correspondences between their assertional level. The most important perspective concerns adaptations of results from CGs to DLs. In particular, it would be fruitful to use graphs operations to solve a crucial problem in DLs: how to explain subsumption? Indeed, in DLs, a terminological knowledge base is split into formulae and the normalization process simplifies and transforms the initial descriptions before computing subsumption. The problem is that the users are often confused since they do not understand the connection between their initial descriptions and the results given by the system, or how the system has obtained these results. It has been proven many times in practice that a lot of results are unexpected by users [14] : subsumption computation is a “black box”. Thanks to a correspondence between CGs and DLs, graph operations, which are intuitive, easily comprehensible and readable, may be used in DLs to explain step by step the subsumption computation.
References 1. B. Bi´ebow and G. Chaty. A comparison between conceptual graphs and KL-ONE. In ICCS’93, First International Conference on Conceptual Structures, LNAI 699, pages 75–89. Springer-Verlag, Berlin, Qu´ebec, Canada, 1993. 2. R.J. Brachman, D.L. McGuiness, P.F. Patel-Schneider, L.A. Resnick, and A. Borgida. Living with CLASSIC: When and how to use a KL-ONE-like language. In J. Sowa, editor, Principles of Semantic Networks, pages 401–456. Morgan Kaufmann, San Mateo, Cal., 1991. 3. R.J. Brachman and J.G. Schmolze. An overview of the KL-ONE knowledge representation system. Cognitive Science, 9(2):171–216, 1985. 4. B. Carbonneill and O. Haemmerl´e. Standardizing and interfacing relational database using conceptual graphs. In ICCS’94, Second International Conference on Conceptual Structures, LNAI 835, pages 311–330. Springer-Verlag, Berlin, Maryland, USA, 1994. 5. M. Chein. The CORALI project: from conceptual graphs to conceptual graphs via labelled graphs. In ICCS’97, Fifth International Conference on Conceptual Structures, LNAI 1257, pages 65–79. Springer-Verlag, Berlin, Seattle, USA, 1997. 6. M. Chein and M. Lecl`ere. A cooperative program for the construction of a concept type lattice. In supplement proceedings of ICCS’94, Second International Conference on Conceptual Structures, pages 16–30. Maryland, USA, 1994. 7. M. Chein and M.L. Mugnier. Conceptual graphs: Fundamental notions. Revue d’Intelligence Artificielle, 6(4):365–406, 1992. 8. M. Chein and M.L. Mugnier. Repr´esenter des connaissances et raisonner avec des graphes. Revue d’Intelligence Artificielle, 10(1):7–56, 1996. 9. P. Coupey and C. Fouquer´e. Extending conceptual definitions with default knowledge. Computational Intelligence Journal, 13(2):258–299, 1997. 10. R. Dionne, E. Mays, and F.J. Oles. The equivalence of model-theoretic and structural subsumption description logics. In 13th International Joint Conference on Artificial Intelligence, pages 710–716, Chamb´ery, France, 1993. 11. C. Faron and J.G. Ganascia. Representation of defaults and exceptions in conceptual graphs formalism. In ICCS’97, Fifth International Conference on Conceptual Structures, LNAI 1257, pages 153–167. Springer-Verlag, Berlin, Seattle, USA, 1997.
178
P. Coupey and C. Faron
12. M. Lecl`ere. Les connaissances du niveau terminologique du mod` ele des graphes conceptuels : construction et exploitation. University thesis, Universit´e de Montpellier 2, France, 1995. 13. M. Lecl`ere. Reasonning with type definitions. In ICCS’97, Fith International Conference on Conceptual Structures, LNAI 1257, pages 401–415. Springer-Verlag, Berlin, Seattle, USA, 1997. 14. D.L. McGuinness and A.T. Borgida. Explaining subsumption in description logics. In 14th International Joint Conference on Artificial Intelligence, pages 816–821, Montreal, Canada, 1995. 15. M.L. Mugnier. On generalization/specialization for conceptual graphs. Journal of Experimental & Theoretical Artificial Intelligence, 7(3):325–344, 1995. 16. M.L. Mugnier and M. Chein. Characterization and algorithmic recognition of canonical conceptual graphs. In ICCS’93, First International Conference on Conceptual Structures, LNAI 699, pages 294–311. Springer-Verlag, Berlin, Qu´ebec, Canada, 1993. 17. B. Nebel. Reasoning and Revision in Hybrid Representation Systems. Number 422 in Lecture Notes in Computer Science. Springer-Verlag, 1990. 18. A. Schaerf. Query Answering in Concept-Based Knowledge Representation Systems: Algorithms, Complexity and Semantics Issues. PhD thesis, Universit` a di Roma “La Sapienza”, Roma, Italy, 1994. 19. K. Schild. A correspondence theory for terminological logics: Preliminary report. In 12th International Joint Conference on Artificial Intelligence, pages 466–471, Sydney, Australia, 1991. 20. J.G. Schmolze. Terminological knowledge representation systems supporting nary terms. In J.G. Schmolze, editor, Principles of Knowledge Representation and Reasoning: 1st International Conference, pages 432–443. Toronto, Ont., 1989. 21. J.F. Sowa. Conceptual Structures : Information Processing in Mind and Machine. Addison-Wesley, Reading, Massachusetts, 1984.
Piece Resolution: Towards Larger Perspectives St´ephane Coulondre and Eric Salvat L.I.R.M.M. (U.M.R. 9928 Universit´e Montpellier II / C.N.R.S.) 161 rue Ada, 34392 Montpellier cedex 5 - France e-mail: {coulondre,salvat}@lirmm.fr
Abstract. This paper focuses on two aspects of piece resolution in backward chaining for conceptual graph rules [13]. First, as conceptual graphs admit a first-order logic interpretation, inferences can be proven by classical theorem provers. Nevertheless, they do not use the notion of piece, which is a graph notion. So we define piece resolution over a class of first-order logical formulae: the logical rules. An implementation of this procedure has been done, and we compare the results with the classical SLD-resolution (i.e. Prolog). We point out several interesting results: it appears that the number of backtracks is strongly reduced. Second, we point out the similarities between these rules and database data dependencies. The implication problem for dependencies is to decide whether a given dependency is logically implied by a given set of dependencies. A proof procedure for the implication problem, called “chase”, was already studied. The chase is a bottom-up procedure: from hypothesis to conclusion. This paper introduces a new proof procedure which is topdown: from conclusion to hypothesis. Indeed, we show that the implication problem for dependencies can be reduced to the existence of a piece resolution.
1
Introduction
This paper focuses on two aspects of piece resolution in backward chaining for conceptual graph rules. This procedure has been originally presented in [13]. It is summarized in section 2. As conceptual graphs admit a first-order logic interpretation, inferences can be proven by classical theorem provers. Nevertheless, we point out that the clausal form of logic prevent from using the notion of piece. So we define in section 3 piece resolution over a class of first-order logical formulae: the logical rules. We made an implementation of this procedure, and we compare in section 4 the results with those of the classical SLD-resolution (i.e. Prolog), on the same benchmarks. We show several interesting results: it appears that the number of backtracks is strongly reduced. Second, we point out in section 5 the similarities between logical rules and data dependencies. Data dependencies are well-known in the context of relational database. They specify constraints that the data must satisfy to model correctly the part of the world under consideration. The implication problem for dependencies is to decide whether a given dependency is logically implied by a given set of dependencies. A proof procedure for the implication problem, called “chase” was already studied in [2], in M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 179–193, 1998. c Springer-Verlag Berlin Heidelberg 1998
180
S. Coulondre and E. Salvat
the case of tuple-generating and equality-generating dependencies. This class of dependencies generalizes most cases of interest, including the well-known functional and multivalued dependencies. The chase is a bottom-up procedure: from hypotheses to conclusion. This paper introduces a new proof procedure which is top-down: from conclusion to hypothesis. To do that, we use the logical rules and show that the problem for implication for dependencies can be reduced to the existence of a piece resolution. We end this paper with several concluding remarks pointing out some possible generalizations of our results and further work.
2
Piece Resolution within Conceptual Graphs
This section summarizes backward chaining of conceptual graph rules by way of graph operations. The framework we consider here is composed of simple conceptual graphs (non-nested), without negation. They are not necessary connected. 2.1
The Rules
Rules are of the type “If G1 then G2 ”, where G1 and G2 are conceptual graphs with possible coreference links between some concept vertices of G1 and G2 . A coreference link indicates that two vertices denote the same individual. In other words, Φ associates to these vertices the same variable or constant. A rule is denoted R : G1 ⇒ G2 , where G1 and G2 are called the hypothesis and the conclusion of R. A useful notation is that of lambda-abstraction. A lambdaabstraction λx1 ...xn G, with n ≥ 0, is composed of a graph G and n special generic vertices of G. Definition 1 (Conceptual graph rule [13]). A conceptual graph rule R : G1 ⇒ G2 is a couple of lambda-abstractions (λx1 ...xn G1 , λx1 ...xn G2 ). x1 , ..., xn are called connection points. In the following, we will denote by x1i (resp. x2i ) the vertex xi of G1 (resp. G2 ). For each i ∈ [1..n], x1i and x2i are coreferent.
R:
G1
1
Manager : *x
agt
2
1
Manage
1
Employee : *z
G2
Office : *y
2
loc
1
Desk
2
obj
2
Office : *y 2
loc
poss 1
1
Manager : *x
agt
2
Employ
1
Fig. 1. A CG rule.
obj
2
Employee : *z
Piece Resolution: Towards Larger Perspectives
181
The rule in figure 1 informally means the following: “If an employee z is in an office y managed by a manager x, then z has got a desk which is inside the office y, and x employs z”. 2.2
Logical Interpretation of a Rule
Φ associates to every lambda-abstraction λx1 ...xn G a first-order formula, where all the variables are existentially quantified, except variables of x1 ...xn which are free. Let R : G1 ⇒ G2 be a CG rule, if → is the logical implication connector, then Ψ (R) = (Φ(λ(x1 ...xn )G1 )) → (Φ(λ(x1 ...xn )G2 )). To every couple of vertices x1i and x2i is associated the same variable. Φ(R) is the universal closure of Ψ (R): Φ(R) = ∀x1 ...∀xn Ψ (R). For example, consider the rule R of figure 1, Φ(R) = ∀x∀y∀z(∃w(M anager(x) ∧ M anage(w) ∧ Of f ice(y) ∧ Employee(z) ∧ agt(x, w) ∧ obj(w, y)∧loc(z, y)) → (∃u∃vOf f ice(y)∧Desk(u)∧Employee(z)∧Employ(v)∧ M anager(x) ∧ loc(u, y) ∧ poss(z, u) ∧ obj(v, z) ∧ agt(x, v)). A knowledge base is composed of a support S, a set of CGs (facts), and a set of rules. 2.3
Piece Resolution Procedure
Consider a goal Q to be proven. The piece resolution procedure allows us to determine, given a knowledge base KB, whether Φ(BK) |= Φ(Q). The basic operation of the piece resolution is the piece unification which, given a goal Q and a rule, builds, if possible, a new goal Q0 to be proven such that if Q0 is proven on KB, then Q is proven on KB. Definition 2 (Piece and cut points [13]). Let R : G1 ⇒ G2 be a rule. A cut point of G2 is either a connection point (i.e. a generic concept shared with G1 ) or a concept with an individual marker (which may be common with G1 or not). A cut point of G1 is either a connection point of G1 or a concept with an individual marker which is common with G2 . The pieces of G2 are obtained as follows: remove from G2 all cut points; one obtains a set of connected components; some of them are not CGs since some edges have lost an extremity. Complete each incomplete edge with a concept with same label as the former cut point. Each connected component is a piece. Equivalently, two vertices of G2 belong to the same piece if and only if there exists a path from one vertex to the other that does not go through a cut point. In figure 1, G2 has two pieces. The first one includes all vertices from y to z and the second one includes all vertices from z to x. Indeed, x, y and z are cut points, and G2 is split at z vertex. Instead of splitting the goal in subgoals to be treated separately [7] [8], as also for Prolog [5] for first-order logic, piece resolution processes as much as possible the graph as a whole. For example, the request Q0 of figure 2 can unify with the rule of figure 1. Indeed, we can unify the subgraph of Q containing the vertices from Manager
182
S. Coulondre and E. Salvat Q: Manager : Tom
1
2
agt
Employ Car : *
1
2
obj
2
Employee : * 1
poss
Fig. 2. A request Q’ : Manager : Tom Car : *
2
1
poss
2
agt 1
Manage
Employee : *
1
1
obj loc
2
Office : * 2
Fig. 3. A new request built by unification of the request and the rule
to Employee with the piece of G2 from the concept vertex with the marker x to the concept vertex with the marker z. We obtain a new request Q0 . A piece resolution is a sequence of piece unifications. It ends successfully if the last produced goal is the empty graph. It needs the definition of a tree exploration strategy (breadth-first or depth-first search like in Prolog for example). For more details on piece resolution, we refer the reader to [13] [12]. The piece resolution procedure is clearly a top-down (or backward chaining) procedure. Indeed, rules are applied to a conclusion (the goal is actually a rule with an empty hypothesis), thus generating new goals to be proven. This is done until we obtain the empty graph. Thus the procedure uses here the goal to guide the process. Theorem 1 (Soundness and completeness of piece resolution [13]). Let KB be a knowledge base and Q be a CG request. Then: – If a piece resolution of Q on KB ends successfully, then Φ(KB) |= Φ(Q). – If Φ(KB) |= Φ(Q), then there is a piece resolution of Q on KB that ends successfully.
3
Piece Resolution within First-Order Logic
Resolution in backward chaining for conceptual graph rules has been defined in section 2. This mechanism is sound and complete relative to the deduction in first-order logic. The underlying idea is to determine subgraphs as large as possible, that can be processed as a whole. As already mentioned, conceptual graph rules can be expressed as first-order sentences. Therefore it may be legitimate to express graph resolution using first-order formulae. This will lead us to a new kind of resolution called Piece Resolution. A unification between a goal graph and a graph rule produces a new request which is also a conceptual graph. This new goal can be expressed as a first-order logic sentence, and can also contain existential quantifiers. Now, classical proof procedures for first-order logic generally use clausal form [10] or specific forms, obtained in every case by taking out
Piece Resolution: Towards Larger Perspectives
183
existential quantifiers. This is called Skolemisation [3]. This process modifies the knowledge base, thus further inferences do not use the original one. On the contrary, graph resolution uses the original base and allows existential quantifiers to take part of the proof mechanism (in the associated logical interpretation of rules) by using graphs structure. The inference is thus more “natural”. Moreover, we will see that graph resolution provides ways to improve effectiveness theoretically speaking as well as practically. We will now clarify the idea of piece in first-order logic. 3.1
Pieces
Definition 3 (Logical rule). A logical rule is a formula of the form F = ∃x1 ...∃xs (A1 ∧ ... ∧ Ap ) ← B1 ∧ ... ∧ Bn , p ≥ 1, universally closed. There are no functional symbols.VWe will omit universal quantifiers for sake of readability. The hypothesis of V F , {Bi |i ∈ [1..n]} is denoted by hyp(F ), and the conclusion of F , ∃x1 ...∃xs {Ai |i ∈ [1..p]} is denoted by conc(F ). Each Ai , i ∈ [1..p] and each Bi , i ∈ [1..n] is an atom and xi , i ∈ [1..s] are variables appearing only in Ai , i ∈ [1..p]. All other variables in Ai , i ∈ [1..p] also appear in the hypothesis and are universally quantified. If n = 0, F is also called a logical fact. When we want to know if a logical fact is logically implied by a set of logical rules, we talk about logical goal. Example 1 (Logical rule). Consider the following logical rule: R = ∃u(t4 (u) ∧ r4 (x, u)) ← t1 (x) ∧ t1 (y) ∧ r1 (x, y). We have hyp(R) = t1 (x) ∧ t1 (y) ∧ r1 (x, y) and conc(R) = ∃u(t4 (u) ∧ r4 (x, u)). Remark 1. If s = 0, then F is equivalent to a set of horn clauses {Ai ← B1 ∧ ... ∧ Bn , i ∈ [1..p]} Definition 4 (Piece). Let C = A1 ∧ ... ∧ Ap be a conjunction of atoms and V = {x1 , ..., xs } be a set of variables. Pieces of C in relation to V are defined in the following way: for all atoms A and A0 members of {A1 , ..., Ap }, A and A0 belong to the same piece if and only if there is a sequence of atoms (P1 , ..., Pm ) of {A1 , ..., Ap } such that P1 = A and Pm = A0 and ∀i = 1, . . . , m − 1, Pi and Pi+1 share at least one variable of V . By construction, the set of pieces is a partition of C. Definition 5 (Logical rule pieces). Let R=∃x1 ...∃xs (A1 ∧ ... ∧ Ap ) ← B1 ∧ ... ∧ Bn be a logical rule. Pieces of R are pieces of conc(R) = A1 ∧ ... ∧ Ap in relation to {x1 , ..., xs }. Example 2. Let R = ∀x∀y∀z(∃w(M anager(x) ∧ M anage(w) ∧ Of f ice(y) ∧ Employee(z) ∧ agt(x, w) ∧ obj(w, y) ∧ loc(z, y)) → (∃u∃vOf f ice(y) ∧ Desk(u) ∧ Employee(z) ∧ Employ(v) ∧ M anager(x) ∧ loc(u, y) ∧ poss(z, u) ∧ obj(v, z) ∧ agt(x, v)) be a logical rule.
184
S. Coulondre and E. Salvat
R has five pieces, which are: – {Desk(u), loc(u, y), poss(z, u)} which contains atoms that share the existentially quantified variable u, – {Employ(v), obj(v, z), agt(x, v)} which contains atoms that share the existentially quantified variable v, – {Employee(z)}, {M anager(x)} and {Of f ice(y)} which contain atoms where no existentially quantified variable appears, thus giving one atom in each piece. Definition 6 (Piece splitting). Let R = ∃x1 ...∃xs (A1 ∧...∧Ap ) ← B1 ∧...∧Bn be a logical rule, and t be the number of pieces of R. R can be split, giving several logical rules R1 , ..., Rt . Each Ri , i ∈ [1..t] has the same hypothesis as R, and the piece number i as conclusion. Ri = ∃xi1 ...∃xisi (Ai1 ∧ ... ∧ Aipi ) ← B1 ∧ ... ∧ Bn , i ∈ [1..t]. pi is the number of atoms in the piece number i, si is the number of existentially quantified variables concerning the piece, and Aij , j ∈ [1..pi ] are atoms of the piece number i. Each variable not in xij is universally quantified. Construction of R1 ∧ ... ∧ Rt is called piece splitting of R. We will denote in the same way the splitting operation and its result R1 ∧ ... ∧ Rt . S S S Remark 2. i∈[1..t] ( j∈[1..pi ] Aij ) = k∈[1..p] Ak . Indeed, the union of atoms in conclusions of new rules is exactly the set of atoms in the conclusion of the initial rule. Therefore, an atom belongs to only one P piece, thus it can appear in only one of the newly generated rules. Hence i∈[1..t] pi = p. No atom is created and none is deleted. The set of pieces is a partition of the set of atoms in the conclusion of the initial rule. Proposition 1. R is logically equivalent to R1 ∧ ... ∧ Rt . Proof (Sketch). In R, we can group existential quantifiers together in front of concerned atoms sets. Each group of existential quantifiers together with the corresponding set of atoms is by definition a piece. We rewrite the obtained formula by eliminating implication. Then by distributivity of ∨ on ∧, we obtain as many formulae as pieces, connected by conjunction, each of them having the same hypothesis which is initial logical rule hypothesis. Common variables to these formulae are all universally quantified. As ∀x(F ∧ G) and (∀xF ∧ ∀xG) are equivalent, we can decompose R in R1 ∧ ... ∧ Rt . t u Definition 7 (Trivial logical rules). Let R be a logical rule. R is trivial if and only if every interpretation of R satisfies R (R is valid). Example 3 (Trivial logical rules). The following logical rules are trivial: t(x) ← t(x) ∧ r(x, y) ∃u(t(u)) ← t(y) ∃v(r(x, y, v)) ← t(z) ∧ r(x, y, z) Piece splitting can generate logical trivial rules, therefore useless. It is the case, among others, of rules in which each atom of the conclusion appears also in the hypothesis. Let R1 , ..., Ri , ...Rt be the result of piece splitting of R. We showed that R and R1 , ..., Ri , ...Rt are logically equivalent. Let Ri be a trivial logical rule. R and R1 , ..., Ri−1 , Ri+1 , ...Rt are logically equivalent. Hence Ri can be deleted.
Piece Resolution: Towards Larger Perspectives
3.2
185
Basic Operations
Definition 8 (Piece unification). Let Q be a logical goal and R be a logical rule. There is a unification between Q and R, say σu , if and only if: 1. there are two substitutions σ1 and σ2 defined respectively on a subset of variables of Q and on a subset of universally quantified variables of R. 2. σu = σ1 ∪σ2 . Pieces of σu (Q) are defined as pieces of conc(σu (Q)) in relation to the set of existentially quantified variables of σu (R). There must exist as least one piece of σu (Q) appearing entirely in the conclusion of σu (R). Once a piece unification σu has been found, a new request Q0 is built from Q. Q0 becomes the new logical goal. Definition 9 (Construction of a new logical goal). Let Q be a logical goal, R be a logical rule and σu be a piece unifier between Q and R, we obtain the new goal Q0 the following way: 1. Delete from σu (Q) pieces that appear in σu (R) and add atoms of hyp(σu (R)) 2. Update existential quantifiers σu (Q), more specifically: a) Delete existential quantifiers in σu (Q) that were corresponding to deleted existentially quantified variables in σu (R). b) Add existential quantifiers corresponding to variables appearing in atoms of hyp(σu (R)) added to σu (Q) (corresponding to universally quantified variables of σu (R)). These steps may need variable renaming, to avoid variable binding phenomena. Indeed, an atom of hyp(σu (R)) added to σu (Q) must not contain an already quantified variable in σu (Q), because it would be “captured” by a wrong quantifier. Therefore, we must rename common variables to Q and R. Definition 10 (Logical piece resolution). Logical piece resolution of a logical goal Q is a sequence of piece unifications. It ends successfully if the last produced request is empty. In this case, the last used rule has an empty hypothesis (i.e. a logical fact). Example 4 (Piece unification). Let R be the following logical rule: R = ∀x∀y∀z (∃w(M anager(x)∧M anage(w)∧Of f ice(y)∧Employee(z)∧agt(x, w)∧obj(w, y)∧ loc(z, y)) → (∃u∃vOf f ice(y)∧Desk(u)∧Employee(z)∧Employ(v)∧M anager(x) ∧ loc(u, y) ∧ poss(z, u) ∧ obj(v, z) ∧ agt(x, v)). Let Q be the following request (note that Q and R must not have variables in common): Q = ∃i∃j∃kM anager(tom) ∧ Employ(i) ∧ Employee(j) ∧ Car(k) ∧ agt(tom, i) ∧ obj(i, j) ∧ poss(j, k) A unifier is σu = {(x, tom), (i, v), (j, z)}. After applying σu , we construct the new goal Q0 : Q0 = ∃w∃z∃y∃k(M anager(tom) ∧ M anage(w) ∧ Of f ice(y) ∧ Employee(z) ∧ agt(tom, w) ∧ obj(w, y) ∧ loc(z, y) ∧ Car(k) ∧ poss(z, k))
186
S. Coulondre and E. Salvat
Remark 3. It is possible to simulate a piece resolution in backward chaining with conceptual graphs by a piece resolution with logical rules, and vice/versa. Thus these problems are equivalent. The first reduction (graphs → logic) is trivial. For the second reduction (logic → graphs), we need to construct a support and to add concept vertices with universal type for terms that do not appear in predicate of arity one. In the support, each concept type is covered by the universal type and thus two concept types are incomparable. The relation types set is partitioned in subsets of relation types of the same arity where each relation type is covered by the greater element and thus two relation types are incomparable. 3.3
Soundness and Completeness of Logical Piece Resolution
Lemma 1. Let Q be a logical goal and R be a logical rule. If Q’ is a new goal built by a piece unification between Q and R, then Q0 , R |= Q Theorem 2 (Soundness of logical piece resolution [6]). Let Γ be a set of logical rules, and Q be a logical goal. If a logical piece resolution of Q on Γ ends successfully, then Γ |= Q. Proof (Sketch). The way we build Q’ allows us to prove Q0 , R |= Q. To prove the theorem, we then proceed by induction on the number of piece unifications. u t Theorem 3 (Completeness of logical piece resolution [6]). Let Γ be a set of logical rules, and Q be a logical goal. If Γ |= Q then there is a logical piece resolution of Q that ends successfully. Proof (Sketch). We first prove by refutation the existence of a piece unification between the logical goal Q and the logical rule R, giving a logical goal Q0 (assuming that Q0 , R |= Q, but not Q0 |= Q). Secondly, we prove by induction that the implication of a logical goal by a set of logical rules can be “decomposed” in a sequence of logical implications involving the previous goal and only one rule to give the next goal. Then we proceed by induction on the number of piece unifications. t u 3.4
From Fact Goals to Rule Goals
So far, goals were only rules without hypothesis. But it is useful to consider rules as goals. Indeed it would allow to decide whether a rule is logically implied by a set of rules, for example to know if the rule is worth being added to the set. We could also want to compute the minimal cover of a set (that is the minimal set that allows to generate the initial set), or the transitive closure of a set (see section 5). Definition 11. Let S be a set of logical rules, and R be a logical rule. The operation ω, defined on R with respect to S and noted ωS (R), replaces every universally quantified variable of R by a new and unique constant (not appearing neither in S nor in R).
Piece Resolution: Towards Larger Perspectives
187
Theorem 4 ([12]). Let Γ be a set of logical rules, and R be a logical rule. R is of the form R = ∃x1 ...∃xs (A1 ∧ ... ∧ Ap ) ← B1 ∧ ... ∧ Bn , p ≥ 1. Let R0 = ωΓ (R). Then Γ |= R if and only if Γ, hyp(R0 ) |= conc(R0 ). Proof (Sketch). =⇒ Let I be a model of Γ and hyp(R0 ). We show that I is also a model of conc(R0 ). ⇐= We prove that Γ ∧ ¬R is inconsistent. To do that, we show that the set of ground instances of Γ ∧ hyp(R0 ) ∧ ¬conc(R0 ) which is inconsistent is included in the set of ground instances of Γ ∧ ¬R. The theorem of compactness allows to conclude. t u
4
Comparison and Statistical Analysis
In order to estimate how much piece resolution can reduce the resolution tree, we built a logical rules random generator. We present in this section the results of this first series of tests, comparing the number of unitary backtracks, for the same rule base, in the case of piece resolution and SLD-resolution [9] [11] (used in Prolog) 1 . The rule base is translated into Horn clauses in order to fit with Prolog, whose effect is to multiply the base size. Therefore, the piece resolution algorithm deals with the original base, and Prolog deals with the translated base. To prevent procedures from looping, there is no cycle in the dependency graph of logical rules. We made 3011 tests, varying each parameter (base size, maximal number of atoms in hypothesis and conclusion, . . . ). A quick analysis over the results shows that the number of backtracks in piece resolution is always lower than 310,000, whereas it is lower than 5,000,000 for Prolog. The table 4 gives the distribution in details for each method. It shows that piece resolution reduces the number of backtracks. Indeed the number of cases with more than 10.000 backtracks is 73 (2,42%) for piece resolution, whereas it is 279 (9.19%) for Prolog. x=number of backtracks x ≤ 100 101 ≤ x ≤ 1000 1001 ≤ x ≤ 10000 10001 ≤ x ≤ 100000 100001 ≤ x ≤ 1000000 1000001 ≤ x
Prolog 2047 67,98% 407 13,52% 280 9,30% 153 5,08% 82 2,72% 42 1,39%
Piece resolution 2504 83,16% 294 9,76% 140 4,65% 66 2,19% 7 0,23% 0 0%
Fig. 4. Repartition of tests, according to the number of backtracks.
The table 5 gives for the same base the repartition of cases in which piece resolution reduces the number of backtracks (function of number of backtracks 1
We used SWI-Prolog (Copyright (C) 1990 Jan Wielemaker, University of Amsterdam).
188
S. Coulondre and E. Salvat
for Prolog - number of backtracks for piece resolution). The percentage is given relatively to the number of cases that show improvement for piece resolution. x= improvement number of cases percentage x ≤ 100 696 47,54% 101 ≤ x ≤ 1000 290 19,81% 1001 ≤ x ≤ 10000 232 15,85% 10001 ≤ x ≤ 100000 127 8,67% 100001 ≤ x ≤ 1000000 78 5,33% 1000001 ≤ x 41 2,80% Fig. 5. Repartition of tests in which piece resolution is more efficient than Prolog, function of decreasing number of backtracks.
In the same way, the table 6 gives for the same base the repartition of cases in which piece resolution increases the number of backtracks (function of number of backtracks for piece resolution - number of backtracks for Prolog). The percentage is given relatively to the number of cases that show degradation for piece resolution. x= degradation number of cases percentage x ≤ 100 228 76,77% 101 ≤ x ≤ 1000 36 12,12% 1001 ≤ x ≤ 10000 21 7,07% 10001 ≤ x ≤ 100000 10 3,37% 100001 ≤ x 2 0,67% Fig. 6. Repartition of tests in which piece resolution is less efficient than Prolog, function of increasing number of backtracks.
Let us point out the 42 cases where the number of backtracks for Prolog is greater than 1 million (table 4). Moreover, the number of backtracks is decreased by more than 1 million for piece resolution, except in one case (decreasing of 931.850), the maximum decreasing being 4.878.882. Therefore, even considering the time parameter, a better efficiency is shown by piece resolution in 33 cases out of 42, in spite of a low efficient algorithm. Indeed, we have implemented a “brute force” algorithm that computes every possible unification between the terms of a goal and a rule, keeping only piece unifications. We can certainly decrease execution time by studying more efficient algorithms. One may argue that the number of backtracks for piece resolution is lower because a part of the backtracks made by Prolog is moved inside the piece unification operation. Indeed, the decision problem associated with piece unification between two formulae is NP-complete. Several arguments may run counter to this allegation. First, practical results show that for 33 out of 42 Prolog difficult cases, piece resolution takes less longer to achieve. Yet its algorithm is very rough.
Piece Resolution: Towards Larger Perspectives
189
In particular, in 4 cases, piece resolution terminates within 1 second, and in less than 10 backtracks, while Prolog makes between 1,4 and 3,3 million backtracks, in 6,5 and 16 minutes. Thus the backtracks have not been all moved into the unification process. Theoretically speaking, the piece idea keeps intact a part of the formula structure and allows to detect fails sooner. Indeed, substitutions that are not piece unifications are rejected, whereas Prolog will use them to unify and develop its solution tree, and the fail will take place later (not sooner anyway). As stated before, we can improve the piece unification resolution by studying efficient piece unification algorithms, using graph theory or Constraint Satisfaction Problems (CSP) results. Indeed, there is a strong correspondence between a CSP and projection checking on two conceptual graphs [4]. In particular, unification has the same complexity as a projection between two graphs. It is polynomial when the corresponding piece has a tree form.
5
An Application to Relational Database Theory
Data dependencies are well-known in the context of relational databases. They aim to formalize constraints that the data must satisfy. A dependency is a statement to the effect that when certain tuples are present in the database, so are certain others, or some values of the first ones are equal. The formalism usually used to define the TGDs and EGDs and to describe the chase procedure is either that of tableaux, or that of first-order logic with identity [2] [1]. In this paper we use the latter formalism. In this section, we show how dependencies theory can benefit from the logical piece resolution. 5.1
Dependencies
For sake of simplicity, we assume, as in [2], several restrictions on the model. These restrictions can be lifted with minor modifications. We assume the database is under the universal relation assumption (only one relation symbol noted R), distinct attributes have disjoint domains (typed dependencies), that dependencies do not contain constants. Definition 12 (Tuple-Generating Dependency [1]). A TGD is a firstorder logic sentence of the form ∀x1 ...∀xn [ϕ(x1 , ..., xn ) → ∃z1 ...∃zk ψ(y1 , ..., ym )], where {z1 , ..., zk } = {y1 , ..., ym } − {x1 , ..., xn }, where ϕ is a conjunction (possibly empty) of atoms of the form R(w1 , ..., wl ) (where each w1 , ..., wl is a variable) using all the variables x1 , ..., xn , and where ψ is a non empty conjunction of atoms of the form R(w1 , ..., wl ) (where each w1 , ..., wl is a variable), using all the variables z1 , ..., zk . Example 5 (TGD). The following first-order sentence is a TGD: F = ∀x∀y∀z∀t∀u∀v∀w[R(x, y, z, t)∧R(x, u, v, w) → ∃s R(s, y, v, t)∧R(s, u, z, w)]. F means that for every couple of tuples having the same value for the first attribute in the instance of the relation R, there are two other tuples in the instance with values taken from the two latter, and a new value for the first attribute.
190
S. Coulondre and E. Salvat
Definition 13 (Equality-Generating Dependency [1]). An EGD is a firstorder logic sentence of the form ∀x1 ...∀xn [ϕ(x1 , ..., xn ) → ψ(x1 , ..., xn )], where ϕ is a conjunction (possibly empty) of atoms of the form R(w1 , ..., wl ) (where each w1 , ..., wl is a variable) using all the variables x1 , ..., xn , and where ψ is of the form w = w0 , (where w and w0 are variables of x1 , ..., xn ). As dependencies are typed, equality atom involves a pair of variables assigned to the same position. Let A be the attribute associated to this position. Then this EGD is called an A-EGD. As dependencies are typed, we assume that variables in atoms occur only in their assigned positions. Example 6 (EGD). The following first-order sentence is an EGD: G = ∀x∀y∀z∀t∀u∀v[R(x, y, z, t) ∧ R(u, y, z, v) → (x = u)]. G means that all couples of tuples that have the same value for the second and third attribute in the instance of the relation R, also have the same value for the first attribute of the instance. The Implication Problem for Dependencies and the Chase Procedure Let D be a set of dependencies, and d be a dependency. The implication problem is to decide whether D |= d. The chase procedure [2] has been designed to solve the implication problem. We will not describe it in details here. Informally speaking, the chase procedure takes the hypothesis of d and treats it as if it formed a set of tuples, thus replacing variables with symbols. Then it applies repeatedly the dependencies of D, following two distinct rules: a T-rule for a TGD, whose effect is to add tuples to the relation, and a E-rule for an EGD, whose effect is to “identify” two symbols. If we obtain the conclusion of d, then we are done. When d is a TGD, we stop when obtaining the tuples satisfying the conclusion of d. If d is an EGD, we stop when obtaining identification of the two symbols involved in the equality in the conclusion of d. This mechanism has been shown sound and complete in [2]. Note that the implication problem for TGDs and EGDs is semi-decidable [14]. Thus this procedure may not stop. Nevertheless, some cases have been shown decidable. The chase procedure is clearly a bottom-up (or forward chaining) procedure. Indeed, rules are applied to hypotheses, thus generating some consequences (new tuples or identification of symbols). This is executed until the desired conclusion is obtained. The initial goal (i.e. the conclusion of d) is not used to guide the process. We now show that we can solve the implication problem with the piece resolution method presented in section 3. In other words, we provide a top-down (or backward chaining) procedure, 5.2
From One Model to Another
In this section, we reduce the problem of implication for dependencies to the existence of a piece resolution within logical rules. Our model of logical piece resolution does not handle equality. This is not a problem for TGD, which are included in the set of logical rules, but EGD contains equality predicates. We
Piece Resolution: Towards Larger Perspectives
191
have two ways of dealing with this problem. First, we can add equality treatment to our model formalism, but we want to stay in the scope of the CG piece resolution. Second, we can use some results of [2] to eliminate equality for treatment of the implication problem. That is the solution chosen here. Suppressing Equality in Dependencies In this section, we use several results previously shown in [2]. Informally, instead of dealing with equality and identifying symbols within an EGD, we can say that they “look the same” within the relation. Expressing the results in the first-order logic formalism, we obtain the following transformation: Let e = ∀x1 , ...xn (JR → (xi = xj )) be an EGD. We associate to e two TGDs e1 and e2 , as follows: e1 = ∀x1 , ...xn , y1 , ...yl−1 (JR ∧ R(xj , y1 , ..., yl−1 ) → R(xi , y1 , ..., yl−1 ) e2 = ∀x1 , ...xn , y1 , ...yl−1 (JR ∧ R(xi , y1 , ..., yl−1 ) → R(xj , y1 , ..., yl−1 ) where l is the arity of the only relation R and yk , k ∈ [1..l − 1] are new variables not appearing in x1 , ...xn . Example 7. The 2 TGDs associated with the EGD of example 6 are: F1 = ∀x∀y∀z∀t∀u∀v∀n∀r∀s[R(x, y, z, t)∧R(u, y, z, v)∧R(u, n, r, s) → R(x, n, r, s)] F2 = ∀x∀y∀z∀t∀u∀v∀n∀r∀s[R(x, y, z, t)∧R(u, y, z, v)∧R(x, n, r, s) → R(u, n, r, s)] In the following, we will denote by D∗ the set of dependencies D in which every EGD has been replaced by its associated TGDs. Theorem 5 ([2]). Let D be a set of TGDs and EGDs, d be a TGD and e be a non trivial A-EGD (i.e. e = ∀x1 , ...xn (JR → (xi = xj )) and i 6= j). D |= d if and only if D∗ |= d. D |= e if and only if D∗ |= e1 and there is a non trivial A-EGD in D. Reducing the Implication Problem We now only deal with TGDs. The previous results showed that the implication problem for TGDs and EGDs is reducible to the implication for TGDs. Theorem 6 ([6]). Let D be a set of TGDs and EGDs, d be a TGD and e be a non trivial A-EGD. Let ω be the operation already presented in definition 11. D |= d if and only if D∗ , hyp(ωD∗ (d)) |= conc(ωD∗ (d)). D |= e if and only if D∗ , hyp(ωD∗ (e1 )) |= conc(ωD∗ ) and there is a non trivial A-EGD in D. Proof. TGDs are included in the set of logical rules. We simply apply theorems 4 and 5. t u Theorem 7 ([6]). Let D be a set of TGDs and EGDs, d be a TGD, and e be a non trivial A-EGD. D |= d if and only if there is a logical piece resolution of conc(ωD∗ (d)) on {D∗ , hyp(ωD∗ (d))} that ends successfully. D |= e if and only if there is a logical piece resolution of conc(ωD∗ (e1 )) on {D∗ , hyp(ωD∗ (e1 ))} that ends successfully and there is a non trivial A-EGD in D.
192
S. Coulondre and E. Salvat
Proof. By theorem 6 and the property of soundness and completeness of logical piece resolution (theorems 2 and 3). t u Therefore, we showed that we can apply logical piece resolution in backward chaining to solve the implication problem for TGDs and EGDs. Since this problem is semi-decidable, this procedure may never terminate, as for the chase process. But the advantage is that the inference is guided by the goal. We hope to develop some heuristics to help exploration of the solutions tree.
6
Conclusion
In this paper we presented in three points the contributions of the piece resolution mechanism. The underlying idea was to adapt a graph notion to first-order logic. This resolution method has been originally defined on conceptual graph rules [13]. The central notion is that of a piece, that allows to unify subgraphs at once instead of splitting graphs into trivial subgraphs (restricted to a relation and its neighbors). In section 3, we translated the piece notion, which comes from graph theory, into first-order logic. As for conceptual graph rules, the piece resolution method is sound and complete on the set of FOL formulae defined by logical rules. In section 4, we compared SLD-resolution and logical piece resolution, considering the number of backtracks of each method. This comparison shows that pieces can considerably reduce the number of backtracks. We must keep in mind, nevertheless, that the unification in Prolog is polynomial, whereas the decision problem of piece unification between two formulae is NP-complete. The efficiency of the whole piece resolution mechanism depends on the unification, which is the central operation. Thus, an algorithmic study of the unification shall improve the whole procedure. Then, in section 5, we pointed out the similarities between logical rules and data dependencies and we presented a new proof procedure for the implication problem of dependencies, issued from the piece resolution on logical rules. This procedure is top-down. We assumed several restrictions on the model. As already stated, these restrictions can be lifted with minor modifications. The logical rules model can deal with several predicates, thus lifting the universal relation assumption. It can also handle non-disjoint domains, thus unsorted (untyped) dependencies, and constants. That is, we can express a constraint like “Employee #23 can have only one desk”. Our theorems still work in these cases, provided that the reduction of EGDs to TGDs also works in the unrestricted model, which is stated in [2]. We think this procedure comes to fill a gap in the data dependencies processing. Acknowledgments We are grateful to M. C. Rousset for showing us the similarity between FOL formulae associated to CG rules and TGDs, to M. L. Mugnier who read the manuscript very carefully, and to M. Y. Vardi for his useful additional information.
Piece Resolution: Towards Larger Perspectives
193
References 1. S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, Reading, Mass., 1995. 2. Catriel Beeri and Moshe Y. Vardi. A proof procedure for data dependencies. Journal of the ACM, 31(4):718–741, October 1984. 3. Chin-Liang Chang and Richard Char-Tung Lee. Symbolic Logic and Mechanical Theorem Proving. Academic Press, New York, 1973. 4. M. Chein and M.-L. Mugnier. Repr´esenter des connaissances et raisonner avec des graphes. Revue d’Intelligence Artificielle, 10(1):7–56, 1996. 5. Alain Colmerauer. Prolog in 10 figures. Communications of the ACM, 28(12):1296– 1310, December 1985. 6. St´ephane Coulondre. Exploitation de la notion de pi`ece des r`egles de graphes conceptuels en programmation logique et en bases de donn´ees. Master’s thesis, Universit´e Montpellier II, 1997. 7. Jean Fargues, Marie-Claude Landau, Anne Dugourd, and Laurent Catach. Conceptual graphs for semantics and knowledge processing. IBM Journal of Research and Development, 30(1):70–79, January 1986. 8. Bikash Chandra Ghosh and V. Wuwongse. A direct proof procedure for definite conceptual graph programs. Lecture Notes in Computer Science, 954, 1995. 9. R. A. Kowalski. Predicate logic as a programming language. Proc. IFIP 4, pages 569–574, 1974. 10. J. W. Lloyd. Foundations of Logic Programming, Second Edition. Springer-Verlag, 1987. 11. Donald W. Loveland. A simplified format for the model elimination procedure. Journal of the ACM, 16(3):233–248, July 1969. 12. Eric Salvat. Raisonner avec des op´erations de graphes: graphes conceptuels et r`egles d’inf´erence. PhD thesis, Universit´e Montpellier II, Montpellier, France, December 1997. 13. Eric Salvat and Marie-Laure Mugnier. Sound and complete forward and backward chainings of graph rules. In Proceedings of the Fourth International Conference on Conceptual Structures (ICCS-96), volume 1115 of LNAI, pages 248–262, Berlin, August19–22 1996. Springer. 14. Moshe Y. Vardi. The implication and finite implication problems for typed template dependencies. Journal of Computer and System Sciences, 28(1):3–28, February 1984.
Triadic Concept Graphs Rudolf Wille Technische Universit¨ at Darmstadt, Fachbereich Mathematik Schloßgartenstr. 7, D–64289 Darmstadt, [email protected]
Abstract. In the paper “Conceptual Graphs and Formal Concept Analysis”, the author has presented a first attempt in unifying the Theory of Conceptual Graphs and Formal Concept Analysis. This context-based approach, which is philosophically supported by Peirce’s pragmatic epistemology, is grounded on families of related formal contexts whose formal concepts allow a mathematical representation of the concepts and relations of conceptual graphs. Such representation of a conceptual graph is called a “concept graph” of the context family from which it is derived. In this paper the theory of concept graphs is extended to allow a mathematical representation of nested conceptual graphs by “triadic concept graphs”. As in the preceding paper, our focuss lies on the mathematical structure theory, which later could be used for extending the already developed logical theory of simple concept graphs. The overall aim of this research is to contribute to the development of a contextual logic as basis of Conceptual Knowledge Processing.
1
Contextual Logic
“Contextual Logic” is understood as a mathematization of traditional philosophical logic which is based on “the three essential main functions of thinking concept, judgment and conclusion” [Ka88; p. 6]. For mathematizing the doctrine of concepts we offer Formal Concept Analysis [GW96] which formalizes concepts on the basis of formal contexts. For mathematizing the doctrine of judgments and conclusions we use the Theory of Conceptual Graphs [So84],[So98]. Conceptual Logic shall be primarily developed as basis of Conceptual Knowledge Processing as discussed in [Wi94]. First ideas toward a contextual logic have been presented in the paper “Restructuring mathematical logic: an approach based on Peirce’s pragmatism” [Wi96a] which proposes Formal Concept Analysis for establishing more connections between the logic-mathematical theory and reality. In the paper “Conceptual Graphs and Formal Concept Analysis” [Wi97b] a first attempt is made in unifying Formal Concept Analysis and the Theory of Conceptual Graphs for the foundation of a contextual logic. This context-based approach, which is philosophically supported by Peirce’s pragmatic epistemology, is grounded on families of related formal contexts whose formal concepts allow a mathematical representation of the concepts and relations of conceptual graphs. Such representation of a conceptual graph is called a “concept graph” of the context family from which it is derived. M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 194–208, 1998. c Springer-Verlag Berlin Heidelberg 1998
Triadic Concept Graphs
195
While the approach to concept graphs in [Wi97b] concentrates on the foundation of a mathematical structure theory, the paper “Simple concept graphs: a logic approach” [Pr98b] presents a logical theory of concept graphs, whose syntax defines concept graphs as syntactical constructs over an alphabet of object names, concept names, and relation names and whose semantics adapt the mathematical structure theory of [Wi97b] for the interpretation of those syntactical constructs (see also [Pr98a]). Up to now, the developed theory only allows to mathematically represent conceptual graphs without nesting. Therefore the theory should be extended so that nested conceptual graphs could be treated within Contextual Logic too. In this paper we approach such an extended theory of concept graphs by using Triadic Concept Analysis (see [LW95]). Again, our focuss lies on founding the mathematical structure theory, which later could be used for extending the logical theory of concept graphs. In Section 2 we discuss through triadic contexts of tonal music how subdivisions of concept graphs can be described using triadic concepts. This discussion motivates the foundation of a theory of “triadic concept graphs” in Section 3. In Section 4, we explain how positive nested conceptual graphs can be mathematically represented by triadic concept graphs. Finally, in Section 5, we sketch ideas of further research toward a comprehensive foundation of Contextual Logic.
2
Triadic Contexts of Tonal Music
The discussion of concept graphs in music theory, offered in this section, shall provide us with an alternative view of the potential application of concept graphs, which especially suggests to allow not only nestings, but also subdivisions with overlappings. Concept graphs can be used to represent the harrmonic analysis of chord sequences. For example, the sequence of major triads C
−
F
−
G
−
D
−
E
−
A
is analysed in [He49; p. 38] as a diatonic modulation from c major to a major via d major which can be represented by the concept graph visualized in Figure 1: Each small box represents the concept of a chroma and each disc represents the concept of a major triad which is understood as a ternary relation between tones of specific chroma; the functional meaning of a major triad is determined by the containment of its disc in the large boxes which represent major keys, respectively. As in the given example, objects and relationships might have different meanings in different surroundings. Therefore we propose a triadic approach for the extended theory of concept graphs which allows to represent mathematically the different meanings of the considered relationships. To keep this paper (as much as possible) selfcontained, we recall the basic notions of Triadic Concept Analysis from [Wi95] and [LW95]. A triadic context is defined as a quadruple (G, M, B, Y ) where G, M , and B are sets and Y is a ternary relation between G, M , and B, i.e. Y ⊆ G × M × B; the elements of G, M , and B are called
196
R. Wille key: d major
key: c major key: a major
e
a 2
c major triad
3 g
2 1 c 3
g major triad
1
1 f
g # 2
d major triad
e major triad
2
2
f major triad
f#
b
g
3
d
1
3
c# 2 1 e 3
3 a
b
a major triad
1 a
Fig. 1. Harmonic analysis of a diatonic modulation
(formal) objects, attributes, and modalities, respectively, and (g, m, b) ∈ Y is read: the object g has the attribute m in the modality b. Formal objects, attributes, and modalities may formalize entities in a wide range, but in the triadic context they are understood in the role of the corresponding Peircean category. In particular, the formal modalities as abstract instances of the third category may formalize relations, mediations, representations, interpretations, evidences, evaluations, modalities, meanings, reasons, purposes, conditions etc. If real data are described by a triadic context, the names of the formal objects, attributes, and modalities yield the elementary bridges to reality which are basic for interpretations (cf. [Wi92]). A triadic concept of a triadic context (G, M, B, Y ) is defined as a triple (A1 , A2 , A3 ) with A1 ⊆ G, A2 ⊆ M , and A3 ⊆ B such that the triple (A1 , A2 , A3 ) is maximal with respect to component-wise set inclusion in satisfying A1 × A2 × A3 ⊆ Y , i.e., for X1 ⊆ G, X2 ⊆ M , and X3 ⊆ B with X1 × X2 × X3 ⊆ Y , the containments A1 ⊆ X1 , A2 ⊆ X2 , and A3 ⊆ X3 always imply (A1 , A2 , A3 ) = (X1 , X2 , X3 ). If (G, M, B, Y ) is described by a three-dimensional cross table, this means that, under suitable permutations of rows, columns, and layers of the cross table, the triadic concept (A1 , A2 , A3 ) is represented by a maximal rectangular box full of crosses. For a particular triadic concept c := (A1 , A2 , A3 ), the components A1 , A2 , and A3 are called the extent, the intent, and the modus of c, respectively; they are also denoted by Ext(c), Int(c), and M od(c). For the description of derivation operators, it is convenient to denote the underlying triadic context alternatively by K := (K1 , K2 , K3 , Y ). For {i, j, k} = {1, 2, 3} with j < k and for X ⊆ Ki and Z ⊆ Kj × Kk , the (i)-derivation operators are defined by X →X (i) := {(aj , ak ) ∈ Kj × Kk | ai , aj , ak are related by Y for all ai ∈ X}, Z →Z (i) := {ai ∈ Ki | ai , aj , ak are related by Y for all (aj , ak ) ∈ Z}. It can be easily seen that a triple (A1 , A2 , A3 ) with Ai ⊆ Ki for i = 1, 2, 3 is a triadic concept of K if and only if Ai = (Aj ×Ak )(i) for {i, j, k} = {1, 2, 3} with j < k.
Triadic Concept Graphs
197
The set T(K) of all triadic concepts of the triadic context K := (K1 , K2 , K3 , Y ) is structured by set inclusion considered in each of the three components of the triadic concepts. For each i ∈ {1, 2, 3}, one obtains a quasiorder i and its corresponding equivalence relations ∼i defined by (A1 , A2 , A3 ) i (B1 , B2 , B3 ) :⇐⇒ Ai ⊆ Bi (A1 , A2 , A3 ) ∼i (B1 , B2 , B3 ) :⇐⇒ Ai = Bi
and (i = 1, 2, 3).
The relational structure T(K) := (T(K), 1 , 2 , 3 ) is called the concept trilattice of the triadic context K. As contextual and conceptual basis for concept := (K1 , . . . , Kn ) graphs we define a power context family of triadic contexts by K (n ≥ 2) with Kk := (Gk , Mk , B, Yk ) (k = 1, . . . , n) such that Gk ⊆ (G1 )k . How specific contextual logics may be grounded on such power context families shall be demonstrated by a mathematical model for tuning logics of tonal music. Since 1980 the computer organ MUTABOR has been developed within the research project on ”Mathematical Music Theory” at Darmstadt University of Technology (see [GHW85],[MW88],[ARW92],[Wi97a]). The special feature of MUTABOR is its realtime-computing of pitches for the keys which is performed immediately after a key is touched. This allows to play on a regular keyboard with arbitrary pitches of the continuum of audible frequencies in almost arbitrary combinations. The computation of pitches is performed by specific computer programs called “tuning logics”. A challenging problem is to design tuning logics which allow to play music on MUTABOR in just intonation. For solving this problem, mathematical models of tone systems for just intonation are needed. The most common system for this purpose is the Harmonic Tone System which is generated by the intervals octave (2 : 1), perfect fifth (3 : 2), and perfect third (5 : 4). An unbounded mathematical model of the Harmonic Tone System is given by the integer set T := {2p 3q 5r | p, q, r ∈ Z} where Z is the set of all integers (cf. [Wi80]). As we have already seen by our preceding example of a diatonic modulation, a harmonic analysis (necessary for treating the tuning problem) is of triadic nature. Therefore we build from the integer set T a power context family of triadic contexts (Gk , Mk , B, Yk ) (k = 1, . . . , 12) by defining Gk := T k , M1 as the set of all keys, octave ranges, tone letters and chromas (with indices for syntonic commas), Mk as the set of all k-ary harmonies, chordal and harmonic forms, and B as the set of all tonalities, major and minor keys (for precise mathematical definitions of all those musical notions and of the relations Yk we refer to [Wi76],[Wi80],[NW90]). The musicologist M. Vogel has proposed a tuning logic for just intonation (see [Vo75], pp. 343–345) which has been implemented on MUTABOR by using the triadic mathematization of the Harmonic Tone System underlying the described power context family. In discussing Vogel’s tuning logic through an example concerned with the so-called Problem of the Harmony of Second Degree, we demonstrate the use of concept graphs derivable from the power context family of the Harmonic Tone System. Figure 2 shows above a concept graph representing a sequence of major and minor triads whose tone concepts are described by MIDI numbers of keys.
198
R. Wille
52 2 2 c major triad
1
60
3
57
1
d major triad
1
52
2
2
3
f major triad
3
59
62
3
g major 1 triad
55
3
c major triad
2
53
1
55
60
e
e
b 2 2
c major triad
1
c
3
a
2
f major triad
1
d major triad
1
d
3
g major 1 triad
3
g
c major triad
2
f
3
2
3
1
g
c
-1
tonality: g
tonality: f
b-2
e-1 2
-1
2 c major triad
1
c
3
f major triad
1 3
g
a
tonality: c
2
2
3 d major triad
f
e-2
2
1
-1
d
3 g major 1
g-1
triad
3
c major triad
-1
tonality: a
1 -1
tonality: c
-1
c
Fig. 2. A sequence of major and minor chords transformed into just intonation
Triadic Concept Graphs
-2
-1
g -1
b-1b
f
g0
d0
+1
bb b
b
eb b
f# e-2 b-2 000000 111111 000000 111111
-1
c
b
+1
1 0 0 1 0 1 -2 0g-2 1 d# 0# 1 -1 0 1 000000 111111 000000 111111 0 1 000000 111111 000000 111111 a 0 1 000000 111111 000000 111111 0 1 000000 111111 000000 111111 0 1 000000 111111 000000 111111 0 1 000000 111111 000000 111111 0-1 1 000000 111111 000000 111111 -1 -1 -1 -1 0 1 000000 111111 000000 111111 c g d a e b-1 0 1 000000 111111 000000 111111 000000 111111 000000 111111 000000 111111 0 1 000000 111111 000000 111111 000000 111111 0 1 000000 111111 000000 111111 000000 111111 0 1 000000 111111 000000 111111 000000 111111 0 1 000000 111111 000000 111111 000000 111111 0 1 000000 111111 000000 111111 000000 111111 0 1 f c 000000 111111 000000 111111 000000 111111 0 1 000000 111111 000000 111111 000000 111111 0 0 0c 0 1 000000 111111 000000 111111 000000 111111 a0 e0 b f g0 0 1 000000 111111 000000 111111 000000 111111 b b b 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 +1 +1 +1 +1 +1 f c g d e 0a+1b 1 b b b b b 0 1 0 1 0 1 -2
a-2
d
-2
c#
199
-2
a#
f
-1
#
d0
+1
b
b
Fig. 3. The chord sequence of Figure 2 represented in Euler’s speculum musicum
The question is: which pitches should MUTABOR compute for those keys? The concept graph in the middle of Figure 2 yields a partial answer to this question in showing chroma concepts instead of the key concepts (e.g. the keys 60, 52, and 55 of the c major triad are replaced by the general chromas c, e, and g, respectively; notice that the keys 52 and 55 are in the range of the small octave and that the key 60 is in the range of the one-line octave). Figure 2 shows below a subdivided concept graph with precise chroma letters of the Harmonic Tone Sytem which, together with the octave range, uniquely determine the pitch, respectively. How Vogel’s tuning logic reaches those pitches may be understood best with the aid of Euler’s “speculum musicum” which graphically represents the chroma system of the Harmonic Tone System (cf. [Vo75], p. 102); a detail of the speculum musicum is presented in Figure 3 together with the harmonic analysis of our chord sequence obtained by Vogel’s tuning logic. Mathematically, Euler’s net represents the numbers 3q 5r (q, r ∈ Z) where c stands for 30 50 and 3q 5r leads to 3q+1 5r by one step to the right and to 3q 5r+1 by one step to the top (3q 5r is the chroma of the tones 2p 3q 5r with p ∈ Z). Our chord sequence is indicated in Figure 3 by the hatched triangles from right to left. The hexagon on the right represents the tonality c which comprises all chromas contained in this hexagon. The other similar hexagons represent the tonalities f , a−1 , g −1 , and c−1 . The key idea of Vogel’s tuning logic is to determine the pitch of a touched chord by the (eventually new) tonality which has the same denotation as the chroma of the reference tone of the touched chord with respect to the actual tonality. For our example, this yields a suprising tonality modulation from c to c−1 via f , a−1 , and g −1 . It is still an open problem in which circumstances Vogel’s solution is adequate and in which not. Viewing a harmonic analysis as a judgment, which can be mathematized by a concept graph, helps in understanding how concept graphs of power context families of triadic contexts should be defined. Of course, the main purpose of such a definition is to allow an adequate mathematization of nested conceptual
200
R. Wille
graphs which are understood as logical abstraction of verbal judgments. But the examples from music show that subdivisions with overlappings should be permitted. To cover such broad understanding of formal judgments, we define in the next section triadic concept graphs as a generalization of the concept graphs introduced in [Wi97b].
3
Triadic Concept Graphs
For the definition of triadic concept graphs, we first define an abstract concept graph with subdivision as a structure G := (V, E, ν, C, κ, θ, σ) for which n 1. V and E are sets and ν is a mapping of E to k=1 V k (2 ≤ n ∈ N) so that (V, E, ν) can be considered as a directed multi-hypergraph with vertices from V and edges from E (we define |e| = k :⇔ ν(e) = (v1 , . . . , vk )), 2. C is a set and κ is a mapping of V ∪ E to C such that κ(e1 ) = κ(e2 ) always implies |e1 | = |e2 | (the elements of C may be understood as abstract concepts), 3. θ is an equivalence relation on V , 4. σ is a partial mapping of the vertex set V to the power set of V × C such that, for the partial mappings σ1 and σ2 with σ1 (w) := {v ∈ V | (v, c) ∈ σ(w) for some c ∈ C} and σ2 (w) := {c ∈ C | (v, c) ∈ σ(w) for some v ∈ V } for w ∈ dom(σ), we have σ1 (V ) = ∅ = σ2 (V ) and v ∈(σ1 )m (v) for all v ∈ V and m ∈ N. Abstract concept graphs with subdivision are quite common. Figure 4 (taken from [Ka79]) shows examples of such graphs (without edges) representing the be a power context family lexical fields “Gew¨asser” and “waters”. Now, let K of triadic contexts K1 , . . . , Kn with Kk := (Gk , Mk , B, Yk ) (1 ≤ k ≤ n) and n T(K let CK k ). Then an abstract concept graph G with subdivision, := k=1 specified by G := (V, E, ν, C, κ, θ, σ), is called a triadic concept graph over the if power context family K 1. 2. 3. 4. 5.
C = CK , κ(V ) ⊆ T(K1 ), κ(e) ∈ T(Kk ) for all e ∈ E with |e| = k, σ2 (w) ⊆ G1 for all w ∈ dom(σ). c1 ∈ Ext(c2 ), c2 ∈ Ext(c3 ), . . . , cm−1 ∈ Ext(cm ) for c1 , c2 , . . . , cm ∈ σ2 (V ) imply c1 = cm .
is defined to be a mapping A realization of such triadic concept graph G over K ρ of V to the power set of G1 satisfying, for all v ∈ V , w ∈ dom(σ), and e ∈ E with ν(e) = (v1 , . . . , vk ), 1. 2. 3. 4.
˜ for all w ˜ ∈ dom(σ), ∅ = ρ(v) ⊆ Ext(κ(v)) if v ∈σ1 (w) ˜ for all w ˜ ∈ dom(σ), ρ(v1 ) × · · · × ρ(vk ) ⊆ Ext(κ(e)) if {v1 . . . vk } ⊆σ1 (w) v1 θv2 always implies ρ(v1 ) = ρ(v2 ), ∅ = ρ(v) ⊆ (Int(κ(v)) × M od(c))(1) if (v, c) ∈ σ(w),
Triadic Concept Graphs
201
Lexical field "Gewaesser"
Meer Maar
Lache Pfuetze
Weiher
See
Teich Tuempel
Haff
Kanal
Pfuhl
Fluss Bach Rinnsal
Strom
Lexical field "water"
lake
sea lagoon
reservoir
channel
pool mere
river
plash puddel pond
brook
rivulet
rill
stream
runnel
trickle
torrent
burn
beck
canal
Fig. 4. Abstract concept graphs with subdivision for two lexical fields
5. ρ(v1 ) × · · · × ρ(vk ) ⊆ (Int(κ(e)) × M od(c))(1) if (v1 , c), . . . , (vk , c) ∈ σ(w), 6. σ2 (w) ⊆ ρ(w). The pair G := (G, ρ) is called a realized triadic concept graph of the power or, shortly, a triadic concept graph of K. context family K In discussing the definition of triadic concept graphs, let us first remark that with only one modality the triadic concept graphs of the power context family K as introduced and with dom(σ) = ∅ can be considered as the concept graphs of K in [Wi97b]; in this way, triadic concept graphs can be understood as a generalization of (dyadic) concept graphs. The basic idea of triadic concept graphs is to describe the parts of a subdivision by specific triadic concepts which are also elements of the object set G1 and therefore called “general objects”. Condition 5 of the definition of triadic concept graphs over a power context family guarantees (under a suitable finite chain condition) that the general objects can be determined inductively. Condition 4 and 5 of the definiton of realizations explicates the idea that the general object c ∈ σ2 (w) (w ∈ dom(σ)) represents the part of the subdivision that is constituted by the vertex set {v ∈ V | (v, c) ∈ σ(w)}; for a vertex v and an edge e in that part, the extents of κ(v) and κ(e) are modified to the extents (Int(κ(v)) × M od(c))(1) and (Int(κ(e)) × M od(c))(1) to respect
202
R. Wille
the modalities of c, while the intents of κ(v) and κ(e) keep the identity of the triadic concepts κ(v) and κ(e), respectively, through changing modalities.
4
Contextual Semantics of Nested Conceptual Graphs
In this section we show how a nested conceptual graph can be represented mathematically by a triadic concept graph. As an example we choose the nested conceptual graph discussed in [CM97]. This conceptual graph (with added individual markers) is shown in Figure 5. To obtain the representing triadic concept := (K1 , K2 ) which represents graph, we first derive the power context family K the knowledge coded in the nested conceptual graph; K1 := (G1 , M1 , B, Y1 ) and K2 := (G2 , M2 , B, Y2 ) are described in Figure 6 by cross tables in which columns without crosses are omitted. The names of the formal objects, attributes, and modalities are heading the rows, the columns, and the vertical sections of the tables, respectively. The general objects u, a, b, and c, which represent the large boxes, are triadic concepts of K1 generated by their corresponding attributes and modalities u, a, b, and c, i.e. u := ({P eter, #1, a}, {u}, {u}), a := ({#2, b}, {a}, {a}), b := ({#3, #4, c}, {b}, {b}), c := ({#4, #5, #6, #7, #8}, {c}, {c}). The concepts and the binary relations of the nested conceptual graphs are represented by triadic concepts of K1 and K2 according to the following list: Person ↔ ({P eter, #5, #7}, {P erson}, {u, c}), Think ↔ ({#1}, {T hink, u}, {u}), Painting ↔ ({a}, {P ainting, u}, {u}), Bucolic ↔ ({#2}, {Bucolic, a}, {a}), Scene ↔ ({b}, {Scene, a}, {a}), Boat ↔ ({#3}, {Boat, b}, {b}), Lake ↔ ({#4}, {Lake, b, c}, {b, c}), Couple ↔ ({c}, {Couple, b}, {b}), Fish ↔ ({#6}, {F ish, c}, {c}), Sleep ↔ ({#8}, {Sleep, c}, {c}), agent ↔ ({(#1, P eter), (#6, #5), (#8, #7)}, {agent}, {u, c}), object ↔ ({(#1, a)}, {object, u}, {u}), attribute ↔ ({(b, #2)}, {attribute, a}, {a}), in ↔ ({(#3, c), (#6, #4)}, {location, in}, {,c}), ¯ on ↔ ({(#4, #3)}, {location, on}, {b}),
Triadic Concept Graphs
203
All triadic concepts of K1 and their triadic relationships can be read off from the triadic diagram of the concept trilattice T(K1 ) depicted in Figure 7; the line diagram on the right, visualizing the containment of the extents, represents the hierarchy of concept types indicated by the conceptual graph in Figure 5 (see [LW95],[Bi97] for reading conventions of triadic diagrams). The listed triadic and the assigment of individual concepts form a triadic concept graph over K markers to concept types in Figure 5 yields a realization and hence a triadic which mathematically represents the nested conceptual graph concept graph of K of Figure 5. The example makes clear that a triadic approach is necessary if the concept and relations of nested conceptual graphs shall be represented by formal concepts. A dyadic representation is not flexible enough to capture precisely the knowledge coded in a nested conceptual graph. 2
Person: Peter
1
agent
Think: #1
1 object 2 Painting: a 1
attr
2
Bucolic: #2
Scene: b Couple: c Person: #5
Person: #7
2
1
agent
2
1
agent
1
Fish: #6
2
in
Lake: #4
Sleep: #8
2 in
1
Boat: #3
2
on
1
Lake: #4
Fig. 5. Example of a positive nested conceptual graph
The triadic approach presented in this section can be applied to positive nested conceptual graphs as they are mathematized in [CM96] and [CM97] by labelled bipartite graphs G := (R, C, E, l) for which R and C are disjoint sets of vertices labelled via l by relation types and concept types, respectively, and E is a set of edges between R and C labelled in such a way that adjacent edges of
R. Wille
u
a
b
c
Person Think Painting u Bucolic Scene a Boat Lake Couple b Person Lake Fish Sleep c
IK1 Peter #1 #2 #3 #4 #5 #6 #7 #8 u a b c
u IK2
a
b
c
agent object u attribute a location in on b agent location in c
204
(#1, Peter) (#1, a ) (b , #2) (#3, c ) (#4, #3) (#6, #5) (#6, #4) (#8, #7) Fig. 6. A power context family derived from the conceptual graph in Figure 5
Triadic Concept Graphs
205
a b c u u #2 #1
Scene
a
a Bucolic
Peter #6
Think u Person
b
Painting
c
#5 #8 #7
#4 #3 c
Sleep Fish Lake b Boat Couple
Fig. 7. A triadic diagram of the concept trilattice of K1 in Figure 6
a vertex of R get the consecutive numbers 1, 2, . . .; further labels of a vertex of C may be individual markers and also descriptions of subgraphs of G which can be considered to be nested into that vertex. Clearly, the graph G := (R, C, E, l) without its individual markers may be understood as an abstract concept graph := (K1 , . . . , Kn ) G with subdivision. For establishing the power context family K with Kk := (Gk , Mk , B, Yk ) (k = 1, . . . , n) which can be associated with G, we complete the invidual markers of G so that each of its concept nodes carries at least one individual marker; then we build with those markers and with the ˜ 1 (which will be changed descriptions of the distinguished subgraphs the set G later to the set G1 by replacing the subgraph descriptions by suitable general objects). The subgraph descriptions form the modality set B and shall be also elements of the attribute sets Mk ; the other elements of Mk are the relation types assigned to a relation vertex with k adjacent concept vertices and, in case
206
R. Wille
of k = 1, also the assigned concept types. Y˜1 is then taken as the smallest ternary relation satisfying the following conditions: 1. (g, a, a) ∈ Y˜1 for all subgraph descriptions a and for all individual markers and subgraph descriptions g assigned within the subgraph described by a, ˜ 1 , M1 , B, Y˜1 ) permits a mapping ρ of V to the 2. the triadic context K˜1 := (G ˜ 1 for a suitably chosen κ : C → B(K˜1 ) satisfying the conditions power set of G 1. and 4. of the definition of a realization in Section 3. For each subgraph description a, a triadic concept a can be defined by a := (({a} × {a})(1) , (({a} × {a})(1) × {a})(2) , (({a} × {a})(1) × (({a} × {a})(1) × {a})(2) )(3) )
starting an inductive procedure from the most inner subgraphs and the tria˜ 1 , M1 , B, Y˜1 ) and, after each step, replacing in the object set of dic context (G the actual context a subgraph description by the corresponding triadic concept. Finally we obtain the triadic context K1 := (G1 , M1 , B, Y1 ) whose object set G1 contains no more subgraph descriptions, but, as general objects, the triadic concepts defined for the subgraph descriptions. The other triadic contexts of the may be similarily defined as K1 so that the abstract power context family K which maconcept graph G can be concretized to a triadic concept graph of K thematically represents the positive nested conceptual graphs described by the labelled bipartite graph G := (R, C, E, l).
5
Further Research
The started theory of triadic concept graphs shall be elaborated toward a comprehensive theory of formal judgments and conclusions as essential part of Contextual Logic. This will be performed in two directions: the elaboration of a mathematical structure theory which treats triadic concepts as realizations of abstract concept graphs with subdivision and the development of a logic theory which understands triadic concept graphs within contextual models of a syntactical language. For the mathematical structure theory, a challenging problem is to find suitable structures and representations which combine concept lattices and concept graphs effectively and communicatively. For the logic theory, the first task consists in extending the existing theory for simple concept graphs (see [Pr98a],[Pr98b]) to a theory of syntax and semantics for triadic concept graphs, which adapts the started mathematical structure theory. Then, of course, a main desire is to widen the expressibility of the developed theory in activating larger parts of predicate logic which are still decidable and allow effective algorithms. Also the integration of modal logics should be tackled. Overall, the pragmatic meaning of the theoretical developments has to be continuously reflected and examined on the basis of concrete applications.
Triadic Concept Graphs
207
References 1. V. Abel, P. Reiss, R. Wille: MUTABOR II - Ein computergesteuertes Musikinstrument zum Experimentieren mit Stimmungslogiken und Mikrot¨ onen. FB4Preprint Nr. 1513, TU Darmstadt 1992. 2. K. Biedermann: How triadic diagrams represent triadic structures. In: D. Lukose, H. Delugach, M. Keeler, L. Searle, J. Sowa (eds.): Conceptual Structures: Fulfilling Peirce’s Dream. Springer, Berlin-Heidelberg-New York 1997, 304–317. 3. M. Chein, M.-L. Mugnier: Repr´esenter des connaissances et raisonner avec des graphes. Revue d’intelligence artificelle 101 (1996), 7–56. 4. M. Chein, M.-L. Mugnier: Positive nested conceptual graphs. In: D. Lukose, H. Delugach, M. Keeler, L. Searle, J. Sowa (eds.): Conceptual Structures: Fulfilling Peirce’s Dream. Springer, Berlin-Heidelberg-New York 1997, 95–109. 5. B. Ganter, H. Hempel, R. Wille: MUTABOR - Ein rechnergesteuertes Musikinstrument zur Untersuchung von Stimmungen. ACUSTICA 57 (1985), 284–289. 6. B. Ganter, R. Wille: Formale Begriffsanalyse: Mathematische Grundlagen. Springer, Berlin-Heidelberg 1996. (English translation to appear) 7. R. Hernried: Systematische Modulation. de Gruyter, Berlin 1949. 8. I. Kant: Logic. Dover, New York 1988. 9. G. L. Karcher: Kontrastive Untersuchung von Wortfeldern im Deutschen und Englischen. Peter Lang, Frankfurt 1979. 10. F. Lehmann, R. Wille: A triadic approach to formal concept analysis. In: G. Ellis, R. Levinson, W. Rich, J. F. Sowa (eds.): Conceptual Structures: Applications, Implementations and Theory. Springer, Berlin-Heidelberg-New York 1995, 32–43. 11. C. Misch, R. Wille: Eine Programmiersprache f¨ ur MUTABOR. In: F. Richter Herf (ed.): Mikrot¨ one II. Edition Helbing, Innsbruck 1988, 87–94. 12. W. Neumaier, R. Wille: Extensionale Standardsprache der Musiktheorie - eine Schnittstelle zwischen Musik und Informatik. In: H.-P. Hesse (ed.): Mikrot¨ one III. Edition Helbing, Innsbruck 1990, 149–167. 13. S. Prediger: Einfache Begriffsgraphen: Syntax und Semantik. FB4-Preprint Nr. 1962, TU Darmstadt 1998. 14. S. Prediger: Simple concept graphs: a logic approach. FB4-Preprint, TU Darmstadt 1998 (this volume). 15. M. Vogel: Die Lehre von den Tonbeziehungen. Verlag f¨ ur systematische Musikwissenschaft. Bonn-Bad Godesberg 1975. 16. J. F. Sowa: Conceptual structures: information processing in mind and machine. Adison-Wesley, Reading 1984. 17. J. F. Sowa: Knowledge representation: logical, philosophical, and computational foundations. PWS Publishing Co., Boston (to appear) 18. R. Wille: Mathematik und Musiktheorie. In: G. Schnitzler (ed.): Musik und Zahl. Verlag f¨ ur systematische Musikwissenschaft, Bonn-Bad Godesberg 1976, 233–264. ¨ 19. R. Wille: Mathematische Sprache in der Musiktheorie. Jahrbuch Uberblicke Mathematik 1980. Bibl. Institut, Mannheim 1980, 167–184. 20. R. Wille: Begriffliche Datensysteme als Werkzeug der Wissenskommunikation. In: H. H. Zimmermann, H.-D. Luckhardt, A. Schulz (eds.): Mensch und Maschine Informationelle Schnittstellen der Kommunikation. Universit¨ atsverlag Konstanz, Konstanz 1992, 63–73. 21. R. Wille: Pl¨ adoyer f¨ ur eine philosophische Grundlegung der Begrifflichen Wissensverarbeitung. In: R. Wille, M. Zickwolff (eds.): Begriffliche Wissensverarbeitung Grundfragen und Methoden. B.I.-Wissenschaftsverlag, Mannheim 1994, 11–25.
208
R. Wille
22. R. Wille: The basic theorem of triadic concept analysis. Order 12 (1995), 149–158. 23. R. Wille: Restructuring mathematical logic: an approach based on Peirce’s pragmatism. In: A. Ursini, P. Agliano (eds.): Logic and Algebra. Marcel Dekker, New York 1996, 267–281. 24. R. Wille: Conceptual structures of multicontexts. In: P. W. Eklund, G. Ellis, G. Mann (eds.): Conceptual Structures: Knowledge Representation as Interlingua. Springer, Berlin-Heidelberg-New York 1996, 23–39. 25. R. Wille: MUTABOR - ein Medium f¨ ur musikalische Erfahrungen. In: M. Warnke, W. Coy, G. C. Tholen (eds.): Hyperkult: Geschichte, Theorie und Kontext digitaler Medien. Stroemfeld Verlag, Basel 1997, 383–391. 26. R. Wille: Conceptual Graphs and Formal Concept Analysis. In: D. Lukose, H. Delugach, M. Keeler, L. Searle, J. Sowa (eds.): Conceptual Structures: Fulfilling Peirce’s Dream. Springer, Berlin-Heidelberg-New York 1997, 290–303.
Powerset Trilattices K. Biedermann Technische Universit¨ at Darmstadt, Fachbereich Mathematik Schloßgartenstr. 7, D–64289 Darmstadt, [email protected]
Abstract. The Boolean lattices are fundamental algebraic structures in Lattice Theory and Mathematical Logic. Since the triadic approach to Formal Concept Analysis gave rise to the triadic generalization of lattices, the trilattices, it is natural to ask for the triadic analogue of Boolean lattices, the Boolean trilattices. The first step in establishing Boolean trilattices is the study of powerset trilattices, which are the the triadic generalization of powerset lattices. In this paper, an order-theoretic characterization of the powerset trilattices as certain B-trilattices is presented. In particular, the finite B-trilattices are (up to isomorphism) just the finite powerset trilattices. They have 3n elements. Further topics are the triadic de Morgan laws, the cycles of triadic complements as the triadic complementation and the atomic cycles, which take over the role of the atoms in the theory of Boolean lattices.
1
Introduction
The idea of a formalization of traditional philosophical logic, which is based on concepts, judgements, and conclusions as the three essential main functions of thinking, lead to a unification of the Theory of Conceptual Graphs and the Theory of Formal Concept Analysis (cf. [Wi97]). It has turned out that for simple conceptual graphs without nesting the dyadic setting of Formal Concept Analysis is appropriate but for conceptual graphs with nesting the triadic approach to Formal Concept Analysis, Triadic Concept Analysis, is required. This is elaborated in the paper ”Triadic Concept Graphs” by R. Wille in this volume. The triadic approach to Formal Concept Analysis on the other hand gave rise to a new class of algebraic structures, the so-called trilattices (cf. [Wi95] and [Bi98]), which are the triadic generalization of lattices. Since Boolean lattices are fundamental algebraic structures in Lattice Theory and Mathematical Logic, it is natural to ask for the triadic analogue of Boolean lattices, the Boolean trilattices, which play a similar role in the triadic case as Boolean lattices in the dyadic case. The first step in establishing Boolean trilattices, done in this paper, is the study of powerset trilattices as the triadic generalization of powerset lattices. In the following section, the basic notions and results about trilattices are given. Some first properties of powerset trilattices will be presented in the third section. Most of them correspond to familiar results concerning powersets. In the fourth M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 209–221, 1998. c Springer-Verlag Berlin Heidelberg 1998
210
K. Biedermann
section, B-trilattices are defined as special (triadically) complemented trilattices and it can be shown that, as for Boolean lattices, the triadic complementation in B-trilattices is unique. The atomic cycles, which correspond to the atoms in the theory of Boolean lattices, enable us to give a first characterization of the powerset trilattices as certain B-trilattices. In particular, the finite B-trilattices turn out to be the finite powerset trilattices just as the finite Boolean lattices are (up to isomorphism) the finite powerset lattices. Before motivating powerset trilattices, we must introduce trilattices as special triordered sets – just as lattices are special ordered sets. More about Formal Concept Analysis and Lattice Theory can be found in [GW96] and [DP90]. For basic results about Boolean lattices, the reader is for example referred to the article by R. W. Quackenbush in [Gr79] and to [MB89].
2
Trilattices
A triordered set is a relational structure (P, .1 , .2 , .3 ) for which the relations are quasiorders with ∼i :=.i ∩ &i for i = 1, 2, 3 such that the following conditions hold for all x, y ∈ P and {i, j, k} = {1, 2, 3}: (1) x ∼i y, x ∼j y implies x = y (uniqueness condition), (2) x .i y, x .j y implies x &k y (antiordinality). A triordered set P := (P, .1 , .2 , .3 ) gives rise to the ordered sets (Pi , ≤i ) with i = 1, 2, 3, where Pi := {[x]i | x ∈ P } with [x]i := {y ∈ P | x ∼i y} and [x1 ]i ≤i [x2 ]i :⇔ x1 .i x2 . They will be called the ordered structures of the triordered set P . Triordered sets are represented by so-called triadic diagrams as depicted in Fig. 1. The elements in P correspond to the little circles in the interior triangular net, showing the so-called geometric structure (P, ∼1 , ∼2 , ∼3 ). Following the parallel lines to one of the three side diagrams, which represent the ordered structures (P, ≤i ), i = 1, 2, 3, yields the corresponding classes of i-equivalent elements. So, from the three ordered structures, one can read if two elements are comparable to each other with respect to some quasiorder while from the interior net one can see if two elements are i-equivalent to one another. More about triadic diagrams can be found in [LW95] and [Bi97]. A mapping ϕ : P → Q between triordered sets P := (P, .1 , .2 , .3 ) and Q := (Q, .1 , .2 , .3 ) is i-isotone for i ∈ {1, 2, 3} if x .i y ⇒ ϕ(x) .i ϕ(y) holds for all x, y ∈ P . If these and also the converse implications x .i y ⇐ ϕ(x) .i ϕ(y) are satisfied for all i ∈ {1, 2, 3}, then ϕ is a triorder-embedding which is already injective (use the uniqueness condition). A triorder-isomorphism is a surjective triorder-embedding. According to the order-theoretic approach, trilattices are special triordered sets in which certain operations, the so-called ik-joins, exist. These operations correspond to meet and join in lattice theory and are defined in the following way: Let (P, .1 , .2 , .3 ) be a triordered set, let X and Y be subsets of P , and let {i, j, k} = {1, 2, 3}. An element b ∈ P is called an ik-bound of (X, Y ) if x .i b for
Powerset Trilattices
211
all x ∈ X and y .k b for all y ∈ Y . An ik-bound l of (X, Y ) is called an ik-limit of (X, Y ) if l &j b for all ik-bounds b of (X, Y ). An ik-limit ˜l of (X, Y ) is called the ik-join of (X, Y ) and denoted by X∇ik Y if ˜l .k l for all ik-limits l of (X, Y ). It is easy to see that there is at most one such ik-limit of (X, Y ). A trilattice is a triordered set L := (L, .1 , .2 , .3 ) in which all the ik-joins {x1 , x2 }∇ik {y1 , y2 } exist where x1 , x2 , y1 , y2 ∈ P and {i, j, k} = {1, 2, 3}. If X = {x} and Y = {y} we will shortly write x∇ik y for {x}∇ik {y}. A triordered set (L, .1 , .2 , .3 ) is called a complete trilattice if, for arbitrary sets X, Y ⊆ L, all the ik-joins X∇ik Y exist. In particular, a complete trilattice is bounded by Oj := L∇ik L = L∇ki L where {i, j, k} = {1, 2, 3}. For a bounded trilattice L we define 0i := [Oi ]i and 1i := [Oj ]i = [Ok ]i where {i, j, k} = {1, 2, 3}, and also the boundary of L by b(L) := {x ∈ L | x ∼i Oi for some i ∈ {1, 2, 3}}. The best way to motivate in which sense the powerset trilattices are the triadic generalization of powerset lattices is to present them within the framework of Triadic Concept Analysis. So we briefly recall some basic definitions and results, which can be found in [Wi95] and [WZ98]. For an immediate study of the powerset trilattices, the reader is directly referred to the first definition in the following section. A triadic context is a quadruple K := (K1 , K2 , K3 , Y ) consisting of a set K1 of objects, a set K2 of attributes, a set K3 of conditions and a ternary relation Y ⊆ K1 × K2 × K3 indicating under which condition b an object g has an attibute m or, in symbols (g, m, b) ∈ Y . For two subsets X1 ⊆ K1 and X2 ⊆ K2 a derivation operator can be defined in the following way: hX1 , X2 i(3) := {a3 ∈ K3 | (x1 , x2 , a3 ) ∈ Y for all x1 ∈ X1 and x2 ∈ X2 } and similarly for hX1 , X3 i(2) and hX2 , X3 i(1) . A triple A := (A1 , A2 , A3 ) ∈ P(K1 )×P(K2 )×P(K3 ) is said to be a triadic concept of K if Aj = hAi , Ak i(j) for all {i, j, k} = {1, 2, 3} with i < k. It follows that the triadic concepts (A1 , A2 , A3 ) of K are precisely those triples in P(K1 ) × P(K2 ) × P(K3 ) which are maximal with respect to the component-wise set inclusion. With respect to the triadic context K, which can be understood as a three-dimensional cross table, a triadic concept can be understood as a maximal, not necessarily connected box which is completely filled out with entries of the relation Y . The set of all triadic concepts T(K) of K can be endowed with three quasiorders .i , i ∈ {1, 2, 3}, by (A1 , A2 , A3 ) .i (B1 , B2 , B3 ) ⇔ Ai ⊆ Bi . It is easy to see that (T(K), .1 , .2 , .3 ) is a triordered set. From the Basic Theorem of Triadic Concept Analysis it even follows that (T(K), .1 , .2 , .3 ) is a complete trilattice which is called the concept trilattice of K. The ik-joins can similarly be described as the following 13-join for subsets X, Y ⊆ T(K): X∇13 Y = hhX, Y i(2) , Y i(1) , hX, Y i(2) , hhhX, Y i(2) , Y i(1) , hX, Y i(2) i(3) , S S where X := {X1 | (X1 , X2 , X3 ) ∈ X} and Y := {Y3 | (Y1 , Y2 , Y3 ) ∈ Y}.
212
3
K. Biedermann
Powerset Trilattices
According to the ordinary (monadic) understanding, the powerset P(M ) of a given set M consists of all subsets of M and is naturally ordered by set inclusion. We will call (P(M ), ⊆) the powerset lattice of M . Within the dyadic setting of Formal Concept Analysis, the powerset P(M ) is represented by the (dyadic) powerset context KM := (M, M, 6=) and its complete concept lattice B(KM ) := (B(KM ), ≤) which has as dyadic concepts the pairs (X1 , X2 ) with X1 ∪ X2 = M and X1 ∩ X2 = ∅ or, equivalently, X1 = X2C and which is ordered by (X1 , X2 ) ≤ (Y1 , Y2 ) :⇔ X1 ⊆ Y1 (⇔ X2 ⊇ Y2 ). Consequently, one component already determines the other component and (P(M ), ⊆) is obviously isomorphic to B(KM ). Structurally, the monadic and the dyadic understanding of a powerset are closely linked. In Triadic Concept Analysis, the idea of a powerset is represented by the triadic powerset context KM := (M, M, M, YM ) with YM := M 3 \{(a, a, a) | a ∈ M } and its complete concept trilattice T(KM ). In [Wi95], it has already been observed that the triadic concepts (X1 , X2 , X3 ) of KM are characterized by X1 ∩X2 ∩X3 = ∅ and Xi ∪ Xj = M for distinct i, j ∈ {1, 2, 3}. We use these conditions to define the triadic generalization of powersets. Definition 3.1: The powertriset of a set M is defined as T(M ) := {(X1 , X2 , X3 ) ∈ P(M )3 | X1 ∩ X2 ∩ X3 = ∅ and Xi ∪ Xj = M for i 6= j ∈ {1, 2, 3}}. It can be equipped with three quasiorders by (X1 , X2 , X3 ) .i (Y1 , Y2 , Y3 ) :⇔ Xi ⊆ Yi where i = 1, 2, 3. The powerset trilattice is defined as T(M ) := (T(M ), .1 , .2 , .3 ) and denoted by T(n) if M is finite with |M | = n. Fig. 1 shows the powerset trilattice T(4). The triple represented by a little circle in the triangular net can be obtained by following straight lines to circles in the three side diagrams. There the numbers labelled “below” these circles are collected where “below” in a side diagram is the opposite direction to that indicated by the arrow. Since obviously T(M ) = T(KM ), powerset trilattices are complete trilattices, which will also follow from Proposition 3.2 and Proposition 3.5. Moreover, the ordered structures (T(M )i , ≤i ), i ∈ {1, 2, 3}, are isomorphic to (P(M ), ⊆) (cf. Proposition 3.4). Note the three bounding elements O1 = (∅, M, M ), O2 = (M, ∅, M ), and O3 = (M, M, ∅). Note also that a triple (X1 , X2 , X3 ) ∈ T(M ) is not determined by one of its components.
Powerset Trilattices 4
213
3 2 1 3
O3
O2
1
4
1
2 2
3 3
1 4
2
O1 X=({1,2},{1,3,4},{2,3,4})
Fig. 1. The powerset trilattice T(4) with the cycle cX of complements of X = ({1, 2}, {1, 3, 4}, {2, 3, 4}).
Proposition 3.2: Let (X1 , X2 , X3 ) ∈ P(M )3 . Define the symmetric difference of X, Y ⊆ M by Xi ∆Xj := (Xi \ Xj ) ∪ (Xj \ Xi ). Then (X1 , X2 , X3 ) ∈ T(M ) if and only if Xk = (Xi ∩ Xj )C = Xi ∆Xj for all {i, j, k} = {1, 2, 3}. Proof: If (X1 , X2 , X3 ) ∈ T(M ) and {i, j, k} = {1, 2, 3}, then X1 ∩ X2 ∩ X3 = ∅ is equivalent to Xk = (Xi ∩ Xj )C . Moreover, Xi ∪ Xj = M implies (Xi ∩ Xj )C = (Xi \ Xj ) ∪ (Xj \ Xi ). Conversely, we obtain Xi ∪ Xk = Xi ∪ (Xi ∩ Xj )C = M 2 and also X1 ∩ X2 ∩ X3 = ∅. So a triple in T(M ) is determined by two of its components. It immediately follows that T(M ) satisfies the unique meet condition and the condition of antiordinality and is therefore a triordered set. Next we consider the finite case: Proposition 3.3: If M is finite with |M | = n then |T(M )| = 3n . Proof: Assume X ⊆ M with |X| = k ≤ n as the first component of a triple in T(M ). As the second component Y ⊆ M with X ∪ Y = M , we can choose |P(X)| =P 2k suchsets to obtain the triple (X, Y, X∆Y ) ∈ T(M ). Altogether, n there are k=0 nk 2k 1n−k = (2 + 1)n = 3n triples in T(M ). 2
214
K. Biedermann
From the preceding proof it also becomes clear that, in general, any subset X ⊆ M can be enlarged to a triple in T(M ) and can therefore occur in any of the three components. If we identify [(X1 , X2 , X3 )]i with Xi for all (X1 , X2 , X3 ) ∈ T(M ) and i ∈ {1, 2, 3}, it immediately follows: Proposition 3.4: The ordered structures (T(M )i , ≤i ), i ∈ {1, 2, 3}, of powerset trilattices are isomorphic to (P(M ), ⊆). The ik-joins in T(M ) can explicitly be determined, which makes the triordered set T(M ) a complete trilattice: S Proposition 3.5: Let X, Y ⊆ST(M ) be arbitrary subsets. Define X := {Xi | (X1 , X2 , X3 ) ∈ X} and Y := {Yk | (Y1 , Y2 , Y3 ) ∈ Y} for {i, j, k} = {1, 2, 3}. Then X∇ik Y = (Z1 , Z2 , Z3 ) where Zi = X ∪ Y C Zj = (X ∩ Y )C Zk = Y . In particular, the 13-join of ((X1 , X2 , X3 ), (Y1 , Y2 , Y3 )) ∈ T(M )2 is determined by (X1 , X2 , X3 )∇13 (Y1 , Y2 , Y3 ) = (X1 ∪ Y3C , (X1 ∩ Y3 )C , Y3 ). Proof: We ensure first that Z := (Z1 , Z2 , Z3 ) ∈ T(M ). Obviously, Zi ∪ Zk = M and, since Zj = X C ∪ Y C , we also have Zi ∪ Zj = M and Zj ∪ Zk = M . Intersecting the three components yields Zi ∩ Zj ∩ Zk = (X ∪ Y C ) ∩ (X C ∪ Y C ) ∩ Y = ((X ∩ X C ) ∪ Y C ) ∩ Y = Y C ∩ Y = ∅ and hence Z ∈ T(M ). But it is also an ik-bound of (X, Y) because (X1 , X2 , X3 ) .i Z for all (X1 , X2 , X3 ) ∈ X and (Y1 , Y2 , Y3 ) .k Z for all (Y1 , Y2 , Y3 ) ∈ Y. Among the triples in T(M ) having X in their ith and Y in their k th component, Z has the greatest possible j th component Zj = (X ∩ Y )C such that Z1 ∩ Z2 ∩ Z3 = ∅ is still satisfied. Therefore Z is also an ik-limit of (X, Y). Since Z has also the smallest possible k th component, it is the ik-join of (X, Y). 2 Many equations valid in all powerset trilattices can now be deduced but we will restrict ourselves to the triadic de Morgan laws after introducing the triadic complementation. Definition 3.6: For an element X := (X1 , X2 , X3 ) ∈ T(M ) and a permutation σ ∈ S3 , the σ-complement of X is defined by X σ := (Xσ−1 (1) , Xσ−1 (2) , Xσ−1 (3) ) ∈ T(M ). The set cX := {X σ | σ ∈ S3 } will be called the cycle of (triadic) complements of X. Obviously, X σ can be obtained from X by moving the ith component Xi of X to the σ(i)th position. Note that the cycle of complements of a boundary element
Powerset Trilattices
215
such as ({4}, {1, 2, 3, 4}, {1, 2, 3}) in Fig. 1 lies entirely on the boundary. Proposition 3.7: In powerset trilattices T(M ) the triadic de Morgan laws (X∇ik Y )σ = X σ ∇σ(i)σ(k) Y σ hold for all X, Y ∈ T(M ), {i, j, k} = {1, 2, 3}, and σ ∈ S3 . Proof: Let X = (X1 , X2 , X3 ) and Y = (Y1 , Y2 , Y3 ). Then X∇ik Y = (Z1 , Z2 , Z3 ) with Zi := Xi ∪ YkC , Zj := (Xi ∩ Yk )C , and Zk := Yk by Proposition 3.5. Since Xi is the σ(i)th component of X σ and Yk is the σ(k)th component of Y σ , it follows that X σ ∇σ(i)σ(k) Y σ has Xi ∪ YkC as σ(i)th component, (Xi ∩ Yk )C as σ(j)th component, and Yk as σ(k)th component. This implies X σ ∇σ(i)σ(k) Y σ = 2 (Z1 , Z2 , Z3 )σ = (X∇ik Y )σ . Note that the cycle of complements of a bounding element equals {O1 , O2 , O3 } and has therefore three elements. Cycles of complements have the following properties: Proposition 3.8: Let cX be the cycle of complements of X ∈ T(M ) and let {i, j, k} = {1, 2, 3}. Then all Y ∈ cX satisfy 1. Y ∼k Y (ij) and 2. [Y ]i ∨ [Y (ij) ]i = 1i and [Y ]i ∧ [Y (ij) ]i ∧ [Y (ik) ]i = 0i . Proof: 1.: The k th component of Y (ij) = (Y(ij)(1) , Y(ij)(2) , Y(ij)(3) ) is unchanged. 2.: This follows immediately from Definition 3.1, Definition 3.6, and (the proof of) Proposition 3.4. 2 In the following section it is shown that cX has six elements if X 6∈ {O1 , O2 , O3 } (cf. Proposition 4.6). Note also that (Y σ )τ = Y σ◦τ for all Y ∈ cX and σ, τ ∈ S3 .
13
03 3
O3
O2
11 1
2
O1
01
02 12
Fig. 2. The smallest non-trivial trilattice B 3 , isomorphic to T(1).
Powerset lattices are order-isomorphic to powers of the smallest non-trivial lattice having two elements. We finish this section with the corresponding triadic
216
K. Biedermann
result and define the smallest non-trivial trilattice B 3 := (B3 , .1 , .2 , .3 ) by B3 := {O1 , O2 , O3 } with Oi ∼i Oi and Oi .i Oj for all distinct i, j ∈ {1, 2, 3} (cf. Fig. 2). Theorem 1: For any powerset trilattice T(M ) the mapping ϕ : T(M ) → B M 3 with ϕ(X1 , X2 , X3 )(x) := Oi if x 6∈Xi defines a triorder-isomorphism, i.e., any powerset trilattice is (triorder-)isomorphic to a power of B 3 . Proof: Let (X1 , X2 , X3 ) ∈ T(M ) and let x ∈ M . Since x 6∈ Xi for some i ∈ {1, 2, 3} implies x ∈ Xj and x ∈ Xk for {j, k} = {1, 2, 3} \ {i}, it follows that ϕ is well-defined and ϕ(X1 , X2 , X3 ) ∈ B3M . But it is also a triorder embedding: Let X, Y ∈ T with X .i Y , i.e. Xi ⊆ Yi where i ∈ {1, 2, 3}. If x ∈ Xi then ϕ(X)(x) ∼i ϕ(Y )(x), if x ∈ Yi \ Xi then ϕ(X)(x) = Oi .i ϕ(Y )(x), and if x 6∈Yi then ϕ(X)(x) = Oi = ϕ(Y )(x). Thus ϕ(X) .i ϕ(Y ). Conversely, let ϕ(X) .i ϕ(Y ) and let x ∈ Xi . Then Oi 6= ϕ(X)(x) and hence ϕ(Y )(x) 6= Oi such that x ∈ Yi . It follows X .i Y . It remains to show that ϕ is surjective. So let f ∈ B3M and define X := (X1 , X2 , X3 ) by Xi := {x ∈ M | f (x) 6= Oi } for i ∈ {1, 2, 3}. Then obviously X ∈ T(M ) satisfying ϕ(X) = f , which completes the proof. 2 Note that the powerset trilattice T(4) in Fig. 1 is isomorphic to B 43 .
3
2 1
Fig. 3. A trilattice with isomorphic Boolean lattices as side diagrams which is not a powerset trilattice.
Powerset Trilattices
4
217
B-Trilattices
According to Proposition 3.4, powerset trilattices have isomorphic powerset lattices as ordered structures. The trilattice in Fig. 3 has this property but is different from T(3). If we add two further conditions, namely some property of the ik-joins in powerset trilattices (Lemma 4.1) and the triadic complementation (Definition 4.2), and in this way define B-trilattices, the powerset trilattices can be characterized as certain B-trilattices. In particular, the finite B-trilattices are (up to isomorphism) just the powerset trilattices T(n). Lemma 4.1: Let L := (L, .1 , .2 , .3 ) be a bounded trilattice in which x∇ik y ∼k y holds for all i 6= k in {1, 2, 3}. If z ∈ L then z k,i := Ok ∇ik z is the only boundary element b ∈ b(L) satisfying b ∼k z and b &i z. In particular, the boundary elements already determine the ordered structures (Lk , ≤k ), k ∈ {1, 2, 3}. Proof: Obviously z k,i ∼i Ok &i z and hence z k,i ∈ b(L). Moreover z k,i ∼k z such 2 that [z]k = [z k,i ]k showing that Lk = b(L)k := {[b]k | b ∈ b(L)}. Definition 4.2: A set c := {c1 , c2 , c3 , c4 , c5 , c6 } ⊆ L of a bounded trilattice L is called a cycle if c1 ∼3 c2 ∼2 c3 ∼1 c4 ∼3 c5 ∼2 c6 ∼1 c1 , c1 ∼2 c4 , c2 ∼1 c5 , and c3 ∼3 c6 (cf. Fig. 4). A cycle of (triadic) complements is a cycle c in which for each x ∈ c there are elements y, z ∈ c such that for all i ∈ {1, 2, 3} the join condition [x]i ∨ [y]i = [y]i ∨ [z]i = [x]i ∨ [z]i = 1i and the meet condition [x]i ∧ [y]i ∧ [z]i = 0i are satisfied.
c3
c4
~1
c2
c5
~3
~2 c1
c6
Fig. 4. A cycle with six elements.
Lemma 4.3: A cycle c of a trilattice L has one, three or six elements. A cycle of complements has one element if and only if |L| = 1. Proof: Two elements ck , cl ∈ c of a cycle c := {c1 , c2 , c3 , c4 , c5 , c6 } with k 6= l can either be i-equivalent for some i ∈ {1, 2, 3} or not. If ck = cl it follows |c| ≤ 3 in the first and |c| = 1 in the second case (apply the uniqueness condition). There are no cycles with two elements. If a cycle of complements has one element then |L| = 1 because of the meet and the join condition. 2
218
K. Biedermann
Definition 4.4: A bounded trilattice L := (L, .1 , .2 , .3 ) is (triadically) complemented if any x ∈ L belongs to a cycle of complements. Definition 4.5: A complemented trilattice B := (B, .1 , .2 , .3 ) is called a B-trilattice, if its ordered structures are isomorphic Boolean lattices and x∇ik y ∼k y holds for all {i, j, k} = {1, 2, 3}. The powerset trilattices are obviously B-trilattices. By Lemma 4.1, the boundary elements already determine the Boolean side diagrams. Moreover, as for Boolean lattices, the triadic complementation in B-trilattices is unique: Proposition 4.6: Let c be a cycle of a B-trilattice B. If the join and the meet condition hold for x, y, z ∈ c for an index i ∈ {1, 2, 3}, then the (ordinary) complement [x]i in (Bi , ≤i ) satisfies [x]0i = [y]i ∧ [z]i . Moreover, x belongs to exactly one cycle of complements, denoted by cx , and all cycles of complements except {O1 , O2 , O3 } have six elements if |B| > 1. Proof: By definition, the elements x, y, z satisfy [x]i ∧ ([y]i ∧ [z]i ) = 0i . On the other hand, it follows [x]i ∨ ([y]i ∧ [z]i ) = ([x]i ∨ [y]i ) ∧ ([x]i ∨ [z]i ) = 1i ∧ 1i = 1i and thus [y]i ∧ [z]i is the (unique) complement of [x]i in (Bi , ≤i ). To show the uniqueness of the triadic complementation for x ∈ B we start with c1 := x ∈ c of a cycle c of complements, using the same notation as in Fig. 4, and determine the position of c2 ∈ c in the triangular net: Let x1 := O1 ∇31 c1 . Then x1 ∼1 c1 ∼1 c6 and x1 &3 c1 , c6 from which x1 .2 c1 , c6 follows by antiordinality. Thus [x1 ]2 ≤2 [c1 ]2 ∧ [c6 ]2 = [c2 ]02 which implies [x1 ]02 ≥2 [c2 ]2 . Thus, for x2 ∈ B with x2 ∼3 x1 and [x2 ]2 = [x1 ]02 we obtain x2 &2 c2 and it easily follows that also [x2 ]1 = [x1 ]01 holds. As for x1 we similarly deduce that x3 := O2 ∇32 c2 .1 c2 , c3 such that [x3 ]1 ≤1 [c2 ]1 ∧ [c3 ]1 = [c1 ]01 = [x1 ]01 = [x2 ]1 , i.e. x3 .1 x2 . On the other hand, x3 &1 x2 since x3 ∼2 c2 .2 x2 and x3 ∼3 x2 , and hence x3 ∼1 x2 which implies x3 = x2 because of the uniqueness condition. Therefore [x1 ]02 = [c2 ]2 which fixes the position of c2 ∼3 c1 . Now, repeating this argument for c2 instead of c1 yields [c4 ]3 = [O2 ∇12 c2 ]03 fixing the positions of c4 ∼2 c1 and c5 ∼1 c2 and, because c is a cycle, also c3 and c6 . In this way, we have determined the unique cycle of complements cx of x. If a cycle c of complements has three elements then it follows c = {O1 , O2 , O3 } because of the join and the meet condition of cycles of complements. 2 Since two cycles of complements of a B-trilattice are either disjoint or identical, we can define the triadic complementation as a (unary) operation: Definition 4.7: Let B be a B-trilattice, let {i, j, k} = {1, 2, 3} and (ij) ∈ S3 (ij) (ij) be a transposition. Then we define Ok := Ok , Oi := Oj and for all non(ij) (ij) ∈ cx by x 6= x and x(ij) ∼k x and define bounding elements x ∈ B we fix x xσ◦τ := (xσ )τ if σ, τ ∈ S3 . The element xσ is called the σ-complement of x.
Powerset Trilattices
219
According to the preceding proof, x(ij) can be determined as the intersection of [x]k and [Oi ∇ki x]0j , because {x(ij) } = [x]k ∩ [Oi ∇ki x]0j , and therefore as the intersecting point of the corresponding lines in the triangular net of the triadic diagram. Crucial for the characterization of powerset lattices as atomic and complete Boolean lattices are the so-called atoms, which are the upper covers of the smallest element 0. (Recall that a lattice (L, ≤) with 0 is called atomic if for any x ∈ L \ {0} there is an atom a with a ≤ x.) Here special cycles of complements, the so-called atomic cycles, will be needed to characterize the powerset trilattices. Definition 4.8: An element a ∈ L of a trilattice L is called an i-atom (i-coatom) for i ∈ {1, 2, 3}, if [a]i is an atom (coatom) of (Li , ≤i ), i.e., b .i a and b 6∼i a implies b = Oi (a .i b and b 6∼i a implies b ∈ 1i ). Proposition 4.9: Let a ∈ B be a k-atom of a B-trilattice B for some k ∈ {1, 2, 3}. Then a is a boundary element and, for all i ∈ {1, 2, 3} and σ ∈ S3 , the σ-complements of a are either i-atoms, i-coatoms or in 1i . Moreover ca ⊆ b(B). Proof: Suppose there is a k-atom x with x 6∈ b(B). With the notation as in Lemma 4.1 it follows that xj,i ∼j x and xj,i &i x where {i, j, k} = {1, 2, 3}. By antiordinality we obtain xj,i .k x which imlies xj,i = Ok because x is a k-atom. But then x ∼j Ok follows which contradicts x 6∈b(B). Therefore a k-atom a is a boundary element and hence there are just two k-atoms, namely ak,i and ak,j with {i, j, k} = {1, 2, 3}. Moreover, ak,j is an i-coatom because y &i ak,j implies y i,j .k ak,j by antiordinality, such that y i,j = Ok and hence y ∈ 1i . Since ak,i ∈ 1i , it follows from Proposition 4.6 that ([ak,i ]i ∧ [ak,j ]i )0 is an atom in (Bi , ≤i ), i.e. the (ik)-complements of the k-atoms are i-atoms. Repeating this argument finally yields that for all i ∈ {1, 2, 3} the cycle of complements ca consists of i-atoms, i-coatoms and elements in 1i and that ca ⊆ b(B). 2 Definition 4.10: A cycle of complements as in the previous proposition is called an atomic cycle (of complements) and abbreviated by a. If i ∈ {1, 2, 3} then ai ⊆ a denotes the subset of i-atoms in a, and A(B) is defined as the set of all atomic cycles of the B-trilattice B. The Characterization Theorem: Let B be a B-trilattice with atomic and complete Boolean lattices as side diagrams. Then a triorder-isomorphism ϕ : B → T(A(B)) can be defined by ϕ(x) := (ϕ(x)1 , ϕ(x)2 , ϕ(x)3 ) with ϕ(x)i := {a ∈ A(B) | a .i x for some a ∈ ai }, i.e. the B-trilattices with ordered structures isomorphic to powerset lattices are (up to isomorphism) exactly the powerset trilattices. Proof: First of all we show that ϕ is well-defined, i.e. ϕ(x) ∈ T(A(B)). Obviously, ϕ({O1 , O2 , O3 }) ⊆ T(A(B)). So, let x ∈ B \ {O1 , O2 , O3 }, let {i, j, k} = {1, 2, 3}
220
K. Biedermann
and assume a 6∈ϕ(x)k . Then, for a ∈ ai with a = ai,j , it follows a &k x and, since a &j x, also a .i x. Thus a ∈ ϕ(x)i showing that ϕ(x)i ∪ ϕ(x)k = A(B). To ensure that also ϕ(x)1 ∩ ϕ(x)2 ∩ ϕ(x)3 = ∅, assume a ∈ ϕ(x)i for some i ∈ {1, 2, 3}, i.e. a .i x for some a ∈ ai . Because of the meet condition, there is an index k ∈ {1, 2, 3} \ {i} such that x(ik) 6&i a. If we choose a ∈ ai with a = ai,j for j ∈ {1, 2, 3} \ {i, k}, then we get x(ik) .i a(ik) . Moreover, since x(ijk) = (x(ik) )(jk) ∼i x(ik) we also have x(ijk) .i a(ik) and, with x(ik) , x(ijk) .j a(ik) , it follows x(ik) , x(ijk) &k a(ik) . Because x, x(ik) , and x(ijk) satisfy the meet condition for k, we obtain x 6&k a(ik) , i.e. a 6∈ ϕ(x)k and ϕ(x)1 ∩ϕ(x)2 ∩ϕ(x)3 = ∅ (Note that a ∈ ϕ(x)j because of ϕ(x)j ∪ϕ(x)k = A(B).). Next we ensure that ϕ is a triorder-embedding. Let x, y ∈ B with x .i y and i ∈ {1, 2, 3}. Then it is obvious that ϕ(x) .i ϕ(y). Conversely, let ϕ(x) .i ϕ(y), i.e. ϕ(x)i ⊆ ϕ(y) W i . Since (Bi , ≤i ) is an atomicWand complete Boolean lattice it follows [x]i = {[a]i | a ∈ ai , a ∈ ϕ(x)i } ≤i {[a]i | a ∈ ai , a ∈ ϕ(y)i } = [y]i and thus x .i y. Therefore it remains to show that ϕ is surjective.W So let W (A1 , A2 , A3 ) ∈ T(A(B)), let x1 ∈ {[a]1 | a ∈ a1 , a ∈ A1 } and x3 ∈ {[a]3 | a ∈ a3 , a ∈ A3 } and define y := x1 ∇13 x3 and z := x3 ∇31 x1 . Then y ∼2 z and z .1 y. We will show that y = z. Suppose y 6∼1 z. Then there is a 1-atom a with a .1 y but a 6.1 z and we can choose a such that a = a1,3 . The first inequality then implies a := ca ∈ ϕ(y)1 and the latter yields a(12) &1 z. But since also a(12) &3 z we obtain a(12) .2 z ∼2 y and hence a ∈ ϕ(y)2 . But we can also show a ∈ ϕ(y)3 as follows. Since (B1 , ≤1 ) is an atomic and complete Boolean lattice and z = x3 ∇31 x1 ∼1 x1 we conclude A1 = ϕ(x1 )1 = ϕ(z)1 and, since a 6.1 z with a as the already chosen 1-atom, it follows a 6∈ϕ(z)1 = A1 which implies a ∈ A3 because of A1 ∪ A3 = A(B). But, since also y = x1 ∇13 x3 ∼3 x3 implies A3 = ϕ(x3 )3 = ϕ(y)3 , we get a ∈ ϕ(y)3 . Therefore a ∈ ϕ(y)1 ∩ ϕ(y)2 ∩ ϕ(y)3 which contradicts ϕ(y) ∈ T(A(B)). Consequently y ∼1 z and hence y = z. But then A1 = ϕ(y)1 and A3 = ϕ(y)3 and therefore ϕ(y) = (A1 , A2 , A3 ) showing that ϕ is a surjective triorder-embedding, i.e. an isomorphism. 2 Theorem 1 and the Characterization Theorem immediately yield: Corollary: Any finite B-trilattice B with n atomic cycles satisfies B ∼ = = T(n) ∼ n B3 .
5
Discussion
Should we now identify the B-trilattices with the Boolean trilattices? In fact, the B-trilattices have similar properties as Boolean lattices such as the unique complementation. They also play an analogous role in the Characterization Theorem. In this sense, the presented order-theoretic approach to B-trilattices corresponds to the definition of Boolean lattices as complemented and distributive lattices. From an algebraic point of view, Boolean lattices can equivalently be understood as algebras in which certain equations hold. There are the lattice equations, the
Powerset Trilattices
221
distributive laws and identities concerning the (ordinary) complement. Trilattices, on the other hand, can algebraically be characterized by certain trilattice equations (cf. [Bi98]). But there is not yet an algebraic way to understand the B-trilattices. In fact, it has not been possible to equivalently express the defining properties of B-trilattices by suitable triadic equations. It becomes, for example, quite complicated to describe the distributive laws of the ordered structures as terms with ik-joins. In general, there seems to be no immediate and simple connection between the triadic operations of B-trilattices – the ik-joins and the σ-complements – and the Boolean operations for their side diagrams. So the next step is to find such unifying and simplifying triadic equations as the distributive laws and also identities concerning the triadic complements.
References Bi97.
Bi98. DP90. Gr79. GW96. LW95.
MB89. Wi95. Wi97. WZ98.
K. Biedermann: How Triadic Diagrams Represent Conceptual Structures. In: D. Lukose, H. Delugach, M. Keeler, L. Searle, J. Sowa (eds.), Conceptual Structures: Fulfilling Peirce’s Dream, Lecture Notes in Artificial Intelligence 1257. Springer-Verlag, Berlin-Heidelberg-New York 1997, 304317. K. Biedermann: An Equational Theory for Trilattices. FB4-Preprint, TU Darmstadt 1998. B.A. Davey, H.A. Priestley: Introduction to Lattices and Order. Cambridge University Press, Cambridge 1990. G. Gr¨ atzer: Universal Algebra. Springer-Verlag, Berlin-Heidelberg-New York, 1979. B. Ganter, R. Wille: Formale Begriffsanalyse: Mathematische Grundlagen. Springer, Berlin-Heidelberg 1996. F. Lehmann, R. Wille: A Triadic Approach to Formal Concept Analysis. In: G. Ellis, R. Levinson, W. Rich and J.G. Sowa (ed.). Conceptual Structures: Applications, Implementations and Theory. Lecture Notes in Artificial Intelligence 954. Springer-Verlag, Berlin-Heidelberg-New York 1995, 32-43. J. D. Monk, R. Bonnet (eds.): Handbook of Boolean Algebras. Elsevier Science Publishers B.V., Amsterdam-New York 1989. R. Wille: The Basic Theorem of Triadic Concept Analysis. Order 12 (1995), 149-158. R. Wille: Conceptual Graphs and Formal Concept Analysis. In: D. Lukose, H. Delugach, M. Keeler, L. Searle (eds.): Conceptual Structures: Fulfilling Peirce’s Dream. Springer, Berlin-Heidelberg-New York 1997, 290-303. R. Wille, M. Zickwolff: Grundlegung einer Triadischen Begriffsanalyse. In: G. Stumme, R. Wille: Wissensverarbeitung: Methoden und Anwendungen. Springer, Berlin-Heidelberg 1998.
Simple Concept Graphs: A Logic Approach Susanne Prediger Technische Universit¨ at Darmstadt, Fachbereich Mathematik Schloßgartenstr. 7, D–64289 Darmstadt, [email protected]
Abstract. Conceptual Graphs and Formal Concept Analysis are combined by developing a logical theory for concept graphs of relational contexts. Therefore, concept graphs are introduced as syntactical constructs, and their semantics is defined based on relational contexts. For this contextual logic, a sound and complete system of inference rules is presented and a standard graph is introduced that entails all concept graphs being valid in a given relational context. A possible use for conceptual knowledge representation and processing is suggested.
1
Introduction
The first approach combining the theory of Conceptual Graphs and Formal Concept Analysis was described by R.Wille in [12]. For connecting the conceptual structures in both theories, the concept types appearing in conceptual graphs were considered to be formal concepts of formal contexts, the constituents in Formal Concept Analysis. To facilitate this connection, concept graphs, appropriate mathematizations of conceptual graphs, were introduced. The theory of concept graphs of formal contexts was developed as a mathematical structure theory where concept graphs of formal contexts are realizations of abstract concept graphs. In this paper, a logic approach is presented by developing a logical theory for concept graphs. Therefore, concept graphs are defined as syntactical constructs over an alphabet of objects names, concept names and relation names (Section 2). Then, a contextual semantics is specified. We interpret the syntactical names by objects, concepts and relations of a relational context (Section 3). In this way, we can profit from all notions for concepts that have been developed in Formal Concept Analysis. The introduced contextual logic is carried on by the study of inferences (Section 4). Based on a model-theoretic notion for the entailment of concept graphs, a sound and complete set of inference rules is established and compared to the notion of projections between concept graphs. With the standard model, we can present another interesting tool for reasoning with concept graphs. In Section 5, we introduce a standard graph for a given relational context. It gives a basis of all concept graphs being valid in the relational context. In the last section, we suggest how this approach can be used for knowledge representation and processing. M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 225–239, 1998. c Springer-Verlag Berlin Heidelberg 1998
226
2
S. Prediger
Syntax for the Language of Concept Graphs
We want to introduce concept graphs as syntactical constructs. Therefore, we need an alphabet of concept names, relation names and objects names. As the theory of conceptual graphs provides an order-sorted logic, we start with ordered sets of names that are not necessarily lattices. These orders are determined by the taxonomies of the domains in view, they formalize ontological background knowledge. Definition 1. An alphabet of concept graphs is a triple (C, G, R), where (C, ≤C ) is a finite ordered set whose elements are called concept names, G is a finite set whose elements are called object names, and (R, ≤R ) is a set, partitioned into finite ordered sets (Rk , ≤Rk )k=1,...,n whose elements are called relation names. Now, we can introduce concept graphs as statements formulated with these syntactical names. That means, we consider the concept graphs to be the wellformed formulas of our formal language. In accordance with the first mathematization of conceptual graphs in [12], the structure of (simple) concept graphs is described by means of directed multi-hypergraphs and labeling functions. Definition 2. A (simple) concept graph over the alphabet (C, G, R) is a structure G := (V, E, ν, κ, ρ), where • (V, E, ν) is a (not necessarily connected) finite directed multi-hypergraph, i. e. a structure where V and E are finite Ssets whose elements are called n vertices and edges respectively, and ν: E → k=1 V k is a mapping (n ≥ 2), • κ: V ∪ E → C ∪ R is a mapping such that κ(V ) ⊆ C and κ(E) ⊆ R, and all e ∈ E with ν(e) = (v1 , . . . , vk ) satisfy κ(e) ∈ Rk , • ρ: V → P(G)\{∅} is a mapping. For an edge e ∈ E with ν(e) = (v1 , . . . , vk ), we define |e| := k, and we write ν(e) i := vi and ρ(e) := ρ(v1 ) × . . . × ρ(vk ). Apart from some little differences, the concept graphs correspond to the simple conceptual graphs defined in [8] or [1]. We only use multi-hypergraphs instead of bipartite graphs in the mathematization. The application ν assigns to every edge the tuple of all its incident vertices. The function κ labels the vertices and edges by concept and relation names, respectively, and the mapping ρ describes the references of every vertex. In contrast to Sowa, we allow references with more than one object name (i. e. individual marker) but no generic markers, i. e. existential quantifiers, yet. They can be introduced into the syntax easily (cf. [6]), but in this paper we want to put emphasis on the elementary language. That is why we can omit coreference links here which are only relevant in connection with generic markers.
3
Semantics for Concept Graphs
We agree with J.F.Sowa, when he writes about the importance of a semantics: “To make meaningful statements, the logic must have a theory of reference that
Simple Concept Graphs: A Logic Approach
227
determines how the constants and variables are associated with things in the universe of discourse.” [9, p. 27] Usually, the semantics for conceptual graphs is given by the translation of conceptual graphs into first-order logic (cf. [8] or [1]). For some notions and proofs, a set-theoretic, extensional semantics was developed (cf. [4]), but it is rarely used. We define a semantics based on relational contexts. That means, we interpret the syntactical elements (concept, object and relation names) by concepts, objects and relations of a relational context. We prefer this contextual semantics for several reasons. As the basic elements of concept graphs are concepts, we want a semantics in which concepts are considered in a formal, but manifold way. Therefore, it is convenient to use Formal Concept Analysis, which is a mathematization of the philosophical understanding of concepts as units of thought constituted by their extension and intension (cf. [11]). Furthermore, it is essential for Formal Concept Analysis that these two components of a concept are unified on the basis of a specified context. This contextual view is supported by Peirce’s pragmatism which claims that we can only analyze and argue within restricted contexts where we always rely on preknowledge and common sense (cf. [12]). Experience has shown that formal contexts are a useful basis for knowledge representation and communication because, on the one hand, they are close enough to reality and, on the other hand, their formalizations allow an efficient formal treatment. As formal contexts do not formalize relations on the objects, the contexts must be enriched with relational structures. Therefore, R.Wille invented power context families in [12] where relations are described as concepts with extensions consisting of tuples of objects. Using relational contexts in this paper, we have chosen a slightly simpler formalism. Nevertheless, this formalism can be transformed to power context families and vice versa. This is explained in detail in [5] and will not be discussed in this paper. Let us start with the formal definition of a relational context (originally introduced in [7]). Definition 3. A formal context (G, M, I) is a triple where G and M are finite sets whose elements are called objects and attributes, respectively, and I is a binary relation between G and M which Sn is called an incidence relation. A formal context, together with a set R := k=1 Rk of sets of k-ary relations on G is called relational context and denoted by K := ((G, R), M, I). The concept lattice B(G, M, I) := (B(G, M, I), ≤) is also called the concept lattice of K and denoted by B(K). For the basic notions in Formal Concept Analysis like the definition of the concept lattice, please refer to [3]. We just mention the denotation g I := {m ∈ M/(g, m) ∈ I} for g ∈ G (and dually for m ∈ M ) which will be used in the following paragraphs. Now, we can specify how the syntactical elements of (C, G, R) are interpreted in relational contexts by context-interpretations. The object names are interpreted by objects of the context, the concept names by its concepts and the relation
228
S. Prediger
names by its relations. In this way, we can embed the order given on C into the richer structure of the concept lattice. Order-preserving mappings are required because the interpretation shall respect the subsumptions given by the orders on C and R. Definition 4. For an alphabet A := (C, G, R) and a relational context K := ((G, R), M, I) we call the union ι := ιC ∪˙ ιG ∪˙ ιR of the mappings ιC : (C, ≤C ) → B(K), ιG : G → G and ιR : (R, ≤R ) → (R, ⊆) a K-interpetation of A if ιC and ιR are order-preserving and we have ιR (Rk ) ⊆ Rk for all k = 1, . . . , n. The tuple (K, ι) is called a context-interpretation of A. Having defined how the syntactical elements are related to elements of a relational context, we can explain formally how to distinguish valid statements from invalid statements. Due to our contextual view, the notion of validity also depends on the specified relational context. That means, a concept graph is called valid in a context-interpretation if the assigned objects belong to the extension of the assigned concepts, and if the assigned relations conform with the labels of the edges. Let us make these conditions precise in a formal definition. Definition 5. Let (K, ι) be a context-interpretation of A. The concept graph G := (V, E, ν, κ, ρ) over A is called valid in (K, ι) if • ιG ρ(v) ⊆ Ext(ιC κ(v)) for all v ∈ V (vertex condition) • ιG ρ(e) ⊆ ιR κ(e) for all e ∈ E (edge condition). If G is valid in (K, ι), then (K, ι) is called a model for G and G is called a concept graph of (K, ι). Note that, theoretically, any formal context could be completed to a model if the relations and the interpretation were chosen in the right way. For a given concept graph G := (V, E, ν, κ, ρ), a formal context (G, M, I) and a given mapping ιG : G → G, we can always define an order-preserving mapping M, the vertex condition, for ιC : C → B(G,W I) satisfying example the mapping with ιC (c) := ιG ρ(v)II , ιG ρ(v)I | v ∈ V, κ(v) ≤C c . Thus, we can obtain a model by choosing appropriate relations and a mapping ιR . This shows that looking for an adequate model is not only a matter of formalism. It always depends on the specific purpose. There is one interesting model for every concept graph, namely its standard model. It codes exactly the information given by the concept graph. Definition 6. Let G := (V, E, ν, κ, ρ) be a concept graph over the alphabet (C, G, R). We define the standard model of G to be the relational context KG := ((G, RG ), C, I G ) together with the KG -interpretation ιG := ιC ∪˙ ιG ∪˙ ιR where G G G ιC : C → B(KG ) with ιC (c) := (cI , cI I ), ιG := idG , RG := ιR (R). The incidence relation I G ⊆ G × C and the mapping ιR are defined in such a way that for all g ∈ G, c ∈ C, (g1 , . . . , gk ) ∈ G k and R ∈ R, we have the following conditions: (g, c) ∈ I G
⇐⇒ ∃ v ∈ V :
(g1 , . . . , gk ) ∈ ιR (R) ⇐⇒ ∃ e ∈ E :
κ(v) ≤C c and g ∈ ρ(v) κ(e) ≤R R and (g1 , . . . , gk ) ∈ ρ(e).
Simple Concept Graphs: A Logic Approach
229
We can read this definition as an instruction of how to construct the standard model. As objects of the context, we take all object names in G, as attributes of the context, we take all concept names in C and we relate the object name g and the concept name κ(v) (i. e., set (g, κ(v)) ∈ I G ) if the object name g belongs to the reference ρ(v) of the vertex v. For preserving the order, we additionally relate g to every concept name c satisfying κ(v) ≤C c. Similarly for the relations. It is proved in [5] that this standard model is really a model for G. Constructing a standard model for a given concept graph is the easiest way to find a relational context that codes exactly the information formalized in the concept graph. Thus, it allows us to translate knowledge expressed on the graphical level into knowledge on the contextual level. In the following section, we will see how the standard model helps to characterize inferences of concept graphs on the contextual level.
4 4.1
Reasoning with Concept Graphs Entailment and Validity in the Standard Model
Having specified a formal semantics, we can easily describe inferences on the semantical level by entailments. For this, we only consider concept graphs over the same alphabet in the whole section. We recall the usual definition. Definition 7. Let G1 and G2 be two concept graphs over the same alphabet. We say that G1 entails G2 if G2 is valid in every model for G1 . We denote this by G1 |= G2 . The following proposition explains how the entailment can be characterized by standard models (for the proof see appendix). Proposition 1. The concept graph G1 entails the concept graph G2 if and only if G2 is valid in the standard model (KG1 , ιG1 ) of G1 . That means, using the contextual language, we obtain an effective method for deciding whether a concept graph entails another one or not. Beyond this, we could theoretically concentrate completely on the context level and describe the relation |= by means of inclusion in the standard models as the following lemma shows. Lemma 1. Let G1 and G2 be two concept graphs over the same alphabet with standard models (KG1 , ιG1 ) and (KG2 , ιG2 ), respectively. They satisfy G1 |= G2
⇐⇒
G2 1 I G1 ⊇ I G2 and ιG R (R) ⊇ ιR (R) for all R ∈ R.
Although the lemma is not very practical for reasoning in general, it has important consequences. Firstly, we can see easily that the relation |= is reflexive and transitive, i. e., it is a preorder. Secondly, it implies that equivalent concept graphs (i. e. concept graphs with G1 |= G2 and G2 |= G1 ) have identical standard models.
230
S. Prediger
Finally, we can characterize the order that is induced by the preorder |= on the equivalence classes of concept graphs: the lemma shows how it can be characterized by the inclusions in the corresponding standard models. In particular, this allows us to describe the infimum and supremum of concept graphs by join and intersection in the standard model. The infimum of the two equivalence classes of the concept graphs G1 and G2 is the equivalence class of the juxtaposition G1 ⊕ G2 (cf. [2]). It is not difficult to see that the standard model of this juxtaposition is exactly the standard model one obtains by “joining the standard models”: for (KG1 ⊕G2 , ιG1 ⊕G2 ), we have I G1 ⊕G2 = I G1 ∪ I G2 and G2 1 ⊕G2 1 (R) = ιG ιG R R (R) ∪ ιR (R) for all R ∈ R. Whereas it is a difficult task to construct the supremum of two equivalence classes (if it exists at all), we can deduce immediately from Lemma 1 that its standard model (K, ι) is the intersection of the standard models. That means, G2 1 we have I = I G1 ∩ I G2 and ιR (R) = ιG R (R) ∩ ιR (R) for all R ∈ R. We conclude that describing the order induced by |= is much easier on the context level than it is on the graph level. Especially suprema and infima can be characterized easily. This shows that for some purposes, it is convenient to translate the information given in concept graphs to the context level. For other purposes, it is interesting to do reasoning only on the syntactical level. But how can we characterize inferences on the syntactical level? It is usually done in two ways: by inference rules that were inspired by Peirce’s inference rules for existential graphs and by projections, i. e. graph morphisms that can be supported by graph-theoretical methods and algorithms (cf. [8], [4]). 4.2
Projections
Projections describe inferences from the perspective of graph morphisms. We recall the definition of projections as graph morphisms respecting the labeling functions (cf. [1]). It is slightly modified for concept graphs. Definition 8. For the two concept graphs G1 := (V1 , E1 , ν1 , κ1 , ρ1 ) and G2 := (V2 , E2 , ν2 , κ2 , ρ2 ) over the same alphabet, a projection from G2 to G1 is defined as the union πV ∪˙ πE of mappings πV : V2 → V1 and πE : E2 → E1 such that |e| = |πE (e)|, πV (ν2 (e) i ) = ν1 (πE (e)) i and κ1 (πE (e)) ≤R κ2 (e) for all edges e ∈ E2 and for all i = 1, . . . , |e|; and κ1 (πV (v)) ≤C κ2 (v) and ρ1 (πV (v)) ⊇ ρ2 (v) for all vertices v ∈ V2 . We write G1 . G2 if there exists a projection from G2 to G1 . The relation . defines a preorder on the class of all concept graphs, i. e., it is reflexive and transitive but not necessarily antisymmetric. It can be proved that this relation is characterized by the following inference rules (cf. [1], for concept graphs see [5]): 1. 2. 3. 4.
Double a vertex. Delete an isolated vertex. Double an edge. Delete an edge.
Simple Concept Graphs: A Logic Approach
C:
HUMAN ADULT
CHILD
WOMAN
G1
CHILD: Hansel, Gretel
G2
HUMAN: Gretel, Witch
G3
WOMAN: Witch
G :=
{Hansel, Gretel, Witch}
R :=
{THREATEN}
231
ADULT: Witch
1
THREATEN
2
CHILD: Hansel
ADULT: Witch
G4
ADULT: Witch
1
THREATEN
2
CHILD: Hansel
Fig. 1. Counter-examples to the Completeness of the Projection
5. 6. 7. 8.
Generalize a concept name. Generalize a relation name. Restrict a reference. Copy the concept graph.
That means, two concept graphs G1 and G2 satisfy G1 . G2 if and only if G2 can be derived from G1 by applying these rules (which are elaborated more precisely in the appendix). Note that the Rule 7 (restrict a reference) is different to the restriction rule defined in [1]: Here, we cannot replace an individual maker by a general marker but delete an individual marker from the set of objects being the reference. It can be proved that these rules are sound (cf. [5]). But as the examples in Figure 1 show, they are not complete. It is easy to see that G2 must be valid in every model for G1 . On the other hand, it cannot be derived from G1 because by these rules, references can only be restricted, not extended or joined. Even for concept graphs with references of only one element, the rules are not complete, when we consider redundant graphs. This is shown by the concept graphs G3 and G4 . We have G3 |= G4 , but the concept name WOMAN cannot be replaced by ADULT with the given rules.
232
4.3
S. Prediger
A Sound and Complete Calculus for all Concept Graphs
There are two ways to treat the incompleteness: restricting the class of considered concept graphs (e. g. to concept graphs of normal form like in [4]) or extending the rules. As it is convenient for conceptual knowledge processing to allow all concept graphs instead of restricting them to normal forms, we decided to modify and extend the rules. (Note that the introduced rules are usually needed to transform a concept graph into normal form.) Definition 9. Let G1 and G2 be two concept graphs over the same alphabet. We call G2 derivable from G1 and denote G1 ` G2 if it can be derived by the following inference rules (which are elaborated in the appendix): 1.
Double a vertex. Double a vertex v and its incident edges (several times if v occurs more than once in ν(e)). Extend the mappings κ and ρ to the doubles. 2. Delete an isolated vertex. Delete a vertex v and restrict κ and ρ accordingly. 3. Double an edge. Double an edge e and extend the mappings κ and ρ to the double. 4. Delete an edge. Delete an edge e and restrict the mappings κ and ρ accordingly. 5.∗ Exchange a concept name. Substitute the assignment v 7→ κ(v) for v 7→ c for such a concept name c ∈ C for which there is a vertex w ∈ V with κ(w) ≤C c and ρ(v) ⊆ ρ(w). 6.∗ Exchange a relation name. Substitute the assignment e 7→κ(e) for e 7→R for such a relation name R ∈ R for which there is an edge f ∈ E with κ(f ) ≤C R and ρ(e) ⊆ ρ(f ). 7. Restrict a reference. Replace the assignment v 7→ρ(v) by v 7→A with the subset ∅ = 6 A ⊆ ρ(v). 8. Copy the concept graph. Construct a concept graph that is identical to the first concept graph up to the names of vertices and edges. 9.∗ Join vertices with equal references. Join two vertices v, w ∈ V satisfying ρ(v) = ρ(w) into a vertex v ∨ w with the same incident edges and references, and set κ(v ∨ w) = c for a c ∈ C with κ(v) ≤C c and κ(w) ≤C c. 10.∗ Join vertices with corresponding edges. Join two vertices v, w ∈ V which have corresponding, but uncommon edges (i. e. for every edge e ∈ E incident with v there exists an edge e0 incident with w, and vice versa, with equal label and equal references, and there incident vertices only differ in v and w once) into a vertex v ∨ w with the same incident edges, κ(v∨w) = c for a c ∈ C with κ(v) ≤C c and κ(w) ≤C c, and ρ(v ∨ w) = ρ(v) ∪ ρ(w).
We will state these inference rules more precisely in the appendix and prove formally that they are sound and complete. Note that Rule 8 is redundant because it can be substituted by applying Rule 1 and 4.
Simple Concept Graphs: A Logic Approach
233
Proposition 2 (Soundness and Completeness). Let G1 and G2 be two concept graphs over the same alphabet. Then, we have G1 |= G2
⇐⇒
G1 ` G2 .
Let us sum up what has been achieved. There are three possibilities for characterizing inferences on concept graphs. The usual model-theoretic way is the entailment (cf. Def. 7). On the syntactical level, we have a sound and complete set of inference rules (cf. Def. 9) whereas the projections cannot be used in the general case due to incompleteness. With their graphical character, the inference rules can visualize inferences and can be intuitively used to derive relatively similar concept graphs by hand. That is why they can support communication about reasoning to a certain degree. For implementation and general questions of decidability, it seems to be more convenient to use the third notion to characterize inferences, namely validity in the standard model (cf. Prop. 1).
5
The Standard Graph of a Relational Context
The construction of a standard model for a given concept graph gives not only an efficient mathematical method for reasoning, but also a mechanism to translate the knowledge given in concept graphs to knowledge formalized in relational contexts. This possibility to translate from the graphical level to the contextual level is important for the development of conceptual knowledge systems. For such a conceptual knowledge system, the opposite direction is equally important. How can we translate knowledge given in relational contexts into the language of concept graphs? Obviously, we can state many different valid concept graphs for a given relational context. If we look for a so-called standard graph that codes the same information as the relational context, we have to look for a concept graph that entails all other valid concept graphs. For a similar purpose, R.Wille proposed a procedure to construct a canonical concept graph in [12]. We will modify this procedure for our purpose here. We start with a relational context, say K := ((G, R), M, I). For constructing a concept graph, we need an alphabet (C, G, R). We define C := B(K), G := G and R := R. For every index k = 1, . . . , n, we determine for every relation R ∈ Rk all maximal k-tuples (A1 , . . . , Ak ) of non-empty subsets of G being included in R. All those (k + 1)-tuples (R, A1 , . . . , Ak ) are collected in the set EK . That means, we define for R ∈ Rk the set Refmax (R) := {A1 × . . . × Ak ⊆ R | B1 × . . . × Bk ⊆ R implies B1 × . . . × Bk 6⊃A1 × . . . Ak } and obtain the set of edges EK :=
[
{(R, A1 , . . . , Ak ) | R ∈ Rk , A1 × . . . × Ak ∈ Refmax (R)}.
k=1,...,n
234
S. Prediger
Now, we define VK:= {A ⊆ G | there exists a (R, A1 , . . . , Ak ) ∈ EK with A = Ai for an i ≤ k} ∪ {g II ⊆ G | g ∈ G}, S
and set νK : EK → nk=1 VKk with νK (R, A1 , . . . , Ak ) := (A1 , . . . , Ak ) and κK : VK ∪ EK → B(K) ∪ R where κK (R, A1 , . . . , Ak ) := R and κK (A) := (AII , AI ). Finally, we can choose ρK (A) := A for all A ∈ VK . In this way, we obtain a concept graph G(K) := (VK , EK , νK , κK , ρK ) that is valid in (K, id) and is called the standard graph of K. Proposition 3. The standard graph G(K) of a relational context K entails every concept graph G0 that is valid in (K, id). This proposition (which is proved in the appendix) guarantees the demanded property of the standard graph. It is an irredundant graph that entails all concept graphs which are valid in its context. Thus, the standard graph is the counterpart to the standard model. With the standard model, we gather all the information given in the concept graph and have a tool to translate it from the graph level into the context level. Vice versa, we can translate information from the context level to the graph level by constructing the standard graph. The relationship between a context K and the context KG(K) , belonging to the standard model (KG(K) , ιG(K) ) of G(K) can also be described: The proof of Prop. 3 shows that the context KG(K) only differs from K because its set of attributes is not reduced and the attributes have different names. Their concept lattices are isomorphic. Vice versa, starting with a concept graph G and constructing the standard model (KG , ιG ), we cannot say that the standard graph of KG is isomorphic to G in the formal sense because it is not a concept graph over the same alphabet. Nevertheless, it encodes the same information in an irredundant form.
6
Contextual Logic for Knowledge Representation
With the approach to contextual logic presented in this paper, we have proposed a logic for concept graphs that is equipped with a model-theoretic foundation and in which inferences can be characterized in multiple ways. From a computational point of view, an efficient method has been presented to do reasoning by checking validity in the corresponding standard models. The major domain of application we have in mind for this logic, is conceptual knowledge representation and processing. In particular, the contextual semantics allows an integration of concept graphs into conceptual knowledge systems like TOSCANA that are based on Formal Concept Analysis. Vice versa, an integration of concept lattices and various methods of Formal Concept Analysis into tools for conceptual graphs is possible. For this purpose, the separation of syntax and semantics is less important than the possibility of expressing knowledge on two different levels, the graph level and the context level. With the
Simple Concept Graphs: A Logic Approach
235
standard model and the standard graph, we have developed two notions that help to translate knowledge from one level to the other. With it, the foundation is laid for conceptual knowledge systems which combine the advantages of both languages. For example, we can imagine a system that codes knowledge in relational contexts and provides, with the concept graphs, a graphical language as interface and representation tool for knowledge. In such a system, the knowledge engineer could extend a given knowledge base by constructing new concept graphs over the existing alphabet. Then, implemented algorithms on the graph level or on the context level (whatever is more convenient for the special situation) could check whether the new concept graph is already valid in the context (i. e., the information is redundant) or whether it represents additional information. Concept lattices could be used to find the conceptual hierarchy on the concepts and to determine the conceptual patterns and dependencies of concepts and objects. Obviously, we could profit from all the methods and algorithms already existent for conceptual graphs. The architecture of conceptual knowledge systems including relational contexts and concept graphs should be discussed, and the role of the different languages should be further explored. As the expressivity of the developed language is still quite limited, the extensions by quantifiers and nested concept graphs are considered in current research.
7
Appendix: Formal Proofs
Proof of Proposition 1. We only have to prove that G2 is valid in an arbitrary model (K, λ) for G1 with K := ((G, R), M, J) and λ := λG ∪˙ λC ∪˙ λR if G2 is valid in the standard model (KG1 , ιG1 ) of G1 with KG1 = ((G, RG1 ), C, I G1 ). As a result of the vertex condition for G1 in the model (K, λ), we have λG ρ1 (v) ⊆ Ext λC κ1 (v) ⊆ Ext λC (c) for all concept names c ∈ C and for all vertices v ∈ V1 with κ1 (v) ≤C c (because λC is order-preserving). It follows S λG ( {ρ1 (v) | v ∈ V1 , κ1 (v) ≤C c}) ⊆ Ext λC (c) for all c ∈ C. As a result of the vertex condition for G2 in the standard model (KG1 , ιG1 ), we have ρ2 (w) ⊆ S 1 Ext ιG {ρ1 (v) | v ∈ V1 , κ1 (v) ≤C κ2 (w)} for all vertices w ∈ V2 . C (κ2 (w)) := S This implies for all w ∈ V2 the vertex condition λG (ρ2 (w)) ⊆ λG ( {ρ1 (v) | v ∈ V1 , κ1 (v) ≤C κ2 (w)}) ⊆ Ext λC (κ2 (w)). For the edge condition, one can proceed similarly. 2 Proof of Soundness: G1 ` G2 ⇒ G1 |= G2 . Due to the transitivity of |=, it suffices to show soundness for each single inference rule. Therefore, we will give the exact definition of every inference rule by describing the derived concept graph G2 . Then, we can prove the entailment by using Prop. 1 and checking that G2 is valid in the standard model (KG1 , ιG1 ) S G1 1 of G1 := (V1 , E1 , ν1 , κ1 , ρ1 ). Because of ιG {ρ1 (v) | v ∈ G := idG , Ext(ιC c) = S 1 R = {ρ (e) | e ∈ E , κ (e) ≤R R} for V1 , κ1 (v) ≤C c} for all c ∈ C and ιG 1 1 1 R all R ∈ R (cf. Def. 6), we only have to convince ourselves that G2 satisfies the following vertex and edge conditions:
236
S. Prediger
∀w ∈ V2 : ρ2 (w) ⊆ ∀f ∈ E2 : ρ2 (f ) ⊆ 1.
2.
3.
4. 5.∗
6.∗
S S
{ρ1 (v) | v ∈ V1 , κ1 (v) ≤C κ2 (w)}
(vertex condition)
{ρ1 (e) | e ∈ E1 , κ1 (e) ≤R κ2 (f )}
(edge condition).
Double a vertex. The concept graph derived by doubling the vertex v ∈ V1 is G2 := (V2 , E2 , ν2 , κ2 , ρ2 ) which is defined by ˙ {(v, 1), (v, 2)}, • V2 := V1 \{v} ∪ ˙ E v with • E2 := E1 \Ev ∪ Ev := {e ∈ E1 | ν1 (e)|i = v for some i = 1, . . . , |e|} and E v := {(e, δ) | e ∈ Ev , δ ∈ {1, 2}[e,v] } where [e, v] := {i | ν1 (e)|i = v}, • ν2 |E1 \Ev := ν1 |E1 \Ev and ν1 (e)|i if i 6∈[e, v] ν2 (e, δ)|i := for all (e, δ) ∈ E v , (v, δ(i)) if i ∈ [e, v] • κ2 : V 2 ∪ E2 → C ∪ R x 7→κ1 (x) for all x ∈ V1 \{v} ∪ E1 \Ev (v, j) 7→κ1 (v) for j = 1, 2 (e, δ) 7→κ1 (e) for all (e, δ) ∈ E v , • ρ2 |V1 \{v} := ρ1 |V1 \{v} and ρ2 (v, j) := ρ1 (v) for j = 1, 2. For this derived concept graph G2 , the vertex and edge condition can be checked easily. It is left to the reader. Delete an isolated vertex. If v ∈ V1 is an isolated vertex of G1 (i.e., there is no edge e ∈ E1 and no i = 1, . . . , |e| with ν1 (e)|i = v), the components of the concept graph G2 derived by deleting the isolated vertex v are defined as follows: V2 := V1 \{v}, E2 := E1 , ν2 := ν1 , κ2 := κ1 |V2 ∪ E1 and ρ2 := ρ1 |V2 . These components obviously satisfy the vertex and edge conditions. Double an edge. The concept graph G2 derived by doubling the edge e ∈ E1 is defined by V2 := V1 , E2 := E1 \{e} ∪ {(e, 1), (e, 2)} where (e, 1), (e, 2) 6∈ E1 , ν2 |E1 \{e} := ν1 |E1 \{e} and ν2 (e, j) := ν1 (e) for j = 1, 2, κ2 |V1 ∪(E1 \{e}) := κ1 |V1 ∪(E1 \{e}) and κ2 (e, j) := κ1 (e) for j = 1, 2 and ρ2 := ρ1 . It satisfies the vertex and edge conditions. Delete an edge. Deleting the edge e ∈ E1 , one obtains the concept graph G2 := (V1 , E1 \{e}, ν1 |E1 \{e} , κ1 |V1 ∪(E1 \{e}) , ρ1 ) which satisfies the vertex and edge conditions. Exchange a concept name. The concept graph derived by substituting the concept name κ1 (v) for a c ∈ C for which there is a vertex w ∈ V1 with κ1 (w) ≤C c and ρ1 (v) ⊆ ρ1 (w), is defined by G2 := (V1 , E1 , ν1 , κ2 , ρ1 ) with κ2 |(V1 \{v})∪E1 := κ1 |(V1 \{v})∪E1 and κ2 (v) := c. The edge condition is obviously satisfied, and the vertex condition is satisfied because κ1 (w) ≤C c 1 implies ρ1 (v) ⊆ ρ1 (w) ⊆ Ext ιG C c. Exchange a relation name. The concept graph derived by substituting the relation name κ1 (e) for such an R ∈ R for which there is an edge f ∈ E1 with κ1 (f ) ≤C R and ρ1 (e) ⊆ ρ1 (f ), is defined by G2 := (V1 , E1 , ν1 , κ2 , ρ1 ) with κ2 |V1 ∪(E1 \{e}) := κ1 |V1 ∪(E1 \{e}) and κ2 (e) := R. It satisfies the edge 1 condition because κ1 (f ) ≤R R implies ρ1 (e) ⊆ ρ1 (f ) ⊆ ιG R R.
Simple Concept Graphs: A Logic Approach
237
7.
Restrict references. The concept graph derived by restricting the reference 6 A ⊆ ρ1 (v), is defined ρ1 (v) of the vertex v ∈ V1 to the reference A with ∅ = by G2 := (V1 , E1 , ν1 , κ1 , ρ2 ) with ρ2 |V1 \{v} := ρ1 |V1 \{v} and ρ2 (v) := A. From A ⊆ ρ1 (v) we deduce the vertex condition. 8. Copy the concept graph. For a copied concept graph G2 , there exist two bijections ϕV : V1 → V2 and ϕE : E1 → E2 such that κ1 (v) = κ2 (ϕV (v)) and ρ1 (v) = ρ2 (ϕV (v)) for all v ∈ V1 , and ϕV (ν1 (e)) = ν2 (ϕE (e)) and κ1 (e) = κ2 (ϕE (e)) for all e ∈ E1 . It trivially satisfies the vertex and edge conditions. ∗ 9 . Join vertices with equal references. The concept graph derived from G1 by joining the two vertices v and w with equal references (i. e. with ρ1 (v) = ρ1 (w)) is G2 := (V2 , E1 , ν2 , κ2 , ρ2 ) with ˙ {v ∨ w}, • V2 := V1 \{v, w} ∪ v ∨ w if ν1 (e)|i = v or ν1 (e)|i = w • ν2 |i (e) := ν1 (e)|i otherwise for all e ∈ E1 , i = 1, . . . , |e|, • κ2 |(V1 \{v,w}) ∪E1 := κ1 |(V1 \{v,w}) ∪E1 and κ2 (v ∨ w) := c for a c ∈ C with κ1 (v) ≤C c and κ1 (w) ≤C c, • ρ2 |V1 \{v,w} := ρ1 |V1 \{v,w} and ρ2 (v ∨ w) := ρ1 (v). The vertex and edge conditions are satisfied from ρ1 (v) = ρ2 (v ∨ w), κ1 (v) ≤C c and κ1 (w) ≤C c, we deduce Ext(κ1 (v)) ∪ Ext(κ1 (w)) ⊆ Ext(κ2 (v ∨ w)). 10∗ . Join vertices with corresponding edges. Let us assume that the vertices v, w ∈ V1 have corresponding, but uncommon edges, that means for every edge e ∈ Ev (i. e., that is incident with v) there exists an edge e0 ∈ Ew and vice versa with κ1 (e) = κ1 (e0 ), ν1 (e)|i = v for exactly one i ∈ {1, . . . , |e|} and ν1 (e0 )|i = w, ν1 (e)|j 6= w and ν1 (e0 )|j 6= w for all j = 1, . . . , |e|, and ρ1 (ν1 (e)|j ) = ρ1 (ν1 (e0 )|j ) if ν1 (e)|j 6= v. Then the concept graph derived from G1 by joining the two vertices v and w is G2 := (V2 , E1 , ν2 , κ2 , ρ2 ) where V2 and κ2 are defined as in Rule 9∗ , and ρ2 is defined by ρ2 |V1 \{v,w} := ρ1 |V1 \{v,w} and ρ2 (v ∨ w) := ρ1 (v) ∪ ρ1 (w). The vertex and edge conditions are satisfied because κ1 (v) ≤C c and κ1 (w) ≤C c imply Ext κ1 (v) ∪ Ext κ1 (w) ⊆ Ext κ2 (v ∨ w). 2 Proof of Completeness: G1 |= G2 ⇒ G1 ` G2 . We will prove completeness by using so-called stars, which are concept graphs with only one edge and its incident vertices. For a given concept graph G := (V, E, ν, κ, ρ), the stars of G are all those stars which are subgraphs of G, i. e. all concept graphs G0 := (V 0 , E 0 , ν|V 0 ∪ E 0 , κ|V 0 ∪ E 0 , ρ|V 0 ∪ E 0 ) where E 0 := {e} for an edge e ∈ E and V 0 := {ν(e)|i | i = 1, . . . , |e| }. The stars are interesting because we can derive a concept graph from the set of all its stars and its isolated vertices using Rule 9∗ (join vertices with equal references). Consequently, it suffices to prove that every star A of G2 can be derived from G1 if G1 entails G2 . Using
238
S. Prediger
Rule 8 (copy concept graph), we obtain enough copies to derive all stars of G2 from which we can derive G2 . Let G1 and G2 be two concept graphs with G1 |= G2 and let A be a star of G2 with edge f and vertices w1 , w2 , . . . , wk . For deriving A from G1 , we proceed in three steps. (i.) First, we derive stars from G1 such that, for every tuple (g1 , . . . , gk ) of objects in ρ(f ), there is a star Ag1 ,...,gk with edge eg1 ,...,gk and ρ(eg1 ,...,gk ) = (g1 , . . . , gk ). (ii.) Then, we join these stars in several steps by joining the corresponding vertices. We obtain a star B with an edge f 0 that has the same references as the star A of G2 . But it does not necessarily have the same concept and relation names. (iii.) In order to adapt the concept and relation names by Rules 5∗ and 6∗ , we first have to derive isolated vertices vi for every vertex wi of A with κ1 (vi ) = κ2 (wi ) and ρ1 (vi ) ⊇ ρ2 (wi ). Then, we can finally deduce a copy of A from B. i) As A is valid in the standard model (KG1 , ιG1 ) and κ2 (f ) ∈ R, there S G1 1 {ιR R | R ∈ T }. exists a set T ⊆ κ1 (E1 ) of relations such that ιG R κ2 (f ) = Consequently, for all (g1 , . . . , gk ) ∈ ρ2 (f ), there exists an R ∈ T such that G1 1 ιG G (g1 , . . . , gk ) = (g1 , . . . , gk ) ∈ ιR (R). Because of R ∈ κ1 (E1 ), we can find an edge eg1 ,...,gk ∈ E1 with (g1 , . . . , gk ) ∈ ρ1 (eg1 ,...,gk ). By means of Rule 2 and 4 (delete vertices and edges), we can derive, for all tuples (g1 , . . . , gk ) ∈ ρ2 (f ), the corresponding star of G1 with the edge eg1 ,...,gk . Using Rule 7 (restrict references), we restrict the references to g1 , . . . , gk . In this way, we derive stars denoted by Ag1 ,...,gk with vertices denoted by vg1 , . . . , vgk . ii) In the first substep, we join the k th vertices of all stars Ag1 ,...,gk where the first k − 1 references are identical. For every tuple (g 1 , . . . , g k−1 ) ∈ ρ2 (w1 ) × . . . × ρ2 (wk−1 ), we consider all stars Ag1 ,...,gk−1 ,gk with gk ∈ ρ2 (wk ) and unify the relation names κ(eg1 ,...,gk−1 ,gk ) by Rule 6∗ (exchange relation names) into a common relation name Reg1 ,...,gk−1 . As all gk belong to ρ2 (wk ), they satisfy κ(eg1 ,...,gk−1 ,gk ) ≤R κ(f ). Thus, we find a common relation name Reg1 ,...,gk−1 ≤R κ(f ). Thereafter, we join the k th vertices of all changed concept graphs Ag1 ,...,gk−1 ,gk by Rule 10∗ (join vertices with corresponding edges). Then, we join their first, then second, and finally (k − 1)th vertices. After deleting the double edges (Rule 4), we obtain a star with k vertices that we denote by Ag1 ,...,gk−1 . It has an edge eg1 ,...,gk−1 , and we have ρ(eg1 ,...,gk−1 ) = {g 1 } × . . . × {g k−1 } × ρ2 (wk ) and κ(eg1 ,...,gk−1 ) ≤R κ(f ). In the second substep, we join the vertices of all those stars Ag1 ,...,gk−1 (which all have the same k th reference) that correspond in the (k − 1)th reference. Applying Rule 6∗ , 10∗ and 4, we obtain concept graphs Ag1 ,...,gk−2 with the edge eg1 ,...,gk−2 satisfying ρ(eg1 ,...,gk−2 ) = {g1 } × . . . × {gk−2 } × ρ2 (wk−1 ) × ρ2 (wk ). After k steps of joining, we obtain a star B with edge f 0 that has the same references as the edge f of A. iii) As A is valid in the standard model (KG1 , ιG1 ), every vertex wi of A S 1 {ρ1 (v) | v ∈ V1 , κ1 (v) ≤C κ2 (wi )}. satisfies ρ2 (wi ) ⊆ Ext (ιG G κ2 (wi )) =
Simple Concept Graphs: A Logic Approach
239
Thus, for every vertex wi of A, we can use Rule 4 (delete edges) and derive all isolated vertices v ∈ V1 with κ1 (v) ≤C κ2 (wi ). By means of Rule 10∗ (join vertices with corresponding edges), they can be joined into an isolated vertex vi with κ1 (vi ) = κ2 (wi ) and ρ1 (vi ) ⊇ ρ2 (wi ). Finally, we can exchange the concept and relation names (Rules 5∗ and 6∗ ) and, by means of Rule 2 (delete all isolated vertices), we obtain a concept graph 2 that is isomorphic to A. Taken as a whole, this proves G1 ` A. Proof of Proposition 3. Let G0 be valid in (K, id). We prove the assertion by showing that G0 is valid in the standard model of the standard graph G(K) := G and by using Prop. 1. The standard model (KG , ιG ) of the concept graph G S G with KG = ((G, R), B(K), I G ) satisfies ιG {ρK (e) | e ∈ G = idG , and ιR (R) = G E, κK (e) ≤ R} = R for all R ∈ R. This implies ιR = idR . Consequently, the edge condition for G0 in (KG , ιG ) is satisfied. Furthermore, we have ιG C (c) := IG IGIG ) and, according to the definition of the incidence relation in the (c , c G standard model, we have for every concept name c ∈ C the equations cI := S S II S {ρK (A) | κK (A) ≤ c} = {g | (g II , g I ) ≤ c} = {g II | g ∈ cI } = cI . Thus, the vertex condition is also satisfied. 2
References 1. Chein, M., Mugnier, M.-L.: Conceptual Graphs: Fundamental Notions. Revue d’Intelligence Artificielle 6 (1992) 365–406 2. Chein, M., Mugnier, M.-L.: Conceptual Graphs are also Graphs. Rapport de Recherche 95003, LIRMM, Universit´e Montpellier II (1995) 3. Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer, Berlin–Heidelberg, to appear 1998 4. Mugnier, M.-L., Chein, M.: Repr´esenter des Connaissances et Raisonner avec des Graphes. Revue d’Intelligence Artificielle 10 (1996) 7–56 5. Prediger, S.: Einfache Begriffsgraphen: Syntax und Semantik. Preprint, FB Mathematik, TU Darmstadt (1998) 6. Prediger, S.: Existentielle Begriffsgraphen. Preprint, FB Mathematik, TU Darmstadt (1998) 7. Priß, U.: The Formalization of WordNet by Methods of Relational Concept Analysis. In: Fellbaum, C. (ed.): WordNet - An Electronic Lexical Database and some of its Applications. MIT-Press (1996) 8. Sowa, J. F.: Conceptual Structures: Information Processing in Mind and Machine. Adison-Wesley, Reading (1984) 9. Sowa, J. F.: Knowledge Representation: Logical, Philosophical, and Computational Foundations. PWS Publishing Co., Boston, to appear 1998 10. Wermelinger, M.: Conceptual Graphs and First-Order Logic. In: Ellis, G. et al. (eds.): Conceptual Structures: Applications, Implementations and Theory, Proceedings of the ICCS ’95. Springer, Berlin–New York (1995) 323–337 11. Wille, R.: Restructuring Mathematical Logic: an Approach based on Peirce’s Pragmatism. In: Ursini, A., Agliano, P. (eds.): Logic and Algebra. Marcel Dekker, New York (1996) 267–281 12. Wille. R.: Conceptual Graphs and Formal Concept Analysis. In: Lukose, D. et. al. (eds.): Conceptual Structures: Fulfilling Peirce’s Dream, Proceedings of the ICCS’97. Springer, Berlin–New York (1997) 290–303
Two FOL Semantics for Simple and Nested Conceptual Graphs G. Simonet LIRMM (CNRS and Universit´e Montpellier II), 161, rue Ada, 34392 Montpellier Cedex 5, France, tel:(33)0467418543, fax: (33)0467418500, email: [email protected]
Abstract. J.F. Sowa has defined a FOL semantics for Simple Conceptual Graphs and proved the soundness of the graph operation called projection with respect to this semantics. M. Chein and M.L. Mugnier have proved the completeness result, with a restriction on the form of the target graph of the projection. I propose here another FOL semantics for Simple Conceptual Graphs corresponding to a slightly different interpretation of a Conceptual Graph. Soundness and completeness of the projection with respect to this semantics are true without any restriction. I extend the definitions and results on both semantics to Conceptual Graphs containing co-reference links and to Nested Conceptual Graphs.
1
Introduction
The Conceptual Graphs model has been proposed by J.F. Sowa [11] as a Semantics Networks one in Knowledge Representation. J.F. Sowa has defined a FOL (First Order classical Logic) semantics denoted by Φ for Simple Conceptual Graphs and proved the soundness of the graph operation called projection with respect to this semantics. M. Chein and M.L. Mugnier [1,5] have studied the basic model of Conceptual Graphs and several of its extensions. Among other results, they have proved the completeness of the projection with respect to the semantics Φ in Simple Graphs, with a restriction on the form of the target graph of the projection. This result shows that reasoning on Conceptual Graphs may be performed using graph operations instead of logical provers. The semantics Φ has already been extended to Nested Conceptual Graphs in non-classical logics with “nested” formulas, i.e. formalisms in which a formula may appear as an argument of a predicate [11]. Thus the structure of a Nested Graph is directly translated into the structure of a nested formula. These logics are similar to the logics of contexts of [4] and [3]. A. Preller et al. [6] have defined a sequent formal system with nested formulas and proved the soundness and completeness of this formal system with respect to the projection in the Simple and Nested Graphs models. More details about related works may be found in [2]. I propose here a slightly different interpretation of a Conceptual Graph which leads to the need for co-reference links in Simple Graphs and to another FOL semantics denoted by Ψ . Projection is shown sound and complete with respect to M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 240–254, 1998. c Springer-Verlag Berlin Heidelberg 1998
Two FOL Semantics for Simple and Nested Conceptual Graphs
241
the semantics Ψ without any restriction. Then the definitions and results about the semantics Φ and Ψ are extended to Nested Graphs. More details about these extensions may be found in [10,9].
2 2.1
Simple Graphs Basic Notions
I briefly recall the basic (simplified) definitions: support, SG, projection, semantics Φ, normal graph, normal form of a graph, as well as the soundness and completeness result. For more details on these definitions, see [11,1,5]. I first specify some notations. Any partial order is denoted by ≤. |= is the entailment relation symbol in FOL. Basic ontological knowledge is encoded in a support S = (TC , TR , I). TC and TR are type sets, respectively a set of concept types and a set of relation types, partially ordered by an A-Kind-Of relation. Relation types may have any arity greater or equal to 1, and two comparable relation types must have the same arity. I is a set of individual markers. All supports also possess a generic marker denoted by ∗. The following partial order is defined on the set of markers I ∪{∗}: ∗ is the greatest element and elements of I are pairwise noncomparable. Asserted facts are encoded by Simple Graphs. A SG (Simple Graph) on a support S, is a labelled bipartite graph G = (R, C, E, l). R and C are the node sets, respectively relation node set and concept node set. E is the set of edges. Edges incident on a relation node are totally ordered, they are numbered from 1 to the degree of the relation node. The ith neighbor of relation r is denoted by Gi (r). Each node has a label given by the mapping l. A relation node r is labelled by a relation type type(r), and the degree of r is equal to the arity of type(r). A concept node c is labelled by a couple (type(c), ref (c)), where type(c) is a concept type and ref (c), the referent of c, either belongs to I — then c is an individual node — or is the generic marker ∗ — then c is a generic node. The SG of Figure 1 represents the information: a person looks at the photograph A (the generic referent * is omitted).
Fig. 1. A Simple Graph
A projection from a SG G = (RG , CG , EG , lG ) to a SG H = (RH , CH , EH , lH ) is a mapping Π from RG to RH and from CG to CH which 1. preserves adjacency and order on edges: ∀rc ∈ EG Π(r)Π(c) ∈ EH and if c = Gi (r) then Π(c) = Hi (Π(r)) 2. may decrease labels: ∀x ∈ RG ∪ CG lH (Π(x)) ≤ lG (x).
242
G. Simonet
The partial order on relation labels is that of TR . The partial order on concept labels is the product of the partial orders on TC and on I ∪ {∗}. SGs are given a semantics in FOL, denoted by Φ [11]. Given a support S, a constant is assigned to each individual marker and an n-adic (resp. unary) predicate is assigned to each n-adic relation (resp. concept) type. For simplicity, we consider that each constant or predicate has the same name as the associated element of the support. To S is assigned a set of formulas Φ(S) which corresponds to the interpretation of the partial orderings of TR and TC . Φ(S) is the set of formulas ∀x1 ...xp (t(x1 , ..., xp ) → t0 (x1 , ..., xp )), where t and t0 are type such that t ≤ t0 and p is the arity of t and t0 . Φ maps any graph G on S into a formula Φ(G) in the following way. First assign to each concept node c a term which is a variable if c is generic, and otherwise the constant corresponding to ref (c). Two distinct generic nodes receive distinct variables. Then assign an atom to each node of G: the atom tc (e) to a concept node c where tc stands for the type of c and e is the term associated with c; the atom tr (e1 , ..., ep ) to a relation node r where tr stands for the type of r and ei is the term associated with the ith neighbor of r. Φ(G) is the existential closure of the conjunction of these atoms. E.g. the formula associated with the graph of Figure 1 is Φ(G) = ∃y1 y2 (P erson(y1 ) ∧ Look(y2 ) ∧ photograph(A) ∧ agent(y2 , y1 ) ∧ object(y2 , A)). Projection is sound and complete with respect to the semantics Φ, i.e.: given two SGs G and H defined on S, if there is a projection from G to H then Φ(S), Φ(H) |= Φ(G) (soundness [11]); conversely, if H is normal and Φ(S), Φ(H) |= Φ(G) then there is a projection from G to H (completeness [5]). A graph is normal iff each individual marker appears at most once in concept node labels. A graph G can be transformed into a normal graph G0 (called the normal form of G) by merging all concept nodes having the same individual referent, provided that these nodes have the same type. The formulas Φ(G) and Φ(G0 ) are trivially equivalent. Figure 2 is a counterexample to the completeness result when H is not normal. Φ(G) = t(a) ∧ r(a, a) and Φ(H) = t(a) ∧ t(a) ∧ r(a, a).
Fig. 2. Counterexample to the completeness result in SGs with Φ
Φ(S), Φ(H) |= Φ(G) (as Φ(H) and Φ(G) are equivalent formulas) but there is no projection from G to H. The normal form of H is G. 2.2
Co-Reference Links
According to the semantics Φ, merging several individual nodes having the same referent (i.e. representing the same entity) does not change the meaning of the graph. However, in some applications, it may be useful to represent the same entity with different concept nodes corresponding to different aspects or viewpoints
Two FOL Semantics for Simple and Nested Conceptual Graphs
243
on this entity, so that merging these nodes would destroy some information. Moreover, the different concept nodes representing the same entity may have different types (and therefore different labels in the graph) according to the different aspects of the entity that they represent. When a specified individual is represented by several nodes, the individual marker suffices for expressing that these nodes represent the same entity. But in the case of an unspecified entity represented by several generic nodes, an additional structure, called a co-reference link, is needed. A SGref (SG with co-reference links) is a graph G = (R, C, E, l, co-ref ), where co-ref is an equivalence relation on the set of G generic nodes (the coreference relation).The intuitive semantics of co-ref is “represents the same entity as”. Any SG may be considered as a SGref in which the co-reference relation is reduced to the identity one. The relation co-ref is naturally extended to an equivalence relation co-ident on the set C of all concept nodes of G (the coidentity relation). Every equivalence class for the co-identity relation is a set of concept nodes representing the same entity which are either co-referent generic nodes (in that case they are explicitly linked by a co-reference link in a graphical representation of G) or individual nodes having the same referent. ∀c, c0 ∈ C, co-ident(c, c0 ) iff (co-ref (c, c0 ) or ref (c) = ref (c0 ) ∈ I) The definitions and results on SGs are modified as follows by the introduction of co-reference links. A projection from a SGref G = (RG , CG , EG , lG , corefG ) to a SGref H = (RH , CH , EH , lH , co-refH ) is a projection Π from the SG (RG , CG , EG , lG ) to the SG (RH , CH , EH , lH ) that preserves co-identity, i.e. ∀c, c0 ∈ CG if co-refG (c, c0 ) then co-identH (Π(c), Π(c0 )). Note that the co-identity of individual nodes is already preserved: for any c, c0 ∈ CG , if ref (c) = ref (c0 ) ∈ I then ref (Π(c)) = ref (Π(c0 )) = ref (c) ∈ I. Φ may be extended to SGref s by assigning the same variable to co-referent generic nodes. A SGref is normal iff it contains no co-reference links and each individual marker appears at most once in concept node labels (i.e. iff the co-identity relation is reduced to the identity one). The normal form of a SGref G is obtained from G by merging its co-identical nodes (provided that they have the same type). Therefore co-reference links in a SG G have no real interest when G is interpreted according to the semantics Φ. Their interest will appear with the semantics Ψ and in Nested Graphs for both semantics. 2.3
Another FOL Semantics
The goal here is to modify the semantics Φ into a semantics Ψ which would express the fact that the same entity is represented by several concept nodes. The structure of a graph G should be fully represented in Ψ (G), which is not the case in Φ(G). Ψ (G) is defined as follows. Two terms are assigned to each concept node c of G. The first term ec represents the co-identity class c of c: the constant corresponding to ref (c) if c is an individual node, otherwise the variable assigned to the co-reference class of c (ec is the term assigned to c in Φ(G)). The second term ec is a variable representing the node
244
G. Simonet
itself. All these variables are distinct. Assign the atom tc (ec , ec ) to each concept node c and the atom tr (ec1 , ..., ecp ) to each relation node r of G, where ci is the ith neighbor of r. Ψ (G) is the existential closure of the conjunction of these atoms. E.g. in Figure 2, Ψ (G) = ∃x1 (t(a, x1 ) ∧ r(x1 , x1 )) and Ψ (H) = ∃x1 x2 (t(a, x1 ) ∧ t(a, x2 ) ∧ r(x1 , x2 )). Ψ (S) is defined as Φ(S), except that predicates associated with concept types become binary. Projection is sound and complete with respect to this semantics without any restriction. Theorem 1. Let G and H be two SGref s. Ψ (S), Ψ (H) |= Ψ (G) iff there is a projection from G to H. E.g. in Figure 2, Ψ (S), Ψ (H) 6|= Ψ (G) and there is no projection from G to H. Proof: the proof is similar to that with the semantics Φ. However it is given here because the complete proof with Φ (for SGs) is only available in French [5] and because Lemmas 1 and 2 will be used for NGref s. For any formula F , let C(F ) denote the clausal form of F . C(Ψ (H)) is the set of atomic clauses obtained from Ψ (H) atoms by substituting Skolem constants to the variables. Let ρ be this substitution. C(¬Ψ (G)) contains a unique clause whose literals are the negations of Ψ (G) atoms. The Herbrand Universe UH of the formula Ψ (S) ∧ Ψ (H) ∧ ¬Ψ (G) is the set of constants appearing in C(Ψ (H)) or in Ψ (G). A Ψ -substitution from G to H w.r.t. ρ (in short Ψ -substitution from G to H) is a substitution σ of the variables of Ψ (G) by UH constants such that for any atom t(e1 , ..., en ) of Ψ (G), there is t0 ≤ t such that σ(t0 (e1 , ..., en )) is an atom of C(Ψ (H)) (i.e. for any atom t(e1 , ..., en ) of Ψ (G), there is an atom t0 (e01 , ..., e0n ) of Ψ (H) such that t0 ≤ t and for any i in {1, ..., n}, σ(ei ) = ρ(e0i )). Theorem 1 immediately follows from Lemmas 1 and 2. 2 Lemma 1: Ψ (S), Ψ (H) |= Ψ (G) iff there is a Ψ -substitution from G to H. Proof: let C = C(Ψ (S) ∧ Ψ (H) ∧ ¬Ψ (G)). Ψ (S), Ψ (H) |= Ψ (G) iff C is unsatisfiable. If there is a Ψ -substitution from G to H then the empty clause can be obtained from C by the resolution method, so C is unsatisfiable. Conversely, let us suppose that C is unsatisfiable. Let v be the Herbrand interpretation defined by: for any predicate t of arity n and any constants a1 , ..., an of UH , v(t)(a1 , ..., an ) is true iff there is t0 ≤ t such that t0 (a1 , ..., an ) is an atom of C(Ψ (H)). v is a model of C(Ψ (S) ∧ Ψ (H)). Then v is not a model of C(¬Ψ (G)), which provides a Ψ -substitution from G to H. 2 Lemma 2: If there is a projection Π from G to H then there is a Ψ substitution σ from G to H such that for any concept node c of G, σ(ec ) = ρ(eΠ(c) ) and σ(ec ) = ρ(eΠ(c) ). Conversely, if there is a Ψ -substitution σ from G to H then there is a projection Π from G to H such that for any concept node c of G, σ(ec ) = ρ(eΠ(c) ) and σ(ec ) = ρ(eΠ(c) ). Proof: let us suppose that there is a projection Π from G to H. For any variable x of Ψ (G), let c be the concept node of G such that x = ec (resp. ec ) and let σ(x) be the UH constant ρ(eΠ(c) ) (resp. ρ(eΠ(c) )). Then for any concept node c of G, σ(ec ) = ρ(eΠ(c) ) and σ(ec ) = ρ(eΠ(c) ). For any atom t(e1 , ..., en ) of Ψ (G), let s be a node of G associated with this atom. The atom of Ψ (H) associated with Π(s) is in the form t0 (e01 , ..., e0n ), with t0 ≤ t and for any i in {1, ..., n}, σ(ei ) = ρ(e0i ). σ is a Ψ -substitution from G to H.
Two FOL Semantics for Simple and Nested Conceptual Graphs
245
Conversely, let us suppose that there is a Ψ -substitution σ from G to H. For any node s of G, let t(e1 , ..., en ) be the atom of Ψ (G) associated with s and let Π(s) be a node of H associated with an atom in the form t0 (e01 , ..., e0n ), with t0 ≤ t and for any i in {1, ..., n}, σ(ei ) = ρ(e0i ). Then for any node s of G, type(Π(s)) ≤ type(s) and for any concept node c of G, σ(ec ) = ρ(eΠ(c) ) and σ(ec ) = ρ(eΠ(c) ). Π fullfills the label-decreasing condition. Let us shoow that it preserves adjacency and order on edges. If c = Gi (r) then σ(ec ) = ρ(eΠ(c) ) = ρ(eHi (Π(r)) ), then Π(c) = Hi (Π(r)). Note that with the semantics Φ, we only have σ(ec ) = ρ(eΠ(c) ) = ρ(eHi (Π(r)) ), then Π(c) = Hi (Π(r)) and we need the normality condition to ensure that Π(c) = Hi (Π(r)). Let us shoow that Π preserves co-identity. If co-refG (c, c0 ) then c = c0 , then σ(ec ) = σ(ec0 ) = ρ(eΠ(c) ) = ρ(eΠ(c0 ) ), then Π(c) = Π(c0 ), i.e. co-identH (Π(c), Π(c0 )). Π is a projection from G to H. 2
3 3.1
Nested Graphs The Model
The Nested Graphs model allows to associate with any concept node a partial internal description in the form of a Nested Graph. E.g. the Nested Graph (with co-reference links) of Figure 3 is obtained from the SG of Figure 1 by adding the partial internal description to the concept node labelled (photograph,A): the photograph represents a person on a boat (and adding a co-reference link to express that the person on the boat is the person looking at the photograph). A NG (Nested Graph) G, its depth depth(G) and the set U CG of concept nodes
Fig. 3. A Nested Graph with co-reference links
appearing in G are defined by structural induction. 1. A basic NG G0 is obtained from a SG G by adding to the label of each concept node c, a third field, denoted by Desc(c), equal to ∗∗ (the empty description); depth(G0 ) = 0 and U CG0 = CG0 = CG . 2. Let G be a basic NG, c1 , c2 , ..., ck concept nodes of G, and G1 , G2 , ..., Gk NGs. The graph G0 obtained by substituting Gi to the description ∗∗ of ci for i = 1, ..., k is a NG; depth(G0 ) = 1 + max{depth(Gi ), 1 ≤ i ≤ k} and U CG0 = CG ∪ (∪1≤i≤k U CGi ).
246
G. Simonet
It is important to note (for the following definition of A(G)) that if a SG or NG H is used several times in the construction of a NG G, we consider that several copies of H (and not several times the graph H itself) are used in the construction of G. A NG can be denoted by G = (R, C, E, l) as a SG except that the label l(c) of any concept node c has a third field Desc(c) (in addition to the fields type(c) and ref (c)) which is either ∗∗ or a NG. A complex node is a concept node c such that Desc(c) is a NG. The set of complex nodes of G is denoted by D(G). A NGref (NG with co-reference links) is a graph G = (R, C, E, l, co-ref ), where co-ref is an equivalence relation on the set of generic nodes appearing in G. NGref s are defined by structural induction from SGref s as NCs from SGs, with the following definition of co-ref . If G0 is the basic NGref obtained from a SGref G, then co-refG0 = co-refG . If G0 is the NGref obtained from a basic NGref G, concept nodes ci of G and NGref s Gi , then co-refG0 is an equivalence relation on the set of generic nodes of U CG0 such that its restriction to the set of generic nodes of CG (resp. U CGi ) is co-refG (resp. co-refGi ). The relation co-ref is extended to an equivalence relation co-ident on the set U CG as in the SGref model. Any NGref G has an associated rooted tree A(G) whose nodes are the SGref s used in the construction of G and edges are in the form (J, c)K, where J and K are nodes of A(G) and c is a concept node of J. Its root is denoted by R(G). E.g. the rooted tree associated with the NGref G of Figure 3 is represented in Figure 4: it has two nodes R(G) and K1 and one edge (R(G), c1 )K1 (co-reference links are not edges of A(G)). A(G) is defined by structural induction. If G0 is the
Fig. 4. Rooted tree associated with the NGref of Figure 3
basic NGref obtained from a SGref G, then A(G0 ) is reduced to its root G. If G0 is obtained from a basic NGref G, concept nodes ci of G and NGref s Gi , then A(G0 ) is obtained from the rooted trees A(Gi ) by adding R(G) as the root and the edges (R(G), ci )R(Gi ). The sets CG and CR(G) will be identified, as well as the set U CG and the union of the sets CK , K node of A(G).The relation co-refG is translated in A(G) into an equivalence relation co-refA(G) on the union of the generic node sets of the nodes of A(G). A projection Π from a NGref G to a NGref H and the image by Π of any node c of U CG , denoted by Π(c), are defined by induction on the depth of G.
Two FOL Semantics for Simple and Nested Conceptual Graphs
247
A projection from a NGref G to a NGref H is a family Π = (Π0 , (Πc )c∈D(G) ) which satisfies: 1. Π0 is a projection from the SGref R(G) to the SGref R(H), 2. ∀c ∈ D(G), Π0 (c) ∈ D(H) and Πc is a projection from Desc(c) to Desc(Π0 (c)), 3. Π preserves co-identity: ∀c, c0 ∈ U CG if co-refG (c, c0 ) then co-identH (Π(c), Π(c0 )), where Π(c) = Π0 (c) if c ∈ CG and otherwise let c1 be the node of D(G) such that c ∈ U CDesc(c1 ) , Π(c) = Πc1 (c). Projection can be formulated in terms of the rooted trees A(G) and A(H) as follows. Let A1 = (X1 , U1 ) and A2 = (X2 , U2 ) be two rooted trees of SGref s with co-reference links and R1 and R2 be their respective roots. An A-projection from A1 to A2 is a family ϕ = (ϕ0 , (ϕK )K∈X1 ) which satisfies: 1. ϕ0 is a mapping from X1 to X2 that maps R1 to R2 , 2. for any node K of A1 , ϕK is a projection from the SGref K to the SGref ϕ0 (K), 3. ϕ preserves adjacency in rooted trees: ∀(J, c)K ∈ U1 (ϕ0 (J), ϕJ (c))ϕ0 (K) ∈ U2 , 4. ϕ preserves co-identity: ∀K, K 0 ∈ X1 ∀(c, c0 ) ∈ CK × CK 0 if co-refA1 (c, c0 ) then co-identA2 (ϕK (c), ϕK 0 (c0 )). Condition 3 on the image of an edge (J, c)K by an A-projection ϕ is represented in Figure 5. It can be shown by induction on the depth of G that for any NGref s
Fig. 5. Edge (J, c)K and its image by an A-projection ϕ
G and H, there is a projection from G to H iff there is an A-projection from A(G) to A(H). 3.2
The Semantics Φ
In this section I will extend the semantics Φ to NGref s. I mentioned earlier the lack of interest of introducing co-reference links in a SG interpreted in the
248
G. Simonet
semantics Φ because of the possibility of merging nodes representing the same entity. Note that such nodes appearing in a NG G may be merged if they appear in the same SG of A(G) (the resulting node having as description the union of the descriptions of the original nodes), but not otherwise. Thus the introduction of co-reference links in NGs interpreted in any semantics does improve their expressive power. In order to extend Φ to NGref s, we add a first argument to each predicate, called the context argument. Thus a binary predicate is associated with each concept type, and a (n+1)-adic predicate is associated with each nadic relation type. For instance, if z (resp. x, y) is the variable assigned to a generic node of type Photograph (resp. Person, Boat), then the interpretation of the atom P erson(z, x) is “x is a person in the context of the photograph z” and that of the atom on(z, x, y) is “the person x is on the boat y in the context of the photograph z”. With S is associated the set Φ(S) of formulas ∀zx1 ...xp (t(z, x1 , ..., xp ) → t0 (z, x1 , ..., xp )), where t and t0 are types such that t ≤ t0 and p is the arity of t and t0 in the semantics Φ for SGs. For any NGref G, let r(G) be the number of equivalence classes of the relation co-refG . Assign r(G) distinct variables y1 , ..., yr(G) to the co-refG classes. For any node K of A(G), if K contains generic nodes then the variables of Φ(K) are some of the variables y1 , ..., yr(G) . A subgraph of G is a NGref that is either equal to G or to the description graph of a concept node appearing in G. A formula Φ0 (e, K) (resp. Φ0 (e, G0 )) is associated with any term e and any node K of A(G) (resp. subgraph G0 of G). Φ0 (e, K) (resp. Φ0 (e, G0 )) is the formula associated with K (resp. G0 ) when it is in the context represented by the term e, or more specifically, when K is the root of the description graph (resp. G0 is the description graph) of a concept node associated with the term e. Φ0 (e, K) is defined as the conjunction of the atoms obtained from those of Φ(K) by adding the first argument e. Φ0 (e, G0 ) is defined by induction on the depth of G0 . For any node K of A(G) and any concept node c of K, ec denotes the term assigned to c in Φ(K) (which is in fact the term assigned to the co-identity class c of c). Φ0 (e, G0 ) = Φ0 (e, R(G0 )) ∧ (∧c∈D(G0 ) Φ0 (ec , Desc(c))) The formula Φ(G) = ∃y1 ... yr(G) Φ0 (a0 , G) is associated with any NGref G defined on the support S, where a0 is a constant representing the general context induced by the support S, so that the same constant a0 is used for all NGref s defined on the support S. E.g. the formula associated with the graph G of Figure 3 is Φ(G) = ∃y1 y2 y3 (P erson(a0 , y1 ) ∧ Look(a0 , y2 ) ∧ photograph(a0 , A)∧ agent(a0 , y2 , y1 ) ∧ object(a0 , y2 , A) ∧ Φ0 (A, Desc(c))), where c is the concept node with referent A and Φ0 (A, Desc(c)) = P erson(A, y1 ) ∧ Boat(A, y3 ) ∧ on(A, y1 , y3 ). Φ(G) may be defined from A(G). Φ(G) = ∃y1 ...yr(G) (∧K node of A(G) Φ0 (eK , K)) where eK = a0 if K = R(G) and otherwise let (J, c)K be the edge of A(G) into K, eK = ec . The normality notion and the soundness and completeness results extend to NGref s. Let us first consider two counterexamples to the completeness result
Two FOL Semantics for Simple and Nested Conceptual Graphs
249
(presented in Figure 6). Φ(G1 ) = t(a0 , a) ∧ r(a0 , a, a) and Φ(H1 ) = t(a0 , a) ∧ t(a0 , a) ∧ r(a0 , a, a). Φ(G1 ) and Φ(H1 ) are equivalent formulas, but there is no projection from G1 to H1 . The problem here is that a node of A(H1 ) is not a normal SGref . Φ(G2 ) = t(a0 , a) ∧ t(a, a) ∧ t(a, a) and Φ(H2 ) = t(a0 , a) ∧ t(a, a). Φ(G2 ) and Φ(H2 ) are equivalent formulas, but there is no projection from G2 to H2 . The problem here is that there are two co-identical concept nodes appearing in H2 such that one of them is a complex node and the other one is not. We
Fig. 6. Counterexample to the completeness result in NGref s with Φ
have seen that the semantics Φ does not distinguish co-identical nodes in a SGref . In the same way, it does not distinguish which one(s) of co-identical nodes appearing in a NGref contain(s) a given piece of information in its (their) description graph(s), as the description graphs of these nodes have the same context argument (the term assigned to the co-identity class of the nodes). A strong definition of normality could be the following. A NGref G is strongly normal iff (1) every node of A(G) is a normal SGref and (2) for any co-identical concept nodes c and c0 appearing in two distinct nodes of A(G), if c is a complex node then c0 is a complex node and Desc(c0 ) is an exact copy of Desc(c), i.e. Desc(c0 ) is a copy of Desc(c) such that each generic node appearing in Desc(c) is co-referent to its copy appearing in Desc(c0 ). The fact that Desc(c0 ) is an exact copy of Desc(c), and not only a copy of Desc(c), is important. E.g. consider the
Fig. 7. Strong normality
NGref s G, H and K in Figure 7. G and K are strongly normal, but H is not. Φ(G), Φ(H) and Φ(K) are equivalent formulas, but there is no projection from
250
G. Simonet
G or K to H. Condition (2) of strong normality may be weakened as follows. Let G be a NGref and G0 and H 0 be two SGref s nodes of A(G) (resp. two NGref s subgraphs of G). An exact projection from G0 to H 0 is a projection from G0 to H 0 mapping each generic node c of CG0 (resp. U CG0 ) to a generic node of CH 0 (resp. U CH 0 ) co-referent to c in G. G0 and H 0 are exactly equivalent iff there is an exact projection from G0 to H 0 and from H 0 to G0 . A NGref G is normal iff 1. every node of A(G) is a normal SGref , 2. for any co-identical concept nodes c and c0 appearing in two distinct nodes of A(G), if c is a complex node then c0 is a complex node and R(Desc(c0 )) is exactly equivalent to R(Desc(c)). It can be shown by induction on the depth of Desc(c) that for any co-identical complex nodes c and c0 appearing in a normal NGref , not only the SGref s R(Desc(c)) and R(Desc(c0 )) but the whole NGref s Desc(c) and Desc(c0 ) are exactly equivalent. Moreover, Desc(c) and Desc(c0 ) are exact copies of each other except for redundant relation nodes (a relation node r appearing in a node K of A(G) is redundant iff there is another relation node r0 of K with the same respective neighbors as r and type(r0 ) ≤ type(r)). However, putting a NGref G into normal form (i.e. transforming G into a normal NGref G0 such that Φ(G) ≡ Φ(G0 )) is not always possible. E.g. consider the graph H2 of Figure 6. If H20 was a normal NGref such that Φ(H2 ) ≡ Φ(H20 ) then Φ(H20 ) would contain the atom t(a, a), then any concept node of U CH20 with referent a would have a concept node with referent a in the root of its description graph, and there would be an infinite chain in A(H20 ), which is impossible. The definition of normality is weakened into that of k-normality in such a way that any NGref G may be put into k-normal form for any k ≥ depth(G). The level in G of a node c of U CG is the level in A(G) of the node K of A(G) containing c (i.e. the number of edges of the path in A(G) from R(G) to K). A NGref G is k-normal iff 1. every node of A(G) is a normal SGref , 2. for any co-identical concept nodes c and c0 appearing in two distinct nodes of A(G) such that c is a complex node, if the level of c0 in G is inferior to k then c0 is a complex node and if c0 is a complex node then R(Desc(c0 )) is exactly equivalent to R(Desc(c)). Note that a normal NGref is k-normal for any natural integer k. Theorem 2. Let G and H be two NGref s and k ≥ depth(G). If there is a projection from G to H then Φ(S), Φ(H) |= Φ(G). Conversely, if H is k-normal and Φ(S), Φ(H) |= Φ(G) then there is a projection from G to H. Proof: We use the same notations as in the proof of Theorem 1. A Φ-substitution from a NGref G to a NGref H is defined in a similar way as a Ψ -substitution from a SGref G to a SGref H. Lemma 1 is true for NGref s instead of SGref s and the semantics Φ instead of Ψ . Lemma 2 is true for SGref s and for the semantics Φ instead of Ψ provided that H is a normal SGref (the condition σ(ec ) = ρ(eΠ(c) ) disappears as the term ec does not exist in the semantics Φ).
Two FOL Semantics for Simple and Nested Conceptual Graphs
251
Let us suppose that there is a projection from G to H. Then there is an Aprojection ϕ = (ϕ0 , (ϕK )K node of A(G) ) from A(G) to A(H). From Lemma 2, for any node K of A(G), there is a Φ-substitution σK from K to ϕ0 (K) such that for any concept node c of K, σK (ec ) = ρ(eϕK (c) ). If a variable x is assigned to two co-referent nodes c and c0 appearing in different nodes K and K 0 of A(G) then σK (x) = σK (ec ) = ρ(eϕK (c) ) and σK 0 (x) = σK 0 (ec0 ) = ρ(eϕ 0 (c0 ) ). As ϕ K
preserves co-identity, ϕK (c) = ϕK 0 (c0 ), then σK (x) = σK 0 (x). Therefore there is a substitution σ of the variables of Φ(G) by UH constants such that for any node K of A(G), the restriction of σ to the variables of Φ(K) is σK . The atoms of Φ(G) are obtained from those of the formulas Φ(K), K node of A(G), by adding the context argument eK . To show that σ is a Φ-substitution from G to H, it remains to show that for any node K of A(G), σ(eK ) = ρ(eϕ0 (K) ). If K = R(G) then ϕ0 (K) = R(H) and σ(eK ) = ρ(eϕ0 (K) ) = a0 , otherwise let (J, c)K be the edge of A(G) into K, eK = ec . From the definition of an Aprojection, (ϕ0 (J), ϕJ (c))ϕ0 (K) is an edge of A(H), then eϕ0 (K) = eϕJ (c) . It follows that σ(eK ) = σ(ec ) = ρ(eϕJ (c) ) = ρ(eϕ0 (K) ). Then σ is a Φ-substitution from G to H and we conclude with Lemma 1. Conversely, let us suppose that H is k-normal with k ≥ depth(G) and Φ(S), Φ(H) |= Φ(G). From Lemma 1, there is a Φ-substitution σ from G to H. We construct an A-projection ϕ from A(G) to A(H) such that for any node K of A(G) and any concept node c of K, σ(ec ) = ρ(eϕK (c) ). We define ϕ0 (K) and ϕK for any node K of A(G) and prove the preceding property by induction on the level l of K in A(G). For l = 0, K is R(G). The atoms of Φ(G) (resp Φ(H)) associated with the nodes of R(G) (resp. R(H)) are those with a0 as first argument. Then the restriction of σ to the variables of Φ(R(G)) is a Φ-substitution from R(G) to R(H). From Lemma 2, there is a projection Π0 from R(G) to R(H) such that for any concept node c of R(G), σ(ec ) = ρ(eΠ0 (c) ). We define ϕ0 (R(G)) = R(H) and ϕR(G) = Π0 . Suppose ϕ is defined up to level l. Let (J, c)K be an edge of A(G) with K at level l + 1. Let s be a node of K and t(e1 , .., , en ) be the atom of Φ(G) associated with s. Let t0 (e01 , .., , e0n ) be an atom of Φ(H) such that t0 ≤ t and for any i in {1, ..., n}, σ(ei ) = ρ(e0i ). Let K 0 be a node of A(H) and s0 a concept node of K 0 such that the atom t0 (e01 , .., , e0n ) is associated with s0 . e1 = eK = ec and e01 = eK 0 then σ(ec ) = ρ(eK 0 ), then eK 0 6= a0 , then there is an edge (J 0 , c0 )K 0 into K 0 and eK 0 = ec0 . By induction hypothesis, σ(ec ) = ρ(eϕJ (c) ).
We have ρ(ec0 ) = ρ(eK 0 ) = σ(ec ) = ρ(eϕJ (c) ), then c0 = ϕJ (c), i.e. c0 and ϕJ (c) are co-identical nodes. H is k-normal, c0 and ϕJ (c) are co-identical, c0 is a complex node and levelH (ϕJ (c)) = levelG (c) < depth(G) ≤ k then ϕJ (c) is a complex node and K 0 is exactly equivalent to R(Desc(ϕJ (c))). Let s00 be the image of s0 by an exact projection from K 0 to R(Desc(ϕJ (c))). The atom of Φ(R(Desc(ϕJ (c)))) associated with s00 is in the form t00 (e02 , .., , e0n ) with t00 ≤ t0 ≤ t. It follows that the restriction of σ to the variables of Φ(K) is a Φ-substitution from K to R(Desc(ϕJ (c))). From Lemma 2, there is a projection ΠK from K to R(Desc(ϕJ (c))) such that for any concept node c00 of K, σ(ec00 ) = ρ(eΠK (c00 ) ). We define ϕ0 (K) = R(Desc(ϕJ (c))) and ϕK = ΠK . ϕ0 (K) = R(Desc(ϕJ (c)))
252
G. Simonet
may be written as (ϕ0 (J), ϕJ (c))ϕ0 (K) is an edge of A(H), then ϕ preserves adjacency in rooted trees. Let us show that ϕ preserves co-identity. For any nodes K and K 0 of A(G) and any concept nodes c and c0 of K and K 0 respectively, if co-refA(G) (c, c0 ) then c = c0 , then σ(ec ) = σ(ec0 ) = ρ(eϕK (c) ) = ρ(eϕ 0 (c0 ) ), then K
ϕK (c) = ϕK 0 (c0 ), i.e. co-identA(H) (ϕK (c), ϕK 0 (c0 )). ϕ is an A-projection from A(G) to A(H), then there is a projection from G to H 2 The k-normal form G0k of G is defined for any k ≥ depth(G) as follows. A rooted tree Ak of SGref s is built level by level from its root to level k. Its root is a copy of R(G). Let J be a node of Ak at level l < k and c a concept node of J (c is the copy of a node c0 of U CG ). If there is at least one complex node of U CG co-identical to c0 then add the edge (J, c)K to Ak , where K is a copy of the union of the SGref s R(Desc(c00 )), c00 complex node of U CG co-identical to c0 . Two generic nodes appearing in nodes of Ak are co-referent iff they are copies of co-referent nodes of U CG . Let G0k be the NGref such that A(G0k ) is obtained from Ak by replacing each node of Ak by its normal form (when merging several co-identical concept nodes, only one of the subtrees issued from these nodes is kept, as these subtrees are exact copies of each other in Ak ). G0k is k-normal and Φ(G) ≡ Φ(G0k ) for any k ≥ depth(G). E.g. the 2-normal form of the graph H2 of Figure 6 is H2 and its 3-normal form is G2 . In Figure 7, for any k ≥ 2, G and K are their own k-normal form and the k-normal form of H is K. Corollary 1 of Theorem 2 shows that reasoning on NGref s may be performed using graph operations without any restriction on the NGref s. Corollary 1: Let G and H be two NGref s and Hk0 be the k-normal form of H, with k ≥ max(depth(G), depth(H)). Φ(S), Φ(H) |= Φ(G) iff there is a projection from G to Hk0 . Note that from a practical point of view, it is sufficient to construct Hk0 up to level depth(G) since projection preserves levels. The semantics Φ is unable to express not only that an entity is represented by several concept nodes in a node of A(G), but also that several concept nodes representing the same entity have distinct descriptions. It is pertinent in applications in which the meaning of a graph is not changed when merging co-identical concept nodes of a SGref or replacing the description of co-identical concept nodes by the union of their descriptions (in particular in applications where graphs are naturally normal). But in some applications, each concept node has a specific situation in its SGref and a specific description; merging these nodes or mixing their descriptions would destroy some information. For instance, let c and c0 be two concept nodes appearing in a NGref G and representing a given lake. c appears in the context of a biological study (i.e. is related in a SGref to nodes and appears in the description graphs of nodes concerning a biological study) and contains biological information about the lake in its description graph: animals and plants living in the lake. c0 appears in the context of a touristic study and contains touristic information about the lake in its description graph: possibilities of bathing, sailing and walking at the lake. The formulas Φ(G), Φ(H) and Φ(K) are equivalent, where H is obtained from G by exchanging the description graphs of c and c0 and K is the k-normal form of G and H (with k = max(depth(G), depth(H))), in which the description of c and c0
Two FOL Semantics for Simple and Nested Conceptual Graphs
253
is the union of the biological and touristic descriptions. Such an equivalence is obviously undesirable. In those applications, the semantics to be used is the semantics Ψ . 3.3
The Semantics Ψ
The semantics Ψ is extended from SGref s to NGref s in the same way as the semantics Φ, except that the context argument is the variable ec assigned to the concept node c representing the context instead of the term ec . Thus a description is specific to a concept node and not to a co-identity class. We add a context argument to each predicate (concept type predicates become 3-adic). For any NGref G and any node K of A(G), let nc(K) be the number of K concept nodes. Assign r(G) variables y1 , ..., yr(G) to the r(G) co-refG classes and for K any node K of A(G), nc(K) variables xK 1 , ..., xnc(K) to K concept nodes. All K K variables yi and xj are distinct. The variables of Ψ (K) are xK 1 , ..., xnc(K) and, if K contains generic nodes, some of the variables y1 , ..., yr(G) . For any node K of A(G), Ψ 0 (e, K) is the conjunction of the atoms obtained from those of Ψ (K) by adding the first argument e. Ψ 0 (e, G0 ) is defined by induction on the depth of G0 . For any node K of A(G) and any concept node c of K, ec denotes the variable assigned to c in Ψ (K). R(G0 ) R(G0 ) Ψ 0 (e, G0 ) = ∃x1 ... xnc(R(G0 )) (Ψ 0 (e, R(G0 )) ∧ (∧c∈D(G0 ) Ψ 0 (ec , Desc(c)))) The formula Ψ (G) = ∃y1 ... yr(G) Ψ 0 (a0 , G) is associated with any NGref G defined on the support S. E.g. the formula associated with the graph G2 of Figure 6 is Ψ (G2 ) = ∃x1 (t(a0 , a, x1 ) ∧ ∃x2 (t(x1 , a, x2 ) ∧ ∃x3 t(x2 , a, x3 ))). Ψ (G) may be defined from A(G) as the existential closure of the conjunction of the formulas Ψ 0 (eK , K), K node of A(G), where eK is a0 if K is R(G) and otherwise let (J, c)K be the edge of A(G) into K, eK is ec . E.g. the formula associated with the graph G2 of Figure 6 may be written as Ψ (G2 ) = ∃x1 x2 x3 (t(a0 , a, x1 ) ∧ t(x1 , a, x2 ) ∧ t(x2 , a, x3 )). Projection is sound and complete with respect to Ψ . Theorem 3. Let G and H be two NGref s. Ψ (S), Ψ (H) |= Ψ (G) iff there is a projection from G to H. E.g. in Figure 6, for any i in {1, 2}, Ψ (S), Ψ (Hi ) 6|= Ψ (Gi ) and there is no projection from Gi to Hi . Sketch of proof: A technical proof is given in [9], similar to that concerning Φ. A more intuitive one is given here, which would not be available for Φ. Let S0 be the support obtained from S by adding the individual marker a0 , the universal concept type > (if it does not already exist) and a binary relation type tcontext (context relation). For any NGref G on S, let G0 be the NGref on S0 reduced to a concept node labelled (>, a0 , G) and Simple(G) be the SGref on S0 obtained from the union of the nodes of A(G0 ) by adding for each edge
254
G. Simonet
(J, c)K of A(G0 ) nc(K) relation nodes of type tcontext relating c to each concept node of K. It can be shown that (1) there is a projection from G to H iff there is a projection from Simple(G) to Simple(H) and (2) Ψ (S), Ψ (H) |= Ψ (G) iff Ψ (S0 ), Ψ (Simple(H)) |= Ψ (Simple(G)) (any substitution of the variables of Ψ (G) leading to the empty clause by the resolution method is available for Ψ (Simple(G)), and conversely). We conclude with the soundness and completeness result on SGref s. This proof would not be available for the semantics Φ because the SGref Simple(H) obtained from a k-normal NGref H containing distinct co-identical nodes is not normal. 2
4
Conclusion
I have presented two FOL semantics for Simple and Nested Graphs. Projection is sound and complete with respect to both semantics. This result shows that reasoning on Conceptual Graphs may be performed using graph operations instead of logical provers (e.g. reasoning with Graph Rules [8,7]). Non FOL formalisms, with “nested” formulas as in the logics of contexts have been proposed for Nested Graphs. It remains to compare these formalisms and the semantics that may be associated with them to the FOL semantics presented here.
References 1. M. Chein and M.L. Mugnier. Conceptual Graphs: fondamental notions. Revue d’Intelligence Artificielle, 6(4):365–406, 1992. Herm`es, Paris. 2. M. Chein, M.L. Mugnier, and G. Simonet. Nested graphs: A graph-based knowledge representation model with fol semantics. In Proceedings of the Sixth International Conference on Principles of Knowledge Representation and Reasoning (KR’98), Trento, Italy, June 1998. 3. R.V. Guha. Contexts: a formalization and some applications. Technical Report ACT-CYC-42391, MCC, December 1991. PhD Thesis, Stanford University. 4. J. McCarthy. Notes on Formalizing Context. In Proc. IJCAI’93, pages 555–560, 1993. 5. M.L. Mugnier and M. Chein. Repr´esenter des connaissances et raisonner avec des graphes. Revue d’Intelligence Artificielle, 10(1):7–56, 1996. Herm`es, Paris. 6. A Preller, M.L. Mugnier, and M. Chein. Logic for Nested Graphs. Computational Intelligence Journal, (CI 95-02-558), 1996. 7. E. Salvat. Raisonner avec des op´erations de graphes : graphes conceptuels et r`egles d’inf´erence. PhD thesis, Montpellier II University, France, December 1997. 8. E. Salvat and M.L. Mugnier. Sound and Complete Forward and Backward Chainings of Graph Rules. In ICCS’96, Lecture Notes in A.I. Springer Verlag, 1996. 9. G Simonet. Une autre s´emantique logique pour les graphes conceptuels simples ou emboˆıt´es. Research Report 96-048, L.I.R.M.M., 1996. 10. G Simonet. Une s´emantique logique pour les graphes conceptuels emboˆıt´es. Research Report 96-047, L.I.R.M.M., 1996. 11. J.F. Sowa. Conceptual Structures - Information Processing in Mind and Machine. Addison Wesley, 1984.
Peircean Graphs for the Modal Logic S5 Torben Bra¨ uner? Centre for Philosophy and Science-Theory Aalborg University Langagervej 6 9220 Aalborg East, Denmark [email protected]
Abstract. Peirce completed his work on graphical methods for reasoning within propositional and predicate logic, but left unfinished similar systems for various modal logics. In the present paper, we put forward a system of Peircean graphs for reasoning within the modal logic S5. It is proved that our graph-based formulation of S5 is indeed equivalent to the traditional Hilbert-Frege formulation. Our choice of proof-rules for the system is proof-theoretically well motivated as the rules are graphbased analogues of Gentzen style rules as appropriate for S5. Compared to the system of Peircean graphs for S5 suggested in [17], our system has fewer rules (two instead of five), and moreover, the new rules seem more in line with the Peircean graph-rules for propositional logic.
1
Introduction
It was a major event in the history of diagrammatic reasoning when Charles Sanders Peirce (1839 - 1914) developed graphical methods for reasoning within propositional and predicate logic, [18]. This line of work was taken up again in 1984 where conceptual graphs, which are generalisations of Peircean graphs, were introduced in [15]. Since then, conceptual graphs have gained widespread use within Artificial Intelligence. The recent book [1] witnesses a general interest in logical reasoning with diagrams within the areas of logic, philosophy and linguistics. Furthermore, the book witnesses a practically motivated interest in diagrammatic reasoning related to the increasing use of visual displays within such diverse areas as hardware design, computer aided learning and multimedia. Peirce completed his work on graphical methods for reasoning within propositional and predicate logic but left unfinished similar systems for various modal logics - see the account given in [12]. In the present paper, we put forward a system of Peircean graphs for reasoning within the modal logic S5. The importance of this logic is recognised within many areas, notably philosophy, mathematical logic, Artificial Intelligence and computer science. It is proved that our graphbased formulation of the modal logic S5 is indeed equivalent to the traditional Hilbert-Frege formulation. Our choice of proof-rules for the system is prooftheoretically well motivated as the rules are graph-based analogues of Gentzen ?
The author is supported by the Danish Natural Science Research Council.
M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 255–269, 1998. c Springer-Verlag Berlin Heidelberg 1998
256
T. Bra¨ uner
style rules as appropriate for S51 . Gentzen style is one way of formulating a logic which is characterised by particularly appealing proof-theoretic properties. It should be mentioned that a system of Peircean graphs for S5 was also suggested in [17]. However, our system has fewer rules (two instead of five), and moreover, the new rules seem more in line with the Peircean graph-rules for propositional logic. In the second section of this paper we give an account of propositional and modal logic in Hilbert-Frege style. Graph-based systems for reasoning within propositional logic and the modal logic S5 are given in respectively the third and the fourth section. In section five it is proved that our graph-based formulation of S5 is equivalent to the Hilbert-Frege formulation. In section six we discuss possible further work.
2
Propositional and Modal Logic
In this section we shall give an account of propositional logic and the modal logic S5 along the lines of [14]. See also [8, 9]. We formulate the logics in the traditional Hilbert-Frege style. Formulae for propositional logic are defined by the grammar s ::= p | s ∧ ... ∧ s | ¬(s) where p is a propositional letter. Parentheses are left out when appropriate. Given formulae φ and ψ, we abbreviate ¬(φ∧¬ψ) and ¬(¬φ∧¬ψ) as respectively φ ⇒ ψ and φ ∨ ψ. Definition 1. The axioms and proof-rules for propositional logic are as follows: A1 ` φ ⇒ (ψ ⇒ φ). A2 ` (φ ⇒ (ψ ⇒ θ)) ⇒ ((φ ⇒ ψ) ⇒ (φ ⇒ θ)). A3 ` (¬φ ⇒ ¬ψ) ⇒ (ψ ⇒ φ). Modus Ponens If ` φ and ` φ ⇒ ψ then ` ψ. Given the definition of derivability for propositional logic, it is folklore that one can prove soundness and completeness in the sense that a formula is derivable using the axioms and rules if and only if it is valid with respect to the standard truth-functional semantics. Formulae for modal logic are defined by extending the grammar for propositional logic with the additional clause s ::= ... | (s) The connective symbolises ”it is necessary that”. Given a formula φ, we abbreviate ¬¬φ as ♦φ. It follows that ♦ symbolises ”it is possible that”. It is a 1
This gives rise to a highly interesting discussion on whether graphs should be considered as a generalisation of Gentzen or/and Natural Deduction style, or perhaps Hilbert-Frege style. (Note that there is danger of confusion here as Gentzen discovered Natural Deduction style as well as what is usually called Gentzen style, [6].)
Peircean Graphs for the Modal Logic S5
257
notable property of the new connective that it is not truth-functional. Whereas the truth-value of for example the proposition ¬φ is determined by the truth-value of the proposition φ, it is not the case that the truth-value of φ is determined by the truth-value of φ. For example, the truth of φ does not necessarily imply the truth of φ. This can be illustrated by taking φ to symbolise either ”all bachelors are unmarried” or ”Bill Clinton is president of USA”. The proposition φ is true in both cases, but φ is true in the first case whereas it is false in the second. Definition 2. The axioms and proof-rules for S5 are constituted by all axioms and proof-rules for propositional logic together with the following: K ` (φ ⇒ ψ) ⇒ (φ ⇒ ψ). T ` φ ⇒ φ. S5 ` ♦φ ⇒ ♦φ. Necessitation If ` φ then ` φ. It may be worth pointing out the relation between S5 and other well known modal logics. We get the modal logic S4 if axiom S5 is replaced by the axiom ` φ ⇒ φ, called S4. It is straightforward to show that the logic S5 is stronger than the logic S4 in the sense that any formula provable in S4 is provable in S5 also. We get the modal logic T if axiom S5 is left out, and the modal logic K is obtained by leaving out S5 as well as T. The various modal logics described correspond to different notions of necessity. We shall here concentrate on S5 where can be considered as symbolising ”it is under all possible circumstances the case that”. Thus, the notion of necessity in question here is concerned with possible ways in which things might have been - it is not concerned with what is known or believed. This constitutes a philosophical motivation of the modal logic S5. From a mathematical point of view, S5 is interesting as it is a modal version of the monadic predicate logic (which is the usual predicate logic equipped with the restriction that every predicate has exactly one argument). Also, the modal logic S5 is used within the areas of Artificial Intelligence and computer science. In what follows, we shall give an account of the traditional possible-worlds semantics for S5. Formally, we need a non-empty set W together with a function V which assigns a truth-value V (w, p) to each pair consisting of an element w of W and a propositional letter p. The elements of W are to be thought of as possible worlds or possible circumstances. By induction, we extend the range of the valuation operator V to arbitrary formulas as follows: V (w, φ ∧ ψ) iff V (w, φ) and V (w, ψ) V (w, ¬φ) iff not V (w, φ) V (w, φ) iff for all w0 in W , V (w0 , φ) So V (w, φ) is to be thought of as φ being true in the world w. A formula φ is said to be S5-valid if and only if φ is S5-true in any model (W, V ), that is, we have V (w, φ) for any world w in W .
258
T. Bra¨ uner
Given the definitions of derivability and validity in S5, it is well known that one can prove soundness and completeness, that is, a formula is derivable using the Hilbert-Frege axioms and rules for S5 if and only if it is S5-valid.
3
Graphs for Propositional Logic
In this section we shall give an account of propositional logic using Peircean graphs in linear notation. Graphs for propositional logic are defined by the grammar s ::= p | s...s | ¬(s) where p is a propositional letter. The number of occurrences of s in a string s...s might be zero in which case we call the resulting graph empty. A graph can be rewritten into a formula of propositional logic by adding conjunction symbols as appropriate. We shall often blur the distinction between graphs and formulas when no confusion can occur. Given graphs φ and ψ together with a propositional letter p occurring exactly once in φ, the expression φ[ψ] denotes φ where ψ has been substituted for the occurrence of p. This notation will be used to point out one single occurrence of a graph in an enclosing graph. The ¬(...) part of a graph ¬(φ) is called a negation context. We say that a graph ψ is positively enclosed in φ[ψ] if and only if it occurs within an even number of negation contexts. Negative enclosure is defined in an analogous fashion. A graph can be written in non-linear style by writing φ instead of ¬(φ). A notion of derivation for propositional graphs is introduced in the definition below. Definition 3. A list of graphs ψ1 , ..., ψn constitutes a derivation of ψn from ψ1 if and only if each ψi+1 can be obtained from ψi by using one of the following rules: Insertion Any graph may be drawn anywhere which is negatively enclosed. Erasure Any positively enclosed graph may be erased. Iteration A copy of any graph φ may be drawn anywhere which is not within φ provided that the only contexts crossed are negation contexts which do not enclose φ. Deiteration Any graph which could be the result of iteration may be erased. Double Negation A double negation context may be drawn around any graph and a double negation context around any graph may be erased. The rules of the graph-based formulation of propositional logic have the notable property that they can be applied at the top-level as well as inside graphs. This is contrary to other formulations of propositional logic which allow only toplevel applications of rules. The point here is that the only global conditions on
Peircean Graphs for the Modal Logic S5
259
applying the inference rules for graphs concern the notion of sign of a graph all the other conditions are purely local. This gives rise to the theorem below. Theorem 1 (Cut-and-Paste). Let a list of graphs ψ1 , ..., ψn be given which constitutes a derivation of ψn from ψ1 . Also, assume that a graph φ[ψ1 ] is given in which ψ1 is positively enclosed. Then the list of graphs φ[ψ1 ], ..., φ[ψn ] constitutes a derivation of φ[ψn ] from φ[ψ1 ]. Proof. Induction on n.
t u
This theorem and it’s name is taken from [16]. The justification for the name is that a derivation from the empty graph can be ”cut out” and ”pasted into” anywhere which is positively enclosed. In what follows, we shall give a proof that the graph-based formulation of propositional logic is equivalent to the Hilbert-Frege formulation in the sense that the same graphs/formulae are derivable. Theorem 2. A graph is derivable from the empty graph if and only if the corresponding formula is derivable using the Hilbert-Frege style axioms and rules. Proof. To see that a Hilbert-Frege derivable formula corresponds to a derivable graph the following two observations suffice: Firstly, the axioms A1, A2 and A3 all correspond to derivable graphs. Secondly, given derivations of graphs φ and φ ⇒ ψ, a derivation of ψ can be constructed as follows: The Cut-and-Paste Theorem enables us to combine the derivations of the graphs φ and φ
ψ
into a derivation of the graph φ
φ
ψ
on which we apply Deiteration to get φ which by Erasure yields
ψ
260
T. Bra¨ uner
ψ which by Double-Negation finally yields the graph ψ This corresponds to the Modus Ponens rule. To see that the rules for graphs do not prove too much, the following argument suffices: The empty graph corresponds to the unit for conjunction which is obviously valid with respect to the standard truth-functional semantics, and furthermore, the rules for derivability of graphs correspond to validity-preserving operations on formulae. Hence, the formula corresponding to a graph derivable from the empty graph is valid and therefore Hilbert-Frege derivable according to the previously mentioned completeness result. t u
4
Graphs for the Modal Logic S5
In this section we shall give a graph-based version of the modal logic S5. Graphs for modal logic are defined by extending the grammar for defining graphs of propositional logic with the additional clause s ::= ... | (s) The (...) part of a graph (φ) is called a modal context 2 . Note that modal contexts do not matter for whether a graph is positively or negatively enclosed. We say that a graph is modally enclosed if it occurs within a modal context. In non-linear style we write φ instead of (φ). A notion of derivation for modal graphs is introduced in the definition below. 2
A historical remark should be made here: Peirce’s modal contexts correspond to ¬((...)) in our system. This definition of modal contexts is adopted by some authors, for example those of the papers [12] and [17]. Our choice of definition for modal contexts deviates from Peirce’s because we want to keep the notions of negation and necessity distinct. This is in accordance with Sowa’s definition of modal contexts for conceptual graphs, [15]. It is straightforward to restate our graph-rules in terms of Peirce’s modal contexts by adding negation contexts as appropriate and by defining positive and negative enclosure such that negation contexts as well as modal contexts are taken into account.
Peircean Graphs for the Modal Logic S5
261
Definition 4. A list of graphs ψ1 , ..., ψn constitutes a derivation of ψn from ψ1 if and only if each ψi+1 can be obtained from ψi by using either one of the rules for propositional logic or one of the following: Negative -Introduction A modal context may be drawn around any negatively enclosed graph. Positive -Introduction A modal context may be drawn around any positively enclosed graph ψ which is not modally enclosed provided that each propositional letter which is within a context enclosing ψ, but which is not within ψ, is modally enclosed. Note that the condition regarding propositional letters in the rule Positive Introduction is vacuous if the graph ψ is not enclosed by any contexts at all. Also, in the rule for iteration note that only negation contexts can be crossed when copying a graph (modal contexts cannot be crossed). Our choice of rules is proof-theoretically well motivated as the rules are graphbased analogues of the Gentzen rules for S5 originally given in [11] (and also considered in [2] and elsewhere). Gentzen rules for classical logic were introduced in [6]. In Gentzen style, proof-rules are used to derive sequents φ1 , ..., φn ` ψ1 , ..., ψm Derivability of such a sequent corresponds to derivability of ` (φ1 ∧ ... ∧ φn ) ⇒ (ψ1 ∨ ... ∨ ψm ) Note that the left hand side formulae φ1 , ..., φn are negatively enclosed whereas the right hand side formulae ψ1 , ..., ψm are positively enclosed. Generally, a rule in Gentzen style either introduces a formula on the left of the turnstyle or it introduces a formula on the right of the turnstile (that is, a formula is introduced either as negatively or positively occurring). In the S5 case, the Gentzen rules are as follows: -Left If Γ, φ ` ∆ then Γ, φ ` ∆. -Right If Γ ` φ, ∆ then Γ ` φ, ∆ provided that any propositional letter occurring in Γ or ∆ is modally enclosed. Clearly, the -Left rule corresponds to the rule Negative -Introduction for graphs and the -Right rule corresponds to the rule Positive -Introduction. Note that in the rule Positive -Introduction, the restriction that ψ must not be modally enclosed cannot be left out. The two graphs ♦q ⇒ ♦q and ♦q ⇒ ♦q constitute a counter-example as the first graph is derivable (as it is obviously valid) whereas the second graph is not derivable at all (as it is invalid). Here, the modal context enclosing the graph in question is negatively enclosed. Similarly, the two graphs (q ∨ p) ⇒ (q ∨ p) and (q ∨ p) ⇒ (q ∨ p) constitute a counter-example where the modal context enclosing the graph in question is positively enclosed. It should be mentioned that an equivalent system can be obtained by replacing the Negative -Introduction rule by the following:
262
T. Bra¨ uner
Positive -Elimination A modal context around any positively enclosed graph may be erased. The equivalence is straightforward to prove using the second modal cut-andpaste theorem, Theorem 4, given below3 . We clearly have to compare our system to the system for S5 given in the paper [17]. Our rule Negative -Introduction is also a rule of this paper, but our rule Positive -Introduction replaces three rules given there, namely graph-based analogues of the Hilbert-Frege axioms and proof-rules K, S5 and Necessitation. So, compared to the rules of [17] we have fewer rules (two instead of five). Furthermore, our rules seem more in line with the graph-rules for propositional logic4 and they are proof-theoretically well motivated, as is made clear in the discussion above. 3
The fact that the two rules give rise to equivalent systems leaves us with a highly interesting question: Which of the rules should we take, Negative -Introduction or Positive -Elimination? The first is suggested if we consider graphs to be a generalisation of Gentzen style. But the second rule is suggested if we consider graphs to be a generalisation of Natural Deduction style where in general a rule either introduces or eliminates a formula on the right of the turnstile (that is, a formula is either introduced or eliminated as positively occurring). Note, however, that Natural Deduction lends itself towards intuitionistic logic (as the asymmetry between assumptions and conclusion in Natural Deduction proofs is reflected in the asymmetry between input and output in the Brouwer-Heyting-Kolmogorov interpretation) rather than classical logic (as the asymmetry between assumptions and conclusion in Natural Deduction is not reflected in the standard truth-functional interpretation where the truth-values, true and false, are perfectly symmetric) whereas the converse seems to be the case with graphs. On the other hand, Natural Deduction proofs correspond in a certain sense to intuitive, informal reasoning. To quote from Prawitz’s classic on Natural Deduction: The inference rules of systems of natural deduction correspond closely to procedures common in intuitive reasoning, and when informal proofs - such as are encountered in mathematics for example - are formalised within these systems, the main structure of the informal proofs can often be preserved. ([13], p. 7) Also Peircean graphs can be said to correspond to intuitive and informal reasoning. But it is not clear in which sense this can be said about Gentzen’s calculus of sequents. Rather: The calculus of sequents can be understood as meta-calculi for the deducibility relation in the corresponding systems of natural deduction. ([13], p. 90)
4
The answer to the question concerning whether graphs generalise Gentzen or/and Natural Deduction style may be hidden within the problem of finding an appropriate notion of reduction for graph-derivations analogous to cut-elimination in Gentzen systems or/and normalisation in Natural Deduction systems - see [7]. We shall leave this issue to future work. This point is clear if we consider graphs as a generalisation of Gentzen or/and Natural Deduction style rather than Hilbert-Frege style (the latter seems unnatural compared to the former).
Peircean Graphs for the Modal Logic S5
263
We do not have the full Cut-and-Paste Theorem, Theorem 1, when the rules for the modal logic S5 are taken into account5 . However, it is possible to add restrictions in two different ways such that the theorem holds also in the modal case. In the first case we have added the restriction that ψ1 is not enclosed by any contexts at all. Theorem 3 (Cut-and-Paste). Let a list of graphs ψ1 , ..., ψn be given which constitutes a derivation of ψn from ψ1 . Also, assume that a graph φ[ψ1 ] is given in which ψ1 is not enclosed by any contexts. Then the list of graphs φ[ψ1 ], ..., φ[ψn ] constitutes a derivation of φ[ψn ] from φ[ψ1 ]. Proof. Induction on n.
t u
In the second case we have added the restriction that the rule for positive introduction is not applied in the derivation of ψn from ψ1 . Theorem 4 (Cut-and-Paste). Let a list of graphs ψ1 , ..., ψn be given which constitutes a derivation of ψn from ψ1 where the rule for positive -introduction is not applied. Also, assume that a graph φ[ψ1 ] is given in which ψ1 is positively enclosed. Then the list of graphs φ[ψ1 ], ..., φ[ψn ] constitutes a derivation of φ[ψn ] from φ[ψ1 ]. Proof. Induction on n.
5
t u
The Equivalence
In this section, we shall prove that the graph-based formulation of S5 is equivalent to the traditional Hilbert-Frege formulation. The theorem below says that a graph is S5-derivable from the empty graph if the corresponding formula is Hilbert-Frege derivable. Lemma 1. A graph is S5-derivable from the empty graph if the corresponding formula is derivable using the Hilbert-Frege style axioms and rules for S5. 5
In passing, we shall mention another theorem which holds for propositional logic but which does not hold for the modal logic S5 (and neither for other modal logics). This is the so-called Deduction Theorem which says that the derivability of ψ from φ implies the derivability of φ ⇒ ψ from the empty graph. Using the Cut-and Paste Theorem, it is straightforward to prove it for propositional logic. But here is a counter-example for S5: The graph p is derivable from the graph p, but p ⇒ p is not derivable from the empty graph (as it is not valid). In the paper [17] it is claimed that the Deduction Theorem holds for the graph-based system for S5 proposed there (and for graph-based systems proposed for other modal logics as well). But this is not true, in fact, the counter-example just given works also for the systems of that paper. However, in the mentioned paper the Deduction Theorem does hold in the cases where it is used.
264
T. Bra¨ uner
Proof. To see that a Hilbert-Frege derivable formula corresponds to a derivable graph the following four observations suffice: Firstly, the axioms A1, A2 and A3 all correspond to derivable graphs. This is analogous to the propositional case, Theorem 2. Secondly, given derivations of graphs φ and φ ⇒ ψ, a derivation of ψ can be constructed. This is straightforward to show in a way analogous to the propositional case, Theorem 2, by using the first modal cut-and-paste theorem, Theorem 3. This corresponds to the rule Modus Ponens. Thirdly, the axioms K, T and S5 all correspond to derivable graphs. With the aim of making clear the role of our modal rules, we shall give the derivations. By Double Negation, Insertion and Iteration we get
φ
ψ
φ
ψ
which by Negative -Introduction yields
φ
ψ
φ
ψ
which by Positive -Introduction yields
φ
ψ
φ
ψ
which by Double Negation finally yields the graph
φ
ψ
φ
ψ
that corresponds to (φ ⇒ ψ) ⇒ (φ ⇒ ψ), that is, the axiom K. By Double Negation, Insertion and Iteration we get φ
φ
Peircean Graphs for the Modal Logic S5
265
which by Negative -Introduction yields the graph φ
φ
that corresponds to φ ⇒ φ, that is, the axiom T. By Double Negation, Insertion and Iteration we get
φ
φ
which by Positive -Introduction yields the graph
φ
φ
that corresponds to ♦φ ⇒ ♦φ, that is, the axiom S5. Fourthly, given a derivation of a graph φ, a derivation of φ can be constructed by using the rule Positive -Introduction. This corresponds to the rule Necessitation. t u The following lemma says that negative -introduction preserves validity. Lemma 2. Let a graph φ[ψ] be given. For every S5-model (W, V ) it is the case that 1. if ψ is negatively enclosed then V (w, φ[ψ]) implies V (w, φ[(ψ)]), 2. if ψ is positively enclosed then V (w, φ[(ψ)]) implies V (w, φ[ψ]). for any w in W . Proof. Induction on the structure of φ[ψ]. We proceed on a case by case basis where symmetric cases are omitted. – The case where φ[ψ] = α ∧ β[ψ] for some α and β. If V (w, α ∧ β[ψ]) then V (w, α) and V (w, β[ψ]). But V (w, β[ψ]) implies V (w, β[ψ]) by induction. Hence, V (w, α ∧ β[ψ]). – The case where φ[ψ] = ¬β[ψ] for some β. If V (w, ¬β[ψ]) then V (w, β[ψ]) is false. But this implies the falsity of V (w, β[ψ]) by contraposition of the induction hypothesis. Hence, V (w, ¬β[ψ]).
266
T. Bra¨ uner
– The case where φ[ψ] = β[ψ] for some β. If V (w, β[ψ]) then V (w0 , β[ψ]) for any w0 in W . Thus V (w0 , β[ψ]) for any w0 in W by induction. Hence, V (w, β[ψ]). – The case where φ[ψ] = ψ. In this case, ψ cannot be negatively enclosed. Clearly, V (w, ψ) implies V (w, ψ). t u With the aim of proving that Positive -Introduction preserves validity, we shall prove a small proposition. Recall that truth in a model amounts to truth in any world of the model. Proposition 1. Let a graph φ be given in which any propositional letter is modally enclosed. In a given S5-model (W, V ) either φ is true or ¬(φ) is true. Proof. Induction on the structure of φ. We proceed on a case by case basis. – The case where φ = α ∧ β for some α and β. If ¬(α ∧ β) is not true in the world then V (w, α ∧ β) for some world w in W , and hence, also V (w, α) and V (w, β). By induction, this implies the truth of α and β in the model, and hence, the truth of α ∧ β in the model. – The case where φ = ¬β for some β. By induction, either β is true or ¬β is true in the model. – The case where φ = β for some β. Clearly ok. t u The following theorem essentially says that the rule Positive -Introduction preserves validity. Lemma 3. Let a graph φ[ψ] be given in which ψ is positively enclosed but not modally enclosed. Furthermore, assume that any propositional letter not within ψ is modally enclosed. If φ[ψ] is true in a given S5-model then φ[(ψ)] is also true in this model. Proof. Induction on the structure of φ[ψ]. We proceed on a case by case basis. – The case where φ[ψ] = α ∧ β[ψ] for some α and β. If α ∧ β[ψ] is true in the model then also α and β[ψ] are true. But the truth of β[ψ] implies the truth of β[ψ] by induction. Hence, α ∧ β[ψ] is true. – The case where φ[ψ] = ¬(α ∧ ¬β[ψ]) for some α and β. If ¬(α ∧ ¬β[ψ]) is true in the model then either ¬α is true or β[ψ] is true according to Proposition 1. But the truth of β[ψ] implies the truth of β[ψ] by induction. Hence, ¬(α ∧ ¬β[ψ]) is true. – The case where φ[ψ] = ψ. Clearly, the truth of ψ in the model implies the truth of ψ. t u The lemma above can be generalised in the following sense.
Peircean Graphs for the Modal Logic S5
267
Lemma 4. Let a graph φ[ψ] be given in which ψ is positively enclosed but not modally enclosed. Furthermore, assume that any propositional letter within a context enclosing φ, but not within ψ, is modally enclosed. If φ[ψ] is valid then φ[(ψ)] is also valid. Proof. There are two cases. If ψ is not enclosed by any negation contexts in φ[ψ] then φ[ψ] = α ∧ ψ for some α. If α ∧ ψ is valid then α and ψ are also valid. The validity of ψ implies the validity of ψ. Hence, α∧ψ is valid. If ψ is enclosed by a non-zero number of negation contexts in φ[ψ] then φ[ψ] = α ∧ ¬β[ψ] for some α and β. If α ∧ ¬β[ψ] is valid then also α and ¬β[ψ] are valid. By Lemma 4 the validity of ¬β[ψ] implies the validity of ¬β[ψ]. Hence, α ∧ ¬β[ψ] is valid. u t The following theorem says that a graph is S5-derivable from the empty graph only if the corresponding formula is Hilbert-Frege derivable. Lemma 5. A graph is S5-derivable from the empty graph only if the corresponding formula is derivable using the axioms and rules for S5. Proof. The formula corresponding to the empty graph is the unit for conjunction which is valid as it is true in any world of any model. The rules for derivability of graphs in propositional logic correspond to validity-preserving operations on formulae. It follows from Lemma 2 and Lemma 4 that the two rules for derivability of graphs in S5 correspond to validity-preserving operations on formulae. We conclude that the formula corresponding to any graph derivable from the empty graph is valid and therefore derivable using axioms and rules according to the previously mentioned completeness result. t u The theorem below says that the graph-based formulation of S5 is equivalent to the traditional Hilbert-Frege formulation. Theorem 5. A graph is S5-derivable from the empty graph if and only if the corresponding formula is derivable using the axioms and rules for S5. Proof. Lemma 1 and Lemma 5.
6
t u
Further Work
It is natural to ask whether what we have done in this paper is also possible with S5 replaced by S4, T or K. It is obviously possible to use graph-based analogues of the Hilbert-Frege axioms and proof-rules for each of the mentioned modal logics. This is what is done in [17]. But is it possible to find proof-theoretically well motivated graph-based formulations of S4, T and K? Clearly, this is related to the possibility of finding Gentzen and Natural Deduction systems for the logics in question. Gentzen proof-rules for S4 and S5 were introduced in [10, 11] and Natural Deduction rules for these logics were given in [13]. But such formulations
268
T. Bra¨ uner
have not been found for other modal logics6 . In Handbook of Philosophical Logic, the following remark is made on Prawitz’s Natural Deduction systems for S4 and S5: However, it has proved difficult to extend this sort of analysis to the great multitude of other systems of modal logic. It seems fair to say that a deductive treatment congenial to modal logic is yet to be found, for Hilbert systems are not suited for actual deduction, .... ([3], p. 27–28) The problem of finding proof-theoretically well motivated graph-based formulations of modal logics is analogous. This suggests that S4 is amenable to a graph-based formulation along the same lines as the one for S5. But it also suggests that this is not the case for other modal logics. This deficiency calls for explanation. The handbook continues: The situation has given rise to various suggestions. One is that the Gentzen format, which works so well for truth-functional operators, should not be expected to work for intensional operators, which are far from truth-functional. (But then Gentzen works well for intuitionistic logic which is not truth-functional either.) Another suggestion is that the great proliferation of modal logics is an epidemy from which modal logic ought to be cured: Gentzen methods work for the important systems, and the other should be abolished. ’No wonder natural deduction does not work for unnatural systems!’ ([3], p. 28) It is not clear to the author of this paper whether one of these suggestions provides a way out of the trouble. We shall leave it to further work. Acknowledgements: Thanks to Peter Øhrstrøm for comments at various stages of writing this paper.
References [1] G. Allwein and J. Barwise, editors. Logical Reasoning with Diagrams. Oxford University Press, 1996. [2] T. Bra¨ uner. A cut-free Gentzen formulation of the modal logic S5. 12 pages. Manuscript, 1998. [3] R. Bull and K. Segerberg. Basic modal logic. In D. Gabbay and F. Guenthner, editors, Handbook of Philosophical Logic, Vol. II, Extensions of Classical Logic, pages 1–88. D. Reidel Publishing Company, 1984. [4] M. Fitting. Tableau methods of proof for modal logics. Notre Dame Journal of Formal Logic, 13:237–247, 1972. 6
It should be mentioned that many modal logics can be given formulations which more or less diverge from ordinary Gentzen systems. Notable here are the formulations of T, S4 and S5 given in [9]. Rather than sequents in the usual sense, they are based on indexed sequents, that is, sequents where each formula is indexed by a string of natural numbers. Also the Prefixed Tableau Calculus of [4, 5] should be mentioned. See the discussion in [2].
Peircean Graphs for the Modal Logic S5
269
[5] M. Fitting. Basic modal logic. In D. Gabbay et al., editor, Handbook of Logic in Artificial Intelligence and Logic Programming, Vol. 1, Logical Foundations, pages 365–448. Oxford University Press, Oxford, 1993. [6] G. Gentzen. Untersuchungen u ¨ ber das logische Schliessen. Mathematische Zeitschrift, 39, 1934. [7] J.-Y. Girard, Y. Lafont, and P. Taylor. Proofs and Types. Cambridge University Press, 1989. [8] G. E. Hughes and M. J. Cresswell. An Introduction to Modal Logic. Methuen, 1968. [9] G. Mints. A Short Introduction to Modal Logic. CSLI, 1992. [10] M. Ohnishi and K. Matsumoto. Gentzen method in modal calculi. Osaka Mathematical Journal, 9:113–130, 1957. [11] M. Ohnishi and K. Matsumoto. Gentzen method in modal calculi, II. Osaka Mathematical Journal, 11:115–120, 1959. [12] P. Øhrstrøm. C. S. Peirce and the quest for gamma graphs. In Proceedings of Fifth International Conference on Conceptual Structures, volume 1257 of LNCS. Springer-Verlag, 1997. [13] D. Prawitz. Natural Deduction. A Proof-Theoretical Study. Almqvist and Wiksell, 1965. [14] D. Scott, editor. Notes on the Formalisation of Logic. Sub-faculty of Philosophy, University of Oxford, 1981. [15] J. F. Sowa. Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley, Reading, 1984. [16] J. F. Sowa. Knowledge Representation: Logical, Philosophical, and Computational Foundations. PWS Publishing Company, Boston, 1998. [17] H. van den Berg. Modal logics for conceptual graphs. In Proceedings of First International Conference on Conceptual Structures, volume 699 of LNCS. SpringerVerlag, 1993. [18] J. Zeman. Peirce’s graphs. In Proceedings of Fifth International Conference on Conceptual Structures, volume 1257 of LNCS. Springer-Verlag, 1997.
Fuzzy Order-Sorted Logic Programming in Conceptual Graphs with a Sound and Complete Proof Procedure Tru H. Cao and Peter N. Creasy Department of Computer Science and Electrical Engineering University of Queensland Australia 4072 {tru, peter}@csee.uq.edu.au
Abstract. This paper presents fuzzy conceptual graph programs (FCGPs) as a fuzzy order-sorted logic programming system based on the structure of conceptual graphs and the approximate reasoning methodology of fuzzy logic. On one hand, it refines and completes a currently developed FCGP system that extends CGPs to deal with the pervasive vagueness and imprecision reflected in natural languages of the real world. On the other hand, it overcomes the previous widesense fuzzy logic programming systems to deal with uncertainty about types of objects. FCGs are reformulated with the introduction of fuzzy concept and relation types. The syntax of FCGPs based on the new formulation of FCGs and their general declarative semantics based on the notion of ideal FCGs are defined. Then, an SLD-style proof procedure for FCGPs is developed and proved to be sound and complete with respect to their declarative semantics. The procedure selects reductants rather than clauses of an FCGP in resolution steps and involves lattice-based constraint solving, which supports more expressive queries than the previous FCGP proof procedure did. The results could also be applied to CGPs as special FCGPs and useful for extensions adding to CGs lattice-based annotations to enhance their knowledge representation and reasoning power.
1. Introduction It is a matter of fact that uncertainty is frequently encountered in the real world. Uncertain knowledge representation and reasoning therefore have gained growing importance in artificial intelligence research. So far, research on extensions of CGs ([29]) for dealing with uncertainty has mainly clustered into two groups. One is on application of CGs to information retrieval (e.g., [11, 27, 13]) and the other is on FCGs (e.g., [25, 33, 17]), with the latter being our current research interest. Among several theories and methodologies dealing with different kinds of uncertainty, fuzzy logic ([37]), originated by the theory of fuzzy sets ([35]), is an essential one for representing and reasoning with vague and imprecise information, which is pervasive in the real world as reflected in natural languages. It is significant that, whilst a smooth mapping between logic and natural language has been regarded as the main motivation of CG ([31]), a methodology for computing with words has been regarded as the main contribution of fuzzy logic ([38]). Interestingly, for example, whilst quantifying words in natural languages such as many, few or most can be represented in CGs ([30]), the vagueness and imprecision of M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 270-284, 1998 Springer-Verlag Berlin Heidelberg 1998
Fuzzy Order-Sorted Logic Programming in Conceptual Graphs
271
these words can be handled by fuzzy logic ([36]). It shows that these two logic systems, although so far developed quite separately, have a common target of natural language. Their merger then promises a powerful knowledge representation language where CG offers a structure for placing words in and fuzzy logic offers a methodology for approximate reasoning with them. Fuzzy logic programming systems can be roughly classified into two groups with respect to (w.r.t.) the narrow and the wide senses of fuzzy logic. Systems of the first group have formulas associated with real numbers in the interval [0,1] (e.g. [26, 21]). Those of the second involve computations with fuzzy sets as data in programs (e.g. [32, 2]). There are two common shortcomings of the previous systems of the second group. First, model-theoretic semantics and theorem-proving fundamentals were not established, whence the soundness and the completeness of the systems could not be proved. Second, they did not deal with type hierarchies as classical order-sorted logic programming systems did. In the fuzzy case, there is the problem of uncertainty about types of objects. Overcoming the first shortcoming, annotated fuzzy logic programs (AFLPs) have been developed as an essential formalism for fuzzy logic programming systems computing with fuzzy sets as soft data ([9, 7]). The purpose of our current work on FCGPs is two-fold. On one hand, it extends CGPs ([14, 28, 19]) to deal with vague and imprecise knowledge. On the other hand, it provides a wide-sense fuzzy logic programming system that can handle uncertainty about types of objects. The system is in the spirit of possibility theory and possibility logic ([12]), where fuzzy set membership functions are interpreted as possibility distributions, in contrast to probability distributions in a probability framework. For that purpose, this paper refines and completes the previous work [34, 4] on FCGPs in the following issues: 1. FCGs are reformulated with the introduction of fuzzy concept and relation types, providing unified structure and treatment for both FCGs and CGs. The syntax of FCGPs is refined on the new formulation of FCGs. 2. The general declarative semantics of FCGPs is defined on the notion of ideal FCGs, ensuring finite proofs of logical consequences of programs. The fixpoint semantics of FCGPs as the bridge between their declarative and procedural semantics are studied. 3. A sound and complete SLD-style proof procedure for FCGPs is developed. For the completeness, the procedure selects reductants rather than clauses of an FCGP in resolution steps. Also, it involves solving constraints on lattice-based fuzzy value terms, supporting more expressive queries which are possibly about not only fuzzy attribute-values but also fuzzy types. In fact, FCGPs and CGPs can be studied in the lattice-based reasoning framework of annotated logic programs ([20, 7]), as CGPs compute with concept types and FCGPs compute with fuzzy types and fuzzy attribute-values as lattice-based data. From this point of view, the results obtained in this paper could also be applied to CGPs as special FCGPs and useful for extensions adding to CGs lattice-based annotations to enhance their knowledge representation and reasoning power.
272
T.H. Cao and P.N. Creasy
The paper is organized as follows. Section 2 presents a framework of fuzzy types, the new formulation of FCGs and the notion of ideal FCGs. Section 3 defines the syntax and general declarative semantics of FCGPs and studies their fixpoint semantics. More details for these two sections can be found in [8]. In Section 4, the definitions of FCGP reductants and FCGP constraints are presented. Then, the new FCGP proof procedure is developed and proved to be sound and complete w.r.t. FCGP declarative semantics. Finally, Section 5 is for conclusions and suggestions for future research.
2. FCG Formulation with Fuzzy Types
Throughout this paper, the conventional notations ∩ and ∪ are respectively used for the ordinary/fuzzy set intersection and union operators, and lub stands for the least upper bound operator of a lattice. Especially, we use ≤ι as the common notation for all orderings used in the current work, under the same roof of information ordering, whereby A ≤ι B means B is more informative, or more specific, than A. In particular, we write A ≤ι B if B is a fuzzy sub-set of A, or B is a sub-type of A. It will be clear in a specific context which ordering this common notation denotes. 2.1. Fuzzy Types In the previous formulation of FCGs ([34, 4]) fuzzy truth-values, defined by fuzzy sets on [0,1], are used to represent the compatibility of a referent to a concept type, or referents to a relation type. The formulation of a fuzzy type as a pair of a basic type and a fuzzy truth-value was first proposed in [5], providing unified structure and treatment for both FCGs and CGs. The intended meaning of an assertion “x is of fuzzy type (t, v)” is “(x is of t) is v”. For example, an assertion “John is of fuzzy type (AMERICAN-MAN, fairly true)” says “It is fairly true that John is an AMERICAN-MAN”, where AMERICAN-MAN is a basic type and fairly true is the linguistic label of a fuzzy truth-value. The intuitive idea of the fuzzy sub-type ordering is similar to that of the ordinary one. That is, if τ2 is a fuzzy sub-type of τ1, then an assertion “x is of τ2” entails an assertion “x is of τ1”. For example, given BIRD ≤ι EAGLE and true ≤ι very true, one has (BIRD, true) ≤ι (EAGLE, very true), on the basis that “It is very true that x is an EAGLE” entails “It is true that x is a BIRD”. In [8], the notion of matchability with a mismatching degree of a fuzzy type to another is introduced, on the basis of the fuzzy sub-type partial ordering and the mismatching degree of a fuzzy set to another. Given two fuzzy types τ1 and τ2, the mismatching degree of τ1 to τ2, where τ1 is matchable to τ2, is a value in [0,1] and denoted by md(τ1/τ2). Then, τ1 ≤ι τ2 if and only if (iff) md(τ1/τ2) = 0. When md(τ1/τ2) ≠ 0, an assertion “x is of τ2” does not fully entail an assertion “x is of τ1”, but rather 1−md(τ1/ τ2) measures the relative necessity degree of “x is of τ1” given “x is of τ2”. Further, for the fact that an object may belong to more than one (fuzzy) types, we apply the conjunctive type construction technique of [1, 10] to define conjunctive fuzzy types. A conjunctive fuzzy type is defined to be a finite set of pairwise incomparable fuzzy types. For example, {(BIRD, very true), (EAGLE, fairly false)} is a conjunctive fuzzy type. An assertion “Object #1 is of type {(BIRD, very true), (EAGLE, fairly
Fuzzy Order-Sorted Logic Programming in Conceptual Graphs
273
false)}” says “It is very true that Object #1 is a BIRD and it is fairly false that it is an EAGLE”. Given two conjunctive fuzzy types T1 and T2, T1 is said to be matchable to T2 iff ∀τ1∈T1∃τ2∈T2: τ1 is matchable to τ2. The mismatching degree of T1 to T2 is then defined by md(T1/T2) = MaxT1 MinT2 {md(τ1/τ2) | τ1∈T1, τ2∈T2 and τ1 is matchable to τ2}. When md(T1/T2) = 0, T2 is said to be a conjunctive fuzzy sub-type of T1 and one writes T1 ≤ι T2. As proved in [8], the set of all conjunctive fuzzy types, defined over a basic type lattice and a fuzzy truth-value lattice, forms an upper semi-lattice under the conjunctive fuzzy sub-type partial ordering. Note that, a basic type t and a fuzzy type (t, absolutely true) are conceptually equivalent. So, for the sake of expressive simplicity, we write, for instance, APPLE instead of (APPLE, absolutely true). Also, one may view a fuzzy type as a conjunctive one that contains only one element, and vice versa. For the following formulation of FCGs, we assume basic concept and relation type lattices, on which fuzzy concept and fuzzy relation types are defined. However, for simplicity, we use the term fuzzy type to mean either fuzzy concept or fuzzy relation type, when a distinction is not necessary. 2.2. FCGs and Ideal FCGs The formulation of FCGs is refined accordingly with the introduction of fuzzy types. An FCG is defined as a conceptual graph (not necessarily connected) the nodes of which are fuzzy concepts and fuzzy relations, and the directed edges of which link the relation nodes to their neighbor concept nodes. Concept nodes are possibly joined by coreference links indicating that the concepts refer to the same individual. A fuzzy concept is either (1) a fuzzy entity concept, which consists of a conjunctive fuzzy concept type and a referent, or (2) a fuzzy attribute concept, which consists of a conjunctive fuzzy concept type, a referent and a fuzzy attribute-value defined by a fuzzy set. A fuzzy relation consists of a conjunctive fuzzy relation type. For an FCG g, we denote the set of all concept nodes and the set of all relation nodes in g respectively by VC g and VR g. For a fuzzy attribute concept c, we denote the fuzzy attribute-value in c by aval(c). FCG projection defined in [34] is also modified with the introduction of fuzzy types. As in [34], given a projection π from an FCG u to an FCG v, 1−επ measures the relative necessity degree of u given v, where επ∈[0,1] is the mismatching degree of π. When επ = 0, the necessity degree of u given v is 1, that is, v fully entails u. We now present the notion of ideal FCGs, first introduced in [8], that are based on the notion of ideals in lattice theory ([16]) and used to define the general FCGP declarative semantics. An ideal of an upper semi-lattice L is any sub-set S of L such that (1) S is downward closed, i.e., if a∈S, b∈L and b ≤ι a then b∈S, and (2) S is closed under finite least upper bounds, i.e., if a, b∈S then lub{a, b}∈S. The set of all ideals of an upper semi-lattice forms a complete lattice under the ordinary sub-set ordering, that is, given two ideals s and t, s ≤ι t iff s is a sub-set of t. For each element a∈L, the set {x∈L | x ≤ι a} is called a principal ideal. It is an important property that, given a principal ideal p and a set of ideals J, if p ≤ι lub(J) then p ≤ι lub(F) with F being a finite sub-set of J.
274
T.H. Cao and P.N. Creasy
An ideal FCG is defined like an FCG except that, fuzzy values in it are ideals of fuzzy attribute-value lattices or conjunctive fuzzy type upper semi-lattices. Given an ideal FCG g, a principal instance of g is an FCG derived from g by replacing each fuzzy value ideal in g by an element in the ideal. For a concept node c in a principal instance of g, we denote the corresponding concept node in g from which c is derived by origin(c). We write norm(g), which is called a normal ideal FCG, to denote g after being normalized, whereby no individual marker occurs in more than one concept node in norm(g) (cf. CG normal form in [14, 28]). Ideal FCG projection as the subsumption ordering over a set of ideal FCGs is defined similarly as CG projection. The difference is only that CG projection is based on basic type lattices, whilst ideal FCG projection is based on conjunctive fuzzy type ideal lattices. Given ideal FCGs u and v, we write u ≤ι v iff there exists an ideal FCG projection from u to v. Since each principal ideal can be represented by the greatest element in it, one may view an FCG as an ideal FCG whose fuzzy values are principal ideals, and vice versa. Also, a single ideal FCG can be viewed as a set of separate ideal FCGs, and vice versa.
3. Syntax and Declarative Semantics of FCGPs 3.1. FCGP Syntax Definition 3.1 An FCGP clause is defined to be of the form if u then v, where u and v are finite FCGs; v is called the head and u the body (possibly empty) of the clause, and there may be coreference links between u and v. Some concept and relation nodes may be defined to be the firm nodes in the clause. An FCGP is a finite set of FCGP clauses. The intended meaning of firm nodes in FCGP rules is that, the firm nodes in the body of a rule require full matching for the rule to be fired, whilst the firm nodes in the head of a rule are not subject to change by mismatching degrees when the rule is fired. For the examples in this paper, we have a convention that concept and relation nodes without linguistic labels of fuzzy sets are firm nodes. Example 3.1 The FCGP in Figure 3.1 consists of one fact saying “Apple #1 is fairly red”, and one rule saying “If an apple is red, then it is ripe”. Here, [APPLE: #1], [APPLE: *], (ATTR1) and (ATTR2) are firm nodes, and the others are not. Note that ATTR1 and ATTR2 are functional relation types, where the concept nodes linked to a functional relation node by double-lined arcs are the dependent concepts and the others are the determining concepts of the relation ([24, 6]). APPLE:
#1
if APPLE: *
ATTR1
ATTR1
COLOR:@fairly
COLOR:@red
red
then APPLE: *
Fig. 3.1. An FCGP
ATTR2
RIPENESS:@ripe
Fuzzy Order-Sorted Logic Programming in Conceptual Graphs
275
3.2. FCGP Interpretations and Models There are two notions of FCGP declarative semantics: restricted and general. For the restricted semantics, an FCGP interpretation is a normal FCG, whose fuzzy values are elements of fuzzy attribute-value lattices or conjunctive fuzzy type upper semi-lattices. For the general semantics, an FCGP interpretation is a normal ideal FCG, whose fuzzy values are ideals of these lattices or upper semi-lattices. As shown in [20, 9], for logic programs computing with data based on lattices that may be infinite, the general semantics has to be used to guarantee finite proofs of logical consequences of programs. Definition 3.2 An FCGP interpretation is a normal ideal FCG (possibly infinite). As in [34, 9], the satisfaction relation between an FCGP interpretation and an FCGP is based on the fuzzy modus ponens model of [23]. The model is consistent with classical modus ponens, that is, when the body of a rule fully matches a fact, then the head of the rule can be derived. When the body mismatches the fact by some degree, one has a degree of indetermination in reasoning, and the conclusion should be more ambiguous and less informative than it is when there is no mismatching. It is obtained by adding the mismatching degree to fuzzy values in the head. For a fuzzy value A, A+ε represents A being overall pervaded with an indetermination degree ε ∈[0,1]. If A is a fuzzy set on a domain U, the membership function of A+ε is defined by µA+ε(u) = Min{µA(u) + ε, 1}, for every u∈U. If A is a fuzzy type (t, v), then A+ε = (t, v+ε). If A is a conjunctive fuzzy type T, then A+ε = {τ+ε | τ∈T}. Definition 3.3 Let P be an FCGP and I be an FCGP interpretation. The satisfaction relation is defined as follows: 1. I |= P iff I |= C, for every clause C in P, 2. I |= if u then v iff the existence of an FCG projection π from u to a principal instance g of I implies the existence of an ideal FCG projection π* from v+επ to I such that (1) the mismatching degree of each mapping from a firm node in u to a node in g is 0, and (2) v+επ is derived from v by adding επ to fuzzy values in all concept and relation nodes that are not firm nodes in v, and (3) for every c∈VC u , c*∈VC v , if coref{c, c*} then π*c* = origin(πc). I is a model of P iff I |= P. A program Q is said to be a logical consequence of a program P iff, for every FCGP interpretation I, if I |= P then I |= Q. 3.3. FCGP Fixpoint Semantics As in [22, 20, 14, 9], each FCGP P is associated with an interpretation mapping TP , which provides the link between the declarative and procedural semantics of P. The upward iteration of TP is then defined with TP↑0 being the empty ideal FCG. Definition 3.4 Let P be an FCGP and I be an FCGP interpretation. Then TP(I) is defined to be norm(SP(I)∪I) where SP(I) = {v+επ | if u then v is a clause in P and π is an FCG projection from u to a principal instance g of I such that (1) the mismatching
276
T.H. Cao and P.N. Creasy
degree of each mapping from a firm node in u to a node in g is 0, and (2) v+επ is derived from v by adding επ to fuzzy values in all concept and relation nodes that are not firm nodes in v, and (3) for every c∈VC u , c*∈VC v , if coref{c, c*} then coref{c*, origin(πc)}}. Example 3.2 Let P be the program in Example 3.1 and, for calculation illustration, suppose the following relations between fuzzy sets denoted by linguistic labels: fairly red = red+ε ≤ι red, fairly ripe = ripe+ε ≤ι ripe which imply md(red / fairly red) = md(ripe / fairly ripe) = ε. Here, given two fuzzy sets A and A* on a domain U, md(A / A*) = SupU{Max{µA*(u) − µA(u), 0}} denotes the mismatching degree of A to A* ([34]). Then one has: TP↑ω = lub{TP↑n | n∈N} = TP↑2 = [APPLE: #1]→(ATTR1)⇒[COLOR:@fairly red] →(ATTR2)⇒[RIPENESS:@fairly ripe]. Theorem 3.1 ([8]) Let P be an FCGP. Then TP↑ω is the least model of P. The significance of Theorem 3.1 is that it ensures a finite sound and complete mechanical proof procedure for FCGPs. Indeed, if g is a finite FCG and g ≤ι TP↑n for some n∈N, then g ≤ι TP↑ω ≤ι I for every model I of P, which means g is a logical consequence of P. On the other hand, if g is a logical consequence of P, then g must be satisfied by TP↑ω as a model of P, i.e., g ≤ι TP↑ω = lub{TP↑n | n∈N}, whence g ≤ι TP↑n for some n∈N, due to g being finite and fuzzy values in g being principal ideals (cf. [22, 20, 14, 9]).
4. Procedural Semantics of FCGPs 4.1. FCGP Reductants Definition 4.1 An FCGP annotation term is recursively defined to be of either of the following forms: 1. A fuzzy value constant, which is a fuzzy attribute-value or a conjunctive fuzzy type, or 2. A+ξ, where A is a fuzzy value constant and ξ is a variable whose value is a real number in [0,1], or 3. A fuzzy value variable, whose value is a fuzzy attribute-value or a conjunctive fuzzy type, or 4. f(τ1, τ2, ..., τm), where each τi (1 ≤ i ≤ m) is an FCGP annotation term and f is a computable ([18]) and monotonic ( f(τ1, τ2, ..., τm) ≤ι f(τ’1, τ’2, ..., τ’m) if τi ≤ι τ’i for every i from 1 to m) function from L1 × L2 × ... × Lm to L, with L and Li’s being lattices of fuzzy attribute-values or upper semi-lattices of conjunctive fuzzy types. FCGP annotation terms of the first three forms are called simple FCGP annotation terms.
Fuzzy Order-Sorted Logic Programming in Conceptual Graphs
277
We denote fuzzy value variables by X, Y, ..., and real number variables by ξ, ψ, ..., which all are called annotation variables to be distinguished from individual variables denoted by x, y, ... . An expression without annotation variables is called annotation variable-free. Definition 4.2 Let P be an FCGP and C1, C2, ..., Cm be different clauses in P, where each Ck (1 ≤ k ≤ m) is of the form if uk then vk. Suppose that some concept nodes in v1, v2, ..., vm can be joined by a coreference partition operator ϖ. Then the clause: if ϖ[u1+ξ1 u2+ξ2 ... um+ξm] then norm(ϖ[v1+ξ1 v2+ξ2 ... vm+ξm]) is called a reductant of P, where each uk+ξk (or vk+ξk) is derived from uk (or vk) by adding ξk to fuzzy values in all concept and relation nodes that are not firm nodes in uk (or vk). Note that, in Definition 4.2, each real number variable ξk represents an unknown mismatching degree of the body of clause Ck to some fact, for Ck taking part in the reductant; ξk = 0 if the body of Ck is empty. This, on the other hand, corresponds to a tolerance degree for the head of Ck in backward chaining in [4]. Moreover, if ϖ(uk+ξk) then norm(ϖ(vk+ξk)) is also an FCGP reductant, which is constructed from only Ck; in this case, ϖ corresponds to a coreference partitioning in [28] on cut-point concept nodes of vk in a unification with a goal. Example 4.1 Figure 4.1 illustrates a reductant constructed from the two rules of the FCGP P. The first rule says “If the demand on a product is not high, then its price is not expensive”. The second rule says “If the demand on a product is not low, then its price is not cheap”. The fact says “The demand on product #2 is normal and its quality is quite good”. program P: if
PRODUCT:
*
ATTR1
DEMAND:@not
then
PRODUCT:
*
ATTR2
PRICE:@not
if
PRODUCT:
*
ATTR1
DEMAND:@not
then
PRODUCT:
*
ATTR2
PRICE:@not
PRODUCT:
#2
high
expensive low
cheap
ATTR1
DEMAND:@normal
ATTR3
QUALITY:@quite
good
a reductant of P: if
PRODUCT:
*
ATTR1
DEMAND:@lub{not
then
PRODUCT:
*
ATTR2
PRICE:@lub{not
high+ξ, not low+ψ}
expensive+ξ, not cheap+ψ}
Fig. 4.1. A reductant of an FCGP
278
T.H. Cao and P.N. Creasy
Property 4.1 Let P be an FCGP. Any annotation variable-free instance, obtained by a substitution for real number variables, of a reductant of P is a logical consequence of P. Proof. The proof is on the basis of FCGP declarative semantics and is similar to the proof for the alike property of AFLP reductants (Property 4.1 in [7]). 4.2. FCGP Constraints Definition 4.3 An FCGP constraint is defined to be of the form: σ1 ≤ι φ1 & σ2 ≤ι φ2 & ... & σm ≤ι φm where, for each i from 1 to m, σi and φi are two FCGP annotation terms evaluated to fuzzy values of the same domain. The constraint is said to be normal iff (1) for each i from 1 to m, σi is a simple FCGP annotation term and if σi contains a variable then this variable does not occur in φ1, φ2, ..., φi and (2) for every pair σi and σj (i ≠ j), σi and σj are not the same conjunctive fuzzy type variable. Definition 4.4 A solution for an FCGP constraint C is a substitution ϕ for annotation variables in C such that every annotation variable-free instance of ϕC holds. An FCGP constraint is said to be solvable iff there is an algorithm to decide whether the constraint has a solution or not, and to identify a solution if it exists. Note that, in Definition 4.4, ϕ does not necessarily contain a binding for every annotation variable. Also, a constraint having a solution is not necessarily solvable, because there may not exist an algorithm to identify what a solution is. On the other hand, a solvable constraint may not have a solution. Property 4.2 Any normal FCGP constraint is solvable. Proof. The proof is similar to the proof for the solvability of normal AFLP constraints (Property 4.2 in [7]). The algorithm for testing satisfiability of a normal FCGP/AFLP constraint is adapted from [20] for constraints on fuzzy value terms. Example 4.2 The constraint: (X ≤ι quite good) & (not high+ξ ≤ι normal) & (not low+ψ ≤ι normal) & (moderate ≤ι lub{not expensive+ξ, not cheap+ψ}) is a normal FCGP constraint. The function lub on fuzzy sets is their intersection, which is computable and monotonic. Suppose the relations between the fuzzy sets denoted by the linguistic labels in the constraint are as follows: normal = not high∩not low = lub{not high, not low}, moderate = not expensive∩not cheap = lub{not expensive, not cheap}. Applying the algorithm presented in [7], one obtains X = quite good, ξ = md(not high / normal) = 0 and ψ = md(not low / normal) = 0, which satisfy the constraint. So, a solution for the constraint is {X/quite good, ξ/0, ψ/0}.
Fuzzy Order-Sorted Logic Programming in Conceptual Graphs
279
4.3. FCGP Proof Procedure Definition 4.5 An FCGP goal G is defined to be of the form QG || CG, where QG is the query part which is a finite FCG whose fuzzy values are simple FCGP annotation terms or of the form lub{σ1, σ2, ..., σm} where σi’s are simple FCGP annotation terms, and CG is the constraint part which is an FCGP constraint. The goal is said to be normal iff (1) CG is a normal FCGP constraint and (2) for a conjunctive fuzzy type variable X, there is at most one occurrence of X in QG or in an inequality X ≤ι φ in CG (there is no such a restriction on occurrences of X in the right-hand sides of the inequalities in CG). Definition 4.6 Let P be an FCGP and G be an FCGP goal. An answer for G w.r.t. P is a triple <ρ, ϖ, ϕ> where ρ and ϖ are respectively a referent specialization operator and a coreference partition operator on G, and ϕ is a substitution for annotation variables in G. The answer is said to be correct iff ϕ is a solution for CG and every annotation variable-free instance of ρϖϕQG is a logical consequence of P. The following definition of FCG unification is modified from the one in [4] with the introduction of FCGP annotation terms and constraints on them. As defined in [4], a VAR generic marker is one whose concept occurs in an FCGP query, or in the body of an FCGP rule, or in the head of an FCGP rule and coreferenced with a concept in the body of the rule; a NON-VAR generic marker is one whose concept occurs in an FCGP fact, or in the head of an FCGP rule but are not coreferenced with any concept in the body of the rule. Definition 4.7 Let u and v be finite normal FCGs. An FCG unification from u to v is a mapping θ: u → v such that: 1. ∀c∈VC u : type(c) is matchable to type(θc) and referent-unified(c, θc), and 2. ∀r∈VR u : type(r) is matchable to type(θr), and ∀i∈{1, 2, ..., arity(r)}: neighbor(θr, i) = θneighbor(r, i), and 3. No VAR generic maker is unified with different individual markers or noncoreferenced NON-VAR generic markers. The constraint produced by θ is denoted by Cθ and defined by the set {aval(c) ≤ι aval(θc) | c∈VC u and c is a fuzzy attribute concept}∪{type(c) ≤ι type(θc) | c∈VC u }∪{type(r) ≤ι type(θr) | r∈VR u }. The referent specialization operator, the coreference partition operator and the resolution operator defined by θ ([4]) are respectively denoted by ρθ, ϖθ and δθ. Note that, in Definition 4.7, if aval(c) is of the form lub{σ1, σ2, ..., σm}, where σi’s are simple FCGP annotation terms, then the constraint lub{σ1, σ2, ..., σm} ≤ι aval(θc) is equivalent to the constraint σ1 ≤ι φ & σ2 ≤ι φ & ... & σm ≤ι φ where φ = aval(θc). This also applies to type(c) and type(r).
280
T.H. Cao and P.N. Creasy
Definition 4.8 Let G be an FCGP goal QG || CG and C be an FCGP reductant if u then v (G and C have no variables in common). Suppose that there exists an FCG unification θ from a normalized sub-graph g of QG to v. Then, the corresponding resolvent of G and C is a new FCGP goal denoted by Rθ(G,C) and defined to be ρθϖθ[δθQG u] || Cθ & CG, where δθ deletes g from QG. Property 4.3 Let G and C be respectively an FCGP goal and an FCGP reductant. If G is a normal FCGP goal, then any resolvent of G and C is also a normal FCGP goal. Proof. The proof is similar to the proofs for the alike properties of annotated logic programs (Lemma 2 in [20]) and AFLPs (Property 4.3 in [7]). Note that the order of the inequalities in Cθ is not significant, but the order of Cθ and CG in Definition 4.8 is. Definition 4.9 Let P be an FCGP and G be an FCGP goal. A refutation of G and P is a finite sequence G
Fuzzy Order-Sorted Logic Programming in Conceptual Graphs
281
C3 = C2 θ3: QG2 → C3, ρθ3 = {}, ϖθ3 = {} G3 = || (X ≤ι quite good) & (lub{not high+ξ, not low+ψ} ≤ι normal) & (moderate ≤ι lub{not expensive+ξ, not cheap+ψ}) = || (X ≤ι quite good) & (not high+ξ ≤ι normal) & (not low+ψ ≤ι normal) & (moderate ≤ι lub{not expensive+ξ, not cheap+ψ}) As in Example 4.2, one has X = quite good and ξ = ψ = 0 as a solution for the constraint above, whence the corresponding answer for G w.r.t. P is <{([PRODUCT:*x], #2)}, {}, {X/quite good}>. Note that, if a clause rather than a reductant of P were used to resolve g0, there would not be a refutation of G and P because, generally, neither moderate ≤ι not expensive nor moderate ≤ι not cheap holds. Also, early type resolution ([4]) is applied when g1 is deleted from QG1. The following theorems state the soundness and the completeness of this FCGP proof procedure. Due to space limitation we omit the proofs for these theorems, which were presented in the submitted version of this paper. Theorem 4.4 (Soundness) Let P be an FCGP and G be an FCGP goal. If G
282
T.H. Cao and P.N. Creasy
and the head of a rule with two different concept types. These rules realize a close coupling between the concept type hierarchy and the axiomatic part of a knowledge base ([3, 4]). A backward chaining proof procedure based on clause selection like the ones in [15, 28, 4] is not complete when such rules are present or facts are not in CG normal form. For example, it cannot satisfy the goal [EMPLOYEE-STUDENT: John] where EMPLOYEE-STUDENT = lub{EMPLOYEE, STUDENT}, because it does not combine rule-defined concept types of an individual, which are EMPLOYEE and STUDENT in this case. Meanwhile, CG normal form is required to combine fact-defined concept types of an individual. Such combinations are inherent in a forward chaining method (cf. [28, 19]). Actually, the significance of FCGP reductants is to combine concept types as well as other lattice-based data for backward chaining. In this example, a reductant of the CGP is: if [PERSON: *x]→(WORK_FOR)→[PERSON: *] then [EMPLOYEE-STUDENT: *x] →(ATTEND)→[UNIVERSITY: *] which together with the first fact resolve the goal [EMPLOYEE-STUDENT: John]. Moreover, in the light of annotated logic programming ([20, 7]), types have been viewed as annotations which can also be queried about. This then reveals an advantage of CG notation that makes this view possible, which has not been addressed in the previous works on CGPs/FCGPs ([15, 28, 4, 19]). With classical first-order logic notation, this view is hindered due to types being encoded in predicate symbols, whence queries about sorts of objects or relations among objects have not been thought about (cf. [1, 3]). For example, the following CG query asks “What is John and what is John’s relation with Mary?”: [X: John]→(Y)→[PERSON: Mary] where X is a concept type variable and Y is a relation type variable. Applying the presented FCGP proof procedure, one obtains X = EMPLOYEE-STUDENT and Y = BROTHER_OF, which say “John is an employee and a student, and John is a brother of Mary”, as the most informative answer w.r.t. the given CGP.
5. Conclusions The syntax of FCGPs with the introduction of fuzzy types has been presented, providing unified structure and treatment for both FCGPs and CGPs. The general declarative semantics of FCGPs based on the notion of ideal FCGs has been defined, ensuring finite proofs of logical consequences of programs. The fixpoint semantics of FCGPs has been studied as the bridge between their declarative and procedural semantics. Then, a new SLD-style proof procedure for FCGPs has been developed and proved to be sound and complete w.r.t. their declarative semantics. The two main new points in the presented FCGP proof procedure are that, it selects reductants rather than clauses of an FCGP in resolution steps and involves solving constraints on fuzzy value terms. As it has been analysed, a CGP/FCGP SLD-style proof procedure based on clause selection is not generally complete. The constraint solving supports more expressive queries which are possibly about not only fuzzy attribute-values but also fuzzy types. Since a CGP can be considered as a special FCGP, the results obtained here for
Fuzzy Order-Sorted Logic Programming in Conceptual Graphs
283
FCGPs could be applied to CGP systems. They could also be useful for any extension that adds to CGs lattice-based annotations to enhance their knowledge representation and reasoning power. The presented FCGP system, on one hand, extends CGPs to deal with vague and imprecise information pervading the real world as reflected in natural languages. On the other hand, to our knowledge, it is the first fuzzy order-sorted logic programming system for handling uncertainty about types of objects. When only fuzzy sets of special cases are involved, FCGPs could become possibilistic CGPs, where concept and relation nodes in a CG are weighted by only values in [0,1] interpreted as necessity degrees. They are less expressive than general FCGPs but have simpler computation and are still very useful for CG-based systems dealing with uncertainty. Besides, FCGPs could be extended further to represent and reason with other kinds of uncertain knowledge, such as imprecise temporal information or vague generalized quantifiers. These are among the topics that are currently being investigated. Acknowledgment. We would like to thank Marie-Laure Mugnier and the anonymous referees for the comments that help us to revise the paper for its readability. References 1. Aït-Kaci, H. & Nasr, R. (1986), Login: A Logic Programming Language with Built-In Inheritance. J. of Logic Programming, 3: 185-215. 2. Baldwin, J.F. & Martin, T.P. & Pilsworth, B.W. (1995), Fril - Fuzzy and Evidential Reasoning in Artificial Intelligence. John Wiley & Sons, New York. 3. Beierle, C. & Hedtstuck, U. & Pletat, U. & Schmitt, P.H. & Siekmann, J. (1992), An Order-Sorted Logic for Knowledge Representation Systems. J. of Artificial Intelligence, 55: 149-191. 4. Cao, T.H. & Creasy, P.N. & Wuwongse, V. (1997), Fuzzy Unification and Resolution Proof Procedure for Fuzzy Conceptual Graph Programs. In Lukose, D. et al. (Eds.): Conceptual Structures - Fulfilling Peirce’s Dream, LNAI No. 1257, Springer-Verlag, pp. 386-400. 5. Cao, T.H. & Creasy, P.N. & Wuwongse, V. (1997), Fuzzy Types and Their Lattices. In Proc. of the 6th IEEE International Conference on Fuzzy Systems, pp. 805-812. 6. Cao, T.H. & Creasy, P.N. (1997), Universal Marker and Functional Relation: Semantics and Operations. In Lukose, D. et al. (Eds.): Conceptual Structures - Fulfilling Peirce’s Dream, LNAI No. 1257, Springer-Verlag, pp. 416-430. 7. Cao, T.H. (1997), Annotated Fuzzy Logic Programs. Int. J. of Fuzzy Sets and Systems. To appear. 8. Cao, T.H & Creasy, P.N. (1997), Fuzzy Conceptual Graph Programs and Their Fixpoint Semantics. Tech. Report No. 424, Department of CS&EE, University of Queensland. 9. Cao, T.H. (1998), Annotated Fuzzy Logic Programs for Soft Computing. In Proc. of the 2nd International Conference on Computational Intelligence and Multimedia Applications, World Scientific, pp. 459-464. 10. Carpenter, B. (1992), The Logic of Typed Feature Structures with Applications to Unification Grammars, Logic Programs and Constraint Resolution. Cambridge University Press. 11. Chevallet, J-P. (1992), Un Modèle Logique de Recherche d’Informations Appliqué au Formalisme des Graphes Conceptuels. Le Prototype ELEN et Son Expérimentation sur un Corpus de Composants Logiciels. PhD Thesis, Université Joseph Fourier. 12. Dubois, D. & Lang, J. & Prade, H., (1994) Possibilistic Logic. In Gabbay, D.M. et al. (Eds.): Handbook of Logic in Artificial Intelligence and Logic Programming, Vol. 3, Oxford University Press, pp. 439-514. 13. Genest, D. & Chein, M. (1997), An Experiment in Document Retrieval Using Conceptual Graphs. In Lukose, D. et al. (Eds.): Conceptual Structures - Fulfilling Peirce’s Dream, LNAI No. 1257, Springer-Verlag, pp. 489-504. 14. Ghosh, B.C. & Wuwongse, V. (1995), Conceptual Graph Programs and Their Declarative
284
T.H. Cao and P.N. Creasy
Semantics. IEICE Trans. on Information and Systems, Vol. E78-D, No. 9, pp. 1208-1217. 15. Ghosh, B.C. (1996), Conceptual Graph Language - A Language of Logic and Information in Conceptual Structures. PhD Thesis, Asian Institute of Technology. 16. Grätzer, G. (1978), General Lattice Theory. Academic Press, New York. 17. Ho, K.H.L. (1994), Learning Fuzzy Concepts By Examples with Fuzzy Conceptual Graphs. In Proc. of the 1st. Australian Conceptual Structures Workshop. 18. Hopcroft, J.E. & Ullman, J.D. (1979), Introduction to Automata Theory, Languages, and Computation. Addison-Wesley, Massachusetts. 19. Kerdiles, G. & Salvat, E. (1997), A Sound and Complete CG Proof Procedure Combining Projections with Analytic Tableaux. In Lukose, D. et al. (Eds.): Conceptual Structures - Fulfilling Peirce’s Dream, LNAI No. 1257, Springer-Verlag, pp. 371-385. 20. Kifer, M. & Subrahmanian, V.S. (1992), Theory of Generalized Annotated Logic Programming and Its Applications. J. of Logic Programming, 12: 335-367. 21. Klawonn, F. (1995), Prolog Extensions to Many-Valued Logics. In Höhle, U. & Klement, E.P. (Eds.): Non-Classical Logics and Their Applications to Fuzzy Subsets, Kluwer Academic Publishers, Dordrecht, pp. 271-289. 22. Lloyd, J.W. (1987), Foundations of Logic Programming. Springer-Verlag, Berlin. 23. Magrez, P. & Smets, P. (1989), Fuzzy Modus Ponens: A New Model Suitable for Applications in Knowledge-Based Systems. Int. J. of Intelligent Systems, 4: 181-200. 24. Mineau, G.W. (1994), Views, Mappings and Functions: Essential Definitions to the Conceptual Graph Theory. In Tepfenhart, W.M. & Dick, J.P. & Sowa, J.F. (Eds.): Conceptual Structures Current Practices, LNAI No. 835, Springer-Verlag, pp. 160-174. 25. Morton, S. (1987), Conceptual Graphs and Fuzziness in Artificial Intelligence. PhD Thesis, University of Bristol. 26. Mukaidono, M. & Shen, Z. & Ding, L. (1989), Fundamentals of Fuzzy Prolog. Int. J. of Approximate Reasoning, 3: 179-194. 27. Myaeng, S.H. & Khoo, C. (1993), On Uncertainty Handling in Plausible Reasoning with Conceptual Graphs. In Pfeiffer, H.D. & Nagle, T.E. (Eds.): Conceptual Structures - Theory and Implementation, LNAI No. 754, Springer-Verlag, pp. 137-147. 28. Salvat, E. & Mugnier, M.L. (1996), Sound and Complete Forward and Backward Chainings of Graph Rules. In Eklund, P.W. & Ellis, G. & Mann, G. (Eds.): Conceptual Structures - Knowledge Representation as Interlingua, LNAI No. 1115, Springer-Verlag, pp. 248-262. 29. Sowa, J.F. (1984), Conceptual Structures: Information Processing in Mind and Machine. AddisonWesley, Massachusetts. 30. Sowa, J.F. (1991), Towards the Expressive Power of Natural Languages. In Sowa, J.F. (Ed.): Principles of Semantic Networks - Explorations in the Representation of Knowledge, Morgan Kaufmann Publishers, San Mateo, CA, pp. 157-189. 31. Sowa, J.F. (1997), Matching Logical Structure to Linguistic Structure. In Houser, N. & Roberts, D.D. & Van Evra, J. (Eds.): Studies in the Logic of Charles Sanders Peirce, Indiana University Press, pp. 418-444. 32. Umano, M. (1987), Fuzzy Set Prolog. In Preprints of the 2nd International Fuzzy Systems Association Congress, pp. 750-753. 33. Wuwongse, V. & Manzano, M. (1993), Fuzzy Conceptual Graphs. In Mineau, G.W. & Moulin, B. & Sowa, J.F. (Eds.): Conceptual Graphs for Knowledge Representation, LNAI No. 699, SpringerVerlag, pp. 430-449. 34. Wuwongse, V. & Cao, T.H. (1996), Towards Fuzzy Conceptual Graph Programs. In Eklund, P.W. & Ellis, G. & Mann, G. (Eds.): Conceptual Structures - Knowledge Representation as Interlingua, LNAI No. 1115, Springer-Verlag, pp. 263-276. 35. Zadeh, L.A. (1965), Fuzzy Sets. J. of Information and Control, 8: 338-353. 36. Zadeh, L.A. (1978), PRUF - A Meaning Representation Language for Natural Languages. Int. J. of Man-Machine Studies, 10: 395-460. 37. Zadeh, L.A. (1990), The Birth and Evolution of Fuzzy Logic. Int. J. of General Systems, 17: 95105. 38. Zadeh, L.A. (1996), Fuzzy Logic = Computing with Words. IEEE Trans. on Fuzzy Systems, 4: 103111.
Knowledge Querying in the Conceptual Graph Model: The RAP Module Olivier Guinaldo1 and Ollivier Haemmerl´e2 1
LIMOS – U. d’Auvergne, IUT de Clermont-Ferrand, BP 86, F-63172 Aubi`ere Cedex, France [email protected] 2 INA-PG, D´epartement OMIP, 16, rue Claude Bernard F-75231 Paris Cedex 05, France [email protected]
Abstract. The projection operation can be used to query a CG knowledge base by searching all the specializations of a particular CG – the question. But in some cases it may not be enough, particularly when the knowledge allowing an answer is distributed among several graphs belonging to the fact base. We define two operating mechanisms of knowledge querying, that work by means of graph operations. Both these mechanisms are equivalent and logically based. The first one modifies the knowledge base, while the second modifies the question. The Rap module is an implementation of the last algorithm on CoGITo.
1
Introduction
The specialization relation, which can be computed by means of the projection operation, is the basis for the reasoning in the CG model. It expresses that one graph contains a more specific knowledge than another. One of the strong points of the CG model is that there exist sound and complete logical semantics, which means that the graphical reasoning is equivalent to the logical deduction upon the logical formulae associated with the graphs [1, 2]. In this paper, we consider a knowledge based system in which the fact base and the query are represented in terms of CGs. An important step is to provide methods allowing one to query such a fact base. The projection operation can be used for such a search, but the problem is that the knowledge allowing one to answer a query can be distributed among several graphs of the knowledge base, in which case no projection can be made. We propose two algorithms designed to avoid that inconvenience. These algorithms are based on those of the Rock system [3, 4]. In the Rock system we used some heuristics in order to limit the combinatory. But after our work upon the management of large knowledge bases, we implemented these algorithms and showed that they were equivalent to logical deduction [5, 6]. This paper results from these works that were never published in an international event. M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 287-294, 1998 Springer-Verlag Berlin Heidelberg 1998
288
O. Guinaldo and O. Haemmerle
We propose a first reasoning algorithm which is sound and complete regarding the logical deduction. This algorithm works on the knowledge base by using a first step that merges the knowledge in order to allow the projection to run. Then we propose a second algorithm that uses a dual mechanism: the knowledge base is not modified, but the question is split and partial answers are searched for. We proved in [6] that this second mechanism is equivalent to the first one, and that it is sound and complete regarding the logical deduction. Then we show that this second algorithm presents several valuable points compared with the first one. The last section of this article is a presentation of the Rap module based on the second algorithm. Rap is implemented on the CoGITo platform [7].
2
Knowledge Querying
In the following, we call the fact base a set of CGs under normal form1 representing the facts of our system. Our goal is to provide CG answers to a CG question asked on a fact base. The projection operation is the ground operation used to make deductions. 2.1
The reasoning
In this paper, we exclusively focus on reasoning in terms of graph operations. However, in order to clarify the notions of fact base, question and answers, we propose an intuitive definition of these notions. A formal definition in first order logic is proposed in [6]. Definition 1 Let F B = {F1 , . . . , Fk } be a set of CGs under normal form defined on the support S and Φ(F1 ), . . . , Φ(Fk ) the set of associated formulae. Let Φ(S) be the set of formulae associated with S. Given Q a CG defined on S and Φ(Q) its associated formula. We say that there exists an answer A to Q on FB iff Φ(S), Φ(F1 ), . . . , Φ(Fk ) ` Φ(Q). The construction of such an answer is presented in the next section. 2.2
Composition: The first algorithm
Our goal is to propose a sound and complete algorithm computing CG answers according to the previous definition. In other words we want to show that it is possible to give a CG answer A to the CG question Q without using a logical theorem prover. We could define the notion of answer as a “specialization of the CG question belonging to the CG fact base”. Thus the algorithm could be “projecting the CG question upon each CG fact”. But such a definition cannot solve the following 1
According to [8], a CG is under normal form if it doesn’t have two conceptual vertices with the same individual marker; the normal form of a graph is computed by merging the conceptual vertices with the same individual marker.
Knowledge Querying in the Conceptual Graph Model: The RAP Module
289
problem: the knowledge relative to an individual marker and allowing one to answer a question may be distributed among several CG facts of the base. In that case it is impossible to answer, because of the impossibility of finding a projection. For example, consider the question Q and the fact base F B presented in fig. 1. Q cannot be projected on a graph of F B. But the logical formula associated with Q can be deduced from the part of F B in bold lines. Graph F in fig. 1 is obtained by the disjunctive sum of F 1 and F 2 (a disjunctive sum is a specialization operation in [8] consisting in the juxtaposition of two CGs), and then by the join made on the individual vertices Mike. This graph admits obviously a projection of Q on its grey subgraph. That sub-graph of F is an answer to Q.
2
carac
Blue
1
2
1
1
2
F1
1
2
poss
Car : CLT63
Man : Mike
agt
Drive
obj
Car : CLT63
Man : Mike 1
2
F2
Bike
1
poss
Adult
carac 2
FB 1
2
1
2
agt
Drive
obj
Car
1
Person
2
carac
Adult
1
poss
Q
2
Blue
carac
1
2
1
Car : CLT63
Bike
2
1
2
agt
Drive
obj
Man : Mike 1
2
F
poss
1
poss
1
carac
2 2
Bike Adult
Fig. 1. Example of a CG fact base and a CG question. No graph of F B admits a projection of Q. But graph F , the logical interpretation of which is equivalent to the conjonction of formulae associated with the graphs in F B, admits a projection of Q. There exists an answer to Q in F B.
So the first algorithm consists in considering the fact base F B = {F1 , . . . , Fk } as a unique graph resulting from the disjunctive sum of F1 , . . . , Fk , then normalizing that graph by merging all the vertices with the same individual marker, and finally projecting Q on F . The Composition algorithm is presented thoroughly in [6].
290
2.3
O. Guinaldo and O. Haemmerle
Blues: The second algorithm
Two drawbacks of the Composition algorithm can be noted. Firstly, the knowledge base is modified by the use of disjunctive sum and normalization. This is not a problem in terms of the complexity of these operations, but the original form of the knowledge is modified. This can be a problem for instance if such a knowledge represents sentences from a text and you want to know easily which sentences an answer comes from. Secondly, Composition involves a projection of the CG question into a graph of a size “equivalent to the size of the knowledge base”. In terms of implementation, this can be prejudicial, the whole knowledge base having to be loaded into the memory in order to compute the projection. That’s why we propose the Blues2 algorithm, which does not modify the fact base: Blues splits the question instead of merging the CG base. The main idea of this algorithm was proposed by Carbonneill and Haemmerl´e [4]. It is an equivalent alternative to the Composition algorithm. Afterwards, we consider that all the CG facts are individually in normal form: the combinatory increases significantly when we work with unspecified graphs, while putting a graph under normal form (which is an equivalent form in terms of logic) has a linear complexity in the size of the graph. Presentation and definitions In the Composition algorithm, we have seen that the answers to a question Q on a fact base F B are graphs resulting from the projection of Q into the graph produced by the disjunctive sum of CG facts then put under normal form. The Blues algorithm simulates these operations in two stages: 1. It splits the question instead of merging the facts, then tries to match each generated sub-question into the base in order to obtain partial answers. 2. It expresses conditions on recombination of these partial answers in order to generate exact answers. More precisely, the splitting of a question Q gives us a set Q = {Q1 , . . . , Qi , . . . , Qn } of CGs that we call sub-questions, and a set C = {C1 , . . . , Cj , . . . , Cm } of conceptual vertices sets. Each sub-question Qi is a copy of a connected sub-graph of Q which has at most one relation vertex (and its concept vertices neighbours). Each Cj is composed of all the concept vertices in Q which result from the splitting of the same concept vertex of Q that we call a cut vertex. The cut vertices of Q are the concept vertices that have at least two distinct neighbouring relation vertices. Moreover, we note ci the concept vertex of the sub-question Qi generated by the cut vertex c of Q (see fig. 2). Definition 2 We call partial answer to a CG question Q the graph that results from the projection Πi of the sub-question Qi into a graph of the base F B. We write it down Πi (Qi ). 2
Building alL the solUtions aftEr splitS.
Knowledge Querying in the Conceptual Graph Model: The RAP Module 2
Car
1
1
obj
2
agt
Drive
Person 0
c
c
2
poss
c1 Car
Q1
obj
1
c03
Person
1
c2
2
agt
Q2
Person
c02
Person
2
carac
1
Drive Drive
Adult
1
Q
2
2
carac
1
291
Bike
Adult
Q3
c04
1
Q4
2
poss
Bike
Fig. 2. Split of a CG question Q. c and c0 are two cut vertices of Q that generate four sub-questions. We have the following sets: Q = {Q1 , Q2 , Q3 , Q4 } and C = {{c1 , c2 }, {c02 , c03 , c04 }}
Our goal is to know whether a partial answer set P = {Π1 (Q1 ), · · · , Πn (Qn )} can be recombined into an exact answer to Q. We note Π the sum of projections Π1 , · · · , Πn such that ∀i, 1 ≤ i ≤ n, if v ∈ Qi then Π(v) = Πi (v). Definition 3 Given Cj = {ci1 , · · · , cil } the set of concept vertices resulting from the splitting of the cut vertex c of Q. Given ∀j, 1 ≤ j ≤ l, Πij (cij ) the image by Πij of the vertex cij in the sub-question Qij . The set ΠCj = {Π1 (ci1 ), · · · , Πl (cil )} is recombinable if each pair (Πu (ciu ), Πv (civ )) satisfies one of the following conditions (the indices u and v are between 1 and l): a) Πu (ciu ) and Πv (civ ) is the same vertex; the splitting was not necessary. b) Πu (ciu ) and Πv (civ ) are distinct but they have the same individual marker. Definition 4 Given Π1 (Q1 ), · · · , Πn (Qn ) some partial answers which have been obtained by projecting the sub-questions (of Q) Q1 , · · · , Qn by Π1 , · · · , Πn on the CG fact base BF . Given C = {C1 , . . . , Cj , . . . , Cm } the set of concept vertices sets due to the splitting in the CG question Q. Given Π the sum of projections Π1 , · · · , Πn such that ∀i 1 ≤ i ≤ n, if v ∈ Qi then Π(v) = Πi (v). Given Q1,···,n the graph resulting from the disjunctive sum on Q1 , · · · , Qn . If ∀j, 1 ≤ j ≤ m, ΠCj is recombinable, then we call answer to Q the normal form of the CG Π(Q1,···,n ). Figure 3 shows an example of recombination of a CG answer by the Blues algorithm. Note that Blues computes all the CG answers to a CG question Q on a CG fact base. The Blues algorithm is presented thoroughly in [6]. We also proved in that article that Blues algorithm is a sound and complete reasoning mechanism with regard to our logical definition of the reasoning (definition 1) and that it is equivalent to the Composition algorithm. Moreover, it is important to say that these two algorithms are both based on the NP-complete problem of the graph projection [9].
292
O. Guinaldo and O. Haemmerle Π1 (Q1 )
Π1 (c1 )
Π4 (Q4 ) 2
1
Drive
2
objet
Π4 (c04 ) 1
Π2 (c2 ) 1
1
obj
Man : Mike
carac
carac
1
Π3 (c03 )
2
2
Bike
poss
Car : CLT63
2
Adult
Π3 (Q3 ) 1
Blue
Man : Mike
Π2 (c02 )
poss
Car : CLT63
2
Π2 (Q2 )
F2
F1
Duplication +
Q1 , Q2
Normalization
a 1
Drive
2
obj poss
Bike
obj
A
Car : CLT63
2
1
2
1
Q2 , Q3
b
Man : Mike
Q2 , Q4
b
a
1
carac
2
Adult
Q3 , Q4
Fig. 3. Recombination of answers. A is a CG answer to the CG question Q of figure 2 on the CG fact base F B = {F1 , F2 }. Each Πi (Qi ) is the partial answer to subquestion Qi on F B. a and b symbolize the different cases of recombination on ΠC1 = {Π1 (c1 ), Π2 (c2 )} and ΠC2 = {Π2 (c02 ), Π3 (c03 ), Π4 (c04 )}.
3 3.1
The RAP Module Presentation
The Rap module (seaRch for All Projections) is essentially an implementation of the Blues algorithm, thus it proposes a sound and complete reasoning. The Rap module is implemented on the CoGITo platform [10, 7] and then can take advantage of its graph base management system [5, 11]. This management system is based on the hierarchy on the CGs induced by the specialization relation. Added to the usual techniques of classification and indexation [12, 13], the CoGITo platform implements hash-coding and filtering mechanisms that allow a reduction of the search domain without using any projection operation. In addition the projection algorithms that are used are efficient [9, 11]. As far as we know [14], the Rap module is the only reasoning module working in a general framework (the projection is not reduced to an injective projection. . . ) and being sound and complete regarding the logical deduction. Among the systems close to ours, we can cite the PEIRCE system [15]. PEIRCE ignores the possibility of “split” knowledge, and it uses injective projection. But it proposes “close match” CGs which are pertinent answers. Rock (Reasoning On
Knowledge Querying in the Conceptual Graph Model: The RAP Module
293
Conceptual Knowledge) [3] is another system close to ours. It searches exact and pertinent answers by modifying the question graph. It considers type definitions (Rap doesn’t). But its major drawback is that it is not complete. Rock can fail to find exact answers to a question. This drawback was at the source of the development of the Blues algorithm by Carbonneill and Haemmerl´e [4, 6]. 3.2
Optimization of the Blues algorithm
Two optimizations of the Blues algorithm have been done in the Rap module. The first one consists in using the set of specialization graphs of each subquestion graph as the search domain, instead of all the fact base. This is done by means of the specialization search functionnalities of CoGITo, that use hierarchical structuration and indexation. The second optimization consists in recomposing the answers only when it is certain that the sets of concept vertices of ΠC are recombinable, instead of trying an hypothetic recombination for all the combination of partial answers. More precisely this optimization is based on the following property: the choice of a partial answer to a sub-question Qi implies restrictions on the choice of the partial answers to the sub-questions that are localized in the immediate neighbourhood of Qi . If no partial answer is possible for one of these sub-questions, then the partial answer of Qi that was chosen cannot lead to an exact answer to Q. Two kinds of restriction are used in this optimization. In the example of figures 2 and 3, when the algorithm chooses a partial answer of Q1 in F1 , then it must choose a partial answer also belonging to F1 as a partial answer of Q2 , in order to make a recombination of a type. Then for Q3 and Q4 , the algorithm has to choose partial answers containing the individual vertex [Man:Mike] in order to make a recombination of b type. The indexed gestion of graphs in the CoGITo platform makes such a restriction of the search domain easy.
4
Conclusion and Perspectives
We have studied two reasoning algorithms only based on graph operations: the Composition algorithm that modifies the CG fact base in order to “compose” the knowledge eventually split through several distinct CGs, and the Blues algorithm that works by splitting the question graph, searching for partial answers, then recombining a complete answer. These algorithms are sound and complete regarding the logical deduction. This work was preliminary a “theoretic” work, but we have implemented the Blues algorithm in order to test it and to observe its behaviour on large CG bases. The first test concerns a base of 10000 graphs (generated by a random algorithm). The CPU times of Blues are close to those of Composition. It is a valuable result, because it shows that using an algorithm that respects the original form of a CG fact base is not prejudicial.
294
O. Guinaldo and O. Haemmerle
This first step in the development of a reasoning module of the CoGITo platform has to lead us to a more complete module taking the extension of the CG model into account (nested CGs for instance [8]). Another direction of study can be the exploration of techniques allowing one to provide pertinent answers as it was done by the Rock system. The Blues algorithm would be easily adaptable to a heuristic reasoning during its recombination phase. This would allow to take advantage of both the completeness of the Rap module and the ability of providing pertinent answers of the Rock system.
References 1. J.F. Sowa. Conceptual Structures - Information Processing in Mind and Machine. Addison-Wesley, Reading, Massachusetts, 1984. 2. M. Chein and M.L. Mugnier. Conceptual Graphs : fundamental notions. Revue d’Intelligence Artificielle, 6(4):365–406, 1992. 3. B. Carbonneill and O. Haemmerl´e. ROCK : Un syst`eme de question/r´eponse fond´e sur le formalisme des graphes conceptuels. In Actes du 9`eme Congr`es Reconnaissance des Formes et Intelligence Artificielle, Paris, pages 159–169, 1994. 4. B. Carbonneill. Vers un syst`eme de repr´ esentation de connaissances et de raisonnement fond´e sur les graphes conceptuels. Th`ese d’informatique, Universit´e Montpellier 2, Janvier 1996. ´ 5. O. Guinaldo. Etude d’un gestionnaire d’ensembles de graphes conceptuels. PhD thesis, Universit´e Montpellier 2, D´ecembre 1996. 6. O. Guinaldo and O. Haemmerl´e. Algorithmes de raisonnement dans le formalisme des graphes conceptuels. In Actes du XIe congr`es RFIA, volume 3, pages 335–344, Clermont-Ferrand, 1998. 7. O. Guinaldo and O. Haemmerl´e. CoGITo : une plate-forme logicielle pour raisonner avec des graphes conceptuels. In Actes du XVe congr`es INFORSID, pages 287–306, Toulouse, juin 1997. 8. M.L. Mugnier and M. Chein. Repr´esenter des connaissances et raisonner avec des graphes. Revue d’Intelligence Artificielle, 10(1):7–56, 1996. 9. M.L. Mugnier and M. Chein. Polynomial algorithms for projection and matching. In Heather D. Pfeiffer, editor, Proceedings of the 7th Annual Workshop on Conceptual Graphs, pages 49–58, New Mexico State University, 1992. 10. O. Haemmerl´e. CoGITo : Une plate-forme de d´eveloppement de logiciels sur les graphes conceptuels. Th`ese d’informatique, Universit´e Montpellier 2, Janvier 1995. 11. O. Guinaldo. Conceptual graphs isomorphism - algorithm and use. In Proceedings of the 4th Int. Conf. on Conceptual Structures, Lecture Notes in Artificial Intelligence, Springer-Verlag, pages 160–174, Sydney, Australia, August 1996. 12. R. Levinson. Pattern Associativity and the Retrieval of Semantic Network. Computers Math. Applic., 23(6-9):573–600, 1992. 13. G. Ellis. Compiled hierarchical retrieval. In E. Way, editor, Proceedings of the 6th Annual Workshop on Conceptual Graphs, pages 187–207, Binghamton, 1991. 14. D. Lukose, editor. Proceedings of the First CGTOOLS Workshop, University of New South Wales Sydney, N.S.W., AUSTRALIA, August 1996. 15. G. Ellis, R. Levinson, and P. Robinson. Managing complex objects in PEIRCE. International journal on human-Computer Studies, 41:109–148, 1994.
Stepwise construction of the Dedekind-MacNeille Completion Bernhard Ganter1 and Sergei O. Kuznetsov2 1
Technische Universit¨ at Dresden, Institut f¨ ur Algebra, D-01062 Dresden Department of Theoretical Foundations of Informatics, All-Russia Institute for Scientific and Technical Information (VINITI), ul. Usievicha 20 a, 125219 Moscow, Russia 2
Abstract. Lattices are mathematical structures which are frequently used for the representation of data. Several authors have considered the problem of incremental construction of lattices. We show that with a rather general approach, this problem becomes well-structured. We give simple algorithms with satisfactory complexity bounds.
For a subset A ⊆ P of an ordered set (P, ≤) let A↑ denote the set of all upper bounds. That is, A↑ := {p ∈ P | a ≤ p for all a ∈ A}. The set A↓ of lower bound is defined dually. A cut of (P, ≤) is a pair (A, B) with A, B ⊆ P , A↑ = B, and A = B ↓ . It is well known that these cuts, ordered by (A1 , B1 ) ≤ (A2 , B2 ) : ⇐⇒ A1 ⊆ A2 ( ⇐⇒ B2 ⊆ B1 ) form a complete lattice, the Dedekind-MacNeille completion (or short completion) of (P, ≤). It is the smallest complete lattice containing a subset orderisomorphic with (P, ≤). The size of the completion may be exponential in |P |. The completion can be computed in steps: first complete a small part of (P, ≤), then add another element, complete again, et cetera. Each such step increases the size of the completion only moderately and is moreover easy to perform. We shall demonstrate this by describing an elementary algorithm that, given a (finite) ordered set (P, ≤) and its completion (L, ≤), constructs the completion of any one-element extension of (P, ≤) in O(|L| · |P | · ω(P )) steps, where ω(P ) denotes the width of (P, ≤). The special case that (P, ≤) is itself a complete lattice and thus isomorphic to its completion, has been considered as the problem of minimal insertion of an element into a lattice, see e.g. Valtchev [4]. We obtain that the complexity of inserting an element into a lattice (L, ≤) and then forming its completion is bounded by O(|L|2 · ω(L)). M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 295–302, 1998. c Springer-Verlag Berlin Heidelberg 1998
296
B. Ganter and S.O. Kuznetsov
The elementary considerations on the incidence matrix of (P, ≤), which we use in the proof, do not utilize any of the order properties. Our result therefore generalizes to arbitrary incidence matrices. In the language of Formal Concept Analysis this may be interpreted as inserting a preconcept into a concept lattice.
1
Computing the Completion
Let us define a precut of an ordered set to be a pair (S, T ), where S is an order filter and T is an order ideal such that S ⊆ T ↓ , T ⊆ S ↑ . We consider the following construction problem: Instance: A finite ordered set (P, ≤), its completion, and a precut (S, T ) of (P, ≤). Output: The completion of (P ∪ {x}, ≤), where x 6∈P is some new element with p ≤ x ⇐⇒ p ∈ S and x ≤ p ⇐⇒ p ∈ T for all p ∈ P .1 (P, ≤) may be given by its incidence matrix (of size O(|P |2 )). The completion may be represented as a list of cuts, that is, of pairs of subsets of P . With a simple case analysis we show how the cuts of (P ∪ {x}, ≤) can be obtained from those of (P, ≤). Proposition 1. Each cut of (P ∪ {x}, ≤), except (S ∪ {x}, T ∪ {x}), is of the form (C, D), (C ∪ {x}, D ∩ T ), or (C ∩ S, D ∪ {x}) for some cut (C, D) of (P, ≤). If (C, D) is a cut of (P, ≤) then 1. (C ∪ {x}, D ∩ T ) is a cut of (P ∪ {x}, ≤) iff S ⊂ C = (D ∩ T )↓ , 2. (C ∩ S, D ∪ {x}) is a cut of (P ∪ {x}, ≤) iff T ⊂ D = (C ∩ S)↑ , 3. (C, D) is a cut of (P ∪ {x}, ≤) iff C 6⊆S and D 6⊆T . For a proof of this result and of the following see the next section. Proposition 2. The number of cuts of (P ∪ {x}, ≤) does not exceed twice the number of cuts of (P, ≤), plus two. A natural embedding of the completion of (P, ≤) into that of (P ∪ {x}, ≤) is given by the next proposition: Proposition 3. For each cut (C, D) of (P, ≤) exactly one of (C, D),
(C ∪ {x}, D),
(C, D ∪ {x}),
(C ∪ {x}, D ∪ {x})
is a cut of (P ∪ {x}, ≤). These cuts can be considered to be the “old” cuts, up to a modification. “New” cuts are obtained only from cuts (C, D) that satisfy 3) and simultaneously 1) or 2). An algorithm can now be given: 1
For elements of P different from x, the order remains as it was.
Stepwise construction of the Dedekind-MacNeille Completion
297
Algorithm to construct the completion of (P ∪ {x}, ≤). Let L denote the set of all cuts of (P, ≤). – Output (S ∪ {x}, T ∪ {x}). – For each (C, D) ∈ L do: 1. If C ⊆ S and D 6⊆T then output (C, D ∪ {x}). 2. If C 6⊆S and D ⊆ T then output (C ∪ {x}, D). 3. If C 6⊆S and D 6⊆T then a) output (C, D), b) if C = (D ∩ T )↓ then output (C ∪ {x}, D ∩ T ), c) if D = (C ∩ S)↑ then output (C ∩ S, D ∪ {x}). – End. It follows from the above propositions that this algorithm outputs every cut of (P ∪ {x}, ≤) exactly once. Each step of the algorithm involves operations for subsets of P . The most time consuming one is the computation of (D ∩ T )↓ and of (C ∩ S)↑ . Note that (D ∩ T )↓ = (min(D ∩ T ))↓ , where min(D ∩ T ) is the set of the minimal elements of D ∩ T and can be computed in O(|P | · ω(P )) steps. Since | min(D ∩ T )| ≤ ω(P ) and, moreover, \ (min(D ∩ T ))↓ = {p↓ | p ∈ min(D ∩ T )}, we conclude that (D ∩ T )↓ can be obtained with an effort of O(|P | · ω(P )). The dual argument for (C ∩ S)↑ leads to the same result. So if L is the set of cuts of (P, ≤), then the algorithm can be completed in O(|L| · |P | · ω(P )) steps. Let us mention that computing an incidence matrix of the completion can be done in O(|L|2 ) steps, once the completion has been computed, see Proposition 6.
2
Inserting a Preconcept
A triple (G, M, I) is called a formal context if G and M are sets and I ⊆ G×M is a relation between G and M . For each subset A ⊆ G let AI := {m ∈ M | (g, m) ∈ I for all g ∈ A}. Dually, we define for B ⊆ M B I := {g ∈ G | (g, m) ∈ I for all m ∈ B}. A formal concept of (G, M, I) is a pair (A, B) with A ⊆ G, B ⊆ M , AI = B, and A = B I . The formal concepts, ordered by (A1 , B1 ) ≤ (A2 , B2 ) : ⇐⇒ A1 ⊆ A2
( ⇐⇒ B2 ⊆ B1 ),
form a complete lattice, the concept lattice of (G, M, I). Most of the arguments given below become rather obvious if one visualizes a formal context as a G × M - cross table, where the crosses indicate the incidence
298
B. Ganter and S.O. Kuznetsov
relation I. The concepts (we sometimes omit the word “formal”) then correspond to maximal rectangles in such a table. Note that if A = B I for some set B ⊆ M , then (A, AI ) automatically is a concept of (G, M, I). A pair (A, B) with A ⊆ G, B ⊆ M , A ⊆ B I , and B ⊆ AI is called a preconcept of (G, M, I). In order to change a preconcept into a concept, one may extend each of the sets G and M by one element with the appropriate incidences. So as a straightforward generalization of the above, we consider the following construction problem: Instance: A finite context (G, M, I), its concept lattice, and a preconcept (S, T ) of (G, M, I). Output: The concept lattice of (G ∪ {x}, M ∪ {x}, I + ), where x 6∈G ∪ M is a new element and I + := I ∪ ((S ∪ {x}) × ({x} ∪ T )). The special case of section 1 is obtained by letting G = M := P
and
(g, m) ∈ I : ⇐⇒ g ≤ m.
Proposition 4. Each formal concept of (G ∪ {x}, M ∪ {x}, I + ), with the exception of (S ∪ {x}, T ∪ {x}), is of the form (C, D),
(C ∪ {x}, D ∩ T ),
or
(C ∩ S, D ∪ {x})
for some formal concept (C, D) of (G, M, I). With the obvious modifications, the conditions given in Proposition 1 hold. Proof. Each formal concept (A, B) of (G ∪ {x}, M ∪ {x}, I + ) belongs to one of the following cases: 1. x ∈ A, x ∈ B. Then A = S ∪ {x}, B = T ∪ {x}. 2. x ∈ A, x 6∈ B. Then B ⊆ T and B I = A \ {x}. Therefore (C, D) := (A \ {x}, (A \ {x})I ) is a formal concept of (G, M, I) satisfying S ⊂ C = (D ∩ T )I .
(1)
Conversely if (C, D) is a formal concept of (G, M, I) satisfying (1), then (A, B) := (C ∪ {x}, D ∩ T ) is a formal concept of (G ∪ {x}, M ∪ {x}, I + ). 3. x 6∈A, x ∈ B, dual to 2. Then (C, D) := ((B \ {x})I , B \ {x}) is a concept of (G ∪ {x}, M ∪ {x}, I + ) with T ⊂ D = (C ∩ S)I . Conversely each formal concept (C, D) with (2) yields a formal concept (A, B) := (C ∩ S, D ∪ {x}) of (G ∪ {x}, M ∪ {x}, I + ).
(2)
Stepwise construction of the Dedekind-MacNeille Completion
4. x 6∈A, x 6∈B. satisfying
299
Then (C, D) := (A, B) is a formal concept also of (G, M, I), C 6⊆S,
D 6⊆T.
(3)
Conversely is each pair with (3) also a concept of (G ∪ {x}, M ∪ {x}, I + ). If both (C ∪ {x}, D ∩ T ) and (C ∩ S, D ∪ {x}) happen to be concepts, then S ⊆ C and T ⊆ D, which implies C ∪ {x} = T I , D ∪ {x} = S I . Thus apart from perhaps one exceptional case these two possibilities exclude each other. From each concept of (G, M, I), we therefore obtain at most two concepts of (G ∪ {x}, M ∪ {x}, I + ), except in a single exceptional case, which may lead to three solutions. On the other hand, each concept of (G ∪ {x}, M ∪ {x}, I + ), except (S ∪ {x}, T ∪ {x}), is obtained in this manner. This proves Proposition 2. To see that Proposition 3 holds in the general case, note that each formal concept (C, D) of (G, M, I) belongs to one of the following cases: 1. C = S, D = T . Then (C ∪{x}, D∪{x}) is a concept of (G∪{x}, M ∪{x}, I + ). 2. C ⊆ S, T ⊂ D. Then D = C I and condition (2) (from the proof of Proposition 4) is fulfilled. Thus (C, D ∪ {x}) is a concept of (G ∪ {x}, M ∪ {x}, I + ). 3. S ⊂ C, D ⊆ T . Then C = DI and condition (1) is satisfied. Therefore (C ∪ {x}, D) is a concept of (G ∪ {x}, M ∪ {x}, I + ). 4. C 6⊆S, D 6⊆T . Then (C, D) is a concept of (G ∪ {x}, M ∪ {x}, I + ). It is clear that each of the possible outcomes determines (C, D), and that therefore the possibilities are mutually exclusive. It is a routine matter to check that these formal concepts are ordered in the same way than those of (G, M, I). The construction thus yields a canonical order embedding of the small concept lattice into that of the enlarged context. Since all details have carried over to the more general case, we may resume: Proposition 5. The algorithm given in section 1, when applied to the concept lattice L of (G, M, I), computes the concept lattice of (G ∪ {x}, M ∪ {x}, I + ). The abovementioned complexity considerations apply as well, but it is helpful to introduce a parameter for contexts that corresponds to the width. The incidence relation induces a quasiorder relation on G by g1 ≤ g2 : ⇐⇒ {g2 }I ⊆ {g2 }I . Let ω(G) be the width of this quasiorder, and let ω(M ) denote the width of the corresponding quasiorder on M . Let τ (G, M, I) := (ω(G) + ω(M )) · (|G| + |M |). Of course, τ (G, M, I) ≤ (|G| + |M |)2 . Provided the induced quasiorders on G and M are given as incidence matrices (these can be obtained in O(|G| · |M | · (|G| + |M |)) steps), we have a better bound on the complexity of the derivation operators: the set AI can be computed from A with complexity O(τ (G, M, I)).
300
B. Ganter and S.O. Kuznetsov
Computing AI was the most time consuming step in the algorithm on section 1. Thus computing the new concept lattice can be performed with O(|L| · τ (G, M, I)) bit operations. Each concept of (G∪{x}, M ∪{x}, I + ), except (S ∪{x}, T ∪{x}), is generated by exactly one of the steps 1, 2, 3a, 3b, 3c of the algorithm, and precisely 3b) and 3c) lead to “new” concepts (other than (S ∪ {x}, T ∪ {x}). When performing the algorithm, we may note down how the concepts were obtained. These data can be used later to construct an incidence matrix of the new lattice: Proposition 6. The order relation of the new lattice can be computed in additional O(|L|2 ) steps. Proof. (S ∪ {x}, T ∪ {x}) is the largest concept containing x in its extent and the smallest concepts containing x in its intent. In other words, (S ∪ {x}, T ∪ {x}) is greater than all concepts generated in steps 2) and 3b) and smaller than all concepts generated by steps 1) and 3c). It is incomparable to the other elements. So we may exclude this concept from further considerations. The order relation between the “old” concepts, i.e. between those generated in steps 1), 2), and 3a), is the same as before. For the remaining case, we consider w.l.o.g. a concept (C ∪{x}, D ∩T ), which was generated in step 3b) from a concept (C, D) of (G, M, I). Now (C ∪ {x}, D ∩ T ) ≤ (E, F ) if and only if (E, F ) has been generated in steps 2) or 3b) from some concept (E \{x}, (E \{x})I ) ≥ (C, D) of (G, M, I). If x ∈ E, then similarly (E, F ) ≤ (C ∪ {x}, D ∩ T ) is true if and only if (E, F ) has been generated in steps 2) or 3b) from some concept (E \ {x}, (E \ {x})I ) ≤ (C, D) of (G, M, I). Suppose x 6∈E. If (E, F ) was obtained in steps 1) or 3a) of the algorithm, than (E, E I ) is a concept of (G, M, I) and (E, F ) ≤ (C ∪{x}, D∩T ) is equivalent to (E, E I ) ≤ (C, D). If (E, F ) was obtained in step 3c), then S I ⊆ F , which implies D ∩ T ⊆ S I ⊆ F . So in this case (E, F ) ≤ (C ∪ {x}, D ∩ T ) always holds. Summarizing these facts, we obtain all comparabilities of a concept (C ∪ {x}, D ∩ T ) of (G ∪ {x}, M ∪ {x}, I + ) which was derived from a concept (C, D) of (G, M, I) in step 3b): Concepts greater than (C∪{x}, D∩T ) are those obtained in steps 2 or 3b) from concepts greater than (C, D), concepts smaller than (C ∪ {x}, D ∩ T ) are those obtained in steps 1), 2), 3a) or 3b) from those smaller than (C, D) and all those obtained in step 3c). Thus the comparabilities of (C ∪ {x}, D ∩ T ) can be obtained from those of (C, D) using only a bounded number of elementary operations in each case. Filling the corresponding row of the incidence matrix is of complexity O(|L|). The argument for concepts obtained by 3c) is analogous. The generalized algorithm may be applied to the context (P, P, 6>), obtained from an arbitrary ordered set (P, ≤). The concept lattice is the lattice of maximal antichains of (P, ≤) (see Wille [5]). Our result therefore relates to that of Jard, Jourdan and Rampon [2].
Stepwise construction of the Dedekind-MacNeille Completion
3
301
A Non-Incremental Procedure may be more Convenient
In practice, a strategy suggests itself that may be more time-consuming, but is nevertheless simpler than the algorithm presented in section 1. Rather than pursuing an incremental algorithm, it may be easier to compute the lattice “from scratch” (i.e. from the formal context, or, in the special case, from the ordered set (P, ≤)) each time. For this task there is an algorithm that is remarkably simple (it can be programmed in a few lines) and at the same time is not dramatically slower than the incremental approach: it computes the concept lattice L of a formal context (G, M, I) in O(|L| · |G|2 · |M |) steps. Using the parameter introduced above, we can improve this to O(|L| · |G| · τ (G, M, I)). This algorithm generates the formal concepts inductively and does not require a list of concepts to be stored. Let us exemplify the advantage of this by a simple calculation: A formal context (G, M, I) with |G| = |M | = 50 may have as may as 250 formal concepts in the extreme. But even if the lattice is “small” and has only, say, 1010 elements, it would require almost a hundred Gigabytes of storage space. Generating such a lattice with the inductive algorithm appears to be time-consuming, but not out of reach; the storage space required would be less than one Kilobyte. Moreover, this algorithm admits modifications that allow to search specific parts of the lattice. For details and proofs we refer to the literature (see [1]), but the algorithm itself is so simple that it can be recalled here. For simplicity assume G := {1, . . . , n}, and define for subsets A, B ⊆ G A
302
B. Ganter and S.O. Kuznetsov
It is easy to see that computing A⊕i requires at most O(|G|·|M |) steps, using the induced quasiorders only O(τ (G, M, I)) steps. The “next” extent therefore is found at an expense of O(|G|2 · |M |), or even O(|G| · τ (G, M, I)). If a lattice diagram is to be generated, the inductive approach may even be faster than the incremental one. For a given extent A 6= G, the extents of the upper covers are precisely the minimal sets of the form (A ∪ {i})II ,
i 6∈A.
Computing these requires O(|G|2 · |M |) steps. Localizing such an upper cover in a linear list of extents, using a binary search algorithm, can be done with O(log |L|) comparisons of subsets of G. The complexity thus is O(|G| · |M |), since |L| ≤ 2|M | . Every finite lattice (L, ≤) is isomorphic to some concept lattice. A natural choice is the formal context (J(L), M (L), ≤), where J(L) and M (L) denote the sets of join- and meet-irreducible elements of (L, ≤), respectively. If we denote the cardinalities of these sets by j(L) := |J(L)|,
m(L) := |M (L)|,
we can resume: Corollary 1. The covering relation of a finite lattice (L, ≤) can be computed in O(j(L)2 · m(L) · |L|) steps, provided the sets J(L) and M (L) of join- and of meet-irreducible elements are given. This is considerably better e.g. than the bound given by Skorsky[3]. Again, the bound can be refined using the width of the induced orders on G and on M .
References 1. Bernhard Ganter, Rudolf Wille: Formale Begriffsanalyse - Mathematische Grundlagen. Springer Verlag 1996. 2. C. Jard, G.-V. Jourdan and J.-X. Rampon: Computing On-Line the Lattice of Maximal Antichains of Posets. Order 11 (1994). 3. Martin Skorsky: Endliche Verb¨ ande – Diagramme und Eigenschaften. Shaker, 1992. 4. Petko Valtchev: An Algorithm for Minimal Insertion in a Type Lattice. Second International KRUSE Symposium, Vancouver, 1997. 5. Rudolf Wille: Finite distributive lattices as concept lattices. Atti Inc. Logica Mathematica, 2 (1985).
PAC Learning Conceptual Graphs Pascal Jappy and Richard Nock Laboratoire d’Informatique, de Robotique et de Micro´electronique de Montpellier 161 rue Ada, 34392 Montpellier Cedex 5, France {jappy, nock}@lirmm.fr
Abstract. This paper discusses the practical learnability of simple classes of conceptual graphs. We place ourselves in the well studied PAC learnability framework and describe the proof technique we use. We first prove a negative learning result for general conceptual graphs. We then establish positive results by restricting ourselves to classes in which basic operations are polynomial. More precisely, we first state a sufficient condition for the learnability of graphs having a polynomial projection operation. We then extend this result to disjunctions of graphs of bounded size.
1
Introduction
The last decade has seen an explosive growth in database technology and the amount of data collected. Advances in data collection have flooded us with lots of data. Data mining is the efficient supervised or unsupervised discovery of interesting, useful, and previously unknown patterns from this data. Machine Learning, and in particular, Inductive Learning, is the supervised extraction of an abstract concept which has to correctly explain classified observations. In both Data Mining and Machine Learning, the choice of the language used to represent the data is of uppermost importance. And the study of the efficiency of learning for a given representation language, i.e. its learnability, has become an active field of research. It is common, in Data analysis and Machine Learning, to represent data as attribute-value pair vectors, often real valued. And, in Computational Learning Theory, the complexity of Learning has been studied mainly for Boolean (i.e. two valued) formula classes. However, alternative descriptions have been proposed in the fields of Artificial Intelligence and Knowledge Representation. Structural representations, for instance, provide a means of capturing a special type of information that others do not. Not only are an object’s properties described, but also the relations between its subcomponents. Furthermore, the readability of such representations is far greater. This makes them much more usable by experts and an abundant literature has been devoted to them, from early semantic networks [18] to the more recent Description Logics [2] or Conceptual Graphs [23], [3]. In Machine Learning, the former are now being used in several applications such as DENDRAL [13]. M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 303–315, 1998. c Springer-Verlag Berlin Heidelberg 1998
304
P. Jappy and R. Nock
To formally analyse the learning hardness associated to a given description language, several theoretical models have been proposed. Identifiability in the limit [8], for instance, focuses on isolating a formula capable of explaining a set of examples in finite time but with no a priori bound on the number examples available to the learner. But the most widely used learnability model is the Probably Approximately Correct (PAC) framework introduced by Valiant [24]. It has become a learnability benchmark in which the vast majority of studies of the past ten years have been undertaken. Early work cast a pessimistic shadow on the learnability of such classes. [9], for instance, showed that learning existential conjunctions in structural domains can be intractable. However, more recent studies have shown that compromises can be found between sheer representation power and complexity of learning. In the quest for more expressive representation languages for Machine Learning, two main trends of structural formalisms have emerged. On one side, Inductive Logic Programming (ILP) aims at learning concepts expressed as Horn Clauses from examples and background knowledge [15]. Though the general problem is undecidable, many restricted classes have led to learnability in the limit [22] or Probably Approximately Correct (PAC) learnability results [6]; [12]; [4]. On the other, a family of structural languages called Description Logics has been developed in the past decade by the Knowledge Representation (KR) community. They provide an efficient means for expressing concept hierarchies and form the basis of KR systems such as KL-ONE [2]. Furthermore, they offer syntactic restrictions of First Order Logic (FOL) previously unexplored in ILP. In this paper, we study the possibility of using Conceptual Graphs in learning applications. This formalism has been studied extensively and their algorithmic properties are well known [3] making it a natural candidate for learning applications. Besides, their expressive power is high, equivalent to large subsets of First Order Logic [23], [3]. Furthermore, document retreival applications which employ search procedures close to Data Mining techniques have been developed using conceptual graphs [7]. We place ourselves in the PAC framework and use a constructive proof technique [1] to obtain our results. Our goal is to isolate which subsets of general Conceptual Graphs allows efficient learning. We first give negative result for conceptual graphs, showing that their NP-Complete projection operation makes it impossible to PAC learn them. We then show a sufficient condition for Conceptual Graph classes with polynomial projection to be learnable. Finally, we extend this result to disjunctions and decision lists of graphs of limited size. Both disjunctions and decision lists are widely employed in Machine Learning and the classes we define by extending these to conceptual graphs are likely to prove very efficient. The rest of this paper is organized as follows. In section 2, we present the formal learnability model we have chosen to place ourselves in. In particular, we describe both the techniques used in practice to prove positive results, and existing theories which limit our search for efficiently learnable classes. In section 3,
PAC Learning Conceptual Graphs
305
we prove a first negative result for general conceptual graphs. Section 4 the lists our positive results and their respective proofs. We finally conclude in section 5.
2
Formal Background
Our goal in this paper is to determine which conceptual graph classes are efficiently learnable. Obviously, this formal analysis requises an explicit model of what it means to be efficiently learnable. In this section, we describe our model of learnability, which is a slight modification of the PAC learnability model introduced by Valiant [24]. 2.1
Hypothesis Space
In inductive learning, the goal is to find an unknown target concept, or some good approximation, from a set of labeled examples. Let X be a set called the domain, and let CG the set of conceptual graphs. A concept C over X is a subset of X. A concept class is a set of concepts. This will designate a constrained set of potential “target” concepts which could be labelling the training examples. A hypothesis is also a subset of X which the learning algorithm produces as an approximation of the hidden target concept. A hypothesis class is a set of hypotheses. Associated with each concept or hypothesis class is a language L ⊂ CG used for writing down concepts or hypotheses in these classes. In this paper, both hypotheses and concept classes will be variously restrited Conceptual Graphs classes. We will also assume the existence of an acceptable size measure (by acceptable, we mean somehow polynomially related to the actual number of bits required to write down a concept or hypothesis). Also, as it will always be clear from the context whether we are refering to a concept or its representation in L, we will let c denote both. Usually, examples of the concept c are elements of the domain X, with x ∈ X labeled as positive if x ∈ c and negative otherwise. Here, we will depart from this practice and consider examples described as conceptual graphs, since this is closer to real applications of the domain. Therefore, examples will be elements selected from CG and x will be labeled positive if c subsumes x (i.e. c projects onto x) and negative otherwise. Finally, the learning algorithm must be allowed greater resources if the concept class is vast. Thus class complexity parameters are needed to express this richness. In this paper, we will follow the accepted theoretical presentation of conceptual graphs proposed in [3]. Hence, the parameters which determine the richness of a given class of conceptual graphs are the number of relations in the support, the number of concept in the type lattice and the number of individual markers allowed. We will let α1 , α2 and α3 denote these respectively. And more generally, CG α(i) will represent the subset of CG defined using parameters (αi ).
306
2.2
P. Jappy and R. Nock
PAC Learnability
Informally, to say that a class C of concepts is “learnable from examples” means that there exists a corresponding learning algorithm A which for any function c in C is capable of converging upon c as it is given more and more examples. A will have learnt c from the examples. This accomplishment by itself is not necessarily very interesting as no bound on A’s execution time or on the number of examples required has been stated. For instance, any boolean function class defined over n variables is clearly learnable from examples, since all the algorithm has to do is to wait for an example corresponding to each of the 2n possible variable assignments to know the complete truth table for c. So the interesting question really is to know whether C is easily learnable from examples, that is if the learning algorithm will require both few examples and a reasonable computing time. Valiant [24] introduced a notion of learnability which explicitely specifies that the hypothesis acquired by the learning algorithm should be a close approximation of the concept being taught via its examples, in the sense that this hypothesis should perform well on new data. We now make these notions more precise. Assume a fixed but unknown probability ditribution Dα(i) to be defined on CG α(i) and according to which the examples are drawn. When a hypothesis h is returned by the algorithm, its quality is evaluated by the measure of the probability that a new example drawn according to Dα(i) is incorrectly classified by h. Definition 1. Let C denote a concept class and C(αi ) the subset of C containing all elements built with parameters (αi ). C is polynomially learnable by H iff ∃A a learning algorithm, such that: – – – – –
∀D(αi ) probability distribution over C(αi ) , ∀α(i) richness parameters, ∀c, target concept in C(αi ) , ∀ precision parameter, ∀η confidence parameter,
when given a set of size m polynomial in αi , size(c), 1/ and 1/η of examples of c drawn according to D(αi ) , A finds a hypothesis g in C(αi ) such that : PD(αi ) (PD(αi ) (f 6= g) > ) < η
(1)
in time polynomial in m and ne . In other words, this definition requires that A be able to produce an answer whose error is less than the prespecified precision parameter . However, it is theoretically possible to draw a learning sample of examples according to Pα(i) that is not “representative” of the target concept c. So, the model also supplies a confidence parameter δ and only demands that the precision be exceeded with probability at most δ. Obviously, as and δ approach 0, the algorithm is allowed a larger learning sample and more computation time.
PAC Learning Conceptual Graphs
307
If a concept class C conforms to definition 1, it is said to be PAC learnable by itself, that is, the hidden concept c belongs to it and the learning algorithm also searches for the hypotheses to approximate c in C. There are a number of variations on this definition. In particular, one which interests us is the relaxation which happens when the algorithm is allowed to look for the hypotheses in another (hypothesis) class H. In this case, the concept class C is said to be PAC learnable by H. 2.3
Obtaining Positive Results
Proving positive PAC results using Valiant’s definition can be dificult. Blumer et al. [1] have developed easier techniques for doing so. We now repeat one of their key theorems which allows the proof to be split into two simple ones. Definition 2. A class C is said to be of polynomial (code) size if ∀(αi )λ(Cαi ) = log(|Cαi |) is polynomial in (αi ). Definition 3. A class C is said to be polynomial time identifiable if, when given parameters (αi ) and a set of m examples, an identification algorithm can produce in time polynomial in m and (αi ) a function g of C(αi ) which is consistent with the examples, or detect its non existence. Theorem 1. Let C denote a concept class of polynomial size. If C is polynomial time identifiable then C is polynomially learnable. Proof Suppose f is a hypothesis in C(αi ) whose error probability is greater than . The chance that f be consistent with a sample of m examples drawn according to P(αi ) in C(αi ) is at most (1 − )m . So the chance that an arbitrary concept in C(αi ) satisfy both conditions is at most |C(αi ) |(1 − )m . The PAC learning requirements demand that: |C(αi ) |(1 − )m ≤ 1 − δ.
(2)
So, if we solve, we get: m>
1 1 (log(|C(αi ) |) + log( )) δ
(3)
if m satisfies this equation, the probability that a hypothesis consistent with the sample turns out to have an error greater than is less than δ. In other words, the algorithm needs only examine m examples and return a consistent hypothesis. The class being polynomial sized, m is bounded by a polynomial in (αi ), 1 , 1δ and the size of the largest example as required by the PAC framework. The practical consequence of this result is that, in order to prove a positive PAC result for a concept class C it is sufficient to produce an identification algorithm and prove C’s polynomial size.
308
P. Jappy and R. Nock
Unfortunately, this also highlights one major constraint of this technique: if testing whether an element x is an example of a concept c cannot be done in polynomial time, then no identification algorithm can be polynomial. This limits the use of the above results to classes for which this test is polynomial. In the following section we give a stronger result for conceptual graphs, similar in spirit to Theorem 1 in [5].
3
A Negative Result
Our first result concerns the general class of conceptual graphs CG for which we prove the necessity of a polynomial projection test for the existence of a PAC learning algorithm. The proof of our theorem relies on a structural complexity hypothesis and requires the presentation of the complexity class P/P oly. Definition 4. The class P/P oly is the set of all languages accepted by a (possibly nonuniform) family of polynomial-size deterministic circuits. It is an accepted assumption that N P 6= P/P oly, which in particular implies N P − complete 6⊆P/P oly. We now state our first result. Theorem 2. Let C ⊆ CG denote a conceptual graph class. If the projection test between elements in C is either NP-Complete or coNP-Complete, then C is not PAC-learnable unless NP ⊆ P/Poly. Proof For any conceptual class C and concept c ∈ C, we note inf (c) the concept having the same representation as c, but which denotes the set inf (c) = d ∈ C such that c projects into d. Similarly, we define inf (C) = inf (c), c ∈ C. It is immediate that C is PAC learnable if f inf (C) is also, and that testing membership for a concept inf (c) ∈ inf (C) is as hard as testing projection between elements in C. In Theorem 7 [21], it is shown that if inf (C) is PAC learnable then inf (C) ∈ P/P oly. Thus, if projection in inf (C) is NP-Complete or CoNP-Complete, then we have N P ⊆ P/P oly or CoN P ⊆ P/P oly. This leads to a contradiction in both cases (because P/P oly is closed under complementation). A simple corollary of this theorem yields our first result regarding the learnability of conceptual graphs. Corollary 1. Since the projection operation is NP-complete in CG , general conceptual graphs are not PAC learnable. In the next section, we examine restrictions of this ast class and determine which are learnable in the framework presented above.
4
Three Positive Results
In this section, we investigate restrictions on conceptual graphs which are sufficient to ensure PAC learnability. Our previous negative result imposes that we
PAC Learning Conceptual Graphs
309
restrict ourselves to classes whose projection operation operation is polynomial. Our first positive result is a general one. It is based on a transcription of Theorem 1 and highlights a size limitation on classes which is sufficient to ensure classe’s PAC learnability. We then specialize this and prove that well known boolean formula classes can be extended by substituting conceptual graphs of limited size for the boolean monomes more commonly used in machine learning. We prove that the classes thus obtained, which are far more expressive than their boolean counterparts, are PAC learnable. 4.1
PAC Learning Individual Graphs
Theorem 3. Let C(αi ) ⊂ CG denote a conceptual graph class. If C(αi ) is of polynomial (code) size and the least upper bound operation is polynomial, then C(αi ) is PAC learnable. Proof We use the technique based on theorem 1. We prove that log(|C(αi ) |) is polynomial in αi , and provide an identification algorithm, that is one capable of returning a conceptual graph consistent with an imput sample. We give a large upperbound of the number of elements of C(αi ) with a total number of n vertices (which we use as a crude but acceptable size measure) as follows. We suppose that we have n relation vertices, and n concept vertices. In this bipartite graph, we suppose that any edge is either present or absent. We suppose that for any n concept vertex, we have α2 α3 substitution possibilities , and for any relation vertex, we have α1 possibilities of substitution. Given this procedure for generating graphs, it follows that we can generate any graph in C(αi ) having a total number of n vertices. The number of total substitutions of vertices is upperbounded by (α1 α2 α3 )n And the size of C(αi ) is therefore calculated taking into account the possibility for any edge of being either absent or present, therefore the total number of CG having n vertices is (largely) upperbounded by n2
((α1 α2 α3 )n )
n3
= (α1 α2 α3 )
It is obvious that taking the log of this quantity gives a polynomial in αi . The identification algorithm we propose is based on the least upper bound operation lub(., .) which returns for any pair of conceptual graphs the most specific one which projects into both. Let S = S + ∪ S − denote the sample. Let cpos be the graph defined as the lub of all positive examples and cneg the lub of all negative examples. One of these two graphs is necessarily consistent with the sample. Indeed, if this were not the case, there would be a positive example which projects into cneg and a negative one which projects into cpos . Since, by definition of the lub, cpos projects into all positive examples and cneg into all negative ones, the transitivity of the projection operation would mean there would be a cycle between the positive
310
P. Jappy and R. Nock
examples (or the negative ones) and themselves, which is impossible. We have thus produced an algorithm capable of returning a graph consistent with the input sample. And if the lub operation is polynomial, this algorithm also is. This concludes the proof. This result uses the lub(., .) operation to construct hypothesis graphs. The theorem below links the more commonly studied projection operation to PAC learnability. Theorem 4. Let C(αi ) ∈ CG denote a conceptual graph class. If the both the complexity of projection test between elements of C(αi ) and |C(αi ) | are polynomial in (αi ) then C(αi ) is PAC learnable. Proof Here, the size restriction imposed on C(αi ) lets us construct an even simpler identification algorithm. Indeed, a simple enumeration of all the elements in C(αi ) and a projection test between the current graph and all the examples in the sample are all that is needed to select a consistent graph or detect its non existence. This algorithm is obviously polynomial if the theorem’s assumptions are true. That the polynomial (code) size criterion is met is also trivial since the class itself is of polynomial cardinal. This result is weaker than the previous one since it requires the class to be of smaller size. However, in the next section, we extend it by showing that under identical conditions on C(αi ) , more elaborate hypotheses (than individual graphs) can be constructed from elements of C(αi ) and successfully PAC learnt. 4.2
Learning Formulas Based on size Limited Graphs
In this section, we examine the possibility of using Conceptual Graphs as the basic buidling blocs of more elaborate formulas. Two classes of boolean formulas have been at the center of numerous studies in Machine Learning. These are DNF (Disjunctive Normal Form) [25] and DL (Decision Lists) [19]. An element of DNF (which we will note dnf )is the disjunction of boolean monomes, that is the conjunction of boolean litterals. A dnf classifies an example x as positive if x satisfies at least one of its monomes. The learnability of DNF is an important open problem, but several results have been shown for subsets of this class. In particular, the class k-DNF, in which the monomes in a disjunction are limited in size to a maximum of k, is PAC learnable[25]. A decision list (dl) is an ordered list of conjunctive rules noted (t1 → g1 ), (t2 → g2 ), ..., (tk → gk ), gk+1 where ∀1 ≤ i ≤ k, ti is a boolean monome and gi ∈ 0, 1), and the class gk+1 is called the default class. The class associated to any example x is the goal class corresponding to the first test passed by the example, ie the first monome satisfied by x. If none is passed, the example is assigned the default class. Again, a positive PAC learning result is obtained if the monomes in the list are limited
PAC Learning Conceptual Graphs
311
in size to at most k. These two classes are very much used in learning applications, and have good algorithmic properties. Thus, it is very interesting to extend them by replacing the monomes they contain by more expressive terms. This increases their representation power, yet retains the good algorithmic aspects. This approach has already been studied in detailed in [10] for several Description Logics as well as other similar structured knowledge representation formalisms. Here, we propose to do the same with conceptual graphs. Our next result shows under which conidtions on the graph sets this extension leads to learnable formula classes. Theorem 5. Let C(αi ) ∈ CG denote a conceptual graph class. If the both the complexity of projection test between elements of C(αi ) and |C(αi ) | are polynomial in (αi ) then k − C(αi ) DNF and k − C(αi ) DL are PAC learnable. Proof Again, our proof is contructive. Let k − C(αi ) denote the subset of C(αi ) containing only graphs of size no greater than k. Replacing n by k in the size of C(αi ) yields: k3
|k − C(αi ) | ≤ (α1 α2 α3 )
(4)
We first calculate the number of disjunctions and of desision lists of graphs in k−C(αi ) then produce identification algorithms for k−C(αi ) DNF and k−C(αi ) DL. Any element of k − C(αi ) DNF is built in the following way: all the graphs in k − C(αi ) are examined one by one and each is either added to the formula or left out. Thus, the number of disjunctions is equal to 2|k−C(αi ) | . And consequently: k3
log(|k − C(αi ) DNF|) = |k − C(αi ) | ≤ (α1 α2 α3 )
(5)
Similarly any element of k −C(αi ) DL is built as follows: all the graphs in k −C(αi ) are examined one by one and each is either added to the formula associated to class 0, or added to the formula associated to class 1, or left out. Also, the order of the elements in the class is important so all graphs selection orders must be examined. So the number of decision lists is equal to: |k − C(αi ) DL| = 3|k−C(αi ) | (|k − C(αi ) |)!
(6)
And consequently: k3
log(|k − C(αi ) DL|) = O(|k − C(αi ) |) = O((α1 α2 α3 ) )
(7)
The identification algorithms for the two classes we are interested in follow. These two greedy algorithms are based on the simple observation that any concept consistent with a sample set is necessarily consistent with any of its subsets. Therefore, both consist in selecting graphs which are consistent with a subset of the sample, adding the graph to the current formula and removing the examples correctly classified. When all the examples have been removed, a consistent formula has been constructed. If at anytime, this proves impossible, then the non consistency of the formula has been detected.
312
P. Jappy and R. Nock
Tables 1 and 2 describe the identification algorithm for disjunctions. Every time a graph consistent with a part of the sample is found, it is ORed to the current disjunction. If all possible graphs have been examined and some examples are left unexplained then no disjunction is consistent with the sample. Indeed, the disjunction being commutative, the order of graph selection is unimportant and the concept class need only be searched once. BuildDNF DNF := MakeEmptyDNF(); LS := ExamplesSet; WHILE NotAllNegative(LS) DO CurrentGraph := SearchPositiveGraph(LS); LS := LS − ExamplesSatisfiedBy(CurrentGraph); DNF := DNF + CurrentGraph; END Return DNF; END Table 1. Pseudocode for building a CG disjunction.
SearchPositiveGraph(LS) CGraph = Make extensive search of a CG satisfied by positive examples; Return CGraph; END Table 2. Pseudocode for extensive search of a CG consistent with positive examples.
It is obvious that for both classes, if |k − C(αi ) | is polynomial in αi and the projection test is also polynomial in these parameters, then the two above algorithms also are. Two classes to which this result applies are Conceptual Trees [17] and locally injective Conceptual Graphs [14]. Although their expressivity is not as good as unconstrained conceptual graphs, their combination in more complex formulas yields very interesting concept classes for learning. Disjunction Normal form formulas are thought to be natural knowledge reprewsentations for the human brain [25]. Decision Lists are also very easy to interpret because they are lists of easily understandable rules. Besides, they are among the most general known PAC learnable classes. The extensions defined above are likely to prove very efficient and interpretable in learning applications.
PAC Learning Conceptual Graphs
313
BuildDL DL := MakeEmptyDL(); LS := ExamplesSet; WHILE NotOnlyOneClass(LS) DO CurrentGraph := SearchGraph(LS); LS := LS − ExamplesSatisfiedBy(CurrentGraph); DL := DL + CurrentGraph; END AddDefaultClass(LS); Return DL; END Table 3. Pseudocode for building a CG decision list. SearchGraph(LS) CGraph = Make extensive search of a CG satisfied by examples from only one class; Return CGraph; END Table 4. Pseudocode for extensive search of a CG.
5
Conclusion
In this paper, we have studied the practical learnability of classes of conceptual graphs. We have first presented a formal learnability model derived from the famous PAC learning framework which we have adapted to make it suitable for the analysis of graphs. This is our main contribution because it paves the way to other studies. In particular, we have shown that conceptual graphs, in their general form [3] do not allow PAC learning, under widely accepted complexity assumptions. This negative result being linked to the NP-Completeness of the projection test, we have then studied the possibility of learning classes of graphs for which basic algorithmic operations are polynomial, and shown that results become positive in this case. Finally, we have shown that graphs of limited size allow efficient learning not only as isolated concepts but as basic building blocs of more elaborate formulae which extend well known boolean concept classes. This type of extension of existing boolean formula classes has already been formally studied in [10] where various Description Logics were shown to be adequate extension languages. Here the results are different further since restricted conceptual graphs play this role. It would be ineteresting to compare the expressivity of the resulting classes in both cases. Another interesting extension would be to examine learning graphs rules, such as those presented in [20]. This could not be done in this paper because a (polynomial) rule generation mechanism from example graphs would be needed to
314
P. Jappy and R. Nock
build an indentification algorithm. However, the structure of these rules being an extension of function free Horn clauses quite similar to the extension of boolean formulae defined above, it is quite likely that positive learnability results could also be obtained (with the same size limitation on graphs) by using some inference techniques of Inductive Logic Programming. Finally, the various PAC results presented in this work are just early learning results and are only meant to path the way to more extensive studies. It should be noted that the PAC framework has become a benchmark model over the years, but is by no means the only interesting one. In fact it makes severe assumptions which may bias learning results [11]. Thus, it would be most interesting to study the learnability of conceptual graphs in newer models. In particular, U-learnability [16] comes to mind. Indeed, two of PAC’s drawbacks are its distributional assumptions and the worst case analysis which makes a polynomial projection mandatory. This last point is a major handicap since in practice the NP-Completeness of the projection test is never a handicap and efficient algorithms have been incorporated in most applied systems.
References 1. A. Blumer, A. Ehrenfeucht, D. Haussler, and M. K. Warmuth. Learnability and the vapnik-chervonenkis dimension. J. ACM, pages 929–965, 1989. 2. R.J. Brachman and J. Schmolze. An overview of the kl-one knowledge representation system. Cognitive Science, 9:171–216, 1985. 3. M. Chein and M.L. Mugnier. Conceptual graphs: Fundamental notions. Revue d’Intelligence Artificielle, pages 365–406, 1992. 4. W. W. Cohen. Pac-learning non-determinate clauses. In Proc. of AAAI-94, pages 676–681, 1994. 5. W. W. Cohen and H. Hirsh. The learnability of Description Logic with equality constrain ts. Machine Learning, pages 169–199, 1994. 6. S. Dzeroski, S. Muggleton, and S. Russel. Pac-learning of determinate logic programs. In Proc. of the 5 th International Conference on Computational Theory, pages 128–137, 1992. 7. D. Genest. Document retrieval: An approach based on conceptual graphs. Rapport de Recherche LIRMM No 97296, 1998. 8. E.M. Gold. Language identification in the limit. Information and Control, 10:447– 474, 1967. 9. D. Haussler. Learning conjunctive concepts in structural domains. Machine Learning, 4:7–40, 1989. 10. P. Jappy and O. Gascuel. On the conputational hardness of learning from structured symbolic data. In Proceedings of the 6th Internationl Conference on Ordinal and Symbolic Data Analysis, OSDA95, pages 128–143, 1995. 11. P. Jappy, R. Nock, and O. Gascuel. Negative robust learning results for horn clause programs. In Proc. of the 13 th International Conference on Machine Learning, 1996. 12. J.U. Kietz. Some lower bounds for the computational complexity of inductive logic programming. In European Conference on Machine Learning, ECML’93, pages 115–123, 1993.
PAC Learning Conceptual Graphs
315
13. R.K. Lindzay, B.G. Buchanan, E.A. Feigenbaum, and J. Lederberg. Dendral: a case study of the first expert system for scientific hypothesis formation. Artificial Intelligence, 61:209–261, 1993. 14. M. Liqui`ere. Apprentissage a ` partir d’objets structur´ es. Conception et R´ ealisation. PhD thesis, Universit´e de Montpellier II, 1990. 15. S.H. Muggleton. Inductive Logic Programming. Academic Press. New York, 1992. 16. S.H. Muggleton. Bayesian inductive logic programming. In COLT94, pages 3–11, 1994. 17. M.L. Mugnier and M. Chein. Polynomial algorithms for projection and matching. In Proc. of the 7th Workshop on Conceptual Structures, pages 68–76, 1992. 18. R.H. Richens. Preprogramming for mechanical translation. Mechanical Translation, 3, 1956. 19. R.L. Rivest. Learning decision lists. Machine Learning, pages 229–246, 1987. 20. E. Salvat and M.L. Mugnier. Sound and complete forward and backward chaining of graph rules. In Proceeding of the International Conference on Conceptual Structures, ICCS96, pages 248–262, 1996. 21. R. Shapire. The strength of weak learning. Machine Learning, 5(2), 1990. 22. E.Y. Shapiro. Algorithmic Program Debugging. Academic Press. New York, 1983. 23. J.F. Sowa. Conceptual Structures - Information Processinf in Mind and Machine. Addison-Wesley, 1984. 24. L. G. Valiant. A theory of the learnable. Communications of the ACM, pages 1134–1142, 1984. 25. L. G. Valiant. Learning disjunctions of conjunctions. In Proc. of the 9 th IJCAI, pages 560–566, 1985.
Procedural Renunciation and the Semi-Automatic Trap Graham A. Mann Artificial Intelligence Laboratory School of Computer Science & Engineering University of New South Wales Sydney, NSW 2052, Australia. [email protected] Abstract. This paper addresses two contemporary issues which could threaten the usefulness of conceptual graphs and their widespread acceptance as a knowledge representation. The first concerns the recent debate over the place of actors in the formalism. After briefly summarising arguments on both sides, I take the position that actors should be retained, and marshal four supporting arguments. An example shows that (slightly enhanced) actor nodes can greatly simplify the delivery of external control signals, without excessively complicating the denotation of the graphs they contain. The second issue concerns an epistemological problem which I have called the semi-automatic trap. This is our tendency to continue constructing systems of logic that depend on human involvement beyond necessity to the point at which such involvement is impractical, unscaleable and theoretically problematic. Two important escape routes from the semi-automatic trap are pointed out, involving more emphasis on automatic graph construction from primitive data, and emphasis on automatic interpretation of conceptual graphs. Practical methods for both are suggested as ways forward for the community.
1 Introduction Recently, the conceptual graphs (CG) community briefly debated the role of, and suitability for inclusion of actors, the angle-bracketed procedural nodes outlined the original theory, within the newly emerging ANSI standard. The exchange took place over a few weeks in the electronic forum of the CG mailing list. On the pro-elimination side, it was argued that actors are unnecessary in that existing elements of CG theory can effectively be used to do all that is claimed for actors without further complicating the standard; and were formally problematic, syntactically because their linear form symbols already have assignments and semantically because their denotation is at best awkward and at worst impossible to specify. Putative mechanisms for enabling naturally stepwise processes, to be handled by CG systems without actors were discussed. Those arguing for retention held that these alternatives lead to a different set of problems, and that explicit tokens of procedural knowledge are intuitive to use and strategically prudent, since industrial developers want them as connection points to their chosen programming language. Actors were claimed to provide needed ways of capturing concepts with extensions that are unbounded sets, changing the values of referents, and sending and receiving messages from outside the CG system. The outcome of this exchange was a kind of compromise, articulated by Sowa, in which actors are viewed as calls in a functional language, with a distinction drawn between “impure” and “pure” actors, meaning whether or not the function involved left side effects. Pure actors are much like relations and so could be expressed using the roundbracket or circle syntax, but would be defined by the named function, not the relational catalogue. Impure actors could pass control signals back and forth to concepts in the graphs to which they are connected, and may have irreversible effects outside the M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 319-333, 1998. © Springer-Verlag Berlin Heidelberg 1998
320
G.A. Mann
system. Actors would be included in the forthcoming CGIF draft, but only the pure variety. In order to include impure actors, provision for actor-tokens in the referent field of a concept would need to be made, and a metalanguage statement amounting to a declaration specifying the possible inputs and outputs would need to be provided somehow; these were left as tasks for the future. In the first part of this paper, I will marshal a number of arguments which further substantiate the role of procedural tokens as an indispensable part of any knowledge formalism and make the case for a stronger role for actors in both the theory and practice of CGs. In the second part of this work is both a cautionary note about a risk to our efforts which is so pervasive that it is difficult to see clearly, and a response to what I see as our community’s curious ambivalence about tools and implemented systems. On the one hand, we all recognise the need for standardised tools, and admire good applications of CG technology. Yet until only last year, papers about these subjects tended to be excluded from the main proceedings of our conferences, while discussions about such things were practically relegated to separate events. As funding for public sector research in many countries is being reduced, industrial sponsorship becomes more important than ever - yet for the most part our attention seems to be directed elsewhere, on minutiae. This is not a suggestion that pure research on this topic be abandoned, but only that striving toward a mathematical ideal not blind us to what will really make our efforts count in the long run - the widespread adoption of CGs in the knowledge-based societies of tomorrow. I will argue that these conflicts stem primarily from different views we hold about what CGs are to be used for. Some of us are interested in the notation itself and its formal properties. Others are applying the formalism and the algorithms implementing its denotation to make new kinds of databases, retrieval mechanisms and other information technologies. Still others are chiefly interested in knowledge representation and automated reasoning for artificial intelligence. Often the differences between these groups can be ignored. For the purposes of this exposition I will exaggerate the differences between these three professional viewpoints - the logician, the informationsystem builder and the intelligent-system builder - and show how each is affected by what I call the semi-automatic trap. This is our tendency to continue constructing systems of logic that depend on human involvement, beyond necessity into realms where such involvement is impractical, inefficient and theoretically problematic. Partly following from traditional Platonic epistemology (Rationalism) and its modern extensions, and partly a consequence of the seduction of the computer, the semiautomatic trap has ensnared the CG and other symbolic approaches to the study of artificial reasoning. After identifying the main features and implications of the semi-automatic trap, I point out two important escape routes, both involving full, rather than partial, automation. First, the emerging interest in acquisition of conceptual structures from simpler symbolic constituents can be viewed as the beginnings of a reaction to the prevailing inward-orientation of CG work. A sketch is offered of an experiment in which conceptual knowledge is collected from more primitive grades of data. Second, the more difficult issue of interpretation of conceptual structures by automatic means will be briefly considered. It is suggested that the label fields of nodes be made switchable, so that the developer can toggle between mnemonic labels and meaningless but systematic labels. A recommended principle to guide progress toward automatic interpretation - that we begin the transition from truth-preserving algorithms to plausibility-preserving heuristics - is illustrated with a simple example. Except for one instance, the practical demands imposed by the development of commercial systems will be seen to exert a healthy influence on these issues.
Procedural Renunciation and the Semi-Automatic Trap
321
2 The Role of Actors Let us begin with the role of actors in CG theory. In the debate mentioned above, writers on both sides commented that actors tended to be left out of many conceptual graphs appearing in published papers (except those used in commercial systems). The idea that actors are somehow unnecessary is a contemporary instance of the current ascendancy of declarative accounts of knowledge over procedural accounts. This orthodoxy holds that declarative formalisms (including programming languages, knowledge bases and models of brain function) are cleaner, more economical and more transparent than procedural versions. While many epistemologies pay lip service to the need for a balance between the two, the models they lead to rarely seem intermix them. Procedures are often portrayed as outdated, of limited portability, and resistant to learning or explanation. They are sometimes associated with bad programming habits in old-fashioned languages, hacking, and scruffy methodology. Some attempts to incorporate procedural elements in knowledge models began to be seen in the 80s in the KL-ONE modifications, KRYPTON [1] and BACK [12], OMEGA [5] and CycL [8]. Original CG theory [14], also a product of the 80s, showed foresight in permitting actors to influence and be influenced by concepts, even if the examples given were simple mathematical functions, which tended to foster the view that they are nothing more than a special kind of relation. At the end of this section, I will show how actors may be transformed into parameterised procedural calls in a way which has less to do with relations. Good reasons why procedures should be “tolerated” in CG epistemology were recently advanced by Rochowiak [13]. Beginning from a classical philosophical definition of knowledge as justified true belief, he argues that declarative and procedural forms can be distinguished in each of these three definitional terms. The relationship between the two is complex, at least in the case of human knowledge: they share concepts, interact in complex ways [see 6] and Rochowiak speaks of a “natural history” of (scientific) knowledge in which if unarticulated skills are sufficiently important to continued scientific enterprise, they are promoted through the heuristic level to a “context-free” level which is distinct from, yet dependent on, the prior procedural form. As usual, Charles Peirce seems to have beaten us all to the conclusion that knowledge must combine the declarative and procedural. As Rochowiak points out, Peirce was strongly committed to a view of concept meaning which fused the formal and practical. This commitment can be seen in his famous pragmatic maxim, one expression of which was: “Pragmatism is the principle that every theoretical judgement expressible in a sentence in the indicative mood is a confused form of thought whose only meaning, if it has any, lies in it’s tendency to enforce a corresponding practical maxim expressible as a conditional having as its apodosis in the imperative mood. “ [3: 5.18]
Using the vocabulary available to him, Peirce was suggesting that conceptual knowledge be thought of as rule-like pairs, with an imperative apodosis (consequent), or what today would be a procedure in the rule’s then-clause. This is the essence of today’s expert systems. Of course, rules may have simple assertions or variable assignments in their then-clauses. But Peirce continually used the word “practical” in his definitions of the maxim, which means that he understood the centrality of active outcomes, in the broadest sense of that word. Furthermore, even a cursory examination of Peirce’s work reveals the importance he attached to idea of knowledge as a dynamic, evolving entity which emerges from and is
322
G.A. Mann
critiqued by social processes in a community. His ambition of creating a process by which knowledge can be refined by successive elaborations of a publicly-readable graphical expression by members of a scientific community would, in the modern context, be a call to negotiate knowledge between the contributions of a multiple agents. His pragmatism was an attempt to overcome, or at least make tractable, the problems of logic theory, which in the words of Peircian scholar Ernest Nagel “...derive almost entirely from isolating knowledge from the procedures leading up to it, so that it becomes logically impossible to obtain.” [2].
These deliberations suggest that the procedural aspects of knowledge should not be factored out. Perhaps this idea has not been realised for historical reasons. In the late 1950s, computer science theory bifurcated into traditional stepwise control methods for a state-machine emerging directly from von Neumann’s work on the one hand [17], and mathematical function-based models, on which for example McCarthy based his Lisp [9], on the other. At each turn, as simple data structures and control metaphors gave way to more sophisticated schemes, the declarative was presented as the more refined, explicit, portable and formally backed. As notions of conceptual knowledge as logic developed, this backing became connected to the solidity of truth-conditional logic. Languages based on this idea could have very simple and elegant interpreters, since they essentially only needed to return truth values for expressions they evaluated. (Of course, practical functional languages do concede a need for incremental changes of state to get things done, but only reluctantly, as signified by the disdainful term “sideeffects”.) This simplicity is appealing enough to help perpetuate the erroneous notion that what is important or interesting about expressions in a knowledge language can always be captured by mapping it to a Boolean value. I have argued elsewhere [10] that for some purposes, truth-conditionally is simply not adequate as an answer to the question: “what does this expression mean?”. Though conceptual graph theory per se requires no specific commitment to a particular semantics, much of the existing CG literature favours the truth-conditional variety, possibly in respect of its classical logic roots. But to the extent that conceptual graph theory has taken up the tradition, it has also inherited nagging questions about truthconditionality. Is the truth of a statement always the only aspect of interest? Under what conditions, if any, is it meaningful to assign a truth value to a statement about the future? To an imperative statement corresponding to the sentence "Go down these stairs."? Will the truth value be the same before and after the interpreter complies with the act? What relationship do truth-conditions have with the intentions of the speaker, the intentions of the hearer, or the hearer's active response? Now conceptual graphs can and have been used with other, more progressive semantics, including situation semantics [15] and four-pole semantics [10]. The point here is that as long as the meaning of a conceptual graph is thought of as only a single Boolean variable, it is easy to ignore the need, in many cases, for an ordered sequence of active steps - a procedure - to be the substance of the graph’s interpretation. So if we wish to define the pragmatic of conceptual graphs in terms of conceptual graphs themselves, we need to be able to explicitly encode procedures in the CG formalism. In one sense, this is really no great imposition: procedures of one kind or another are an integral part of CG theory, e.g. the canonical formation rules. The real questions are, what formalisms should be used to encode these, and what expressions should they have inside graphs? Why would we want to make the contents of procedures (as apposed to only explicit tokens of procedures) explicit within CGs? The principle reason is that changes to declarative aspects of graphs during reasoning can affect procedures and changes in the status, or progress, of procedures can then influence reasoning, by virtue of their common elements. But this could also be viewed as a problem for two reasons: because
Procedural Renunciation and the Semi-Automatic Trap
323
we do not have and may not wish to introduce a suitable language into the CG formalism itself, and because it would cluttered up otherwise clearly understandable graphs. Without actors betokening procedures, definitions of ACT concepts - at least, those expressed in conceptual graphs - will in a sense always be empty. Imagine by how far the expansion of the concept [SAMBA] would miss the mark if it could not at any point provide access to some kind of (simulated or actual) sequence of motions. One might expect the definition-graph to classify the concept, declaratively, as a subtype of [LATIN-DANCE], and perhaps add, also declaratively, salient features such as modernity and tempo. But such a definition could never be wholly satisfactory as a surrogate without some representation which could replay the pattern of steps in time so as to demonstrate, or facilitate recognition of, the actual dance itself. Dictionaries attempt to capture the meaning of the word “samba” without recourse to such replays but this is one of the reasons that machine-readable forms of conventional dictionaries are, by themselves, inadequate as a basis for commonsense knowledge. And movies or simulations of acts are one of the things we want to provide in a modern multimedia dictionary. An alternative to explicit actor nodes has been proposed in Sowa’s object-oriented carstarting example [16]. External processes, in this case the ignition of a car’s engine, could be provably initiated from a conceptual graph built on an object-oriented software model. Messages could be sent to and from objects that appeared to be concept boxes labelled with process names to other process-objects, not shown on the graph, and presumably more closely connected with the physical engine. Ellis [4] subsequently criticised this design as awkward, but his own simpler model still encounters the basic difficulty of trying to describe declaratively what is essentially a procedure, and it still seems to miss the point. Concepts and relations should not be used as surrogates for active processes; they are statements of existence relative to an ontology. A more pragmatic question lurks beyond these complaints: even if we were satisfied with this kind of solution, do we really want our knowledge formalism to commit the user to a particular computational model? Given the popularity of the object-oriented programming approach, perhaps it is not a serious disadvantage - but it seems onerous to demand that CG developer must sign up with the object-oriented paradigm, when all that might be needed is a simple escape to the code level. It is not difficult to imagine design scenarios in which this tips the balance against the adoption of CGs. Figure 1 shows how actors would be used to produce a natural model of a car, without appealing to a particular computational paradigm or language. The referent of the concept MAX-DUR, *d, is functionally related to the referents of CAPACITY and FUEL-CONS by the “pure” actor DIVI, which divides the capacity of the tank, *c, by the rate of fuel consumption, *f. In the active interpretation of this, the output of DIVI is placed in the referent field of MAX-DUR. Execution of the procedure can be either data-directed, in which processing begins with values in the input nodes, or goaldirected, in which processing begins with a requested value at the output node, and propagates backwards to the inputs. These processes are initiated by the assertion mark ! and the request mark ?, respectively. The “impure” actor START, is not relation-like, but can be read as a state-changer. It is a procedure which can be initiated by a control mark in the ENGINE concept. The existence of a particular key is a precondition for the procedure and only when the referent *k is bound to PCX999 will the precondition be satisfied. Before the signal to start arrives, the first placeholder acts like a concept which needs to be restricted to the value KEY:PCX999. The ignition process creates side-effects, including rotation, heat and noise in the outside world; these may be measured by sensors and reported back to the CG system. At the end of the START process, a tachometer reports a number of
324
G.A. Mann RUNNING:@*r
TANK
ST ART K EY: *k *r
S T AT
PART
ENGINE
AUTOMOBILE
PART
CHRC
CHRC
CHRC CHRC
CHRC MAX-DUR:*d
CAPACITY : *c
DIVI
MAX-RANGE:*t
MAX-SPEED:*s
MUL T
FUEL-CONS: *f
Fig. 1. Alternative model of a car.
revolutions per minute , *r, back via the second placeholder, which is then data-driven to the referent field of RUNNING with the prefix @. Syntactically, the actors may be represented using a LISP-like functional notation in which the list delimiters are the < and > angle brackets, the first element is a procedure name, the last element is a place holder for the returned value, and all other elements are an ordered list of input parameters.
Procedural Renunciation and the Semi-Automatic Trap
325
since any variable must be universally quantified. This means that actors could appear in first-order logic expressions without interfering with their quantifications. Within the predicate for the concept connected to the output of the actor, f places a second element: a function with the input variables as arguments. Thus the DIVI actor in Figure 1 would appear as .... L (MAX-DUR, (DIVI (a, b)) L ... If desired, the output variable *c could be equated to the function elsewhere in the expression. The above arrangement is simple to understand and allows the graph’s declarative nodes to interact with the actors. Suppose it was learned that the engine of the car has been replaced with a different engine. When the graph is updated to reflect this, the new instance will have a new characteristic fuel-consumption. Because of the actor connections, the maximum duration and range will automatically alter to reflect this change; yet it is difficult to imagine natural relations which could be used to link [FUEL-CONS] with those two concepts. From a goal-oriented perspective, if an engine was stationary and the goal was a non-zero speed, this could trigger a search for a key, which also could not be done naturally with a relation. Notice, however, that only the actor tokens can influence and be influenced by the declarative nodes: the elements of the procedure itself are beyond reach. For this reason, we may still wish for code which had explicit statement inside a graph, perhaps as chains of linked actors representing the subroutines of complex actions. In human beings, only our highest-level plans are available to our reflection; lower level acts, those which have become automatised, as psychologists say, are not available.
3 The Semi-Automatic Trap Let us turn now to another risk facing our community: the semi-automatic trap. This is our tendency to keep building systems of logical symbols that depend at least partially on human involvement for their creation, manipulation and interpretation. That is a serious problem for builders of intelligent systems, and to a lesser degree for information-system builders and logicians. The semi-automatic trap is best understood in its historical perspective. Logic has a long history of being done by humans on paper. The symbols which are used in a typical logic have a long aetiology. Perhaps they include Greek letters, betraying their Platonic ancestry. Essentially, these symbols are passive, mnemonic devices, designed to signify objects, classes or operations by triggering the associated ideas inside human brains. They are created by human beings for human manipulation and human interpretation. For most of their history, and in most of their applications, this role for logical symbols as manual tools has been uncontroversial and unproblematic. The decades-long revolution which has placed a computer on every desk offered rich opportunities to practitioners of logic, builders of information systems and artificial intelligence researchers, among countless others. Yet the advent of computers can now be seen, with the benefit of hindsight, to have loaded the semi-automatic trap, ready to be sprung. In their ubiquitous incarnation as semi-automatic machines - that is, devices which accept symbolic input from users via a keyboard, process them, and display symbolic output for human eyes via a screen - desktop (and laptop and palmtop) computers have human dependence built in as a basic affordance. For the greater part, the software which has been developed on such machines also tends to depend on human intervention via this interface. This too seems natural enough.
326
G.A. Mann
But this built-in, natural dependence can work against us when we try to use those machines and logic symbols to build symbolic structures which can behave like human conceptual knowledge, for the following reasons. First, the culture of logical mnemonics carries certain unstated assumptions with it. For example, although a symbol could in principle have a extensional grain size of anything from a highly specific and nameless microfeature to a broad metaphysical category, the symbols used in practice tend to cluster around a narrow range of possible grains associated with convenient words used for their labels. The fact that concepts have to be named could push hand-built ontologies away from the microfeatural to a coarser resolution. Second, mnemonic symbols can easily be parasitic for their true meaning on the human viewer, that is, they could merely draw on associations, inference mechanisms and the like in the head of a developer, or user, instead of supplying them itself. It is surprising how much this blinds us to the need for an interpretative component inside the system. Third, the tradition of logic still leads us to think of conceptual knowledge in terms of verifying propositions, and that truth-conditional semantics might be sufficient as a denotation for any conceptual structures. This again leads to the neglect of active procedural output as part of meaning, as discussed in Section 2. Let me make the problem clearer by taking it to extremes. Consider for a moment the fact that natural existence proofs of intelligence - humans and (more problematically) animals - have no screens and keyboards. If, as we wish to maintain, symbolic conceptual structures exist within their brains, then keyboards and screens cannot be a necessary condition for intelligence. But think of the disservice that would be done to most intelligent machines - and surely every CG program - if the screen and keyboard were removed. This will be my basic test for a system caught in the semi-automatic trap - will it continue to function and be useful with the human-dependent I/O devices disconnected? The reader may well object that this is test is uncharitable. After all, animals (including humans) and computers are really very different. Animals arrive with a genetic legacy, which might include innate conceptual structures and the machinery to support them. Computers are general-purpose machines to which any structure must be supplied - and this is ultimately a screen and keyboard job. To make a fair comparison, then, the machine does need a screen and keyboard. Very well - let the original test be modified as follows: the builder may stand in for Nature, supplying any programs and data that the system needs, but during the design phase only. Once the program is finished (born), the screen and keyboard must come off. The test still seems absurd because we have not yet exhausted the differences between animals and computers. Animals have sense organs and bodies with actuators, and these serve as I/O channels to the world. But - and this is important to full understanding of the issue - those channels do not depend on the manipulation or scrutiny on the part of a external human, as screens and keyboards do. Even in a human being, they do not: they serve a human brain, and could be influenced by or influence an external human in various ways, but they do not depend on it. Note that if the computer had other I/O devices with this property, such as cameras, microphones, sound generators and motors, they must remain connected during our test. We only wish to disable the machine in a specific way, so that it reveals its autonomy or the lack of autonomy. Am I serious when I demand that CG-based devices be useful without their screen and keyboard? Not entirely. To disqualify screens and keyboards on the grounds that they permit a kind of epistemological cheating is to oversimplify, because it ignores the fact that these devices can also be used to stand in for more complicated, and possibly less useful, “natural” I/O such as speech and hearing. We may be very satisfied, for example, with a natural language program that communicates using ASCII text, provided we stay vigilant against other uses of the screen and keyboard which commit
Procedural Renunciation and the Semi-Automatic Trap
327
humans to too much of the wrong kind of involvement. This is the kind that does not relate to natural communication with the functioning program and would not be possible if the program were a person.1 Furthermore, it oversimplifies because some purposes of logicians and informationsystem builders may correctly demand no more than semi-automation. A successful logic visualiser might be essentially a tool for human work, like a calculator or spreadsheet. In these cases, it could be argued, the semi-automatic trap is really no trap at all. Even here, though, much of the motivation of tool developers lies in the potential they see in computerising conceptual structures to free human users of the need to carry out the time-consuming, repetitive or difficult parts of logic work - and this means automation. Of course, individual systems might pass the test, with their designers successfully avoiding the seductive powers of the screen and keyboard, and using their computers to try to write truly automatic programs that collect their own data, form their own goals, build their own representations and draw their own conclusions for action. The trouble is that we can seem to be building systems capable of human-like conceptual processing, when what we are really doing is only building systems that help people to do so. Disguising this mistake is the fact that, at least in information-retrieval systems, even a semi-automatic program could be quite useful. In fact, some informationsystem builders argue that a semi-automatic “intelligence amplification” (IA) approach is the best we can hope for at present, or even that this is better than AI altogether, since it preserves a human role in these affairs. But builders of autonomous agents, intelligent reasoning systems and true (that is, non-teleoperated) robots must avoid the semi-automatic trap to build automatic machines, because these devices must function independently. These builders cannot get away with thinking representations are all there is to knowledge. Again, we can visualise this in terms of the ability of such devices to operate without a keyboard and screen plugged in, even if they were needed to start the programs. A final barb that may hold us in the semi-automatic trap is that of power. I mean this in a more specific sense than notions of technological dominance or “knowledge is power”. If responsibility and power are opposite sides of the same coin, it follows that human involvement in the knowledge process grants us power over it. In the artificial realm of symbols on a blackboard, the logician presides as god, letting this or that assumption be true, assigning variables, defining axioms and applying stepwise procedures. In the methodology of conceptual graphs, we proclaim the existence of types which divide aspects of reality into meaningfully distinct atoms, forming the basis of an ontology. The authority for this act is that of the knowledge specialist, or domain expert. Such persons naturally occupy positions of power in our technocratic society. And if the ontologies that are derived from their proclamations were in turn to form an obligatory language to which all artistic, scientific, commercial or military participants in a knowledge-mediated discourse have to conform, the focus of power would become very tangible indeed. 1. One might argue that language is an exception, because it is essentially about communications with other human beings, and thus dependent on them. But human speech is based on vocal apparatus which can produce sounds for other purposes, and auditory sensors which are used to detect many kinds of sounds besides speech. And of course, both can operate meaningfully even when there is no external human present. The same cannot be said of a keyboard and screen.
328
G.A. Mann
A number of critics have warned that society could find itself mired in a kind of ideological determinism instrumented by information technology [e.g. 7]. Particularly applicable here is the risk of opportunism on the part of an organisation or company that establishes an comprehensive ontology defining all the concepts, relations, contexts and actors for an industry as a bid for control of that industry. If one powerful organisation controlled these element, it could be difficult or impossible for outsiders to get new ideas recognised, or to communicate ideas which fell outside those terms of reference. The concern is that the organisation could disguise its attempt to monopolise the discourse by claiming that it was simply establishing a useful knowledge standard. But engineering standards for knowledge interchange should not be permitted to lead towards standardised knowledge content. Our community should be on its guard about this, so that our innovations do not become yet another pillar of inequality. If as we oversee the construction of large, shareable ontologies we are to avoid the twin evils of hard labour and ideological monopoly, we must become willing to take our hands off the levers of power to some degree. Since humans and their institutions tend to seek to consolidate power, not relinquish it, this barb can be expected to be the sharpest of all.
3 Fully Automated Acquisition In the CG community, we tend to neglect knowledge acquisition. In a set of 148 papers from ICCS meetings of the last five years, only 17 mentioned knowledge acquisition at all, and fewer discussed the topic in any detail. Concepts and relations appear in the catalogues of practically all CG systems as a matter of design-time proclamation, which is to say, human judgement. Neglecting other means of gathering knowledge tends to lock in proclamation as the only method of establishing knowledge using CGs. That may discourage experienced knowledge engineers looking for improved technologies for their craft, who know that manual knowledgebase creation and maintenance is potentially so labour-intensive that it may make their systems unacceptably costly. Perhaps CG developers are beginning to see the cruciality of this issue: whereas only 3 of the 56 papers in 1992 concerned this topic, this had risen to 9 out of 48 papers by 1997. Now consider what the semi-automatic trap has to teach us about knowledge acquisition. If, as in the above reductio ad absurdum, conceptual graph systems were not permitted to use a screen and keyboard for more than the initial system specification, would knowledge acquisition become impossible? Evidently not, because humans and animals can learn. But, for the sake of argument, how would it change under this restriction? The term “knowledge acquisition”, as it is conventionally used, means encoding knowledge which has come from asking human beings (experts). Sometimes it takes on a slightly broader sense, in which the knowledge may come from other sources like models, or books. “Asking a human” sounds like a natural process, available to a person and not likely to be disabled by removing the screen and keyboard from a computer. But of course, what almost always actually happens is that another person elicits the knowledge from the expert, casts the knowledge into the representation formalism by hand and then types it into the system by screen and keyboard. This would not be possible in a system in which these two devices were disconnected, that is, once the system was deployed.
Procedural Renunciation and the Semi-Automatic Trap
329
We could, of course, imagine a system in which deployment was delayed until all the knowledge it would ever need was built-in at design time. This would be a model of totally innate knowledge. New knowledge could be had within the deployed system, but only by derivation from the innate supply. Perhaps, with great foresight on the part of the designer, such a knowledge system would be adequate. By analogy, most commercial programs are sold without their source code, with all information they need sealed into the executable code. But this strategy seems inherently risky. A system closed off from external change seems to lack an essential flexibility. It is precisely this sort of inflexibility that makes us seek more advanced forms of software than conventional programs. The alternative would be to use I/O devices which are allowed to remain connected (or, in the liberal interpretation of the rule, to use the screen and keyboard for natural exchange only). To do that, the human knowledge elicitor must be replaced. This means creating not only a natural language interface, but also a program to conduct the elicitation process, automatically generating the graphs for expression as questions, and a method for automatically dealing with the conceptual graphs that result from the parsing of the expert’s responses. Since I have written elsewhere about parsing and learning by asking [10], I will focus here on this method. EKB
SKB
PKB s t ar t (key r pm) begi n i f ( k ey < > P CX9 99 ) then ( r etur n 0) el se i nser t ( k ey)
: :
Interpreter
Skill Learning
NEWCON BEAR < ANIMAL DEFC ON BEAR (ATTR FURR Y AT TR C OLO UR) NEWCON CAVE < LANDMARK DEFC ON CAVE ( CHRC DARK CO NT ROC)
BEAR 1 JOIN (BEAR EXPR HU NGER) G 105 JO IN (CAVE C ONT BEAR ) ASSER T G 105
Syllabus
Fig. 2. A teachable CG knowledge machine.
Figure 2 describes a CG machine containing three knowledgebases:
SKB, the semantic knowledgebase, consisting of conceptual, relational and actor hierarchies.
EKB, the episodic knowledgebase, a sequential list of conceptual graphs, representing the conceptual history of the system.
PKB, the procedural knowledgebase, containing the source for actors defined in the system. This could be as simple as a set of listings of Lisp functions.
The acquisition process depends on a special teaching language in which a range of operations on these databases may be expressed in simple fashion. Such a language would be something like the Structured English Interface for the Deakin Toolset [11], except that additions and changes to the conceptual hierarchies as well as the construction of graphs based on them would be allowed. The procedural knowledgebase
330
G.A. Mann
would need a theoretically different set of techniques for skill learning, which will not be addressed here. For example, the following sequence of expressions sets up a situation in which a hungry bear is inside a cave, beginning with no concept of bear or cave: NEWCON DEFCON NEWCON DEFCON
BEAR < ANIMAL BEAR (ATTR FURRY ATTR COLOUR:Brown LOC PLACE EATS ANIMAL) CAVE < LANDMARK CAVE (CHRC DARK CONT ROCKS CONT BATS)
BEAR1 G105 ASSERT
JOIN (BEAR EXPR HUNGER) JOIN (CAVE CONT BEAR1) G105
The "interpreter" (in the specific technical sense of a parser-executor of strings in an artificial language) of Figure 2 chooses appropriate operators such as copy, restrict, or join and decides how to apply these to update the SKB and EKB. In most cases, the opcode of an instruction should be enough to select the correct knowledgebase, since the operations appropriate to hierarchy-building and episodic memory are different. Instructions concerning operations on individual graphs would use local variables to hold the graphs until a command sent the graph to a specified knowledgebase. The course of knowledge to be learned will be introduced in a specially prepared sequence of these expressions in this language, called a syllabus. The syllabus would be written by hand, which might at first glace seem to defeat the notion of automating acquisition. However, until we are prepared to construct much more sophisticated perceptual devices (such as a camera which returns conceptual graphs describing objects and events in its field of view), the data which informs the learning process must come from such human-mediated sources. Since advanced raw-data-to-CG converters seem far off, our efforts to reduce human involvement are compromised. It might instead be hoped that the simple teaching language, beginning from a form like that described above, would evolve with experience, so that frequently recurring patterns of operations were eventually chunked up into powerful elements of a higherlevel language. Ideally, the high-level form would both decouple the content of the material from the operations, thus allowing syllabuses focus primarily on content, and to be shorter and easier to write.
4 Fully Automated Interpretation Given that the goal is creating a system for intelligent reasoning and not a tool for manipulating logic, one way test that a system is not semi-automatic-trapped is to systematically replace all the mnemonics in the graphs - the type labels in the concepts, relations, actors and contexts - with “blind” labels such as random combinations of characters. If this disadvantages or disables the run-time system, it means that the symbols in the graphs are parasitic on the meanings in the user’s head, and so is not fully automated. In practice, it would be useful to be able to switch between the arbitrary labels and mnemonic ones, since the mnemonics are legitimate and useful for design and debugging. Being able to turn off the mnemonics at run-time would force attention away from the graphs and onto the human-read/writable forms with which users of the system will deal.
Procedural Renunciation and the Semi-Automatic Trap
331
Let us assume, for convenience, that the form for users of a hypothetical system is natural language, and permit that to explain what is needed to automatically interpret graphs. Section 2 argued that a truth-conditional semantics is not enough. In [10], I expressed this as the need to move beyond truth-preserving algorithms to plausibilitypreserving heuristics. What does this mean? Imagine that the natural language system is asked two questions: 1.
“Can a rabbit fly?”
2.
“Can you arrange the names in file “customers.txt” alphabetically and send that to printer1?
To answer 1, a truth-conditional, truth-preserving approach can be tried. Once the pragmatic component of the system has recognised the form as a question, an attempt to join the definitions of [RABBIT] and [FLY] can be made, and since selectional constraints in the definition graphs should reject an attempt to fit two incompatible graphs together, an answer of “no”, based on whether the join algorithm was successful, can be returned. Perhaps, though, it would be more convivial to return either the successfully joined graph or an error message that reported what had blocked the join. This would avoid the embarrassment of a simple “yes” answer, in the event that the system had deduced that a rabbit was a suitable patient for transport by aeroplane. It is easy to show that to properly answer 2 it takes a plausibility-preserving, procedural heuristic approach to prevail. First, assume that the only procedure available which could possibly alphabetise the file is a generalised Sort function. The word “arrange” is insufficient to choose an operation, so the system searches all available acts for a suitable match. The way the match is performed is crucial to success here; it must be quite liberal if it is to cope with the many possible ways in which it might properly be summoned. Assuming the Sort function was found to have an optional parameter called “alphabetical”, then it might be appropriate to use Sort(alphabetical, customers.txt). We could not know this kind of relationship with the certainty required for a truth-preserving algorithm. A heuristic enforcing only plausibility, on the other hand, could take such a liberty, and thus only it could succeed in this case. Second, in order to avoid the pragmatic howler of returning a yes-or-no answer to 2, the system must actively carry out first the sort and then the print operation. The “and” linking the two is not a Boolean conjunction, but a conditional link ordering two tasks. Successfully recognising the two clauses as a pipelined print operation and performing it is the interpretation of the question. If unsuccessful, some kind of error message representing an explanation would then be appropriate, as in 1.
5 Conclusions By making provision for actors, CG theory has already prepared the way for progressing beyond the notion that description is all there is to representation. To address the shortcomings of formal knowledge representations which do not recognise the significance of procedural aspects of knowledge, more attention must be paid to both the tokens which mark their presence and to the active processes which substantiate those tokens. These processes cannot be completely divorced from the pragmatics of a CG system. It will not be enough to continue developing algorithms which manipulate CGs, without somehow recognising these explicitly within the graphs
332
G.A. Mann
themselves. Ideally, the entire codification of an active process would appear in a conceptual graph, but this complicates the notation. At least actor nodes with the same names as coded procedures outside the system should be allowed for. Those interested in the adoption of this formalism as a knowledge standard must also now progress beyond the notion that representations are all there is to knowledge. To create working knowledge for intelligent systems, it is important not to perpetuate passive symbols designed for human use. This engenders a kind of introspection which is compelling to system builders, and potentially aversive to commercial developers. More seriously, it carries the risk of building too much human involvement into the system. The semi-automatic trap is a gedankenexperiment designed to reveal this risk. By asking how knowledge systems would function without their screens and keyboards. it reminds us that these conduits can sometimes work against the development of true automated reasoning. Two escapes from the semi-automatic trap are briefly discussed: fully automated acquisition and fully automated interpretation. In the case of knowledge acquisition, too much human involvement is problematic because it commits the system builder to large amounts of collection and maintenance work on the knowledgebase. It also risks granting a great deal of power to any highlyresourced organisation which is able to manually create a large ontology. It would be preferrable to eliminate the human elicitor in knowledge acquisition. Although the suggested teaching experiment does not accomplish this, it may be a step in the right direction. In the case of interpretation, the ease with which humans take on the role of interpreter of symbols makes the human-readability of CGs a double-edged sword. Therefore I suggested that the mnemonic labels inside the nodes be able to be switched off, so that any parasitism of the system may be exposed. Our artificial reasoning systems will be better able to cope with the vagaries of real tasks when they use plausibility-preserving heuristics instead of truth-preserving algorithms. Truth preservation is important for maintaining the canonicity of true graphs during arbitrary transformations, but it might block sensible but unsound steps in the reasoning process. Such steps could be ubiquitous in commonsense thinking.
References 1. Brachman, J. et.al. Krypton: A Functional Approach to Knowledge Representation. IEEE Computer, 1983, 16, 10, 67-74. 2. Buchler, J. Charles Peirce’s Empiricism. New York: Harcourt, Brace & Co., 1939. 3. Burk, A.W. The Collected Papers of Charles Sanders Peirce. Vol. 5. 4. Ellis, G. Object-oriented Conceptual Graphs. In G. Ellis, R. Levinson, W. Rich and J.F. Sowa (Eds.) Conceptual Structures: Applications, Implementation and Theory. Lecture Notes in AI 954, Springer-Verlag, Berlin, 1995, 114-157. 5. Hewitt, C., et. al. Knowledge Embedding in the Description System Omega. Proceedings of the First National Conference on Artificial Intelligence, Stanford, CA, 1980, 157-164. 6. Hiebert, J. Conceptual and Procedural Knowledge in Mathematics: An Introductory Analysis. In J. Hiebert (Ed.) Conceptual and Procedural Knowledge: The Case of Mathematics. Hillsdale, NJ: Lawrence Earlbaum Assoc., 1986, pp.1-27.
Procedural Renunciation and the Semi-Automatic Trap 7. Lacroix, G. Technical Domination and Techniques of Domination in the New Bureacratic Processes. In L. Yngstrom, et. al. (Ed.s) Can Information Technology Result in Benevolent Bureacracies? The Netherlands: Elsevier Science Publishing Co., 1985, 173-178. 8. Lenat, D. et. al. CYC: Towards Programs with Common Sense. Communications of the ACM, 1990, 33, 30-49. 9. McCarthy, J. Recursive Functions of Symbolic Expressions and their Computation by Machine, Part 1. Communications of the ACM, 1960, 3, 4. 10. Mann, G.A. Control of a Navigating Rational Agent by Natural Language. PhD thesis, University of New South Wales, 1996. 11. Munday, C., Sobora, F. & Lukose, D. UNE-CG-KEE: Next Generation st Knowledge Engineering Environment. Proceedings of the 1 Australian Knowledge Structures Workshop. Armidale, Australia, 1994, 103-117. 12. Nebel, B. & von Luck, K. Hybrid Reasoning in BACK. In Z.W. Ras and L. Saitta (Ed.s) Methodologies for Intelligent Systems, Vol.3. North-Holland, Amsterdam, The Netherlands, 1988. 13. Rochowiak, D. A Pragmatic Understanding of “Knowing That” and “Knowing How”: The Pivotal Role of Conceptual Structures. In D. Lukose, H. Delugach, M. Keeler, L. Searle & J.F. Sowa (Eds.) Conceptual Structures: Fulfilling Peirce’s Dream. Lecture Notes in AI 1257, Springer-Verlag, Berlin, 1997, 25-40. 14. J.F. Sowa: Conceptual structures. Menlo Park, California: Addison-Wesley Publishing Company, 1984. 15. Sowa, J.F. Conceptual Graph Summary. In T.E. Nagle et. al. (Eds.), Conceptual Structures: Current Research and Practice. Chichester: Ellis Horwood, 1992, 339-348. 16. Sowa, J.F. Logical Foudations for Representing Object-Oriented Systems. Journal of Theoretical and Experimental Artificial Intelligence, 1993, 5. 17. von Neumann, J. The Computer & the Brain. New York: Yale University Press, 1958.
333
Ontologies and Conceptual Structures William M. Tepfenhart AT&T Laboratories 480 Red Hill Rd Middletown, NJ 07748 [email protected]
Abstract. This paper addresses an issue associated with representing information using conceptual graphs. This issue concerns the great variability in approaches that individuals use with regard to the conceptual graph representation and the ontologies employed. This great variability makes it difficult for individual authors to use results of other authors. This paper lays out all of these differences and the consequences on the ontologies. It compares the ontologies and representations used in papers presented at the International Conference on Conceptual Structures in 1997. This comparison illustrates the diversity of approaches taken within the CG community.
1 Introduction One of the problems about reading papers on conceptual structures is that there are almost as many different approaches to conceptual structures as there are authors. In the original book by Sowa [1], he described three basic representational elements: concepts, conceptual relations, and actors. Since then, other authors have modified concepts, conceptual relations, and actors in very different manners -- different in terms of how they are defined and used. In addition, there are at least four graph types: simple graphs, nested graphs, positive nested graphs, and actor graphs. These differences make comparison between papers difficult and at times impossible. However, there is an even worse problem and that is -- it is fracturing the conceptual graph community along multiple lines. This paper does not attempt to unify all of the different approaches. Such a task is difficult and the effort involved tremendous. It is not even clear that the result would be of value to any but a few. Instead, this paper lays out certain fundamental differences in the various approaches to conceptual graphs. Using the results of this paper, the interested reader will understand how to interpret papers based on very different sets of premises and perhaps be more forgiving to those who have chosen a different approach. M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 334-348, 1998. © Springer-Verlag Berlin Heidelberg 1998
Ontologies and Conceptual Structures
335
The basic elements for which differences are identified in this paper are: the descriptive emphasis, the definitional information, the conceptual grounding, the processing approaches, the ontological structures, and the knowledge structures. As will be shown, there are many degrees of freedom in how one can combine all of these different elements. This paper will not argue which combinations of these elements are meaningful, although it might seem to some readers that some are not. The next six sections of this paper address the following topics: • descriptive emphasis - what aspects about the world are stressed most in the ontology and how that affects where and how concepts are defined. • definitional information - what information is captured in the definition of concepts and how that information is to be used. • conceptual groundings - the semantic basis on which the meaning of the concept is founded. • processing approaches - how information captured within conceptual graphs is processed and the implications in terms of how concepts are defined. • ontological structures - how concepts are arranged in a type structure and the kinds of processing that can be performed over it. • knowledge structures - the graph structures which individuals use to express information and how that structure influences the ontology. Each section describes the element and gives examples of the approach. This is followed by a section that classifies individual papers according to the ontological assumptions on which they are based. The paper concludes by giving a summary of the results presented here.
2 Descriptive Emphasis One element contributing to an ontology is the descriptive emphasis. The descriptive emphasis is the part of the physical world that is stressed most within the ontology and knowledge structures. Some descriptions focus on the state while others focus on the act. The distinction between the two is significant in terms of the kind of information captured, the types of operations that are performed over them, and the kinds of inferences that can be achieved. In fact, the different emphasis controls what kinds of information must be derived from a given graph and knowledge base versus what information is trivially extracted.
2.1 State An ontology that emphases state concentrates on things and the relationships among them. Actions are expressed as changes in state and are characterized by an initial state, a final state, and the act that links the two. The ontology, of course, supports this kind of treatment directly. An example of this for ‘A Cat Sat On A Mat’ is,
336
W.M. Tepfenhart
[[Cat: *] -> (on) -> [Mat: *] (posture) -> [Standing: *]] -> <Sit> -> [[Cat: *] -> (on) -> [Mat: *] (posture) -> [Sitting: *]] In this example, the initial state is one in which a cat is on a mat in a standing position; the final state is one in which a cat is on a mat in a sitting position; and the link between the two is an actor that represents the movement of the cat into a sitting position. The use of an actor <Sit> expresses the semantics of an active relation although, as will be discussed in a later section, actors are not the only mechanism (computational) to express the changes that are taking place. The ontology reflects this way of viewing the physical world by having states and the objects within them captured as concepts. The concepts are defined within the concept type structure. The relationships between objects within a state are captured as conceptual relations which are defined within the relation type structure. Relationships between states are captured as active relations which can be defined as a relation within the relation type structure or as actors within an actor type structure. 2.2 Act An ontology that emphasizes acts concentrates on the transitions and the roles that things play within them. Actions that take place are characterized by the subject of the act, the recipient of the act, the location it took place, and the manner in which the subject executed the act. An example of this is, [Sat: *] -> (agent) -> [Cat: *] (location) -> [Mat: *] In this case, the act is expressed by [Sat: *] where the agent of the act is given by [Cat:*], and the location is given by [Mat: *]. The ontology, of course, supports this kind of treatment directly. Here the objects involved and the act are expressed as concepts which are defined within the concept type structure. The relationships between the act and the participants are expressed as conceptual relationships which are defined within the relation type structure.
3 Definitional Information Sowa states that a concept is defined by [Sowa], type a(x) is u where a is the type label and the body u is the differentia of a, and type(x) is called the genus of a. An example of this is, type CIRCUS-ELEPHANT(x) is [ELEPHANT: *X] <- (AGNT) <- [PERFORM] <- (LOC) <- [CIRCUS]
Ontologies and Conceptual Structures
337
While it would appear that this definition is rather clear, it is in the treatment of the differentia that authors differ tremendously. There are two approaches: the differentia is treated as a predicate or as a prototype. In some cases, authors do not use the definitional mechanism. There is another factor complicating understanding the ontology behind many of the papers about conceptual graphs. Many authors do not give definitions for the concepts and conceptual relations employed in their papers. One must assume that they have some sort of definition that is obvious to them. More significantly, these authors do not exploit definitions or describe how they are used when processing graphs.
3.1 None While some authors assume but do not give definitions for conceptual elements, there are others who do not employ the Sowa definitional mechanism in any form or fashion. For these authors, definitional information is given by the placement within some network of connected nodes. The definition is established by all nodes to which it is linked by way of relationships.
3.2 Logic One approach treats the differentia as a predicate which is evaluated according to the rules of logic. In this approach, the definitional graph is the predicate. If the necessary information is available for the predicate evaluate to true, then this is deemed as sufficient to establish the type for an individual.
3.3 Prototype In another approach, the definitional graph is treated as a prototype. Individuals are treated and described as instances. The attributes associated with an individual are those given within the graph that constitutes the differentia those inherited from parent types. In a sense, this approach is compatible with the class systems used in Object Oriented systems.
4 Conceptual Grounding Ogden and Richards, in [2], established the meaning triangle as a means for expressing the relationships among symbols, concepts, and referents. The meaning triangle is illustrated in the figure below. In the lower left corner is the symbol which corresponds to the linguistic element of a word. In the lower right corner is the
338
W.M. Tepfenhart
referent which is related to the object. The top of the triangle is the concept which serves to link the symbol and the referent. The direct link between symbol and referent is actually a virtual link. Concept
Symbol
Referent
Certain relationships need to be explained in this figure. In particular, the link between the symbol and the concept is an invokes relationship. The idea here is that the symbol invokes in the mind of an individual the concept. Alternatively, one may view the link as the symbol expresses the concept. The relation between the referent and the concept is a little more complex. The referent is observed and expressed as a percept. The percept is then interpreted as a concept.
4.1 Percept A percept based approach exploits the assertion that a concept is the interpretation of a percept which is the result of some sensation of an object. As a result, developing an ontology and grounding the semantics of concepts in the physical world is a matter of studying objects and how we observe them. Abstract concepts are introduced as a result of computational operations over perceptually grounded percepts. In the process of grounding the semantics of a concepts in percepts, the investigator must necessarily concern themselves with the nature of sensors and actuators. That is, in order to ground the concept in the physical we have to understand how we perceive the physical and can interact with it.
4.2 Linguistic A linguistic based approach to grounding the semantics of a concept is based on the view that the concept is the result of an invocation by a symbol. Hence, for every symbol there is some meaningful concept. By studying natural languages, we can get a very detailed view of the concepts and the type relationships that exist among.
5 Processing Approaches Given a conceptual graph, what does one do with it? There are three major approaches to processing conceptual graphs. One approach has its roots in semantic networks,
Ontologies and Conceptual Structures
339
another in predicate logic, and a third in procedure. These three different approaches have significant effect on how concepts are defined. In reading papers on conceptual graphs, it is clear that many of them do not describe how the graphs are to be processed. The papers concentrate on capturing some domain or natural language. There is little focus on what one does with the knowledge once captured.
5.1 Semantic Network In this approach, graphs containing referents are treated as activations within a semantic network. Processing the graph is a matter of activating connected nodes until some query is resolved.
5.2 Predicate Logic In this approach, graphs containing referents are assertions that are made on sheets of assertion. Processing occurs as the result of a asserting a query. The query in the form of a conceptual graph with existential quantifiers. The resolution of the query graph and the bindings that make the graph true is the result returned. This approach is very much like a prolog style of programming.
5.3 Procedure The arrival of a graph in a working memory area is treated as an event. The event triggers processing in the form of an actor firing. The actor may be defined in terms such that it invokes additional actors to fire as a result of a change in the input graph. Actors modify the input graph until some stop condition is reached.
6 Ontological Structures Integral to the whole concept of an ontology is the type structure. In various approaches this resulting structures have different characteristics. Operations that make sense in one, do not make sense in the other. The choice of one approach over the other has significant consequences on the kinds of processing that must be performed in classification or establishing co-reference. The type structure is a partial ordering of the concepts based on type-of relations. Where individual authors disagree is on the nature of the type-of relation. In some cases, individuals allow a concept to exist in more than one type-of relation with others. Others restrict a concept to exist in a type-of relation with one concept. The result is that in some approaches the type structure is a hierarchy while in others it is a lattice.
340
W.M. Tepfenhart
6.1 Type Hierarchy In a type hierarchy, the ontology type structure is captured in the form of a tree. Even within the community that employs type hierarchies there is some disagreement about how individuals stand in relation to the hierarchy. In some, a particular individual can be classified as only a single leaf type of the lattice (this allows it to be classified as any of the parent types of the leaf). The other approach allows an individual to be classified as multiple subtypes of a single concept.
6.2 Type Lattice In a type lattice, the ontology type structure is captured in the form of a lattice. At the top of the lattice is the universal true and at the base of the lattice is the absurd. In this approach both natural types and role types appear in the lattice with concepts able to inherit from both. CAT: *
ON
MAT: *
7 Knowledge Structures Sowa’s original book [1] described a basic conceptual graph which has become known as a simple graph and nested graphs for capturing logical expressions. Since then, additional graph types have been added. These additional graphs include positive nested graphs and actor graphs.
7.1 Simple Graphs Simple graphs are bi-partite directed graphs containing only concepts and conceptual relations. They are the most basic graph employed in the community. An example, often cited, is
[PERSON]
[PERSON] <- (CHLD) <- [MOTHER]
7.2 Nested Graphs Nested graphs are intended to express positive and negative propositions as needed in first order logic. The following illustrates how to capture the statement, “Every person has a mother.”,
Ontologies and Conceptual Structures
341
7.3 Positive Nested Graphs Positive nested graphs are similar in structure to nested graphs with the exception that negations are not allowed. They also differ in the sense of the processing that is performed over them. In this case, the nested graph is part of the referent field of some concept which participates in the outer graph. The nested graph provides greater description about an individual without complicating the exterior graph with the additional details. An example of a nest graph (taken partially from [6]), Person:Peter: **
AGN
THINK:*: ** SUBJ
PAINTING: A: SCENE: *:**
ATT
Bucolic: *:**
7.4 Actor Graphs Actor graphs are graphs that incorporate actors in addition to concepts and conceptual relations. The actors are treated as active relations which have the potential to modify a graph. An example of an actor graph for the action that a cat moved onto a mat is,
CAT: x
ON: F
MAT: y
MOVE
CAT: x
ON: T
MAT: y
342
W.M. Tepfenhart
8 A Sampling of Approaches In this section of the paper, the approaches demonstrated in the papers presented in International Conference on Conceptual Structures 1997 are mapped against the various ontological styles described in the previous sections. Individual authors of papers may find that their work has been misclassified in this paper. The fact that misclassification of an authors approach is likely is mentioned for a specific reason. In particular, it is to make the point that understanding the ontology employed by various authors is difficult, particularly since few authors bother to explain their ontology or what assumptions they make in constructing it. In some cases, it is exactly this information that is necessary for a different author to use their results. In the tables that follow, care has been taken to accurately characterize papers. A number of papers which appear in the tables will not have entries associated with them in some or all of the main columns. After serious consideration, it was decided that the lack of information in some papers about the ontological basis and concepts which are being processed constitutes a data point just as valid as when it was stated. This further underscores the need for such information to be stated. Author
Descriptive Emphasis State
[3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24]
Act
Definitional Emphasis None
Predicate
Prototype
X X X X
X
X
X X X X X X
X X X
X X X
X X X X X
X
X X
Ontologies and Conceptual Structures [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38]
X
343
X X X
X
X X X
X X X X
X X X
In the table that follows the conceptual grounding and processing approaches are compared. One will notice in looking at this table, that there are a number of papers for which are no entries. This is not a mistake, some papers don’t include information about these topics. Author
Conceptual Grounding Percept
[3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20 [21] [22]
Linguistic
Processing Approach Semantic
X
Predicate
Procedure
X X X X X X
X X
X X X
X
X X X X X X
X
X X X X
X
X
344
W.M. Tepfenhart
[23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38]
X X X
X X
X
X X X X X
X X
X X
X
X X
X X X
In the table that follows, there are several unusual entries. In some cases, there is an ontological structure defined and not a knowledge structure. That is, some papers only describe the type structure and not how the concepts are to used within graphs. In other cases, the knowledge structures are defined without any reference to the ontological structure. Author
Ontological Structure Hierarchy
[3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20
Knowledge Structure
Lattice X X X
X X X X X
X X X
Nested Graphs Positive Nested Graphs Nested Graphs Positive Nested Graphs Simple Graphs Actor Graphs Actor Graphs Simple Graphs Nested Graphs Nested Graphs Nested Graphs Nested Graphs Nested Graphs Nested Graphs Nested Graphs Simple Graphs
Ontologies and Conceptual Structures [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38]
345
X X
X X X X X ? X
?
Gamma Graphs Nested Graphs Fuzzy Graphs Simple Graphs Nested Graphs Actor Graphs Simple Graphs Simple Graphs Simple Graphs Simple Graphs Simple Graphs Simple Graphs Simple Graphs Simple Graphs
9 Summary This paper has attempted to outline some of the fundamental ideas that form the basis of research efforts in the area of ontologies and conceptual structures. It is hoped that the preceding sections have conveyed the difficulty in comparing and contrasting papers on conceptual structures, particularly with regard to the ontological foundations. The variety of approaches, processing styles, and assumptions make it difficult for one author to apply the results of another. One result is that much effort is being spent to solve the same problem several times because the language in which the problem is framed appears to be different. At the least, this paper gives an outline by which technical disagreements and discussion can be focused. That is, we can use the results presented in this paper as a means to argue about whether a predicate or procedural approach is most appropriate for conceptual graphs. If the different approaches are all appropriate, then when does one approach work better than another? If the community continues to use multiple approaches, then how can we use conceptual graphs as an exchange mechanism when the underlying ontologies have such different foundations?
346
W.M. Tepfenhart
10 References 1. Sowa, J.F., Conceptual Structures: Information Processing In Mind and Machine, Addison Wesley, Reading, MA, 1984. 2. Ogden, C.K., and I.A. Richards, The Meaning Of Meaning, Harcourt, Brace, and World, New York, NY, 1946. 3. Sowa, J.F, “A Peircean Foundations for the Theory of Context,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 41-64). 4. Chein, M., “The CORALI Project: From Conceptual Graphs to Conceptual Graphs via Labeled Graphs,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 65-79). 5. Mineau, G.W. and M-L Mugnier, “Contexts: A Formal Definition Of Worlds Of Assertions,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 80-94). 6. Chein, M. and M-L Mugnier, “Positive Nested Conceptual Graphs,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 95-109). 7. Wermelinger, M. “A Different Perspective On Canonicity,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, SpringerVerlag, Berlin, 1998, (pp. 110-124). 8. Tepfenhart, W.M., “Aggregations In Conceptual Graphs,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, SpringerVerlag, Berlin, 1998, (pp. 125-137). 9. Mineau, G.W. and R. Missaoui, “The Representation Of Semantic Constraints in Conceptual Systems,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 138-152). 10. Faron, C. and J-G. Ganascia, “Representation of Defaults and Exceptions in Conceptual Graphs Formalism,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 153-167). 11. Ribiere, M. and R. Dieng, “Introduction of Viewpoints in Conceptual Graph Formalism,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 168-182). 12. Angelova G., and K. Bontcheva, “Task-Dependent Aspects of Knowledge Acquisition: A Case Study in a Technical Domain,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 183-197). 13. Richards, D. and P. Compton, “Uncovering the Conceptual Models in Ripple Down Rules,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 198-212). 14. Kremer, R., D. Lukose, and B. Gaines, “Knowledge Modeling Using Annotated Flow Chart,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 213-227). 15. Lukose, D., “Complex Modeling Constructs in MODEL-ECS,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 228-243). 16. Dick, J.P, “Modeling Cause and Effect in Legal Text,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, SpringerVerlag, Berlin, 1998, (pp. 244-259).
Ontologies and Conceptual Structures
347
17. Raban, R., “Information Systems Modeling with GCs Logic,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 260-274). 18. Bos, C. B. Botella, and P. Vanheeghe, “Modeling and Simulating Human Behaviors with Conceptual Graphs,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 275-289). 19. Wille, R., “Conceptual Graphs and Formal Concept Analysis,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 290-303). 20. Biedermann, K., “How Triadic Diagrams Represent Conceptual Structures,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 304-317). 21. Stumme, G. “Concept Exploration - A Tool for Creating and Exploring Conceptual Hierarchies,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 318-332). 22. Prediger, S., “Logical Scaling in Formal Concept Analysis,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 332-341). 23. Ellis, G. and S. Callaghan, “Organization of Knowledge Using Order Factors,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 342-356). 24. Ohrstrom, P., “C.S. Peirce and the Quest for Gamma Graphs,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 138-152). 25. Kerdiles, G. and E. Salvat, “A Sound and Complete CG Proof Procedure Combining Projections with Analytic Tableaux,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 357-370). 26. Coa, T.H., P.N. Creasy, and V. Wuwongse, “Fuzzy Unification and Resolution Proof Procedure for Fuzz Conceptual Graph Programs,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 386-400). 27. Leclere, M. “Reasoning with Type Definitions,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 401-415). 28. Coa, T.H. and P.N. Creasy, “Universal Marker and Functional Relation: Semantics and Operations,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 416-430). 29. Raban, R. and JH. S. Delugach, “Animating Conceptual Graphs,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 431-445). 30. Bournaud, I. and J-G. Ganascia, “Accounting for Domain Knowledge in the Construction of a Generalization Space,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 446459). 31. Mann, G.A., “Rational and Affective Linking Across Conceptual Cases - Without Rules,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 460-473).
348
W.M. Tepfenhart
32. Gerbe, O. “Conceptual Graphs for Corporate knowledge Repositories,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 474-488). 33. Genest, D. and M. Chein, “An Experiment in Document Retrieval Using Conceptual Graphs,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 489-504). 34. Keeler, M.A. L.F. Searle, and C. Kloesel, “PORT: A Testbed Paradigm for Knowledge Processing in the Humanities,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 505-520). 35. Clark, P. and B. Porter, “Using Access Paths to Guide Inference with Conceptual Graphs,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 521-535). 36. de Moor, A., “Applying Conceptual Graph Theory to the User-Driven Specifications of Network Information Systems,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 536-550). 37. Puder, A. and K. Romer, “Generic Trading Service in Telecommunication Platforms,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 551-565). 38. Polovina, S., “Assessing Sowa’s Conceptual Graphs for Effective Strategic Management Decisions, Based on a Comparative Study with Eden’s Cognitive Mapping,” Conceptual Structures: Fulfilling Peirce’s Dream, eds. D. Lukose, H. Delugach, M. Keeler, L. Searle, and J. Sowa, Springer-Verlag, Berlin, 1998, (pp. 566-580).
Manual Acquisition of Uncountable Types in Closed Worlds Galia Angelova Bulgarian Academy of Sciences, 25A Acad. G. Bonchev St., 1113 Sofia, Bulgaria [email protected]
Abstract. The paper considers the problem of classifying countableuncountable entities during the process of Knowledge Acquisition (KA) from texts. Since one of the main goals of KA is to identify types, means to distinguish new types, instances and individuals become particularly important. We review briefly related studies to show that the distinction countable-uncountable depends on the considered natural language, context usage and the domain; then countability is a perspective to look at a closed world since there is no universal general taxonomy. Finally we propose an internal ontological solution for mass objects which suits to a project1 for generation of multilingual Natural Language (NL) explanations from Conceptual Graphs (CG).
1
Introduction
KA aims at (i) the identification of (task-specific) objects and relationships in the acquisition domain and (ii) the encoding of these entities in the most suitable structures of the chosen knowledge representation formalism. KA is usually performed from texts in a manual, semi-automatic or automatic manner. KA explicates the concept types, their instances and individuals as they are described in the acquisition texts. Obviously, the type fragmentariness is strongly influenced by the concrete natural language - the particular words have languagespecific semantic granularity and often this granularity is the KA hint for the semantic content of the correspondingly acquired type. Uncountable types raise a special interest. These types (often mass objects) should be recognised and encoded properly, since their instances behave in a specific manner. If KA does not identify the countable-uncountable types whatever their language citation is, there is no ’later’ adjustment where this distinction would be evaluated in the context of the acquisition texts and explicated in the Knowledge Base (KB). So, in our view, the precise acquisition of uncountable concepts is one of the obligatory tasks to be performed by KA. This paper discusses KA of uncountable types from noun phrases. We summarise our experience in manual KA from technical texts (generally discussed in [AB2]) and the way we treat the obtained instances after a precise study of 1
DBR-MAT “Intelligent Translation System”, funded by Volkswagen Foundation (1996-98), http://nats-www.informatik.uni-hamburg.de/projects/dbr-mat.
M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 351–358, 1998. c Springer-Verlag Berlin Heidelberg 1998
352
G. Angelova
their (i) linguistic behaviour in the texts and (ii) conceptual behaviour in the closed domain. Section 2 presents the distinction countable-uncountable from a linguistic perspective. Since we perform KA from NL texts to provide further NL generation, the linguistic outlook is the most natural initial standpoint. Section 3 summarises briefly other related approaches to mass nouns and objects. Section 4 shows how the observation of linguistic facts affects the conceptual representation within the KB of CG. Section 5 contains the conclusion.
2
Linguistic Perspective to Countable vs. Uncountable
It is difficult to say whether the distinction countable-uncountable is to be drawn (i) between linguistic entities in the NL texts (as it is widely accepted, these are noun phrases), or (ii) between conceptual entities (objects in the world). Moreover, a clearly defined hierarchy of countable-uncountable entities does not exist. Below we consider the problems of such a classification. The contrastive study [Mo1] adopts a classification of nouns after the linguist Otto Jespersen (see Fig. 1); following them we accept that the distinction countable-uncountable is one of the most important semantic characteristics of nouns. This fits very well to the KA goals to identify types; it is really important to classify the acquired concept types as types with enumerable instances and individuals vs. other types. We can use Fig. 1 as a guide for conceptual classification of types if we interpret properly the ontological content of the hierarchy. To discuss Fig. 1, let us consider nouns and world objects which are ’named’ by these nouns:
N oun
Countable
Concrete
Abstract
Individual
Collective
Uncountable
Concrete
M ass
Abstract
Collective
Fig. 1. Taxonomy of nouns after Jespersen. All partitions are disjoint. The dotted lines indicate the only non-exhaustive partition.
1. Countable, Concrete, Individual: table, student. Material world objects, with certain shape or precise limits, which exist as sets of separable concrete
Manual Acquisition of Uncountable Types in Closed Worlds
2. 3.
4. 5.
353
instances and individuals. This is nearly the only class of nouns whose grammatical behaviour coincides with the individual behaviour of the respective conceptual objects. Countable, Concrete, Collective: flock, herd. Grouping material objects which exist as enumerable instances. Countable, Abstract, Individual: idea, thought. Abstract objects named by nouns whose grammatical features allow counting. We say ’one, three, many ideas’. The language-influenced conceptualisation is that ideas can be counted (at least in English). That is why in the NL texts we meet references to different instances of the respective concept type. Countable, Abstract, Collective: family, company. Grouping abstract objects which exist as enumerable instances. Uncountable, Concrete, Mass: wine, water, sand, snow, copper, butter, sugar. The semantic structure of mass nouns excludes enumeration of instances but mass nouns have the category of number because (i) they appear in singular, e.g. ’this is silver’, and (ii) they express quantity. Plural forms usually indicate a modified meaning or a polysemy. Mass nouns correspond to material objects with properties that (i) the object substance can be measured but not counted, and (ii) each separate part of this stuff has the quality and meaning of the whole.
We reviewed many works to justify this taxonomy; most of the authors would agree with the interpretation of the above five classes. To show the complications of the exact classification of uncountables, however, we compile different examples mapping them into the hierarchy in Fig. 1: 1. Uncountable, Concrete, Collective: pottery, silverware. 2. Uncountable, Abstract, but not Mass and not Collective: love, liberty, democracy, dispersion, music, elegance, etc. Abstract objects which seem to exist ’in one instance only’. 3. Uncountable, Abstract, Mass: success, knowledge, software. 4. Uncountable, Abstract, Collective: mankind, gentry, police. If software is something like abstract mass, then why not to consider hardware as concrete mass? But on the other hand, can hardware be measured like water? Moreover, hardware resembles pottery, because not every physical part of hardware is still hardware. But pottery is classified in [Mo1] as uncountable, concrete, collective. To move hardware to the same class as pottery would probably mean to consider software as abstract, collective - which would contradict some other opinions on what collective is, and so on. It is difficult to distinguish abstract collective and abstract mass nouns in various cases, if we compare the opinions of different authors. Facing all these complications in building an exact taxonomy, we decided to avoid classifying each abstract object in the above categories. We hoped that the opposition countable-uncountable would provide enough conceptual evidence for the number of instances. Unfortunately, even this partition is not absolute
354
G. Angelova
in case of multilinguality: e.g., news in English is uncountable, singular, the news is good, while in Bulgarian one says ’two news’ similarly to ’two ideas’ in English. So we can use the taxonomy in Fig. 1 as a basic framework supporting KA classifications, but only if we remember that it is somewhat relative, especially in particular multilingual cases. Below we continue discussing noun phrases and concept types but always bearing in mind the duality language unit - type acquired in the KB.
3
Other Approaches to Uncountability
AI usually studies the philosophical, cognitive and formal foundations of knowledge modelling. In [Ha1] objects from the Naive Physics world are considered. Properties like countable-uncountable seem to be treated as intrinsic ones. In [Gu1] the countability is a category in a top-level ontology of universals. [WCH] develops an interesting taxonomy of part-whole relations to explain the English usage of part of. Two cases concern the distinction uncountable-countable : the relations portion-mass (slice-pie) and stuff-object (steel-car). Our KA goals, however, lay at the border between support of multilingual NL processing and conceptual modelling using CG; that is why we are mostly interested either in approaches which address both layers together or in CG-related approaches. In [So1] and [So2] we see that mass nouns have measured individuals which correspond to concrete amounts of stuff, representing in this way different individual quantities of types WATER, TIME etc. Plural referents are only used with countable things, while mass nouns are not normally used in the plural. To represent substances by CG, [Te1] proposes the type definition: type substance(x) is [ [ physical_object: {*}@++] (x) ] -> (has_components) -> [Type_Of (physical_object) {*}] (has_internal_structure) -> [Structure] ...... ] ’A substance is a set containing an uncountable number of physical objects of various types and the members within the set have some internal structure’. [Te1] proposes to perform computations over sets and then substance properties can be interpreted as consequences of the relationships that exist among the instances. [Te1] explains compositions, states, viscosity, and reactivity of substances. This is a conceptualisation of substances at a micro-level, providing an extremely detailed description of the micro-changes that can take place over time. The NL-oriented approaches, however, always take into consideration (i) the linguistic facts, and (ii) (to some extent) the object denoted by the noun and the properties of this object. According to Lyons, the count-mass distinction is primarily a linguistic one, which is clearly seen in cases of multilinguality (remember the example of news, uncountable in English and countable in Bulgarian). R. Dale [Da1] investigates the domain of cooking and discusses conflicts between a naive ontology and linguistic facts. In the domain of cooking, rice and lentil are rather similar (small
Manual Acquisition of Uncountable Types in Closed Worlds
355
objects of roughly the same size), whose individuals are not considered separately in recipes. The linguistic expressions of ingredients, however, represent rice as a mass noun while lentil behaves like a count noun: e.g. four ounces of rice, four ounces of lentils. ’If the count/mass distinction was ontologically based, we would expect these descriptions to be either both count or both mass’ [Da1]. We found particularly interesting the observation that ’physical objects are not inherently count or mass, but are viewed as being count or mass’ in some domains. So [Da1] considers physical objects from either a mass or a count perspective. ’Thus, a specific physical object can be viewed one time as a mass, and another time as a countable object: when cooking, I will in all likelihood view a quantity of rice as a mass, but if I am a scientist examining rice grains for evidence of pesticide use, I may view the same quantity of rice as a countable set of individuals’. In [Da1], exactly one perspective at a time is allowed: each object in the closed domain of cooking is either mass, or count. Comparing [Te1] and [Da1], we see that at a micro-level all objects can be treated as sets of instances; but natural language does not work at the level of atomic components. In most realistic domains the basic objects have much bigger granularity and they are denoted by words that make us treat them as count or mass. Unfortunately, the context-dependent NL usage provides flexible shifts of granularity (see [Ho1]). ’A road can be viewed as a line (planning a trip), as a surface (driving on it), and as a volume (hitting a pothole). ... Many concepts are inherently granularity-dependent.’ Probably we could say that closed domain is such a domain, where a pre-fixed number of perspectives to each object exist and the relevant kinds of granularity are pre-fixed as well. To summarize, in NL we refer to the conceptual entities as follows: (i) when the stuff itself is referred to, an uncountable noun is typically used; in other cases the emphasis is on the object form and shape and then we refer to particular instances by means of countable nouns; (ii) in some domains the entities are treated as compositions at the level of micro-ingredients,while in other domains we see them as compositions of ingredients at a much higher level.
4
A Mixed Count-Mass Taxonomy in Closed Domains
We acquire CG in the domain of admixture separation from polluted water in order to generate domain explanations in several NLs from the underlying KB [AB1]. In the context of the present considerations, we try to satisfy the following requirements: (i) adequate internal conceptualisation providing easy surface verbalisation in different NLs with a proper usage of singular-plural and count-mass nouns; (ii) clear separation between conceptual data (in the single KB) and linguistic data (in the system lexicons, one lexicon per language); (iii) conceptual structures allowing for easy integration of the closed domain into more universal ontologies. In the type hierarchy we define important domain types and integrate them under an upper model. Figure 2 represents a simplified view to the taxonomy
356
G. Angelova
where countable and mass entities are classified as subtypes of PHYSICALOBJECT. We acquire as concept types two important objects in the domain: oil drop and oil particle. Note that this decision is domain-dependent, since in this domain the polluting oil exists as particles and drops in the polluted water. Furthermore, OIL-DROP and OIL-PARTICLE are subtypes of the substance OIL. These two types show the borderline where the domain taxonomy is integrated into a universal taxonomy of physical objects. The ISA-KIND relation, introduced in [AB2], defines the perspective of looking at particles and drops as OIL: they are typical ’quantities of stuff’. To conform to some standard, we adopted the keyword PACKAGER from [Da1]. The ISA-KIND relation indicates that the classifications OIL → OIL-PARTICLE and OIL → OIL-DROP are partitions into role subtypes, because these are subtypes that can be changed during the life time of the physical object (similarly to PROFESSION for PERSONS), while the partitions OIL → MINERAL-OIL and OIL → SYNTHETIC-OIL are classifications into natural subtypes according to the usual type-of relation. Note that the PACKAGER-perspective covers the two cases of part-whole relations mentioned in [WCH]: it denotes the relations portion-mass and stuff-object, which are not distinguished in the current domain and consequently, are not treated as different ones in the conceptual model. Such a conceptual solution provides flexible links to the lexicons in case of count-mass nouns. Imagine in some natural language only the word for the stuff exists (e.g. grape is a mass noun in Bulgarian and Russian); then this word is linked to the stuff-concept. But in some other languages, the related words can name typical ’packaged’ quantities (like grape in English); if we acquire such quantities as types, the respective words will be connected to these special concept types. Note that it is not obligatory to have ’naming’ lexicon elements in any language and for any domain types; the explanations are constructed for the existing concepts in the corresponding grammatical forms. Since the inheritance works for type-of relations, in DBR-MAT we can, for instance, generate the explanation: Each oil particle has dimension less than 0.05 mm. Here we use particle in singular, since it is a countable object and inherits the characteristic features of PARTICLE. As a subtype of PARTICLE, OIL-PARTICLE has SHAPE and DIMENSION. From the perspective of OIL, however, we talk about particles always in plural, and thus we connect countable and mass nouns in the generated explanations. For instance: Oil appears as oil particles and oil drops. Viewed as oil, oil particles have density and relative weight. Additionally, in the KB, when instances of the types [OIL-PARTICLE] and [OIL-DROP] appear in conceptual graphs with unspecified plural referent, i.e. as [OIL-PARTICLE: {*}] and [OIL-DROP: {*}], we can make a generalisation and replace these types by OIL. Then they are verbalised as mass nouns. It is obvious that the natural language we generate is not as flexible as a human one, but this solution is an opportunity to mix countable and uncountable perspective in one utterance. Note that we cannot produce phrases like two oil particles, because in our domain-specific KB the unspecified plural referent sets
Manual Acquisition of Uncountable Types in Closed Worlds
357
... ...
PHYSICAL-OBJECT COUN TABLE
DROP
SUBSTAN CE
PARTICLE
OIL-DROP
CON CEPT-TYPE
OIL
N ATURAL
OIL-PARTICLE
PROFESSION
SYN THETIC-OIL M IN ERAL-OIL
OIL-PARTICLE
ISA-K IN D
ROLE
PACK AGER
PACK AGER
OIL OIL-DROP OIL
ISA-K IN D
PACK AGER
Fig. 2. Taxonomy mixing countable and mass objects by classification into role and natural subtypes.
from the type definition of SUBSTANCE would not be instantiated with counted sets. To conclude, such crosspoint types between countable and mass types require very careful elaboration of: (i) the type hierarchy, (ii) the type definitions of the supertypes and the ’crosspoint’ type so that to assure correct generalisation and specialisation, (iii) the characteristics of both supertypes to assure correct inheritance.
5
Conclusion
This paper discusses difficulties in classifying world objects as countable and uncountable and presents a (more or less) empirical solution applied in an ongoing project. We see that KA requires extremely detailed analysis of the source texts and deep understanding of the CG structures and idiosyncrasies. It is somehow risky to consider only the NL level, since the language phenomena in the acquisition texts are often misleading (clearly seen in a multilingual paradigm). Actually the distinction countable-mass should be defined at a deeper concep-
358
G. Angelova
tual level. We try to keep the two perspectives closely related: the uncountable stuff and the countable individual objects made by this stiff. Fig. 2 shows that we allow both count-mass perspectives together but only to specially acquired, domain dependent types. In this sense, our approach addresses the closed world of one domain. To ’open’ this closed world and to integrate another closed world for another domain will probably require the addition of new ’crosspoint’ types between countable-mass objects. In our view, the classification countable-uncountable is not less meaningful than the partition abstract-material. Fig. 1 differs from many upper-level ontological classifications, where abstract-real is the highest partition of the top (see [FH1]). But a careful analysis of the taxonomy in Fig. 1 shows that it is easy to replace the two upper layers, i.e. the classification concrete-abstract can easily become the topmost partition. Despite the problems discussed in this paper, it seems worthwhile to consider at least the countable-mass distinction of physical object as one of the unifying principles for top-level ontologies. Acknowledgements The author is grateful to the three anonymous referees for the fruitful discussion and the suggestions.
References AB1. Angelova, G., Boncheva, K.: DB-MAT: Knowledge Acquisition, Processing and NL Generation using Conceptual Graphs. ICCS-96, LNAI 1115 (1997) 115–129. AB2. Angelova, G., Boncheva, K.: Task-Dependent Aspects of Knowledge Acquisition: a Case Study in a Technical Domain. ICCS-97, LNAI 1257 (1996) 183–197. Da1. Dale, R.: Generating Referring Expressions in a Domain of Objects and Processes. Ph.D. Thesis, University of Edinburgh (1988). FH1. Fridman, N., Hafner, C.: The State of the Art in Ontology Design. A Survey and Comparative Review. AI Magazine Vol. 18(3) (1997), 53 – 74. Gu1. Guarino, N.: Some Organizing Principles for a Unified Top-Level Ontology. In: Working Notes AAAI Spring Symp. on Ontological Engineering, Stanford (1997). Ha1. Hayes, P.: The Second Naive Physics Manifesto. In Brachman, Levesque (eds.) Readings in Knowledge Representation, Morgan Kaufmann Publ. (1985) 468–485. Ho1. Hobbs, J.: Sketch of an ontology underlying the way we talk about the world. Int. J. Human-Computer Studies 43 (1995), 819 – 830. Mo1. Molhova, J. The Noun: a Contrastive English - Bulgarian Study. Publ. House of the Sofia University ”St. Kl. Ohridski”, Sofia (1992). So1. Sowa, J.: Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley, Reading, MA (1984). So2. Sowa, J.: Conceptual Graphs Summary. In: Nagle, Nagle, Gerholz, Eklund (Eds.), Conc. Structures: Current Research and Practice, Ellis Horwood (1992) 3–52. Te1. Tepfenhart, W.: Representing Knowledge about Substances. LNAI 754 (1992) 59–71. WCH. Winston, M., Chaffin, R., Herrmann, D.: A Taxonomy of Part-Whole Relations. Cognitive Science 11 (1987) 417–444.
A Logical Framework for Modeling a Discourse from the Point of View of the Agents Involved in It 1 Bernard Moulin, Professor
Computer Science Department and Research Center in Geomatics Laval University, Pouliot Building, Ste Foy (QC) G1K 7P4, Canada Phone: (418) 656-5580, E-mail: [email protected]
Abstract. The way people interpret a discourse in real life goes well beyond the traditional semantic interpretation based on predicate calculus as is currently done in approaches such as Sowa’s Conceptual Graph Theory, Kamp’s DRT. From a cognitive point of view, understanding a story is not a mere process of identifying truth conditions of a series of sentences, but i s a construction process of building several partial models such as a model of the environment in which the story takes place, a model of mental attitudes for each character and a model of the verbal interactions taking place in the story. Based on this cognitive basis, we propose a logical framework differentiating three components in an agent’s mental model: a temporal model which simulates an agent’s experience of the passing of time; the agent’s memory model which records the explicit mental attitudes the agent is aware of and the agent’s attentional model containing the knowledge structures that the agent manipulates in its current situation
1
Introduction
Conceptual Graphs (CG) [16] have been applied in several natural language research projects and the CG notation can be used to represent fairly complex sentences, including several interesting linguistic phenomena such as attitude reports, anaphors, indexicals and subordinate sentences. However, most researchers have overlooked the importance of modeling the context in which sentences are uttered by locutors. Even modeling a simple sentence such as “Peter saw the girl who was playing in the park with a red ball” requires a proper representation of the context of utterance. For instance, the preterit “saw” cannot be modeled without referring to the time when the locutor uttered the sentence: hence, we need to represent the locutor and the context of utterance in addition to representing the sentence itself. To this end, we proposed to model the contents of whole discourses using an approach in which it is possible to explicitly represent the context of utterance of speech acts 1 An extended version of this paper can be found in reference [11]. This research is sponsored by the Natural Sciences and Engineering Council of Canada and FCAR. My apologies to the reviewers because I could not answer all their questions in such a short paper. M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 359-366, 1998 Springer-Verlag Berlin Heidelberg 1998
360
B. Moulin
[8], [9], [10]. Any sentence is thought of as resulting from the action of a locutor performing a speech act, which determines the context of utterance of that sentence. A major contribution of that approach was the explicit introduction of temporal coordinate systems in the discourse representation using three kinds of constructs: the narrator’s perspective, the temporal localizations and the agent’s perspectives. As an example let us consider the story displayed in Figure 1. It is told by an unidentified narrator using the past tense, which indicates that the reported events occurred in the past relative to the narrator’s time, and hence to the moment when the reader reads the story. When the narrator reports the characters’ words, the verb tense changes for the present or the future. These tenses are relative to the characters’ temporal perspectives which differ from the narrator’s temporal perspective that is temporally located after that date. This example shows the necessity of explicitly introducing in the discourse representation the contexts of utterance of the different speech acts performed by the narrator and the characters. The complete representation of this story can be found in [11]. Monday October 20 1997, Quebec city (S1). Peter wanted to read Sowa's book (S2), but he did not have it (S3). He recalled that Mary bought it last year (S4). He phoned her (S5) and asked her (S6): "Can you lend me Sowa's book for a week?"(S7) Mary answered (S8):"Sure! (S9) Come and pick it! (S10)". John replied (S11):"Thanks! (S12) I will come tomorrow" (S13). Figure 1: A sample story Note: the numbers Si are used to identify the various sentences of the text
However, even such a representation is not sufficient if we want to enable software agents to reason about the discourse content. We aim at creating a system which will be able to manipulate the mental models of the characters involved in the discourse and simulate certain mechanisms related to story understanding. We based our approach on cognitive studies that have shown that readers adopt a point of view “within the text or discourse” [2]. The Deitic Shift Theory (DST) argues that the metaphor of the reader “getting inside the story” is cognitively valid. The reader often takes a cognitive stance within the world of the narrative and interprets the text from that perspective [14]. Segal completes this view by discussing about the mimesis mechanism: “A reader in a mimetic mode is led to experience the story phenomena as events happening around him or her, with people to identify with and to feel emotional about... The reader is often presented a view of the narrative world from the point of view of a character... We propose that this can occur by the reader cognitively situating him or herself in or near the mind of the character in order to interpret the text” ([15] p 67 and 68). Hence, from a cognitive point of view, understanding a story is not a mere process of identifying truth conditions of a series of sentences, but is a construction process of building several partial models such as a model of the environment in which the story takes place, a model of mental attitudes for each character and a model of the verbal interactions taking place in the story. Hence the following assumption: When understanding a discourse, a reader creates several mental models that contain the mental attitudes (beliefs, desires, emotions, etc.) that she attributes to
A Logical Framework for Modeling a Discourse
361
each character as well as the communicative and non-communicative actions performed by those characters. Hence, when using CGs to model the semantic content of a discourse, we need: 1) a way of representing the context of utterance of agents’ speech acts; 2) the underlying temporal structure; 3) a way of representing the mental models of each character involved in the discourse. In the next sections we address point 3) and propose an approach based on a logical framework that differentiates three components in an agent’s mental model: a temporal model which simulates an agent’s experience of the passing of time (Section 2); the agent’s memory model which records the explicit mental attitudes the agent is aware of and the attentional model containing the knowledge structures that the agent manipulates in its current situation (Section 3).
2. An Agent’s Mental Model Based on Awareness In order to find an appropriate approach to represent agents’ mental models, we can consider different formalisms that have been proposed to model and reason about mental attitudes2 [1] among which the so-called BDI approach [13] [5] is widely used to formalize agents' knowledge in multi-agent systems. These formalisms use a possibleworlds approach [6] for modeling the semantics of agent’s attitudes. For example, in the BDI approach [13] [5], an agent’s mental attitudes such as beliefs, goals and intentions are modeled as sets of accessible worlds associated with an agent and a time index thanks to accessibility relations typical of each category of mental attitudes. However, such logical approaches are impaired by the problem of logical omniscience, according to which agents are supposed to know all the consequences of their beliefs. This ideal framework is impractical when dealing with discourses that reflect human behaviors, simply because people are not logically omniscient [7]. In addition, it is difficult to imagine a computer program that will practically and efficiently manipulate sets of possible worlds and accessibility relations. In order to overcome this theoretical problem, Fagin et al. [4] proposed to explicitly model an agent's knowledge by augmenting the possible-worlds approach with a syntactic notion of awareness, considering that an agent must be aware of a concept before being able to have beliefs about it. In a more radical approach, Moore suggested partitioning the agent's memory into different spaces, each corresponding to one kind of propositional attitude (one space for beliefs, another for desires, another for fears, etc.), "these spaces being functionally differentiated by the processes that operate on them and connect them to the agent's sensors and effectors" [7]. The approach that we propose in this paper tries to reconcile these various positions, while providing a practical framework for an agent 2 In AI litterature, elements such as beliefs, goals and intentions are usually called “mental states”. However, we use the term “mental attitudes” to categorize those elements. An agent’s mental model evolves through time and we use the term agent’s mental state t o characterize the current state of the agent’s mental model. Hence, for us an agent’s mental state is composed of several mental attitudes.
362
B. Moulin
in order to manipulate knowledge extracted from a discourse. The proposed agent’s framework is composed of three layers: the agent’s inner time model which simulates its experience of the passing of time; the agent’s memory model which records the explicit mental attitudes the agent is aware of and the attentional model containing the knowledge structures that the agent manipulates in its current situation. In order to formalize the agent’s inner time model, we use a first-order, branching time logic, largely inspired by the logical language proposed in [5] and [13]. It is a firstorder variant of CTL*, Emerson’s Computational Tree Logic [3], extended to a possible-worlds framework [6]. In such a logic, formulae are evaluated in worlds modeled as time-trees having a single past and a branching future. A particular time index in a particular world is called a world-position. The agent’s actions transform one world position into another. A primitive action is an action that is performable by the agent and uniquely determines the world position in the time tree. The branches of a time tree can be viewed as representing the choices available to the agent at each moment in time. CTL* provides all the necessary operators of a temporal logic. It is quite natural to model time using the possible-worlds approach because the future is naturally thought of as a branching structure and because the actions performed by the agent move its position within this branching structure. The agent’s successive world positions correspond to the evolution of the agent’s internal mental state through time as a result of its actions (reasoning, communicative and non-communicative acts). The agent does not need to be aware of all the possible futures reachable from a given world position: this is a simple way of modeling the limited knowledge of future courses of events that characterizes people. An agent’s successive world positions specify a temporal path that implements the agent’s experience of the passing of time: this characterizes its “inner time”. This inner time must be distinguished from what we will call the “calendric time” which corresponds to the official external measures of time that is available to agents and users (dates, hours, etc.).
3
The Agent’s Memory Model and Attentional Model
The agent’s mental attitudes are recorded in what we call the agent’s memory model. Following Fagin [4] we consider that the definition of the various mental attitudes in terms of accessibility relations between possible worlds corresponds to the characterization of an implicit knowledge that cannot be reached directly by the agent. At each world position the agent can only use the instances of mental attitudes it is aware of. Following Moore’s proposal of partitioning the agent's memory into different spaces, the awareness dimension is captured by projecting an agent's current world-position onto so-called knowledge domains. The projection of agent Ag's world-position wt0 on the knowledge domain Attitude-D defines the agent's range of awareness relative to domain Attitude-D at time index t0 in world w. The agent's range of awareness is the subset of predicates contained in knowledge domain Attitude-D which characterize the particular instances of attitudes the agent is aware of at world-position w t0 . The
A Logical Framework for Modeling a Discourse
363
knowledge domains that we consider in this paper are the belief domain Belief-D and the goal domain Goal-D. But an agent can also use other knowledge domains such as the emotion domain Emotion-D that can be partitioned into sub-domains such as FearD, Hope-D and Regret-D. In addition to knowledge domains which represent an agent's explicit recognition of mental attitudes, we use domains to represent an agent's explicit recognition of relevant elements in its environment, namely the situational domain Situational-D, the propositional domain Propositional-D, the calendric domain Calendric-D and the spatial domain Spatial-D. The situational domain contains the identifiers of any relevant situation that an agent can explicitly recognize in the environment. Situations are categorized into States, Processes, Events, and other sub-categories which are relevant for processing temporal information in discourse [8] [9]: these sub-categories characterize the way an agent perceives situations. A situation is specified by three elements: a propositional description found in the propositional domain Propositional-D, temporal information found in the calendric domain Calendric-D and spatial information found in the spatial domain Spatial-D. Hence, for each situation there is a corresponding proposition in Propositional-D, a temporal interval in Calendric-D and a spatial location in Spatial-D. Propositions are expressed in a predicative form which is equivalent to conceptual graphs. The elements contained in the calendric domain are time intervals which agree with a temporal topology. The elements contained in the spatial domain are points or areas which agree with a spatial topology. Figure 2 illustrates how worlds and domains are used to model agent Peter’s mental attitudes obtained after reading sentences S1 to S4 in the text of Figure 1. Worlds are represented by rectangles embedding circles representing time indexes and related together by segments representing possible time paths. Ovals represent knowledge domains. Curved links represent relations between world positions and elements of domains (such as Spatial-D, Calendric-D, Peter’s Belief-D, etc.) or relations between elements of different domains (such as Situational-D and Spatial-D, Calendric-D or Propositional-D). After reading sentences S1 to S3, we can assume that Peter is in a world-position represented by the left rectangle in Figure 2, at time index t1 in world W1. This world position is associated with a spatial localization Peter’s.home in the domain Spatial-D and a date d1 in the domain Calendric-D. This information is not mentioned in the story but is necessary to structure Peter's knowledge. Time index t1 is related to beliefs P.b1 and P.b2 in Peter’s Belief-D. P.b1 is related to situation s1 in Situational-D which is in turn related to proposition p29, to location Peter’s.home in Spatial-D and to a time interval [-, Now] in Calendric-D. Now is a variable which takes the value of the date associated to the current time index. Proposition p29 is expressed as a conceptual graph represented in a compact linear form: Possess (AGNT- PERSON:Peter; OBJ- BOOK: Sowa’s.Book) Notice that in Calendric-D we symbolize the temporal topological properties using a time axis: only the dates d1 and d2 associated with time indexes t1 and t2 have been represented as included in the time interval named October 20 1997.
364
B. Moulin W1
Peter
W2 Creates (Peter, P.g2, active)
t1
Peter
t2
t2 Quebec Peter's.home Mary's.home
Spatial-D d1
P.g2 d2 [-, Now]
d1 d2
P.b1
[dx]
Oct.20.1997
s1
P.g1
P.b2
Calendric-D
Peter's Goal-D s3
Peter's Belief-D
s4 p29 p31
mg6
s2 p30
p32 Situational-D
Mary's Goal-D
Propositional-D Specification of propositions PROP(p29, Possess (AGNT- PERSON:Peter; OBJ- BOOK: Sowa’s.Book) ) PROP(p30, Read (AGNT- PERSON:Peter; OBJ- BOOK: Sowa’s.Book) ) PROP(p31, Buy (AGNT- PERSON:Mary; OBJ- BOOK: Sowa’s.Book) ) PROP(p32, Lend (AGNT- PERSON:Mary; PTNT- PERSON: Peter; OBJ- BOOK: Sowa’s.Book) )
Figure 2: Worlds and Domains From sentence S2 we know that Peter wants to read Sowa’s book, which is represented by the link between time index t1 and the goal P.g1 in Peter’s Goal-D. P.g1 is related to situation s4 in Situational-D which is in turn related to proposition p30 in Propositional-D. P.g1 is related to a date in Calendric-D and a location in Spatial-D, but those links have not been represented completely in order to simplify the figure (only dotted segments indicate the existence of those links). In world W1, agent Peter can choose to move from time index t1 to various other time indexes shown by different circles in the world rectangle in Figure 2. Moving from one time index to another is the result of performing an elementary operation. From our little story we can imagine that Peter wanted that Mary lends him the book: the corresponding elementary operation is the creation of a goal P.g2 with the status active. In Figure 2 this is represented by the large arrow linking the rectangles of worlds W1 and W2 on which appears the specification of the elementary operation Creates (Peter, P.g2, active). When this elementary operation is performed, agent Peter moves into a new world W2 at time index t2 associated with the spatial localization Peter’s.home and date d2. Time index t2 is still related to beliefs P.b1 and P.b2 in Belief-D, but also to goals P.g1 and P.g2 in Peter’s Goal-D. In an agent Ag1’s mental model certain domains may represent mental attitudes of another agent Ag2: They represent the mental attitudes that Ag1 attributes to Ag2. As an example, Goal
A Logical Framework for Modeling a Discourse
365
P.g2 in Peter’s Goal-D is related to Goal mg6 in Mary’s Goal-D which is contained in Peter’s mental model. Goal mg6 is associated with situation s2 which is itself related to proposition p32. Beliefs and goals are formally expressed using predicates which hold for an agent Ag, a world w and a time index t as for example Peter's beliefs and goals at time index t2: Peter, W2, t2 Peter, W2, t2 Peter, W2, t2 Peter, W2, t2
|= BELP.b1(Peter, STATEs1(NOT p29, [-, Now], Peter’s.home )) |= BELP.b2(Peter, EVENTs3(p31, dx, -) ) |= GOALP.g1(Peter, PROCESSs4(p30, -, Quebec ), active) |= GOALP.g2(Peter, GOALmg6(Mary, PROCESSs2(p32, -, Quebec), active), active)
The agent’s memory model gathers the elements composing the agent’s successive mental states. The amount of information contained in the agent’s memory model may increase considerably over time, resulting in efficiency problems. This is similar to what is observed with human beings. They record “on the fly” lots of visual, auditive and tactile but they usually do not consciously remember this detailed information over long periods of time. They remember information they pay attention to. Similarly in our framework, the agent’s attentional model gathers some information extracted from the agent’s memory model because of its importance or relevance for the agent’s current activities. The attentional model is composed of a set of knowledge bases that structure the agent’s knowledge and enable it to perform the appropriate reasoning, communicative and non-communicative actions. Among those knowledge bases, we consider the Belief-Space, Decision-Space, Conversational-Space and Action-Space. The Belief-Space contains a set of beliefs extracted from the memory model and a set of rules enabling the agent to reason about those beliefs. Each belief is marked by the world position that was the agent’s current world position when the belief was acquired or inferred. The Decision-Space contains a set of goals extracted from the memory model and a set of rules enabling the agent to reason about those goals. Each goal is marked by the world position that was the agent’s current world position when the goal was acquired or inferred. The Conversational-Space models the agents’ verbal interactions in terms of exchanges of mental attitudes and agents’ positionings relative to these mental attitudes. In our approach a conversation is thought of as a negotiation game in which agents negotiate about the mental attitudes that they present to their interlocutors: they propose certain mental attitudes and other locutors react to those proposals, accepting or rejecting the proposed attitudes, asking for further information or justification, etc. [12]. The Action-Space records all the communicative, noncommunicative and inference actions that are performed by the agent. Details and examples about the attentional model can be found in [11].
5
Conclusion
This paper is a contribution to the debate about the notion of context which has taken place for several years in the CG community. Whereas our temporal model [8], [9], [10]. presented a static representation of the various contexts of utterance found in a
366
B. Moulin
discourse, the present approach considers that the context is built up from the accumulation of knowledge in various knowledge bases (the various spaces of the attentional model) that compose the agent’s mental model. The proposed logical framework provides the temporal model of discourse with semantics that can be practically implemented in an agent’s system. However, the comparison of this framework with other approaches of context modeling would deserve another entire paper.
References 1. Cohen P. R.& Levesque H. J. (1990), Rational Interaction as the Basis for Communication, in Cohen P.R., Morgan J. & Pollack M. E.(edts.) (1990), Intentions in Communication, MIT Press, 221-255.
2. Duchan J. F., Bruder G.A. & Hewitt L.E. (1995), Deixis in Narrative, Hillsdale: Lawrence Erlbaum Ass. 3. Emerson E. A. (1990), Temporal and modal logic, In van Leeuwen J. (edt.), Handbook of Theoretical Computer Science, North Holland, Amsterdam, NL. 4. Fagin R., Halpern J.Y., Moses Y. & Vardi M.Y., Reasoning about Knowledge, MIT Press, 1996. 5. Haddadi A. (1995), Communication and Cooperation in Agent Systems, Springer Verlag Lecture Notes in AI n. 1056.
6. Kripke S. (1963), Semantical considerations on modal logic, Acta Philosophica Fennica, vol 16, 83-89. 7. Moore R.C. (1995), Logic and Representation, CSLI Lecture Notes, n. 39. 8. Moulin B. (1992), A conceptual graph approach for representing temporal information in discourse, Knowledge-Based Systems, vol5 n3, 183-192. 9. Moulin B. (1993), The representation of linguistic information in an approach used for modelling temporal knowledge in discourses. In Mineau G. W., Moulin B. & Sowa J. F. (eds.), Conceptual Graphs for Knowledge representation, Lecture Notes on Artificial Intelligence, Springer Verlag, 182-204. 10. Moulin B. (1997), Temporal contexts for discourse representation: an extension of the conceptual graph approach, the Journal of Applied Intelligence, vol 7 n3, 227255. 11. Moulin B. (1998), A logical framework for modeling a discourse from the point of view of the agents involved in it, Res. Rep. DIUL-RR 98-03, Laval Univ., 16 p. 12. Moulin B., Rousseau D.& Lapalme G. (1994), A Multi-Agent Approach for Modeling Conversations, Proc. of the International Conference on Artificial Intelligence and Natural Language, Paris, 35-50. 13. Rao A.S.& Georgeff M. P. (1991), Modeling rational agents within a BDI architecture, In proceedings of KR’91 Conference, Cambridge, Mass, 473-484. 14. Segal E.M. (1995a), Narrative comprehension and the role of Deictic Shift Theory, in [2], 3-17. 15. Segal E.M. (1995b), A cognitive-phenomenological theory of fictional narrative, in [2], 61- 78. 16. Sowa J. F. (1984). Conceptual Structures. Reading Mass: Addison Wesley.
Computational Processing of Verbal Polysemy with Conceptual Structures Karim Chibout and Anne Vilnat Language and Cognition Group,LIMSI-CNRS, B.P. 133, 91403 Orsay cedex, France. chibout,[email protected]
Abstract. Our work takes place in the general framework of lexicosemantic knowledge representation to be used by a Natural Language Understanding system. More specifically, we were interested in an adequate modelling of verb descriptions allowing to interpret semantic incoherence due to verbal polysemy. The main goal is to realise a module which is able to detect and to deal with figurative meanings. Therefore, we first propose a lexico-semantic knowledge base; then we present the processes allowing to determine the different meanings which may be associated to a given predicate, and to discriminate these meanings for a given sentence. Each verb is defined by a basic action (its supertype) specified by the case relations allowing to specify it (its definition graph), that means the object, mean, manner, goal and/or result relations which distinguish the described verb meaning from the specified basic action. This description is recursive: the basic actions are in turn defined using more general actions. To help interpreting the different meanings conveyed by a verb and its hyponyms, we have determined three major types of heuristics consisting in searching the type lattice, and/or examining the associated definitions.
1
Introduction
Our work takes place in the general framework of lexico-semantic knowledge representation to be used by a Natural Language Understanding system. More specifically, we were interested in an adequate modelling of verb descriptions allowing to interpret semantic incoherence due to verbal polysemy. The main goal is to realise a module which is able to detect and to deal with figurative meanings.We studied two complementary thrusts to model polysemy : (a) lexicosemantic knowledge representation, (b) processes to interpret the different meanings which may be associated to a given predicate, and to discriminate these meanings for a given sentence. After an outline of some metaphor examples that justified our approach, we will present links between verbs within a lexical network and the semantic structure associated with each of them. We will finally define the model of polysemy propounded from this representation formalism; in particular, selection rules within the network in order to process figurative meanings of verbs. M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 367–374, 1998. c Springer-Verlag Berlin Heidelberg 1998
368
2
K. Chibout and A. Vilnat
Polysemy, Metaphors and Figurative Meanings
The distinction between polysemy and homonymy must be clearly distinguished : two homonyms only share the same orthography, two polysems share semantic elements. In case of homonymy, the solution is to create as many concepts as necessary. Otherwise, in case of polysemy the different senses of a word are mostly figurative meanings and derive from its core meaning. Thus, it is necessary to be able to determine this core meaning, and the derivation rules implied in this senses elaboration. The following examples illustrate this phenomenon. Lexical metaphors are created by selecting words specific semantic features. In the following example, both nouns share /spherical/ feature. 1)La terre est une orange. (Earth is an orange.)1 The characteristic feature used in the comparison may also be taken in a figurative meaning, and the phrase is twice non literal. 2) Sally is a block of ice. The feature /cold/ of the ice (literal meaning) is assimilated to the feature /cold/ of human character (figurative meaning). This characteristic may be true or assumed (in the beliefs) : 3) John is a gorilla, to signify a man with violent, brutal manners; whereas the ethologist will insist on the peaceful manners of this animal. The same metaphor may convey multiple meanings depending on the context, by selecting different semantic features. In 3), gorilla may also mean /hairy/ or even /(physically) strong/. Figure interpretation consists in selecting one of the semantic characteristics among those which define the metaphoric term. When multiple meanings are possible, they are semantically related as they come from a unique representation. Polysemy is also considered as in context (re-)building process from a unique representation. We pay most attention to semantic incoherence resolution due to verbal polysemy. Analysing multiple meanings of French verbs allows us to precise the semantic representations from which they are elaborated. 4) Les vagues couraient jusquaux rochers. (The waves run towards the rocks) The verbal metaphor in 4) may be interpreted in replacing to run by to move quickly, which both are implied by the semantic description of this verb. Roughly speaking, the semantic features associated to to run are :/to move/ + /with feet/ + /quickly/... Hyperonym is not the only part of the meaning which is used to build a figurative meaning. 5) La robe ´etrangle la taille de la jeune fille. (The dress constricts the young girls waist, with the French verb ´etrangler, which literally means to strangle, to translate the notion conveyed by to constrict). In this example to strangle is defined as ”to suffocate by grasping the neck”; the semantic incoherence cannot be resolved by the hyperonym but by the selection of the feature /grasp/ 1
In this paper, the examples illustrating polysemy will generally be given in French, with a translation in English, to maintain their polysemous nature, which will probably lost their character in English. Being not English native speaker, it was difficult to always find analaguous examples, and to be sure that they convey the same senses.
Computational Processing of Verbal Polysemy with Conceptual Structures
369
(synonym of to constrict) which specifies the method used to do the action. To find the different meanings of a predicate, always semantically related, it is necessary to select its hyperonym, or to extract another part of its definition, or to combine both mechanisms (see for example the interpretation of to run in 4). These polysemous behaviours lead us to propose a hierarchical representation of the concepts, completed for each concept by a precise semantic description allowing to express semantic features that differentiate it from its father (direct hyperonym) and from its brothers in the hierarchy. This representation is implemented in the conceptual graph formalism [7] : the hierarchy will take place in the type lattice, and semantic descriptions will be translated in type definitions.
3
Ontology, Definitions and Conceptual Graphs
From the preceding study, it is clear that the best suited tool to build the ontology is a dictionary to obtain the different meaning components. The entries constitute rather precise descriptions generally including the hyperonym and referring to verbs whose meanings are related to the one defined. The different relations (such as mean, manner, goal,...) that specify the verb with respect to its hyperonym are also given. Yet, the definitions in a dictionary are not always homogenous concerning their structure and their content [4], [1]. Thus it is impossible to rely only on a dictionary to build our network.The method used to organise verbal concepts has been divided in two steps. Firstly, we have analysed about a hundred verbs in French. These concepts have been categorised in details following precise criteria, such as a systematic definition of the kind of the relations between the verb and its semantic features. The representation of each lexical item is called conceptual schema, and corresponds to an enhanced dictionary definition. Each conceptual verb is given in terms of its nearest hyperonym (the central event) and some semantic cases which specify it. In addition to the classical case relations (agent, object, mean,...), four cases are essential for a complete verb description : the manner an event is realised, the method used to realise it, the result of the event and its intrinsic goal. For example, the verbs to cut and to cover have the following semantic descriptions : to cut : to divide (nearest hyperonym) a solid object (object) into several pieces (result) using an edge tool (mean) by going through the object (method) to cover : to place (nearest hyperonym) something (object) over something else (support)in order to hide it(goal1) or protect it (goal2). The verb hierarchy is organised following the case relations which are associated. A verbal concept is hyperonym (respectively hyponym) of another one if they share a common hyperonym and if the case structure of this verb presents one of the following feature : (a) lack (respectively presence) of a value defined for a given case, for example to divide is hyperonym of to cut, because the definition of to cut includes a mean and a method ; (b) presence of a case with multiple value (respectively unique value), for example to cover is hyperonym of to plate and to veil which only include one of the possible goals (respectively to protect and to hide) ; (c) presence of a case with a generic value (respectively specific va-
370
K. Chibout and A. Vilnat
lue), for example to decapitate is hyperonym of to guillotine; the mean case of the first (edge tool) is a sub-type of the one of the second (guillotine).This bottom-up building of the hierarchy allows us to define some large semantic classes. Thus, we determine about fifteen primitives appearing to be similar to the classical case relations, and mostly corresponding to state verbs : Owner (ex : to own), Location (ex : to live in), Container (ex : to contain), Support(ex : to support), Patient (ex : to suffer), Time (ex : to exist), Experiencer (ex : to know),. . . At the
(a)
Action
FaireCesser
...
FaireCesserPossibilite
FaireMourir
Empecher (Prevent) Etouffer (Suffocate)
FaireCesserEspace
FaireCesserTemps
Tuer (Kill)
Museler (Muzzle)
FaireCesserParaitre
Separer (Separate)
Arreter (Stop)
..
FaireDevenir
Interrompre Demeler (Interrupt) (Unravel)
Diviser (Divide)
...
FaireDevenirLoc FaireEntrer (Put in)
FaireDisparaitre
Cacher Dissiper Effacer (Hide) (Dissipate) (Erase)
Asphyxier Noyer Decapiter Egorger Peigner Couper Casser Fragmenter Etrangler (Asphyxiate) (Strangle) (Drown) (Decapitate) (Cut Throat) (Comb) (Cut) (Break) (Fragment)
...
Gommer (Rub out)
...
Importer (Import)
Eclipser (Eclipse)
FairePenetrer (Penetrate)
Instiller (Instill)
...
.
Hacher (Mince)
Guillotiner (Guillotine)
(b)
...
Briser (Smash)
(c) feed
give_food:*x
goal agent
recipient
agent
animate
animate
agent
animate:*y
recipient maintain
animate:*y
animate:*z
animate:*z patient
Fig. 1. Extract of the hierarchy (a) and description of to feed: (b) canonical graph and (c) type definition graph
higher level of the hierarchy, the process verbs and the action verbs derive from these state primitives. So processes are expressed as Devenir (become)/Cesser (cease) + primitive, and action verbs as FaireDevenir/FaireCesser + primitive : – DevenirContenant (to fill, to eat, to drink), DevenirTemps (to born), DevenirExp´erienceur (to learn) – CesserContenant (to (become) empty), CesserTemps (to die), CesserExp´erienceur (to forget) – FaireDevenirContenant (to fill, to water), FaireDevenirTemps (to create, to give birth to, to calve), FaireDevenirExp´erienceur (to teach) – FaireCesserContenant (to empty), FaireCesserTemps (to interrupt, to kill)
Computational Processing of Verbal Polysemy with Conceptual Structures
371
Our study has concerned about 2000 French verbs. Thus we have constituted a large lexico-semantic network corresponding to more than 1000 verbal concepts, organised inside a hierarchy (see Fig.1). For each node, the description consists in : a sub-categorisation frame, a case structure (represented in a canonical graph) and a definition (represented in a definition graph), describing how it processes its hyperonym (i.e. its super-type in the hierarchy), (see Fig.1). The association word/concept is defined in a global table. These descriptions have allowed us to build the lexico-semantic knowledge base used by the Natural Language Processing system developed at LIMSI [8]. The definition graphs have been built following the method we describe in the first part of this paragraph. Thus we make sure that the definitions are not ad-hoc definitions, following a top-down building of the hierarchy, but respect the meanings of the words they represent. But, at that time we cannot verify that the constraints due to inheritance mechanismare verified : this constitutes the following step of our work.
4
Verbal Polysemy: Elements of Interpretation
As we have said it, we built the hierarchy we presented just above to be able to interpret lexical polysemy of French verbs. We assume that there exists a (proto)typical meaning associated to each verb. The prototype is the more often attested meaning. The so-called figurative meanings derive from this prototypical meaning. These figurative meanings are related to the semantic structure of the verb by taking into account either the nearest hyperonym (as in example 4), or other parts of the conceptual schema (as in example 5). But other processes are necessary to interpret metaphoric senses, based on primitives substitutions. Let us consider the interpretation of the following example: 6) La fermi`ere nourrit le feu avec des branchages. (The farmer’s wife feeds the fire with lopped branches (to keep going the fire)) The constraints expressed in the canonical graph of to feed are violated. But the definition graph (see Fig.1) includes the particular meaning of this word (in example 6): it is given in the sub-graph corresponding to to maintain (in French entretenir which means to keep going in this context). The semantic constraints expressed in the canonical graph of to maintain are then fulfilled (particularly for the type of the object fire). Selecting parts of the semantic representation of lexical items is not the only process used to build figurative meanings ; nevertheless, as we show it just below, it is a necessary condition to be able to elaborate these meanings.Some figurative meanings rely on conceptual metaphors [2]. Thus, nourrir quelqu’un de ragots (to feed someone with ill-natured gossips), abreuver quelqu’un de connaissances (to swamp someone with knowledge) may be interpreted from the metaphor Mind is Container. Such conceptual metaphors are interesting because they are rather general: they apply not only to a given verb, but to a whole class of verbs. Thus, gaver de connaissances (to fill ones mind with knowledge), se nourrir de lectures (to improve the mind with reading), d´evorer un livre (to devour a book)
372
K. Chibout and A. Vilnat
will be interpreted using the metaphor Mind is Container. These interpretation processes do not invalidate our model for polysemy. The semantic representation we propose is based on a hierarchy, thus it is possible to go up to the semantic primitives from which verbal items are dependant. The extracts of the type lattice in Fig.2 clarify the way the interpretation is done. Semantic primitives play a central role to interpret figurative meanings due to conceptual metaphor. Thus, all the metaphoric uses, such as abreuver quelqu’un de connaissances (to swamp someone with knowledge), for which the analogy Mind is container will be relevant, will be interpreted by substituting the primitive CONTAINER to the primitive EXPERIENCER. From the metaphoric verb, the semantic class
Devenir
FaireDevenir DevenirContenant
FaireAbsorber
Remplir
FaireConnaÓtre
Absorber
SeRemplir
(absorb)
(fill)
(fill)
FaireAvaler ImprÈgner
DevenirExpÈrienceur
FaireDevenirExpÈrienceur
FaireDevenirContenant
...
Annoncer
FaireApprendre
(announce)
Avaler
...
(learn)
(water)
nourrir (feed)
Inculquer Enseigner
(inculcate) (teach)
Proclamer
(understand) (decipher)
(swallow)
(get_swallow) (impregnate)
Abreuver
DevenirConnaÓtre
Apprendre Comprendre DÈchiffrerr
SeNourrir (feed)
Boire
(drink)
Lire (read)
(proclaim)
Manger (eat)
Gaver
(gorge)
DÈvorer SeGaver (devour) (gorge oneself)
Fig. 2. Interpretation examples
corresponding to the primitive is reached. Then the substitution rule (Container -> Experiencer) is applied, and the pertinent concept (FaireApprendre for Abreuver (to water) and Lire (to read) for D´evorer (to devour) is searched for in this branch using canonical graphs. The searched concept must be described with a canonical graph expressing constraints that are verified by the arguments of the metaphoric sentence (in our examples knowledge and book) The most difficult point is due to the fact that canonical graphs don’t always discriminate the pertinent concept from the others belonging to the same branch (for example FaireApprendre, Inculquer (to inculcate), Enseigner (to teach), etc. have identical canonical graphs).This technical difficulty doesn’t invalidate the model proposed for conventional metaphoric senses. Conceptual metaphors are not applied casually ; it is the fact that a verbal item is related to a given primitive in its semantic structure which justifies the use of this or that metaphor. There is no ”spontaneous generation” of meaning; verbs contain in their conceptual schemas the entry points towards other meanings. Following examples illustrate the classic metaphor of SPACE -> TIME :
Computational Processing of Verbal Polysemy with Conceptual Structures
373
7) couper la parole (to interrupt someone),[interrompre (to interrupt)] 8) briser une conversation (to break off a conversation)[interrompre brusquement (to suddenly interrupt)] 9)entrecouper ses phrases de sanglots (to interrupt one’s phrases with sobs) [interrompre fr´equemment (to frequently interrupt)] All the verbs couper, briser, entrecouper depend on the class FaireCesserEspace (expressing spatial discontinuity). Interpreting metaphors consists in substituting class FaireCesserTemps (temporal discontinuity) to this class. For the two last examples, the adverbs attached when they are interpreted belong to the definitions of the verbs (to smash [briser]: break suddenly. . . ; entrecouper : to cut frequently. . . About ten substitution rules of this kind have been determined. They are probably not exhaustive, but have nevertheless being tested on 1000 verbs. Moreover, the fact that common meanings are shared by a class of verbs (regrouped because they belong to a same branch) partially validate our ontology. Such figurative meaning interpretations are rather poor. The following example illustrates this point. 10) Le paysan coupait souvent par le champ de bl´e.(The farmer often cuts through the wheatfield) Polysemy is solved by selecting the case method , i.e. to go through associated to cut in its definition graph (see Sect. 3). But replacing to cut by to go through looses information : the implicit notion of spatial reduction (or more exactly moving distance reduction) conveyed by to cut in this sentence is lost. A more precise interpretation will need to determine which meaning parts of the metaphoric verb have to be transferred to the inferred meaning.
5
Related Works
Lexical classifications based on semantic criteria are relatively numerous in artificial intelligence. Semantic nets are main lexical knowledge representation modes; but they essentially concern nouns. However, Levin [3] proposes an English verbs summary classification founded on syntactico-semantic criteria. The hierarchy lacks accuracy : Levin gets main classes (occasionally sub-classes) in which verbs are listed; but, on no account, there are semantic relations defined between verbs within a same class (or sub-class). Verbs within a same class are assimilated to syntactic synonyms ; therefore from a semantic point of view, these verbs may convey rather distant meanings. Some authors try to build precise semantic hierarchies of English verbs on a large scale [5]. They emphasise the complexity of this task, notably due to the different semantic fields implied in relations between verbs having connected meanings. Miller and his collaborators determine a particularisation relation allowing to group different semantic components that distinguish a verb from its hyperonym. This relation between two verbs V1 and V2 (V1 hyponym of V2) is named troponymy and is expressed by the formula ”accomplish V1 is accomplishing V2 with a particular manner”. For example,battle, war, tourney, duel,..are troponyms of the verbal predicate fight. Troponyms of communication verbs imply the speaker intention or his motiva-
374
K. Chibout and A. Vilnat
tion to communicate, as in examine, confess, preach,... or the communication media used: fax, email, phone, telex, .... Relying on these principles, they implemented a large lexical network (Wordnet) that organise verbs and other lexical categories in terms of signified. Our work is in part inspired by their approach, but we systematically precise the kind of relation between troponym concepts (case relations).
6
Conclusion
We have presented linguistic and computational aspects of verbal polysemy. Figurative meanings of a predicate represent as many shifts in meaning from a unique semantic structure that define this verb. The resolution processes consist in a simple selection of the pertinent elements of this representation (i.e. the pertinent subgraph of the definition graph) or in the same selection plus an inference from one of these elements (conceptual metaphors). Our knowledge base has been implemented in conceptual graphs but till need to be verified concerning possible incoherence due to the inheritance mechanism. The psychological validity of the representation and of the treatment we propose is also being tested in an experiment. We don’t claim that our work is exhaustive neither in the semantic representation proposed, nor in the different acceptions understanding processes. We define a general framework in which the links between the multiple senses of a verb may be expressed. Polysemy is one of the major characteristics of natural language, it is also one of the most complex to apprehend. Moreover, as the other contextual phenomena (anaphora, implicit, etc.), polysemy is one of the major difficulty in Natural Language Processing.
References 1. Chibout, K., Masson, N.: Un r´eseau lexico-s´emantique de verbes construit ` a partir du dictionnaire pour le traitement informatique du fran¸cais, Actes du colloque LTTAUPELF-UREF Lexicomatique et Dictionnairique, Lyon, Septembre 1995. 2. Lakoff, G., Johnson, M.: Les m´etaphores dans la vie quotidienne. Collection ”Propositions”, les ´editions de Minuit (1985). 3. Levin, B.: English verb classes and alternations, University of Chicago Press, 1993. 4. Martin, R.: Pour une logique du sens. Linguistique nouvelle. Presses Universitaires de France, (1983). 5. Miller, A. G., Fellbaum, C., Gross, D.: WORDNET a Lexical Database Organised on Psycholinguistic Principles. in Zernik (Ed.): Proceedings of the First International Lexical Acquisition Workshop, I.J.C.A.I., D´etroit,(1989). 6. Searle, J.R.: Metaphor in Andrew Ortony (Ed.): Metaphor and Thought. Cambridge University Press (1979), pp. 284-324. 7. Sowa J. Conceptual Structures: processing in mind and machine. AddisonWesley.Reading, Massachussetts, (1984). 8. Vapillon J., Briffault X., Sabah G., Chibout K.: An object oriented linguistic engineering using LFG and CG, ACL/EACL Workshop: Computational Environments for Grammar Development and Linguistic Engineering, Madrid (1997).
Word Graphs: The Second Set C. Hoede1 and X. Liu2 1
2
University of Twente, Faculty of Mathematical Sciences, P.O. Box 217, 7500 AE Enschede, The Netherlands [email protected] Department of Applied Mathematics, Northwestern Polytechnical University, 710072 Xi’an, P.R. China
Abstract. In continuation of the paper of Hoede and Li on word graphs for a set of prepositions, word graphs are given for adjectives, adverbs and Chinese classifier words. It is argued that these three classes of words belong to a general class of words that may be called adwords. These words express the fact that certain graphs may be brought into connection with graphs that describe the important classes of nouns and verbs. Some subclasses of adwords are discussed as well as some subclasses of Chinese classifier words. Key words: Knowledge graphs, word graphs, adjectives, adverbs, classifiers. AMS Subject Classifications: 05C99, 68F99.
1
Introduction
We refer to the paper of Hoede and Li [1] for an introduction to knowledge graphs as far as needed for this paper. We only recall the following. Words are considered to be representable by directed labeled graphs. The vertices, or tokens, are indicated by squares and represent somethings. The arcs have certain types that are considered to represent the relationship between somethings as recognizable by the mind. The graphs that we will discuss are therefore considered to be subgraphs of a huge mind graph, representing the knowledge of a mind and therefore also called knowledge graph. These knowledge graphs are very similar to conceptual graphs, but are restricted as far as the number of types of relationship is concerned. There are two types of relationships. The binary relationships, the usual arcs, may have the following labels: M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 375-389, 1998 Springer-Verlag Berlin Heidelberg 1998
376
equ sub ali dis cau ord par sko
C. Hoede and X. Liu
: : : : : : : :
Identity Inclusional part-ofness Alikeness Disparateness Causality Ordering Attribuation Informational dependency.
The sko-relationship is used as a loop to represent universal quantification. Next to the binary relationship there are the n-ary frame-relations. There are four of these. : Relationship of constituting elements with a concept, being a subgraph of the mind graph. negpar : Negation of a certain subgraph. pospar : Possibility of a certain subgraph. necpar : Necessity of a certain subgraph. fpar
These four frame relationships generalize the wellknown logical operators. If a certain subgraph of the mind graph is the representation of a wellformed proposition p, this proposition is represented by the frame, ¬ p is represented by the same subgraph framed with the negpar relationship and the modal propositions ♦p and p are represented by the same subgraph framed with the pospar and the necpar relationship respectively. In this way logical systems can be represented by different types of frames of very specific subgraphs. We refer to Van den Berg [2] for a knowledge graph treatment of logical systems. So logic is described by frames of propositions. If a subgraph of the mind graph does not correspond to a proposition the framing, and the representation of the frame by a token, may still take place. Any such frame may be baptized, i.e. labeled with a word. The directed ali-relationship is used between a word and the token to type the token. Thus 2
ali stone o
is to be read as “something like a stone”. Note that the token may represent a large subgraph of the mind graph. In particular verbs may have large frame contents. Verbs are represented in the same way. So 2
ali o
hit
Word Graphs: The Second Set
377
is the way the verb hit is represented. The directed equ-relationship is used between a word and a token to valuate or instantiate the token. So equ pluto /
ali
2
dog o
is to be read as “something like a dog equal to Pluto”. The mind graph is considered to be a wordless representation of thought relationships between units of perceptions. The words come in when certain subgraphs are “framed and named”. At the most elementary level the frame contents may just be one relationship. These are the first word graphs to start with. It turned out that prepositions have such very simple structures and for that reason they formed the first set of word graphs. The frame contents of frames representing nouns and verbs express the definitions of the concepts (note that frames do literally take other concepts together). A lexicon of word graphs is being constructed at the University of Twente. In order to make the theme of this paper, adwords, clear we recall the preposition of. There were three word graphs given in total graph form, where arcs are also indicated by vertices with the type of relationship as label. These three word graphs were: 2
'& !
%$ "#
2
'& !
%$ "#
sub
/
par
/
2
'& ! /
.
%$ "#
fpar
Suppose we have the word combination red ball. The word ball can be represented, according to our formalism, by 2
ali ball o
.
Now red is a word attributed to the word ball in the sense that its word graph is linked to the word graph of ball. For red we would take the following word graph into the lexicon: equ red /
2
ali o
colour
analogous to the graph for Pluto. Now we say “red is the colour of the ball” and the word graphs are to be linked in a way that we use the word
378
C. Hoede and X. Liu
of for. As we see colour as an exterior attribute of ball we choose the par-relationship to represent red ball by ali
2
ball o
O
par equ red /
2
ali o
colour.
It is in this way that red is a word linked to the word ball. It is usually called an adjective. Note that without the word red we would still have coloured ball. Also note that we do not say red colour but do say the colour red. It should be clarifying that have was seen as be with, be being represented by the empty frame and with having word graph 2 . So the graph might also be brought under words by “the par ball has colour red”. Note that people may differ in opinion on expressing red ball. We have just given our view. Quite a few adjectives are similar to red and can be seen as adwords linked by the par-relationship. However, there are other ways how word graphs can be linked to the word graph of a noun and a verb. One particular way is by the fpar-relationship that links the constituents of a definition to the defined concept. Suppose, for example, that a stone is defined as “a structure of molecules” (which would not be precise enough, but lexica contain scores of unprecise definitions). If the type of molecules is denoted in the graph, by means of an equ-arc, as silicon, we may speak of a silicon stone and silicon is now functioning as an adjective. This type of adjective is of another nature than the adjective red, the difference being expressed by the way of linking, one time by a par-relationship and the other time by an fparrelationship. The essential linguistic phenomenon is that word graphs are linked to other word graphs. As this can be done in various ways, to nouns and to verbs, it is more natural to speak of adwords. We will discuss adjectives and adverbs from this view point. The interesting phenomenon of classifiers in Chinese is closely related to our way of viewing adwords. In Chinese it is not well-spoken to say “a spear”. One should say “a stick spear”, where stick is a word expressing a classifying aspect of spear. a horse is not i ma but should be expressed as i pi ma, where pi is the classifier, the meaning of which seems to have been lost in the course of time. Yet the adword pi should be expressed in proper speaking. There are more than 400 of these classifiers. We will discuss several of them from the same view point as before and give word graphs for them. '& !
%$ "#
/
Word Graphs: The Second Set
379
In the word graph project we plan to start with studying structural parsing as soon as sufficient word graphs are available. This will take a third set of word graphs, containing the remaining word types.
2
Adwords
In this paper we cannot give an extensive treatment of the grammatical aspects of adjectives, adverbs and classifiers. We will restrict ourselves to some major subclasses of these adwords. The book of Quirk, Greenbaum, Leech and Svartnik [3] was used for reference, more specifically Chapter 5. Our examples are chosen from this book. We will not stress syntactic problems. A certain knowledge graph is brought under words by expressing by words certain of its subgraphs. The way this is done differs from language to language. In an extremely simple example we have that red ball in English is uttered as ballon rouge in French. Any knowledge graph admits an utterance path. Usually there are several utterance paths, i.e. ways of uttering the words, having word graphs that cover the knowledge graph, in a linear order. As our example red ball or ball with colour red shows, there are ways of bringing a knowledge graph under words that are more precise than others. In natural language often the less precise descriptions, like red ball, are used. That red is a colour and that colour can be attributed to a ball is background knowledge for these concepts red and ball that enables the short but incomplete utterance path. A lexicon may contain a word graph for red that includes the colour concept, but a word graph for ball that does not mention the possibility of linking to another concept by means of a par-relationship. A machine may then not be able to create a connected knowledge graph for red ball, unless it is instructed to interpret the syntactical fact that both words are uttered together as justifying the linking of their word graphs by some arc. For a more elaborate discussion of the interplay of semantics and syntax we refer to the thesis of Willems [4]. 2.1
Adjectives
As word graphs are supposed to grasp the semantics of a word, according to the slogan “the structure is the meaning”, we focus on the Paragraphs 5.37 to 5.41 in Quirk et al., in which they give a semantic subclassification of adjectives.
380
C. Hoede and X. Liu
They make the distinctions stative/dynamic, gradable/non-gradable and inherent/non-inherent. Their Table 5:2 looks as follows: stative gradable inherent + + + black (coat) – + + brave (man) + – + british (citizen) + + – new (friend) . Adjectives are characteristically stative, most are gradable and most are inherent. The normal adjective type is all three, like black. Quirk et al. give the imperative as a way to distinguish stative from dynamic adjectives. One can say be careful but not be tall or be black, be british respectively be new. One can say be brave, explaining the minus sign in the first column. blacker, braver and newer are gradings but britisher is not possible, explaining the minus sign in the second column. Inherent adjectives characterize the referent of the noun directly. They consider black, true and british to be inherent adjectives. Before undertaking our own discussions we should reproduce their premodification examples on page 925 in which combinations of adwords are mentioned. deter- general age colour partiminers ciple etc. the hectic the extravagant a crumbling a grey crumbling some intri- old intercate locking a small green carved his heavy new
provenance
noun
denom- head inal
social life london social life church
tower
gothic church
tower
chinese
designs jade
idol moral responsibilities
It is clear that proposals have been made for what Quirk et al. call semantic sets. We will do exactly that, but the basis of our proposal will be the types of relationships a noun, or verb, can have with other words. The FPAR-adwords We use the fpar-relationship to represent the definitional contents of a concept. Any word used in the definition might
Word Graphs: The Second Set
381
be called an fpar-adword. However, usually some restrictions are made. If the definition contains a preposition like of this word is not considered an adword. An other remark that should be made is that a definition may contain the concept of colour. In that case the adjective, say grey, is considered to be inherent. The definition of an elephant may contain the statement that it is a grey animal. In black coat, however, the adjective cannot be considered to be inherent, like in the table given. The point is that colour is attributed subjectively, a red ball in green light looks black. Even in white light objects may have different colours for colour blind persons. For this reason we used the par-relationship and consider red to be a par-adword and colour to be non-inherent. Similarly brave and british are disputable as inherent adjectives, for different reasons. Braveness is present according to a judgement. One is considered to be brave by others. brave can be seen as an instantiation of judgement. Other adjectives of this type are kind and ugly. british does not describe an inherent aspect either. Anything that is part of Britain, seen as a frame, can be called british. In this way the frame name determines the adjective for its constituents. british may therefore be called an inverse fpar-adword. Examples of this type of adjectives are lamb in lamb meat and city in city council or church in church tower. We conclude that the restriction inherent/non-inherent had better be replaced by the distinction fpar/nonfpar. For material objects a typical fpar-adword expresses the sort of material. So in jade idol, jade describes the material and is a typical fpar-adword. If steam is hot water vapour then hot is an fpar-adword as is water, instantiating relative temperature and material of the vapour. Note that we speak of relative temperature as hot is not a temperature, like fast is not a velocity. Within a frame concepts may occur that allow a measure, like temperature or length. In those cases the corresponding fpar-adwords are gradable, like in hot, hotter or long, longer. Usually these adwords do not indicate absolute temperature or length but relative temperature and relative length. In “a two meter man”, the precise length two meter may be interpreted as an fpar-adword too. The PAR-adwords We use the par-relationship to represent exterior attribuation. Judgements on a concept are typical exterior attribuations.
382
C. Hoede and X. Liu
We already classified brave as a par-adword. beautiful is another good example. An important class of words concerns space and time aspects. A ball may exist at some location at some moment in time. Its space-time coordinates will be seen as determined from the outside, i.e. by exterior attribuation, and therefore represented by a par-relationship. Adjectives like early and sudden belong to this class and also old and new. The link to the main concept, or head as Quirk et al. call it, is by a par-link that connects the time aspect to the concept in the following way ali
2
something o
O
.
par 2
ali o
time
The word graphs for the adjectives embed the time aspect in a more elaborate word graph in order to express the various meanings. In these graphs the time of the speech act may play an important role, like in the description of tense, a theme that we will not discuss here. A speech act may occur at time t0 , whereas the intended description concerns something at time t1 , often determined by the discourse of which the speech act is part. In a historical account one may read “the former king was tyrannical but the new king was a very kind person”. The adjective former refers to a time before a certain time t2 , when the king was replaced, the adjective new refers to a time after that time t2 . The fact that this took place before t0 is apparent from the past tense coming forward in was. The word graph for former and new will contain an ord-relationship. In early reference to a time interval will have to be made, expressing that the first part of the time interval is meant. This also holds for old or ancient. Differences between the three adjectives must come forward in the word graphs, but if these are left out by the speaker, in his choice of words, he might speak of “in the early days”, “in the old days” or “in the ancient days” to express the same thing. The more elaborate word graphs may become quite large. This situation is similar for dictionaries, where bad dictionaries give short definitions, and good dictionaries try to give more precise definitions. We would like to recall here from [1] that in our theory the meaning of a word is in principle the whole graph considered, containing the word. This implies that different graphs, i.e. different contexts, give different meanings to a word.
Word Graphs: The Second Set
383
The CAU-adwords In the representation of language by knowledge graphs verbs are represented by a token and cau-relationships. Transitive verbs like write are represented as 2
cau /
cau
2
/
O
2 ,
ali write
whereas intransitive verbs like sleep are represented as 2
cau /
2 O
ali
.
sleep Consider the leftmost token. It represents “something writing” respectively “something sleeping”. Thus writing and sleeping are adjectives of that something and are, for obvious reasons, classified as cau-adwords. The rightmost token in the first knowledge graph might represent a letter and can be described as “written letter”. So the adjective written can also be classified as a cau-adword. Again we should note, like for new, that the word graph should contain information referring to the time aspects. winning team, boiling water and married couple give examples of cau-adwords. The last example is somewhat tricky. Marrying usually involves two persons and “A marries B” and “B marries A”. Then “A and B have got married”, or both are “in the state of marriage”. This brings us to a special class of adjectives, describing states. Suppose something is subject to a series of changes. At any time it is then in a certain state. In happy girl the adjective describes a state the girl is in, “girl being happy” is doing the same. The verb be was represented by the empty frame. Anything in the frame is. have, which was defined as be with, can and must are verbs that likewise correspond to the frame relationships that we distinguished. When they are used, they usually express states, focusing on the process rather than on the subject or object related to the process by a cau-relationship. Yet the adjectives describing states are discussed in this section, as we stress the relationship with verbs. States are exemplified by adjectives like ablaze, asleep or alive, where descriptions of processes stand central. The check on the distinction stative/dynamic, by trying out the imperative, like in be careful, corresponds to checking whether the adjective can be seen as describing a state.
384
C. Hoede and X. Liu
It should be noted that predicative use of an adjective expresses a beframe. the car is heavy instead of the heavy car expresses the “being heavy” of the car explicitely. In Turkish doktordadir literally means doctor at be. The very frequently used agglutination dir expresses the be-frame. Most of the a-adjectives, like ablaze, are predicative only. the house is ablaze can be said, the ablaze house cannot be said. In the car is heavy the use of the adjective heavy is predicative. The analogy is suggested by the word is. However, in this case this word is stems e.g. from a truck is (defined as) a heavy car which is shortcutted to a truck is heavy or even the car is heavy. This explains the predicative use of an adjective that is essentially an fpar-adword. heavy is pushed into the role of a state describer like ablaze. The ALI-adwords The ali-relationship between two concepts expresses the alikeness of the concepts. This relationship may be seen as primus inter pares as the process of concept creation seems to depend heavily on the becoming aware of alikeness. Prototype definitions express what has been seen as common properties of a set of somethings. Adjectives that are expressing that a concept looks like another concept may have specific endings. disastrous expresses a similarity with a disaster and industrious expresses a similarity with certain aspects of industry. Other endings are –ish as in foolish or –like as in childlike, for example in combination with behaviour. A special category of adjectives are ali-adwords that are themselves nouns. thunder in thunder noise or traitor in traitor knight express that the noise is like heard when thunder occurs or that the knight acts like a traitor. Especially for this category of adjectives, nouns acting as adjectives, it becomes clear that the classification that we give in different types of adwords makes sense. 2.2
Adverbs
In our theory nouns and verbs are both represented by word graphs. In this respect nouns and verbs are describing concepts basically in the same way. However, nouns do not necessarily include cau-relationships in their definition, whereas verbs do. But for this difference a verb may be seen as a special type of noun. This means that there is no basic difference in the way other words may act as adwords of verbs in comparison to the way other words act as adwords of nouns. This is also underlined by the possibility of substantivation of verbs. Compare to play, playing and a play.
Word Graphs: The Second Set
385
We will discuss only a few examples. In our view time and location of the act expressed by the verb are natural aspects to consider in relation to the verb. Many adverbs refer to these aspects and are classified as par-adwords. We mention often, outside, briefly, ever as examples. The explicit word graphs for these words may become quite large as rather complex aspects are expressed. Judgements and measurements are two aspects that are also often expressed by par-adwords. well, quite, extremely, enough, much, almost are reflecting judgements or measurements. A judgement may be interpreted as a subjective measurement, hence the treatment of these two aspects at the same time. cau-adwords refer to influences or, in most cases, to consequences of the acts described by the verbs. Hence adverbs like amazingly or surprisingly may be mentioned. ali-adwords include clockwise, or any of the many adverbs with ending -wise, but also words like too or as are used as adverbs and clearly should be classified as ali-adwords. brilliantly, like a brilliant, is representative of another subclass of these adwords. fpar-adwords are somewhat rare. In “the country is deteriorating economically” the adverb economically indicates in what respect the country deteriorates. The country’s economy must be seen as frame part and for this reason economically may be classified as an fpar-adword. Two remarks are to be made still. Firstly, there is a set of words, mentioned by Quirk et al., that are sometimes used as adverbs, but actually refer to the use of logic in language. nevertheless, however, though, yet, so, else are such words. We would prefer to include them in a special third list of words graphs with no and probably, to mention some other potential adverbs. Secondly, we would like to comment on an example of Quirk et al.; a far mor easily intelligible explanation, showing compilation of adwords. In traditional discussion we would have to decide whether we are dealing with adjectives or adverbs. The great advantage of our theory is that this discussion is avoided. Instead one may have the discussion about the classification that is to be given to each of the adwords. However, once a knowledge graph, expressing the text, is made out of the word graphs, the way these word graphs glue together, by which type of arc, immediately gives the answer in that discussion. An important aspect of this example is that the sheer possibility of compiling adwords in language is an argument for the modelling of language in terms of knowledge graphs, built from word graphs.
386
2.3
C. Hoede and X. Liu
Classifiers in the Chinese language
In Chinese, a special class of words are the quantity words, as Chinese prefer to call them, which are also called classifiers. FPAR-classifiers One of the most frequent quantity words in ko, used in combination with a word like tung tsi, thing. For a thing in Chinese we should say, yi ko tung tsi. There are many nouns for which the quantity word ko is used. As this example already shows the quantity word may be seen as a word naming a subframe of a frame carrying the name of the noun. Hence here the relationship between noun and quantity word is an fpar-relationship, the quantity word is an fpar-adword and, in terms of knowledge graphs, we get the following graph. 2
ali noun o
O
fpar o
ali quantity word.
Here, indicates the something described by the quantity word. Although a word like ko is not felt to have a meaning of its own by Chinese, it should refer to something felt as an essential property of the following noun. Many quantity words express the property of the noun that it describes a unit. Another example yi feng xin for a letter, shows an adword feng, that in the combination is not felt to have a meaning. However, there is a verb feng for folding and it is clear that xin (letter) is described in a way that expresses the property that letters usually consist of folded paper. For a well we have yi yan jing, where the quantity word yan has some meaning, namely hole. If a well is the configuration of water coming from a hole in the ground, the hole property of a well is clearly mentioned as property not to be deleted in the description. Quantity words like these should perhaps better be called property words or classifiers. Other classifiers Next to the fpar-relationship to a noun of a property word, there are many other classifiers that do have a separate meaning, but are used in combination with a noun to express a certain feature. These quantity words can be divided into 13 subclasses. We will discuss this in more detail now and start with a typical example: yi hu shui, a pot of water. In English too a class description of water is necessary. hu, or pot, is an adword for shui, or water. A pot typically contains a
Word Graphs: The Second Set
387
liquid, as a container it has a specific shape and is made of some material like China, metal or class. The important feature is that of containing a liquid. In knowledge graph representation we have in first instance
liquid
pot ali
ali equ water
2
/
O
O
par
par
2
sub /
O
2 O
ali space
.
ali space
The graph can be extended to include the other features of pot like shape or material, but the containment feature gives the direct link to the noun. In terms of adwords we cannot say that pot is a par-adword, the linking to water is more complex, in fact our example is a clear example of a relationship in knowledge graph theory. Two concepts are part of one graph and it is this graph that characterizes the relationship between the two concepts. If we have to baptize this relationship we would choose the word container or containment. So hu is an adword of shui linked to it by a relationship of complex nature, that is however representable by the basic types of arcs. There are quite a few adwords of the type container, ping or bottle is just one other example. We now list the 13 subclasses by giving a characterization of the class, like container, an example, like yi hu shui, a pot of water and the relevant structure relating the quantity word and the noun in one other case.
1. containment Example: yi hu shui (a pot of water). Graph: see above. 2. similarity Example: yi di shui (a drop of water)
388
C. Hoede and X. Liu
Graph: liquid
drop ali
ali equ water
2
/
O
O
par
par 2 O
ali
2 O
ali shape
ali shape
Remark: This subclass too is quite numerous. Because of lack of space we just mention the 11 other subclasses with an example only. 3. set Example: yi chuan putao (a cluster of grapes) 4. time Example: yi tian shidian (a day of time) 5. length Example: yi mi chang (a meter of length) 6. area Example: yi yingmu tudi (an acre of land) 7. weight Example: yi ke yinzi (a gram of silver) 8. volume Example: yi sheng shui (a liter of water) 9. money Example: zhe jiazhi yi meiyuan (this costs a dollar). 10. chinese character Example: “wang” zi bihua you si ge (“wang” word stroke(s) has four element(s)). 11. number Example: yi da wazi (a dozen of socks). 12. action Example: da ta yi zhang (hit him a palm). 13. complexes Examples: san jiaci feiji (three times of flight). In a more elaborate version of this paper word graphs and remarks are given for these other subclasses.
Word Graphs: The Second Set
3
389
Discussion
In the first paper on word graphs nouns, verbs and prepositions were discussed. In this paper a second set of word graphs has been presented, that we called adwords. Seeing adjectives, adverbs and classifiers in Chinese as instances of words that glue to other words like nouns or verbs in certain specific ways a completely different view on these wordclasses has been developed. We do not have enough space here to show that our way of classifying, according to the way graphs are linked, indeed solves quite a few problems concerning them. In fact it should be stressed that the correctness of our method of representing words is still to be proven by showing that linguistic problems can indeed be solved. Preliminary results show that the approach is really quite promising. But first we have to construct a word graph lexicon. In principle we have covered most types of words already. There remain some other types, like the “logical” words or wordcompositions like e.g. disconnect, and we plan to consider them in a third paper. Chinese quantity words will be collected extensively in a special report.
References 1. Hoede, C., Li X.: Word Graphs: The First Set, in Conceptual Structures: Knowledge Representation as Interlingua. Auxiliary Proceedings of the Fourth International Conference on Conceptual Structures, Bondi Beach, Sydney, Australia (P.W. Eklund, G. Ellis and G. Mann, eds.), ICCS ’96, (1996) 81–93 2. Berg, H. van den: Knowledge Graphs and Logic: One of Two Kinds. Dissertation, University of Twente, The Netherlands, ISBN 90-9006360-9 (1993) 3. Quirk, R., Greenbaum, S., Leech, G., Svartnik, J.: A Grammar of Contemporary English. Longman (1972) 4. Willems, M.: Chemistry of Language: A graph-theoretical study of linguistic semantics. Dissertation, University of Twente, The Netherlands, ISBN 90-9005672-6 (1993)
Tuning Up Conceptual Graph Representation for Multilingual Natural Language Processing in Medicine Anne-Marie Rassinoux, Robert H. Baud, Christian Lovis, Judith C. Wagner, and Jean-Raoul Scherrer Medical Informatics Division, University Hospital of Geneva, Switzerland [email protected]
Abstract. Multilingual natural language processing (NLP), whether it concerns analysis or generation of sentences, requires a sound language-independent representation for grasping the deep meaning of narratives. The formalism of conceptual graphs (CGs), especially designed to cope with natural language semantics, constitutes a good repository for dealing with the compositionality and intricacies of medical language. This paper describes our experiment, as part of the European GALEN project, for exploiting a conceptual graph representation of medical language, upon which multilingual medical language processing is performed.
1
Introduction
Health care, like any institution managing a huge amount of textual information, has to face an important counterbalancing challenge. On the one hand, the ability to rapidly and easily access and retrieve relevant information on a patient is a crucial need. Such a functionality is better achieved when information is encoded and structured in databases or electronic patient records, thus allowing the formulation of precise queries. On the other hand, effective communication is better performed through the expressiveness of natural language, whether it is among health care providers themselves, or to directly address the patient. It appears, therefore, that both textual documents and structured information must coexist in the same environment. The approaches developed to switch from one to another are at the heart of research and development performed in the domain of medical language understanding [1]. This has led to substantial and ongoing results related to both the analysis [2] and generation [3] of medical texts. This article presents substantial issues resulting from our experience in handling a unique knowledge representation - being used as input for the generation task and as output for the analysis task - for grasping the deep meaning of medical sentences. The peculiarities of medical language and their implications for knowledge representation are first described. Then the adjustments achieved to mediate between the implicitness and expressiveness of natural language (NL) on the one hand, and the accuracy and granularity of knowledge representation (KR) on the other hand, are exposed. M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 390-397, 1998. © Springer-Verlag Berlin Heidelberg 1998
Tuning Up Conceptual Graph Representation
2
391
Background
Domain knowledge model that represents medical information in a languageindependent and structured way constitutes a major cornerstone upon which multilingual applications can be built. This requires the description of a consistent view of all the relevant entities of the domain and their associated attributes. Such a description must reflect an appropriate level of generality and granularity, useful for integrating knowledge from diverse sources as well as maintaining standard representation that can be exchanged and reused in the future. These requirements are at the basis of the European GALEN project1. This project led to the development of a common reference model (the CORE model) which is expressed in a languageindependent manner through a descriptive logic language (the GRAIL Kernel) [4]. It affords a high level ontology that organizes concepts and attributes (also called semantic links or relationships) upon which multiple inheritance can be applied. Moreover, it allows composite concepts to be built from existing ones, providing that the constraints required for such compositions are available. This compositional modeling facilitates shifting between different levels of detail. This presents advantages for NLP, as such compositions allow conceptual structures to be formulated in natural language using more or less precise medical vocabulary. The importance of multilingual natural language processing was particularly emphasized during the first year’s work in GALEN-IN-USE2 [5]. On the one hand, natural language generation is of paramount interest for information presentation as it allows knotted and complex GRAIL expressions to be naturally displayed to the user with sentences formulated in his native (or at least known) natural language. On the other hand, natural language analysis is of significant help for data capture in so far as it produces a structured representation which can then be (semi) automatically mapped to the GALEN CORE model. Therefore, linguistic tools including generation [6, 7] as well as analysis [8] of medical texts, have been developed as part of the GALEN project. Pursuing a previous in-house experiment [9] with the formalism of conceptual graph (CG) [10], the latter has been chosen as the modeling formalism for representing medical language that is handled by NLP tools. Due to the fact that the CG formalism shares many features with the GRAIL language in which the CORE model is expressed [4], the translation from GRAIL to CG representation was a straightforward process using Definite Clause Grammar (DCG). The NLP tools have been mainly tested on a new French coding system for surgical procedures named NCAM3 [11]. From this classification, more than 500 surgical procedures belonging to the urology domain were modeled, while respecting 1
GALEN stands for General Architecture for Language, Encyclopaedias and Nomenclatures in medicine, and is currently funded as part of Framework IV of the EC Healthcare Telematics research program. 2 GALEN-IN-USE is the current phase of the GALEN project, whose aim is to apply tools and methods of GALEN to assist in the collaborative construction and maintenance of surgical procedure classifications. 3 NCAM stands for Nomenclature Commune des Actes Médicaux and is currently developed by the University of Saint Etienne in France.
392
A.-M. Rassinoux et al.
the multi-level sanctioning of the GALEN representation, and then regenerated into natural language phrases. Fig. 1 shows the CG representing the French surgical procedure „Pyélo-calicoscopie par voie percutanée“ and its subsequent translations in English and French. The various implemented operations are further depicted according to their use in the generation task. They can also be applied during the analysis of surgical procedures in order to adjust the level of granularity of the conceptual representation produced. Establishing the focus:
cl_Inspecting
[cl_SurgicalDeed: y](rel_isMainlyCharacterisedBy)->[cl_performance](rel_isEnactmentOf)->[cl_GeneralisedProcess: x]\\.
Language annotations
From concepts to
CG:
[ [SurgicalDeed](isMainlyCharacterisedBy)->[performance](isEnactmentOf)->[[Inspecting](playsClinicalRole)->[SurgicalRole]\](actsSpecificallyOn)->[ArbitraryBodyConstruct](hasArbitraryComponent)->[RenalPelvis] (hasArbitraryComponent)->[CalixOfKidney]\ (hasPhysicalMeans)->[Endoscope] (hasSpecificSubprocess)->[SurgicalApproaching](hasPhysicalMeans)->[[Route](passesThrough)->[SkinAsOrgan]\]\\\\].
en: scopy; fr: scopie en:surgical; fr:chirurgical en: pyelo; fr: pyélo en: calico; fr: calico en: endoscopic fr: endoscopique en: by; fr: par en: percutaneous route fr: voie percutanée
Relational contraction:
Type contraction:
[cl_GeneralisedProcess: x](rel_hasSpecificSubprocess)->[cl_SurgicalApproaching](rel_hasPhysicalMeans)->[cl_Route: y]\\.
[cl_Route: x](rel_passesThrough)->[cl_SkinAsOrgan]\.
reld_byRouteOf
cl_PercutaneousRoute
Ouput of the generation tool for English and French: en: ’endoscopic surgical pyelocalicoscopy by percutaneous route’ fr: 'pyélocalicoscopie chirurgicale endoscopique par voie percutanée' Fig. 1. Operations applied during the generation task on the CG representing the French rubric „Pyélo-calicoscopie par voie percutanée“4
4
Internally, a concept is prefixed by cl_, a simple relationship by rel_ and a composite relationship by reld_. For the sake of simplicity, these prefixes are omitted in the above CG.
Tuning Up Conceptual Graph Representation
3
393
Modeling Medical Language
Modeling medical language requires taking into account the variations in the description of medical terms, while supporting a uniform representation expressing the medical concepts characterized by attributes and values. However, the way information is modeled does not always correspond to the way information is expressed in natural language. Different compromises must therefore be set up. 3.1
Annotation of Medical Information
In order to make available and operational the semantic content of the CORE model for NLP, the major task has consisted in annotating the model in the corresponding languages treated (mainly French, English, German, and Italian). These linguistic annotations are performed at two levels. First, conceptual entities are annotated with ’content words’ that correspond mostly to the syntactic categories of nouns and adjectives. Either single words, parts of words like prefixes and suffixes, or multiword expressions are permitted as annotation (see the language annotation part in Fig. 1). Second, annotations of relationships are more frequently achieved through ’function words’. The latter are conveyed either through grammatical structures, such as the adjectival structure or the noun complement (as in the examples pyelic calculus or calculus of the renal pelvis for the relationship rel_hasSpecificLocation), or directly through grammatical words such as prepositions (as in the example urethroplasty for perineal hypospadias where the preposition ’for’ denotes the relationship rel_hasSpecificGoal). The annotation process, which only occurs on named concepts, enables the creation of multilingual dictionaries that settle a direct bridge between concepts and language words. Every meaningful primitive concept belonging to the GALEN CORE model needs to be annotated. Besides, composite concepts, for which a definition is maintained at the conceptual level, may be annotated based on the availability and conciseness of words in the language treated. The verbosity of medical language and the complexity of the modeling style can then be respectively tuned by annotating composite concepts and composite relationships. For example, the concept cl_PercutaneousRoute is directly annotated with concise expressions, and the relationship reld_byRouteOf, especially created for NLP purposes, allows the nested concept cl_SurgicalApproaching to be masked during linguistic treatments (see Fig. 1). However, the combinatorial aspect of the compositional approach as well as the continually growing creation of new medical terms, make the annotation task unbounded and time-consuming. This has led to the implementation of procedural treatments at the linguistic level that map syntactic structures upon semantic representation but also include the management of standard usage of prefixes and suffixes. The latter is especially important for surgical procedures that are commonly expressed through compound word forms [12]. This means that the word pyelocalicoscopy is never described as an entry in the English dictionary (as the description of the concept denoting a pyelocalicoscopy is not explicitly named in the model), but is automatically generated according to its corresponding semantic description. Automated morphosemantic treatment also implies that the linguistic module be aware of abstract constructions used at the conceptual level to handle the
394
A.-M. Rassinoux et al.
enumeration of constituents. In Fig. 1, both the abstract concept cl_ArbitraryBodyConstruct and the relationship rel_hasArbitraryComponent are used to clarify the different body parts on which the inspection occurs. 3.2
The Relevance of the Focus for NLP
A recognized property of CG formalism, over other formalisms such as the frame system, is its ability to easily turn over the representation, i.e. to draw the same graph from a different head concept. For example, the following conceptual graph [cl_Pain]->(rel_hasLocation)->[cl_Abdomen] can be rewritten into [cl_Abdomen] ->(rel_isLocationOf)->[cl_Pain]. Even if these two graphs appear equivalent at first sight, from the conceptual viewpoint, some subtleties can be pointed out when shifting to NL. The former graph is naturally translated into abdominal pain or pain in the abdomen, whereas the second one tends to be translated into painful abdomen. In medical practice, the interpretation underlying these two clinical terms significantly differs. The key issue here is that, for NLP purposes, the head concept of a graph (such as cl_Pain in the first graph and cl_Abdomen in the second one) is precisely considered as the focus of the message to be communicated. The rest of the graph, therefore, is only there to characterize this main concept in more detail. Such an observation questions the focus-neutral property of CG formalism in so far as linguistic tools add special significance to the head concept or focus of a graph. Indeed, the latter is interpreted as the central wording upon which the rest of the sentence is built. 3.3
Contexts or Nested Conceptual Graphs
Contexts constitute a major topic of interest for both the linguist and conceptual graph communities. Recent attempts to formally define the notion of context have come to light [13, 14], and the use of context for increasing the expressiveness of NL representation is clearly asserted [15]. For GALEN modeling, contexts appear as a major way of avoiding ambiguity when representing medical language. In particular, associating a specific role with a concept (as for example, cl_SurgicalRole for identifying a surgical procedure, or cl_InfectiveRole or cl_AllergicRole for specifying a pathological role) allows for reasoning and then restricting the inference process to what is sensible to say about this concept. Such packaging of information is graphically represented through brackets delimiting the nested graph (see the CG displayed in Fig. 1 where two contexts are surrounded by bold brackets). Handling these contexts at the linguistic level results mainly in enclosing the scope of the nested graph in a single proposition, which can be expressed by a simple noun phrase or through a more complex sentence.
Tuning Up Conceptual Graph Representation
4
395
Formal Operations to Mediate between KR and NL
The previous section emphasized the gaps that exist between KR and medical language phrases. Indeed, as KR aims to describe, in an unambiguous way, the meaning carried out by NL, such a structure is naturally more complete and accurate than that which is simply expressed in NL. Mediating between these two means of communication implies setting up specific formal operations for readjusting KR and NL expressions. 4.1
Basic Operations on CGs
Balancing the degree of granularity and thus the complexity of a conceptual representation can be achieved in two different ways. On the one hand, a conceptual graph can be contracted in order to display information in a more concise manner. The contraction operation, which consists in replacing a connex portion of a graph by an explicit entity, is basically grounded on the projection operation (see [10], pp. 99). On the other hand, a conceptual graph can be expanded in order to add and thus make explicit precise information on the semantic content of the graph. The expansion operation, which consists in replacing a composite entity by its full definition, is based on the join operation (see [10], pp. 92). As the general guideline for the generation task in the GALEN project is to produce phrases ’as detailed as necessary but as concise as possible’, the projection operation appears as the central means to mediate with the complexity of KR. In order to adjust this operation for particular usage, it has been necessary to provide, in addition to the projected graph, the hanging graph, the list of cut points and finally the specializations performed during the projection. The hanging graph only embeds the remaining portions of the original graph that were connected to formal parameters (i.e. composed of one or two parts depending on the number of formal parameters, x and y, present in the definition). All other hanging subgraphs are clearly considered as cut points. Each specialization performed in the original graph, whether it concerns a relationship or a concept, is recorded in a list. Each of these components are then checked, as explained in the following section, for their particular behavior in the three following situations: the setting-up of the focus, the contraction of conceptual definitions, and the management of contexts. 4.2
Refining Basic Operations for NLP
In the simplest case, the focus of a graph is defined as the head concept of the graph. However, in KR, this solution is frequently left for the benefit of a representation that allows the focus as well as other concepts mentioned at the same level to be represented uniformly. This is the case for the example shown in Fig. 1, where the general concept cl_SurgicalDeed, representative of the type of concept to be modeled, is taken as the head concept of the graph. Then specific relationships, such as rel_isMainlyCharacterisedBy and rel_isCharacterisedBy, are respectively used to clarify the ’primary’ procedure and a number of, possibly optional, additional procedures. Establishing the focus of the graph in this case consists in restoring the
396
A.-M. Rassinoux et al.
primary procedure as the head concept of the graph, by projecting the corresponding abstraction shown in Fig. 1 on the initial graph. Then, the projected graph is straightly replaced by the specialization of the concept identified by the formal parameter x. In the example, the latter corresponds to the concept cl_Inspecting that is a descendant of cl_GeneralisedProcess in the conceptual hierarchy. Moreover, this operation prohibits the presence of cut points as well as any specialization done on concepts other than the formal parameter x. The two hanging graphs (if not empty) are then appended to the new head of the graph. In order to retain the level of detail of the conceptual representation, the type contraction does not allow specialization. Moreover, it normally prohibits the presence of cut points, which are signs of a resulting disconnected graph. However, for the generation task, such a rule can be bypassed. For example, let us consider the following graph: [cl_SurgicalExcising](rel_actsSpecificallyOn)->[cl_Adenoma](rel_hasLocativeAttribute)->[cl_ProstateGland]\\. Assuming that the composite concept cl_Adenomectomy exists in the model, the contraction of its corresponding definition would produce the cut point (rel_hasLocativeAttribute)->[cl_ProstateGland]. But, as the type contraction in the generation process is intended to ensure the conciseness of the produced NL expressions, by translating a portion of graph by precise words, the cut point can be joined to the hanging graph. This contributes to the generation of the valid NL expression prostatic adenomectomy from the above graph. For the relational contraction, the projection operation permits the specialization on concepts and relationships, as these relational definitions, specifically introduced for NLP purposes, are commonly expressed in the most general way possible. However, the cut points are not permitted. Finally, contexts are treated first of all by looking for the definition of a composite concept already described in the model that can be successfully projected on the nested graph. This is the case in Fig. 1 for the contextual information describing the percutaneous route that is replaced by the concise concept cl_PercutaneousRoute. In all the other cases, the boundaries of the contexts are simply removed and the nested graph is merged to the main graph as performed for the surgical role in Fig. 1.
5
Conclusion
Our experience with managing the conceptual graph formalism for NLP has reinforced our belief that a logical, expressive, and tractable representation of medical concepts is a requisite for dealing with the intricacies of medical language. In spite of the effort undertaken to independently manage conceptual knowledge (which in this case is mainly modeled within the GALEN project) and linguistic knowledge (which is handled by linguistic tools), it clearly appears that fine-tuning of both sources of knowledge is a requisite towards building concrete multilingual applications. Such an adjustment affects both the KR and the multilingual NLP tools, and is realized through declarative as well as procedural processes. On the one hand, it has been necessary to add declarative knowledge through the specification of both multilingual annotations and language-independent definitions. On the other hand, the procedural adjustment has been mainly achieved through the implementation of morphosemantic
Tuning Up Conceptual Graph Representation
397
treatment at the linguistic level, and the refinement of conceptual operations for holding the modeling style at the KR level. All these compromises have proved to be adequate for smoothly and methodically counterbalancing the granularity and complexity of KR with the implicitness and expressiveness of NL.
References 1. 2. 3. 4. 5. 6. 7.
8. 9. 10. 11.
12. 13. 14. 15.
McCray, A.T., Scherrer, J.-R., Safran, C., Chute, C.G. (eds.): Special Issue on Concepts, Knowledge, and Language in Health-Care Information Systems (IMIA). Meth Inform Med 34 (1995). Spyns, P.: Natural Language Processing in Medicine: An Overview. Meth Inform Med 35(4/5) (1996) 285-301. Cawsey, A.J., Webber, B.L., Jones, R.B.: Natural Language Generation in Health Care. JAMIA 4 (1997) 473-482. Rector, A.L., Nowlan, W.A., Glowinski, A.: Goals for Concept Representation in the GALEN project. In: Safran, C. (ed.): Proceedings of SCAMC’93. New York: McGrawHill, Inc. (1993) 414-418. Rogers, J.E., Rector, A.L.: Terminological Systems: Bridging the Generation Gap. In: Masys, D.R. (ed.): Proceedings of the 1997 AMIA Annual Fall Symposium. Philadelphia: Hanley & Belfus, Inc. (1997) 610-614. Wagner, J.C., Baud, R.H., Scherrer, J.-R.: Using the Conceptual Graphs Operations for Natural Language Generation in Medicine. In: Ellis, G. et al. (eds.): Proceedings of ICCS’95. Berlin: Springer-Verlag (1995) 115-128. Wagner, J.C., Solomon, W.D., Michel, P.-A. et al.: Multilingual Natural Language Generation as Part of a Medical Terminology Server. In: Greenes, R.A., Peterson, H.E., Protti, D.J. (eds.): Proceedings of MEDINFO’95. North-Holland: HC&CC, Inc. (1995) 100-104. Rassinoux, A.-M., Wagner, J.C., Lovis, C., et al.: Analysis of Medical Texts Based on a Sound Medical Model. In: Gardner, R.M. (ed).: Proceedings of SCAMC’95. Philadelphia: Hanley & Belfus, Inc. (1995) 27-31. Rassinoux, A.-M., Baud, R.H., Scherrer, J.-R.: A Multilingual Analyser of Medical Texts. In: Tepfenhart, W.M., Dick, J.P., Sowa, J.F. (eds.): Proceedings of ICCS’94. Berlin: Springer-Verlag (1994) 84-96. Sowa, J.F.: Conceptual Structures: Information Processing in Mind and Machine. Reading, MA: Addison-Wesley Publishing Compagny (1984). Rodrigues, J.-M., Trombert-Paviot, B., Baud, R. et al.: Galen-In-Use: An EU Project applied to the development of a new national coding system for surgical procedures: NCAM. In: Pappas, C., Maglaveras, N., Scherrer, J.-R. (eds.): Proceedings of MIE’97. Amsterdam: IOS Press (1997) 897-901. Norton, L.M., Pacak, M.G.: Morphosemantic Analysis of Compound Word Forms Denoting Surgical Procedures. Meth Inform Med 22(1) (1983) 29-36. Sowa, J.F.: Peircean Foundations for a Theory of Context. In: Lukose, D. et al. (eds.): Proceedings of ICCS’97. Berlin: Springer (1997) 41-64. Mineau, G.W., Gerbé, O.: Contexts: A Formal Definition of Worlds of Assertions. In: Lukose, D. et al. (eds.): Proceedings of ICCS'97. Berlin: Springer (1997) 80-94. Dick, J.P.: Using Contexts to Represent Text. In: Tepfenhart, W.M., Dick, J.P., Sowa, J.F. (eds.): Proceedings of ICCS'94. Berlin: Springer-Verlag (1994) 196-213.
Conceptual Graphs for Representing Business Processes in Corporate Memories Olivier Gerb´e1 , Rudolf K. Keller2 , and Guy W. Mineau3 1
DMR Consulting Group Inc. 1200 McGill College, Montr´eal, Qu´ebec, Canada H3B 4G7 [email protected] 2 Universit´e de Montr´eal C.P. 6128 Succursale Centre-Ville, Montr´eal, Qu´ebec, Canada H3C 3J7 [email protected] 3 Universit´e Laval Qu´ebec, Qu´ebec, Canada G1K 7P4 [email protected]
Abstract. This paper presents the second part of a study conducted at DMR Consulting Group during the development of a corporate memory. It presents a comparison of four major formalisms for the representation of business processes: UML (Unified Modeling Language), PIF (Process Interchange Format), WfMC (Workflow Management Coalition) framework and conceptual graphs. This comparison shows that conceptual graphs are the best suited formalism for representing business processes in the given context. Our ongoing implementation of the DMR corporate memory – used by several hundred DMR consultants around the world – is based on conceptual graphs, and preliminary experience indicates that this formalism indeed offers the flexibility required for representing the intricacies of business processes.
1
Introduction
Charnel Havens, EDS (Electronic Data Systems) Chief Knowledge Officer, presents in [5] the issues of knowledge management. With a huge portion of a company’s worth residing in the knowledge of its employees, the time has come to get the most out of that valuable corporate resource – by applying management techniques. The challenge companies will have to meet is the memorization of knowledge as well as its storage and its dissemination to employees throughout the organization. Knowledge may be capitalized on and managed in corporate memories in order to ensure standardization, consistency and coherence. Knowledge management requires the acquisition, storage, evolution and dissemination of knowledge acquired by the organization [14], and computer systems are certainly the only way to realize corporate memories [15] which meet these objectives. M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 401–415, 1998. c Springer-Verlag Berlin Heidelberg 1998
402
O. Gerb´e, R.K. Keller, and G.W. Mineau
DMR Consulting Group Inc. has initiated the IT Macroscope project [7], a research project that aims to develop methodologies allowing organizations: i) to use IT (Information Technology) for increasing competitiveness and innovation in both the service and product sectors; ii) to organize and manage IT investments; iii) to implement information system solutions both practically and effectively; and iv) to ensure that IT investments are profitable. In parallel with methodology development, tools for designing and maintaining these methodologies, designing training courses, and for managing and promoting IT Macroscope products were designed. These tools implement the concept of corporate memory. This corporate memory, called the Method Repository, plays a fundamental role. It captures, stores [3], retrieves and disseminates [4] throughout the organization all the consulting and software engineering processes and the corresponding knowledge produced by the experts in the IT domain. During the early stage of the development of the Method Repository, the choice of a knowledge representation formalism was identified as a key issue. That lead us to define specific requirements for corporate memories, to identify suitable knowledge representation formalisms and to compare them in order to choose the most appropriate formalism. We identified two main aspects: knowledge structure and dynamics – business processes, together with activities, events, and participants. The first part of the study [2] lead us to adopt the conceptual graph formalism for structural knowledge. Uniformity of the formalism used in the Method Repository was one issue but not the all-decisive one in adopting conceptual graphs for the dynamic aspect, too. Rather, our decision is based on the comparison framework presented in this paper. In our comparison, we studied four major business modeling formalisms or exchange formats, UML (Unified Modeling Language), PIF (Process Interchange Format), WfMC (Workflow Management Coalition) framework, and conceptual graphs, against our specific requirements. Choosing these four formalisms for our study has been motivated by the requirement for building our solution on existing or de facto standards. Our study demonstrates that conceptual graphs are particularly well suited for representing business processes in corporate memories since they support: (i) shared activities, and (ii) management of instances. The paper is organized as follows. Section 2 introduces the basic notions of business processes as used in this paper. Section 3 defines specific requirements for the representation of business processes in corporate memories. Section 4 compares the four formalisms. Finally, Section 5 reports on the on-going implementation of the Method Repository and discusses future work.
2
Basic Notions
In this section, we present basic notions relevant to the representation of business processes. Main notions of representation of the dynamics in an enterprise are processes, activities, participants (input, output, and agent), events (preconditions and postconditions), and notions of sequence and parallelism of activity
Conceptual Graphs for Representing Business Processes
403
executions. These notions build upon some commonly used definitions in enterprise modeling, as summarized in the following paragraph. A process is seen as a set of activities. An activity is a transformation of input entities into output entities by agents. An event marks the end of an activity; the event corresponds to the fulfilment of both the activity’s postcondition and the precondition of its successor activity. An agent is a human or material resource that enables an activity. An input or output is a resource that is consumed of produced by an activity. The notions of sequence and parallelism define the possible order of activity executions. Sequence specifies an order of executions and parallelism specifies independence between executions. Figure 1 presents the notions of activity, agent, input, and output. Activities are represented by a circle and participants of activities are represented by rectangles and linked to their respective activities by arcs ; directions of arc define their participation: input, output or agent. There is no notational distinction between input and agent. Note that this simple process representation exclusively serves for introducing terminology and for illustrating our requirements.
Fabrication Order
Glazier
Cut Window Panes
Window Panes
Fig. 1. Activity with input, output and agent.
Figure 2 illustrates the notions of sequence and parallelism by a process composed of five activities. The activity Write Production Order is the first activity of the process, the activities Build Frame and Cut Panes are executed in parallel, Assemble Window follows the activities Build Frame and Cut Panes, and finally Deliver Window terminates the process. Note that we only have to consider the representation of parallel activities or sequential activities; all other cases can be represented by these two cases by splitting activities into sub-activities.
3
Requirements
This section introduces the two main requirements underlying our study: representation of processes sharing activities and management of instances that are involved in a process. It is obvious that there exists a lot of other requirements to represent a business process in a corporate memory. Since these other requirements are mostly met by all the formalisms studied, we decided to focus on the two main requirements mentioned above.
404
O. Gerb´e, R.K. Keller, and G.W. Mineau
Build Frame
Write Production Order
Assemble Window
Deliver Window
Cut Panes
Fig. 2. A process as a set of activities.
3.1
Sharing Activities
Let us consider the case of two processes that share a same activity. Figure 3 illustrates this settings.
Write Software Code
Design Software Validate Specifications Design Hardware
Build Hardware
Fig. 3. Sharing Activities.
The example depicted in Fig. 3 deals with the fabrication of a product which is made out of two components: a software component and a hardware component, as in a cellular phone, a microwave oven or a computer. The first process describes the development of the software component. It is composed of Design Software, Validate Specifications, and Write Software Code. The second process describes the development of the hardware component. It is composed of activities Design Hardware, Validate Specifications, and Build Hardware. The activity Validate Specifications is an activity of synchronization and is shared by the two processes. The problem in this example is the representation and identification of the two processes. Each process is composed of three activities, with one of them being in common. Therefore the formalism must offer reuse of parts of process definitions or support some kind of shared variable mechanism. To support the representation of business processes in corporate memory, a formalism must offer features to represent processes sharing the same activities.
Conceptual Graphs for Representing Business Processes
3.2
405
Instance Management
To illustrate the problem of instance management, let us assume the example of a window manufacturer who has a special department for building non standard size windows. Figure 4 presents the window fabrication process.
Build Frame
Client Order
Write Fabrication Order
Fabrication Order
Cut Window Panes
Window Frame
Window Assemble Window
Window Panes
Fig. 4. The Window Problem.
A fabrication order is established from a client order. A fabrication order defines the size and material of the frame and the size and thickness of the glasses to insert into the frame. The fabrication order is sent to the frame builder and glass cutter teams which execute the order. Then the frame and glasses are transmitted to the window assembly team which insert the glasses into the frame. The problem of this team is to insert the right glasses (size and thickness) into the right frames (size and material). Some frames take more time to build than others, so the frames may be finished in a different order than the glasses are. This problem can be solved by the assembly team by assembling the frame and glasses in conformity with the fabrication order. At the notational level, this requires the possibility of specifying instances of input and output participants. To support representation of business processes in corporate memory, the formalism must offer features to represent and manage the related instances needed by different processes.
4
Formalisms
This section presents the four business process modeling formalisms of our study. These formalisms offer representation features in order to describe, exchange, and execute business processes. Each of the studied formalisms supports the representation of the basic notions introduced in Section 2, so we concentrate on the specific requirements discussed above. Against these requirements we have evaluated the four formalisms, UML [1] (Unified Modeling Language), PIF
406
O. Gerb´e, R.K. Keller, and G.W. Mineau
[8] (Process Interchange Format), WfMC framework [6] (Workflow Management Coalition) and conceptual graphs. Other formalisms, Petri net [16] and CML [11, 12], have been considered but not included in this study because not well-suited to represent business processes or not enough formal. 4.1
Unified Modeling Language
In [2] we presented how to represent the static structure in UML [1] (Unified Modeling Language). Let us recall that UML, developed by Grady Booch, Jim Rumbaugh and Ivar Jacobson from the unification of Booch method, OMT and OOSE, is considered as a de facto standard. UML provides several kinds of diagrams that allow to show different aspects of the dynamics of processes. Use Case diagrams show interrelations between functions provided by a system and external agents that use these functions. Sequence diagrams and Collaboration diagrams present interactions between objects by specifying messages exchanged among objects. State diagrams describe the behavior of objects of a class or the behavior of a method in response to a request. A state diagram shows the sequence of states an object may have during its lifetime. It also shows responsible requests for state transitions, responses and actions of objects corresponding to requests. Activity diagrams have been recently introduced in UML. They are used to describe processes that involve several types of objects. An activity diagram is a special case of state diagram where states represent the completion of activities. In the context of corporate memory , activity diagrams are the most relevant and we will present their main concepts in what follows. In UML, there are two types of execution of activities: execution of activities that represent atomic actions, they are called ActionState, and execution of a non atomic sequence of actions, they are called ActivityState. Exchange of objects among actions are modeled by object flows that are called ObjectFlowState. ObjectFlowStates implements notions of inputs and ouptuts. Agents are represented by Swimlane in activity diagrams. However it is possible to define agent as a participant to an activity and to establish explicitly a relationship between agent and activity. Figure 5 shows how to model the cut window pane activity with participants. Activity diagrams shows possible scenarios; this means that activity diagrams
:fabrication order
:glazier
Cut Window Panes
:panes
Fig. 5. UML - The cut window pane Activity.
Conceptual Graphs for Representing Business Processes
407
show objects instead of classes. Dashed arrows link inputs and outputs to activities. Processes may be represented using activity diagrams in UML and Fig. 6 shows an example of the window building process. Solid arrows between processes represent the control flow.
Build Frame
Write Fabrication Order
Assemble Window
Deliver Window
Cut Window Panes
Fig. 6. UML - The whole Process.
Sharing Activities As detailed in [1], UML does not support adequate representation features for sharing activities. However activity diagrams are new in the definition of the language and all cases have not been yet presented. Instances Management In opposition with the representation of structure [2], the process representation is done at the instance level. Activity diagrams involve objects not classes and therefore it is possible to represent the window problem by using the object fabrication order which specifies frame and panes. Figure 7 shows a representation for the window problem. :frame Build Frame
Write Fabrication Order
:client order
:fabrication order
:fabrication order specifies :fabrication order
specifies
Cut Window Panes
Assemble Window
:panes
:fabrication order
Fig. 7. UML - The Window Problem.
:window
408
O. Gerb´e, R.K. Keller, and G.W. Mineau
4.2
Process Interchange Format (PIF)
The PIF (Process Interchange Format) workgroup, composed of representatives from companies and universities developed a format to exchange the specifications of processes [8]. A PIF process description is a set of frame definitions. Each frame specifies an instance of one class of the PIF metamodel. Figure 8 shows PIF metamodel. It is composed of a generic class ENTITY from which all other classes are derived and of four core classes: ACTIVITY, OBJECT, TIMEPOINT, and RELATION. Subclasses
Decision successor Agent
if
then
❄ ❄ ❄ creates ✲ modifies ✲ uses ✲
performs ✲
Activity
begin
Object
end status
❄ ❄ ❄ Time Point
✛
before
Fig. 8. PIF - Metamodel.
of ACTIVITY and OBJECT are respectively DECISION and AGENT. Class RELATION has seven subclasses, subclasses CREATES, MODIFIES, PERFORMS, and USES define relationships between ACTIVITY and OBJECT, the subclass BEFORE defines a predecessor relationship between two points in time, the subclass SUCCESSOR defines a successor relationship between two activities and, ACTIVITY-STATUS defines the status of an activity at a point in time. Figure 9 shows the representation of an activity using the PIF format. ACT1 (define-frame ACT1 :own-slots ((Instance-Of ACTIVITY) (Name ”Cut Window Panes") (End END-ACT1)))
(define-frame PRFRMS1 :own-slots ((Instance-Of PERFORMS) (Actor AGT1) (Activity ACT1)))
(define-frame OUTPUT1 :own-slots ((Instance-Of OBJECT) (Name ”panes")))
(define-frame END-ACT1 :own-slots ((Instance-Of TIMEPOINT)))
(define-frame INPUT1 :own-slots ((Instance-Of OBJECT) (Name ”Fabrication Order")))
(define-frame CRTS1 :own-slots ((Instance-Of CREATES) (Activity ACT1) (Object OUTPUT1)))
(define-frame AGT1 :own-slots ((Instance-Of AGENT) (Name ”Glazier")))
(define-frame USES1 :own-slots ((Instance-Of USES) (Activity ACT1) (Object INPUT1)))
Fig. 9. PIF - Activity with participants.
Conceptual Graphs for Representing Business Processes
409
defines the cut window panes activity as an instance of ACTIVITY with a name and a relation to END-ACT1. END-ACT1 represents the end of the activity and is defined as a point in time. Then come definitions of the three participants; each participant is defined in two parts: definition of the participant itself and definition of the relationship between the activity and the participant. With the PIF process interchange format and framework, there is no explicit definition of a process. A process is the set of defined activities. Example shown in Fig. 10 shows how two activities ACT1 and ACT2 are linked by a BEFORE relationship. (define-frame ACT1 :own-slots ((Instance-Of ACTIVITY) (Name ”Write Fabrication Order") (End END-ACT1)))
(define-frame ACT2 :own-slots ((Instance-Of ACTIVITY) (Name ”Build Frame") (End END-ACT2)))
(define-frame END-ACT1 :own-slots ((Instance-Of TIMEPOINT)))
(define-frame END-ACT2 :own-slots ((Instance-Of TIMEPOINT)))
(define-frame ACT1-ACT2 :own-slots ((Instance-Of BEFORE) (Preceding-Timepoint END-ACT1) (succeeding-Timepoint END-ACT2)))
Fig. 10. PIF - Process.
Sharing Activities The PIF format supports representation of several sequences of activities. It is possible to define in one file more than one sequence of activities by a set of frames instance of BEFORE. However it is not possible to explicitly identify several processes. Instance Management With the PIF format activities and participants involved in the activities are described at the type level. Therefore, it is not possible to identify instances in PIF activity definitions. 4.3
Workflow Reference Model
The Workflow Management Coalition (WfMC) defines in the Workflow Reference Model [6] a basic metamodel that supports process definition. The Workflow Reference Model defines six basic object types to represent relatively simple processes. These types are: Worflow Type Definition, Activity, Role, Transition Conditions, Workflow Relevant Data, and Invoked Application. Figure 11 shows the basic process definition metamodel. The Workflow Management Coalition has also published a Process Definition Interchange in version 1.0 beta [17] that describes a common interface to the exchange of process definitions between workflow engines. Figure 12 presents the definition of the activity Cut Window Panes using this exchange format. Participants (inputs or agents) to an activity are defined explicitly. Data that are created or modified by an activity are defined in the postconditions of the activity or defined as output parameters of invoked applications during activity execution. In WFMC Process Definition Interchange format, a process is defined as a list of activities and a list of transitions that
410
O. Gerb´e, R.K. Keller, and G.W. Mineau Workflow Type Definition consists of
may
Role
has
✛ refer to may have
❄
uses
Activity
❄ ❄
Transition Conditions
✲ uses ✻
❄ Data
✻
Invoked Application may refer to
Fig. 11. WfMC - Basic Process Definition MetaModel.
specify in which order activities are executed. In Fig. 13 of the following section, examples of definitions of activities in WFMC interchange format are shown. ACTIVITY Cut_Window_Panes PARTICIPANT Glazier, Fabrication_Order POST CONDITION Window_Panes exists END ACTIVITY PARTICIPANT Glazier TYPE HUMAN END PARTICIPANT
DATA Fabrication_Order TYPE COMPLEX DATA END DATA DATA Window_Panes TYPE REFERENCE END DATA
Fig. 12. WfMC - Activity with Participants.
Sharing Activities Processes are defined using keyword WORKFLOW and ENDWORKFLOW which respectively begins and ends a process definition. In a process definition, it is possible to use activities or participants that have been defined in another process definition. In the example shown in Fig. 13, two processes are defined with a common activity. The commom activity is defined in process 1 and reused in process 2. Instance Management Process definitions are defined at type level. However conditions that fire activity or that are realized at the end of an activity are expressed using Boolean expressions with variables. In theory, it is possible to represent the window problem but the version 1.0 beta of Process Definition Interchange [17] gives few indications to realize it.
Conceptual Graphs for Representing Business Processes WORKFLOW PROCESS1 ACTIVITY Design_Software ... END_ACTIVITY
WORKFLOW PROCESS2 ACTIVITY Design_Hardware ... END_ACTIVITY
ACTIVITY Validate_Specifications ... END_ACTIVITY
ACTIVITY Build_Hardware ... END_ACTIVITY
ACTIVITY Write_Software_Code ... END_ACTIVITY
TRANSITION FROM Design_Hardware TO Validate_Specifications END_TRANSITION
TRANSITION FROM Design_Software TO Validate_Specifications END_TRANSITION TRANSITION FROM Validate_Specifications TO Write_Software_Code END_TRANSITION END_WORKFLOW
411
TRANSITION FROM Validate_Specifications TO Build_Hardware END_TRANSITION END_WORKFLOW
Fig. 13. WfMC - Processes Sharing Activities.
4.4
Conceptual Graphs and Processes
In conceptual graph theory, there is no standard way to represent processes. Processes have not been extensively studied and only a few works are related to the representation of processes. John Sowa in [13] presents some directions to represent processes. Dickson Lukose [9] and Guy Mineau [10] have proposed executable conceptual structures. We present below a possible metamodel to represent processes that fulfills corporate memory requirements as expressed in Section 3. The metamodel (Fig. 14) is composed of three basic concepts: ACTIVITY, PROCESS, and EVENT. An activity
TYPE ACTIVITY(x) IS TYPE EVENT(x) IS [T:*x][T:*x](INPUT)<-[T:*i] (END)->[EVENT:*ev1] (OUTPUT)<-[T:*e] (FOLLOWS)<-[EVENT:*ev2] (AGENT)<-[T:*a] (DEPENDS-ON)->[PRECONDITION:*pre] TYPE PROCESS(x) IS (REALIZES)->[POSTCONDITION:*post] [T:*x](FIRST)<-[EVENT:*].
Fig. 14. Conceptual Graphs - Metamodel.
is defined by its inputs and outputs, the agents that enable the activity, and by pre and post conditions. Preconditions define conditions or states that must be verified to fire the execution of the activity; postconditions define states or conditions that will result from the execution of the activity. An event is a point in time that marks the end of an activity; it marks the realization of the postcondition of the activity. A process is defined as a set of events that represent the execution of a set of activities.
412
O. Gerb´e, R.K. Keller, and G.W. Mineau
Using this metamodel, the Cut-Window-Panes process is defined by the definition graph presented in Fig. 15 where two variables with the same name represent the same object. TYPE CUT-WINDOW-PANES(x) IS [ACTIVITY:*x](AGENT)<-[GLAZIER:*] (INPUT)<-[ORDER:*o] (OUTPUT)<-[PANES:*v] (REALIZES)->[POSTCONDITION:[ORDRE:*o](CONFORMS)<-[PANES:*v]].
Fig. 15. Conceptual Graphs - The Cut Window Panes Activity.
The process to build a window is represented by the definition graph shown in Fig. 16. TYPE BUILD-WINDOW(x) IS [PROCESS:*x](FIRST)<-[EVENT:*ev1](END)->[WRITE-FABRICATION-ORDER:*] (FOLLOWS)<-[EVENT:*ev2a](END)->[BUILD-FRAME:*], (FOLLOWS)<-[EVENT:*ev2b](END)->[CUT-WINDOW-PANES:*] (FOLLOWS)<-[EVENT:*ev3](END->[ASSEMBLE-WINDOW:*] (FOLLOWS)<-[EVENT:*ev3](END)->[DELIVER-WINDOW:*].
Fig. 16. Conceptual Graphs - Process.
Sharing Activities The proposed model allows the representation of processes that share a same activity (as indicated by variables under a global coreference assumption1 ). Figure 17 shows two processes that share the same activity TYPE PROCESS1(x) IS TYPE PROCESS2(x) IS [PROCESS:*x][PROCESS:*x](FIRST)<-[EVENT:*ev1a](FIRST)<-[EVENT:*ev1b](END)->[DESIGN-HARDWARE:*] (END)->[DESIGN-SOFTWARE:*] (FOLLOWS)<-[EVENT:*ev2a](FOLLOWS)<-[EVENT:*ev2b](END)->[VALIDATE-SPECIFICATIONS:*vs] (END)->[VALIDATE-SPECIFICATIONS:*vs] (FOLLOWS)<-[EVENT:*ev3a](FOLLOWS)<-[EVENT:*ev3b](END)->[BUILD-HARDWARE:*]. (END)->[WRITE-SOFTWARE-CODE:*].
Fig. 17. Conceptual Graphs - Processes Sharing Activities. VALIDATE-SPECIFICATIONS. Each process is defined by a sequence of events, and
one event of each process marks the end of the activity. 1
The proposed model assumes global coreference. Two variables with the same identifier represent the same concept.
Conceptual Graphs for Representing Business Processes
413
Instance Management. Figure 18 shows that with the use of variables and the global coreference assumption, conceptual graphs support the representation of TYPE WRITE-FABRICATION-ORDER(x) IS [ACTIVITY:*x](INPUT)<-[CLIENT-ORDER:*c] (OUTPUT)<-[ORDER:*o]. TYPE BUILD-FRAME(x) IS [ACTIVITY:*x](INPUT)<-[ORDER:*o] (OUTPUT)<-[FRAME:*f] (REALIZES)->[POSTCONDITION:[ORDER:*o](CONFORMS)<-[FRAME:*f]].
TYPE ASSEMBLE-WINDOW(x) IS [ACTIVITY:*x](INPUT)<-[ORDER:*o] (INPUT)<-[PANES:*p] (INPUT)<-[FRAME:*f] (DEPENDS-ON)->[PRECONDITION:[ORDER:*o](CONFORMS)<-[PANES:*p] (CONFORMS)<-[FRAME:*f]] (OUTPUT)<-[WINDOW:*w].
TYPE CUT-WINDOW-PANES(x) IS [ACTIVITY:*x](INPUT)<-[ORDER:*o] (OUTPUT)<-[PANES:*p] (REALIZES)->[POSTCONDITION:[ORDER:*o](CONFORMS)<-[PANES:*v]].
Fig. 18. Conceptual Graphs - The Window Problem.
the window problem. The concept type definition of WRITE-FABRICATION-ORDER, BUILD-FRAME, CUT-WINDOW-PANES, and ASSEMBLE-WINDOW specify that the frame and panes involved in assemble window are conformed to the fabrication order. 4.5
Summary
Table 1 presents a summary of this survey on business process representation formalisms. This summary shows that the framework proposed by the WfMC Table 1. Summary UML PIF WfMC CG
Sharing Activities Instances Management No Yes Yes No Yes Yes Yes Yes
and conceptual graphs fulfill our requirements for the representation of business processes in corporate memories. However, the first part of our study [2] identified conceptual graphs as the best-suited formalism for knowledge structure. Therefore, for the sake of uniformity of formalism, we chose conceptual graphs.
5
Experience and Future Work
Using conceptual graph formalism, a corporate memory has been developed at the Research & Development Department of DMR Consulting Group Inc in or-
414
O. Gerb´e, R.K. Keller, and G.W. Mineau
der to memorize the methods, know-how and expertise of its consultants. This corporate memory, called Method Repository, is a complete authoring environment used to edit, store and display the methods used by the consultants of DMR. The core of the environment is the CG Knowledge Base; it is a knowledge engineering system based on conceptual graphs. Four methods are commercially delivered: Information Systems Development, Architecture, Benefits Realization, and Strategy; their documentation in paper and in hypertext format is generated from conceptual graphs. About two hundred business processes have been modeled and from about 80,000 conceptual graphs, we generated more than 100,000 HTML pages in both English and French that can be browsed using commercial Web browsers. This paper has described the research we have done to identify which formalism was the most suitable for the representation of business processes in corporate memories. We have compared four formalisms and this comparison has shown as in a previous study [2] how conceptual graphs are a good response to the specific requirements involved in the development of corporate memories.
References [1] G. Booch, J. Rumbaugh, and I. Jacobson. Unified Modeling Language, Version 1.1. Rational Software Corporation, 1997. [2] O. Gerb´e. Conceptual graphs for corporate knowledge repositories. In Proceedings of 5th International Conference on Conceptual Structures, pages 474–488, 1997. [3] O. Gerb´e, B. Guay, and M. Perron. Using conceptual graphs for methods modeling. In Proceedings of the 4th International Conference on Conceptual Structures, 1996. [4] O. Gerb´e and M. Perron. Presentation definition language using conceptual graphs. In Peirce Workshop Proceedings, 1995. [5] C. Havens. Enter, the chief knowledge officer. CIO Canada, 4(10):36–42, 1996. [6] D. Hollingsworth. The Workflow Reference Model. Workflow Management Coalition, 1994. [7] DMR Consulting Group Inc. The IT Macroscope Project, 1996. [8] J. Lee, M. Gruninger, Y. Jin, T. Malone, A. Tate, and Yost G. other members of the PIF Working Group. The PIF Process Interchange Format and Framework (May 24, 1996), 1996. availaible at http://soa.cba.hawaii.edu/pif/. [9] D. Lukose. Model-ecs: Executable conceptual modelling language. In Proceedings of Knowledge Acquisition Workshop (KAW96), 1996. [10] D. Lukose and G.W Mineau. A comparative study of dynamic conceptual graphs. In Accepted for publication at the 11th KAW, 1998. [11] A. Schreiber, B. Wielenga, H. Akkermans, W. Van de Velde, and A. Anjewierden. Cml: The commonkads conceptual modelling language. In L. Steels, A. Schreiber, and W. Van de Velde, editors, Proceedings of the 8th European Knowledge Acquisition Workshop (EKAW’94), pages 1–24. Springer-Verlag, 1994. [12] G. Schreiber, B. Wielenga, H. Akkermans, W. Van de Velde, and A. Anjewiereden. Cml: The commonkads conceptual modelling language. In Proceedings of the 8th European Knowledge Acquisition Workshop (EKAW’94), 1994. [13] J. Sowa. Processes and participants. In P. Eklund, G. Ellis, and G. Mann, editors, Proceedings of the 4th International Conference on Conceptual Structures, ICCS’96, pages 1–22. Springer, 1996.
Conceptual Graphs for Representing Business Processes
415
[14] E. W. Stein. Organizational memory: Review of concepts and recommendations for management. International Journal of Information Management, 15(1):17–32, 1995. [15] G. van Heijst, R. van der Spek, and E. Kruizinga. Organizing corporate memories. In Proceedings of the Knowledge Acquisition Workshop, 1996. [16] WG11. High-level petri net standard - working draft - version 2.5. 1997. [17] Workflow Management Coalition. Interface 1: Process Definition Interchange, 1996.
Handling Specification Knowledge Evolution Using Context Lattices Aldo de Moor1 and Guy Mineau2 1
2
Tilburg University, Infolab, Tilburg University, P.O.Box 90153, 5000 LE Tilburg, The Netherlands [email protected] Universit´e Laval, Department of Computer Science, Quebec City, Canada, G1K 7P4, [email protected]
Abstract. Internet-based information technologies have considerable potential for improving collaboration in professional communities. In this paper, we explain the concept of user-driven specification of network information systems that these communities require, and we describe some problems related to finding the right focus for adequate user involvement. A methodological approach to the management of specification knowledge definitions, which involves composition norms, is summarized. Subsequently, an existing conceptual graph based definitional framework for contexts is presented. Conceptual graphs are a simple, general, and powerful way of representing and reasoning about complex knowledge structures. This definitional framework organizes such graphs in context lattices, allowing for their efficient handling. We show how context lattices can be used for structuring composition norms. The approach makes use of context lattices in order to automatically verify specification constraints. To enable structured specification discourse, this mechanism is used to automatically select relevant users to be involved, as well as the information appropriate for building their discourse agendas. Consequently, this paper shows how conceptual graphs can play an important role in the development of this key Internet-based activity.
1
Introduction
Increasingly more distributed professional communities, such as research networks, are discovering the potential of collaboration through electronic media such as the Internet. However, several factors contribute to making it hard to determine the optimal or even just adequate use of information technology to support these networks in their collaborative activities [1]. One reason is that most knowledge creation activities are complex, situated, and dynamic. Another complicating factor is that numerous networked information tools are available, from which it is often difficult to determine which ones to use for what task purposes. Furthermore, system specification becomes even harder as it must also be user-driven, meaning that the users themselves are to discover ’breakdowns’ in their use of the system and negotiate specification changes with other users and implementors. Users must initiate their own specification processes, because they themselves are the task experts, and moreover are often only loosely organized, without extensive organizational support for taking care of system development. For example, the publishing of electronic M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 416–430, 1998. c Springer-Verlag Berlin Heidelberg 1998
Handling Specification Knowledge Evolution Using Context Lattices
417
journals by networks of scholars instead of by commercial publishing houses is rapidly becoming popular. The support of the complex collaborative processes involved, such as the reviewing and editing of electronic publications, must not just be treated from a technical perspective. Rather, the new information tools must be designed to ‘play an effective role within the social infrastructure of scholarship’ [2]. The involved scholars themselves are in a good position to define this role, as they best understand the subtleties of their requirements, and can provide volunteer specification labour in these mostly underfunded joint projects. To overcome the specification hurdles, structured methods for user-driven specification are needed. Already some approaches exist, for instance rapid application development, prototyping, and radically tailorable tools [3,4], which more strongly involve users than traditional systems development methods. However, some drawbacks are that these approaches are based on traditional sequential instead of on evolutionary development models, focus too much on implementation rather than on conceptual issues, or support single user instead of group specification processes. 1.1
User-Driven Specification
True user-driven systems development means that each user can initiate and co-direct the specification process, based on concrete functionality problems he experiences when using the information system for his own purposes. Rather than doing a ‘summative evaluation’ of the information system in progress, in which users only approve of the overall specification process results, a user should be able to do a ‘formative evaluation’. This entails that the users, rather than the developers, propose and decide upon specification suggestions which developers only help translate into actual modifications of the design of the system [5]. One approach that could in potential deal with the mentioned issues is process composition [6]. Its essence is that users of a system start with a rough definition of their work processes which are completely supported by the set of available tools. Over time, these specifications are gradually refined, always making sure that all processes are covered by available tool-enabled functionality. Such an approach takes into account the empirical findings that in general users initially only need to have an essential understanding of their business processes and tools to be able to initiate work [7], and that new technologies must be introduced gradually to prevent disruption of current work practices [8]. One implementation in progress of (group) process composition is the RENISYS specification method for research network information systems [1]. This method is discussed later on in this paper. 1.2
Finding a Focus
A major problem with process composition is that it is very difficult to determine the exact scope of a specification process aimed at resolving a functionality problem. Finding the proper scope is important in order to arrive at legitimate specifications, which are not only meaningful, but also acceptable to the professional community as a whole [1]. However, this acceptability does not mean that all users should be consulted about every
418
A. de Moor and G. Mineau
change all the time. Of course, on the one hand, all users who have an interest in the system component to be changed need to be involved. On the other hand, however, as few users as possible should participate in the resolution of a particular specification problem, in order to prevent ‘specification overload’, as well as to ensure the assignment of clear specification responsibilities. Most current specification approaches intending to foster user participation do not systematically analyze how to achieve adequate user involvement in specification processes. For user participation in specification discourse (defined as rational discussion among users to reach agreement on the specifications of their network information system to become more satisfactory), it is at least necessary to precisely know: 1. 2. 3. 4.
When to consult users? Which users to consult? What to consult them about? How to consult them?
Question 1 has to do with how to recognize breakdowns, which are disruptions in work processes experienced by participants while using the information system. A breakdown should trigger specification discourse resulting in newly defined functionality that better matches the real information needs of the user community. Question 4 focuses on how such specification discourse is to be systematically supported. Users could be provided with semi-structured linguistic options (representing for instance requests, assertions, promises), which are tailored to the particular specification problem at hand. Answering these two questions does not fall within the scope of the current paper. Ideas being worked out in the RENISYS project are taken from the language/action perspective [9]. This rather new paradigm for IS specification looks at the actions people carry out while communicating, and how this communication helps them to coordinate their activities. One of the key paradigmatic ideas is that people can make commitments as a result of speech acts. Such commitments in turn can be used to generate agendas of tasks to be carried out and evaluated by the various participating users. An agenda for a particular user thus consists of all things a user has to do, normally concerning the conduct and coordination of goal-oriented activities. In our case, however, agenda items refer to the specifications to be made or agreed upon of the network information system that supports the group work. In this article, we will concentrate on questions 2 and 3. The main issues we will address are: (1) selecting the relevant users to participate in system specification discourse, and (2) determining the possibly different agendas for a particular specification discourse for the various selected users. We will do this by developing a mechanism to efficiently handle user-driven specification knowledge evolution using context lattices. These were first presented in [10], and will be briefly reintroduced in Sect. 3.3. The context lattices are used to (1) organize specification knowledge, (2) check whether knowledge definitions are legitimate (i.e. both meaningful and acceptable), and (3) determine which participants should be involved with what privileges in specification discourse to resolve illegitimate knowledge definitions. In Sect. 2, the approach to knowledge handling in the user-driven specification method RENISYS is described. Sect. 3 introduces conceptual graph-based contexts and context lattices. In Sect. 4, context lattices are applied
Handling Specification Knowledge Evolution Using Context Lattices
419
to structure what is called composition norm management, and in this way support the specification process.
2
Specification Knowledge Handling
First, the different categories of specification knowledge distinguished in the RENISYS method are presented. Then, the problem of how to ensure that specification changes are covered by what is called the composition norm closure, is discussed. 2.1
Knowledge Categories
RENISYS distinguishes three types of specification knowledge: ontological (type) definitions, state definitions, and norm definitions. The ontologies contain functionality specifications (what are the entities, attributes, and relationships to be represented and supported by the IS). States define what entities are or should be actually present. Norms determine (1) who can use the system (determined by action norms) and (2) who should be involved in their specification (determined by composition norms). Conceptual graphs are used as the underlying knowledge representation formalism because a knowledge representation formalism is needed that is sufficiently close to natural language to efficiently express complex specifications understandable to users, yet that is formal and constrained enough to allow for automated coordination of the specification process. CG theory is very well suited to this task, as argued in [1]. Type Definitions In RENISYS, the type definitions are organized into an ontological framework consisting of three kinds of ontologies. The heart of this framework is the core process ontology, consisting of elementary network process concepts derived from workflow modelling theory. Built on top of these generic concepts, three domain ontologies are defined. A domain is a system of network entities that can be observed by analyzing the universe of discourse from a particular perspective. The problem domain is the UoD seen from the task perspective, the human network is the UoD observed from the organizational perspective, and the information system is the same seen from the functionality perspective. The domain ontologies can be customized by the user to express concepts specific to his situation, thus allowing for conceptual evolution. Finally, the framework ontology describes a set of mapping constructs that link entities from the various domains. Type definitions represent functionality specifications, such as the structure of documents, or the inputs and outputs of workflows. For example, a simplified definition of type MAILING LIST could be: [TYPE: [MAILING_LIST:*x] -> (def) -> [INFORMATION_TOOL:?x] (matr) -> [RECEIVED_MAIL] (rslt) -> [RESENT_MAIL] (poss) <- [LIST_OWNER]].
Note that we do not use the standard type definition format introduced by Sowa, as we want a uniform representation format that can be used for all three categories of
420
A. de Moor and G. Mineau
knowledge (i.e. types, norms, and states). Furthermore, we want to be able to represent and infer from qualified type definitions, such as partial, proposed, and invalid type definitions. For instance, partial type definitions must be identified and represented as such. They are incomplete definitions of the necessary properties that a concept type should have. They are very important in guiding specification discourse, as often a group of users will initially agree on a concept at least necessarily having a set of properties, while also agreeing that the definition is not yet complete. A partial type definition is thus open to further debate. State Definitions State definitions represent states-of-affairs, which are first of all needed to determine which entities the information system implementation must support. For example, the following state definition indicates that John Doe is the list owner of the cg-mailing list. We thus know that all mailing list owner functions must be installed for at least this network participant. [STATE: [MAILING_LIST: cg-list] <- (poss) <- [LIST_OWNER: John Doe]].
Also, state knowledge plays two crucial roles in the specification process of the network information system. First, it can be used to detect incomplete or inconsistent functionality specifications. For instance, if the type definition of a mailing list says that there should be at least one list owner, but (unlike in the above state definition) no such list owner has been defined, then a specification process can be started to specify who currently plays this role. Alternatively, if no such person can be defined, it may be that the type definition of mailing list must be revised so that this (currently mandatory) relation can be removed. Second, state definitions can be used as input objects into specification processes, for instance by allowing for the identification of subjects who can create new knowledge definitions. Such concrete assignments of specification responsibilities are essential for network information system development to be successful. Norm Definitions Norm definitions represent deontic knowledge, which includes such concepts as responsibilities, permissions and prohibitions. This knowledge can, among other things, help to define and manage workflow and specification commitments. Formal models for such commitment management in a language/action context are dealt with in speech-act based deontic logic [11]. A key concept is that of actor, which is an interpreting entity capable of playing process controlling roles. Actor concepts themselves are ultimately instantiated by subjects, who are the people using and developing the network information system. The basic pattern of a norm definition is an actor to which the norm applies, in combination with a control process (initiation, execution, or evaluation) and a transformation (a process in which a set of input objects is transformed into an output object) being controlled. Norm definitions can be subdivided into action norms and composition norms. An action is a control process plus the controlled (operational level) workflow, a composition is defined as a control process plus a (meta-level) specification process. Action norms regulate behaviour at the operational level, in which case the transformations are called workflow processes. An example of an action norm is the following permitted action, which says that a list owner is permitted to add a list member:
Handling Specification Knowledge Evolution Using Context Lattices
421
[PERM_ACTION: [LIST_OWNER] <- (agnt) <- [EXEC] -> (obj) [ADD_LIST_MEMBER]].
Composition norms, on the other hand, define desired behaviour at the specification level: they allow users who are, through actor roles involved in workflows, to be identified as simultaneously having legitimate roles in the specification process. Three kinds of specification processes are distinguished: creation, modification, and termination. An example of a composition norm could be this mandatory composition: [MAND_COMP: [LIST_OWNER] <- (agnt) <- [EVAL] -> (obj) [TERMINATE] -> (rslt) -> [TYPE: [LIST_MEMBER]].
The termination of a type means that a legitimate type is removed from the type hierarchy together with all its definitions, which may be required if a concept is no longer useful. This particular norm means that a list owner is required to evaluate (i.e. approve or reject) any list member type termination, which has been proposed by possibly another actor. Having a well-supported approach for dealing with composition norms is crucial for managing the change process of network information systems. These norms help to identify which actors are to be involved in a particular specification process. Furthermore, they can be used to set the agenda for specification discourse, since they indicate what knowledge definitions an actor can legitimately handle and in what way. Thus, composition norms provide the key to answering the two questions we posed in section 1.2. 2.2
Composition Norm Closure
Traditional information systems analysis can be characterized as taking a snapshot of “the" sum of information requirements of an organization by a monolithic external group of analysts. However, in network information systems development, many users are often only temporarily involved in specification processes and this only from a very limited perspective and mandate: trying to resolve their own particular problem or that of others with whom they closely collaborate. However, if every specification is linked to others and every specification must be covered by the appropriate composition norms, a major problem arises in case of (partially) changing needs: how to guarantee that proposed specification changes remain part of the composition norm closure (defined as the sum of the explicitly asserted plus all derivable composition norms), i.e., how to make sure that a proposed specification is legitimate and also does not leave any other specification uncovered? To deal with this problem, it is often not enough to find just one applicable norm. Completeness is very important. For instance, if one wants to know whether the current user, who plays a number of actor roles, is allowed to change the definition of a particular type, all composition norms applicable to this definition need to be identified. However, as the knowledge base of graphs grows large, checking every unorganized composition norm by standard projections can get very cumbersome. This is especially true when recursive operations on embedded parts must be carried out. Furthermore, such a straightforward approach does not easily generate related contextual information, such as the other definitions the actor specifying the current definition is allowed to make.
422
A. de Moor and G. Mineau
Therefore, a more sophisticated norm querying and updating mechanism is needed. Such a query mechanism, which is optimized to handle particular contexts and the relations between different worlds of assertions, is formed by context lattices. Two of the major advantages of context lattices are that they (1) allow queries to be simplified, as embedded queries can be subdivided into their constituting parts and (2) the structure of the knowledge base can be queried, allowing for interesting relations to be easily discovered [10].
3
Contexts
Composition norms play a crucial role in the coordination of the user-driven specification process, as they put constraints on who is authorized to (re)define which particular knowledge definitions. Thus, the knowledge definitions are only true if the specification process conditions under which they are asserted are true as well. Such conditional sheets of assertion can be naturally represented as conceptual graph contexts [10]. Contexts are an essential building block of conceptual graph theory [12]. Building on these notions, Mineau and Gerb´e [1997] presented a formal theory of context lattices, which is briefly summarized here. A context is a conceptual device that can be used for organizing information that originates from multiple worlds of assertion. It consists of an extension and an intention. In a context, the truth of a set of assertions (the extension) depends on a specific set of conditions (the intention). Thus, the intention is formed by those graphs which, if conjunctively satisfied, make the extension true. Thus, only if the intention graphs can all be made true, do the extension graphs exist. A context Ci is defined as a tuple of two sets of conceptual graphs: Ci =< T, G >
(1)
where T is the intention, and G is the extension of Ci . Two functions I and E were defined so that for a context Ci its intention T equals I(Ci ) and the extension G is the same as E(Ci ). Contexts can directly be used to represent norms. The intention of a (composition) norm defines that some actor is capable of controlling a specification process of some kind of knowledge definition. The graph representation of this most generic composition norm intention is: [ACTOR] <- (agnt) <- [CONTROL] -> (obj) -> [SPECIFY] (rslt) -> [DEFINITION: #]
It will be used as the intention of some context, while the referent of the DEFINITION concept, representing the knowledge definition being specified, will be considered as being in the extension of the same context. The format of the extension graph depends on the type of this definition (i.e. TYPE, PERM ACTION, or STATE). Examples of these definitions were given in the previous section.
Handling Specification Knowledge Evolution Using Context Lattices
3.1
423
Example: Mailing List
We will illustrate the ideas put forward in this paper with a short example of a specification process typically encountered in a research network. To clarify the ideas introduced in this paper, only three permitted compositions are given. In a realistic case, required and forbidden compositions will also be needed. The example is the following. Many research networks are supported by mailing lists. A mailing list comes installed with a default set of properties. Some public lists allow any member to control the change of all their properties, which is explicitly defined as a composition norm. Often, however, as the networks grow in scope, the mailing list is to play new roles. For example, the purpose for which the mailing list is used could be changed from enabling general information exchange to supporting the preparation of a confidential report. In case of such a private mailing list, the list owner, who is a special type of network actor, can explicitly be allowed to modify the setting of the list parameters. Finally, (for any type of mailing list) a list owner can start the cancellation of the action norm which says that a list applicant can register himself as a list member. In this case, the following three (permitted) composition norms apply: (1) In a mailing list, any network participant is permitted to control (initiate, execute, and evaluate) modifications of mailing list properties, for example when the scope of the group needs to be changed. (2) In case of a private mailing list, a list owner is permitted to make modifications of the properties of the mailing list, i.e. he may change the settings about whether the list has open or closed subscription, whether it is moderated or not, etc. (3) A list owner is allowed to initiate the termination of the action (norm) that a list applicant can register himself as a list member. As contexts Ci , these composition norms could be represented as follows: --------------------------------------------------------------------------| C1: Perm_Comp_1 | | i1: [ACTOR] <- (agnt) <- [CONTROL] -> (obj) -> [MODIFY] | | (rslt) -> [TYPE: #] | --------------------------------------------------------------------------| g1: [MAILING_LIST] | ----------------------------------------------------------------------------------------------------------------------------------------------------| C2: Perm_Comp_2 | | i2: [LIST_OWNER] <- (agnt) <- [EXEC] -> (obj) -> [MODIFY] | | (rslt) -> [TYPE: #] | --------------------------------------------------------------------------| g2: [PRIVATE_MAILING_LIST] | ----------------------------------------------------------------------------------------------------------------------------------------------------| C3: Perm_Comp_3 | | i3: [LIST_OWNER] <- (agnt) <- [INIT] -> (obj) -> [TERMINATE] | | (rslt) -> [PERM_ACTION: #] | --------------------------------------------------------------------------| g3: [LIST_APPLICANT] <- (agnt) <- [EXEC] -> (obj) -> [REG_LIST_MEMBER] | ---------------------------------------------------------------------------
424
A. de Moor and G. Mineau
Hereby we assume that this (partial) type hierarchy has been defined: [T] > [CONTROL] > [EXEC] [INIT] [DEFINITION] > [PERM_ACTION] [TYPE] [INFO_TOOL] > [MAILING_LIST] > [PRIVATE_MAILING_LIST] [ACTOR] > [LIST_APPLICANT] [LIST_OWNER] [REG_LIST_MEMBER] [SPECIFY] > [MODIFY] [TERMINATE]
The contexts thus allow for a clear separation between the knowledge being specified (gi ), and the modality of the actual specification process (ii ). Note that contexts instead of non-nested CGs are not only used in C3, but in C1 and C2 as well, because the intention (actor permitted to control specification process) represents the conditions under which the extension (knowledge definition, e.g. mailing list) may be specified. 3.2
Basic Context Inferences
Contexts have some interesting properties that can be inferred from the previous definitions [10]. First, it is important to realize that a graph g can be in E(Ci ) either because it has been explicitly asserted in that context, or because it is part of the transitive closure of the asserted graphs of that context. Thus, if gm is asserted in some context but is a generalization of gn from E(Ci ), then gm is also considered to be in E(Ci ). Thus, in the example, E(C2 ) contains both g2 and g1 , as a MAILING LIST is a generalization of a PRIVATE MAILING LIST. Second, if I(Cj ) < I(Ci ) then E(Cj ) ⊆ E(Ci ). This means that if the intention of Cj is a specialization of the intention of Ci , then (at least) all the extension graphs of Cj are in the extension of Ci . Thus, as i2 < i1 , we can derive that E(C1 ) = {g1 , g2 }. Note that the contexts described in Sect. 3.1 only contain the extension graphs that have been explicitly asserted. Later on in this paper we will also include the derived extension graphs. Now that we have made some basic context inferences, we will look at how they can be used in the construction of context lattices. 3.3
Context Lattices
A context lattice is a structure that can be used to organize a set of contexts, allowing associations between these contexts to be made. A context lattice L consists of a set of formal contexts Ci∗ , which are structured in a partial order ≤. A formal context Ci∗ is represented as:
Handling Specification Knowledge Evolution Using Context Lattices
Ci∗ =< I ∗ (Gi ), E ∗ (Ti ) >
425
(2)
where I ∗ (Gi ) = ∪j I(Cj )|Gi ⊆ E(Cj ) and E ∗ (Ti ) = ∩j E(Cj )|I(Cj ) ⊆ Ti Context lattices can be used to optimize query mechanisms concerning (1) particular contexts and (2) the relations between sets of intentions and extensions. In other words, they can be used to make explicit the relations between different worlds of assertions [10]. To create the context lattice for our example, we need to take the following steps: 1. Recalculate the contexts As noted earlier, contexts can have explicitly asserted as well as derived extension graphs. The representations of C1 , C2 and C3 only showed the explicitly asserted extension graphs: C1 =< {i1 }, {g1 } > C2 =< {i2 }, {g2 } > C3 =< {i3 }, {g3 } >
With completely calculated extensions (using the inferences of section 3.2 to recalculate E(Ci )) the contexts can be represented as: C1 =< {i1 }, {g1 , g2 } > C2 =< {i2 }, {g1 , g2 } > C3 =< {i3 }, {g3 } >
Note that C1 and C2 now have the same set of extension graphs.
2. Calculate the formal contexts For each context, we now calculate the formal context (using Ci∗ =< I ∗ (Gi ), E ∗ (Ti ) >). C1∗ =< {i1 , i2 }, {g1 , g2 } > C2∗ =< {i1 , i2 }, {g1 , g2 } > C3∗ =< {i3 }, {g3 } >
For C1 and C2 the same formal context is produced. 3. Calculate the context lattice Each formal context should occur only once, so redundant formal contexts (i.e. the above C2∗ ) must be removed. Furthermore, in order to create a lattice, we must also add a formal context including all extension graphs, as well as a context including all intention graphs. After renumbering the formal contexts, the resulting context lattice is as represented in fig. 1: C1∗ =< {}, {g1 , g2 , g3 } >
XXX X X
C2∗ =< {i1 , i2 }, {g1 , g2 } >
X XXX
C3∗ =< {i3 }, {g3 } >
X
C4∗ =< {i1 , i2 , i3 }, {} >
Fig. 1. The Context Lattice for the Example
426
4
A. de Moor and G. Mineau
Structuring Composition Norm Management
On top of the context lattice structure, a mechanism that allows for its efficient querying has been defined [10]. One of its main advantages is that complex queries consisting of sequences of steps and involving both intentions and extensions can be formulated. This mechanism can be used for a structured yet flexible approach to norm management in specification discourse, by automatically determining which users to involve in discussions about specification and changes and what they are to discuss about (their agendas). There are two main ways in which a context lattice can be used in a user-driven specification process. First, it can be used to assess whether a new knowledge definition is legitimate by checking if a particular specification process is covered by some composition norm. As this is a relatively simple task of projecting the specification process on the composition norm base, we do not work out this application here. The second application of context lattices in the specification process is applying (new) specification constraints (constraints on the relations that hold between different knowledge definitions) on the existing (type, norm, and state) knowledge bases. This differs from the first application as, after a constraint has been applied, originally legitimate knowledge base definitions may become illegitimate, and would then need to be respecified. In this section we will first give a brief summary of how context lattices can be queried. Then, it will be illustrated how this query mechanism can play a role in composition norm management, by applying it to our example in the resolution of one realistic specification constraint. 4.1
Querying Concept Lattices
In order to make series of consecutive queries where the result of one query is the input for the embedding query, which is needed for navigating a context lattice, we need two more constructs. First, we need to be able to query a particular context extension or intention. Second, we must be able to identify the context which matches the result of an extension or intention-directed query. For the first purpose, two query functions have been defined that allow respectively extension or intention graphs to be retrieved from a specified formal context [10]: δE ∗ (C ∗ , q) = {gE(C ∗ )|g ≤ q} δI ∗ (C ∗ , q) = {gI(C ∗ )|g ≤ q}
(3) (4)
Furthermore, Mineau and Gerb´e have constructed two context-identifying functions: CE (G) =< I ∗ (G), E ∗ (I ∗ (G)) > CI (T ) =< I ∗ (E ∗ (T )), E ∗ (T ) >
(5) (6)
Space does not permit to describe the inner workings of these functions in detail (see [10] for further explanation). Right now, it suffices to understand that these functions allow the most specific context related to respectively a set of extension graphs G or
Handling Specification Knowledge Evolution Using Context Lattices
427
a set of intention graphs T to be found Together, these functions can be used to produce embedded queries by alternately querying and identifying contexts, thus enabling navigation through the context lattice. 4.2
Supporting the Specification Process
In the applications of context lattices, discussed in the previous section, the following general steps apply: 1) Check either the specification of a new knowledge definition against the composition norm base or the specifications of an existing knowledge base against a specification constraint. 2) Identify the resulting illegitimate knowledge definition(s). 3) Identify appropriate ‘remedial composition norms’ (i.e. composition norms in which the illegitimate knowledge definition is in the extension) 4) Build discourse agendas (overviews of the specifications to discuss) for the users identified by those remedial composition norms, so that they can start resolving the illegitimate definitions. These processes consist of sequences of queries that switch their focus between what is being defined and who is defining. For this purpose, the functions provided by context lattice theory are concise and powerful, at least from a conceptual point of view. One way in which we can apply context lattices is by formulating specification constraints, which constrain possible specifications and can be expressed as (sequences of) composition norm queries. Note that the example of the resolution of a specification constraint presented next is simple, and the translation into context lattice queries is not yet very elegant. However, what we try to present here is the general idea that flattening queries using context lattices is a powerful tool for simplifying and helping to understand queries with respect to the contexts where they apply. In future work, we aim to develop a more standardized approach that can apply to different situations. 4.3 An Example One specification constraint could be: “Only actors involved in the definition of permitted actions are to be involved in the definition of (the functionality of) information tools". The constraint guarantees that enabling technical functionality is defined only by those who are also involved in defining the use that is being made of at least some of these tools. This specification constraint and much more complex ones can be helpful to realize more user-driven specification, tailored to the unique characteristics of a professional community. The power of the approach developed in this paper is that it allows such constraints to be easily checked against any existing norm base, identifying (now) illegitimate knowledge definitions, and providing the contextual information necessary for their resolution. We will illustrate these rather abstract notions by translating the above mentioned informal specification constraint into a concrete sequence of composition norm queries. Decomposing the specification constraint, we must answer the following questions:
428
A. de Moor and G. Mineau
1. Which actors control the specification of which information tools? 2. Are there illegitimate composition norms (because some of these norm actors are not also being involved in the specification of any permitted actions?) 3. Which actors are to respecify these illegitimate norms on the basis of what agendas? Questions 1-3 can be decomposed into the following steps (this decomposition is not trivial, in future research we aim at providing guidelines to achieve it): 1a. Determine which specializations gj of information tools have been defined. The query s1 should start at the top of the context lattice, as this context contains all extension graphs: s1 = δE ∗ (C1∗ , q1 ) = {g1 , g2 } = {[MAILING_LIST],[PRIVATE_MAILING_LIST]} where q1 = [INFO TOOL]
1b. For each of these information tools gj , determine which actors ai control its specification: s2 = δI ∗ (CE (g1 ), q2 ) = δI ∗ (C2∗ , q2 ) = {i1 , i2 } s3 = δI ∗ (CE (g2 ), q2 ) = δI ∗ (C2∗ , q2 ) = {i1 , i2 } where q2 = [ACTOR:?] <- (agnt) <- [CONTROL] -> (obj) -> [SPECIFY] (rslt) -> [TYPE] a2 = [ACTOR:?] = {[ACTOR],[LIST_OWNER]} and a3 = [ACTOR:?] = {[ACTOR],[LIST_OWNER]}
2a. Determine which actors ai are involved in the specification of permitted actions. This query should be directed toward the bottom of the context lattice, as this context contains all intention graphs (which in turn include the desired actor concepts): s4 = δI ∗ (C4∗ , q4 ) = {i3 } where q4 = [ACTOR:?] <- (agnt) <- [CONTROL] -> (obj) -> [SPECIFY] (rslt) -> [PERM_ACTION] a4 = [ACTOR:?] = {[LIST_OWNER]}
2b. Using a4 , determine, for each type of information tool gj (see 1a), its corre0 sponding si and actors ai (see 1b), which actors ai currently illegitimately control its specification process. g1 : [MAILING_LIST] 0
a2 = a2 − (a2 ∩ a4 ) = {[ACTOR]} g2 : [PRIVATE_MAILING_LIST] 0
a3 = a3 − (a3 ∩ a4 ) = {[ACTOR]}
0
2c. For each tool identified by the gj having illegitimate controlling actors ai , define 0 the illegitimate composition norms ck =< il , gj > by selecting from the si from 1b 0 those il which contain ai : 0
c1 =< i1 , g1 > 0
c2 =< i1 , g2 >
3. In the previous two steps we identified the illegitimate norms. Now we will prepare the stage for the specification discourse in which these norms are to be corrected. A composition norm does not just need to be seen as a context. It is itself a knowledge definition which needs to be covered by the extension graph of at least one other composition norm, which in that case acts as a meta-norm. In order to correct the illegitimate norms we need to (a) identify which actors are permitted to do this and (b) what items should be on their specification agenda. This step falls outside the scope of this paper but is presented here to provide the reader with the whole picture. A forthcoming paper will elaborate on meta-norms and contexts of contexts.
Handling Specification Knowledge Evolution Using Context Lattices
429
0
3a. For each illegitimate composition norm ck , select the actors ai from the permitted 0 (meta) composition norms cm which allow that ck to be modified:1 cm =< im , gm > where im = [ACTOR:?] <- (agnt) <- [EXEC] -> (obj) -> [MODIFY] -> (rslt) -> [PERM COMP: #] 0
and gm = ck
3b. For each of these actors ai , build an agenda Ai . Such an agenda could consist of 0 (1) all illegitimate norms ck that each actor is permitted to respecify and (2) contextual information from the most specific context in which these norms are represented, or other contexts which are related to this context in some significant way. The exact contextual graphs to be included in these agendas are determined by the way in which the specification discourse is being supported, which is not covered in this paper and needs considerable future research. However, we would like to give some idea of the direction we are investigating. In our example, we identified the illegitimate (derived) composition norm ‘any actor is permitted to control (i.e. initiate, execute, and evaluate) the specification of a private mailing list’ (< i1 , g2 >). From its formal context C2∗ it also appears that a list owner, on the other hand, is permitted to at least execute the modification of this type (< i2 , g2 >). If another specification constraint would say that one permitted composition for each control process category per knowledge definition suffices, then only the initiation and evaluation of the modification now would remain to be defined (as the execution of the modification of the private mailing list type is already covered by the norm referring to the list owner). Thus, the specification agendas Ai for the actors ai identified in 3a could include : ‘you can be involved in the respecification of the initiation and the evaluation of the modification of the type private mailing list’, as well as ‘there is also actor-such-and-such (e.g. the list owner) who has the same (or more general/specific) specification rights, with whom you can negotiate or whom you can ask for advice.’. Of course, in a well-supported discourse these kinds of agendas would be translated into statements and queries much more readable to their human interpreters, but such issues are of a linguistic nature and are not dealt with here.
5
Conclusions
Rapid change in work practices and supporting information technology is becoming an ever more important aspect of life in many distributed professional communities. One of their critical success factors therefore is the continuous involvement of users in the (re)specification of their network information system. In this paper, the conceptual graph-based approach for the navigation of context lattices developed by Mineau and Gerb´e [1997] was used to structure the handling of user-driven specification knowledge evolution. In virtual professional communities, the various kinds of norms and the knowledge definitions to which they apply, as well as the specification constraints that apply to these norms, are prone to change. The formal context lattice approach can be used to guarantee that specification processes result in 1
For lack of space, we have not included such composition norms in our example, but since they are also represented in a context lattice, the same mechanisms apply. The only difference is that the extension graphs are themselves contexts (as defined in Sect.3).
430
A. de Moor and G. Mineau
legitimate knowledge definitions, which are both meaningful and acceptable to the user community. Extracting the context to which a query is applied, provides simpler graphs that can more easily be understood by the user when he interacts with the CG base. It also provides a hierarchical path that guides the matching process between CGs, that would otherwise not be there to guide the search. Even though the computation cost of matching graphs would be the same, overall performance would be improved by these guidelines as the search is more constrained. But the most interesting part about using a context lattice, is that it provides a structuring of different contexts that help conceptualize (and possibly visualize) how different contexts (‘micro-worlds’) relate to one another, adding to the conceptualization power of conceptual graphs. In future research, we plan to further formalize and standardize the still quite conceptual approach presented here, and also look into issues regarding its implementation.
References 1. A. De Moor. Applying conceptual graph theory to the user-driven specification of network information systems. In Proceedings of the Fifth International Conference on Conceptual Structures, University of Washington, Seattle, August 3–8, 1997, pages 536–550. SpringerVerlag, 1997. Lecture Notes in Artificial Intelligence No. 1257. 2. B.R. Gaines. Dimensions of electronic journals. In T.M. Harrison and T. Stephen, editors, Computer Networking and Scholarly Communication in the Twenty-First Century, pages 315–339. State University of New York Press, 1996. 3. L.J. Arthur. Rapid Evolutionary Development - Requirements, Prototyping & Software Creation. John Wiley & Sons, 1992. 4. T.W. Malone, K.-Y. Lai, and C. Fry. Experiments with Oval: A radically tailorable tool for cooperative work. ACM Transactions on Information Systems, 13(2):177–205, 1995. 5. P. Holt. User-centred design and writing tools: Designing with writers, not for writers. Intelligent Tutoring Media, 3(2/3):53–63, 1992. 6. G. Fitzpatrick and J. Welsh. Process support: Inflexible imposition or chaotic composition? Interacting with Computers, 7(2):167–180, 1995. 7. L.J. Arthur. Quantum improvements in software system quality. Communications of the ACM, 40(6):46–52, 1997. 8. I. Hawryszkiewycz. A framework for strategic planning for communications support. In Proceedings of the Inaugural Conference of Informatics in Multinational Enterprises, Washington, October 1997, 1997. 9. F. Dignum, J. Dietz, E. Verharen, and H. Weigand, editors. Proceedings of the First International Workshop on Communication Modeling ’Communication Modeling - The Language/Action Perspective’, Tilburg, The Netherlands, July 1-2, 1996. Springer eWiC series, 1996. http://www.springer.co.uk/eWiC/Workshops/CM96.html. 10. G. Mineau and O. Gerb´e. Contexts: A formal definition of worlds of assertions. In Proceedings of the Fifth International Conference on Conceptual Structures, University of Washington, Seattle, August 3–8, 1997, pages 80–94. Springer Verlag, 1997. Lecture Notes in Artificial Intelligence, No. 1257. 11. F. Dignum and H. Weigand. Communication and deontic logic. In R. Wieringa and R. Feenstra, editors, Working Papers of the IS-CORE Workshop on Information Systems - Correctness and Reusability, Amsterdam, 26-30 September, 1994, pages 401–415, September 1994. 12. J.F. Sowa. Conceptual Structures: Information Processing in Mind and Machine. AddisonWesley, 1984.
Using CG Formal Contexts to Support Business System Interoperation Hung Wing1 , Robert M. Colomb1 , and Guy Mineau2 1
CRC for Distributed Systems Technology Department of Computer Science The University of Queensland Brisbane, Qld 4072, Australia 2 Dept. of Computer Science Universit´e Laval, Canada
Abstract. This paper describes a standard interoperability model based on a knowledge representation language such as Conceptual Graphs (CGs). In particular, it describes how an Electronic Data Interchange (EDI) mapping facility can use CG contexts to integrate and compare different trade documents by combining and analysing different concept lattices derived from formal concept analysis theory. In doing this, we hope to provide a formal construct which will support the next generation of EDI trading concerned with corporate information.
1
Introduction
There have been several attempts to overcome semantic heterogeneity existing between two or more business systems. It could be a simple paper-based system in which purchase orders generated from a purchasing program can be faxed (or communicated via telephone) to a human coordinator, whose job is to extract and transcribe the information from an order to a format that is required by an order entry program. In general, the coordinator has specific knowledge that is necessary to handle the various inconsistencies and missing information associated with exchanged messages. For example, the coordinator should know what to do when information is provided that was not requested (unused item) or when information that was requested but it is not provided (null item). The above interoperation technique is considered simple and relatively inexpensive to implement since it does not require the support of EDI software. However, this approach is not flexible enough to really support a complex and dynamic trade environment where time critical trade transactions (e.g. a foreign exchange deal) may need to be interoperated on-the-fly without prior negotiation (about standardised trading terms) having to rely on the human ability to quickly and correctly transcribe complex trade information. In facilitation of system interoperation, other more sophisticated systems include general service discovery tools like the trader service of Open Distributed Processing [5], schema integration tools in multidatabase systems [2], contextbased interchange tools of heterogeneous systems [7,1], email message filtering M.-L. Mugnier and M. Chein (Eds.): ICCS’98, LNAI 1453, pp. 431–438, 1998. c Springer-Verlag Berlin Heidelberg 1998
432
H. Wing, R.M. Colomb, and G. Mineau
tools of Computer Systems for Collaborative Work (CSCW) [3], or EDI trade systems [6]. The above systems are similar in the sense that they all rely on commonly shared structures (ontologies) of some kind to compare and identify semantic heterogeneity associated with underlying components. However, what seems lacking in these systems is a formal construct which can be used to specify and compare the different contexts associated with trade messages. Detailed descriptions of theses systems and their pros and cons can be found in [9]. In this paper, we describe an enhanced approach which will support business system interoperation by using Conceptual Graph Formal Context[4] deriving from the Formal Concept Analysis theory [8]. The paper is organised as follows: Section 2 overviews some of the relevant formal methods. Section 3 describes how we can overcome the so-called 1st trade problem (refers to the initial high cost to establish a collection of commonly agreed trading terms).
formal Spec Customer Application Programs
formal Spec Data
Purchase Order Handler
EMF Server
Data
formal Spec Order Entry Handler
Revised Spec
formal Spec Data
Supplier Application Programs
Specification Analysis Tools EMF Human Coordinator
Fig. 1. EDI Mapping Facility (EMF)
2
Relevant Formal Methods
In designing the EDI Mapping Facility (EMF) (shown in Figure 1), we aim to facilitate the following: 1) Systematic interoperation: allow business system to dynamically and systematically collaborate with each other with minimal human intervention, 2) Unilateral changes: allow business system to change and extend trade messages with minimal consensus from other business systems, and 3) Minimising up front coordination: eliminate the so-called one-to-one bilateral trade agreements imposed by traditional EDI systems. To support the above aims, we need to be able to express the various message concepts and relationships among these concepts. In doing so we need a logical notation of some kind. In general, a formal notation such as first order logic, Object Z, or CGs is considered useful due to the following: 1) it is an unambiguous logical notation, 2) it is an expressive specification language, and 3) the specification aspects can be demonstrated by using mathematical proof techniques.
Using CG Formal Contexts to Support Business System Interoperation
433
However, we choose CGs to specify EDI messages due to the following added benefits: First, the graphical notation of CG is designed for human readability. Second, the Canonical Formation Rules of CGs allow a collection of conceptual graph expressions to be composed (by using join, copy) and decomposed (by using restrict, simplify ) to form new conceptual graph expressions. In this sense, the formation rules are a kind of graph grammar which can be used to specify EDI messages. In addition, they can also be used to enforce certain semantic constraints. Here, the canonical formation rules define the syntax of the trade expressions, but they do not necessarily guarantee that these expressions are true. To derive correct expressions from other correct expressions we need rules of inference. Third, aiming to support reasoning with graphs, Peirce defined a set of five inference rules (erasure, insertion, iteration, de-iteration, double negation) and an axiom (the empty set) based on primitive operations of copying and reasoning about graphs in various contexts. Thus, rules of inference allow a new EDI trade expression to be derived from an existing trade expression, allowing an Internet-based trade model to be reasoned about and analysed. Furthermore, to facilitate systematic interoperation we need to be able to formalise the various trade contexts (assumptions and assertions) associated with EDI messages. According to Mineau and Gerb´e, informally: ‘A context is defined in two parts: an intention, a set of conceptual graphs which describe the conditions which make the asserted graphs true, and an extension, composed of all the graphs true under these conditions’ [4]. Formally, a context Ci can be described as a tuple of two sets of CGs, Ti and Gi . Ti defines the conditions under which Ci exists, represented by a single intention graph; Gi is the set of CGs true in that context. So, for a context Ci , Ci = < Ti , Gi > = , where I(Ci ), a single CG, is the intention graph of Ci , and E(Ci ), the set of graphs conjunctively true in Ci , are the extension graphs. Based on Formal Concept Analysis theory [8], Mineau and Gerb´e further define the formal context, named Ci∗ , as a tuple < Ti , Gi > where Gi = E ∗ (Ti ) and Ti = I ∗ (Gi ) = I(Ci∗ ). With these definitions, the context lattice, L, can be computed automatically by applying the algorithm given in the formal concept analysis theory described below. This lattice provides an explanation and access structure to the knowledge base, and relates different worlds of assertions to one another. Thus, L is defined as: L =< {Ci∗ }, ≤> . In the next section, we describe how these formal methods can be applied to solve the so-called 1st trade problem.
3
An Example: Overcoming the 1st Trade Problem
This example is based on the following trade scenario: in a foreign exchange deal, a broker A uses a foreign exchange standard provided by a major New York bank to compose a purchase order. Similarly, a broker B uses another standard provided by a major Tokyo bank to compose an order entry. Note that these two standards are both specialisations of the same EDI standard. The key idea here
434
H. Wing, R.M. Colomb, and G. Mineau
is to deploy an approach in which we can specify the different assumptions and assertions relevant to the trade messages so we use these formalised specifications to systematically identify the different mismatches (null, unused, and missing items). As an example, Figure 2 shows how we may use CG contexts to model the various trade assumptions (intents i1 , ..., i7 ) and concept assertions (extents c1 , ..., c12 ).
CONTEXT: Foreign exchange
CONTEXT: Foreign exchange
i1
Use
c1 P/Order
Foreign-exchange
r1 Part
c1 P/Order
c1 P/Order
c1 P/Order
c3
Quantity
Factor: 1
ToAssert
Currency: USD
CostPerUnit
ToAssert
Part:
CONTEXT: Foreign exchange, Factor=1000
c4 c4
i6
Use
CostPerUnit
DiscountRqst
Foreign-exchange
Document: O/Entry
r1 Part
c5
ToAssert
a6 Use
Factor: 1000
CostPerUnit
i2
CONTEXT: Foreign exchange, Factor=1 Quantity
c3
CostPerUnit
c4
CONTEXT: Foreign exchange, Currency=USD
c8
Quantity
Part:
a2 Use
i5
CONTEXT: TKSE foreign exchange, Factor=1
c2 Product#
a1 Use
i4
c6, c7,...,c13
Document: #
ToAssert
Part: #
ToAssert
Part: #
C12
CONTEXT: Foreign exchange, Currency=JPY
i7
i3
ToAssert
Use
Foreign-exchange
Document: O/Entry
ToAssert
a6 Use
Currency: JYP
CostPerUnit c4 CostPerUnit
C12
Fig. 2. Sample CG contexts relevant to a purchase order (left) and an order entry (right)
There are several steps involve in the systematic interoperation of EDI messages. The following steps is based on the EMF shown in Figure 1. • Step 1. Prepare and forward specs: Broker A and B can interact with the Customer Application Program and Supplier Application Program, respectively, to compose a purchase order and an order entry based on the standardised vocabularies provided by a foreign exchange standard. Figure 2 shows two possible specifications: a purchase order and an order entry. Once formal specifications have been defined (by using CG formation rules and contexts), they can be forwarded to either an Purchase Order Handler or an Order Entry Handler for processing. Upon receiving an order request, the Purchase Order Handler checks its internal record stored in the Supplier Log to see whether or not this order spec has been processed before. If not, this ‘1st trade’ spec is forwarded to the EMF Server for processing. Otherwise, based on the previously established trade information stored in the Supplier Log, the relevant profile can then be retrieved and forwarded with the relevant order data to an appropriate Order Entry Handler for processing. In order to identify the discrepancy between a purchase order and an order entry, the Order Entry Handler needs to forward an order entry spec to an EMF Server for processing.
Using CG Formal Contexts to Support Business System Interoperation
435
• Step 2. Integrate and compare specs: To effectively compare two specs from different sources the EMF server needs to do the following: 1) formalise the specs and organise their formal contexts into two separate type hierarchies known as Context Lattices. Note that in order to compare two different specs, an agreement on an initial ontology must exist; 2) by navigating and comparing the structures of these context lattices it can identify and integrate contexts of one source with contexts of another source. In doing so it can form an integrated lattice, and 3) this integrated lattice can then be accessed and navigated to identify those equivalent and/or conflicting intentions (or assumptions). From these identified and matched intentions it can compare the extents in order to identify those matched, unused, null, and conflict assertions. The result of the comparison steps can then be used to generate the necessary mapping profiles. In the following, we describe how the above steps can be formally carried out. Mpo attributes i1
Gpo c1 objects c2 x c3 x c4 x c5
i2
i3
x x
x
Moe attributes
Goe objects
i4 c6 c7 x c8 x c9 c10 c11 c12 x
i5 i6
i7
x
x
x
c13
Fig. 3. FCA Formal Contexts represent two different sets of assumptions (about standard, currency and scale factor)
Generating the Context Lattices: Based on the FCA theory, the above formal CG contexts can be systematically re-arranged to form the corresponding FCA contexts of the purchase order and order entry (denoted as KP and KO , respectively). These contexts are illustrated as cross-tables shown in Figure 3. The cross-table on the left depicts the formal context (KP ) of the purchase order spec representing a query graph, while the cross-table shown on the right depicts the formal context (KO ) of the order entry spec representing a type graph. To simplify our example, all asserted conceptual relations shown in Figure 2 have been ignored in the cross-tables. If the application is required to query and compare the asserted relations, they can easily be included in the cross-tables prior to generating the context lattice. Recall from FCA theory that for a given context K we can systematically find its formal concepts (Xi , Bi ). By ordering these formal concepts based on the sub/superconcept relation (≤), we can systematically determine the concept lattice B(K) based on the join and meet operations of FCA theory. Thus, from our example, the context KP and KO shown in Figure 3 can be systematically processed to generate concept lattices B(KP ) and B(KO ) shown in Figure 4, respectively. The context KP has five formal concepts { C1,C2,C3,C4,C5} and the context KO also has five formal concepts { C6,C7,C8,C9,C10 }.
436
H. Wing, R.M. Colomb, and G. Mineau C1= <{}{c1,...,c4}>
C6= <{}{c5,...,c13}>
C2= <{i1}{c2,c3,c4}> C3= <{i1,i2}{c3,c4}>
C4= <{i1,i3}{c4}>
C7= <{i4}{c7,c8,c12}>
C5= <{i1,i2,i3}{c4}> P/Order Context Lattice
C9= <{i4,i6,i7}{c12}>
C8= <{i4,i5}{c8}>
C10= <{i4,i5,i6,i7}{}> O/Entry Context Lattice
Fig. 4. Context lattices generated from cross-tables shown in Figure 3
Integrating the Context Lattices: At this point, the context lattices B(KP ) and B(KO ) represent the important hierarchical conceptual clustering of the asserted concepts (via the extents) and a representation of all implications between the assumptions (via its intents). With these context lattices we can then proceed to query and compare the asserted concepts based on the previously specified assumptions. However, before we can compare the individual concepts, we need to combine the context lattices to form an integrated context lattice. In doing this we ensure that only those concepts that are based on the same type of assumptions (or intention type) can be compared with each other. Otherwise, the comparison would make no sense.
C7= <{i4}{c7, c8, c12}> C2= <{i1}{c2, c3, c4}>
S2
S1 C9= <{i4, i6, i7}{c12}> C5= <{i1, i2, i3}{c4}>
C8= <{i4, i5} {c8}> C3 = <{i1, i2} {c3, c4}>
Conflicting pair matching pair
C10= <{i4,i5,i6,i7}{}> C5= <{i1,i2,i3}{c4}>
Fig. 5. Integrated Context Lattice
Based on the information given by the individual contexts shown in Figure 2 we can derive that i1 is equivalent to i4 (i.e. both context C2 and C7 are based on the same foreign exchange standard). Thus, we can integrate and compare context C2’s individual concepts c2 , c3 , c4 against node C7’s individual concepts (c7 , c8 , c12 ). By comparing these individual concepts we find that c2 = c7 , c3 = c8 , and c4 = c12 . Note that this comparison is true only when the above concepts are defined according to the convention specified by intents i1 and i4 . This integration step is illustrated in the top part of the integrated lattice shown in Figure 5.
Using CG Formal Contexts to Support Business System Interoperation
437
Similarly, we can integrate and compare contexts C3 and C8. In this case, we find that concept c8 = c3 (quantity) according to the assumption that c8 and c3 are based on i4 (foreign exchange standard) and i5 (factor = 1). This integration step is illustrated in the left part of the integrated lattice shown in Figure 5. While integrating and comparing C5 and C9 we find some discrepancies in the intentions i2 (factor =1) and i6 (factor = 1000), also in i3 (currency = USD) and i7 (currency = JPY). These discrepancies in the intent parts suggest that both c4 and c12 (CostPerUnit) are based on a conflicting assumption. This integration and comparison step is illustrated in the right part of the integrated lattice shown in Figure 5. The results of the integration process can be used to form the result profiles which identify those null, unused and mismatched items of a purchase order and an order entry. This profile is then forwarded to the relevant handlers to control the data flow between business sytems. In general, a mapping profile can systematically be generated by inferring the integrated context lattice. • Step 3, Forward relevant data: Upon receiving the mapping profiles from an EMF Server, the Purchase Order Handler and Order Entry Handler store these profiles in the Supplier Log and Customer Log, respectively. In doing so, subsequent order requests can use these profiles to coordinate and transport the purchase order data to the appropriate order entry programs without having to integrate and compare the Purchase Order’s and Order Entry’s specs. It is important to point out that by navigating the context lattice, analysis software would be able to identify the reasons behind the mismatching results. Some mismatches (e.g. unknown concepts those which cannot be identified by a particular standard) can be impossible to interpret by another system without the intervention of a human coordinator. However, some other mismatches (e.g. those exchanged concepts that were based on a shared ontology but were not provided or asked for) can be systematically appended to the original specs and forwarded back to the Purchase Order Handler or Order Entry Handler for re-processing. An open research agenda: Previously, we have described an approach to identify discrepancies among different CG concepts. It is important to note that discrepancies may come from relations and not just from concepts. They may come in the way concepts are connected by relations or they may come from within nested concepts (or nested relations). For example, the message ‘Broker A delivers a quote to Broker B’ may have different interpretations depending on whether A calls (or emails) B to deliver a quote on the spot (which may not be secured), or A requests a quote specialist (by using a server) to make a delivery (in which a quote can be securely delivered by using the encryption, certification, and non-repudiation techniques). If the application does not care how the quote is delivered, as long as B receives the quote, then it is not necessary to analyse or reason about the relevant nested concepts (or relations). However, if the security associated with the delivery is of concern we need to find a way to compare and identify the potential conflicts embedded in nested concepts.
438
H. Wing, R.M. Colomb, and G. Mineau
Discrepancies associated with relations can be solved by using the above described approach. For example, we can substitute relations (instead of concepts) as the extents of the cross table shown in Figure 3. In doing so, we can then generate the necessary context lattices and integrated lattices based on relations and not concepts. Thus, we can then compare and identify discrepancies among relations. If we view the different ways in which concepts are connected by relations as a collection of ‘super concepts’, then to identify the discrepancies among these super concepts, a partial common ontology (which describes how concepts may be connected by relations) must be used. The level of matching between different ontologies will have a direct impact on the comparison heuristic. The problem here is to discover some heuristics to guide a search through two lattices in order to integrate them. In doing so, we can find enough similarity to discover dissimilarities. To conclude, by using formal concept analysis and conceptual graph formalism we can systematically create context lattices to represent complex message specifications and their assumptions. In doing so, message specs can be effectively navigated and compared, thus it is feasible that a formal EDI mapping approach can be facilitated.
References 1. C. Goh, S. Bressan, S. Madnick, and M. Siegel. Context interchange: Representing and reasoning about data semantics in heterogeneous systems. ACM Sigmod Record, 1997. 2. Vipul Kashyap and Amit Sheth. Semantic and schematic similarities between database objects: a context-based approach. The VLDB Journal, 5, 1996. 3. J. Lee and T. Malone. Partially Shared Views: A Scheme for Communication among Groups that Use Different Type Hierarchies. ACM Transactions on Information Systems, 8(1), 1990. 4. G. Mineau and O. Gerb´e. Contexts: A formal definition of worlds of assertions. In Dickson Lukose et al, editor, International Conference on Conceptual Structures (ICCS97), 1997. 5. A. Puder and K. R¨ omer. Generic trading service in telecommunication platforms. In Dickson Lukose et al, editor, International Conference on Conceptual Structures (ICCS97), 1997. 6. L. Raymond and F. Bergeron. EDI success in small and medium-sized enterprises: A field study. Journal of Organizational Computing and Electronic Commerce, 6(2), 1996. 7. G. Wiederhold. Mediators in architecture of future information systems. IEEE Computer, 25(3), 1992. 8. R. Wille. Restructuring Lattice Theory: An Approach Based on Hierarchies of Concepts. In I. Rival, editor, Ordered Sets. Reidel, Dordrecht, Boston, 1982. 9. Hung Wing. Managing Complex and Open Web-deloyable Trade Objects. PhD Thesis, University of Queensland, QLD. 4072, Australia, 1998.
Author Index Angelova, G.
351
Baader, F. 15 Baud, R.H. 390 Biedermann, K. 209 Borgida, A. 15 Braiiner, T. 255 Burrow, A. 111 Cao, T.H. 270 Chibout, K. 367 Colomb, R.M. 431 Coulondre, S. 179 Coupey, P. 165 Creasy, P.N. 270 Cyre, W. R:. 51 Dibie, J. 80 Dieng, R. 139 Eklund, P.W. Faxon, C.
111
165
Ganter, B. 295 Genest, D. 154 Gerb~, 0. 401 Groh, B. 127 Guinaldo, O. 287
Kayser, D. 35 Keller, R.K. 401 Kuznetsov, S. O. 295 Liu, X. 375 Loiseau, S. 80 Lovis, C. 390 McGuinness, D.L. 15 Mann, G.A. 319 Mineau, G.W. 65, 401, 416, 431 Moor, A. de 416 Moulin, B. 359 Nock, R.
Pollitt, S. 111 Prediger, S. 225 Puder, A. 119 Rassinoux, A.-M. Ribi~re, M. 94
Jappy, P.
303
390
Salvat, E. 154, 179 Scherrer, J.-R. 390 Simonet, G. 240 Sowa, J.F. 3 Stra.hringer, S. 127 Tepfenhaxt, W.M. Vilnat, A.
Haemmerl~, O. 80, 287 Hoede, C. 375 Hug, S. 139
303
367
Wagner, J.C. 390 Wille, R. 127, 194 Wing, H. 431
334
Lecture Notes in Artificial Intelligence (LNAI)
Vol. 1314: S. Muggleton (Ed.), Inductive Logic Programming. Proceedings, 1996. VIII, 397 pages. 1997. Vol. 1316: M.Li, A. Maruoka (Eds.), Algorithmic Learning Theory. Proceedings, 1997. XI, 461 pages. 1997. Vol. 1317: M. Leman (Ed.), Music, Gestalt, and Computing. IX, 524 pages. 1997. Vol. 1319: E. Plaza, R. Benjamins (Eds.), Knowledge Acquisition, Modelling and Management. Proceedings, 1997. XI, 389 pages. 1997. Vol. 1321: M. Lenzerini (Ed.), AI*IA 97: Advances in Artificial Intelligence. Proceedings, 1997. XII, 459 pages. 1997. Vol. 1323: E. Costa, A. Cardoso (Eds.), Progress in Artificial Intelligence. Proceedings, 1997. XIV, 393 pages. 1997. Vol. 1325: Z.W. R a s ' , A. Skowron (Eds.), Foundations of Intelligent Systems. Proceedings, 1997. XI, 630 pages. 1997. Vol. 1328: C. Retort (Ed.), Logical Aspects of Computational Linguistics. Proceedings, 1996. VIII, 435 pages. 1997. Vol. 1342: A. Sattar (Ed.), Advanced Topics in Artificial Intelligence. Proceedings, 1997. XVIII, 516 pages. 1997. Vol. 1348: S. Steel, R. Alami (Eds.), Recent Advances in AI Planning. Proceedings, 1997, IX, 454 pages. 1997. Vol. 1359: G. Antoniou, A.K. Ghose, M. Truszezyn "ski (Eds.), Learning and Reasoning with Complex Representations. Proceedings, 1996. X, 283 pages. 1998. Vol. 1360: D. Wang (Ed.), Automated Deduction in Geometry. Proceedings, 1996. VII, 235 pages. 1998. Vol. 1365: M.P. Singh, A. Rao, M.J. Wooldridge (Eds.), Intelligent Agents IV. Proceedings, 1997. XII, 351 pages. 1998. Vol. 1371: I. Wachsmuth, M. FrShlich (Eds.), Gesture and Sign Language in Human-Computer Interaction. Proceedings, 1997. XI, 309 pages. 1998. Vol. 1374: H. Bunt, R.-J. Beun, T. Borghuis (Eds.), Multimodal Human-Computer Communication. VIII, 345 pages. 1998. Vol. 1387: C. Lee Giles, M. Gori (Eds.), Adaptive Processing of Sequences and Data Structures. Proceedings, 1997. XII, 434 pages. 1998. Vol. 1394: X. Wu, R. Kotagiri, K.B. Korb (Eds.), Research and Development in Knowledge Discovery and Data Mining. Proceedings, 1998. XVI, 424 pages. 1998. Vol. 1395: H. Kitano (Ed.), RoboCup-97: Robot Soccer World Cup I. XIV, 520 pages. 1998. Vol. 1397: H. de Swart (Ed.), Automated Reasoning with Analytic Tableaux and Related Methods. Proceedings, 1998. X, 325 pages. 1998.
Vol. 1398: C. Ntdellec, C. Rouveirol (Eds.), Machine Learning: ECML-98. Proceedings, 1998. XII, 420 pages. 1998. Vol. 1400: M. Lenz, B. Bartsch-Sptirl, H.-D. Burkhard, S. Wess (Eds.), Case-Based Reasoning Technology. XVIII, 405 pages. 1998. Vol, 1404: C. Freksa, C. Habel. K.F. Wender (Eds.), Spatial Cognition. VIII, 491 pages. 1998. VoL 1409: T. Sehaub, The Automation of Reasoning with Incomplete Information. XI, 159 pages. 1998. Vol. 1415: J. Mira, A.P. del Pobil, M. Ali (Eds.), Methodology and Tools in Knowledge-Based Systems. Vol. I. Proceedings, 1998. XXIV, 887 pages. 1998. Vol. 1416: A.P. del Pobil, J. Mira, M. Ali (Eds.), Tasks and Methods in Applied Artificial Intelligence. Vol. II. Proceedings, 1998. XXIII, 943 pages. 1998. Vol. 1418: R. Mercer, E. Neufeld (Eds.), Advances in Artificial Intelligence. Proceedings, 1998. XII, 467 pages. 1998. Vol. 1421: C. Kirchner, H. Kirchner (Eds.), Automated Deduction - CADE-15. Proceedings, 1998. XIV, 443 pages. 1998. Vol. 1424: L. Polkowski, A. Skowron (Eds.), Rough Sets and Current Trends in Computing. Proceedings, 1998. XIII, 626 pages. 1998. Vol. 1433: V. Honavar, G. Slutzki (Eds.), Grammatical Inference. Proceedings, 1998. X, 271 pages. 1998. Vol. 1434: J.-C. Heudin (Ed.), Virtual Worlds. Proceedings, 1998. XII, 412 pages. 1998. Vol. 1435: M. Klusch, G. Weig (Eds.), Cooperative Information Agents II. Proceedings, 1998. IX, 307 pages. 1998. Vol. 1437: S. Albayrak, F.J. Garijo (Eds.), Intelligent Agents for Telecommunication Applications. Proceedings, 1998. XII, 251 pages. 1998. Vol. 1441: W. Wobcke, M. Pagnucco, C. Zhang (Eds.), Agents and Multi-Agent Systems. Proceedings, 1997. XII, 241 pages. 1998. Vol. 1446: D. Page (Ed.), Inductive Logic Programming. Proceedings, 1998. VIII, 301 pages. 1998. Vol. 1453: M.-L. Mugnier, M. Chein (Eds.), Conceptual Structures: Theory, Tools and Applications. Proceedings, 1998. XIII, 439 pages. 1998. Vol. 1454: I. Smith (Ed.), Artificial Intelligence in Structural Engineering.XI, 497 pages. 1998. Vol. 1456: A. Drogoul, M. Tambe, T. Fukuda (Eds.), Collective Robotics. Proceedings, 1998. VII, 161 pages. 1998. Vol. 1458: V.O. Mittal, H.A. Yanco, J. Aronis, R. Simpson (Eds.), Assistive Technology in Artificial Intelligence. X, 273 pages. 1998.
Lecture Notes in Computer Science
Vol. 1415: J. Mira, A.P. del Pobil, M.AIi (Eds.), Methodology and Tools in Knowledge-Based Systems. Vol. I. Proceedings, 1998. XXIV, 887 pages. 1998. (Subseries LNAI). Vol. 1416: A.P. del Pobil, I. Mira, M.Ati (Eds.), Tasks and Methods in Applied Artificial Intelligence. Vol.II. Proceedings, 1998. XXIII, 943 pages. 1998. (Subseries LNAI). Vol. 1417: S. Yalamanchili, J. Duato (Eds.), Parallel Computer Routing and Communication. Proceedings, 1997. XII, 309 pages. 1998. Vol. 1418: R. Mercer, E. Neufeld (Eds.), Advances in Artificial Intelligence. Proceedings, 1998. XIL 467 pages. 1998. (Subseries LNA1). Vol. 1419: G. Vigna (Ed.), Mobile Agents and Security. XII, 257 pages. 1998. Vol. 1420: J. Desel, M. Silva (Eds.), Application and Theory of Petri Nets 1998. Proceedings, 1998. VIII, 385 pages. 1998. Vol. 1421: C. Kirchner, H. Kirchner (Eds.), Automated Deduction - CADE-15. Proceedings, 1998. XIV, 443 pages. 1998. (Subseries LNAI). Vol. 1422: J. Jeuring (Ed.), Mathematics of Program Construction. Proceedings, 1998. X, 383 pages. 1998. Vol. 1423: J.P. Buhler (Ed.), Algorithmic Number Theory. Proceedings, 1998. X, 640 pages. 1998. Vol. 1424: L. Polkowski, A. Skowron (Eds.), Rough Sets and Current Trends in Computing. Proceedings, 1998. XIII, 626 pages. 1998. (Subseries LNAI). Vol. 1425: D. Hutchison, R. Schlifer (Eds.), Multimedia Applications, Services and Techniques - ECMAST'98. Proceedings, 1998. XVI, 532 pages. 1998. Vol. 1427: A.J. Hu, M.Y. Vardi (Eds.), Computer Aided Verification. Proceedings, 1998. IX, 552 pages. 1998. Vol. 1430: S. Trigila, A. Mullery, M. Campolargo, H. Vanderstraeten, M. Mampaey (Eds.), Intelligence in Services and Networks: Technology for Ubiquitous Telecom Services. Proceedings, 1998. XII, 550 pages. 1998. Vol. 1431: H. Imai, Y. Zheng (Eds.), Public Key Cryptography. Proceedings, 1998. XI, 263 pages. 1998. Vol. 1432: S. Arnborg, L. Ivansson (Eds.), Algorithm Theory - SWAT '98. Proceedings, 1998. IX, 347 pages. 1998. Vol. 1433: V. Honavar, G. Slutzki (Eds.), Grammatical Inference. Proceedings, 1998. X, 271 pages. 1998. (Snbseries LNAI). Vol. 1434: J.-C. Heudin (Ed.), Virtual Worlds. Proceedings, 1998. XlI, 412 pages. 1998. (Subseries LNAI). Vol. 1435: M. Klusch, G. WeiB (Eds.), Cooperative Information Agents II. Proceedings, 1998. IX, 307 pages. 1998. (Subseries LNAI).
Vol. 1436: D. Wood, S. Yu (Eds.), Automata Implementation. Proceedings, 1997. VIII, 253 pages. 1998. Vol. 1437: S. Albayrak, F.J. Garijo (Eds.), Intelligent Agents for Telecommunication Applications. Proceedings, 1998. XII, 251 pages. 1998. (Subseries LNAI). Vol. 1438: C. Boyd, E. Dawson (Eds.), Information Security and Privacy. Proceedings, 1998. XI, 423 pages. 1998. Vol. 1439: B. Magnusson (Ed.), System Configuration Management. Proceedings, 1998. X, 207 pages. 1998.
Vol. 1441: W. Wobcke, M. Pagnucco, C. Zhang (Eds.), Agents and Multi-Agent Systems. Proceedings, 1997. XII, 241 pages. 1998. (Subseries LNAI). Vol. 1443: K.G. Larsen, S. Skyum, G. Winskel (Eds.), Automata, Languages and Programming. Proceedings, 1998. XVI, 932 pages. 1998. Vol. 1444: K. Jansen, J. Rolim (Eds.), Approximation Algorithms for Combinatorial Optimization. Proceedings, 1998. VIII, 201 pages. 1998. Vol. 1445: E. Jul (Ed.), E C O O P ' 9 8 - Object-Oriented Programming. Proceedings, 1998. XII, 635 pages. 1998.
Vol. 1446: D. Page (Ed.), Inductive Logic Programming. Proceedings, 1998. VIII, 301 pages. 1998. (Subseries LNAI). Vol. 1448: M. Farach-Colton (Ed.), Combinatorial Pattern Matching. Proceedings, 1998. VIII, 251 pages. 1998 .Vol. 1449: W.-L Hsu, M.-Y. Kao (Eds.), Computing and Combinatorics. Proceedings, 1998. XII, 372 pages. 1998. Vol. 1452: B.P. Goettl, H.M. Halff, C.L Redfield, V.J. Shute (Eds.), Intelligent Tutoring Systems. Proceedings, 1998. XIX, 629 pages. 1998. Vol. 1453: M.-L. Mugnier, M. Cbein (Eds.), Conceptual Structures: Theory, Tools and Applications. Proceedings, 1998. XIII, 439 pages. (Subseries LNAI). Vol. 1454: I. Smith (Ed.), Artificial Intelligence in Structural Engineering. XI, 497 pages. 1998. (Subseries LNAI). Vol. 1456: A. Drogout, M. Tambe, T. Fukuda (Eds.), Collective Robotics. Proceedings, 1998. VII, 161 pages. 1998. (Subseries LNAI). Vol. 1457: A. Ferreira, J. Rolim, H. Simon, S.-H. Teng (Eds.), Solving Irregularly Structured Problems in PraUel. Proceedings, 1998. X, 408 pages. 1998. Vol. 1458: V.O. Mittal, H.A. Yah'cO, J. Aronis, R. Simpson (Eds.), Assistive Technology in Artificial Intelligence. X, 273 pages. 1998. (Subseries LNAI). Vol. 1464: H.H.S. Ip, A.W.M. Smeuldcrs (Eds.), Multimedia Information Analysis and Retrieval. Proceedings, 1998. VIII, 264 pages. 1998.