Automated deduction - CADE-17: 17th International Conference on Automated Deduction, Pittsburgh, PA, USA, June 17-20, 2000 : proceedings, Volume 17, Part 2000 (Lecture Notes in Artificial Intelligence 1831)

Lecture Notes in Artificial Intelligence Subseries of Lecture Notes in Computer Science Edited by J. G. Carbonell and J...

Author: David A. McAllester

37 downloads 614 Views 4MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Lecture Notes in Artificial Intelligence Subseries of Lecture Notes in Computer Science Edited by J. G. Carbonell and J. Siekmann

Lecture Notes in Computer Science Edited by G.Goos, J. Hartmanis, and J. van Leeuwen

1831

Berlin Heidelberg New York Barcelona Hong Kong London Milan Paris Singapore Tokyo

David McAllester (Ed.)

Automated Deduction – CADE-17 17th International Conference on Automated Deduction Pittsburgh, PA, USA, June 17-20, 2000 Proceedings

Series Editors Jaime G. Carbonell, Carnegie Mellon University, Pittsburgh, PA, USA J¨org Siekmann, University of Saarland, Saarbr¨ucken, Germany Volume Editor David McAllester AT&T Labs Research 180 Park Avenue, Florham Park, N.J., 07932-0971,USA E-mail: [email protected]

Cataloging-in-Publication Data applied for Die Deutsche Bibliothek - CIP-Einheitsaufnahme Automated deduction : proceedings / CADE-17, 17th International Conference on Automated Deduction, Pittsburgh, PA, USA, June 17 - 20, 2000. David McAllester (ed.). - Berlin ; Heidelberg ; New York ; Barcelona ; Hong Kong ; London ; Milan ; Paris ; Singapore ; Tokyo : Springer, 2000 (Lecture notes in computer science ; Vol. 1831 : Lecture notes in artificial intelligence) ISBN 3-540-67664-3

CR Subject Classification (1998): I.2.3, F.4.1, F.3.1

ISBN 3-540-67664-3 Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. Springer is a company in the BertelsmannSpringer publishing group. c Springer-Verlag Berlin Heidelberg 2000 Printed in Germany Typesetting: Camera-ready by author, data conversion by Christian Grosche, Hamburg Printed on acid-free paper SPIN: 10721959 06/3142 543210

Preface For the past 25 years the CADE conference has been the major forum for the presentation of new results in automated deduction. This volume contains the papers and system descriptions selected for the 17th International Conference on Automated Deduction, CADE-17, held June 17-20, 2000, at Carnegie Mellon University, Pittsburgh, Pennsylvania (USA). Fifty-three research papers and twenty system descriptions were submitted by researchers from fifteen countries. Each submission was reviewed by at least three reviewers. Twenty-four research papers and fifteen system descriptions were accepted. The accepted papers cover a variety of topics related to theorem proving and its applications such as proof carrying code, cryptographic protocol verification, model checking, cooperating decision procedures, program verification, and resolution theorem proving. The program also included three invited lectures: “High-level verification using theorem proving and formalized mathematics” by John Harrison, “Scalable Knowledge Representation and Reasoning Systems” by Henry Kautz, and “Connecting Bits with Floating-Point Numbers: Model Checking and Theorem Proving in Practice” by Carl Seger. Abstracts or full papers of these talks are included in this volume. In addition to the accepted papers, system descriptions, and invited talks, this volume contains one page summaries of four tutorials and five workshops held in conjunction with CADE-17. The CADE-17 ATP System Competition (CASC-17), held in conjunction with CADE-17, selected a winning system in each of four different automated theorem proving divisions. The competition was organized by Geoff Sutcliffe and Christian Suttner and was overseen by a panel consisting of Claude Kirchner, Don Loveland, and Jeff Pelletier. This was the fifth such competition held in conjunction with CADE. Since the contest was held during the conference the winners were unknown as of this printing and the results are not described here. I would like to thank the members of the program committee and all the referees for their care and time in selecting the submitted papers. I would also like to give a special thanks to Bill McCune for setting up and maintaining the web site for the electronic program committee meeting.

April 2000

David McAllester

VI

Preface

Conference Organization

Program Chair

David McAllester, AT&T Labs-Research

Conference Chair

Frank Pfenning, Carnegie Mellon University

Workshop Chair

Michael Kohlhase, Universit¨ at des Saarlandes

System Competition Chair

Geoff Sutcliffe, James Cook University

Program Committee Hubert Comon (Cachan) Paliath Narendran (Albany) David Dill (Palo Alto) Robert Nieuwenhuis (Barcelona) Ulrich Furbach (Koblenz) Tobias Nipkow (Munich) Harald Ganzinger (Saarbr¨ ucken) Hans de Nivelle (Saarbr¨ ucken) Didier Galmiche (Nancy) Larry Paulson (Cambridge)

Mike Gordon (Cambridge) David Plaisted (Chapel Hill) Tom Henzinger (Berkeley) Amir Pnueli (Rehovot) Deepak Kapur (Albuquerque) Mark Stickel (Menlo Park) Ursula Martin (St Andrews) Moshe Y. Vardi (Houston) Ken McMillan (Berkeley) Andrei Voronkov (Manchester)

CADE Trustees Maria Paola Bonacina (Secretary) Uli Furbach (President) Harald Ganzinger Claude Kirchner David McAllester William McCune Neil Murray (Treasurer) Tobias Nipkow Frank Pfenning (Vice President) David Plaisted John Slaney

University of Iowa University of Koblenz Max Plank Institute, Saarbr¨ uken LORIA, INRIA AT&T Labs-Research Argonne National Laboratory SUNY at Albany Technische Universit¨at M¨ unchen Carnegie Mellon University University of North Carolina, Chapel Hill Australian National University

Preface

Referees Tamarah Arons Clark Barrett Peter Baumgartner Gertrud Bauer Adam Cichon Witold Charatonik Ingo Dahn Anatoli Degtyarev Grit Denker Dan Dougherty Alain Finkel Dana Fisman Laurent Fribourg Juergen Giesl Andy Gordon Enrico Guinchiglia Andrew Haas Ian Horrocks Joe Hurd Paul Jackson

Robert B. Jones Yonit Kesten Gerwin Klein Konstantin Korovin Keiichirou Kusakari Raya Leviatan Ian Mackie Rupak Majumdar Fred Mang Claude March R. C. McDowell D. Mery Robin Milner Marius Minea Michael Norrish Jens Otten Nicolas Peltier Leonor Prensa-Nieto Alexandre Riazanov C. Ringeissen

Sitvanit Ruah Axel Schairer Renate Schmidt Philippe Schnoebele Elad Shahar Ofer Shtrichman G. Sivakumar Viorica Sofronie-Stokkermans Dick Stearns Frieder Stolzenburg J¨ urgen Stuber Yoshihito Toyama Rakesh Verma Bob Veroff David von Oheimb Uwe Waldmann Christoph Weidenbac Markus Wenzel Benjamin Werner

Previous CADEs CADE-1, Argonne National Laboratory, USA, 1974 CADE-2, Oberwolfach, Germany, 1976 CADE-3, MIT, USA, 1977 CADE-4, University of Texas at Austin, 1979 CADE-5, Les Arcs, France, 1980 (Springer LNCS 87) CADE-6, Courant Institute, New York, 1982 (Springer LNCS 138) CADE-7, Napa, California, 1984 (Springer LNCS 170) CADE-8, University of Oxford, UK, 1986 (Springer LNCS 230) CADE-9, Argonne National Laboratory, USA, 1988 (Springer LNCS 310) CADE-10, Kaiserslautern, Germany, 1990 (Springer LNAI 449) CADE-11, Saratoga Springs, New York, 1992 (Springer LNAI 607) CADE-12, Nancy, France, 1994 (Springer LNAI 814) CADE-13, Rutgers University, USA, 1996 (Springer LNAI 1104) CADE-14, James Cook University, Australia, 1997 (Springer LNAI 1249) CADE-15, Lindau, Germany, July 6-10, 1998. (Springer LNAI 1421) CADE-16, Trento, Italy, July, 1999. (Springer LNAI 1632)

VII

Table of Contents Invited Talk: High-Level Verification Using Theorem Proving and Formalized Mathematics (Extended Abstract) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 John Harrison Session 1: Machine Instruction Syntax and Semantics in Higher Order Logic . . . . . . . . . . 7 Neophytos G. Michael and Andrew W. Appel Proof Generation in the Touchstone Theorem Prover . . . . . . . . . . . . . . . . . . . . . . 25 George C. Necula and Peter Lee Wellfounded Schematic Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Konrad Slind Session 2: Abstract Congruence Closure and Specializations . . . . . . . . . . . . . . . . . . . . . . . . . . 64 Leo Bachmair and Ashish Tiwari A Framework for Cooperating Decision Procedures . . . . . . . . . . . . . . . . . . . . . . . . 79 Clark W. Barrett, David L. Dill, and Aaron Stump Modular Reasoning in Isabelle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 Florian Kamm¨ uller An Infrastructure for Intertheory Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115 William M. Farmer Session 3: G¨ odel’s Algorithm for Class Formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 Johan Gijsbertus Frederik Belinfante Automated Proof Construction in Type Theory Using Resolution . . . . . . . . 148 Marc Bezem, Dimitri Hendriks, and Hans de Nivelle System Description: TPS: A Theorem Proving System for Type Theory . . 164 Peter B. Andrews, Matthew Bishop, and Chad E. Brown

X

Table of Contents

The Nuprl Open Logical Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 Stuart F. Allen, Robert L. Constable, Rich Eaton, Christoph Kreitz, and Lori Lorigo System Description: aRa - An Automatic Theorem Prover for Relation Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Carsten Sinz Invited Talk: Scalable Knowledge Representation and Reasoning Systems . . . . . . . . . . . . . . 183 Henry Kautz Session 4: Efficient Minimal Model Generation Using Branching Lemmas . . . . . . . . . . . 184 Ryuzo Hasegawa, Hiroshi Fujita, and Miyuki Koshimura FDPLL — A First Order Davis–Putnam–Longeman–Loveland Procedure . 200 Peter Baumgartner Rigid E-Unification Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220 Ashish Tiwari, Leo Bachmair, and Harald Ruess Invited Talk: Connecting Bits with Floating-Point Numbers: Model Checking and Theorem Proving in Practice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235 Carl-Johan Seger Session 5: Reducing Model Checking of the Many to the Few . . . . . . . . . . . . . . . . . . . . . . . 236 E. Allen Emerson and Vineet Kahlon Simulation Based Minimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 Doran Bustan and Orna Grumberg Rewriting for Cryptographic Protocol Verification . . . . . . . . . . . . . . . . . . . . . . . . 271 Thomas Genet and Francis Klay System Description: *sat: A Platform for the Development of Modal Decision Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 Enrico Giunchiglia and Armando Tacchella

Table of Contents

XI

System Description: DLP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 Peter Patel-Schneider Two Techniques to Improve Finite Model Search . . . . . . . . . . . . . . . . . . . . . . . . . 302 Gilles Audemard, Belaid Benhamou, and Laurent Henocque Session 6: Eliminating Dummy Elimination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309 J¨ urgen Giesl and Aart Middeldorp Extending Decision Procedures with Induction Schemes . . . . . . . . . . . . . . . . . . 324 Deepak Kapur and Mahadavan Subramaniam Complete Monotonic Semantic Path Orderings . . . . . . . . . . . . . . . . . . . . . . . . . . . 346 Cristina Borralleras, Maria Ferreira, and Albert Rubio Session 7: Stratified Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365 Anatoli Degtyarev and Andrei Voronkov Support Ordered Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385 Bruce Spencer and Joseph D. Horton System Description: IVY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401 William McCune and Olga Shumsky System Description: SystemOnTPTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406 Geoff Sutcliffe System Description: PTTP+GLiDes: Semantically Guided PTTP . . . . . . . . 411 Marianne Brown and Geoff Sutcliffe Session 8: A Formalization of a Concurrent Object Calculus up to α-Conversion . . . . 417 Guillaume Gillard A Resolution Decision Procedure for Fluted Logic . . . . . . . . . . . . . . . . . . . . . . . . 433 Renate A. Schmidt and Ullrich Hustadt ZRes: The Old Davis–Putman Procedure Meets ZBDD . . . . . . . . . . . . . . . . . . 449 Philippe Chatalic and Laurent Simon

XII

Table of Contents

System Description: MBase, an Open Mathematical Knowledge Base . . . . 455 Andreas Franke and Michael Kohlhase System Description: Tramp: Transformation of Machine-Found Proofs into ND-Proofs at the Assertion Level . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460 Andreas Meier Session 9: On Unification for Bounded Distributive Lattices . . . . . . . . . . . . . . . . . . . . . . . . 465 Viorica Sofronie-Stokkermans Reasoning with Individuals for the Description Logic SHIQ . . . . . . . . . . . . . 482 Ian Horrocks, Ulrike Sattler, and Stephan Tobies System Description: Embedding Verification into Microsoft Excel . . . . . . . . . 497 Graham Collins and Louise A. Dennis System Description: Interactive Proof Critics in XBarnacle . . . . . . . . . . . . . . . 502 Mike Jackson and Helen Lowe Tutorials: Tutorial: Meta-logical Frameworks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 507 Carsten Sch¨ urmann Tutorial: Automated Deduction and Natural Language Understanding . . . 509 Stephen Pulman Tutorial: Using TPS for Higher-Order Theorem Proving and ETPS for Teaching Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 511 Peter B. Andrews and Chad E. Brown Workshops: Workshop: Model Computation - Principles, Algorithms, Applications . . . . 513 Peter Baumgartner, Chris Fermueller, Nicolas Peltier, and Hantao Zhang Workshop: Automation of Proofs by Mathematical Induction . . . . . . . . . . . . . 514 Carsten Sch¨ urmann Workshop: Type-Theoretic Languages: Proof-Search and Semantics . . . . . . 515 Didier Galmiche

Table of Contents

XIII

Workshop: Automated Deduction in Education . . . . . . . . . . . . . . . . . . . . . . . . . . 516 Erica Melis Workshop: The Role of Automated Deduction in Mathematics . . . . . . . . . . . 517 Simon Colton, Volker Sorge, and Ursula Martin Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519

High-Level Verification Using Theorem Proving and Formalized Mathematics (Extended Abstract) John Harrison Intel Corporation, EY2-03 5200 NE Elam Young Parkway Hillsboro, OR 97124, USA [email protected] Abstract. Quite concrete problems in verification can throw up the need for a nontrivial body of formalized mathematics and draw on several special automated proof methods which can be soundly integrated into a general LCF-style theorem prover. We emphasize this point based on our own work on the formal verification in the HOL Light theorem prover of floating point algorithms.

1

Formalized Mathematics in Verification

Much of our PhD research [11] was devoted to developing formalized mathematics, in particular real analysis, with a view to its practical application in verification, and our current work in formally verifying floating point algorithms shows that this direction of research is quite justified. First of all, it almost goes without saying that some basic facts about real numbers are useful. Admittedly, floating point verification has been successfully done in systems that do not support real numbers at all [16,17,19]. After all, floating point numbers in conventional formats are all rational (with denominators always a power of 2). Nevertheless, the whole point of floating point numbers is that they are approximations to reals, and the main standard governing floating point correctness [13] defines behavior in terms of real numbers. Without using real numbers it is already necessary to specify the square root function in an unnatural way, and for more complicated functions such as sin it seems hardly feasible to make good progress in specification or verification without using real numbers explicitly. In fact, one needs a lot more than simple algebraic properties of the reals. Even to define the common transcendental functions and derive useful properties of them requires a reasonable body of analytical results about limits, power series, derivatives etc. In short, one needs a formalized version of a lot of elementary real analysis, an unusual mixture of the general and the special. A typical general result that is useful in verification is the following: If a function f is differentiable with derivative f 0 in an interval [a, b], then a sufficient condition for f(x) ≤ K throughout the interval is that f(x) ≤ K at the endpoints a, b and at all points of zero derivative. D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 1–6, 2000. c Springer-Verlag Berlin Heidelberg 2000

2

John Harrison

This theorem is used, for example, in finding a bound for the error incurred in approximating a transcendental function by a truncated power series. The formal HOL version of this theorem looks like this: |- (!x. a <= x /\ x <= b ==> (f diffl (f’ x)) x) /\ f(a) <= K /\ f(b) <= K /\ (!x. a <= x /\ x <= b /\ (f’(x) = &0) ==> f(x) <= K) ==> (!x. a <= x /\ x <= b ==> f(x) <= K)

A typical concrete result is a series expansion for π [1]: ∞ X 1 4 2 1 1 π= − − − 16n 8n + 1 8n + 4 8n + 5 8n + 6 n=0 This allows us to approximate π arbitrarily closely by rational numbers. Doing so is important both for detailed analysis of trigonometric range reduction (reducing an argument x to a trigonometric function to r where x = r + N π/2) and to dispose of trivial side-conditions. For example, an algorithm might rely on the fact that sin(x) is positive for some particular x, and we can verify this by confirming that 0 < x < π using an approximation of π. In HOL, the formal theorem is as follows: |- (\n. inv(&16 pow n) * (&4 / &(8 * n + 1) - &2 / &(8 * n + 4) &1 / &(8 * n + 5) - &1 / &(8 * n + 6))) sums pi

The mathematics needed in floating-point verification is an unusual mixture of these general and special facts, and it’s sometimes the kind that isn’t widely found in textbooks. For example, an important result we use is the power series expansion for the cotangent function (for x 6= 0): 1 1 1 2 5 − x − x3 − x − ... x 3 45 945 To derive this straightforward-looking theorem, both getting a simple recurrence relation for the coefficients and a reasonably sharp bound on their size, is fairly non-trivial. A typical mathematics book either doesn’t mention such a concrete result at all, or gives it without proof as part of a “cookbook” of wellknown useful results. After some time browsing in a library, we eventually settled on formalizing a proof in Knopp’s classic book on infinite series [14]. Formalizing this took several days of work, drawing extensively on existing analytical lemmas in HOL. A side-effect is that we derived a general result on harmonic sums, the simplest special cases of which are the well-known: cot(x) =

1 + 1/22 + 1/32 + 1/42 + · · · = π 2 /6 and 1 + 1/24 + 1/34 + 1/44 + · · · = π 4 /90

High-Level Verification Using Theorem Proving and Formalized Mathematics

3

Knopp remarks It is not superfluous to realize all that was needed to obtain even the first of these elegant formulae. We may add that it is even more surprising that such extensive mathematical developments are used simply to verify that a floating point tangent function satisfies a certain error bound. Of course, one also needs plenty of specialized facts about floating point arithmetic, e.g. important properties of rounding. These theories have also been developed in HOL Light [12] but we will not go into more detail here.

2

Proof in HOL Light

The theorem prover we are using in our work is HOL Light [8],1 a version of the HOL prover [5]. HOL is a descendent of Edinburgh LCF [6] which first defined the ‘LCF approach’ that these systems take to formal proof. LCF provers explicitly generate proofs in terms of extremely low-level primitive inferences, in order to provide a high level of assurance that the proofs are valid. In HOL Light, as in most other LCF-style provers, the proofs (which can be very large) are not usually stored permanently, but the strict reduction to primitive inferences in maintained by the abstract type system of the interaction and implementation language, which for HOL Light is CAML Light [4,23]. The primitive inference rules of HOL Light, which implements a simply typed classical higher order logic, are very simple, and will be summarized below. `t=t

REFL

Γ `s=t ∆`t=u TRANS Γ ∪∆`s=u Γ `s=t ∆`u=v MK COMB Γ ∪ ∆ ` s(u) = t(v) Γ `s=t ABS Γ ` (λx. s) = (λx. t) ` (λx. t)x = t {p} ` p

BETA

ASSUME

Γ `p=q ∆`p EQ MP Γ ∪∆`q 1

See http://www.cl.cam.ac.uk/users/jrh/hol-light/index.html

4

John Harrison

Γ `p ∆`q DEDUCT ANTISYM RULE (Γ − {q}) ∪ (∆ − {p}) ` p = q Γ [x1, . . . , xn] ` p[x1 , . . . , xn ] INST Γ [t1 , . . . , tn ] ` p[t1 , . . . , tn ] Γ [α1, . . . , αn] ` p[α1 , . . . , αn] INST TYPE Γ [γ1 , . . . , γn ] ` p[γ1 , . . . , γn ] In MK COMB, the types must agree, e.g. s : σ → τ , t : σ → τ , u : σ and v : σ. In ABS, we require that x is not a free variable in any of the assumptions Γ . In ASSUME, p must be of Boolean type, i.e. a proposition. All theorems in HOL are deduced using just the above rules, starting from three axioms: Extensionality, Choice and Infinity. There are also definitional mechanisms allowing the introduction of new constants and types, but these are easily seen to be logically conservative and thus avoidable in principle. CAML Light also serves as a programming medium allowing higher-level derived rules (e.g. to automate linear arithmetic, first order logic or reasoning in other special domains) to be programmed as reductions to primitive inferences, so that proofs can be partially automated. This is very useful in practice. In floating point proofs we make extensive use of quite intricate facts of linear arithmetic, such as: |- x <= a /\ y <= b /\ abs(x - y) < abs(x - a) /\ abs(x - y) < abs(x - b) /\ (x <= b ==> abs(x - a) <= abs(x - b)) /\ (y <= a ==> abs(y - b) <= abs(y - a)) ==> (a = b)

Proving these by low-level primitive inferences can be tedious in the extreme, so it is immensely valuable to have the process automated. Similarly, we often use first order automation to avoid tedious low-level reasoning (e.g. chaining together many inequalities) or exploit symmetries via lemmas such as: |- (!x y. P x y = P y x) /\ (!x y. Q x ==> P x y) ==> !x y. Q x \/ Q y ==> P x y

Because these are all programmed as reductions to primitive inferences, we have the security of knowing that any errors in the derived rule cannot result in false “theorems” as long as the few primitive rules are sound. This can be especially important in verification of real industrial systems, since an error in a ‘proof’ can invalidate the entire result. The basic LCF approach of exploiting traditional automated techniques [3,15] or high-level methods of proof description [9] by reducing them to primitive

High-Level Verification Using Theorem Proving and Formalized Mathematics

5

inferences in a single core logic seems to us a very fruitful one. Of course, it has an efficiency penalty, but as we argue in [7], it is not usually too severe except in a few special cases. Nevertheless, there is still much more work to be done to make systems like HOL Light really usable by a nonspecialist. In our opinion, the most impressive system for formalizing abstract mathematics is Mizar [18,22], and importing the strengths of that system into LCF-style provers is a popular topic of research [10,21,24,26]. The first sustained attempt to actually formalize a body of mathematics (concepts and proofs) was Principia Mathematica [25]. This successfully derived a body of fundamental mathematics from a small logical system. However, the task of doing so was extraordinarily painstaking, and indeed Russell [20] remarked that his own intellect ‘never quite recovered from the strain of writing it’. The correctness theorems we are producing in our work often involve tens or hundreds of millions of applications of primitive inference rules, and build from foundational results about the natural numbers up to nontrivial and highly concrete applied mathematics. Yet using HOL Light, which can bridge the abyss between simple primitive inferences and the demands of real applications, doing so is quite feasible.

References 1. D. Bailey, P. Borwein, and S. Plouffe. On the rapid computation of various polylogarithmic constants. Mathematics of Computation, 66:903–913, 1997. 2. Yves Bertot, Gilles Dowek, Andr´e Hirschowitz, Christine Paulin, and Laurent Th´ery, editors. Theorem Proving in Higher Order Logics: 12th International Conference, TPHOLs’99, volume 1690 of Lecture Notes in Computer Science, Nice, France, 1999. Springer-Verlag. 3. Richard John Boulton. Efficiency in a fully-expansive theorem prover. Technical Report 337, University of Cambridge Computer Laboratory, New Museums Site, Pembroke Street, Cambridge, CB2 3QG, UK, 1993. Author’s PhD thesis. 4. Guy Cousineau and Michel Mauny. The Functional Approach to Programming. Cambridge University Press, 1998. 5. Michael J. C. Gordon and Thomas F. Melham. Introduction to HOL: a theorem proving environment for higher order logic. Cambridge University Press, 1993. 6. Michael J. C. Gordon, Robin Milner, and Christopher P. Wadsworth. Edinburgh LCF: A Mechanised Logic of Computation, volume 78 of Lecture Notes in Computer Science. Springer-Verlag, 1979. 7. John Harrison. Metatheory and reflection in theorem proving: A survey and critique. Technical Report CRC-053, SRI Cambridge, Millers Yard, Cambridge, UK, 1995. Available on the Web as http://www.cl.cam.ac.uk/users/jrh/papers/reflect.dvi.gz. 8. John Harrison. HOL Light: A tutorial introduction. In Mandayam Srivas and Albert Camilleri, editors, Proceedings of the First International Conference on Formal Methods in Computer-Aided Design (FMCAD’96), volume 1166 of Lecture Notes in Computer Science, pages 265–269. Springer-Verlag, 1996. 9. John Harrison. A Mizar mode for HOL. In Joakim von Wright, Jim Grundy, and John Harrison, editors, Theorem Proving in Higher Order Logics: 9th International Conference, TPHOLs’96, volume 1125 of Lecture Notes in Computer Science, pages 203–220, Turku, Finland, 1996. Springer-Verlag.

6

John Harrison

10. John Harrison. Proof style. In Eduardo Gim´enez and Christine Paulin-Mohring, editors, Types for Proofs and Programs: International Workshop TYPES’96, volume 1512 of Lecture Notes in Computer Science, pages 154–172, Aussois, France, 1996. Springer-Verlag. 11. John Harrison. Theorem Proving with the Real Numbers. Springer-Verlag, 1998. Revised version of author’s PhD thesis. 12. John Harrison. A machine-checked theory of floating point arithmetic. In Bertot et al. [2], pages 113–130. 13. IEEE. Standard for binary floating point arithmetic. ANSI/IEEE Standard 7541985, The Institute of Electrical and Electronic Engineers, Inc., 345 East 47th Street, New York, NY 10017, USA, 1985. 14. Konrad Knopp. Theory and Application of Infinite Series. Blackie and Son Ltd., 2nd edition, 1951. 15. Ramaya Kumar, Thomas Kropf, and Klaus Schneider. Integrating a first-order automatic prover in the HOL environment. In Myla Archer, Jeffrey J. Joyce, Karl N. Levitt, and Phillip J. Windley, editors, Proceedings of the 1991 International Workshop on the HOL theorem proving system and its Applications, pages 170–176, University of California at Davis, Davis CA, USA, 1991. IEEE Computer Society Press. 16. J Strother Moore, Tom Lynch, and Matt Kaufmann. A mechanically checked proof of the correctness of the kernel of the AM D5K 86 floating-point division program. IEEE Transactions on Computers, 47:913–926, 1998. 17. John O’Leary, Xudong Zhao, Rob Gerth, and Carl-Johan H. Seger. Formally verifying IEEE compliance of floating-point hardware. Intel Technology Journal, 1999-Q1:1–14, 1999. Available on the Web as http://developer.intel.com/technology/itj/q11999/articles/art_5.htm. 18. Piotr Rudnicki. An overview of the MIZAR project. Available on the Web as http://web.cs.ualberta.ca/~piotr/Mizar/MizarOverview.ps, 1992. 19. David Rusinoff. A mechanically checked proof of IEEE compliance of a register-transfer-level specification of the AMD-K7 floating-point multiplication, division, and square root instructions. LMS Journal of Computation and Mathematics, 1:148–200, 1998. Available on the Web via http://www.onr.com/user/russ/david/k7-div-sqrt.html. 20. Bertrand Russell. The autobiography of Bertrand Russell. Allen & Unwin, 1968. 21. Don Syme. Three tactic theorem proving. In Bertot et al. [2], pages 203–220. 22. Andrzej Trybulec. The Mizar-QC/6000 logic information language. ALLC Bulletin (Association for Literary and Linguistic Computing), 6:136–140, 1978. 23. Pierre Weis and Xavier Leroy. Le langage Caml. InterEditions, 1993. See also the CAML Web page: http://pauillac.inria.fr/caml/. 24. Markus Wenzel. Isar - a generic intepretive approach to readable formal proof documents. In Bertot et al. [2], pages 167–183. 25. Alfred North Whitehead and Bertrand Russell. Principia Mathematica (3 vols). Cambridge University Press, 1910. 26. Vincent Zammit. On the implementation of an extensible declarative proof language. In Bertot et al. [2], pages 185–202.

Machine Instruction Syntax and Semantics in Higher Order Logic Neophytos G. Michael and Andrew W. Appel Computer Science Department, Princeton University, 35 Olden Street Princeton, NJ 08544, USA [email protected] [email protected]

Abstract. Proof-carrying code and other applications in computer security require machine-checkable proofs of properties of machine-language programs. These in turn require axioms about the opcode/operand encoding of machine instructions and the semantics of the encoded instructions. We show how to specify instruction encodings and semantics in higher-order logic, in a way that preserves the factoring of similar instructions in real machine architectures. We show how to automatically generate proofs of instruction decodings, global invariants from local invariants, Floyd-Hoare rules and predicate transformers, all from the specification of the instruction semantics. Our work is implemented in ML and Twelf, and all the theorems are checked in Twelf.

1

Introduction

The security problem for mobile code or for component software is this: an untrusted program (or program fragment) is to execute in a host environment (the code consumer ), and we want to ensure that it will do no harm. Proof Carrying Code (PCC) [1] is a framework for solving this problem by providing such assurances to the host. In the PCC framework the code consumer advertises a safety policy which specifies the logic in which it will accept proofs, the regions of readable or writable addresses, and so on. The code producer must construct a proof that the machine-language program satisfies the safety policy; the proof might be generated using hints from the compiler that generated the code. This proof along with the code is communicated to the host environment and the host verifies it before executing the code. PCC has significant advantages over other approaches that address the same problem (such as software fault isolation [6] or byte code interpretation [7]): no performance penalty is taken since the code is run at native speeds, and the proofs are performed on native machine code so no unsoundness can be introduced in the translation (or compilation) from the proved program to the one that will actually execute. For well-chosen safety policies, the proofs can be generated completely automatically. In Appel and Felty [5] we gave an overview of our PCC system and described how it differs from the approach taken by Necula [2]. Instead of building typeinference rules into the safety policy, we model types as defined predicates using D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 7–24, 2000. c Springer-Verlag Berlin Heidelberg 2000

8

Neophytos G. Michael and Andrew W. Appel

the primitives of ordinary logic; we prove typing rules as lemmas, and show how to model a wide variety of type constructors. This way the PCC safety policy is independent of the code producer’s programming language and type system. The machine description semantics are moved from the verification-condition generator to the safety policy. More specifically our safety policy consists of the following: 1. The logic: a fairly standard higher order logic1 (L) consisting of eight inference rules for the logic and twenty-nine for arithmetic (with addition and multiplication taken as primitives). 2. The machine code syntax and semantics: this is encoded as the definition of the step relation (7→) that describes the syntax and semantics of the machine. Step formally captures the notion of a single instruction execution. These axioms also define the decode relation that completely specifies instruction opcodes and operands (machine syntax) for all legal machine-code instructions. 3. Safety constraints: these are statements2 in L that describe general properties of the runtime system (such as readable and safe-to-jump memory locations). They may also contain typing judgments for the initial contents of the register bank. The small size of the logic is one of the major advantages of our approach. It contains no inference rules on types and no Hoare-logic rules for instructions (thus avoiding all complications due to substitution). Since it is so small, the proof checker can be likewise small. Thus the trusted computing base (TCB) can be verified easily (either by hand or through other means). A small TCB is the essence of PCC. To simplify the presentation of the following sections we will use the toy machine (from [5]), a word-addressed 16-bit CPU. Its instruction set is presented in figure 1. Our system currently works with two other machine architectures (Sparc and Mips) and when appropriate we will also use examples from these.

2

Overview

Our focus in this paper is twofold: concise axioms modeling machine architectures, and efficient proofs using those axioms. 1

2

Our logic L, is a sublogic of the Calculus of Constructions [11] and of the logic used in the HOL theorem prover [12], so our proofs can be checked in either Coq or HOL. Our current implementation uses Twelf [4]. We offer a brief introduction to the syntax of our object logic: A metalogic (Twelf) type is a type, and an object-logic type is a tp. Object-logic types are constructed from num (the type of rationals), form (the type of formulas) and the arrow constructor. Object-level terms of type T have type (tm T ) in the metalogic. Terms of type (pf A) are terms representing proofs of object formula A. The term lam [x] F (x) is the object-logic function that maps x to F (x) and @ is the application operator for λ-terms. See Appel and Felty [5] for more details.

Machine Instruction Syntax and Semantics in Higher Order Logic Instruction add addi load store jump bgt beq

Fields 0 d s1 s2 1 d s c 2 d s c 3 s1 s2 c 4 d s c 5 s1 s2 c 6 s1 s2 c

9

Effect rd := rs1 + rs2 rd := rs + sign ext(c) rd := m[rs + sign ext(c)] m[rs2 + sign ext(c)] := rs1 rd := rpc ; rpc := rs + sign ext(c) if rs1 > rs2 then rpc := rpc + sign ext(c) if rs1 = rs2 then rpc := rpc + sign ext(c)

Fig. 1. The toy machine instruction set.

We will describe in detail our step relation and show how it succinctly captures the syntax and semantics of real machines. Since it is by far the largest piece of our safety policy we are of course concerned about its correctness. To this end we will show how parts of it can be automatically generated from existing systems. Here we tackle the syntax of machine instructions using machine descriptions from the New Jersey Machine Code Toolkit [8]. We also show how to automatically generate proofs of correspondence between machine code integers and statements involving the decode relation. We will describe the engineering aspects of generating small proofs of safety. Program safety is proved using a coinduction theorem based on progress and preservation of an invariant. We construct invariant expressions whose size is linear in the number of program instructions, and structure the progress and preservation proofs so that – modulo the parts that will have to be built by our tactical theorem prover – they are linear in size. In building these invariants we need to use the weakest preconditions of instructions and we will show how to automatically generate lemmas for a Hoare logic of machine language from the step relation. Our safety proofs will be linear-sized trees of applications of these Hoare lemmas. Figure 2 shows our system operating on a small program that computes the sum of a linked list of integers. The goal of the system is to prove that the initial machine configuration (IMC) is safe, in symbols the following theorem: IMC(r0 , m0 ) → safe(r0 , m0 ) where IMC(r, m) := m(100) = 8976 ∧ · · · ∧ m(105) = 24859 safe(r, m) := ∀r 0, m0 (r, m 7→∗ r 0, m0 ) → ∃r 00, m00 (r 0, m0 7→ r 00, m00 ). The IMC describes parts of memory at the moment the program will run (in this case only the part containing the program itself). The step relation r, m 7→ r 0, m0 formally describes a single instruction execution, i.e. given a machine at state (r, m), after execution of the instruction found at r(pc), the machine will be at state (r 0, m0 ). The safe property states that no matter how far the execution

10

Neophytos G. Michael and Andrew W. Appel

Machine Code CInv0 100: 8976

L Predicates CInv0 pf (Inv(r0 , m0 )) Global load r3 , r1 , 0 Inv - Invariant Prover - pf (progress(Inv)) beq r3 , r4 , 5 101: 25413 Gen Inv pf (preservation(Inv)) 102: 8978 Decode load r3 , r1 , 2 103: 547 Prover add r2 , r2 , r3 Coinduction Theorem ? load r1 , r1 , 1 104: 8465 pf (safe(r0 , m0 )) beq r1 , r1 , −5 105: 24859 CInv1

CInv1

Fig. 2. Generating safety proofs. proceeds, it never gets stuck, i.e. executes an illegal instruction or performs an illegal fetch. The PCC system is presented with a list of machine code instructions (i.e. integers). The instruction stream is fed through the decode-prover whose job is to discover the instruction each integer represents, and to produce the symbolic representation of each instruction – which is a predicate that describes the instruction’s semantics. The decode-prover also produces proofs of this correspondence. Following this, the predicates are fed into the invariant-generator which builds the global invariant to be used in the coinduction proof. Constructing invariants is not computable in general, so the prover requires hints in the form of local loop invariants decorating the targets of backward branches. Once the global invariant is built we must prove the three preconditions of the coinduction theorem3 (see figure 3) in order to apply it. This is done by the prover and given the three proofs we apply the rule to finally establish safe(r0 , m0 ).

progress(Inv) := ∀r, m Inv(r, m) → ∃ r0, m0 (r, m 7→ r0, m0 ) preservation(Inv) := ∀r, m, r0, m0 Inv(r, m) ∧ (r, m 7→ r0, m0 ) → Inv(r0, m0 ) Inv(r, m)

progress(Inv) preservation(Inv) safe(r, m)

Fig. 3. The coinduction theorem.

3

A note on notation: in the interest of brevity we will sometimes use mathematical notation when presenting Twelf terms.

Machine Instruction Syntax and Semantics in Higher Order Logic

11

upd(f, d, x, f 0 ) := ∀z if (z = d) then f 0 (z) = x else f 0 (z) = f (z) i add(d, s1 , s2 )(r, m, r0, m0 ) := ∃sum plus mod16(r(s1 ), r(s2 ), sum) ∧ upd(r, d, sum, r0 ) ∧ no mem change(m, m0 ) 0

0

i load(d, s1 , c)(r, m, r , m ) := ∃cext, addr sign ext(3, c, cext) ∧ plus mod16(r(s1), cext, addr) ∧ upd(r, d, m(addr), r0 ) ∧ readable(addr) ∧ no mem change(m, m0 )

Fig. 4. Semantics of the add and load instruction of the toy machine.

3

Machine Semantics

In this section we show that the semantics of machine instructions can be easily and concisely expressed in higher order logic. We begin by explaining the idea using the toy machine, and then explore the problems in defining a semantic description of a real CPU. Each instruction defines a relation between the machine state (registers, memory) before and after its execution. We treat both the memory and register bank as functions from integers to integers. Each instruction then becomes a predicate which takes (r, m, r 0, m0 ) as input, and holds when the instruction can safely take state (r, m) to (r 0, m0 ). In figure 4 we show the terms expressing the semantics for the “add” and “load” instructions of the toy machine. The Twelf term i add (what we will call a constructor in section 4) expects three arguments (d, s1 , s2 ) and returns a predicate of type instr, defined as: instr = regs → mem → regs → mem → form. It is this predicate that we view as the semantics of the instruction. Thus for the add instruction, i add(d, s1 , s2 ) holds when for some integer sum, the following three equations hold: sum = (r(s1 ) + r(s2 )) mod 216 ∀x if (x = d) then r 0 (x) = sum else r 0 (x) = r(x) ∀x m(x) = m0 (x). The situation is similar for the semantics of the “load” instruction. But we wish to consider a program safe only if all of its memory accesses are within a specified region. Therefore our step relation admits only a subset of executable load instructions: those that load from readable addresses. The designer of the safety policy must provide axioms that define the readable predicate. In general the semantics of each instruction must enforce the proper conditions under which

12

Neophytos G. Michael and Andrew W. Appel

the instruction can be executed. For “add” there are no such conditions; we can always add two numbers. Real hardware can be a lot more complex than our simplistic toy machine. On a modern CPU one has to deal with the issues of delayed branches, address alignment, stores and loads of different sizes, condition registers, sign extension, instructions with multiple effects, and ALU operations not directly expressible in our arithmetic, to mention just a few. We claim that all of these can be handled relatively easily with the right set of abstractions and definitions. Space restrictions only allow us to deal with a representative subset here. We will use the Sparc CPU in the presentation. – Condition Registers: We model condition registers exactly as we model physical registers. We assign a number to each of them that is outside the range of representable register numbers and refer to them exactly the same way we refer to regular registers. Instructions that need to modify individual bits do so by the use of appropriate definitions (see the bits predicate below). – Delayed Branches: In order to keep their deep pipelines filled, some modern CPUs have introduced the notion of a delayed branch. On such CPUs one (or more) of the instructions following a branch will be executed even if the branch is taken, before the CPU starts executing instructions from the target address. We will assume a single instruction delay slot (the solution can be easily generalized to a delay slot of n instructions). We introduce another register called the next program counter4 (npc) which holds the address at which the pc will be next. In the semantics of a branch instruction, if the branch is to be taken we simply set r(npc) = target and the step relation takes care of updating r(pc) to r(npc) at the appropriate time. – Address Alignment: Machine addresses have to be properly aligned depending on the instruction that uses them. Using the bits(r, l, v, w) predicate (which holds when the value in the binary representation of w between bits r and l equals v; in symbols bits(r, l, v, w) ⇔ v = 2wr mod 2r−l+1 ) we can easily express such constraints. In a load-word instruction for the Sparc for instance we would insist that bits(0, 1, 0, address) holds. – Stores/Loads: We chose to model memory by a function m : num → num that we define only on word-aligned addresses. This way we avoid the complications of modifying individual bytes in a word. When we wish to store a byte quantity, the entire word must be fetched from memory, the byte spliced into it, and then stored back in memory. For load we have a similar situation. With the appropriate definitions all these operations can be specified painlessly. With careful selection of predicates most of them can be shared between the load and store instructions. One such example is the predicate form address below. It computes a word aligned address and offset from an unaligned one, and ensures that the original address was well aligned with 4

This is in fact how the hardware manages delayed branches. Some machines make the npc register explicit in the specifications [10].

Machine Instruction Syntax and Semantics in Higher Order Logic

13

respect to the size of the value we are trying to load/store. form address(u addr , alignment bit, addr , offset , size) := bits(0, alignment bit, offset , u addr ) ∧ minus mod32(u addr , offset , addr ) ∧ modulo(offset , size, 0)

– Arithmetic Operations: Some of the arithmetic operations performed by modern CPUs are not directly expressible as functions in our logic. We cannot, for example, write the function that computes the bitwise “exclusive or” of two integers since our arithmetic primitives include only addition and multiplication and we have no recursion at the object level. Such operations are however, trivially expressed as relations (predicates). Here is for instance the xor predicate: xor(a, b, c) = ∀i ∃x, y, r bits(i, i, x, a) ∧ bits(i, i, y, b) ∧

(if x = y then r = 0 else r = 1) ∧ bits(i, i, r, c) Factoring via Higher-order Predicates. Machine instruction sets are highly factored, both in syntax and semantics. Consider for instance the ALU operations of any modern RISC chip. The ALU takes its input from two registers (or a register and a constant) and produces the result in another. The only difference between instructions is the operation performed. Our use of higher order logic allows us to exploit such factoring very effectively. We find the commonalities in families of instructions (even between families as in the load/store case above), factor those out and reuse well-chosen definitions. Here is an example from the Sparc. The definition of i aluxcc is reused to define 23 different instructions. Argument with carry specifies whether the instruction operates with a “carry”, modifies icc specifies whether it modifies the integer condition codes, and func is the predicate describing the operation performed by the instruction. alu_fun = num arrow num arrow num arrow form. i_aluxcc : tm (form arrow form arrow alu_fun arrow alu_typ) = lam3 [with_carry : tm form][modifies_icc : tm form][func : tm alu_fun] lam3 [rs1][reg_imm][rd] lam4 [r][m][r’][m’] (exists3 [v][v’][r’’] (load_reg_imm @ r @ reg_imm @ v) and (compute_with_carry @ with_carry @ func @ r @ rs1 @ v @ v’) and (compute_cc @ modifies_icc @ r @ r’’ @ v’) and (upd_reg @ r’’ @ rd @ v’ @ r’) and (no_memory_change m m’)). i_AND = i_aluxcc @ false @ false @ and_oper. i_ANDcc = i_aluxcc @ false @ true @ and_oper. " " " " " " -- 21 cases omitted.

Moreover we exploit commonality between machines. Many of our definitions that deal with the mechanics of splicing values into words, sign extension, and

14

Neophytos G. Michael and Andrew W. Appel

arithmetic operations, are shared between semantic descriptions of different machines. Higher-order predicates are useful in expressing this kind of sharing; note that the i aluxcc predicate above is higher order.

4

The Decode Relation

On a von Neumann machine, each instruction is represented in memory by an integer. The decode relation makes this notion precise. It is a predicate of four arguments (m, w, i, s) stating that address w in memory m contains the encoding of instruction i that has size s. Modern microprocessors have hundreds of instructions and to construct this relation manually would be a daunting task. The observation that the information we wish to encode is very similar to the information used by an assembler/disassembler led us to look for an automatic way to generate the relation. The New Jersey Machine Code Toolkit [8] helps programmers write applications that process machine code – assemblers, disassemblers, code generators, and so on. The toolkit lets programmers encode and decode machine instructions symbolically. It transforms symbolic manipulations into bit manipulations, guided by a specification that defines mappings between symbolic and binary representations of instructions. Of interest to us here is the specification language (called SLED) for encoding and decoding assembly-language representations of machine instructions [9]. It is a concise, elegant, and semantically well-founded language, a fact that has made the translation into logic fairly painless. In fact our translation into L can be viewed as a semantics for the language. Before describing our encoding of SLED into L we offer a brief introduction to the language. In order to accommodate machines with non-uniform instruction sizes the toolkit works with streams of tokens instead of instructions. Each instruction consists of one or more tokens. Tokens are further partitioned into fields which are sequences of contiguous bits within a token. Patterns in SLED serve two purposes: firstly they are used to constrain the division of streams into tokens, and secondly to constrain the values of fields in those tokens. Patterns can be combined with various operators to produce new patterns. The toolkit is concerned with two representations of machine instructions: machine code and assembly language. Constructors are used to connect the two representations. Figure 5 presents a SLED specification of the toy machine architecture. The first two lines specify the 16-bit token instr and its fields: op which occupies bits 12 to 15, rd which occupies bits 8 to 11, and so on. The next line specifies a list of patterns (add, . . . , beq,) and for each one, it constrains the op field to have the value 0, . . . , 6 respectively. Finally the constructors clause specifies the toy machine instructions. A special toolkit shortcut is used here: if no pattern is specified in the constructor definition then all the names used in the constructor must be either patterns or fields and their conjunction is taken to be the pattern that will be generated by the constructor. In the next subsections we show how to map fields, patterns, and constructors into higher-order logic.

Machine Instruction Syntax and Semantics in Higher Order Logic

fields of instr (16) op 12:15 rd 8:11 rs1 4:7

rs2 0:3

15

c 0:3

patterns [add addi load store jump bgt beq] is op = 0 to 6 constructors add rd, rs1, rs2 addi rd, rs1, c load rd, rs1, c store rd, rs1, c jump rd, rs1, c bgt rd, rs1, c beq rd, rs1, c

Fig. 5. The SLED specification for the toy machine.

4.1

Mapping Fields into L

The definition of the bits predicate (from section 3) makes it straightforward to map fields into L. All that it takes is to supply the right and left bit specifiers of each field to this predicate. Since our definitions are curried, defining fields in L becomes very convenient and almost as terse as it is in SLED. For the toy machine the first two fields are translated as follows: op rd

= bits @ (const 12) @ (const 15). = bits @ (const 8) @ (const 11).

The op predicate expects two integers as arguments (v, word), and it holds when v is equal to the integer between the 12th and 15th bit of word. 4.2

Mapping Patterns into L

Patterns in SLED constrain both the division of streams into tokens and the values of the fields in those tokens. They are composed of constraints on fields. Patterns can be combined using various operators to form other patterns. The RISC machine descriptions we have considered so far contain only conjunction and disjunction operators, and those are the ones we currently translate. We expect no problems in translating the rest when we choose to deal with CISC machines. Conjunction is used to constrain multiple fields within a single token. When p and q are patterns, the pattern “p & q” matches if both p and q match. For example, in the SLED description for Sparc [8] we find:5 5

This is another example of the terseness of SLED. In the definitions of these patterns Ramsey [9] makes use of a SLED feature called generating expressions, which describe ranges of lists either explicitly or implicitly as shown in the example.

16

Neophytos G. Michael and Andrew W. Appel

patterns [ TABLE_F2 CALL TABLE_F3 TABLE_F4 ] is op = {0 to 3} [ UNIMP Bicc SETHI FBfcc CBccc ] is TABLE_F2 & op2 = [0 2 4 6 7] NOP is SETHI & rd = 0 & imm22 = 0

In the first line TABLE_F2 is defined as the pattern that wants the op field to equal zero, in the second line TABLE_F2 is used in the definition of SETHI which is defined as the conjunction of patterns TABLE_F2 and op2 = 4. Finally in the last line pattern SETHI is used in the definition of the NOP pattern.6 Patterns of this kind are very easy to translate into L. We make use of a higher level infix “and” operator defined as: num_pred = num arrow form. && : tm num_pred -> tm num_pred -> tm num_pred = [p1][p2] lam [w] (p1 @ w) and (p2 @ w).

Given && it is now easy to deal with conjunctive patterns by simply “anding” together the different conjuncts after mapping each of them to an L predicate. The example above then becomes: p_TABLE_F2 = op @ (const 0). p_SETHI = p_TABLE_F2 && (op2 @ (const 4)). p_NOP = p_SETHI && (rd @ (const 0)) && (imm22 @ (const 0)).

Disjunction in patterns is usually used to group patterns for related instructions. In the following example from the Sparc SLED we use disjunction to group the logical, shift, and arithmetic instructions into three groups, which are then disjunctively combined into a pattern that matches any ALU instruction. patterns logical arith shift alu

is is is is

AND | ANDcc | ANDN | ANDNcc | OR | ORcc | ORN | ORNcc | ... ADD | ADDcc | ADDX | ADDXcc | TADDcc | TADDccTV | ... SLL | SRL | SRA logical | arith | shift

Disjunction patterns are mostly used as opcodes to constructors and we show how we deal with them in the next subsection. 4.3

Mapping Constructors into L

A constructor maps a list of operands to a pattern which stands for the binary representation of an operand or an instruction. There are two kinds of constructors, typed and untyped. Typed constructors generate instruction operands and untyped constructors generate instructions. The following definition from the Sparc specification is an example of a typed constructor: constructors imode simm13! : reg_or_imm is rmode rs2 : reg_or_imm is 6

i = 1 & simm13 i = 0 & rs2

A NOP on the Sparc is a SETHI on r0 with value 0, and since r0 is hardwired to zero it has no effect.

Machine Instruction Syntax and Semantics in Higher Order Logic

17

Each line in the definition of a constructor specifies the opcode, the operands, the constructor type, and matching pattern. Usually the opcode is the constructor’s name (as in this case). Constructors generate disjoint sum types. In the above, imode : num → reg or imm is the canonical injection from num into the reg or imm type – likewise for rmode : num → reg or imm. The type is defined implicitly at first use. Each constructor is applicable when the pattern following the is keyword is satisfied. The above constructor definition captures the following idiom: many Sparc instructions (such as add r1, reg or imm, r2) take either a register or a constant as one of their arguments. The hardware differentiates between the two instances by the value of bit 13 (field i) in the representation of the instruction. Depending on the value of i, either imode or rmode can be applied, giving in each case a reg or imm. We translate a typed constructor into L as follows. We first create a new object-logic type for the constructor type. For each of the injective arrows (imode and rmode above) we create an injective Twelf term (c imode and c rmode), as well as a discriminator term (p imode and p rmode). Finally we generate a predicate that decides the type itself (p reg or imm), i.e. a term that when given an object of that type and a word decides whether that word contains the given object. We show these terms for the example below: reg or imm : tp c imode : num −→ reg or imm c rmode : num −→ reg or imm p imode(simm) := i(1) && simm13(simm) p rmode(s2 ) := i(0) && rs2(s2 ) p reg or imm(regimm, word) := (∃simm p imode(simm, word) ∧ regimm = c imode(simm)) ∨ (∃s2 p rmode(s2 , word) ∧ regimm = c rmode(s2 ))

Untyped constructors represent the instructions themselves. Their translation into L is not much different from the typed case so we omit it. Factoring via Higher-order Predicates. The extensive factoring present in the SLED specifications (through the wide use of “or” patterns) carries over to the translated higher-order logic terms. When translating a constructor that uses an “or” pattern as an opcode, we do not generate a unique term for each instruction but instead build just a single term that describes all of them. This way we preserve SLED’s economy of syntax. Here is an example for the ALU instructions of the Sparc shown earlier. The constructor in the spec is the following: constructors alu rs1, reg_or_imm, rd

18

Neophytos G. Michael and Andrew W. Appel

p instr(word , i) := (p add ||2 p addi ||2 p load ||2 p store ||2 p jump ||2 p bgt ||2 p beq)(word , i) decode(m, w, i, s) := (s = 1) ∧ p instr(m(w), i) step(r, m, r0, m0 ) := ∃ i, r00, size decode(m, r(pc), i, size) ∧ upd(r, pc, r(pc) + size, r00 ) ∧ i(r00, m, r0, m0 )

Fig. 6. The decode and step relations for the toy machine.

and we generate the following two terms for it: p alu aux(p i , i cons, s1 , regimm, s2 , word, i) :=

(p i && rs1(s1 ) && p reg or imm(regimm) && rs2(s2 ))(word) ∧ i = i cons(s1 , regimm, s2 ) p alu(word, i) := ∃s1 , rimm, s2 (p alu aux(p AND, i AND) ||5 p alu aux(p ANDcc, i ANDcc) ||5 .. .. .. .. . . . . – 35 cases omitted p alu aux(p SRA, i SRA))(s1 , rimm, s2 , word, i) where p AND is the opcode pattern, i AND is the instruction constructor and likewise for the rest of them. Here again we make use of a higher level “or” (||5 ) operator to factor out the common arguments to the auxiliary predicate. Our decode-generator is a 3200-line ML program that operates directly on SLED specifications. Since it generates a large portion of our safety policy it ought to be considered trusted code (along with the SLED specifications). We feel that this is a small enough program that can be thoroughly and convincingly debugged into correctness. Furthermore its output is human readable and only a constant factor bigger (between 2x and 3x) than the original SLED specification. Thus the output can easily be inspected and debugged directly. The program currently does not share any code with the New Jersey Machine Code Toolkit although the front-end code and some of the analysis that the two programs perform could be shared. We plan to investigate an integration of the two tools in the future. 4.4

The Decode and Step Relations

We are finally in a position to present the decode relation for the toy machine (see figure 6). After all the instruction predicates have been emitted, the decode-generator creates a predicate for the top-level token (i.e. instr in the

Machine Instruction Syntax and Semantics in Higher Order Logic

19

case of the toy spec). This predicate is the disjunction of all the instruction predicates (modulo factoring as described above). Decode is then defined in terms of this predicate. Figure 6 also shows the step relation for the toy machine. It is a predicate mapping the machine state (r, m) 7→ (r0, m0 ) by requiring the existence of an instruction i, a register bank r 00, and an integer size such that location r(pc) in memory m decodes to i, updating the register bank r with the next pc produces r 00, and finally instruction i safely maps (r 00, m) to (r 0, m0 ). Step models the meaning of a single instruction execution.

5

Machine Code Proofs

In this section we discuss some of the issues in generating the proofs used in the coinduction theorem (figure 3). 5.1

Hoare-Logic Predicates for Local Invariants

In the Floyd-Hoare logic one tries to establish statements of the form P {S} Q, where S is a program statement, and P , Q are logical formulae. P {S} Q means that if P holds, and S executes to completion, then Q holds. The logic specifies a set of axioms and inference rules that allow the deduction of statements of this form. The assignment axiom for instance states: ` P [E/V ] {V :=E} P . In our framework we have no such axioms or rules; nevertheless, our preservation statement (in figure 3) bears a striking resemblance to a Hoare judgment. What is stated there is in essence equivalent to: Inv (r, m) {(r, m) 7→ (r 0 , m0 )} Inv (r 0, m0 )

(1)

i.e. if the invariant holds at (r, m), then it must hold at the new state (r0 , m0 ) at which we were taken by the execution of some instruction (a single step). This similarity is of course no accident; we wish to exploit the well understood theory of Hoare logic in order to construct the weakest preconditions that will allow us to prove preservation. Our invariant (as described in detail in previous work [5]) is in essence a disjunction of statements7 of the form r(pc) = n ∧ decode(m, n, i, 1) ∧ In (r, m) where i is the instruction found at m(n) and In is the local invariant at n. To make the situation more concrete assume that at r(pc) we find instruction add(r1 , r2 , r3 ) (r1 := r2 + r3 ), and that after completion of this instruction, we 7

The invariant presented in Appel and Felty [5] could grow exponentially large for certain kinds of programs. By the use of appropriate higher-order definitions we have remedied this problem and now produce invariants that are always linear in the number of program instructions and in the size of the compiler-inserted loop invariants (see subsection 5.2). The structure of the new invariant is beyond the scope of this paper. The discussion in this section is equally applicable to either kind of invariant.

20

Neophytos G. Michael and Andrew W. Appel

wish predicate Q(r, m) to hold at the new state. The question now is what should In be in order to be able to prove equation 1, or equivalently the statement: r(pc) = n ∧ decode(m, n, i, 1) ∧ i = add(r1 , r2 , r3 ) ∧ In (r, m) ∧ (r, m 7→ r 0, m0 ) → Q(r 0, m0 ).

(2)

It is not difficult to see that one such In is the following: Q(r, m)[(r2 + r3 )/r1 ], i.e. the formula we get after applying the assignment axiom of Hoare logic to the postcondition Q(r, m). In building the invariant though, we do not wish to perform substitution of terms for two main reasons. Firstly, if we are not careful during substitution the local invariants could grow exponentially large.8 The goal is to end up with small proofs of safety; an exponentially large theorem is unlikely to have a small proof. Secondly, our logic does not contain axioms that express term substitution; such axioms would render the proof checker more complex and would defeat our efforts for a small TCB. Instead we view substitution as a relation between terms and express the notion concisely by higher-order definitions. These definitions allow us to express In (r, m) in terms of Q(r, m) in such a way that the size of local invariants stays constant, and substitution is completely avoided (at this stage). We define predicate let upd in terms of upd (introduced in figure 4) as follows: let upd(r, a, v, f) := ∀r 0 upd(r, a, v, r 0) → f(r 0 ).

Predicate let upd specifies that for any function r0 that updates r at a with value v, f(r 0 ) must hold (we note that there is exactly one such r 0 ; upd is deterministic). Using this predicate we can succinctly express the weakest precondition for each of our instructions. Below we show the term for the add instruction; compare hx add with the semantics of add shown in figure 4. hx add(d, s1 , s2 , post )(r, m) := ∃sum plus mod16(r(s1 ), r(s2 ), sum) ∧ let upd(r, d, sum, λr 0 .post (r 0 , m))

The last argument to hx add is the postcondition, and the return value is a predicate on (r, m) expressing the weakest precondition for the add. Our system currently generates all the predicate transformers (such as hx add above) automatically for each instruction from the step relation of each machine. The program performing the translation is not part of the TCB; if there is a bug in it then we will simply fail to prove preservation. In proving preservation we will have to prove a statement very similar to that in equation 2 for each instruction in our program (but see section 5.2). Such statements can be proved once and for all as lemmas and applied each time the corresponding instruction is encountered. The extensive use of such 8

Consider for example the program (r2 := r1 + r1 ; r3 := r2 + r2 ; r4 := r3 + r3 ) with postcondition Q(r4 ). Its weakest precondition is Q(((r1 + r1 ) + (r1 + r1 )) + ((r1 + r1 ) + (r1 + r1 ))). The size of the argument to Q grows by a factor of two for each assignment.

Machine Instruction Syntax and Semantics in Higher Order Logic

21

lemmas will have a profound effect on the size of our safety proofs. We have currently proven such lemmas for all the instructions of the toy machine by hand. It is our intention to generate them and their proofs automatically from the step relation of each machine. 5.2

Domain Specific Proofs

Precondition strengthening (shown below) is another rule of Hoare logic. P0 → P P {S} Q 0 P {S} Q

(3)

It states that if P {S} Q then one may replace P by a stronger predicate. This scenario occurs when we deal with program loops, as we explain next. Safety proofs for programs with loops require the use of loop invariants. Construction of loop invariants is not computable in general, so our theorem prover requires hints in the form of typing judgments at every location that is the target of a backward jump. At such locations though, our invariant-generator would have computed a local invariant In (this is the weakest precondition of the instruction – see subsection 5.1). We wish to replace In by Hn (the typing hint at that location) as the precondition of that instruction, but in order to be able to do that we must establish that Hn → In . After that, a lemma application similar to rule 3 allows us to conclude Hn {S} Q. We are building a tactical theorem prover that understands the structure of types and is able to produce such proofs. The “linear size of proofs” discussed in this paper excludes the size of the strengthening proofs. These are not necessarily large but a description of their structure is beyond the scope of this paper. 5.3

Decode Proof-Generation

Proofs involving the decode relation can be hard to generate since the definition itself is quite involved. Our decode-prover (see figure 2) is a Twelf logic program that analyzes the machine-code stream and not only discovers which instruction each integer represents but also produces a proof of this fact. More concretely, if integer n represents instruction i, we get a proof of statement instruction(n, i) from which a proof of decode(m, w, i, s) follows trivially (given a proof that n = m(w)). The decode-prover for the toy machine is about 600 lines of Twelf, currently hand written. We plan to generate the decode-prover itself from the SLED specification of each machine. Note that the decode-prover is not part of the TCB; any bug in it will simply produce an invariant from which it will be impossible to show preservation.

6

Related Work

There has been a large amount of work in the area of proofs of machine language programs using both first order [14] and higher order logics [15][16]. Some of this

22

Neophytos G. Michael and Andrew W. Appel

work was focused on proving the correctness of the compiler or the code generator (see for instance [13]). For a historical survey see Calvert [18]. The practice of proving the Hoare rules as lemmas (see subsection 5.1 and 5.2) in an underlying logic is widespread among the program-verification community [15][16][17]. Two pieces of work are most related to ours: Wahab [15] is concerned with correctness (not just safety) of assembly language programs. He defines a flowgraph language expressive enough to describe sequential machine code programs (he deals with the Alpha AXP processor). Substitution is a primitive operator and the logic contains rules detailing term equality under substitution. He proves the Hoare-logic rules as theorems and uses abstraction in order to massage the code stream and get shorter correctness proofs. The translation from machine code to the flow-graph language does not go through a “decode” relation. Also the use of substitution as a primitive makes this approach unsuitable for our purposes since it complicates the TCB. Boyer and Yu [14] formally specify a subset of the MC68020 microprocessor within the logic of the Boyer-Moore Theorem Prover [19], a quantifier-free first order logic with equality. Their specification of the step relation is similar to ours (they also include a decode relation) but in their approach these relations are functions. The theorem prover they use allows them to “run” the step function on concrete data (i.e. once the step function is specified they automatically have a simulator for the CPU). Their logic, albeit first-order, appears to be larger than ours mainly because of its wealth of arithmetic operators (decoding can be done directly from the specification). Also their machine descriptions are larger than ours; the subset of the 68020 machine description is about 128K bytes while our description of the Sparc is less than half that size. Admittedly, the Motorola chip is much more complex than the Sparc, but we suspect that most of the size difference is attributed to our extensive use of factoring facilitated by higher order logic.

7

Conclusion and Future Work

We have shown how higher-order logic can be used to succinctly describe the syntax and semantics of machine instructions, in a manner that preserves the natural factoring of each architecture. Our step relation formally captures the notion of a single instruction execution. It consists mainly of two pieces: (1) the decode relation that specifies the syntax of machine instructions, and (2) axioms describing the semantics of each instruction by predicates mapping machine states to machine states. The decode relation is generated automatically from existing compiler tools. Large parts of the safety proof involving decode can be generated completely automatically. We explained how to build Hoare-logic predicate transformers from our step relation in order to simplify the construction of the global invariant, and how lemmas can be used to minimize the size of safety proofs involving this invariant. The system is implemented in Twelf [4] and all theorems have been mechanically checked.

Machine Instruction Syntax and Semantics in Higher Order Logic

23

We are building a PCC system that will be used to generate safety proofs for many different architectures. Building all the pieces of figure 2 for each machine would be a daunting and unrewarding task. We instead intend to generate most of the prover components shown in figure 2 completely automatically. Since the decode-prover is in essence a machine-code disassembler, we intend to generate it directly from the decode relation of each machine or alternatively from each machine’s SLED specification. Note that the decode-prover not only disassembles but also builds proofs involving decode. The invariant-generator is again machine-instruction dependent and can also be generated directly from decode (we already generate the predicate transformers expressing the weakest precondition for each instruction automatically from step). It is our intention to automatically generate the Hoare-logic lemmas (of subsection 5.1) along with their proofs from step since there will be a large number of them and their proofs tend to be rather long. The proof of preservation (see figure 3) requires an inversion lemma for decode. We have not proved this lemma for any machine yet, but we expect the proof to be mundane and long (linear in the size of the instruction set). Our plan is to generate these proofs from decode. Finally we are working on a tactical theorem prover that will fill in parts of the proofs involving compiler inserted invariants at locations of backward branches (see subsection 5.2).

References 1. George Necula. Proof Carrying Code. In The 24th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 106-119, New York, January 1997. ACM Press. 2. George Ciprian Necula. Compiling with Proofs. PhD thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, September 1998. 3. Frank Pfenning. Logic Programming in the LF logical framework. In G´erard and Gordon Plotkin, editors, Logical Frameworks, pages 149-181. Cambridge University Press, 1991. 4. Frank Pfenning and Carsten Sch¨ urmann. System description: Twelf - a meta-logical framework for deductive systems. In the 16th International Conference on Automated Deduction. Springer-Verlag, July 1999. 5. Andrew Appel and Amy Felty. A Semantic Model For Types and Machine Instructions for Proof-Carrying Code. In the 27th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL ’00), January 2000. 6. R. Wahbe, S. Lucco, T. Anderson, and S. Graham. Efficient software-based fault isolation. In Proc. 14th ACM Symposium on Operating System Principles, pages 203-216, New York, 1993. ACM Press. 7. Tim Lindholm and Frank Yellin. The Java Virtual Machine Specification. Addison Wesley, 1997. 8. Norman Ramsey, Mary Fernandez. The New Jersey Machine-Code Toolkit. In Proceedings of the 1995 USENIX Technical Conference, pages 289-302, New Orleans, LA, Han. 1995. 9. Norman Ramsey, Mary Fernandez. Specifying Representations of Machine Instructions. In ACM Transactions on Programming Languages and Systems, pages 492524 Vol. 19, No. 3, May 1997.

24

Neophytos G. Michael and Andrew W. Appel

10. SPARC International, Inc. The SPARC Architecture Manual v. 8, Prentice-Hall, Inc. 1992. 11. Thierry Coquand and G´ erard Huet. The calculus of constructions. Information and Computation, 76(2/3), pages 95-120, February/March 1988. 12. M. J. C. Gordon and T. F. Melham (editors). Introduction to HOL: A theorem proving environment for higher order logic, Cambridge University Press, 1993. 13. R. Milner and R. Weyhrauch. Proving Compiler Correctness in a Mechanized Logic. In Machine Intelligence, 7:51-70, 1972. 14. Robert S. Boyer and Yuan Yu. Automated Correctness Proofs of Machine Code Programs for a Commercial Microprocessor. In the 11th International Conference of Automated Deduction, pages 416-430. Springer-Verlag, 1992. 15. M. Wahab. Verification and Abstraction of Flow-Graph Programs with Pointers and Computed Jumps. Technical Report, University of Warwick, Coventry, UK. 16. M. Gordon. A Mechanized Hoare Logic of State Transitions. In A Classical Mind: Essays in Honour of C. A. R. Hoare, pages 143-159. Edited by A. W. Roscoe (Prentice-Hall, 1994). 17. M. Gordon. Mechanizing Programming Logics in Higher Order Logic. In Current Trends in Hardware Verification and Automated Theorem Proving, pages 387-439. Edited by G. Birtwistle and P. A. Subrahmanyam (Springer-Verlag, 1989). 18. David William John Stringer-Calvert. Mechanical Verification of Compiler Correctness. Ph.D. thesis, University of York, 1998. 19. Robert S. Boyer and J Strother Moore. A Computational Logic Handbook. Academic Press 1988.

Proof Generation in the Touchstone Theorem Prover George C. Necula1 and Peter Lee2 1

University of California, Electrical Engineering and Computer Science Department Berkeley, CA 94720, USA [email protected] 2 Carnegie Mellon University, School of Computer Science Pittsburgh, PA 15213, USA [email protected]

Abstract. The ability of a theorem prover to generate explicit derivations for the theorems it proves has major benefits for the testing and maintenance of the prover. It also eliminates the need to trust the correctness of the prover at the expense of trusting a much simpler proof checker. However, it is not always obvious how to generate explicit proofs in a theorem prover that uses decision procedures whose operation does not directly model the axiomatization of the underlying theories. In this paper we describe the modifications that are necessary to support proof generation in a congruence-closure decision procedure for equality and in a Simplex-based decision procedure for linear arithmetic. Both of these decision procedures have been integrated using a modified Nelson-Oppen cooperation mechanism in the Touchstone theorem prover, which we use to produce proof-carrying code. Our experience with designing and implementing Touchstone is that proof generation has a relatively low cost in terms of design complexity and proving time and we conclude that the software-engineering benefits of proof generation clearly outweighs these costs.

1

Introduction

There are several reasons why a theorem prover ought to produce easily checkable derivations of the formulas it proves. First, that way the soundness of the theorem prover does not have to be trusted since it is reduced to the soundness of a much simpler proof checker. This allows theorem proving tasks to be delegated to anonymous or even untrusted parties, such as remote proving servers, without loss of confidence in the result. On the software-engineering side, the testing and maintenance of a proof-generating theorem prover can be simplified considerably at the cost of implementing a simple proof checker. Our initial motivation for developing such a theorem prover was to assist with the generation proof-carrying code [Nec97], in which an explicit proof of safety is attached to mobile code to allow a code receiver to verify easily the compliance of the code with a safety policy. D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 25–44, 2000. c Springer-Verlag Berlin Heidelberg 2000

26

George C. Necula and Peter Lee

The complexity of proof generation in a theorem prover depends on the prover design. For example, a simple theorem prover can be written as an interpreter for a logic program consisting of a transcription of axioms and inference rules. In fact, the first implementation of proof-carrying code used the Elf [Pfe94] system to search for proofs when the logic was expressed as an LF signature, in the style described in [Pfe91]. For such a theorem prover it is a simple bookkeeping task to record the proof as the sequence of the inference rules used on the successful search path. The problem is complicated somewhat in theorem provers based on decision procedures, such as PVS [ORS92] or Simplify [DLNS98], because of the indirect relationship between the decision algorithm, sometimes described in terms of graphs [Sho81] or matrices [Nel81], and the axiomatization of the theories involved. In this paper we describe an extension for proof generation of the Nelson-Oppen cooperating decision-procedures model and then we show how to implement proof generation in a congruence closure decision procedure for equality and in a Simplex-based decision procedure for linear arithmetic. We implemented these decision procedures along with a few others in the Touchstone theorem prover that we use in our proof-carrying code experiments. One noteworthy feature of our implementation is that proofs of intermediate subgoals are generated lazily and only if they turn out to be on the successful proof search path. With this optimization the overhead of proof generation is a 30% increase in the size of the prover source code and a 15% increase of proving time. Proof generation or logging appears in various forms in other theorem provers as well. In LCF-style tactic-based provers (e.g. Isabelle [Pau94] and HOL [Gor85]) the lack of decision procedures allows a simple implementation of proof logging in the form of a trace of the successful proof search path. In theorem provers that do use decision procedures (e.g., PC-Nqthm [BM79], PVS [ORS92], Simplify [DLNS98]) most often the prover records only the user input and the invocations of the decision procedures, to allow batch-mode proof playback. This means that the implementations of the decision procedures must be trusted since they are also part of the proof checker. In addition to Touchstone, a select number of other theorem provers combine decision procedures for efficiency and proof generation for assurance. One of them is the Stanford Validity Checker [SD99], which uses a different set of decision procedures and the Shostak method for integrating decision procedures instead of the Nelson-Oppen method discussed here. A more closely related result is Boulton’s integration [Bou93,Bou95] of a fully-expansive implementation of the Nelson-Oppen method in the HOL theorem prover [Gor85]. While we used some of the same techniques as Boulton (such as the lazy generation of proof objects [Bou92]), our work is different from Boulton’s in two respects. Boulton chooses to use a version of Fourier-Motzkin elimination for deciding linear arithmetic formulas, in order to simplify the task of generating proofs ([Bou93], page 80). We have opted for a more complex but apparently more efficient decision procedure based on the Simplex algorithm. A second difference is that Boulton uses a functional programming style in order to

Proof Generation in the Touchstone Theorem Prover

27

implement easily the required undo feature of decision procedures. We decided to use an imperative style and program the undo feature explicitly in order to have better control on the memory usage. As a result, we were able to implement the undo feature for the Simplex algorithm with a very small cost by not actually reverting the data structures to their original form but to another one that is equivalent. This, coupled with the modifications that are required to the linearprogramming version of Simplex to make it usable in a Nelson-Oppen prover complicates the proof generation problem for Simplex. We show a solution to this problem in Section 3.2. In addition to presenting the particular techniques that we use to generate proofs from decision procedures, a substantial part of this paper summarizes our experience in building and using the Touchstone theorem prover for producing proof-carrying code. We discuss both the additional programming-complexity cost of proof generation along with the benefits of proof generation for debugging and maintaining the prover. In fact, we show that the ability to generate proofs and thus to check easily each run of the prover allowed us to use aggressive implementation techniques in order to gain efficiency. Our measurements show that Touchstone appears to be faster than Boulton’s implementation of the Nelson-Oppen strategy in the HOL theorem prover. In order to achieve this we had to adopt a more aggressive imperative programming style which led to a number of subtle design and programming errors that could have been avoided in a purely functional implementation. However this did not turn out to be a reliability problem because the proof checker quickly pointed out our programming errors. Considering the complexity of implementing proof generation and the runtime cost of synthesizing proofs on one hand and, on the other hand, the number of design and implementation errors that were uncovered by proof checking during testing and maintenance along with the added value of the theorem prover as a proof-carrying code generator, we strongly advocate that theorem provers ought to generate easily checkable proofs.

2

Overview of the Touchstone Theorem Prover

Touchstone has a modular design based on a strategy for combining decision procedures first described by Nelson and Oppen [NO79]. The innovation in Touchstone lies in a modification of the Nelson-Oppen strategy to allow for proof-generating decision procedures and also in the techniques used to generate proofs in individual decision procedures. In this paper we discuss such techniques for the congruence closure decision procedure for equality and a Simplex-based decision procedure for linear arithmetic. Touchstone handles the fragment of first-order logic shown in Figure 1, where the languages of literals L and expressions E can be extended with additional operators or function symbols. There are two motivations for restricting ourselves to such a small subset of first-order logic formulas. First, this fragment is

28

George C. Necula and Peter Lee Goals Hypotheses Literals Expressions

G H L E

::= L ::= L ::= E1 ::= n

| > | > = E2 | E1

| G1 ∧ G2 | H ⊃ G | ∀x.G | H1 ∧ H2 | | H1 ∨ H2 | ∃x.H | E1 6= E2 | p(E1 , . . . , En ) + E2 | f (E1 , . . . , En ) | · · ·

Fig. 1. The syntax of formulas handled by Touchstone.

sufficient for expressing verification conditions for programs whose loop invariants and function pre/postconditions are themselves restricted to the language H of hypotheses. This is true in all applications to date of proof-carrying code where, in fact, we currently use only conjunctions of literals as hypotheses. Secondly, this fragment of intuitionistic logic has the convenient property that all inference rules are invertible and thus we can use a very simple yet complete inversion proof-search procedure without any disjunctive or existential choices. In essence, the hard part of the proving task in this fragment of logic lies with the decision procedures that handle goal literals. The prover can be extended to handle more logical connectives all the way to higher-order hereditary Harrop formulas following, for example, the strategies described in [Mil91] or [MNPS91]. A decision procedure for a given theory T in Touchstone knows how to decide whether a set of literals entails another literal, in the case when all of the literals involved contain only function symbols from T . Most decision procedures in Touchstone are implemented in terms of satisfiability procedures that can detect when a set of literals is unsatisfiable. In practice, goal formulas contain literals from multiple theories and, although necessary, it is not sufficient to have decision procedures for these isolated theories. Furthermore, combining decision procedures is not as straightforward as it might seem. To illustrate this point consider the theory Q of rational numbers with the free symbols +, −, ≥ and the numerals along with the usual axioms of rational arithmetic. Consider also the theory E with one uninterpreted unary function symbol “f”. The satisfiability problems for each of these theories considered separately were solved long ago by Fourier for Q and by Ackermann for E [Ack54]. Consider now the following goal from the combined theory Q + E1 : f(f(x) − f(y)) 6= f(z) ∧ y ≥ x ∧ x ≥ y + z ∧ z ≥ 0

(1)

Informally, to demonstrate that the above set of literals is not satisfiable, we would first use the two literals in the middle to infer in Q that “0 ≥ z” and then the last literal to demonstrate that “z = 0” and hence also that “x = y”. Then, we use the congruence rule of E to infer that “f(x) = f(y)”. Then we move again in Q to prove that “f(x) − f(y) = z” and then back to E to prove that “f(f(x) − f(y)) = f(z)”. This allows E to detect the contradiction with the first literal and to declare that the set of literals is not satisfiable. This example demonstrates that, in general, the decision procedures must interact in a non-obvious way to detect unsatisfiability. 1

This example is taken from [Nel81].

Proof Generation in the Touchstone Theorem Prover Goal

29

Proof

Control core Inversion Assertion

Contradiction

Dispatch Command Convex satisfiability procedure

Next subgoal Subgoal Add subgoal

Command

…

Tactic-based satisfiability procedure

Fig. 2. The overall structure of the Touchstone theorem prover.

Nelson and Oppen show that it is enough for each decision procedure to broadcast only the contradictions and the equalities between variables that it discovers. This simple cooperation mechanism is shown in [Nel81] to be complete when the theories involved are convex. A theory is not convex if it has a set of literals that entails a proper disjunction of equalities between variables without entailing any single equality. For example, the theory Z of integer linear arithmetic is not convex since y = z + 1 ∧ y ≥ x ∧ x ≥ z entails x = y ∨ x = z. The Nelson-Oppen architecture can be adapted to deal with non-convex theories by performing a case-split whenever a conjunction of literals entails a disjunction. Informally, the prover tries to guess which one of the disjuncts holds and asserts it to all decision procedures. If this does not lead to unsatisfiability then the next disjunct is tried. For this procedure to be correct there are additional technical requirements that the theories must satisfy as explained in [Nel81]. The structure and operation of Touchstone is shown in Figure 2. Input goals are first broken into literal goals and assertions by the “Inversion” module. Hypothesis literals and negated goal literals are asserted along with their proofs to the “Dispatch” module that essentially implements a broadcast medium between decision procedures. Each decision procedure receives either proved asserted literals or proved equality of variables discovered by other decision procedures. Decision procedures can also discover a contradiction, which is then propagated to the “Inversion” module. The “Subgoal” module is discussed later. As an optimization, equalities discovered and broadcasted by decision procedures are not accompanied by an actual proof but only by a token that identifies the originator decision procedure. Proofs are produced on demand only if the equality is

30

George C. Necula and Peter Lee

actually used in generating a contradiction. This optimization is similar to the one described by Boulton [Bou92]. The “Inversion” module is fairly simple mostly due to the limited fragment of first-order logic that it has to handle. Informally, goals are first broken using the appropriate introduction rules. When the goal is a literal its negation is asserted to the “Dispatch” module. Other sources of assertions are the left-hand sides of assertions, which are broken using elimination rules into a sequence of literals to be asserted. As discussed before, the “Dispatch” module broadcasts the assertions received from the “Inversion” module to all decision procedures and expects them to return either a contradiction or a set of entailed equalities between variables. This proof procedure is complete for the fragment of logic that we consider here. The completeness at the level of literals is ensured by the correctness theorem for the Nelson-Oppen strategy for convex theories. In order to generate proofs several changes have to be made: – The “Inversion” module must keep track of the introduction rules that it uses while breaking up the goal and the elimination rules that it uses while breaking up the assertion. – The “Dispatch” module and consequently all decision procedures must receive the asserted literals accompanied by a proof. – A decision procedure that discovers an equality must also be able to produce a proof of that equality, possibly in terms of the proofs accompanying the assertions that it received previously. – Furthermore, a decision procedure that discovers a contradiction must be able to exhibit a proof of falsehood. All proof objects maintained in the system are represented as terms in the Edinburgh Logic Framework (LF) [HHP93], which conveniently allows higherorder proof representations, so we can use the simple LF type checker as a proof checker. We decided to use an imperative implementation of the system so that each decision procedure can maintain state without having to pass it around as it would be necessary in a purely functional implementation. To ensure proper scoping of assertions and thus to maintain the soundness of the prover, the inversion module announces when decision procedures must “forget” certain assertions they have received. This is implemented by programming the “Inversion” module to issue pair of commands snapshot and undo. These commands are broadcast to decision procedures by the “Dispatch” module. The intended semantics of the undo operation is that all decision procedures should adjust their internal data structures so that they do not reflect assertions that were made after the matching snapshot operation. This ensures that assertions are properly retracted from the system at the time of the undo. 2.1

The Subgoal Module

Touchstone deviates from the Nelson-Oppen architecture described in [Nel81] by allowing the use of decision procedures based on tactics. A tactic-based decision

Proof Generation in the Touchstone Theorem Prover

31

procedure can, in addition to detecting contradictions and equalities between variables, announce subgoals that, if proved, would allow the decision procedure to detect a contradiction. Each such subgoal is announced to the “Subgoal” module along with a function that given a proof of the subgoal generates a proof of falsehood. We refer to this function as the proof transformer. The “Subgoal” module is used extensively in Touchstone for the implementation of various decision procedures for type checking, as explained in Section 4. Touchstone attempts to prove tactic-generated subgoals only when the current proof is about to fail. If the “Inversion” module notices that no contradiction is announced after it asserts the negation of a goal literal then it queries the “Subgoal” module for a list of subgoals that were announced by tactic-based decision procedures. The “Inversion” considers these subgoals in turn and if any one of them is proved it can use the associated proof transformer to generate the desired contradiction. An added benefit of the “Subgoal” module is that non-convex decision procedures can be incorporated quite naturally. If a decision procedure cannot discover an equality but it does discover a proper disjunction of equalities it is in a position to announce a subgoal consisting of a proper conjunction of disequalities. This will lead the “Inversion” to perform a case-split and to try to prove all disequalities independently, which is exactly the behavior desired in the presence of non-convex theories. This completes the description of the modules responsible with the control of the decision procedures. In the next section we describe some general principles for implementing proof-generating decision procedures and then we examine in more detail two of the decision procedures of Touchstone.

3

Proof-Generating Decision Procedures

All decision procedures in Touchstone use the same internal representation of literals in the form of a global expression directed acyclic graph, or the E-DAG. The E-DAG contains a node for each unique subexpression of the goal formula. In addition to the global E-DAG, each decision procedure is free to maintain its own internal state. Each decision procedure is required to implement at least three functions: – The assert function that given a literal and its proof asserts the literal to the decision procedure. This function should return a list of equalities that were discovered along with their proofs, or a contradiction along with a proof of falsehood. As a side-effect this function can also announce subgoals along with their proof transformers to the “Subgoal” module. – The snapshot and undo functions, as described above. There are at least a couple of ways to implement the snapshot and undo operations. The simplest way is for each decision procedure to maintain a stack of the input assertions. Each time a new assertion arrives it is considered with

32

George C. Necula and Peter Lee

E =E E1 = E2

eqid

E2 = E1 E1 = E2

E1 6= E2 ⊥

falsei

eqsym

E1 = E2

E1 = E10

E2 = E3

E1 = E3 · · · En = En0

eqtr

f(E1 , . . . , En ) = f(E10 , . . . , En0 )

congr

Fig. 3. The axioms of the theory E of equality. respect to the assertions already memorized on the stack for the purpose of detecting equalities or announcing a contradiction. Each snapshot places a marker on the stack and each undo pops the stack up to the nearest marker. This simple strategy is typical of the decision procedures that use backward chaining. In Touchstone, the modular arithmetic and the typing decision procedures use this simple strategy. Another strategy is used by the forward-chaining decision procedures. These typically maintain internal data structures that reflect the current assertions in an internal form. As new assertions arrive they are internalized in the data structure and equalities are propagated. To be usable in Touchstone, such decision procedures must be able to revert their data structures to the state at the matching snapshot. Decision procedures can implement the undo operation by using non-mutable data structures or by maintaining a list of the destructive operations performed on the state so that they can be undone. The latter strategy is used in Touchstone by the congruence closure and the Simplex decision procedures. 3.1

Proof Generation in the Congruence Closure Algorithm

A central theory in any implementation of the Nelson-Oppen architecture is the theory of equality. The free functions of the theory E are “=” and “6=” along with any uninterpreted function symbols. The axioms of the theory are those shown in Figure 3. There is one congruence rule for each function symbol in the system. The theory E was first shown decidable by Ackermann [Ack54] by reducing the problem to that of constructing the congruence closure of a relation on a graph. If R is an equivalence relation over a set of terms, we say that two terms f(t1 , . . . , tn ) and f(t01 , . . . , t0n ) are congruent if ti is related to t0i by R for all i = 1, . . . , n. The congruence closure of a relation R is the smallest extension of R that is both an equivalence relation and relates all its congruent terms. To see if a given equality “t = u” follows from a set of equalities R, we first construct the congruence closure R0 of R and then check to see if t = u ∈ R0 . This is the sense in which an algorithm for computing the congruence closure of a set of equalities can be at the base of a decision procedure for E. The implementation of congruence closure in Touchstone is a proof-generating extension of that described in [Nel81]. In addition to the E-DAG, the congruence closure algorithm uses its own internal data structures to represent the congruence closure of the current equality

Proof Generation in the Touchstone Theorem Prover

33

assertions. Thus, a mapping root is maintained to map each subexpression to a representative for its equivalence class. A mapping forbid maps a class representative C to a set of nodes that are known to be distinct from C. Finally, the set of incoming equality and inequality assertions are stored on an undoStack along with their proofs. Additionally, whenever a congruence is discovered, the corresponding equality along with its proof is pushed on the undoStack. For the purposes of discussing the proof generation strategy we do not need to see the whole implementation but just the invariant that it maintains. This invariant is shown below, with the notation class(a) denoting the set of nodes with the same root as a: C1. root(a) = root(b) if and only if a ≡ b or there exist a1 , . . . , an+1 such that – a ≡ a1 , b ≡ an+1 , and – (ai = ai+1 , pfi ) ∈ undoStack or (ai+1 = ai , pfi ) ∈ undoStack for all i = 1, . . . , n C2. class(a) ∩ forbid(root(b)) 6= ∅ if and only if there exist a0 ∈ class(a) and b0 ∈ class(b) such that (a0 6= b0 , neqab) ∈ undoStack. Based on this invariant we can define a function prfEq(a, b) that given two expressions with the same root produces a proof of their equality, and also a function mkEqContra(a, b, eqab) that given two expressions as in Invariant C2 along with a proof of their equality, produces a proof of falsehood, as shown below. prfEq(a : node, b : node) = /* root(a) ≡ root(b) */ if a ≡ b then return eqid(a) let a1 ,( . . . , an+1 as in the invariant C1 if (ai = ai+1 , pf i ) ∈ undoStack pf i pf 0i = eqsym(pf i ) if (ai+1 = ai , pf i ) ∈ undoStack return eqtr(pf 01 , eqtr(pf 02 , . . . , eqtr(pf 0n−1 , pf 0n ) . . . )) mkEqContra(a, b, eqab) = /* class(a) ∩ forbid(root(b)) 6= ∅ */ let a0 ∈ class(a), b0 ∈ class(b) such that (a0 6= b0 , neqab) ∈ undoStack return falsei(eqtr(prfEq(a0 , a), eqtr(eqab, prfEq(b, b0))), neqab) The decision procedure operates as follows. When an equality a = b is asserted we first check the Invariant C2 condition to see if we detected a contradiction, in which case we use mkEqContra to generate the proof of falsehood required for the Contradiction exception. Otherwise we push the asserted equality along with its proof on the undoStack and we merge the classes of a and b updating the forbid sets accordingly to preserve the invariant. Finally, we check for newly introduced congruences. Any such congruence is both an equality to be announced to other decision procedures and an input to a recursive invocation of the merge procedure. For this latter step we use the congr rule to generate the appropriate proof. When a disequality assertion a 6= b along with its proof neqab is encountered we first check whether a and b are equivalent. If they are then we announce

34

George C. Necula and Peter Lee

a Contradiction with the proof falsei(prfEq(a, b), neqab). Otherwise we just add {(a 6= b, neqab)} to the undoStack and we update the forbid sets of both a and b. Note that congruence closure is a convex decision procedure and thus, it does not need to create case splits. Also, as proven in [NO80] and [Nel81] for similar implementations, the congruence closure algorithm is a sound and complete decision procedure for E. The algorithmic complexity of the algorithm is determined by the method used to discover the congruent pairs of nodes. In Touchstone this is done with a simple algorithm of complexity O(n2 ), where n is the number of nodes in the E-DAG. The complexity can be reduced to O(n log n) using the more complex strategy described in [DST80]. The proof generation extensions do not affect the algorithmic complexity. For more implementation details the reader is invited to consult [Nel81,Nec98]. 3.2

Proof-Generation in the Simplex Algorithm

Now we turn our attention to the decision procedure for the theory Z of integer numerals along with the operators {+, −, ≥, >}. As a notational convenience we also consider multiplication by integer numerals, which we write using the infix · operator. The decision problem for Z is essentially the problem of deciding whether one linear inequality is a consequence of several other inequalities. There are several decision procedures for Z in the literature. Some cover only special cases [AS80,Pra77,Sho81] while others attempt to solve the general case [Ble74,Nel81]. Here we will consider a general decision procedure based on the Simplex algorithm for linear programming, as described in [Nel81]. Like most decision procedures for Z, Simplex is not complete since it works essentially with rational numbers. However, Simplex is powerful enough to handle the kinds of inequalities that typically arise in program verification. For space reasons, we discuss here only on the very basic properties of the Simplex algorithm used as a decision procedure and we focus on the modifications necessary for proof generation. The reader is invited to consult [Nec98,Nel81] for more details and a complete running example. The internal data structure used by the Simplex algorithm is the tableau, which consists of a matrix of rational numbers qij with rows i ∈ 1 . . r and columns j ∈ 1 . . c along with a vector of rational numbers qi0 . All rows and columns in the tableau are owned by an expression in the E-DAG. We write R(i) to denote the owner of row i and C(j) to denote the owner of column j. The main property of the tableau is that each row owner can be expressed as a linear combination of column owners, as follows: R(i) ' qi0 +

c X

qij · C(j)

(i = 1 . . r)

(2)

j=1

We use the ' notation to denote equality between symbolic expressions modulo the rules of the commutative group of addition.

Proof Generation in the Touchstone Theorem Prover

35

The Simplex tableau as described so far encodes only the linear relationships between the owning expressions. The actual encoding of the input inequality assertions is by means of restrictions on the values that owning expressions can take. There are two kinds of restrictions that Simplex must maintain. A row or a column can be either +-restricted, which means that the owning expression can take only values greater or equal to zero, or ∗-restricted, in which case the owning expression can only be equal to zero. To illustrate the operation of Simplex as a decision procedure for arithmetic consider the task of detecting the contradiction in the following set of literals: {1 − x ≥ 0, 1 − y ≥ 0, −2 + x + y ≥ 0, −1 + x − y ≥ 0} When Simplex asserts an inequality, it rewrites the inequality as e ≥ 0 and it introduces a new row in the tableau owned by e. To simplify the notation we are going to use the names s1 , . . . , s4 for the left-hand sides of the our four inequations. After adding each row, Simplex attempts to mark it as +-restricted (since the input assertions comes with a proof that the owning expression is positive). But in doing so the tableau might become unsatisfiable, so Simplex first performs Gaussian elimination (called pivoting in Simplex terminology) to increase the entry in column 0 to a strictly positive value. If it is possible to bring the tableau in a form where each +-restricted row has a positive entry in column 0, then the tableau is satisfiable. One particular assignment that satisfies all +-restrictions is obtained by setting all variables or expressions owning the columns to zero. In the process of adding the third inequality from above the tableau is as shown below in position (a) below. The row owned by s3 cannot be marked as +-restricted because q30 = −2 6≥ 0. To increase the value of q30 Simplex performs two pivot operations (eliminating x from s1 and y from s2 ) and brings the tableau in the state shown below in position (b).

s+ 1 s+ 2 s3

x y 1 −1 1 −1 −2 1 1 (a)

s+ s+ 1 2 x 1 −1 y 1 −1 s3 0 −1 −1

s∗1 s∗2 x 1 −1 y 1 −1 s∗3 0 −1 −1

s∗1 s∗2 x 1 −1 y 1 −1 s∗3 0 −1 −1 s4 −1 −1 1

(b)

(c)

(d)

Now Simplex can safely mark the row s3 as +-restricted and, if it were not for the need to detect all equalities between variables, it could proceed to process the fourth inequality. In order to detect easily all inequalities Simplex must maintain the tableau in a form in which all owners of rows and columns that are constrained to be equal to zero are marked as ∗-restricted. It turns out that in order to detect rows that must be made ∗-restricted it is sufficient to look for +restricted rows that become maximized at 0. A row i is said to be maximized at qi0 when all its non-zero entries are either in ∗-restricted columns or are negative and are in +-restricted columns. One such row is s3 in the tableau (b). Since

36

George C. Necula and Peter Lee

s1 and s2 are known to be positive and s3 is a negative-linear combination of them, it follows that s3 ≤ q30 = 0. On the other hand after processing the third assertion we know that s3 ≥ 0 which leads Simplex to decide that s3 is both +-restricted and maximized at 0, hence it must be equal to zero. Furthermore all non-zero entries in the row of s3 must now be in ∗-restricted columns. Thus Simplex marks all of s1 , s2 , and s3 as ∗-restricted rows, bringing the tableau in the state shown above in position (c). In this state Simplex notices that the rows of x and y differ only in ∗-restricted columns (which are known to be zero), hence it announces that x is equal to y. Finally, when the fourth assertion is added, the tableau becomes as shown in position (d) above. Now Simplex notices that s4 is maximized at −1 and consequently that it is impossible to increase the value of q40 by pivoting. Since Simplex also knows that s4 ≥ 0 (from the fourth assertion), it has discovered a contradiction. In the rest of this section we discuss how one can extract proofs of variable equalities and contradictions from the Simplex tableau. Before that let us point out that the implementation of the undo feature in Simplex does not have to revert the tableau to the original state at the time of the matching snapshot. Instead, all it has to do is to remove the rows, columns and restrictions that were added since then. This is much less expensive than a full undo. The operation of the Simplex algorithm maintains the following invariants: S1. If R(i) is restricted (either +-restricted or ∗-restricted) then there is a proof of R(i) ≥ 0. We refer to this proof as Proof (R(i)). Similar for columns. S2. If R(i) is +-restricted then qi ≥ 0 S3. If R(i) is ∗-restricted then qi = 0 and qij 6= 0 implies C(j) is ∗-restricted S4. If C(j) is ∗-restricted then there exists ∗-restricted R(i) such that qij < 0 and qik ≤ 0 for all k > j. We say that row i restricts column j. Only Invariant S1 is introduced solely for the purpose of proof generation; the others are necessary for the correct operation of Simplex. Simplex detects contradictions when it tries to add a +-restriction to a row i that is maximized at a negative value qi0 . The key ingredient of a proof of falsehood in this case is a proof that R(i) ≤ qi0 . Simplex constructs this proof indirectly by constructing first a proof that R(i) = qi0 + E and then a proof that E ≤ 0. Furthermore, Simplex chooses E to be the expression constructed as the linear combination of column owners as specified by the entries in row i. If we ignore the ∗-restricted columns, then all of the non-zero entries in the maximized row i are negative and are in columns j that are +-restricted. For each such column, according to Invariant S1, there is a proof that C(j) ≥ 0. To construct these proofs Simplex uses the inference rules shown below: Ri ≥ 0 Ri = q + E ⊥ E1 = E2

arith (E1 ' E2 )

E≤0

sfalse (q < 0)

E1 ≤ 0 E2 ≥ 0 geqadd (q ≥ 0) E1 + q · E2 ≤ 0

Proof Generation in the Touchstone Theorem Prover

37

mapRow(i) = Φ ← ∅ foreach k = 1 . . c such that qik < 0 do Φ(C(k)) ←+ qik foreach k = 1 . . c such that qik > 0 do Φ = mapCol(k, qik , Φ) return Φ mapCol(j, q, Φ) = let row i be the restictor of j as in invariant S4 (qij < 0) Φ(R(i)) ←+ q/qij foreach k 6= j such that qik < 0 do Φ(C(k)) ←+ −q · qik /qij foreach k such that qik > 0 do Φ ← mapCol(k, −q · qik /qij , Φ) return Φ

Fig. 4. Extracting coefficients from the Simplex tableau.

The sfalse rule is used to generate contradictions as explained above. Its first hypothesis is obtained directly from the proof of the incoming inequality R(i) ≥ 0. The second hypothesis is constructed directly using the rule arith whose side-condition holds because of the main tableau representation invariant. Finally, the third hypothesis is constructed using repeated use of the geqadd rule for all the elements of the negative linear combination of restricted column owners, as read from the row i in the tableau. It is a known fact from linear algebra that a contradiction is entailed by a set of linear inequalities if and only if a false inequality involving only numerals can be constructed from a positive linear combination of the original inequalities. Thus these rules are not only necessary for Simplex but also sufficient for any linear arithmetic proof procedure. The situation is somewhat more complicated due to the presence of ∗restricted columns that might contain strictly positive entries in a maximized row (such as is the case in the row s4 of tableau (d) shown before). To solve this complication we must also be able to express every ∗-restricted column as a negative linear combination of restricted owners. Simplex uses the two functions mapRow and mapCol shown in Figure 4 to construct negative linear combinations for a maximized row or a ∗-restricted column. In Figure 4 the notation Φ denotes a map from expressions to negative rational factors. All integer numerals n are represented as the numeral 1 with coefficient n. We use ∅ to denote the empty map. The operation Φ(E) ←+ q updates the map Φ increasing the coefficient of E by q. The reader is invited to verify using the Simplex invariants that if mapRow is invoked on a maximized row and if mapCol is invoked on a ∗-restricted column with a positive factor q then the resulting map Φ contains only restricted expressions with negative coefficients. The termination of mapCol is ensured by the Invariant S4. The reader can verify that by running mapRow(4) on the tableau (d) we obtain that s4 ' −1 − 2 · s1 − 1 · s3 , which in turn says that the negation of the fourth inequality can be verified by multiplying the first equality by 2 and adding the third equality.

38

George C. Necula and Peter Lee

The Simplex equality proofs are generated using the inference rule seq shown below. E1 − E2 = E E ≤ 0 E2 − E1 = E 0 E 0 ≤ 0 seq E1 = E2 Consider for example the case of two rows i1 and i2 whose entries are distinct only in ∗-restricted columns. We temporarily add to the tableau a new row r whose entries are qi1 j − qi2 j . Since row r is maximized at 0 we can use mapProof to produce the first two hypotheses of seq, just like we did for sfalse. Then we negate the entries in r and we use again mapProof to produce the last two hypotheses.

4

Lessons Learned while Building and Using Touchstone

In this section we describe our initial experience with building and using the Touchstone theorem prover. We programmed the theorem prover in the Standard ML of New Jersey dialect of ML. The whole project, including the control core along with the congruence closure and Simplex decision procedures and also with decision procedures for modular arithmetic and type checking, consists of around 11,000 lines of source code. Of these, about 3,000 are dedicated solely to proof representation, proof generation, proof optimization (such as local reduction and turning proofs by contradiction into simpler direct proofs) and proof checking. The relatively small size of the proof generating component in Touchstone can be explained by the fact that most heuristics and optimizations that are required during proving are irrelevant to proof generation. Take for example the Simplex decision procedure. Its implementation has over 2000 lines of code, a large part of which encodes heuristics for selecting the best sequence of pivots. The proof-generating part of Simplex is a fairly straightforward reading of the tableau once the contradiction was found, as shown in Figure 4. One important design feature of proof generation in Touchstone is that proofs are produced lazily, only when a contradiction is found. This is similar in spirit with the lazy techniques described by Boulton [Bou92]. We do not generate explicit proofs immediately when we discover and propagate an equality. Instead we only record which decision procedure discovered the equality and later, if needed, we ask that decision procedure to generate an explicit proof of the equality. This lazy approach to proof generation means that the time consumed for generating proofs is small when compared to the time required for proving since in most large proving tasks a large part of the time is spent exploring unsuccessful paths. Our experiments with producing type safety proofs for assembly language show that only about 15% of the time is spent on proof generation. It was obvious from the beginning of the project that there will be a design and coding complexity cost to be paid for the proof-generating capability of the prover. We accepted this cost initially because we needed a mechanical way to build the proofs required for our proof-carrying code experiments. We did not anticipate how much this feature would actually simplify the building, testing and maintenance of the theorem prover. Indeed, we estimate that the ability to

Proof Generation in the Touchstone Theorem Prover

39

cross-check the operation of the prover during proof generation saved us many weeks or maybe months of testing and debugging. One particular aspect of the theorem prover that led to many design and programming errors was that all decision procedures and the control core must be incremental and undoable. This is complicated by decision procedures that perform a pseudo-undo operation in the interest of efficiency. For example, the Simplex decision procedure does not revert the tableau to the exact state it was at the last snapshot operation but only to an equivalent state obtained simply by deleting some rows and columns. In the presence of such decision procedures the exact order of intermediate subgoals and discovered equalities depends on all previously processed subgoals. This defeats a common technique for isolating a bug by reducing the size of the goal in which it is manifested. It often happens that by eliminating seemingly unrelated subgoals the error disappears because the order in which entailed equalities is changed. Proof-generation as a debugging mechanism continues to be valuable even as the prover matures. While we observe a decrease in the number of errors, we also observe a sharp increase in the average size of the proving goals that trigger an error. Indeed the size of these goals is now such that it would have been impractical to debug the prover just by manual inspection of a trace. Secondly, proof-generation gave us significant assistance with the upgrade and maintenance of the prover, as a broken invariant is promptly pointed out by the proofgeneration infrastructure. We should also point out that for maximum assurance we checked proofs using the same small proof checker that we also use for proofcarrying code. However, we noticed that most prover errors surfaced as failures by the proof-generating infrastructure to produce a proof and only a very small number of bugs resulted in invalid proofs. A lesson that can be drawn here is that the software engineering advantages of proof-generation in theorem provers can be obtained by just going through the process of generating a proof without actually having to record the proof. 4.1

Using Touchstone to Build Proof-Carrying Code

The main motivation for building Touchstone was for use in a proof-carrying code system. A typical arrangement for the generation of proof-carrying code is shown in Figure 5. Note that on the right-hand side we have the untrusted components used to produce PCC while on the left-hand side we have the trusted infrastructure for checking PCC. This figure applies to the particular case in which the safety policy consists of type safety in the context of a simple first-order type system with pointers and arrays. The process starts with a source program written in a type-safe subset of the C programming language. The source program is given to a certifying compiler that, in addition to producing optimized machine code, also generates function specifications and loop invariants based on types. We will return to the issue of modeling types in first-order logic shortly. The code augmented with loop invariants is passed through a verification condition generator (VcGen) that produces a verification condition (VC). The

40

George C. Necula and Peter Lee Server

Client

Invar Safety policy

VC Generator

Code

Source

Certifying Compiler

Spec Logic

Proof Producer

VC

Theorem Prover

Proof Checker

Proof Trusted Simple Fast

Untrusted Complex Slow

Fig. 5. A typical arrangement for building proof-carrying code. VC is provable only if the code satisfies the loop invariants and the specifications and only if all memory operations are safe. To encode proof obligations for memory safety in a general way, VcGen emits formulas of the form “saferd(E)” to say that the memory address denoted by the symbolic expression E is readable, and “safewr(E,E’)” to say that the value denoted by E’ is writable at the address denoted by E. The verification condition is passed then to Touchstone, which proves it and returns a proof encoded in a variant of the Edinburgh LF language. This allows an LF type checker on the receiver side to fully validate the proof with respect to the verification condition. The key idea behind proof-carrying code is that the whole process of producing a safe executable can be split into an complex and slow untrusted component on one side and a simple and fast trusted safety checker. It was therefore a key requirement that the proving and proof checking tasks be separated as shown in the picture. For more details on the system described here the reader is invited to consult [Nec98]. What remains to be discussed are the details of the logical theory that is used to model types and to derive memory safety. We use a theory of first-order types with constructors for types and a typing predicate. A few of the terms and formulas used along with three of the inference rules are shown below: A : array(T, L)

I≥0 I
A + 4 ∗ I : ptr(T )

A : ptr(pair(T1 , T2 ))

A : ptr(T )

A + 4 : ptr(T1 )

saferd(A)

The first rule says that a pointer to an element of type T can be obtained by indexing in an array whose element type is T. In this rule the array type is

Proof Generation in the Touchstone Theorem Prover

41

dependent on the length of the array and the element size is considered to be 4 bytes. The second rule is used to reason about tuple destructors. Finally, the last rule is the only rule that introduces the saferd predicate, basically saying that in this safety policy readability of memory locations is dictated by types. To handle this theory of types in Touchstone we make heavy use of the “Subgoal” module. We implemented a tactic that does backward chaining on the rules of the theory. A central element of the tactic is a heuristic that finds likely valid formulas of the form “A : ptr(T )” by looking at the form of A and at the current assertions (originating from function preconditions and loop invariants in this case). While the other elements of Touchstone have some completeness properties, the tactic for the theory of types need only be powerful enough to “understand” the code produced by our compiler. And since the compiler starts with an obviously well-typed source program the only difficulties can be introduced by optimizations. In fact, our compiler is very aggressive in optimizing array-bounds checks and hence the theorem prover must be able to prove itself all the arithmetic facts that the compiler has discovered and proved. As a result the Simplex satisfiability procedure is exercised quite heavily in this setting. Our experiments with Touchstone in this setting have shown several interesting facts. This separation of tasks does indeed achieve a separation of complexity and running cost. In terms of code size the untrusted components are about four times larger than the trusted ones. In particular the Touchstone prover is four times larger than the proof checker. Furthermore, while Touchstone grows continuously as we incorporate more heuristics and better tactics we found that the proof-checking component has remained largely unchanged over a couple of years. A sample of the experimental data that we collected is shown in Figure 6. This table shows, for a few programs, the sizes of the verification condition generated and of the associated proofs along with the time required for theorem proving and proof checking. The sizes are expressed in number of AST nodes while the timings are expressed in milliseconds. The measurements were performed on a DEC Alpha with a 21064 processor running at 175MHz. Notice that in these experiments proof checking is about an order of magnitude faster than theorem proving. At this point we would like to point out that our imperative implementation of the Nelson-Oppen prover appears to be much faster than the functional implementation of Boulton in the HOL prover, as described in [Bou93,Bou95]. We ran Touchstone on the 11 examples shown on page 94 in [Bou93]. All of these examples are very small ranging from 9 to 43 AST nodes. While Boulton ran the measurements on a Sparcstation 2 with a 40Mhz processor we ran them on an Alpha with a 175Mhz. But even after multiplying the Touchstone running times by a factor of 5 to compensate for this difference it still results that Touchstone is faster by a factor ranging from 5 to 80 on these examples. There are several reasons behind the better performance in Touchstone. One of them is that Touchstone handles only a small fragment of first-order logic and can thus use a very fast goal-directed proof procedure. HOL extended with the

42

George C. Necula and Peter Lee Program Lines bcopy edge kmp qsort sharpen simplex unpack

16 88 67 142 153 303 259

VC size Proving time Proof size Check time (AST nodes) (ms) (AST nodes) (ms) 82 25 64 4 224 143 528 15 483 108 344 9 1444 127 1770 16 420 257 477 23 7055 1272 3912 120 5759 1912 1750 92

Fig. 6. Experimental data collected using the Touchstone prover in the context of generating PCC for type safety. .

Nelson-Oppen cooperative decision procedure on the other hand uses a more general proof procedure based on conversion to disjunctive normal form. Another possible reason for the disparity in the performance is that while HOL uses a satisfiability procedure for arithmetic based on Fourier-Motzkin variable elimination, Touchstone uses an efficient implementation of Simplex. Boulton explains that the Fourier-Motzkin elimination was chosen in HOL because of the simplicity of proof generation. We show that even an efficient version of Simplex can be made fully expansive, although almost surely at a larger programming cost. Finally, we suspect that another reason that makes Touchstone faster than Boulton’s implementation of the Nelson-Oppen strategy is the use of an imperative implementation style. By making very judicious use of memory and we observe very little garbage collection during proving. In contrast, Boulton’s implementation uses a functional programming style leading to very elegant implementation of a crucial part of the prover: the undo mechanism. Instead our implementation of undo is quite a bit more complex and is responsible for many of the bugs that we discovered. In quite a few cases we forgot to undo certain changes thus leading to unsoundness. This did not turn out to be a big problem because the proof checking mechanisms quickly pointed out our errors. Other times the undo procedure mistakenly removed too many assertions to some situations in which the prover was not able to prove predicates that it was intended to prove. We were helped in this situation by the fact that we were using Touchstone essentially to verify that a number of optimizations performed by our compiler preserve type safety. By design of both the compiler and the prover, every failed proof attempt points to either a compilation bug or a completeness bug in the prover. This is how we found a very large number of bugs in the compiler and some in the prover.

Proof Generation in the Touchstone Theorem Prover

5

43

Conclusion

We describe in this paper an implementation of a Nelson-Oppen theorem prover enhanced with the ability to generate easily-checkable proof objects for all the predicates it proves. Our implementation, just like the one described by Nelson, was designed with efficiency in mind in order to handle verification condition of non trivial programs. This led us to use more complex algorithms and implementation techniques than a related functional implementation in the context of the HOL theorem prover. The added complexity of our design seem to pay off in terms of efficiency. But it also led us to making many subtle design and programming errors that threatened both the soundness and the completeness of our prover. Fortunately, soundness was never in real danger because of Touchstone’s proof generating ability, which enables us to use a simple proof checker to validate the correctness of each run. As a general conclusion, we feel that the benefits of proof-generation in theorem provers clearly outweigh the additional cost of designing and implementing proof-generating decision procedures.

Acknowledgments We would like to thank the anonymous referees for making many valuable suggestions that have improved this paper substantially.

References Ack54. AS80. Ble74. BM79. Bou92.

Bou93. Bou95.

Wilhelm Ackermann. Solvable Cases of the Decision Problem. Studies in Logic and the Foundations of Mathematics. North-Holland, Amsterdam, 1954. Bengt Aspvall and Yossi Shiloach. A polynomial time algorithm for solving systems of linear inequalities with two variables per inequality. SIAM Journal on Computing, 9(4):827–845, 1980. W.W. Bledsoe. The Sup-Inf method in Presurger arithmetic. Technical report, University of Texas Math Dept., December 1974. Robert Boyer and J. Strother Moore. A Computational Logic. Academic Press, 1979. Richard J. Boulton. A lazy approach to fully-expansive theorem proving. In International Workshop on Higher Order Logic Theorem Proving and its Applications, pages 19–38, Leuven, Belgium, September 1992. NorthHolland. IFIP Transactions. Richard John Boulton. Efficiency in a Fully-Expansive Theorem Prover. PhD thesis, University of Cambridge, December 1993. Richard J. Boulton. Combining decision procedures in the HOL system. In 8th International Workshop on Higher Order Logic Theorem Proving and its Applications, volume 971 of Lecture Notes in Computer Science, pages 75–89. Springer-Verlag, September 1995.

44

George C. Necula and Peter Lee

DLNS98. David L. Detlefs, K. Rustan M. Leino, Greg Nelson, and James B. Saxe. Extended static checking. SRC Research Report 159, Compaq Systems Research Center, 130 Lytton Ave., Palo Alto, December 1998. DST80. Peter J. Downey, Ravi Sethi, and Robert E. Tarjan. Variations on the common subexpressions problem. Journal of the ACM, 27(4):758–771, 1980. Gor85. Michael Gordon. HOL: A machine oriented formulation of higher-order logic. Technical Report 85, University of Cambridge, Computer Laboratory, July 1985. HHP93. Robert Harper, Furio Honsell, and Gordon Plotkin. A framework for defining logics. Journal of the Association for Computing Machinery, 40(1):143–184, January 1993. Mil91. Dale Miller. A logic programming language with lambda-abstraction, function variables, and simple unification. Journal of Logic and Computation, 1(4):497–536, September 1991. MNPS91. Dale Miller, Gopalan Nadathur, Frank Pfenning, and Andre Scedrov. Uniform proofs as a foundation for logic programming. Annals of Pure and Applied Logic, 51:125–157, 1991. Nec97. George C. Necula. Proof-carrying code. In The 24th Annual ACM Symposium on Principles of Programming Languages, pages 106–119. ACM, January 1997. Nec98. George C. Necula. Compiling with Proofs. PhD thesis, Carnegie Mellon University, September 1998. Also available as CMU-CS-98-154. Nel81. Greg Nelson. Techniques for program verification. Technical Report CSL81-10, Xerox Palo Alto Research Center, 1981. NO79. Greg Nelson and Derek Oppen. Simplification by cooperating decision procedures. ACM Transactions on Programming Languages and Systems, 1(2):245–257, October 1979. NO80. Greg Nelson and Derek C. Oppen. Fast decision procedures based on congruence closure. Journal of the Association for Computing Machinery, 27(2):356–364, April 1980. ORS92. S. Owre, J. M. Rushby, and N. Shankar. PVS: A prototype verification system. In Deepak Kapur, editor, 11th International Conference on Automated Deduction (CADE), volume 607 of Lecture Notes in Artificial Intelligence, pages 748–752, Saratoga, NY, June 1992. Springer-Verlag. Pau94. L. C. Paulson. Isabelle: A generic theorem prover. Lecture Notes in Computer Science, 828:xvii + 321, 1994. Pfe91. Frank Pfenning. Logic programming in the LF logical framework. In G´erard Huet and Gordon Plotkin, editors, Logical Frameworks, pages 149–181. Cambridge University Press, 1991. Pfe94. Frank Pfenning. Elf: A meta-language for deductive systems (system description). In Alan Bundy, editor, 12th International Conference on Automated Deduction, LNAI 814, pages 811–815, Nancy, France, June 26–July 1, 1994. Springer-Verlag. Pra77. Vaughan R. Pratt. Two easy theories whose combination is hard. Unpublished manuscript, 1977. SD99. Aaron Stump and David L. Dill. Generating proofs from a decision procedure. In A. Pnueli and P. Traverso, editors, Proceedings of the FLoC Workshop on Run-Time Result Verification, Trento, Italy, July 1999. Sho81. Robert Shostak. Deciding linear inequalities by computing loop residues. Journal of the ACM, 28(4):769–779, October 1981.

Wellfounded Schematic Definitions Konrad Slind Cambridge University Computer Laboratory

Abstract. A program scheme looks like a recursive function definition, except that it has free variables ‘on the right hand side’. As is well-known, equalities between schemes can capture powerful program transformations, e.g., translation to tail-recursive form. In this paper, we present a simple and general way to define program schemes, based on a particular form of the wellfounded recursion theorem. Each program scheme specifies a schematic induction theorem, which is automatically derived by formal proof from the wellfounded induction theorem. We present a few examples of how formal program transformations are expressed and proved in our approach. The mechanization reported here has been incorporated into both the HOL and Isabelle/HOL systems.

Program schemes form the foundation of an interesting class of program development methodologies which advocate the incremental instantiation of abstract programs, preserving important properties all the while, until a suitable concrete program results. There has been a great deal of work on program transformation, for background see [6,18,33,23,26]. Although program transformation theories are being applied a lot informally, work on program transformation in mechanized proof assistants is not as abundant, in spite of the evident interest in using such systems as platforms for program development and transformation. One reason for this may be that, currently, such environments (e.g., [13,25,22,3]) tend to be based on logics of total functions and it is not clear how a program scheme can be regarded as a total function, since many schemes allow instantiations such that the resulting function is not total. In spite of this, we will describe a simple and general technique by which schemes may be defined such that totality is enforced.

1

Formal Basis

We work in a higher order logic commonly called HOL [13]; a description of the logic may be found in the Appendix. We adopt the common approach of using the native functions of the logic to represent programs; recursive programs are modelled with the use of a wellfounded recursion theorem. There are several equivalent definitions of wellfoundedness [28]; the following asserts that the relation R : α → α → bool is wellfounded iff every non-empty set has an R-minimal element. D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 45–63, 2000. c Springer-Verlag Berlin Heidelberg 2000

46

Konrad Slind

Definition 1 (Wellfoundedness). WF(R) ≡ ∀P. (∃w. P w) ⊃ ∃min. P min ∧ ∀b. R b min ⊃ ¬P b. From this definition, the following general induction and recursion theorems can be proved (the interested reader can find details in [30]): Theorem 2 (Wellfounded Induction). WF(R) ⊃ (∀x. (∀y. R y x ⊃ P y) ⊃ P x) ⊃ ∀x. P x. Theorem 3 (Wellfounded Recursion). ∀f R M. (f = WFREC R M ) ⊃ WF(R) ⊃ ∀x. f(x) = M (f | R, x) x. WFREC : (α → α → bool) → ((α → β) → (α → β)) → α → β can be thought of as a ‘controlled’ fixpoint operator; since it is only used to prove Theorem 3, we omit its quite obfuscatory definition. Also used in the statement of Theorem 3 is a ternary operator that restricts a function to a certain set of values. 1 Definition 4 (Restriction). (f | R, y) ≡ λx. if R x y then f x else Arb. Theorem 5. R x y ⊃ (f | R, y) x = f x.

2

The Technique

We shall present our approach with the hand derivation of an example; the automation of the technique will be taken up in Section 3. Consider the following description of the ‘while’ construct familiar from imperative programming: While s = if B s then While (C s) else s. This is a syntactic specification of a class of functions determined by the parameters B and C. To start the derivation, the description is translated into a functional: λW hile s. if B s then W hile (C s) else s. 1

(1)

In set theory, or logics of partial functions, function restriction may result in a partial function. In a logic of total functions, such as HOL, a restriction of a function is still a total function, giving a fixed but arbitrary value when applied outside of the restriction.

Wellfounded Schematic Definitions

Instantiating M in the recursion theorem with (1) yields WF R, f = WFREC R (λW hile s. if B s then W hile (C s) else s) ` ∀x. f(x) = if B x then (f | R, x) (C x) else x.

47

(2)

By assuming ∀s. B s ⊃ R (C s) s (we discuss the origin of this assumption in Section 3), it is possible to derive

WF R, ∀s. B s ⊃ R (C s) s, f = WFREC R (λW hile s. if B s then W hile (C s) else s) ` f x = if B x then f (C x) else x.

(3)

The assumptions WF R and ∀s. B s ⊃ R (C s) s are the ‘termination conditions’ of (3). Now we apply the Principle of Constant Definition to define While. This is the central step in our method. The indefinite description operator (ε) is applied to choose a wellfounded relation R meeting the termination conditions. Notice also that the distinction between parameters (B and C) and arguments (s) is supported by the different binding sites in the definition: parameters are arguments to the definition itself, while the original argument s is a bound variable in the functional. While ≡ λB C. WFREC(εR. WF R ∧ ∀s. B s ⊃ R (C s) s) (λW hile s. if B s then W hile (C s) else s).

(4)

Eliminating (4) from the hypotheses of (3) yields WF (εR. WF R ∧ ∀s. B s ⊃ R (C s) s), ∀s. B s ⊃ (εR. WF R ∧ ∀s. B s ⊃ R (C s) s) (C s) s ` While B C s = if B s then While B C (C s) else s. Finally, assuming WF(R) and ∀s. B s ⊃ R (C s) s and then applying the Select Axiom allows the conclusion WF R, ∀s. B s ⊃ R (C s) s ` (5) While B C s = if B s then While B C (C s) else s. Remark 6. The derived equation (5) looks like a normal higher-order function; however, had we tried to define While as a higher order function in the standard manner, i.e., with no parameters, then B, C and s would be treated as arguments—and thus bound in the functional—and the termination conditions would be equivalent to the proposition ∃R. WF(R) ∧ ∀B C s. R (B, C, C s) (B, C, s),

48

Konrad Slind

which is not provable since C could be taken to be the identity function, but there is no wellfounded relation R such that R x x. Remark 7. Our treatment of parameters is not specific to wellfounded recursion: it works for any fixpoint operator. In particular, for any fix satisfying the wellknown equation fix(M ) = M (fix(M )), it is merely a common subexpression elimination to get ∀g. (g = fix(M )) ⊃ ∀x. g x = M g x. By abstracting free variables P1 , . . . , Pk of M , this can be transformed to ` ∀g. (g P1 . . . Pk = fix(M )) ⊃ ∀x. g P1 . . . Pk x = M (g P1 . . . Pk ) x. With hindsight, the treatment of parameters in inductive definition packages such as those reported in [21,24,15] can be seen as concrete applications of this theorem. 2.1

Induction

It is well known that the wellfounded relation used to prove termination for a function can also be used to derive an induction theorem, in which the induction predicate is assumed to hold for the arguments to recursive calls. For ML-style pattern-matching recursion equations of the form f(pat1 ) ≡ rhs1 [f(a11 ), . . . , f(a1k1 )] .. .

(6)

f(patn ) ≡ rhsn [f(an1 ), . . . , f(ankn )], an induction theorem of the following form (where Γ (aij ) is the context of recursive call f (aij )) can be derived by formal proof from Theorem 2:     (∀(Γ (a11 ) ⊃ P a11 )) ∧   ..   ∀  . ∧  ⊃ P (pat1 ) ∧ (∀(Γ (a1k1 ) ⊃ P a1k1 ))

.. . ∧ 

 (∀(Γ (an1 ) ⊃ P an1 )) ∧     .. ∀  . ∧  ⊃ P (patn ) ⊃ ∀v. P v. (∀(Γ (ankn ) ⊃ P ankn ))  

The assumptions to this theorem will be the termination conditions of f, as explained in [31], where the automatic derivation of such induction theorems is described. An earlier treatment of the derivation of induction for functions in a simpler object language is described in [5]. It might seem to be problematic to derive induction for program schemes since the termination relation is not known; however, an appropriate induction

Wellfounded Schematic Definitions

49

theorem can still be derived: all that is required is to assume that a suitable termination relation exists. We demonstrate the idea by deriving the following induction theorem for the While function: WF R, ` ∀P. (∀s. (B s ⊃ P (C s)) ⊃ P s) ⊃ ∀v. P v. (7) ∀s. B s ⊃ R (C s) s The derivation begins by assuming the antecedent of (7) and the termination conditions of the definition. 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11.

∀s. (B(s) ⊃ P (C s)) ⊃ P s Assume ∀s. B(s) ⊃ R (C s) s Assume [2] ` B(s) ⊃ R (C s) s ∀-elim(2) ∀y. R y s ⊃ P s Assume [4] ` R (C s) s ⊃ P (C s) ∀-elim(4) [2, B s] ` R (C s) s Undisch(3) [4, 2, B s] ` P (C s) ⊃ -elim (5) (6) [4, 2] ` B s ⊃ P (C s) ⊃ -intro (7) [1, 4, 2] ` P s ⊃ -elim (1) (8) [1, 2] ` (∀y. R y s ⊃ P s) ⊃ P s ⊃ -intro (9) [1, 2] ` ∀s. (∀y. R y s ⊃ P s) ⊃ P s ∀-intro (10)

In step 11, the antecedent of the wellfounded induction theorem (Theorem 2) has been derived, and a few further obvious steps deliver (7), as desired. Remark 8. If the semantics of a Hoare triple {P } C {Q} are defined by Hoare P C Q ≡ ∀s. P s ⊃ Q (C s) then the following While rule for total correctness has an easy2 proof by induction with (7): Hoare (λs. P s ∧ B s) C P ⊃ WF R, ` ∀s. B s ⊃ R (C s) s Hoare P (While B C) (λs. P s ∧ ¬B s).

3

Automation

A useful level of support for deriving program schemes can be supplied by generalizing and automating the steps taken in the While example. The particular interface we have implemented takes as input recursion equations of the form given in (6) and performs the following steps: 1. Translates the equations into a functional F, using a pattern-matching translation based on those used in functional programming language implementations [2,20]. 2

The proof takes four tactic applications in a current version [14] of the Hol98 proof assistant.

50

Konrad Slind

2. Instantiates M in the recursion theorem with F . 3. Extracts termination conditions T C1 (R), . . . T Ck (R) from the results of step 2, where R is a variable representing the wellfounded relation. 4. Computes the free variables P1 . . . Pi of F , then defines the constant denoting the desired function: f ≡ λP1 . . . Pi . WFREC (εR.WF(R) ∧ T C1 (R) ∧ . . . ∧ T Ck (R)) F .

5. 6.

7. 8.

Two things are important here: (1) using the description operator to choose a suitable wellfounded relation meeting the results of step 3; and (2) making sure to separate parameters from arguments in the definition. Eliminates the result of step 4 from the hypotheses of the result of step 2 (the instantiated recursion theorem). Assumes each of WF(R), T C1 (R), . . . T Ck (R) and then eliminates the description operator terms, via application of the Select Axiom. Now the desired termination constraints have been derived. Derives the induction theorem from the termination conditions. Returns the recursion equations and the induction theorem.

Fortunately, the algorithms of [30,31] generalize naturally to support steps 1 to 8. The key to automation is step 3, in which termination conditions are automatically extracted. This is accomplished by use of a special contextual rewriter, which attempts to rewrite the instantiated recursion theorem (coming from step 2) with the conditional rewrite rule for function restriction (Theorem 5). In searching for matches for this rule, the rewriter is essentially searching for every recursive call site in the original equations. The rewriter uses its stock of contextual rules to gather and discard context Γ as it makes its search; when a recursive call site (f | R, pati) (aij ) is found (in context Γ (aij )), the termination condition Γ (aij ) ⊃ R aij pati is captured by performing a small proof which stores the termination condition on the assumptions. After the rewriting process terminates, one is left with a theorem, the conclusion of which is the desired recursion equations, and the assumptions of which are the termination conditions (from which the induction theorem can be derived). An important point about these manipulations is that they all take place by deductive steps in the object logic, so the results are sound. Detailed descriptions of the algorithms, including their extension to mutual recursion, nested recursion, and higher order recursion can be found in [32]. To extend these algorithms to program schemes is particularly simple: all that need be done is to take care never to quantify scheme variables in any of the derivations. If there are no scheme variables, the algorithms perform exactly as in [30,31], so schemes are a smooth extension to the existing apparatus.

4

Formal Program Transformations

Program schemes are helpful for giving suitably abstract descriptions of classes of functions. A further application of schemes comes from using them as a basis

Wellfounded Schematic Definitions

51

for program transformation: instances of schemes may be identified, provided applicability conditions are satisfied. There are various ways of representing program transformations formally; we choose to represent them simply as theorems (specifically, as constrained recursion equations). Proving a program transformation typically involves an application of the induction theorem for one of the program schemes being equated. Example 9. The following scheme expresses a class of linear recursive programs: linRec(x) ≡ if Atomic x then A x else Join (linRec (Dest x)) (D x). Under certain conditions, instances of linRec are equal to corresponding instances of the following tail-recursive scheme, which uses an accumulating parameter: accRec(x, u) ≡ if Atomic x then Join (A x) u else accRec (Dest x, Join (D x) u). Intuitively, the recursive calls of linRec must get ‘stacked up’ somehow, waiting for deeper recursive calls to return. In contrast, calls to accRec need not be stacked. If the combination function Join is associative, then the implicit bracketing of the stacked recursive calls can be replaced with a single data value that gets modified and passed at each recursive call. We now formalize this intuition. The result of defining linRec is (we omit the induction theorem): WF R, ∀x. ¬Atomic x ⊃ R (Dest x) x ` linRec D Dest Join A Atomic x = if Atomic x then A x else Join (linRec D Dest Join A Atomic (Dest x)) (D x), and that for accRec is (we conjoin the induction theorem to the recursion equation): WF R, ∀x. ¬Atomic x ⊃ R (Dest x) x ` (accRec D Dest A Join Atomic (x, u) = if Atomic x then Join (A x) u else accRec D Dest A Join Atomic (Dest x, Join (D x) u)) ∧ ∀P. (∀x u. (¬Atomic x ⊃ P (Dest x, Join (D x) u)) ⊃ P (x, u)) ⊃ ∀v v1 . P (v, v1 ). The formal program transformation is then captured in the following theorem:   WF R,   ∀x. ¬Atomic x ⊃ R (Dest x) x, ∀p q r. Join p (Join q r) = Join (Join p q) r ` ∀x u. Join (linRec D Dest Join A Atomic x) u = accRec D Dest A Join Atomic (x, u).

52

Konrad Slind

Proof. Apply the induction theorem for accRec, then expand the definitions of linRec and accRec. Example 10. The following scheme for binary recursion uses the parameters Left and Right to break the input into two parts on which to recurse: binRec(x) ≡ if Atomic x then A x else Join (binRec (Left x)) (binRec (Right x)). The result of the definition is (omitting the induction theorem)   WF R,  ∀x. ¬Atomic x ⊃ R (Right x) x,  ∀x. ¬Atomic x ⊃ R (Left x) x ` binRec Right Left Join A Atomic x = if Atomic x then A x else Join (binRec Right Left Join A Atomic (Left x)) (binRec Right Left Join A Atomic (Right x)). The example comes from Wand [33], who used paper and pencil, and has been treated in PVS [29]. In his development, Wand was interested in explaining how continuations give the programmer a representation of the runtime stack, and thus can act as a bridge in the transformation of non-tail-recursive functions to tail recursive ones. In our development, we will avoid the continuation-passing intermediate representation (although it is simple for us to handle) and transform to tail recursion in one step. Now we present a general tail recursion scheme for lists. In the definition, the parameter Dest : α → α list breaks the head h of the work list h :: t into a list of new work, which it prepends to t before continuing; hence, the tailRec scheme is quite general because the argument to the second tail call may increase in length by any finite amount. (Wand and Shankar only consider tail recursions in which the Dest parameter can produce two new pieces of work.) tailRec ([ ], v) ≡ v tailRec (h :: t, v) ≡ if Atomic h then tailRec (t, Join v (A h)) else tailRec (Dest h @ t, v). The result of this definition is (including the induction theorem)   WF R,  ∀v t h. ¬Atomic h ⊃ R (Dest h @ t, v) (h :: t, v),  ∀v t h. Atomic h ⊃ R (t, Join v (A h)) (h :: t, v) `

Wellfounded Schematic Definitions

(tailRec Dest A Join Atomic ([ ], v) = v) ∧ (tailRec Dest A Join Atomic (h :: t, v) = if Atomic h then tailRec Dest A Join Atomic (t, Join v (A h)) else tailRec Dest A Join Atomic (Dest h @ t, v)) ∧  ∀P. (∀v. P ([ ], v)) ∧  (∀h t v. ¬Atomic h ⊃ P (Dest h @ t, v) ∧   Atomic h ⊃ P (t, Join v (A h)) ⊃ P (h :: t, v)) ⊃ ∀v v1 . P (v, v1 ).

53

   

We intend to prove an equivalence between binRec and tailRec but the transformation seems to require the termination constraints for both binRec and tailRec to be satisfied. However, a bit of thought reveals that a useful fact about finite multisets can simplify matters, by allowing one constraint to be expressed in terms of the other. Definition 11 (msetPred). Let m be a finite multiset and R : α → α → bool a relation on elements of m. The relation msetPred R is built by removing an x from m and replacing it with a finite multiset of elements, each of which is R-smaller than x. Theorem 12. WF(R) ⊃ WF(msetPred R) Proof. The classic(al) proof can be found in [8]; a recent constructive proof is described in [27, Chapter II]. Now we show how the termination condition of tailRec can be reduced to the (simpler) one of binRec: WF R ∧ (∀h y. ¬Atomic h ∧ mem y (Dest h) ⊃ R y h) ⊃ ∃R0 . WF R0 ∧ (∀h t v. ¬Atomic h ⊃ R0 (Dest h @ t, v) (h :: t, v)) ∧ (∀h t v. Atomic h ⊃ R0 (t, Join v (A h)) (h :: t, v)) Proof. Assume WF R and ∀h y. ¬Atomic h ∧ mem y (Dest h) ⊃ R y h). R0 is a relation on pairs. The witness for R0 operates over the first projection of the pair, i.e., over lists, and maps a list into a multiset of the list elements. Since R is wellfounded, msetPred over the multiset is wellfounded and thus the witness is wellfounded. The remaining two conjuncts are both true, the first by assumption and the definition of msetPred, and the second by the definition of msetPred, since no elements are being put back into the multiset. With this reduction, one can state and prove the following general theorem relating binary recursion and tail recursion. The essential insight is that the work list l of tailRec represents a linearization of the binary tree of calls of binRec. Thus going from left to right through the work list, invoking binRec and accumulating the results, should deliver the same answer as executing tailRec on the work list.

54

Konrad Slind

We formalize this left-to-right pass by the auxiliary function rev itlist.3 Note how the Dest parameter of tailRec has been specialized with λx. [Left x, Right x].   WF R,  ∀x. ¬Atomic x ⊃ R (Left x) x ∧ R (Right x) x,  ∀p q r. Join (Join p q) r = Join p (Join q r) ` ∀l v0 . rev itlist(λtr v. Join v (binRec Right Left Join A Atomic tr)) l v0 = tailRec(λx.[Left x, Right x]) A Join Atomic (l, v0 ) Proof. Induct with the induction theorem for tailRec. The base case is straightforward; the step case is also essentially trivial, since it only involves using the induction hypotheses and rewriting with the definitions of rev itlist, tailRec, and binRec. Finally, the desired program transformation   WF R,  ∀x. ¬Atomic x ⊃ R (Left x) x ∧ R (Right x) x,  ∀p q r. Join (Join p q) r = Join p (Join q r) ` ∀x v0 . Join v0 (binRec Right Left Join A Atomic x) = tailRec (λx. [Left x, Right x]) A Join Atomic ([x], v0) can be obtained by instantiating the work list l to comprise the initial item of work [x], and then reducing the definition of rev itlist away. Example 13. Now we derive a program transformation originally presented by Bird [4], and later mechanized by Shankar [29]. Consider a datatype btree of binary trees with constructors LEAF : α btree NODE : α btree → α → α btree → α btree. The so-called catamorphism (iterator) for this type is btreeRec LEAF v f ≡ v btreeRec (NODE t1 M t2 ) v f ≡ f (btreeRec t1 v f) M (btreeRec t2 v f). 3

rev itlist, also known as foldl to functional programmers, is defined as rev itlist f [ ] v ≡ v rev itlist f (h :: t) v ≡ rev itlist f t (f h v).

Wellfounded Schematic Definitions

55

Most mechanizations of higher order logic automate such definitions; however, the so-called anamorphism (or unfold, or co-recursor) for this type has not been straightforward to define in these systems. Understanding the following definition of unfold : α → β btree may be eased by considering it as operating over an abstract datatype α which supports operations More : α → bool and Dest : α → α ∗ β ∗ α. unfold x ≡ if More x then let (y1 , b, y2 ) = Dest x in NODE (unfold y1 ) b (unfold y2 ) else LEAF. The automatically computed constraints attached to the definition are the following:   WF R,  ∀x y1 b y2 . More x ∧ ((y1 , b, y2 ) = Dest x) ⊃ R y2 x,  ∀x y1 b y2 . More x ∧ ((y1 , b, y2 ) = Dest x) ⊃ R y1 x. Notice that the mechanization is not currently smart enough to know that the two termination conditions share the same context. After some trivial manipulation to join the two termination conditions, the induction theorem for unfold is the following (omitting the hypotheses): ∀P. (∀x. (∀y1 b y2 .More x∧((y1 , b, y2) = Dest x) ⊃ P y1 ∧P y2 ) ⊃ P x) ⊃ ∀v. P v. (8) It is easy to generalize unfold to an arbitrary range type by replacing NODE and LEAF with parameters G and C: fuse x ≡ if More x then let (y1 , b, y2 ) = Dest x in G (fuse y1 ) b (fuse y2 ) else C. The fusion theorem states that unfolding into a btree and then applying a structural recursive function to the result is equivalent to interweaving unfolding steps with the steps taken in the structural recursion. Thus two recursive passes over the data can be replaced by one:

WF R, ∀x y1 b y2 . More x ∧ ((y1 , b, y2) = Dest x) ⊃ R y1 x ∧ R y2 x ` ∀x C G. btreeRec (unfold Dest More x) C G = fuse C Dest G More x.

Proof. The proof is by induction using (8), followed by expanding the definitions of btreeRec, unfold, and fuse.

56

Konrad Slind

5

Related Work

The paper by Huet and Lang [18] is an important early milestone in the field of program transformation. They worked in the LCF system, using fixpoint induction to derive program transformations. Program schemes were not defined; instead, transformations were represented via applications of the Y combinator, i.e., had the form applicability conditions ⊃ Y F = Y G, for functionals F and G. An influential aspect of the work was the use of second order matching to automate the application of program transformations. Work using PVS has represented program schemes and transformations by theories parameterized over the parameters of the scheme and having as proof obligations the applicability conditions of the transformation [29,9,10]. To apply the program transformation, the theory must be instantiated, and the corresponding concrete proof obligations proved. In our technique—in contrast—the parameters of a scheme are arguments to the defined constant, and the proof obligations are constraints on the recursion equations and the induction theorem. Thus, theorems are used to represent both program schemas and program transformations. Instantiating a program transformation in our setting merely requires one to instantiate type variables and/or free term variables in a theorem. It remains to be seen if one representation is preferable to the other. In other ways, however, our approach seems to offer improved functionality: 1. Currently, our technique produces more general schemes, since termination conditions are phrased in terms of an arbitrary wellfounded relation, whereas termination relations in PVS are restricted to measure functions [22]. Similarly, a general induction theorem is automatically derived for each scheme in our setting, whereas the PVS user is limited to measure induction (or may alternatively derive a more general induction theorem ‘by hand’ from wellfounded induction). 2. Our technique is more convenient because it automatically generates—by deductive steps—termination conditions for schemes. Taking the example of unfold, one doesn’t have to ponder the right constraints in our setting: they are delivered as part of the returned definition. In contrast, the definition of unfold in [29] requires expert knowledge of the PVS type system in order to phrase the right constraints on the Dest parameter. Since the termination conditions of a scheme constrain any program transformation that mentions the scheme, our approach should also ease the correct formulation of program transformations. 3. Our approach also works for mutually recursive schemes, which are not currently available in PVS. The paper of Basin and Anderson [1] has much in common with our work: for example, both approaches represent schemes and transformations by HOL theorems (Basin and Anderson call these rules). Their work differs from ours by focusing on relations (they are interested in modelling logic programs) rather

Wellfounded Schematic Definitions

57

than recursive functions. They present two techniques: in the first, program schemes are not defined; instead, transformations are derived by wellfounded induction on the arguments of the specified recursive relations (the relations themselves are left as variables). In the second, a program scheme is represented by an inductively defined relation. The first approach suffers from lack of automation: termination constraints are not synthesized and induction theorems are not automatically derived. In contrast, their second approach requires no mention of wellfoundedness, and induction is automatically derived by the inductive definition package of Isabelle/HOL. In [11], Farmer treats the definition of recursive functions in a logic of partial functions. Schematic functions are represented in a similar manner to our approach, but the automation issues we tackle have not been explored. In the context of language design, Lewis et al. [19] use schemes to implement a degree of dynamic scoping in a statically scoped functional programming language. Their approach allows occurrences of a free variable, e.g., P, in the body of a program to be marked with special syntax, e.g., ?P. The program is then treated as being parameterized by all such variables. To instantiate P occurring in a program f by a ground value val, they employ a notation ‘f with ?P = val ’. Although their work is phrased using operational semantics and ours is denotationally based, there are many similarities. Finally, our approach gives a higher-order and fully formal account of the steadfast transformation idea of Flener et al. [12]. In contrast to their work, we need give no soundness proof since our transformations are generated by deductive steps in a sound logic.

6

Conclusions

We have shown how a very simple technique allows a smooth treatment of program schemes, their induction theorems, and program transformations. Although the ideas are presented in the HOL logic, they are broadly applicable: the only notable requirements are a recursion theorem of the right form, a basic definition principle for introducing abbreviations, and an indefinite description operator. We have also sketched how higher levels of automation may be achieved, based on the automatic extraction of termination conditions by contextual rewriting. A few standard examples have been covered and, in some cases, generalized. We emphasize that transformations derived using our technique are sound. For any instantiation of the parameters of a scheme or transformation, the rules of deduction force the applicability constraints to be likewise instantiated, and those instantiations persist in the hypotheses until eliminated by deduction. An instantiated scheme or transformation with invalid constraints can of course be trivialized. Future work should focus on the difficult problems involved in automating the application of program transformations. One potential benefit of our practice of always deriving induction theorems may be that, if the scheme and the induction theorem are treated as a unit during instantiation, the instantiated induction

58

Konrad Slind

scheme will be available for reasoning about the instantiated program at each step in the instantiation chain. The schematic definition facility we have presented has been implemented via simple extensions to the TFL [32] package: as a result, program schemes as described in this paper have been available in the public releases of both the Hol98 and Isabelle/HOL systems since summer 1999.

Acknowledgements This research was carried out on EPSRC grant GR/L03071, and written up while the author was employed on ESPRIT Framework IV LTR 26241. Larry Paulson helped finalize the Isabelle/HOL instantiation.

References 1. Penny Anderson and David Basin. Program development schemata as derived rules. Journal of Symbolic Computation, 2000. To appear. 2. Lennart Augustsson. Compiling pattern matching. In J.P. Jouannnaud, editor, Conference on Functional Programming Languages and Computer Architecture (LNCS 201), pages 368–381, Nancy, France, 1985. 3. Bruno Barras, Samuel Boutin, Cristina Cornes, Judicael Courant, Yann Coscoy, David Delahaye, Daniel de Rauglaudre, Jean-Christophe Filliatre, Eduardo Gimenez, Hugo Herbelin, Gerard Huet, Henri Laulhere, Cesar Munoz, Chetan Murthy, Catherine Parent-Vigouroux, Patrick Loiseleur, Christine PaulinMohring, Amokrane Saibi, and Benjamin Werner. The Coq Proof Assistant Reference Manual. INRIA, 6.3.1 edition, December 1999. Accessible at http://pauillac.inria.fr/coq/doc/main.html. 4. Richard Bird. Functional algorithm design. In B. Moeller, editor, Mathematics of Program Construction, Third International Conference, (MPC’95), volume LNCS 947, pages 2–17, Kloster Irsee, Germany, July 17-21 1995. 5. Robert S. Boyer and J Strother Moore. A Computational Logic. Academic Press, 1979. 6. Rod Burstall and John Darlington. A transformation system for developing recursive programs. Journal of the Association for Computing Machinery, 24(1):44–67, January 1977. 7. Alonzo Church. A formulation of the Simple Theory of Types. Journal of Symbolic Logic, 5:56–68, 1940. 8. Nachum Dershowitz and Zohar Manna. Proving termination with multiset orderings. CACM, 22(8):465–476, 1979. 9. Axel Dold. Representing, verifying and applying software development steps using the PVS system. In V.S. Alagar and Maurice Nivat, editors, Proceedings of the Fourth International Conference on Algebraic Methodology and Software Technology, AMAST’95, Montreal, volume 936 of Lecture Notes in Computer Science, pages 431–435. Springer-Verlag, 1995. 10. Axel Dold. Software development in PVS using generic development steps. To appear in Springer LNCS, Proceedings of a Seminar on Generic Programming, April 1998.

Wellfounded Schematic Definitions

59

11. William Farmer. Recursive definitions in IMPS. Available by anonymous FTP at ftp.harvard.edu, in directory imps/doc, file name recursivedefinitions.dvi.gz, 1997. 12. P. Flener, K.-K. Lau, and M. Ornaghi. On correct program schemas. In N.E. Fuchs, editor, Proceedings of LOPSTR’97 (LNCS 1463), pages 124–143. Springer-Verlag, 1998. 13. Mike Gordon and Tom Melham. Introduction to HOL, a theorem proving environment for higher order logic. Cambridge University Press, 1993. 14. Hardware Verification Group. Hol98 User’s Manual. University of Cambridge, December 1999. Accessible at http://www.ftp.cl.cam.ac.uk/ftp/hvg/hol98. 15. John Harrison. Inductive definitions: automation and application. In E. Thomas Schubert, Phillip J. Windley, and James Alves-Foss, editors, Proceedings of the 1995 International Workshop on Higher Order Logic theorem proving and its applications, number 971 in LNCS, pages 200–213, Aspen Grove, Utah, 1995. SpringerVerlag. 16. John Harrison. HOL-Light: A tutorial introduction. In Proceedings of the First International Conference on Formal Methods in Computer-Aided Design (FMCAD’96), volume LNCS 1166, pages 265–269. Springer-Verlag, 1996. 17. John Harrison. Theorem Proving with the Real Numbers. CPHC/BCS Distinguished Dissertations. Springer, 1998. 18. Gerard Huet and Bernhard Lang. Proving and applying program transformations expressed with second-order patterns. Acta Informatica, 11:31–55, 1978. 19. Jeffery R. Lewis, Mark B. Shields, Erik Meijer, and John Launchbury. Implicit parameters: Dynamic scoping with static types. In Tom Reps, editor, ACM Symposium on Principles of Programming Languages, Boston, Massachusetss, USA, January 2000. ACM Press. 20. Luc Maranget. Two techniques for compiling lazy pattern matching. Technical Report 2385, INRIA, October 1994. 21. Tom Melham. A package for inductive relation definitions in HOL. In M. Archer, J. J. Joyce, K. N. Levitt, and P. J. Windley, editors, Proceedings of the 1991 International Workshop on the HOL Theorem Proving System and its Applications, pages 350–357. IEEE Computer Society Press, Davis, California, USA, August 1991. 22. S. Owre, J. M. Rushby, N. Shankar, and D.J. Stringer-Calvert. PVS System Guide. SRI Computer Science Laboratory, September 1998. Available at http://pvs.csl.sri.com/manuals.html. 23. Helmut A. Partsch. Specification and Transformation of Programs: A Formal Approach to Software Development. Texts and Monographs in Computer Science. Springer-Verlag, 1990. 24. Lawrence Paulson. A fixedpoint approach to implementing (co)inductive definitions. In Alan Bundy, editor, 12th International Conference. on Automated Deduction (CADE), volume LNAI 814, pages 148–161. Springer-Verlag, 1994. Revised version available at http://www.cl.cam.ac.uk/users/lcp/papers/recur.html under title ‘A Fixedpoint Approach to (Co)inductive and Co(datatype) Definitions’. 25. Lawrence Paulson. Isabelle : A Generic Theorem Prover. Number 828 in LNCS. Springer-Verlag, 1994. Up-to-date reference manual can be found at http://www.cl.cam.ac.uk/Research/HVG/Isabelle/dist/.

60

Konrad Slind

26. Peter Pepper and Douglas R. Smith. A high-level derivation of global search algorithms (with constraint propagation). Science of Computer Programming, 28(2– 3):247–271, April 1997. 27. Henrik Persson. Type Theory and the Integrated Logic of Programs. PhD thesis, Chalmers University of Technology, June 1999. 28. Piotr Rudnicki and Andrzej Trybulec. On equivalents of well-foundedness. Journal of Automated Reasoning, 23(3):197–234, 1999. 29. Natarajan Shankar. Steps towards mechanizing program transformations using PVS. In B. Moeller, editor, Mathematics of Program Construction, Third International Conference, (MPC’95), number 947 in Lecture Notes in Computer Science, pages 50–66, Kloster Irsee, Germany, July 17-21 1995. 30. Konrad Slind. Function definition in higher order logic. In Theorem Proving in Higher Order Logics, number 1125 in Lecture Notes in Computer Science, Abo, Finland, August 1996. Springer-Verlag. 31. Konrad Slind. Derivation and use of induction schemes in higher order logic. In Theorem Proving in Higher Order Logics, number 1275 in Lecture Notes in Computer Science, Murrary Hill, New Jersey, USA, August 1997. Springer-Verlag. 32. Konrad Slind. Reasoning about Terminating Functional Programs. PhD thesis, Institut f¨ ur Informatik, Technische Universit¨ at M¨ unchen, 1999. Accessible at http://www.cl.cam.ac.uk/users/kxs/papers. 33. Mitchell Wand. Continuation-based program transformation strategies. Journal of the ACM, 1(27):164–180, January 1980.

Appendix The HOL logic is a typed higher-order predicate calculus[13], derived from Church’s Simple Theory of Types [7]. The HOL logic is classical and has a set theoretic semantics, in which types denote non-empty sets and the function space denotes total functions. Several mature mechanizations exist [14,16,25]. The HOL logic is built on the syntax of a lambda calculus with an ML-style polymorphic type system. The syntax is based on signatures for types (Ω) and terms (ΣΩ ). The type signature assigns arities to type operators, while the term signature delivers the types of constants. Definition 14 (HOL Types). The set of types is the least set closed under the following rules: type variable. There is a countable set of type variables, which are represented with Greek letters, e.g., α, β, etc. compound type. If c in Ω has arity n, and each of ty1 , . . . tyn is a type, then c(ty1 , . . . tyn ) is a type. A type constant is represented by a 0-ary compound type. A large collection of types can be definitionally constructed in HOL, building on the initial types found in Ω: truth values (bool), function space (written α → β), and an infinite set of individuals (ind). Terms are typed λ-calculus expressions built with respect to ΣΩ . When we wish to show that a term M has type τ , the notation M : τ is used.

Wellfounded Schematic Definitions

61

Definition 15 (HOL Terms). The set of terms is the least set closed under the following rules: Variable. if v is a string and ty is a type built from Ω then v : ty is a term. Constant. (c : ty) is a term if c : τ is in ΣΩ and ty is an instance of τ , i.e., there exists a substitution for type variables θ, such that each element of the range of θ is a type in Ω and θ(τ ) = ty. Combination. (M N ) is a term of type β if M is a term of type α → β and N is a term of type α. Abstraction. (λv. M ) is a term of type α → β if v is a variable of type α and M is a term of type β. Initially, ΣΩ contains constants denoting equality (=), implication (⊃), and an indefinite description operator (ε). Types and terms form the basis of the prelogic, in which basic algorithmic manipulations on types and terms are defined: e.g., the free variables of a type or term, α-convertibility, substitution, and β-conversion. For describing substitution, the notation [M1 7→ M2 ] N is used to represent the term N where all free occurrences of M1 have been replaced by M2 . Of course, M1 and M2 must have the same type in this operation. During substitution, every binding occurrence of a variable in N that would capture a free variable in M2 is renamed to avoid the capture taking place. Deductive system. In Figure 1, a useful set of inference rules is outlined, along with the axioms of the HOL logic. The derivable theorems are just those that can be generated by using the axioms and inference rules of Figure 1. More parsimonious presentations of this deductive system can be found in [13] or Appendix A of [17]. A theorem with hypotheses P1 , . . . , Pk and conclusion Q (all of type bool) is written [P1 , . . . , Pk ] ` Q. In the presentation of some rules, e.g., ∨-elim, the following idiom is used: Γ, P ` Q. This denotes a theorem where P occurs as a hypothesis. A later reference to Γ then actually means Γ − {P }, i.e., had P already been among the elements of Γ , it would now be removed. Some rules, noted by use of the asterisk in Figure 1, have restrictions on their use or require special comment: – ∀-intro. The rule application fails if x occurs free in Γ . – ∃-intro. The rule application fails if N does not occur free in P . Moreover, only some designated occurrences of N need be replaced by x. The details of how occurrences are designated vary from implementation to implementation. – ∃-elim. The rule application fails if the variable v occurs free in Γ ∪∆∪{P, Q}. – Abs. The rule application fails if v occurs free in Γ . – tyInst. A substitution θ mapping type variables to types is applied to each hypothesis, and also to the conclusion. An important feature of the HOL logic is ε : (α → bool) → α, Hilbert’s indefinite description operator. A description term εx : τ. P x is interpreted as follows: it delivers an arbitrary element e of type τ such that P e holds. If there

62

Konrad Slind

⊃ -intro

Γ `Q Γ − {P } ` P ⊃ Q

Γ `P ⊃Q ∆`P Γ ∪∆` Q

⊃ -elim

∧-intro

Γ `P ∆`Q Γ ∪∆`P ∧Q

Γ `P ∧Q Γ `P Γ `Q

∧-elim

∨-intro

Γ `P Γ ` P ∨ Q, Γ ` Q ∨ P

Γ1 ` P ∨ Q Γ2 , P ` M Γ3 , Q ` M Γ1 ∪ Γ2 ∪ Γ3 ` M

∨-elim

∀-intro∗

Γ `P Γ ` ∀x. P

Γ ` ∀x. P Γ ` [x 7→ N ]P

∀-elim

∃-intro∗

Γ `P Γ ` ∃x. [N → 7 x]P

Γ ` ∃x. P ∆, [x 7→ v]P ` Q Γ ∪∆` Q

∃-elim∗

Assume

P `P

`M =M

Γ `M =N Γ `N =M

Γ ` M = N, ∆ ` N = P Γ ∪∆ `M = P

Trans

Comb

Γ ` M = N, ∆ ` P = Q Γ ∪∆` M P =N Q

Γ `M =N Γ ` (v.M ) = (v.N )

Abs∗

tyInst∗

Γ `M θ(Γ ) ` θ(M )

Sym

Refl

` (v.M )N = [v 7→ N ]M β-conv

Bool ` P ∨ ¬P Eta ` (v. M v) = M Select ` P x ⊃ P (εx. P x) Infinity ` ∃f : ind →ind. (∀x y. (f x = f y) ⊃ (x = y)) ∧ ∃y.∀x. ¬(y = f x)

Fig. 1. HOL deductive system is no object that P holds of, then εx : τ. P x denotes an arbitrary element of τ . This is summarized in the axiom ` ∀P x. P x ⊃ P (εx. P x). Definition 16 (Arb). Arb ≡ εz : α.T The definition of Arb uses the Hilbert choice operator to denote an arbitrary but fixed value, for each type τ . Arb is fixed because T has no free variables; it is arbitrary because λv.T holds for every element of τ . 4 4

F and T are the two constants of type bool denoting truth values in HOL.

Wellfounded Schematic Definitions

63

One of the most influential methodological developments in verification has been the adoption of principles of definition as logical prophylaxis, and implementations of HOL therefore tend to eschew the assertion of axioms. Definition 17 (Principle of Constant Definition). Given terms x : τ and M : τ in signature ΣΩ , check that 1. 2. 3. 4.

x is a variable and the name of x is not the name of a constant in ΣΩ ; τ is a type in ΣΩ ; M is a term in ΣΩ with no free variables; and Every type variable occurring in M occurs in τ .

If all these checks are passed, add a constant x : τ to ΣΩ and introduce an axiom ` x = M. Thus invocation of the Principle of definition, for suitable c and M , introduces c as an abbreviation for M . It is shown in [13] to be a sound means of extending the HOL logic. The notation c ≡ M is often used to show that a definition is being made. Derived definition principles, such as the one presented in this paper, reduce via deduction to application of the primitive Principle.

Abstract Congruence Closure and Specializations? Leo Bachmair and Ashish Tiwari Department of Computer Science State University of New York Stony Brook, NY 11794-4400, U.S.A {leo,astiwari}@cs.sunysb.edu

Abstract. We use the uniform framework of abstract congruence closure to study the congruence closure algorithms described by Nelson and Oppen [9], Downey, Sethi and Tarjan [7] and Shostak [11]. The descriptions thus obtained abstract from certain implementation details while still allowing for comparison between these different algorithms. Experimental results are presented to illustrate the relative efficiency and explain differences in performance of these three algorithms. The transition rules for computation of abstract congruence closure are obtained from rules for standard completion enhanced with an extension rule that enlarges a given signature by new constants.

1

Introduction

Algorithms to compute “congruence closure” have typically been described in terms of directed acyclic graphs (dags) representing a set of terms, and a unionfind data structure storing an equivalence relation on the vertices of this graph. In this paper, we abstractly describe some of these algorithms while still maintaining the “sharing” and “efficiency” offered by the data structures. This is achieved through the concept of an abstract congruence closure, c.f. [2, 3]. A key idea of abstract congruence closure is the use of new constants as names for subterms which yields a concise and simplified term representation. Consequently, complicated term orderings are no longer necessary or even applicable. There usually is a trade-off between the simplicity of terms thus obtained and the loss of term structure. In this paper, we get a middle ground where we keep the term structure as much as possible while still using extensions to obtain a simplified term representation. The paper also illustrates the use of an extended signature as a formalism to model and subsequently reason about data structures like the term dags, which are based on the idea of structure sharing. In Section 2 we review the description of abstract congruence closure as a set of transition rules [2, 3]. The transition rules are derived from standard completion [1] enhanced with extension and suitably modified for the ground ?

The research described in this paper was supported in part by the National Science Foundation under grant CCR-9902031.

D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 64–78, 2000. c Springer-Verlag Berlin Heidelberg 2000

Abstract Congruence Closure and Specializations

65

case. Taking such an abstract view allows for a better understanding of the various graph-based congruence closure algorithms (Section 3), and also suggests new efficient procedures for constructing congruence closures (Section 4). Preliminaries Given a set Σ = ∪n Σn of function symbols and constants–called a signature–the set of (ground) terms T (Σ) over Σ is the smallest set containing Σ0 and such that f(t1 , . . . , tn ) ∈ T (Σ) whenever f ∈ Σn and ti ∈ T (Σ). The index n of the set Σn to which a function symbol f belongs is called the arity of the symbol f. Elements of arity 0 are called constants. A symbol f ∈ Σk of arity k is also said to be a k-ary function symbol. The symbols s, t, u, . . . are used to denote terms in T (Σ); f, g, . . ., function symbols. We write t[s] to indicate that a term t contains s as a subterm and (ambiguously) denote by t[u] the result of replacing a particular occurrence of s by u. A subterm of a term t is called proper if it is distinct from t. An equation is a pair of terms, written as s ≈ t. The replacement or singlestep rewrite relation 1 →E induced by a set of ground (or variable-free) equations E is defined by: u[l] →E u[r] if, and only if, l ≈ r is in E. If → is a binary relation, then ← denotes its inverse, ↔ its symmetric closure, →+ its transitive closure and →∗ its reflexive-transitive closure. Thus, ↔∗E denotes the congruence relation2 , which is the same as the equational theory when E is ground, induced by a set E of ground equations. Equations are often called rewrite rules, and a set E a rewrite system, if one is interested particularly in the rewrite relation →∗E rather than the equational theory ↔∗E . If E is a set of equations, we write E[s] to denote that the term s occurs as a subterm of some equation in E, and (ambiguously) use E[t] to denote the set of equations obtained by replacing an occurrence of s in E by t. A term t is in normal form with respect to a rewrite system R if there is no term t0 such that t →R t0 . We write s →!R t to indicate that t is a Rnormal form of s. A rewrite system R is said to be (ground) confluent if every (ground) terms t has at most one normal form, i.e., if there exist s, s0 such that s ←∗R t →∗R s0 , then, s →∗R ◦ ←∗R s0 . It is terminating if there exists no infinite sequence s0 →R s1 →R s2 · · · of terms. Rewrite systems that are (ground) confluent and terminating are called (ground) convergent.

2

Abstract Congruence Closure

We first review the concept of an abstract congruence closure [2, 3]. Let Σ be a signature and K be a set of constants disjoint from Σ. A D-rule (with respect to Σ and K) is a rewrite rule of the form t → c where t is a term from the set 1 2

There is no difference between the replacement relation and the rewrite relation in the ground case. A congruence relation is a reflexive, symmetric and transitive relation on terms that is also a replacement relation.

66

Leo Bachmair and Ashish Tiwari

T (Σ ∪ K) − K and c is a constant in K 3 . A C-rule (with respect to K) is a rule c → d, where c and d are constants in K. For example, if Σ0 = {a, b, f}, and E0 = {a ≈ b, ffa ≈ fb} then D0 = {a → c0 , b → c1 , ffa → c2 , fb → c3 } is a set of D-rules over Σ0 and K0 = {c0 , c1 , c2 , c3 }. Original equations in E0 can now be simplified using D0 to give C0 = {c0 ≈ c1 , c2 ≈ c3 }. The set D0 ∪ C0 may be viewed as an alternative representation of E0 over an extended signature. The equational theory presented by D0 ∪ C0 is a conservative extension of the theory E0 . This reformulation of the equations E0 in terms of an extended signature is (implicitly) present in all congruence closure algorithms, see Section 3. A constant c in K is said to represent a term t in T (Σ ∪ K) (via the rewrite system R) if t ↔∗R c. A term t is represented by R if it is represented by some constant in K via R. For example, the constant c2 represents the term ffa via D0 . Definition 1. Let Σ be a signature and K be a set of constants disjoint from Σ. A ground rewrite system R = D ∪ C of D-rules and C-rules over the signature Σ ∪ K is said to be an (abstract) congruence closure (with respect to Σ and K) if (i) each constant c ∈ K that is in normal form with respect to R, represents some term t ∈ T (Σ) via R, and (ii) R is ground convergent. If E is a set of ground equations over T (Σ ∪ K) and in addition R is such that (iii) for all terms s and t in T (Σ), s ↔∗E t if, and only if, s →∗R ◦ ←∗R t, then R will be called an (abstract) congruence closure for E. Condition (i) essentially states that no superfluous constants are introduced; condition (ii) ensures that equivalent terms have the same representative; and condition (iii) implies that R is a conservative extension of the equational theory induced by E over T (Σ). The rewrite system R0 = D0 ∪ {c0 → c1 , c2 → c3 } above is not a congruence closure for E0 , as it is not ground convergent. But we can transform R0 into a suitable rewrite system, using a completion-like process described in more detail below, to obtain a congruence closure R1 = {a → c1 , b → c1 , fc1 → c3 , fc3 → c3 , c0 → c1 , c2 → c3 }. Construction of Congruence Closures We next present a general method for construction of congruence closures. Our description is fairly abstract, in terms of transition rules that manipulate triples (K, E, R), where K is the set of constants that extend the original fixed signature Σ, E is the set of ground equations (over Σ ∪ K) yet to be processed, and R is the set of C-rules and D-rules that have been derived so far. Triples represent states in the process of constructing a congruence closure. Construction starts from initial state (∅, E, ∅), where E is a given set of ground equations. 3

The definition of a D-rule is more general than the definition presented in [2, 3] as it allows for arbitrary non-constant terms on the left-hand side.

Abstract Congruence Closure and Specializations

67

The transition rules can be derived from those for standard completion as described in [1], with some differences. In particular, (i) application of the transition rules is guaranteed to terminate, and (ii) a convergent system is constructed over an extended signature. The transition rules do not require any reduction ordering4 on terms in T (Σ), but only only a simple ordering on terms in T (Σ ∪ U )5 where U is an infinite set of constants from which new constants K ⊂ U are chosen. In particular, if we assume U is any ordering on the set U , then is defined as: c d if c U d and t c if t → c is a D-rule. In this paper, the set U = {c0 , c1 , c2 , . . .}, and we will assume ci U cj iff i < j. A key transition rule introduces new constants as names for subterms. Extension:

(K, E[t], R) (K ∪ {c}, E[c], R ∪ {t → c})

where t → c is a D-rule, t is a term occurring in (some equation in) E, and c 6∈ Σ ∪ K. Following three rules are identical to the corresponding rules for standard completion. Simplification:

(K, E[t], R ∪ {t → c}) (K, E[c], R ∪ {t → c})

where t occurs in some equation in E. It is fairly easy to see that by repeated application of extension and simplification, any equation in E can be reduced to an equation that can be oriented by the ordering . Orientation:

(K ∪ {c}, E ∪ {t ≈ c}, R) (K ∪ {c}, E, R ∪ {t → c})

if t c. Trivial equations may be deleted. Deletion:

(K, E ∪ {t ≈ t}, R) (K, E, R)

In the case of completion of ground equations, deduction steps can all be replaced by suitable simplification steps. In particular, most of the deduction steps can be described by collapse, and hence, the deduction rule considers only simple forms of overlap. Deduction: 4 5

(K, E, R ∪ {t → c, t → d}) (K, E ∪ {c ≈ d}, R ∪ {t → d})

An ordering is any irreflexive and transitive relation on terms. A reduction ordering is an ordering that is also a well-founded replacement relation. Terms in T (Σ) are uncomparable by .

68

Leo Bachmair and Ashish Tiwari

In our case the usual side condition in the collapse rule, which refers to the encompassment ordering, can easily be stated in terms of the subterm relation. Collapse:

(K, E, R ∪ {s[t] → d, t → c}) (K, E, R ∪ {s[c] → d, t → c})

if t is a proper subterm of s. As in standard completion the simplification of right-hand sides of rules in R by other rules is optional and not necessary for correctness. The right-hand side term in any rule in R is always a constant. Composition:

(K, E, R ∪ {t → c, c → d}) (K, E, R ∪ {t → d, c → d})

We use the symbol ` to denote the one-step transition relation on states induced by the above transition rules. A derivation is a sequence of states (K0 , E0, R0 ) ` (K1 , E1 , R1 ) ` · · ·. Example 1. Consider the set of equations E0 = {a ≈ b, ffa ≈ fb}. An abstract congruence closure for E0 can be derived from (K0 , E0 , R0) = (∅, E0, ∅) as follows: i Constants Ki Equations Ei Rules Ri Transition Rule 0∅ E0 ∅ 1 {c0 } {c0 ≈ b, ffa ≈ fb} {a → c0 } Ext 2 {c0 } {ffa ≈ fb} {a → c0 , b → c0 } Ori 3 {c0 } {ffc0 ≈ fc0 } {a → c0 , b → c0 } Sim2 4 {c0 , c1 } {fc1 ≈ fc0 } R3 ∪ {fc0 → c1 } Ext 5 {c0 , c1 } {fc1 ≈ c1 } R3 ∪ {fc0 → c1 } Sim 6 K5 {} R5 ∪ {fc1 → c1 } Ori The rewrite system R6 is the required congruence closure. The correctness of the transition rules presented here can be established in a way similar to the correctness of the transition rules for computing a congruence closure modulo associativity and commutativity [3]. The differences arise from the more general definition of D-rules, and the lack of any associative and commutative functions here. The set of transition rules presented above are sound in the following sense: if (K0 , E0, R0 ) ` (K1 , E1 , R1 ), then, for all terms s and t in T (Σ ∪K0 ), s ↔∗E1 ∪R1 t if and only if s ↔∗E0 ∪R0 t. Additionally, let K0 be a finite set of constants (disjoint from Σ), E0 be a finite set of equations (over Σ ∪ K0 ), and R0 be a finite set of D-rules and C-rules such that for every C-rule c → d ∈ R0 , we have c U d. Then, any derivation starting from (K0 , E0 , R0 ) is finite. If (K0 , E0, R0 ) `∗ (Km , Em , Rm ), then Rm is terminating. We call a state (K, E, R) final if no transition rule (except possibly composition) is applicable.

Abstract Congruence Closure and Specializations

69

Theorem 1. Let Σ be a signature and K1 a finite set of constants disjoint from Σ. Let E1 be a finite set of equations over Σ ∪ K1 and R1 a finite set of D-rules and C-rules such that for every c ∈ K1 represents some term t ∈ T (Σ) via E1 ∪ R1 , and c U d for every C-rule c → d in R1 . If (Kn , En , Rn ) is a final state such that (K1 , E1 , R1 ) `∗ (Kn , En , Rn ), then En = ∅ and Rn is an abstract congruence closure for E1 ∪ R1 (over Σ and K1 ).

3

Congruence Closure Strategies

The literature abounds with various implementations of congruence closure algorithms. We next describe the algorithms in [7], [9] and [11] as specific variants of our general abstract description. That is, we provide a description of these algorithms (modulo some implementation details) using abstract congruence closure transition rules. Term directed acyclic graphs (dags) is a common data structure used to implement algorithms that work with terms over some signature—such as the congruence closure algorithm. In fact, many algorithms that have been described for congruence closure assume that the input is an equivalence relation on vertices of a given dag, and the desired output is an equivalence on the same dag that is defined by the congruence relation. Figure 1 illustrate how a given term dag is (abstractly) represented using Drules. The solid lines represent subterm edges, and the dashed lines represent a binary relation on the vertices. We have a D-rule corresponding to each vertex, and a C-rule for each dashed edge. Note that the D-rules corresponding to a conventional term dag representation are all of a special form f(c1 , . . . , ck ) → c, where f ∈ Σ is a k-ary function symbol, and c1 , . . . , ck , c are all new constants. Such rules will be called simple D-rules. The definition of D-rules given in Section 2 is more general, and allows for arbitrary terms on the left-hand sides. In a sense this corresponds to storing contexts, rather than just symbols from Σ, in each node (of the term dag). This is an attempt to keep as much of the term structure information as possible and still get advantages offered by a simplified term representation via extensions. We need to specify a U set and an ordering U on this set. Since elements of U serve only as names, we can choose U to be any countable set of symbols. An ordering U need not be specified a-priori but can be defined on-the-fly as the derivation proceed. (The ordering has to be extended so that the irreflexivity and transitivity properties are preserved). Traditional congruence closure algorithms also employ other data structures such as the following: (i) Input dag: Starting from the state (∅, E0 , ∅), if we apply extension and simplification using strategy (Ext ◦ Sim∗ )∗ and making sure we create only simple D-rules, we finally get to a state (K1 , E1 , D1 ) where all equations in E1 are of the form c ≈ d, for c, d ∈ K1 . The set D1 , then, represents the input dag and E1 represents the (input) equivalence on the vertices of this dag. Note that due to eager simplification, we obtain representation of a dag with maximum possible

70

Leo Bachmair and Ashish Tiwari D-rules representing the term dag:

f

a→ b→ d→ f c5 c9 →

f g a

h b

g c

h

c1 c4 c7 c10

gc1 c1 → c2 hc4 → c5 hc7 → c8

f c1 c2 → c3 c → c6 gc6 c8 → c9

C-rules representing the relation on vertices: c1 ≈ c5 c4 ≈ c7

c2 ≈ c9 c6 ≈ c5

c3 ≈ c10 c5 ≈ c8

d

Fig. 1. A term dag and a relation on its vertices sharing. For example, if E0 = {a ≈ b, ffa ≈ fb}, then K1 = {c0 , c1 , c2 , c3 , c4 }, E1 = {c0 ≈ c1 , c3 ≈ c4 } and R1 = {a → c0 , b → c1 , fc0 → c2 , fc2 → c3 , fc1 → c4 }. (ii) Signature table: The signature table (indexed by vertices of the input dag) stores a signature6 for some or all vertices. Clearly, the signatures are fully leftreduced D-rules. (iii) Use table: The use table (also called predecessor list) is a mapping from the constant c to the set of all vertices whose signature contains c. This translates, in our presentation, to a method of indexing the set of D-rules. (iv) Union Find: The union-find data structure that maintains equivalence classes on the set of vertices is represented by the set of C rules. If we apply orientation and simplification to the state (K1 , E1 , D1 ) described above, using the strategy (Ori ◦ Sim∗ )∗ , we obtain a state (K1 , ∅, D1 ∪ C1 ). The set C1 is a representation of the Union-Find structure capturing the input equivalence on vertices. Continuing with the same example, C1 would be the set {c0 → c1 , c3 → c4 }. We note that, D-rules serve a two-fold purpose: they represent the input term dag, and also a signature table. We shall also note that Composition is used only implicitly in the various algorithms via path-compression on the union-find structure.

Shostak’s Method Shostak’s congruence closure procedure was first described using simple D-rules and C-rules by Kapur [8]. We show here that Shostak’s congruence closure procedure is a specific strategy over the general transition rules for abstract congruence closure presented here. Shostak’s congruence closure is dynamic: it can accept new equations after it has processed some equations, and can incrementally take care of the new 6

The signature of a term f (t1 , . . . , tk ) is defined as f (c1 , . . . , ck ) where ci is the name of the equivalence class containing term ti .

Abstract Congruence Closure and Specializations

71

equation. Its input state is (∅, E0 , ∅). Shostaks procedure can be described (at a fairly abstract level) as: Shos = ((Sim∗ ◦ Ext ∗ )∗ ◦ (Del ∪ Ori) ◦ (Col ◦ Ded∗ )∗ )∗ which is implemented as (i) pick an equation s ≈ t from the E-component, (ii) use simplification to normalize the term s to a term s0 (iii) use extension to create simple D-rules for subterms of s0 until s0 reduces to a constant, say c, whence extension is no longer applicable. Perform steps (ii) and (iii) on the other term t as well to get a constant d. (iv) if c and d are identical then apply deletion (and continue with (i)), and if not, create a C-rule using orientation. (v) Once we have a new C-rule, perform all possible collapse step by this new rule, where each collapse step is followed by all the resulting deduction steps arising out of that collapse. The whole process is now repeated starting from step (i). Shostak’s procedure uses indexing based on the idea of the use() list. This use() based indexing is used to identify all possible collapse applications. If the E-component of the state is empty while attempting to apply step (i), Shostak’s procedure halts. It is fairly easy to observe that Shostak’s procedure halts in a final state. Hence, Theorem 1 establishes that the R-component of Shostak’s halting state contains a convergent system and is an abstract congruence closure. Example 2. We use the set E0 used in Example 1 of Section 2 to illustrate Shostak’s method. We show some of the important intermediate steps of a Shostak derivation. i Constants Ki 0∅ 1 {c0 , c1 } 2 {c0 , c1 } 3 {c0 , . . . , c3 } 4 {c0 , . . . , c3 } 5 {c0 , . . . , c3 }

Equations Ei E0 {ffa ≈ fb} {ffc1 ≈ fb} {c3 ≈ fb} {c3 ≈ c2 } ∅

Rules Ri Transition ∅ {a → c0 , b → c1 , c0 → c1 } Ext 2 ◦ Ori {a → c0 , b → c1 , c0 → c1 } Sim R2 ∪ {fc1 → c2 , fc2 → c3 } Ext 2 R3 Sim2 R4 ∪ {c3 → c2 } Ori

The Downey–Sethi–Tarjan Algorithm The Downey, Sethi and Tarjan [7] procedures assumes that the input is a dag and an equivalence relation on its vertices, which, in our language, means that the starting state for this procedures is (K1 , ∅, D1 ∪ C1 ), where D1 represents the input dag and C1 represents the initial equivalence. It can be succinctly abstracted as: DST = ((Col ◦ (Ded ∪ {}))∗ ◦ (Sim∗ ◦ (Del ∪ Ori))∗ )∗ where is the null transition rule. This strategy is implemented as follows (i) if any collapse rule is applicable, it is applied and if, as a result any new deduction step is possible, it is done. This is repeated until no more collapse steps are

72

Leo Bachmair and Ashish Tiwari

possible. (ii) if no collapse steps are possible, then each C-equation in the Ecomponent is picked up sequentially, fully-simplified (simplification) and then either deleted (deletion) or oriented (orientation). Although the above description captures the essence of the Downey, Sethi and Tarjan procedure, a few implementation details need to be pointed out. Firstly, the Downey, Sethi and Tarjan procedure keeps the original dag (represented by D1 ) intact7 , but changes signatures in a signature table. Hence, in the actual implementation described in [7], the (Col ◦ (Ded ∪ {}))∗ strategy is applied by: (i) deleting all signatures that will be changed, i.e., deleting all D-rules which can be collapsed; (ii) computing new signatures using the original copy of the signatures stored in the form of the dag D1 ; and, finally, (iii) inserting the newly computed signatures into the signature table and checking for possible deduction steps. Our description achieves the same end result, but, by doing fewer inferences. Secondly, in the Downey, Sethi and Tarjan procedure, for efficiency, an equation c ≈ d is oriented to c → d if the c occurs fewer times than d in the signature table. This is done to minimize the number of collapse steps. Additionally, indexing based on the use() tables is used for efficiently implementing the specific strategy. Let (K1 , ∅, D1 ∪ C1 ) `! (Kn , En , Dn ∪ Cn ) be a derivation using the DST strategy. Then, it is easily seen that the state (Kn , En, Dn ∪ Cn ) is a final state, and hence the set Dn ∪Cn is convergent, and also an abstract congruence closure. We remark here that Dn holds the information that is contained in the signature table, and Cn holds information in the union-find structure. The set Cn is usually considered the output of the Downey, Sethi and Tarjan procedure. Example 3. We illustrate the Downey-Sethi-Tarjan algorithm by using the same set of equations E0 , used in Example 1 of Section 2. The start state is (K1 , ∅, D1 ∪ C1 ) where K = {c0 , . . . , c4 }, D1 = {a → c0 , b → c1 , fc0 → c2 , fc2 → c3 , fc1 → c4 }, and, C1 = {c0 → c1 , c3 → c4 }. i Consts Ki Eqns Ei Rules Ri Transition 1 K1 ∅ D1 ∪ C1 2 K1 ∅ {a → c0 , b → c1 , fc1 → c2 , Col fc2 → c3 , fc1 → c4 } ∪ C1 3 K1 {c2 ≈ c4 } R2 Ded 4 K1 ∅ R3 − {fc1 → c2 } ∪ {c4 → c2 } Ori Note that c4 ≈ c2 was oriented in a way that no further collapses were needed thereafter. The Nelson–Oppen Procedure The Nelson-Oppen procedure is not exactly a completion procedure and it does not generate a congruence closure in our sense. The initial state of the Nelson7

We could make a copy of the original D1 rules and not change them, while keeping a separate copy as the signatures.

Abstract Congruence Closure and Specializations

73

Oppen procedure is given by the tuple (K1 , E1 , D1 ), where D1 is the input dag, and E1 represents an equivalence on vertices of this dag. The sets K1 and D1 remain unchanged in the Nelson-Oppen procedure. In particular, the inference rule used for deduction is different from the conventional deduction rule8 . NODeduction:

(K, E, D ∪ C) (K, E ∪ {c ≈ d}, D ∪ C)

if there exist two D-rules f(c1 , . . . , ck ) → c, and, f(d1 , . . . , dk ) → d in the set D; and, ci →!C ◦ ←!C di , for i = 1, . . . , k. The Nelson-Oppen procedure can now be (at a certain abstract level) represented as: NO = (Sim∗ ◦ (Ori ∪ Del) ◦ NODed∗ )∗ which is applied in the following sense: (i) select a C-equation c ≈ d from the E-component, (ii) simplify the terms c and d using simplification steps until the terms can’t be simplified any more, (iii) either delete, or orient the simplified C-equation, (iv) apply the NODeduction rule until there are no more non-redundant applications of this rule, (v) if the E-component is empty, then we stop, otherwise continue with step (i). Certain details like the fact that newly added equations to the set E are chosen before the old ones in an application of orientation and indexing based on the use() table, are abstracted away in this description. Using the Nelson-Oppen strategy, assume we get a derivation (K1 , E1 , D1 )`∗NO (Kn , En, Dn ∪Cn ). One consequence of using a non-standard deduction rule, NODeduction, is that the resulting set Dn ∪ Cn = D1 ∪ Cn need not necessarily be convergent, although the the rewrite relation Dn /Cn [6] is convergent. Example 4. Using the same set E0 as equations, we illustrate the Nelson-Oppen procedure. The initial state is given by (K1 , E1, D1 ) where K1 = {c0 , c1 , c2 , c3 , c4 }; E1 = {c0 ≈ c1 , c3 ≈ c4 }; and, D1 = {a → c0 , b → c1 , fc0 → c2 , fc2 → c3 , fc1 → c4 }. i Constants Ki Equations Ei Rules Ri Transition 1 K1 E1 D1 2 K1 {c3 ≈ c4 } D1 ∪ {c0 → c1 } Ori 3 K1 {c2 ≈ c4 , c3 ≈ c4 } R2 NODed 4 K1 {c3 ≈ c4 } R2 ∪ {c2 → c4 } Ori 5 K1 ∅ R4 ∪ {c3 → c4 } Ori Consider deciding the equality fa ≈ ffb. Even though fa ↔∗E0 ffb, the terms fa and ffb have distinct normal forms with respect to R5 . But terms in the original term universe have identical normal forms. 8

This rule performs deduction modulo C-equations, i.e., we compute critical pairs between D-rules modulo the congruence induced by C-equations. Hence, the NelsonOppen procedure can be described as an extended completion [6] (or completion modulo C-equations) method over an extended signature.

74

4

Leo Bachmair and Ashish Tiwari

Experimental Results

We have implemented five congruence closure algorithms, including those proposed by Nelson and Oppen (NO) [9], Downey, Sethi and Tarjan (DST) [7], and Shostak [11], and two algorithms based on completion—one with an indexing mechanism (IND) and the other without (COM). The implementations of the first three procedures are based on the representation of terms by directed acyclic graphs and the representation of equivalence classes by a union-find data structure. The completion procedure COM uses the following strategy: ((Sim∗ ◦ Ext∗ )∗ ◦ (Del ∪ Ori) ◦ (Com ◦ Col)∗ ◦ Ded∗ )∗ . The indexed variant IND uses a slightly different strategy ((Sim∗ ◦ Ext∗ )∗ ◦ (Del ∪ Ori) ◦ (Col ◦ Com ◦ Ded)∗ )∗ . Indexing in the case of completion refers to the use of suitable data structures to efficiently identify which D-rules contain specified constants. In a first set of experiments, we assume that the input is a set of equations presented as pairs of trees (representing terms). We added a preprocessing step to the NO and DST algorithms to convert the given input terms into a dag and initialize the other required data-structures. The other three algorithms interleave construction of a dag with deduction steps. The published descriptions DST and NO do not address the construction of a dag. Our implementation maintains the list of terms that have been represented in the dag in a hash table and creates a new node for each term not yet represented. We present below a sample of our results to illustrate some of the differences between the various algorithms. The input set of equations E can be classified based on: (i) the size of the input and the number of equations, (ii) the number of equivalence classes on terms and subterms of E, and, (iii) the size of the use lists. The first set of examples are relatively simple and developed by hand to highlight strengths and weaknesses of the various algorithms. Example (a)9 contains five equations that induce a single equivalence class. Example (b) is the same as (a), except that it contains five copies of all the equations. Example (c)10 requires slightly larger use lists. Finally, example (d)11 consists of equations that are oriented in the “wrong” way. In Table 1 we compare the different algorithms by their total running time, including the preprocessing time. The times shown are the averages of several runs on a Sun Ultra workstation under similar load conditions. The time was computed using the gettimeofday system call. 9 10 11

The equation set is {f 2 (a) ≈ a, f 10 (a) ≈ f 15 (b), b ≈ f 5 (b), a ≈ f 3 (a), f 5 (b) ≈ b}. The equation set is {g(a, a, b) ≈ f (a, b), gabb ≈ f ba, gaab ≈ gbaa, gbab ≈ gabb, gbba ≈ gbab, gaaa ≈ f aa, a ≈ c, c ≈ d, d ≈ e, b ≈ c1, c1 ≈ d1, d1 ≈ e1}. The set is {g(f i (a), h10 (b)) ≈ g(a, b), i = {1, · · · , 25}, h47 (b) ≈ b, b ≈ h29 (b), h(b) ≈ c0, c0 ≈ c1, c1 ≈ c2, c2 ≈ c3, c3 ≈ c4, c4 ≈ a, a ≈ f (a)}.

Abstract Congruence Closure and Specializations

Ex.a Ex.b Ex.c Ex.d

Eqns Vert Class DST 5 27 1 1.286 20 27 1 2.912 12 20 6 1.255 34 105 2 10.556

NO SHO COM 1.640 0.281 0.606 2.772 0.794 1.858 0.733 0.515 0.325 22.488 7.275 12.077

75

IND 0.409 0.901 0.323 4.416

Table 1. Total running time (in milliseconds) for Examples (a) − −(d). Eqns refers to the number of equations; Vert to the number of vertices in the initial dag; and Class to the number of equivalence classes induced on the dag.

Table 2 contains similar comparisons for considerably larger examples consisting of randomly generated equations over a specified signature. Again we show total running time, including preprocessing time12 . Eqns Ex.1 10000 Ex.2 5000 Ex.3 5000 Ex.4 6000 Ex.5 7000 Ex.6 5000 Ex.7 5000

Vert Σ0 Σ1 Σ2 d Class DST NO SHO IND 17604 2 0 2 3 7472 11.087 3.187 10.206 13.037 4163 2 1 1 3 3 2.276 306.194 3.092 0.774 7869 3 0 1 3 2745 2.439 1.357 3.521 3.989 8885 3 0 1 3 9 3.551 1152.652 52.353 7.069 9818 3 0 1 3 1 4.633 1682.815 47.755 5.471 645 4 2 0 23 77 1.224 1.580 0.371 0.363 1438 10 2 0 23 290 1.452 3.670 0.392 0.374

Table 2. Total running time (in seconds) for randomly generated equations. The columns Σi denote the number of function symbols of arity i in the signature and d denotes the maximum term depth.

In Table 3 we show the time for computing a congruence closure assuming terms are already represented by a dag. In other words, we do not include the time it takes to create a dag. Note that we include no comparison with Shostak’s method, as the dynamic construction of a dag from given term equations is inherent in this procedure. However, a comparison with a suitable strategy (in which all extension steps are applied before any deduction steps) of IND is possible. We denote by IND* indexed completion based on a strategy that first constructs a dag. The examples are the same as in Table 2. Several observations can be drawn from these results. First, the NelsonOppen procedure NO is competitive only when few deduction steps are performed and thus the number of equivalence classes is large. This is because it uses a non-standard deduction rule, which forces the procedure to unnecessarily repeat the same deductions many times over in a single execution. Not surprisingly, straight-forward completion without indexing is also inefficient when 12

Times for COM are not included as indexing is indispensable for larger examples.

76

Leo Bachmair and Ashish Tiwari

Ex.1 Ex.2 Ex.3 Ex.4

DST NO IND* 0.919 0.296 0.076 0.309 319.112 1.971 0.241 0.166 0.030 0.776 1117.239 7.301

DST NO IND* Ex.5 0.958 1614.961 9.770 Ex.6 0.026 0.781 0.060 Ex.7 0.048 2.470 0.176

Table 3. Running time (in seconds) when input is in a dag form.

many deduction steps are necessary. Indexing is of course a standard technique employed in all practical implementations of completion. The running time of the DST procedure critically depends on the size of the hash table that contains the signatures of all vertices. If the hash table size is large, enough potential deductions can be detected in (almost) constant time. If the hash table size is reduced, to say 100, then the running time increased by a factor of up to 50. A hash table with 1000 entries was sufficient for our examples (which contained fewer than 10000 vertices). Larger tables did not improve the running times. Indexed Completion, DST and Shostak’s method are roughly comparable in performance, though Shostak’s algorithm has some drawbacks. For instance, equations are always oriented from left to right. In contrast, Indexed Completion always orients equations in a way so as to minimize the number of applications of the collapse rule, an idea that is implicit in Downey, Sethi and Tarjan’s algorithm. Example (b) illustrates this fact. More crucially, the manipulation of the use lists in Shostak’s method is done in a convoluted manner due to which redundant inferences may be done when searching for the correct non-redundant ones13 . As a consequence, Shostak’s algorithm performs poorly on instances where use lists are large and deduction steps are many such as in Examples (c), 4 and 5. Finally, we note that the indexing used in our implementation of completion is simple—with every constant c we associate a list of D-rules that contain c as a subterm. On the other hand DST maintains at least two different ways of indexing the signatures, which makes it more efficient when the examples are large and deduction steps are plenty. On small examples, the overhead to maintain the data structures dominates. This also suggests that the use of more sophisticated indexing schemes for indexed completion might improve its performance.

5

Related Work and Conclusion

Kapur [8] considered the problem of casting Shostak’s congruence closure [11] algorithm in the framework of ground completion on rewrite rules. Our work has been motivated by the goal of formalizing not just one, but several congruence closure algorithms, so as to be able to better compare and analyze them. 13

The description in Section 3 accurately reflects the logical aspects of Shostak’s algorithm, but does not provide details on data structures like the use lists.

Abstract Congruence Closure and Specializations

77

We suggest that, abstractly, congruence closure can be defined as a ground convergent system; and that this definition does not restrict the applicability of congruence closure. The rule-based abstract description of the logical aspects of the various published congruence closure algorithms leads to a better understanding of these methods. It explains the observed behaviour of implementations and also allows one to identify weaknesses in specific algorithms. Additionally, using the abstract rules, we can also get efficient implementation of completion based congruence closure procedure—one can effectively utilize the theory of redundancy to figure out and eliminate inferences which are not necessary, and moreover also use knowledge about efficient indexing mechanisms. The concept of an abstract congruence closure is also relevant for describing applications that use congruence closure algorithms. Some of these applications include efficient normalization by rewrite systems [4, 2], computing a complete set of rigid E-unifiers [13], and combination of decision procedures [11]. The notion of an abstract congruence closure is naturally extended to handle presence of associative-commutative operators, and this application is described in [3]. We believe that theories other than associativity and commutativity can also be incorporated with the inference rules for abstract congruence closure. Congruence closure has also been used to construct a convergent set of ground rewrite rules in polynomial time by Snyder [12] and other works. Plaisted et. al. [10] gave a direct method, not based on using congruence closure, for completing a ground rewrite system in polynomial time. Hence our work completes the missing link, by showing that congruence closure is nothing but ground completion. In fact, the process of transforming a set of rewrite rules over an extended signature (representing an abstract congruence closure) into a convergent set of rewrite rules over the original signature can be easily described by additional transition rules [3]. Our approach is different from that of Snyder, and can be used to obtain a more efficient implementation partly because Snyder’s algorithm needs two passes of the congruence closure algorithm, whereas we would need to compute the abstract congruence closure just once. The concept of an abstract congruence closure as detailed here and the rules for computation open up new frontiers too. For example, the transition rules presented in Section 2 can be naturally implemented in MAUDE [5]. Moreover, specific strategies, such as the ones presented in Section 3 can be encoded easily too. This might provide a basis for automatically verifying the correctness of congruence closure algorithms14. Acknowledgements We would like to thank the anonymous reviewers for their helpful comments.

14

Personal communication with Manuel Clavel.

78

Leo Bachmair and Ashish Tiwari

References [1] L. Bachmair and N. Dershowitz. Equational inference, canonical proofs, and proof orderings. J. ACM, 41:236–276, 1994. [2] L. Bachmair, C. Ramakrishnan, I. Ramakrishnan, and A. Tiwari. Normalization via rewrite closures. In P. Narendran and M. Rusinowitch, editors, 10th Int. Conf. on Rewriting Techniques and Applications, pages 190–204, 1999. LNCS 1631. [3] L. Bachmair, I. Ramakrishnan, A. Tiwari, and L. Vigneron. Congruence closure modulo associativity and commutativity. In H. Kirchner and C. Ringeissen, editors, Frontiers of Combining Systems, 3rd Intl Workshop FroCoS 2000, pages 245–259, 2000. LNAI 1794. [4] L. P. Chew. Normal forms in term rewriting systems. PhD thesis, Purdue University, 1981. [5] M. Clavel et al. Maude: Specification and Programming in Rewriting Logic. http://maude.csl.sri.com/manual/, SRI International, Menlo Park, CA, 1999. [6] N. Dershowitz and J. P. Jouannaud. Rewrite systems. In J. van Leeuwen, editor, Handbook of Theoretical Computer Science (Vol. B: Formal Models and Semantics), Amsterdam, 1990. North-Holland. [7] P. J. Downey, R. Sethi, and R. E. Tarjan. Variations on the common subexpressions problem. J. ACM, 27(4):758–771, 1980. [8] D. Kapur. Shostak’s congruence closure as completion. In H. Comon, editor, Proceedings of the 8th International Conference on Rewriting Techniques and Applications, pages 23–37, 1997. Vol. 1232 of Lecture Notes in Computer Science, Springer, Berlin. [9] G. Nelson and D. Oppen. Fast decision procedures based on congruence closure. Journal of the Association for Computing Machinery, 27(2):356–364, Apr. 1980. [10] D. Plaisted and A. Sattler-Klein. Proof lengths for equational completion. Information and Computation, 125:154–170, 1996. [11] R. E. Shostak. Deciding combinations of theories. Journal of the ACM, 21(7):583– 585, 1984. [12] W. Snyder. A fast algorithm for generating reduced ground rewriting systems from a set of ground equations. Journal of Symbolic Computation, 15(7), 1993. [13] A. Tiwari, L. Bachmair, and H. Ruess. Rigid E-unification revisited. In D. McAllester, editor, 17th Intl Conf on Automated Deduction, CADE-17, 2000.

A Framework for Cooperating Decision Procedures Clark W. Barrett, David L. Dill, and Aaron Stump Stanford University, Stanford, CA 94305, USA, http://verify.stanford.edu

Abstract. We present a flexible framework for cooperating decision procedures. We describe the properties needed to ensure correctness and show how it can be applied to implement an efficient version of Nelson and Oppen’s algorithm for combining decision procedures. We also show how a Shostak style decision procedure can be implemented in the framework in such a way that it can be integrated with the Nelson–Oppen method.

1

Introduction

Decision procedures for fragments of first-order or higher-order logic are potentially of great interest because of their versatility. Many practical problems can be reduced to problems in some decidable theory. The availability of robust decision procedures that can solve these problem within reasonable time and memory could save a great deal of effort that would otherwise go into implementing special cases of these procedures. Indeed, there are several publicly distributed prototype implementations of decision procedures, such as Presburger arithmetic [15], and decidable combinations of quantifier-free first-order theories [2]. These and similar procedures have been used as components in applications, including interactive theorem provers [13,9], infinite-state model checkers [7,10,4], symbolic simulators [18], software specification checkers [14], and static program analyzers [8]. Nelson and Oppen [12] showed that satisfiability procedures for several theories that satisfy certain conditions can be combined into a single satisfiability procedure by propagating equalities. Many others have built upon this work, offering new proofs and applications [19,1]. Shostak [17,6,16] gave an alternative method for combining decision procedures. His method is applicable to a more restricted set of theories, but is reported to be more efficient and is the basis for combination methods found in SVC [2], PVS [13], and STeP [9,3]. An understanding of his algorithm has proven to be elusive. Both STeP and PVS have at least some ability to combine the methods of Nelson and Oppen and Shostak [5,3], but not much detail has been given, and the methods used in PVS have never been published. As a result, there is still significant confusion about the relationship between these two methods and how to implement them efficiently and correctly. D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 79–98, 2000. c Springer-Verlag Berlin Heidelberg 2000

80

Clark W. Barrett, David L. Dill, and Aaron Stump

Our experience with SVC, a decision procedure for quantifier-free first-order logic based loosely on Shostak’s method for combining cooperating decision procedures, has been both positive and negative. On the one hand, it has been implemented and is efficient and reliable enough to enable new capabilities in our research group and at a surprisingly large number of other sites. However, efforts to extend and modify SVC have revealed unnecessary constraints in the underlying theory, as well as gaps in our understanding of it. This paper is an outcome of ongoing attempts to re-architect SVC to resolve these difficulties. We present an architecture for cooperating decision procedures that is simple yet flexible and show how the soundness, completeness, and termination of the combined decision procedure can be proved from a small list of clearly stated assumptions about the constituent theories. As an example of the application of this framework, we show how it can be used to implement and integrate the methods of Nelson and Oppen and Shostak. In so doing, we also describe an optimization applicable to the original Nelson and Oppen procedure and show how our framework simplifies the proof of correctness of Shostak’s method. Due to the scope of this paper and space restrictions, many of the proofs have been abbreviated or omitted.

2

Definitions and Notation

Expressions in the framework are represented using the logical symbols true, false, and ‘=’, an arbitrary number of variables, and non-logical symbols consisting of constants, and function and predicate symbols. We call true and false constant formulas. An atomic formula is either a constant formula, an equality between terms, or a predicate applied to terms. A literal is either an atomic formula or an equality between a non-constant atomic formula and false. Equality with false is used to represent negation. Formulas include atomic formulas, and are closed under the application of equality, conjunction and quantifiers. An expression is either a term or a formula. An expression is a leaf if it is a variable or constant. Otherwise, it is a compound expression, containing an operator applied to one or more children. A theory is a set of first-order sentences. For the purposes of this paper, we assume that all theories include the axioms of equality. The signature of a theory is the set of function, predicate, and constant symbols appearing in those sentences. The language of a signature Σ is the set of all expressions whose function, predicate, and constant symbols come from Σ. Given a theory T with signature Σ, if φ is a sentence in the language of Σ, then we write T |= φ to mean that every model of T is also a model of φ. For a given model, M , an interpretation is a function which assigns an element of the domain of M to each variable. If Γ is a set of formulas and φ is a formula, then we write Γ |= φ to mean that for every model and interpretation satisfying each formula in Γ , the same model and interpretation satisfy φ. Finally, if Φ is a set of formulas, then Γ |= Φ indicates that Γ |= φ for each φ in Φ. Expressions are represented using a directed acyclic graph (DAG) data structure such that any two expressions which are syntactically identical are uniquely

A Framework for Cooperating Decision Procedures

81

represented by a single DAG. The following operations on expressions are supported. Op(e) e[i]

the operator of e (just e itself if e is a leaf). the ith child of e, where e[1] is the first child.

If e1 and e2 are expressions, then we write e1 ≡ e2 to indicate that e1 and e2 are the same expression (syntactically identical). In contrast, e1 = e2 is simply intended to represent the expression formed by applying the equality operator to e1 and e2 . Expressions can be annotated with various attributes. If a is an attribute, e.a is the value of that attribute for expression e. Initially, e.a = ⊥ for each e and a, where ⊥ is a special undefined value. The following simple operations make use of an expression attribute called find to maintain equivalence classes of expressions. We assume that these are the only functions that reference the attribute. Note that when presenting pseudocode here and below, some required preconditions may be given next to the name and parameters of the function. HasFind(a) RETURN a.find 6= ⊥;

SetFind(a) {a.find = ⊥ } a.find := a;

Find(a) {HasFind(a)} Union(a,b) {a.find ≡ a ∧ b.find ≡ b } IF (a.find ≡ a) THEN RETURN a; a.find := b.find; ELSE RETURN Find(a.find);

In some similar algorithms, e.find is initially set to e, rather than ⊥. The reason we don’t do this is that it turns out to be convenient to use an initialized find attribute as a marker that the expression has been seen before. This not only simplifies the algorithm, but it also makes it easier to describe certain invariants about expressions. The find attribute induces a relation ∼ on expressions: a ∼ b if and only if HasFind(a) ∧ HasFind(b) ∧ [Find(a)≡Find(b)]. For the set of all expressions whose find attributes have been set, this relation is an equivalence relation. The find database, denoted by F , is defined as follows: a = b ∈ F iff a ∼ b. The following facts will be used below. Find Database Monotonicity. If the preconditions for SetFind and Union are met, then if F is the find database at some previous time and F 0 is the find database now, then F ⊆ F 0 . Find Lemma. If the preconditions for Find, SetFind, and Union hold, then Find always terminates.

3

The Basic Framework

As mentioned above, the purpose of the framework presented in this paper is to combine satisfiability procedures for several first-order theories into a satisfiabil-

82

Clark W. Barrett, David L. Dill, and Aaron Stump

ity procedure for their union. Suppose Tn are n first-order theories, S that T1 , . . . , S with signatures Σ1 , . . . Σn . Let T = Ti and Σ = Σi . The goal is to provide a framework for a satisfiability procedure which determines the satisfiability in T of a set of formulas in the language of Σ. Our approach follows that of Nelson and Oppen [12]. We assume that the intersection of any two signatures is empty and that each theory is stably-infinite. A theory T with signature Σ is called stably-infinite if any quantifier-free formula in the language of Σ is satisfiable in T only if it is satisfiable in an infinite model of T . We also assume that the theories are convex. A theory is convex if there is no conjunction of literals in the language of the theory which implies a disjunction of equalities without implying one of the equalities itself. The interface to the framework from a client program consists of three methods: AddFormula, Satisfiable, and Simplify. Conceptually, AddFormula adds its argument (which must be a literal) to a set A, called the assumption history. Simplify transforms an expression into a new expression which is equivalent modulo T ∪ A, and Satisfiable returns false if and only if T ∪ A |= false. Since any quantifier-free formula can be converted to disjunctive normal form, after which each conjunction of literals can be checked separately for satisfiability, the restriction that the arguments to AddFormula be literals does not restrict the power of framework. The framework includes sets of functions which are parameterized by theory. For example, if f is such a function, we denote by fi the instance of f associated with theory Ti . If for some f and Ti , we do not explicitly define the instance fi , it is assumed that a call to fi does nothing. It is convenient to be able to call these functions based on the theory associated with some expression e. Expressions are associated with theories as follows. First, variables are partitioned among the theories arbitrarily. In some cases, one choice may be better than another, as discussed in Sec. 5.1 below. An expression in the language of Σ is associated with theory Ti if and only if it is a variable associated with Ti , its operator is a symbol in Σi , or it is an equality and its left side is associated with theory Ti . If an expression is associated with theory Ti , we call it an i-expression. We denote by T (e) the index i, where e is an i-expression. Figure 1 shows pseudocode for the basic framework. An input formula is first simplified it because it might already be known or reduce to something easier to handle. Simplification involves the recursive application of Find as well as certain rewrite rules. Assert calls Merge which merges two ∼-equivalence classes. Merge first calls Setup which ensures that the expressions are in an equivalence class. There are four places in the framework in which theory-specific functionality can be introduced. TheorySetup, TheoryRewrite and PropagateEqualities are theory-parameterized functions. Also, each expression has a notify attribute containing a set of pairs hf,di, where f is a function and d is some data. Whenever Merge is called on an expression a = b, the find attribute of a changes to b, and f(a = b,d) is called for each hf,di ∈ a.notify. Typically, TheorySetup adds callback functions to the notify attribute of various expressions to guarantee that the theory’s satisfiability procedure will be notified if one of those

A Framework for Cooperating Decision Procedures AddFormula(e) { e is a literal } Assert(e); REPEAT done := true; FOREACH theory Ti DO IF PropagateEqualitiesi () THEN done := false; UNTIL done; Assert(e) { e is a literal; T ∪ A |= e } IF ¬ Satisfiable() THEN RETURN; e’ := Simplify(e); IF e’ ≡ true THEN RETURN; IF Op(e’) 6= ‘=’ THEN e’ := (e’ = true); Merge(e’); Merge(e) { Op(e) = ‘=’; T ∪ A |= e; see text for others } Setup(e[1]); Setup(e[2]); IF e[1] and e[2] are terms THEN TheorySetupT (e) (e); Union(e[1],e[2]); FOREACH hf,di ∈ e[1].notify DO f(e,d); Setup(e) IF HasFind(e) THEN RETURN; FOREACH child c of e DO Setup(c); TheorySetupT (e) (e); SetFind(e); Simplify(e) IF HasFind(e) THEN RETURN Find(e); Replace each child c of e with Simplify(c); RETURN Rewrite(e); Rewrite(e) IF HasFind(e) THEN RETURN Find(e); IF Op(e) = ‘=’ THEN e’ := RewriteEquality(e); ELSE e’ := TheoryRewriteT (e) (e); IF e 6≡ e’ THEN e’ := Rewrite(e’); RETURN e’; RewriteEquality(e) IF e[1] ≡ e[2] THEN RETURN true; IF one child of e is true THEN RETURN the other child; IF e[1] ≡ false THEN RETURN (e[2] = e[1]); RETURN e; Satisfiable() RETURN true 6∼ false;

Fig. 1. Basic Framework

83

84

Clark W. Barrett, David L. Dill, and Aaron Stump

expressions is merged with another expression. Finally, before returning from AddFormula, each theory may notify the framework of additional equalities it has deduced until each theory reports that there are no more equalities to propagate. Theory-specific code is distinguished from the framework code shown in Fig. 1 and from user code which is the rest of the program. It may call functions in the framework, provided any required preconditions are met. Examples of theory-specific code for both Nelson–Oppen and Shostak style theories are given below, following a discussion of the abstract requirements which must be fulfilled by theory-specific code to ensure correctness.

4

Correctness of the Basic Framework

In order to prove correctness, we give a specification in terms of preconditions and postconditions and show that the framework meets the specification. Sometimes it is necessary to talk about the state of the program. Each run of a program is considered to be a sequence of states, where a state includes a value for each variable in the program and a location in the code. 4.1

Preconditions and Postconditions

The preconditions for each function in the framework except for Merge are shown in the pseudocode. In order to give the precondition for Merge, a few definitions are required. A path from an expression e to a sub-expression s of e is a sequence of expressions e0 , e1 , ..., en such that e0 ≡ e, ei+1 is a child of ei , and s is a child of en . A sub-expression s of an expression e is called a highest find-initialized sub-expression of e if HasFind(s) and there is a path from e to s such that for each expression e’ on the path, ¬HasFind(e’). An expression e is called find-reduced if Find(s) ≡ s for each highest find-initialized sub-expression s of e. An expression e is called merge-acceptable if e is an equation and one of the following holds: e is a literal; e[1] is false or an atomic predicate and e[2] ≡ true; or e[1] ≡ true and e[2] ≡ false. Merge Precondition. Whenever Merge(e) is called, the following must hold. 1. 2. 3. 4.

e is merge-acceptable, e[1] and e[2] are find-reduced, e[1] 6≡ e[2], and T ∪ A |= e.

In addition to the preconditions, the following postconditions must be satisfied by the parameterized functions.

A Framework for Cooperating Decision Procedures

85

TheoryRewrite Postcondition. After e’ := TheoryRewrite(e) or e’ := RewriteEquality(e) is executed, the following must hold: 1. 2. 3. 4.

F is unchanged by the call, if e is a literal, then e’ is a literal, if e is find-reduced, then HasFind(e’) or e’ is find-reduced, and T ∪ F |= e = e’.

TheorySetup Postcondition. After TheorySetup is executed, the find database is unchanged. If all preconditions and postconditions hold for all functions called so far, we say that the program is in an uncorrupted state. Also, if true 6∼ false, we say the program is in a consistent state. A few lemmas are required before proving that the preconditions and postconditions hold for the framework code. Lemma 1. If the program is in an uncorrupted state and Union(a,b) has been called, then since that call there have been no calls to Union where either argument was a. Proof. Once Union(a,b) is called, a.find 6≡ a and this remains true since it can never again be an argument to SetFind or Union. Lemma 2 (Equality Find Lemma). If e ≡ a = b and the program is in an uncorrupted and consistent state whose location is not between the call to SetFind(e) and the next call to Union and HasFind(e), then a and b are terms and Find(e) ≡ false. Proof. Suppose HasFind(e). Then Setup(e) was called. But by the definition of merge-acceptable, this can only happen if e[1] and e[2] are terms and Merge(e = false) was called, in which case Union(e,false) is called immediately afterwards. It is clear from the definition of merge-acceptable, that Union is never called with first argument false unless the second argument is true. Thus, if true 6∼ false, it follows from Lemma 1 that Find(e) ≡ false. t u Lemma 3 (Literal Find Lemma). If the program is in an uncorrupted state and e is a literal, then Find(e) is either e, true, or false. Proof. ¿From the previous lemma, it follows that if e is an equality, then Find(e) is either e, true, or false. A similar argument shows that the same is true for a predicate. t u Lemma 4 (Simplify Lemma). If the program is in an uncorrupted state after e’ := Simplify(e) is executed, then following are true:

86

1. 2. 3. 4.

Clark W. Barrett, David L. Dill, and Aaron Stump

F is unchanged by the call, if e is a literal then e’ is a literal, if e is a literal or term, then e’ is find-reduced, and T ∪ F |= e = e’.

We must prove the following theorem. A similar theorem is required every time we introduce theory-specific code. Theorem 1. If the program is in an uncorrupted state located in the framework code, then the next state is also uncorrupted. Proof. Find Precondition: Find is called in two places by the framework. In each case, we check the precondition before calling it. SetFind Precondition: SetFind(e) is only called from Setup(e) which returns if HasFind(e). Otherwise, Setup performs a depth-first traversal of the expression and calls SetFind. It follows from the TheorySetup Postcondition and the fact that expressions are acyclic that the precondition is satisfied. Union Precondition: Union(a,b) is only called if Merge(a = b) is called first. By the Merge precondition, a and b are find-reduced. It is easy to see that after Setup(a) and Setup(b) are called, Find(a) ≡ a and Find(b) ≡ b. AddFormula Precondition: We assume that AddFormula is only called with literals. Assert Precondition: Assert(e) is only called from AddFormula. In this case, e ∈ A, so it follows that T ∪ A |= e. Merge Precondition: Merge(e’) is called from Assert(e). We know that e is a literal, so by the Simplify Lemma, Simplify(e) is a literal and is find-reduced. It follows that e’ is merge-acceptable and e’[1] and e’[2] are find-reduced and unequal. ¿From the Simplify Lemma, we can conclude that T ∪ F |= e = e’. It follows from the soundness property (described next) that T ∪ A |= e = e’. We know that T ∪ A |= e, so it follows that T ∪ A |= e’. TheoryRewrite Postcondition: It is straight-forward to check that each of the requirements hold for RewriteEquality. t u 4.2

Soundness

The satisfiability procedure is sound if whenever the program state is inconsistent, T ∪ A |= false. Soundness depends on the invariance of the following property. Soundness Property. T ∪ A |= F . Lemma 5. If the program is in an uncorrupted state, then the soundness property holds.

A Framework for Cooperating Decision Procedures

87

Proof. Initially, the find database is empty. New formulas are added in two places. The first is in Setup, when SetFind is called. This preserves the soundness property since it only adds a reflexive formula to F. The other is in Merge(e), when Union(e[1],e[2]) is called. This adds the formula e to F, but we know that T ∪ A |= e by the Merge Precondition. It also results in the addition of any formulas which can be deduced using transitivity and symmetry, but these are also entailed because T includes equality. t u Theorem 2. If the program is in an uncorrupted state, then the satisfiability procedure is sound. Proof. Suppose Satisfiable returns false. This means that true ∼ false. It follows from the previous lemma that T ∪ A |= true = false, so T ∪ A |= false. t u 4.3

Completeness

The satisfiability procedure is complete if T ∪ A is satisfiable whenever the program is in a consistent state in the user code. We define the merge database, denoted M, as the set of all expressions e such that there has been a call to Merge(e). In order to describe the property which must hold for completeness, we first introduce a few definitions, adapted from [19]. Recall that an expression in the language of Σ is an i-expression if it is a variable associated with Ti , its operator is a symbol in Σi , or it is an equality and its left side is an i-expression. A sub-expression of e is called an i-leaf if it is a variable or a j-expression, with j 6= i, and every expression along some path from e is an i-expression. An i-leaf is an i-alien if it is not an i-expression. An i-expression in which every i-leaf is a variable is called pure (or i-pure). With each term t which is not a variable, we associate a fresh variable v(t). We define v(t) to be t when t is a variable. For some expression or set of expressions S, we define γi (S) by replacing all of the i-alien terms t in S by v(t)1 so that every expression in γi (S) is i-pure. We denote by γ0 (S) the set obtained from S by replacing all maximal terms (i.e. terms without any superterms) t by v(t). Let Θ be the set of all equations t = v(t), where t is a sub-term of some formula in M. It is easy to see that T ∪ M is satisfiable iff T ∪ M ∪ Θ is satisfiable. Let Mi = {e | e ∈ M∧e is an i-expression }. Define Θi similarly. Notice that S (M ∪ Θ) is logically equivalent to γi (Mi ∪ Θi ), since each can be transformed into the other by repeated substitutions. 1

Since expressions are DAG’s, we must be careful about what is meant by replacing a sub-expression. The intended meaning here and throughout is that the expression is considered as a tree, and only occurrences of the term which qualify for replacement in the tree are replaced. This means that some occurrences may not be replaced at all, and the resulting DAG may look significantly different as a result.

88

Clark W. Barrett, David L. Dill, and Aaron Stump

We define V , the set of shared terms as the set of all terms t such that v(t) appears in at least two distinct sets γi (Mi ∪ Θi ), 1 ≤ i ≤ n. Let E(V ) = {a = b | a, b ∈ V ∧ a ∼ b}, and let D(V ) = {a 6= b | a, b ∈ V ∧ a 6∼ b}. For a set of expressions S, an arrangement π(S) is a set such that for every two expressions a and b in S, exactly one of a = b or a 6= b is in π(S). We denote by π(V ) the arrangement E(V ) ∪ D(V ) of V determined by ∼. Now we can state the property required for completeness. Completeness Property. If the program is in a consistent state in the user code, then Ti ∪ γi (Mi ∪ π(V )) is satisfiable. The following lemmas are needed before proving completeness. Lemma 6. If the program is in an uncorrupted state, then T ∪ M |= F Proof. Every formula in F is either in M or can be derived from formulas in M using reflexivity, symmetry, and transitivity of equality. t u Lemma 7. If the program is in an uncorrupted and consistent state in the user code, then T ∪ M |= A. Proof. Suppose e ∈ A. Then we know that Assert(e) was called at some time previously. We can conclude by monotonicity of the find database that true 6∼ false at the time of that call. Thus, e’ := Simplify(e) was executed. By the Simplify Lemma, if F1 was the find database at the time of the call, T ∪ F1 |= e = e’. Now, if e0 ≡ true, then T ∪ F1 |= e and so by monotonicity and Lemma 6, T ∪ M |= e. Otherwise, Merge is called. Let x be the argument to Merge. It is easy to see that T ∪ F1 |= e = x. But x ∈ M, so T ∪ M |= x. It then follows easily by monotonicity and Lemma 6 that T ∪ M |= e. t u The following theorem is from [19]. Theorem 3. Let T1 and T2 be two stably-infinite, signature-disjoint theories and let φ1 be a set of formulas in the language of T1 and φ2 a set of formulas in the language of T2 . Let v be the set of their shared variables and let π(v) be an arrangement of v. If φi ∧ π(v) is satisfiable in Ti for i = 1, 2, then φ1 ∧ φ2 is satisfiable in T1 ∪ T2 . Theorem 4. If the procedure always maintains an uncorrupted state and the completeness property holds for each theory, then the procedure is complete. Proof. Suppose that for a consistent state in the user code, Ti ∪ γi (Mi ∪ π(V )) is satisfiable for each i. This implies that Ti ∪ γi (Mi ∪ Θi ∪ π(V )) is satisfiable (since each equation in Θi simply defines a new variable), which is logically equivalent (by applying substitutions from Θi ) to Ti ∪ γi (Mi ∪ Θi ) ∪ γ0 (π(V )). Now, each set γi (Mi ∪ Θi ) is a set of formulas in the language of Ti , and γ0 (π(V )) is an arrangement of the variables shared among these sets, so we can conclude

A Framework for Cooperating Decision Procedures

89

S by repeated application of Theorem 3 that γi (Mi ∪ Θi ) is satisfiable in T . S But γi (Mi ∪ Θi ) is equivalent to M ∪ Θ which is satisfiable in T iff T ∪ M is satisfiable. Finally, by Lemma 7, T ∪ M |= A. Thus we can conclude that T ∪ A is satisfiable. t u 4.4

Termination

We must show that each function in the framework terminates. The following requirements guarantee this. Termination Requirements. 1. The preconditions for Find, SetFind, and Union always hold. 2. For each i-expression e, TheoryRewritei (e) terminates. 3. If s is a sequence of expressions in which the next member of the sequence e’ is formed from the previous member e by calling TheoryRewritei (e), then beyond some element of the sequence, all the expressions are identical. 4. For each i-expression e, TheorySetupi (e) terminates. 5. After Union(a,b) is called, (a) No new entries are added to a.notify. (b) Each call to each funtion in a.notify terminates. 6. For each theory Ti , PropagateEqualitiesi terminates and after calling PropagateEqualitiesi some finite number of times, it will always return false.

Theorem 5. If the termination requirements hold, then each function in the framework terminates. Proof. The first condition guarantees that Find terminates, from which it follows that Satisfiable terminates. The next two ensure that Rewrite terminates. It then follows easily that Simplify must terminate. The next few conditions are sufficient to ensure that Setup and Merge terminate, from which it follows that Assert terminates. This, together with the last condition allows us to conclude that AddFormula terminates. t u It is not hard to see that without any theory-specific code, these requirements hold.

5

Examples Using the Framework

In this section we will give two examples to show how the framework can accommodate different kinds of theory-specific code.

90

5.1

Clark W. Barrett, David L. Dill, and Aaron Stump

Nelson–Oppen Theories

A Nelson–Oppen style satisfiability procedure for a theory Ti must be able to determine the satisfiability of a set of formulas in the language of Σi as well as which equalities between variables are entailed by that set of formulas [12]. We present a method for integrating such theories which is flexible and efficient. Suppose we have a Nelson–Oppen style satisfiability procedure which treats alien terms as variables with the following methods: AddFormulai Satisfiablei AddTermToPropagatei GetEqualitiesi

Adds a new formula to the set Ai . True iff Ti ∪ γi (Ai ) is satisfiable. Adds a term to the set ∆i . Returns the largest set of equalities Ei between terms in ∆i such that Ti ∪ γi (Ai ) |= γi (Ei ).

A new expression attribute, shared is used to keep track of which terms are relevant to more than one theory. Each theory is given an index, i, and the shared attribute is set to i if the term is used by theory i. If more than one theory uses the term, the shared attribute is set to 0. This is encapsulated in the SetShared and IsShared methods shown below. SetShared(e,i) IF e.shared = ⊥ THEN e.shared := i; ELSE IF e.shared 6= i THEN e.shared := 0; AddTermToPropagatei (e);

IsShared(e) RETURN e.shared = 0;

Figure 2 shows the theory-specific code needed to add a theory Ti with a satisfiability procedure as described above. We will refer to a theory implemented in this way as a Nelson–Oppen theory. Each i-expression is passed to TheorySetupi . TheorySetupi marks these terms and their alien children as used by Ti . It also ensures that Notifyi will be called if any of these expressions are merged with something else. When Notifyi is called, the formula is passed along to the satisfiability procedure for Ti . These steps correspond to the decomposition into pure formulas in other implementations (but without the introduction of additional variables). PropagateEqualitiesi asserts any equations between shared terms that have been deduced by the satisfiability procedure for Ti . This corresponds to the equality propagation step in other methods. It is sufficient to propagate equalities between shared variables, a fact also noted in [19]. We also introduce a new optimization. Not all theories need to know about all equalities between shared terms. A theory is only notified of an equality if the left side of that equality is a term that it has seen before. In order to guarantee that this results in fewer propagations, we have to ensure that whenever an equality between two terms is in M, if one of the terms is not shared, then the left term is not shared. We can easily do this by modifying RewriteEquality to put non-shared terms on the left. However, this is not necessary for correctness, a fact which allows the integration of Shostak-style satisfiability procedures which require a different implementation of RewriteEquality as described in Sec. 5.2 below.

A Framework for Cooperating Decision Procedures

91

TheorySetupi (e) FOREACH i-alien child a of e DO BEGIN a.notify := a.notify ∪ { hNotifyi , ∅i }; SetShared(a,i); END e.notify := e.notify ∪ { hNotifyi , ∅i }; IF e is a term THEN SetShared(e,i); TheoryRewritei (e) RETURN e; PropagateEqualitiesi () propagate := false; IF Satisfiable() BEGIN IF ¬ Satisfiablei () THEN Merge(true = false)); ELSE FOREACH x = y ∈ GetEqualitiesi DO IF IsShared(x) AND IsShared(y) AND x 6∼ y THEN BEGIN propagate := true; Assert(x = y)); END END RETURN propagate; Notifyi (e) IF e[1] is an i-alien term THEN BEGIN x := Find(e[2]); x.notify := x.notify ∪ { hNotifyi , ∅i }; e := (e[1] = x); END AddFormulai (e);

Fig. 2. Code for implementing a Nelson–Oppen theory Ti .

A final optimization is to associate variable with theories in such a way as to to avoid causing terms to be shared unnecessarily. For example, if x = t is a formula in M and x is a variable and t is an i-term, it is desirable for x to be an i-term as well (otherwise, t immediately becomes a shared term). In our implementation, expressions are type-checked and each type is associated with a theory. Thus, we can easily guarantee this by associating x with the theory associated with its type.

Correctness. The proof of the following theorem is similar to that given for the framework code and is omitted. Theorem 6. If the program is in an uncorrupted state located in the theoryspecific code for a Nelson–Oppen theory, then the next state is also uncorrupted.

92

Clark W. Barrett, David L. Dill, and Aaron Stump

To show that the completeness property holds, we must show that if the program is in a consistent state in the user code, then Ti ∪ γi (Mi ∪ π(V )) is satisfiable. This requires the following invariant to hold for each theory Ti . Shared Term Requirement. There has been a call to SetShared(e,i) if v(e) appears in γi (Mi ∪ Θi ). Lemma 8. If Ti is a Nelson–Oppen theory, then the shared term requirement holds for Ti . Corollary 1. If Ti is a Nelson–Oppen theory, and v(t) appears in γi (Mi ∪ Θi ), then t ∈ ∆i . Let ∆0i = ∆i ∪ {x | x is a term and t = x ∈ Ai for some term t}. Lemma 9. If Ti is a Nelson–Oppen theory and the program is in an uncorrupted state in the user code and x = y ∈ M, where x ∈ ∆0i , then x = z ∈ Ai , where z ≡ Find(y) at some previous time. Proof. Suppose x ∈ ∆i . Then SetShared was called. It is easy to see from the code that at the time it was called, Notifyi was added to x.notify. If on the other hand, x 6∈ ∆i , then t = x ∈ Ai for some t which is not an i-term. But then, when t = x was added to Ai , Notifyi was added to x.notify. In each case, Notifyi (x = y) will be called after Merge(x = y) is called, so that x = Find(y) is added to Ai . t u Lemma 10. If Ti is a Nelson–Oppen theory and the program is in an uncorrupted state in the user code and x ∼ y, where x, y ∈ ∆0i , then Ti ∪ γi (Ai ) |= γi (x = y). Proof. We can show by the previous lemma that since Find(x) ≡ Find(y), there is a chain of equalities in Ai linking x to y. t u Let Di = {a 6= b | a, b ∈ (∆i ∩ V )}, and let Di0 = {a 6= b | a, b ∈ (∆0i ∩ V )}. Lemma 11. If Ti is a Nelson–Oppen theory and the program is in an uncorrupted and consistent state in the user code, then Ti ∪ γi (Ai ∪ Di ) is satisfiable. Proof. No single disequality x 6= y ∈ Di can be inconsistent because if it were, that would mean Ti ∪ γi (Ai ) |= γi (x = y). But if this is the case, since PropagateEqualitiesi terminated, it must be the case that x ∼ y. Since no single equality x = y is entailed, it follows from the convexity of Ti , that no disjunction of equalities can be entailed. t u Lemma 12. If Ti is a Nelson–Oppen theory and the program is in an uncorrupted and consistent state in the user code, then Ti ∪ γi (Ai ∪ Di0 ) is satisfiable.

A Framework for Cooperating Decision Procedures

93

Proof. If t01 6= t02 ∈ Di0 , we can find (by the definition of ∆0i ) some t1 and t2 such that t1 6= t2 ∈ Di and Ai |= (t1 = t01 ∧ t2 = t02 ). The result follows by the previous lemma. t u Theorem 7. If each theory satisfies the shared term requirement and the program is in an uncorrupted and consistent state in the user code, then if Ti is a Nelson–Oppen theory, the completeness property holds for Ti . Proof. It is not hard to show that if v(x) ∈ γi (Ai ∪ Θi ), then x ∈ ∆0i . It then follows that an interpretation satisfying Ti ∪ γi (Ai ∪ Di0 ) can be modified to also satisfy γi (π(V )). t u Termination. The only termination condition that is non-trivial is the last one. The following requirement is sufficient to fulfill this condition. Nelson–Oppen Termination Requirement Suppose that before a call to Assert from PropagateEqualitiesi , n is the number of equivalence classes in ∼ containing at least one term t ∈ V . Then, either the state following the call to Assert is inconsistent or if m is the number of equivalence classes in ∼ containing at least one term t ∈ V after returning from Assert, m < n. If every theory is a Nelson–Oppen theory, it is not hard to see that this requirement holds. This is because each call to Assert merges the equivalence classes of two shared variables without creating any new ones. 5.2

Adding Shostak Theories

Suppose we have a theory Ti with no predicate symbols which provides two functions, σ and ω which we refer to as the canonizer and solver respectively. Note that if we have more than one such theory, we can often combine the canonizers and solvers to form a canonizer and solver for the combined theory, as described in [17]2 . The functions σ and ω have the following properties. σ is a canonizer for Ti if 1. 2. 3. 4. 5.

Ti |= γi (a = b) iff σ(a) ≡ σ(b) σ(σ(t)) ≡ σ(t) for all terms t. γi (σ(t)) contains only variables occurring in γi (t). σ(t) ≡ t if t is a variable or not an i-term. If σ(t) is a compound i-term, then σ(x) = x for each child x of σ(t).

ω is a solver3 for Ti if 2 3

Although it has been claimed that solvers can always be combined to form a solver for the combined theory [6,17], this is not always possible, as pointed out in [11] Shostak allows the solved form to be more general. To simplify the presentation, we assume the solver returns a single, logically equivalent, equation.

94

1. 2. 3. 4. 5. 6.

Clark W. Barrett, David L. Dill, and Aaron Stump

If Ti |= γi (x 6= y) then ω(x = y) ≡ false. Otherwise, ω(x = y) ≡ a = b where a and b are terms, Ti |= (x = y) ↔ (a = b), γi (a) is a variable and does not appear in γi (b), neither γi (a) nor γi (b) contain variables not occurring in γi (x = y), ω(a = b) ≡ a = b and σ(b) ≡ b.

We call such a theory a Shostak theory. The code in Fig. 3 shows the additional code needed to integrate a Shostak theory.

RewriteEquality(e) IF e[1] ≡ e[2] THEN RETURN true; IF one child of e is true THEN RETURN the other child; IF e[1] ≡ false THEN RETURN (e[2] = e[1]); IF e[1] is a term THEN RETURN ω(e); RETURN e; TheorySetupi (e) FOREACH a which is an i-leaf in e DO BEGIN IF Op(e) = ‘=’ THEN a.notify := a.notify ∪ {hUpdateDisequality,ei}; ELSE a.notify := a.notify ∪ {hUpdateShostak,ei}; SetShared(a,i); END IF e is a term THEN SetShared(e,i); TheoryRewritei (e) RETURN σ(e); PropagateEqualitiesi () RETURN false; UpdateDisequality(x,y) IF ¬ Satisfiable() ∨ ¬ HasFind(y) THEN RETURN; Replace each i-leaf c in y with Find(c); y’ := Rewrite(y); IF y’ 6≡ false THEN Merge(y’ = false); UpdateShostak(x,y) IF Find(y) ≡ y THEN BEGIN Replace each i-leaf c in y with Find(c) to get y’; Merge(y = σ(y’)); END

Fig. 3. Code for implementing a Shostak theory Ti .

A Framework for Cooperating Decision Procedures

95

Correctness. It is not hard to show that this code satisfies the preconditions and requirements of the framework. Theorem 8. If the program is in an uncorrupted state located in the theoryspecific code for a Shostak theory, then the next state is also uncorrupted. Included in the Shostak code are the calls to SetShared necessary to allow this theory to be integrated with Nelson–Oppen theories. We have not included the code typically included for handling uninterpreted functions. This is because our approach allows us to consider uninterpreted functions as belonging to a separate Nelson–Oppen theory. Though we do not show how in this paper, any simple congruence closure algorithm can be integrated as a Nelson–Oppen theory. Omitting details related to uninterpreted functions simplifies the presentation and proof. We have also included code for handling disequalities, which Shostak’s original procedure does not handle directly. We will give some intuition for how this works after making a few definitions. Let ∆i = {t | t is an i-leaf in some expression e ∈ M}. Let E = {a = b | a ∈ ∆i ∧ b ≡ Find(a) }. For an expression e, define τ (e) to be the expression obtained from e by replacing each i-leaf x in e by Find(x). Shostak’s method works by ensuring that Find(t) ≡ σ(τ (t)). This together with the properties of the solver ensure that the set E is equivalent to a substitution, meaning it is easily satisfiable. These are the key ideas of the completeness argument. Lemma 13. If the program is in an uncorrupted and consistent state which is not inside of a call to Merge, then for each term t such that HasFind(t), Find(t) ≡ σ(τ (t)). Also, if Find(t) ≡ t, then τ (t) ≡ t. Proof. When SetFind is first called on an expression e, the Merge preconditions together with the solver and canonizer guarantee that e ≡ σ(τ (e)). Then, whenever an i-leaf is merged, UpdateShostak is called to preserve the invariant. u t Lemma 14. If the program is in an uncorrupted and consistent state in the user code, and Ti is a Shostak theory, then Ti ∪ γi (E) is satisfiable. Proof. Let M be a model of Ti , and let x ∈ ∆i . If Find(x) ≡ x, then assign v(x) an arbitrary value. Otherwise, assign v(x) the same value as γi (Find(x)). By the above lemma, this assignment satisfies γi (E). t u Lemma 15. If the program is in an uncorrupted and consistent state in the user code and Ti is a Shostak theory, then Ti ∪ γi (Mi ) is satisfiable. Proof. Suppose e ∈ Mi . Clearly e[1] ∼ e[2]. If e is an equality between terms, it follows from Lemma 13 that σ(τ (e[1])) ≡ σ(τ (e[2])). By properties of σ, it follows that Ti |= τ (e[1]) = τ (e[2]). Then, by the definition of E, it follows that Ti ∪ E |= e[1] = e[2] and hence Ti ∪ γi (E) |= γi (e[1] = e[2]). Suppose on the other hand that e is the literal (x = y) = false, and suppose that Ti ∪ γi (E) |= γi (x = y). The same argument as above in reverse shows that

96

Clark W. Barrett, David L. Dill, and Aaron Stump

Find(x) ≡ Find(y). The UpdateDisequality code ensures that in this case true will get merged with false, contradicting the assumption that the state is consistent. Thus, Ti ∪ γi (E) 6|= γi (x = y). Since Ti is convex, it follows that Ti ∪ γi (E ∪ Mi ) is satisfiable. t u Theorem 9. If the program is in an uncorrupted and consistent state in the user code and Ti is a Shostak theory, then the completeness property holds for Ti . Proof. The above lemma shows that Ti ∪ γi (E ∪ Mi ) is satisfiable. Suppose a and b are shared terms. If a ∼ b, a similar argument to that given above shows that Ti ∪ γi (E) |= γi (a = b). If, on the other hand a 6∼ b, it follows easily that Ti ∪ γi (E) 6|= γi (a = b). Since each equality in γi (Mi ∪ π(V )) is entailed by Ti ∪ γi (E) and none of the disequalities are, it follows by convexity that Ti ∪ γi (Mi ∪ π(V )) is satisfiable. t u Termination. The idempotency of the solver and canonizer are sufficient to guarantee termination of rewrites. For each expression e, it is not hard to show that something is added to e.notify only if Find(e) ≡ e. Consider the functions called by Merge which are UpdateDisequality and UpdateShostak. Both of them call Merge recursively. Each of them reduce the value of some measure of the program state. For UpdateDisequality, the measure is the number of equality expressions e such that HasFind(e) and ω(τ (e)) 6≡ false. For UpdateShostak, the measure is the number of expressions e such that Find(e) ≡ e and Find(c) 6≡ c for some i-leaf c of e. With some effort, it can be verified that none of the functions in the theory-specific code presented thus far which can be called after Union increase either of these measures. The other termination conditions are trivial. Finally, in order to combine Shostak and Nelson–Oppen, the Shostak code must not break the Nelson–Oppen Termination Requirement. Any new call to Merge has the potential to “create” new shared terms by causing a new term to show up in Mi for some i. A careful analysis shows that if Assert(x = y) is called from the Nelson–Oppen code, any resulting call to Merge does not increase the number of equivalence classes containing shared terms. Lemma 13 ensures that by the time Assert has returned, x ∼ y, so the number of equivalence classes containing shared terms decreases as required.

6

Conclusion

We have presented a framework for combining decision procedures for disjoint first-order theories, and shown how it can be used to implement and integrate Nelson–Oppen and Shostak style decision procedures. This work has shed considerable light on the individual methods as well as on what is required to combine them. We discovered that a more restricted set of equalities can be propagated in the Nelson–Oppen framework without losing

A Framework for Cooperating Decision Procedures

97

completeness. Also, by separating the uninterpreted functions from the Shostak method, the code is simpler and easier to verify. We are working on an extension of the framework which would handle nonconvex theories and more general Shostak solvers. In future work, we hope also to be able to relax the requirements that the theories be disjoint and stablyinfinite. We also plan to complete and distribute a new version of SVC based on these results.

Acknowledgments We would like to thank Natarajan Shankar at SRI for helpful discussions and insight into Shostak’s decision procedure. This work was partially supported by the National Science Foundation Grant MIPS-9806889 and NASA contract NASI-98139. The third author is supported by a National Science Foundation Graduate Fellowship.

References 1. F. Baader and C. Tinelli. A new approach for combining decision procedures for the word problem, and its connection to the Nelson–Oppen combination method. In W. McCune, editor, 14th International Conference on Computer Aided Deduction, Lecture Notes in Computer Science, pages 19–33. Springer-Verlag, 1997. 2. Clark Barrett, David Dill, and Jeremy Levitt. Validity checking for combinations of theories with equality. In M. Srivas and A. Camilleri, editors, Formal Methods In Computer-Aided Design, volume 1166 of Lecture Notes in Computer Science, pages 187–201. Springer-Verlag, 1996. 3. N. Bjorner. Integrating Decision Procedures for Temporal Verification. PhD thesis, Stanford University, 1999. 4. Michael A. Colon and Tomas E. Uribe. Generating finite-state abstractions of reactive systems using decision procedures. In International Conference on ComputerAided Verification, volume 1427 of Lecture Notes in Computer Science, pages 293– 304. Springer-Verlag, 1998. 5. D. Cyrluk. Private communication. 1999. 6. D. Cyrluk, P. Lincoln, and N. Shankar. On Shostak’s Decision Procedure for Combinations of Theories. In M. McRobbie and J. Slaney, editors, 13th International Conference on Computer Aided Deduction, volume 1104 of Lecture Notes in Computer Science, pages 463–477. Springer-Verlag, 1996. 7. Satyaki Das, David L. Dill, and Seungjoon Park. Experience with predicate abstraction. In 11th International Conference on Computer-Aided Verification, pages 160–172. Springer-Verlag, July 1999. Trento, Italy. 8. David L. Detlefs, K. Rustan M. Leino, Greg Nelson, , and James B. Saxe. Extended static checking. Technical Report 159, Compaq SRC, 1998. 9. Z. Manna et al. STeP: Deductive-Algorithmic Verification of Reactive and Realtime Systems. In 8th International Conference on Computer-Aided Verification, volume 1102 of Lecture Notes in Computer Science, pages 415–418. SpringerVerlag, 1996.

98

Clark W. Barrett, David L. Dill, and Aaron Stump

10. H.Saidi and N.Shankar. Abstract and model check while you prove. In Proceedings of the 11th Conference on Computer-Aided Verification. Springer-Verlag, July 1999. Trento, Italy. 11. J. Levitt. Formal Verification Techniques for Digital Systems. PhD thesis, Stanford University, 1999. 12. G. Nelson and D. Oppen. Simplification by cooperating decision procedures. ACM Transactions on Programming Languages and Systems, 1(2):245–57, 1979. 13. S. Owre, J. Rushby, and N. Shankar. PVS: A Prototype Verification System. In D. Kapur, editor, 11th International Conference on Automated Deduction, volume 607 of Lecture Notes in Artificial Intelligence, pages 748–752. Springer-Verlag, 1992. 14. David Y.W. Park, Jens U. Skakkebæk, Mats P.E. Heimdahl, Barbara J. Czerny, and David L. Dill. Checking properties of safety critical specifications using efficient decision procedures. In FMSP’98: Second Workshop on Formal Methods in Software Practice, pages 34–43, March 1998. 15. William Pugh. The omega test: a fast and practical integer programming algorithm for dependence analysis. In Communications of the ACM, volume 8, pages 102–114, August 1992. 16. H. Ruess and N. Shankar. Deconstructing Shostak. In 17th International Conference on Computer Aided Deduction, 2000. 17. R. Shostak. Deciding combinations of theories. Journal of the Association for Computing Machinery, 31(1):1–12, 1984. 18. J. Su, D. Dill, and J. Skakkebæk. Formally verifying data and control with weak reachability invariants. In Formal Method In Computer-Aided Design, 1998. 19. C. Tinelli and M. Harandi. A new Correctness Proof of the Nelson–Oppen Combination Procedure. In F. Baader and K. Schulz, editors, 1st International Workshop on Frontiers of Combining Systems (FroCoS’96), volume 3 of Applied Logic Series. Kluwer Academic Publishers, 1996.

Modular Reasoning in Isabelle Florian Kamm¨ uller GMD First, 12489 Berlin, Germany [email protected]

Abstract. The concept of locales for Isabelle enables local definition and assumption for interactive mechanical proofs. Furthermore, dependent types are constructed in Isabelle/HOL for first class representation of structure. These two concepts are introduced briefly. Although each of them has proved useful in itself, their real power lies in combination. This paper illustrates by examples from abstract algebra how this combination works and argues that it enables modular reasoning.

1

Motivation

Modules for theorem provers are a means for organizing theories of applications. Generic interactive theorem provers like PVS [OSRSC98], IMPS [FGT93], and HOL [GM93] define their applications as object logics. Modules are used to maintain and structure these object logics. Being a classical software engineering concept for re-usability and structuring, modules are the obvious method for organizing formalizations of theorem provers. Apart from just organizing big theories, advanced modular features — like parameterization and instantiation — give rise to use modules to represent (mathematical) structure logically. For example, the abstract algebraic structure of groups is represented by a module in the following fashion (cf. [OSRSC98]). Module ∀ x: ∀ x: ∀ x,

Group [G: TYPE, o : G -> G -> G, inv: G -> G, e: G] G. x o e = x G. x o (inv x) = e y, z: G. x o (y o z) = (x o y) o z

The abstract character of groups is modeled in systems like PVS by using (generic) sorts or explicit parameters to model the contents of the group. Reasoning about properties of group elements and the operation ◦ is possible inside such a theory. The parameterization enables the instantiation of the group theory to actual groups. The abstractly derived results can thus be reused by an instantiation. This is also what we think of as modular reasoning; reasoning where the abstraction and structuring of modules becomes part of the proof process. However, an adequate way of reasoning is not possible in this setting. For example, we must consider the class of all groups to enable reasoning about general properties which hold, say, only for finite groups. This class of all groups cannot be defined here because the theory level is separate from the reasoning level. There are more examples, like quotients of groups forming groups again; the D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 99–114, 2000. c Springer-Verlag Berlin Heidelberg 2000

100

Florian Kamm¨ uller

problem is always the same. Since modules are not first class citizens, we cannot use the structure defined by a module in any formula. Hence, formalizations using modules to represent mathematical structure are not adequate; we can only reason about a restricted set of aspects of the (mathematical) world. In rich type theories there is the concept of dependent types. Systems like Coq [D+93] and LEGO [LP92] implement such type theories. If the hierarchies of the type theory are rich enough then dependent types are first class citizens. Usually, type theories do not have advanced module concepts as they are known in interactive theorem provers, like PVS and IMPS. However, it is well known that dependent types may be used to represent modules (e.g. [Mac86]). We verified by case studies (e.g. [KP99]) that a module system where the modules are first class citizens is actually necessary for an adequate representation of (mathematical) structures in the logic of a theorem prover. Yet, it turns out that we sometimes need just some form of local scope and not a first class representation. We need locality, i.e. the possibility to declare concepts whose scope is limited or temporary. Locality and adequacy are separate concerns that do not coincide generally. We propose to use separate devices, i.e. locales [KWP99] and dependent types [Kam99b]. We have designed and implemented them for Isabelle. In this paper, we show that in combination they realize modular reasoning. In Section 2.1 we shortly introduce the concept of locales for Isabelle. The way we represent dependent types in Isabelle/HOL is sketched in Section 2.2. The introduction to these topics has been presented elsewhere and goes only as far as needed for the understanding of the following. In Section 3 we present various case studies. They illustrate the use of locales and dependent types and validate that the combination of these concepts enables modular reasoning.

2

Prerequisites and Concepts

Isabelle is a higher order logic theorem prover [Pau94]. It is generic, that is, it can be instantiated to form theorem provers for a wide range of logics. These can be made known to the prover by defining theories that contain sort and type declarations, constants, and related definitions and rules. The most popular object logics are Isabelle/HOL and Isabelle/ZF. A powerful parser supports intelligible syntactic abbreviations for user-defined constants. Definitions, rules, and other declarations that are contained in an Isabelle theory are visible whenever that theory is loaded into an Isabelle session. All theories on which the current theory is built are also visible. All entities contained in a current theory stay visible for any other theory that uses the current one. Thus, theory rules and definitions are not suited for formalizing concepts that are of only local significance in certain contexts or proofs. Isabelle theories form hierarchies. However, theories do not have any parameters or other advanced features typical for modules in theorem provers. That is, Isabelle did not have a module concept prior to the developments presented in the current section.

Modular Reasoning in Isabelle

2.1

101

Locales

Locales [KWP99] declare a context of fixed variables, local assumptions and local definitions. Inside this context, theorems can be proved that may depend on the assumptions and definitions and the fixed variables are treated like constants. The result will then depend on the locale assumptions, while the definitions of a locale are eliminated. The definition of a locale is static, i.e. it resides in a theory. Nevertheless, there is a dynamic aspect of locales corresponding to the interactive side of Isabelle. Locales are by default inactive. If the current theory context of an Isabelle session contains a theory that entails locales, they can be invoked. The list of currently active locales is called scope. The process of activating them is called opening; the reverse is closing. Locales can be defined in a nested style, i.e. a new locale can be defined as the extension of an existing one. Locales realize a form of polymorphism with binding of type variables not normally possible in Isabelle (see Section 3.2). Theorems proved in the scope of a locale may be exported to the surrounding context. The exporting device for locales dissolves the contextual structure of a locale. Locale definitions become expanded, locale assumptions attached as individual assumptions, and locale constants transformed into variables that may be instantiated freely. That is, exporting reflects a locale to Isabelle’s metalogic. Although they do not have a first class representation, locales have at least a meta-logical explanation. In Section 3.4 we will see that this is crucial for the sound combination of locales and dependent types. Locales are part of the official distribution since Isabelle version 98-1. They can be used in all of Isabelle’s object logics — not just Isabelle/HOL — and have been used already in many applications apart from the ones presented here. 2.2

Dependent Types as First Class Modules

In rich type theories, e.g. UTT [Bai98], groups can be represented as Σ G : set. Σ e : G. Σ ◦ : map2 G G G. Σ

−1

: map G G. group axioms

where group axioms abbreviates the usual rules for groups, corresponding to the body of a module for groups. The elements G, e, ◦ and −1 correspond to the parameters of a module and occur in group axioms. Since this Σ-type can be considered as a term in a higher type universe, we can use it in other formulas. Hence, this modular formalization of groups is adequate. Na¨ıvly, a Σ-type may be understood as the Cartesian product A × B and a Π-type as a function type A → B, but with the B having a “slot” of type A, i.e. being parameterized over an element of A. The latter part of this intuition gives rise to use these type constructors to model the parameterization, and hence the abstraction, of modules. Isabelle’s higher order logic does not have dependent types. For the first class representation of abstract algebraic structures we construct an embedding

102

Florian Kamm¨ uller

of Σ-types and Π-types as typed sets into Isabelle/HOL using set-theoretic definitions [Kam99b]. Since sets in Isabelle/HOL are constants, algebraic structures become first class citizens. Moreover, abstract structures that use other structures as parameters may be modeled as well. We call such structures higher order structures. An example are group homomorphisms, i.e. maps from the carrier of a group G to the carrier of a group H that respect operations. Hom ≡ Σ G ∈ Group. Σ H ∈ Group. {Φ | Φ ∈ G.hcri → H.hcri ∧ (∀x, y ∈ G.hcri. Φ(G.hf i x y) = H.hf i Φ(x) Φ(y))}

The postfix tags, like .hfi are field descriptors of the components of a structure. In general, Σ is used as a constructor for higher order structures. In some cases, however, a higher order structure is uniquely constructed, as for example the factorization of a group by one of its subgroups (see Section 3.3). In those cases, we use the Π-type. We define a set-typed λ-notation that enables the construction of functions of a Π-type.

3

Locales + Dependent Types = Modules

The main idea of this work is that a combination of the concepts of locales and dependent types enables adequate representation and convenient proof with modular structures. To validate this hypothesis we present various case studies pointing out the improvements that are gained through the combination of the two concepts. Some basic formalizations in Section 3.1, explain the use and interaction of the two concepts. In Section 3.2, we reconsider the case study of Sylow’s theorem [KP99] that mainly illustrates the necessity of locales. Then, in Section 3.3, we discuss in detail the quotient of a group that clearly proves the need of the first class property of the dependent type representation. However, it illustrates as well how the additional use of locales enhances the reasoning. After summarizing other examples and analyzing the improvements we show in Section 3.4 how operations on structures may be performed with the combined use of locales and dependent types. 3.1

Formalization of Group Theory

Groups and Subgroups. The class of groups is defined as a set over a record type with four elements: the carrier, the binary operation, the inverse and the unit element that constitute a group. They can be referred to using the projections G., G., G., and G.<e> for some group G. Since the class of all groups is a set, it is a first class citizen. We can write G ∈ Group as a logical formula to express “G is a group”. Hence, all group properties can be derived from the definition. In the definition of the subgroup property we can use an elegant approach which reads informally: a subset H of G is a subgroup if it is a group with G’s operations. Only since groups are first class citizens, we can describe subgroups

Modular Reasoning in Isabelle

103

in that way. A Σ-structure is used to model the subgroup relation. The (| |) enclosed quadruple constructs the subgroup as an element of the record type of the elements of Group. Our λ-notation enables the restriction of the group operations to the subset H. Σ G ∈ Group. {H | H ⊆ (G.) ∧ (| carrier = H, bin_op = λ x ∈ H. λ y ∈ H. (G.) x y, inverse = λ x ∈ H. (G.) x, unit = (G.<e>) |) ∈ Group}

The convenient syntax H <<= G, for H is a subgroup of G, may be used to abbreviate (G, H) ∈ subgroup. In addition to the first class representation of groups and subgroups, we define a locale group to provide a local proof context for group related proofs. The fixes, assumes, and defines parts introduce the constants with their polymorphic types, the assumptions and definitions of the locale1 . The ’a is a polymorphic type variable (see Section 3.2). locale group = fixes G :: "’a grouptype" e :: "’a" binop :: "’a => ’a => ’a" (infixr "#" 80) inv :: "’a => ’a" ("i (_)" [90]91) assumes Group_G "G ∈ Group" defines e_def "e == (G.<e>)" binop_def "x # y == (G.) x y" inv_def "i x == (G.) x"

This locale is attached to the theory file for groups. Prior to starting the proofs concerning groups, we open this locale and can subsequently use the syntax and the local assumption G ∈ Group throughout all proofs for groups. This improves the readability of the derivations as well as it reduces the length of the proofs. For example, instead of [| G ∈ Group; x ∈ (G.); (G.) x x = x |] ==> x = (G.<e>)

we can state this theorem now as [| x ∈ (G.); x # x = x |] ==> x = e

Subgoals of the form G ∈ Group that would normally be created in proofs are not there any more because they are now matched by the corresponding locale rule. All group related proofs share this assumption. Thus, the use of a locale rule reduces the length of the proofs. 1

We omitted an abbreviation for G. to contrast it from the group G.

104

Florian Kamm¨ uller

Cosets. To enable the proof of Sylow’s theorem and further results from group theory we define left and right cosets of a group, a product, and an inverse operation for subsets of groups. We create a separate theory for cosets named Coset containing their definitions. r_coset G l_coset G set_r_cos set_inv G

H a ≡ (λ x. (G.) x a) ‘‘ H a H ≡ (λ x. (G.) a x) ‘‘ H G H ≡ r_coset G H ‘‘ (G.) H ≡ (λ x. (G.) x) ‘‘ H

Cosets immediately give rise to the definition of a special class of subgroups, the so-called normal subgroups of a group. Normal ≡ Σ G ∈ Group. {H | H <<= G ∧ (∀ x ∈ (G.). r_coset G H x = l_coset G x H)}

We define the convenient syntax H <| G for (G, H) ∈ Normal. As is apparent from the definition, normal subgroups are a special case of subgroups of a group where left and right cosets coincide. This is not necessarily the case in nonAbelian groups. Since the notion of cosets, e.g. r coset G H a, depends on the binary operation of the group, they have the additional parameter G. The mathematical notation is Ha. We want have to at least a notation like H #> a. Locales give us this support. We define a locale for the use of cosets to enable convenient syntax for cosets and products. This locale is defined as an extension of the locale for groups. In the scope of the locale coset, we can omit the group parameter G that is necessary for an adequate formalization and we can define local infix syntax. That is, we can write H #> a instead of r coset G H a, I(H) instead of set inv G H, H1 <#> H2 instead of set prod G H1 H2, and {* H *} for set r cos G H. Logically, the short forms refer to the adequate definitions as may be revealed in theorems by export (see Section 2.1). The theorems we derive about cosets and the set product of groups are needed as a calculational basis for Lagrange’s theorem used in Sylow’s proof (see Section 4.1) and in the theorems involving the quotient of a group (see Section 3.3). The binary operation of groups is lifted to the level of subsets of a group. We derive algebraic rules relating the coset operators <# and #> with the product operation for subsets <#>. For example, the theorem set_prod G (r_coset G H x) (r_coset G H y) = r_coset G H ((G.) x y)

can be written in the scope of the locale coset as (H #> x) <#> (H #> y) = H #> (x # y)

The advantage is considerable, especially if we consider that the syntax is not only important when we type in a goal for the first time, but we are confronted with it in each proof step. Hence, the syntactical improvements are crucial for a good interaction with the proof assistant.

Modular Reasoning in Isabelle

3.2

105

Sylow’s Theorem

Sylow’s theorem gives criteria for the existence of subgroups of prime power order in finite groups. Theorem 1. If G is a group, p a prime and pα divides the order of G then G contains a subgroup of order pα . In the first mechanization of the theorem [KP99], here referred to as the ad hoc version, we were forced to abuse the theory mechanism to achieve readable syntax for the main proof. We declared the local constants and definitions as Isabelle constants and definitions. To model local rules, we used axioms, i.e. Isabelle rules. This works, but contradicts the meaning of axioms and definitions in a theory (cf. Section 2). Locales offer the ideal support for this procedure and the mechanization is methodically sound. In the theory of cosets, we define a locale for the proof of Sylow’s theorem. The natural number constants we had to define in [KP99] as constants of an Isabelle theory become now locale constants. The names we use as abbreviations for larger formulas like the set M ≡ {S ⊆ Gcr | card(S) = pα } also become added as locale constants. So, the fixes section of the locale sylow is locale sylow = coset + fixes p, a, m :: "nat" calM :: "’a set set" RelM :: "(’a set * ’a set)set"

The following defines section introduces the local definitions of the set M and the relation ∼ on M (here calM and RelM). defines calM_def "calM == {s | s ⊆ (G.) ∧ card(s) = (p ^ a)}" RelM_def "RelM == {(N1,N2) | (N1,N2) ∈ calM × calM ∧ (∃ g ∈ (G.). N1 = (N2 #> g) )}"

Note that the previous definitions depend on the locale constants p, a, and m (and G from locale group). We can abbreviate in a convenient way using locale constants without being forced to parameterize the definitions, i.e. without locales we would have to write calM G p a m and RelM G p a m. Furthermore, without locales the definitions of calM and RelM would have to be theory level definitions — visible everywhere — whereas now they are just local. Finally, we add the locale assumptions to the locale sylow. Here, we can state all assumption that are local for the 52 theorems of the Sylow proof. In the mechanization of the proof without locales in [KP99] all these merely local assumptions had to become rules of the theory for Sylow. assumes Group_G prime_p card_G finite_G

"G ∈ Group" "p ∈ prime" "order(G) = (p ^ a) * m" "finite (G.)"

106

Florian Kamm¨ uller

The locale sylow can subsequently be opened to provide just the right context to conduct the proof of Sylow’s theorem in the way we discovered in the ad hoc approach [KP99] to be appropriate, but now we can define this context soundly. In the earlier mechanization of the theorem, we abused the theory mechanisms of constants, rules and definitions to that end. Apart from the fact that it is meaningless to define local entities globally, we could not use a polymorphic type ’a for the base type of groups. Using polymorphism would have led to inconsistencies, as we would have assumed the Sylow premises for all groups. Hence, we had to use a fixed type, whereby the theorem was not generally applicable to groups. Also, we restricted the definition of local proof contexts to the scope as outlined above, i.e. to entities that are used in the theorem. Having locales, we can extend the encapsulation of local proof context much further than in the ad hoc mechanization and closer to the way the paper proof operates. More Encapsulation. Now, we may soundly use the locale mechanism for any merely locally relevant definition. In particular we can define the abbreviation H == {g | g ∈ (G.)

∧

M1 #> g = M1}

for the main object of concern, the Sylow subgroup that is constructed in the proof. Naturally, we refrained from using a definition for this set before because in the global theorem it is not visible at all, i.e. it is a temporary definition. But, by adding the above line to a new locale, after introducing a suitably typed locale constant in the fixes part, the proofs for Sylow’s theorem improve a lot. A further measure taken now is to define in the new locale the two assumptions that are visible in most of the 52 theorems of the proof of Sylow’s theorem. Summarizing, a locale for the central part of Sylow’s proof is given by: locale sylow_central = sylow + fixes H :: "’a set" M :: "’a set set" M1 :: "’a set" assumes M_ass "M ∈ calM / RelM ∧ ¬(p ^ ((max-n r. p ^ r | m)+ 1) | card(M))" M1_ass "M1 ∈ M" defines H_def "H == {g | g ∈ (G.) ∧ M1 #> g = M1}"

We open this locale after the first few lemmas when we arrive at theorems that use the locale assumptions and definitions. Subsequently, we assume that the locales group and coset are open. Henceforth, the conjectures become shorter and more readable than in the ad hoc version. For example, [| M ∈ calM / RelM ∧ ¬(p ^ ((max-n r. p ^ r | m)+ 1) | card(M)); M1 ∈ M; x ∈ {g | g ∈ (G.) ∧ M1 #> g = M1}; xa ∈ {g. g ∈ (G.) ∧ M1 #> g = M1} |] ==> x # xa ∈ {g | g ∈ (G.) ∧ M1 #> g = M1}

Modular Reasoning in Isabelle

107

can now be stated as [| x ∈ H; xa ∈ H |] ==> x # xa ∈ H

Figure 1 illustrates how the scoping for Sylow’s proof works. Apart from the theory Coset locale group locale coset locale sylow locale sylow central ∃ M1. M1 ∈ M ∃ M. M ∈ calM / RelM ∧ ¬(p ^ (max-n r. p ^ r | m)+ 1)|card(M)) H << G ∧ card(H) = p ^ a H Export

HH

∃ H. H <<= G ∧ card(H) = p ^ a export

HH H

[| ?p ∈ prime; finite(?G.); ?G ∈ Group; order(?G) = (?p ^ ?a) * ?m |] ==> ∃ H. H <<= ?G ∧ card(H) = ?p ^ ?a

Fig. 1. Sylow’s theorem at different levels of nested locales

main theorem, only two other theorems need to be exported from the innermost locale sylow central. These two theorems prove existence of witnesses for the locale assumptions M ass and M1 ass and are used to cancel the latter assumptions from the main theorem at the level of locale sylow. When we finally export the main theorem from the context of locale sylow using the generally normalizing function export, we achieve the desired form of Sylow’s theorem which is independent from any local definitions and assumptions. The theorem stands alone as a global theorem of the Isabelle theory Coset, and is hence applicable to any group. This becomes visible in the resulting theorem by the question marks indicating schematic variables. There is another feature of locales that we used and that is particularly decisive for the combination of locales and dependent types: a slightly changed polymorphism. Adapted Polymorphism. The declarations of locale constants may use polymorphism, as seen in most of the examples so far, but this is different to the one usual in Isabelle. Usually, Isabelle’s polymorphic declarations are completely

108

Florian Kamm¨ uller

independent of each other, e.g. if the same type variable ’a is used in two declarations, these constants may be still instantiated to different types. In locales, we enrich the expressiveness of polymorphic definitions by extending the scope of the polymorphic variable names over all constant declarations of a locale. This changes the usual polymorphism of Isabelle, in that equal names imply the same variable. That is, polymorphic variables are fixed by the variable names, e.g. ’a, inside the locale. Although the locale as an entity can still be instantiated to arbitrary constant types of appropriate sort, the instantiation is implicitly forced to be the same for all constants of a locale that use the same variable name. This restriction only holds if the same names are used. Naturally we preserve the same freedom of expressiveness that was there before: if we use different variable names in polymorphic declarations of locale constants, they can be instantiated independently. An example is the use of two different groups for the construction of the direct product of groups (see Section 2). However, in the Sylow case study — and in abstract algebraic applications in general — this is exactly what we need: we want to constrain different constructors to the same type, while we still want to stay abstract, i.e. use polymorphic declarations. Most of the locales defined for the examples of this paper, e.g. groups, use one polymorphic type variable ’a in different locale constant declarations, while referring to constituents of one structure. That is, they are abstract, but the same type. This “connected” form of polymorphic declaration reflects the connection that is there in the dependent type corresponding to the locale. For example, the fact that the constituents of a Group element are ranging over the same base type, needs to be reflected to the polymorphic type ’a in the constant declarations of locale groups. 3.3

Quotient of a Group

If a group is factorized by one of its normal subgroups then the quotient together with the induced operations on the cosets is again a group. This is a quite standard result of group theory, but it is challenging because it contains a selfreference: a structure constructed from a group shall be a group again. The quotient of a group illustrates the need for structures as dependent types, and hence first class citizens. In addition, we will analyze to what extent locales can be helpful. For this proof we define a new theory that builds on the theory of cosets. The factorization of a group by one of its normal subgroups is given by the set of cosets. The operations on the cosets are described by the group operations lifted to the level of cosets, i.e. the binary operation is given by the product of cosets, the inverse operation is given by the inverse coset, and the factor of the quotient serves as a unit element, a normal subgroup H. To describe this construction formally, we use our typed λ-notation (see Section 2.2). FactGroup ≡ λ G ∈ Group. λ H ∈ Normal ↓ G. (| carrier = set_r_cos G H, bin_op = λ X ∈ set_r_cos G H.

Modular Reasoning in Isabelle

109

λ Y ∈ set_r_cos G H. set_prod G X Y, inverse = λ X ∈ set_r_cos G H. set_inv G X, unit = H |)

We define the theory syntax G Mod H for the quotient FactGroup G H. To enhance the readability of the construction and thereby the proofs about it, we employ locales. We cannot use any nicer syntax in the above definition of the quotient because in the body of the λ-term above, the terms G and H are parameters. Hence, they have to stay flexible. However, using locales we can fix a group G and a normal subgroup H in G for the local proof context. locale factgroup = coset + fixes F :: "(’a set) grouptype" H :: "(’a set)" assumes H_ass "H <| G" defines F_def "F == FactGroup G H"

By defining this locale as an extension of the locale coset, we incorporate all the syntactical abbreviations we defined for cosets and operations on cosets in Section 1. In addition, we have the group G already as a fixed local constant. The additional definition of the quotient as F lets us derive in the scope of this locale2 F = (| carrier = {* H *}, bin_op = (λ X ∈ {* H *}. λ Y ∈ {* H *}. X <#> Y), inverse = (λ X ∈ {* H *}. I(X)), unit = H |)

The derivation is an application of Isabelle’s simplifier to the corresponding definitions, and the reduction rules for λ. By the additional use of the locale properties of fixing and local definition, we achieve a readable syntax in a local scope. With these preparations, we can prove that this quotient is again a group, which is trivially stated as F ∈ Group in the scope of the locale. The proof is straightforward. By backward resolution with the introduction rule for the group property GroupI we can reduce it to six subgoals that can be solved by repeatedly applying previously derived results about cosets and the operations on them. Note that here the initial application of GroupI illustrates the advantage we gain through the normalization performed by export. Although we proved the rule GroupI for the fixed group G we can now apply it again to the group F which is even constructed with that same G. By exporting the result that F is a group we get the general formula [| ?G ∈ Group; ?H <| ?G |] 2

==> ?G Mod ?H ∈ Group

Opening factgroup automatically opens coset and group.

110

Florian Kamm¨ uller

In an earlier version of this experiment, we did not employ locales. The statement of the conjecture was even without locales not such a problem; it corresponded to the above formula. But, in the proof of the group property, where all the definitions of the lifted operations have to be employed, we were formerly exposed to formulas that were hard to read. As a further illustration of the concept of higher order structures, we consider the proof that the constructed λ-term FactGroup is an element of a suitable Πset. FactGroup ∈ (Π G ∈ Group. (Normal ↓ G) -> Group)

This membership statement is equivalent to the structural proposition that the quotient of a group is a function mapping a group and a normal subgroup of this group to another group. We call this kind of theorem a structural proposition because membership in a structure (a set) entails the proposition we just proved, i.e. that the quotient is a group. If we interpret the sets that are our structures as types, then we see how the Curry-Howard isomorphism of propositions-as-types [How80] is embodied in a statement like above. In contrast to type theory, we do not need to state this isomorphism as a paradigm — it is inherent because we use sets: from the above we can derive the logical proposition. Further Examples and Analysis. Other theorems we mechanized [Kam99a] but cannot present here due to space limitations are: – the direct product of two groups is again a group – the set of bijections with the appropriate operations of composition of bijections, inverse bijection, and identical bijection forms a group – the automorphisms of a ring form a group – the full version of Tarski’s fixed point theorem, i.e. the classical theorem with the addition that the set of all fixed points of the continuous function f is itself a complete lattice. As with the quotient of groups, we first performed the mechanization without the use of locales3 . In comparison, we could reduce the size of the proofs by 50% using locales. Although in the latter version some savings are due to polishing the proofs by improving the applications of automatic simplification tactics, a larger portion is due to locales. Furthermore, the streamlining of the proofs was made much easier because of the greatly improved comprehensibility. Where we were lost before in grasping huge complicated terms, and thus sometimes misled from the optimum solution, the natural representation achieved by locales leads the way now. Locales allow us to use the same local definition and assumptions as in a module. At the same time the structures, like groups and rings, are first class citizens, whereby we achieve adequacy. Through the combination with locales the higher complexity of the adequate formalizations is balanced out. Hence, the combination of locales and dependent types adds up to modular reasoning. Since locales are reflected into the meta-logic (see Section 2.1), this 3

At the time the implementation was not capable of dealing with nested locales.

Modular Reasoning in Isabelle

111

combination does not introduce inconsistencies and enables reuse of locales as we will see in the following section. 3.4

Operations on Modules

Through the embedding of structures as Σ-types and Π-types, we achieve first class representations of modules. Thereby, we are able to use structures in formulas. Moreover, we illustrate in the present section that we can express general operations on structures such as forgetful functors. We show how the substructure of a ring that is an Abelian group can be revealed. For this example we first have to explain how we formalized rings. Using extension of record types, we can build the base type for rings on the base type for groups grouptype. record ’a ringtype = ’a grouptype + Rmult :: "[’a, ’a] => ’a"

Thereby, we inherit the components of groups and can form rings by just extending the latter by the second operation Rmult4. We add the syntax R.<m> for the additional element Rmult of a ring to adapt the notation for rings to the syntax of the group projections. To isolate the group contained in a ring we can use an element of a Π-set. This λ-function represents a forgetful functor. It “forgets” some structure. group_of :: "’a ringtype => ’a grouptype" "group_of == λ R ∈ Ring. (| carrier = (R.), bin_op = (R.), inverse = (R.), unit = (R.<e>) |)"

Thereby, we are able to refer to the substructure of the ring that forms an Abelian group using the forgetful functor group of. We can derive the theorem5 R ∈ Ring ==> group_of R ∈ AbelianGroup

[R_Abel]

This enables a better structuring and decomposition of proofs. In particular, we can use this functor when we employ locales for ring related proofs. Then we want to use the encapsulation already provided for groups by the locale group. To achieve this we define the locale for rings as an extension. locale ring = group + fixes R :: "’a ringtype" rmult :: "[’a, ’a] => ’a" (infixr "**" 80) assumes Ring_R "R ∈ Ring" defines rmult_def "x ** y == (R.<m>) x y" R_id_G "G == group_of R" 4 5

Note that we formalize rings without 1. Mathematical textbooks sometimes use the notion of rings for rings with a 1 for convenience. AbelianGroup is the structure of Abelian, i.e. commutative, groups. Their definition is a simple extension from the one of groups by the additional commutativity.

112

Florian Kamm¨ uller

Note that we are able to use the locale constant G again in a locale definition, i.e. R id G. This is sound because we have not defined G yet. If one gives a constant an inconsistent definition, then one will be unable to instantiate results proved in the locale. This way of reusing the local proof context of groups for the superstructure of rings illustrates the flexibility of locales as well as the ease of integration with the mechanization of structures given by Σ and Π. Theorems that are proved in the setup of the locale ring using group results will have in the exported form the assumptions [| R ∈ Ring; group_of R ∈ Group ... |] ==> ...

But, as an implication of the theorem R Abel, we can easily derive R ∈ Ring ==> group_of R ∈ Group

Thus, the second premise can be cancelled. Although we have to do a final proof step to cancel the additional premise, this shows the advantage of locales being reflected onto premises of the global representations of theorems: it is impossible to introduce unsoundness. A definition of a locale constant that is not consistent with its properties stated by locale rules would be not applicable. Since locale assumptions and definitions are explained through meta-assumptions, the resulting theorem would carry the inconsistent assumptions implicitly. We see that the nested structure of locales is consistent with the logical structure because locales are reflected to the meta-logic. Thereby reuse of the locale of groups is possible.

4 4.1

Conclusion Related Work

The proof of the theorem of Lagrange has been performed with the Boyer Moore Prover [Yu90]. E. Gunter formalized group theory in HOL [Gun89]. In the higher order logic theorem prover IMPS [FGT93] some portion of abstract algebra including Lagrange is proved. Mizar’s [Try93] library of formalized mathematics contains probably more abstract algebra theorems than any other system. However, to our knowledge we were the first to mechanically prove Sylow’s first theorem. Since it uses Lagrange’s theorem, we had to prove this first. In contrast to the formalization as seen in [Yu90] the form of Lagrange that we need for Sylow’s theorem is not just the one stating that the order of the subgroup divides the order of the group but instead gives the precise representation of the group’s order as the product of order of the subgroup and the index of this subgroup in G. Since we have a first class representation of groups, we can express this order equation and can use general results about finite sets to reduce it to simpler theorems about cosets. Hence, compared to [Yu90] our proof of Lagrange is simpler. Locales implement a sectioning device similar to that in AUTOMATH [dB80] or Coq [Dow90]. In contrast to this kind of sections, locales are defined statically. Also, optional pretty printing syntax is part of the concept. The HOL system [GM93] has a concept of abstract theories based on Gunter’s experiments with abstract algebra [Gun89,Gun90] in parts comparable to locales.

Modular Reasoning in Isabelle

4.2

113

Discussion

Modules for theorem provers can be considered as a means to represent abstract structures. In that case modules need to be first class citizens to enable adequacy. Another aspect of modules is the locality they provide by their scoping, which is useful — if sometimes not necessary — for shortening and hence readability of formulas and proofs. The embedding of dependent types combines the expressiveness of type theories with the convenience and power of higher order logic theorem proving. Although the dependent types are only modeled as typed sets of Isabelle/HOL we get the “expressive advantage”. In contrast to earlier mechanizations of dependent types in higher order logic [JM93] our embedding is relatively lightweight as it is based on a simple set-theoretic embedding. At the same time the Π and Σ-types are strong enough to express higher-level modular notions, like mappings between parameterized structures. Locales are a general concept for locality. They are not restricted to any particular object logic of Isabelle, i.e. they can be used for any kind of reasoning. Although they are not first class citizens, there is the export device that reflects locales to meta-logical assumptions, thereby explaining them in terms of Isabelle’s meta-logic. Locality and adequacy are separate aspects that may sometimes coincide, but in general they should be treated individually. We have illustrated this by showing how the devices of locales for locality and dependent types for adequacy add up to support modular reasoning. The presented case studies contain as well aspects that have to be expressed adequately by first class modules given by dependent types as ones that needed the structuring and syntactic support of locales. Moreover, we have shown that the separation of the concepts does not hinder their smooth combination. Where the first class representations become too complicated, locales can be used to reduce them. Moreover, in intricate combinations like the forgetful functor in Section 3.4 we have seen that the reflection of locales to the meta-logic preserves consistency and enables reuse. Hence, instead of using one powerful but inadequate concept for modular reasoning, like classical modules, we think that locales combined with dependent types are appropriate. The separation is tailored for Isabelle, yet it is applicable to other theorem provers. Since the difference between adequacy and locality is a general theoretical issue, the conceptual design of a combination of two devices for the support of modular reasoning presented in this paper is of more general interest.

References Bai98. D+ 93. dB80.

A. Bailey. The Machine-Checked Literate Formalisation of Algebra in Type Theory. PhD thesis, University of Manchester, 1998. Gilles Dowek et al. The Coq proof assistant user’s guide. Technical Report 154, INRIA-Rocquencourt, 1993. N. G. de Bruijn. A Survey of the Project AUTOMATH. In Seldin and Hindley [SH80], pages 579–606.

114 Dow90.

Florian Kamm¨ uller

G. Dowek. Naming and Scoping in a Mathematical Vernacular. Technical Report 1283, INRIA, Rocquencourt, 1990. FGT93. W. M. Farmer, J. D. Guttman, and F. J. Thayer. imps: an Interactive Mathematical Proof System. Journal of Automated Reasoning, 11:213–248, 1993. GM93. M. J. C. Gordon and T. F. Melham, editors. Introduction to HOL, a Theorem Proving Environment for Higher Order Logic. Cambridge University Press, 1993. Gun89. E. L. Gunter. Doing Algebra in Simple Type Theory. Technical Report MS-CIS-89-38, Dep. of Computer and Information Science, University of Pennsylvania, 1989. Gun90. E. L. Gunter. The Implementation and Use of Abstract Theories in HOL. In Third HOL Users Meeting, Aarhus University, 1990. How80. W.A. Howard. The formulae-as-types notion of construction. In Seldin and Hindley [SH80], pages 479–490. JM93. B. Jacobs and T. F. Melham. Translating Dependent Type Theory into Higher Order Logic. In M. Bezem and J. F. Groote, editors, Typed Lambda Calculi and Applications, number 664 in LNCS. Springer, 1993. Kam99a. F. Kamm¨ uller. Modular Reasoning in Isabelle. PhD thesis, University of Cambridge, 1999. Technical Report 470. Kam99b. F. Kamm¨ uller. Modular Structures as Dependent Types in Isabelle. In Types for Proofs and Programs, volume 1657 of LNCS. Springer, 1999. KP99. F. Kamm¨ uller and L. C. Paulson. A Formal Proof of Sylow’s First Theorem – An Experiment in Abstract Algebra with Isabelle HOL. Journal of Automated Reasoning, 23(3-4):235–264, 1999. KWP99. F. Kamm¨ uller, M. Wenzel, and L. C. Paulson. Locales – a Sectioning Concept for Isabelle. In Theorem Proving in Higher Order Logics, TPHOLs’99, volume 1690 of LNCS. Springer, 1999. LP92. Z. Luo and R. Pollack. Lego proof development system: User’s manual. Technical Report ECS-LFCS-92-211, University of Edinburgh, 1992. Mac86. D. B. MacQueen. Using Dependant Types to Express Modular Structures. In Proc. 13th ACM Symp. Principles Programming Languages. ACM Press, 1986. OSRSC98. S. Owre, N. Shankar, J.M. Rushby, and D.W.J. Stringer-Calvert. PVS Language Reference. Part of the PVS Manual. Available on the Web as http://www.csl.sri.com/pvsweb/manuals.html, September 1998. Pau94. L. C. Paulson. Isabelle: A Generic Theorem Prover, volume 828 of LNCS. Springer, 1994. SH80. J.P. Seldin and J.R. Hindley, editors. To H. B. Curry: Essays on Combinatory Logic, Academic Press Limited, 1980. Try93. A. Trybulec. Some Features of the Mizar Language. 1993. Available from Mizar user’s group. Yu90. Y. Yu. Computer Proofs in Group Theory. Journal of Automated Reasoning, 6:251–286, 1990.

An Infrastructure for Intertheory Reasoning William M. Farmer Department of Computing and Software McMaster University, 1280 Main Street West Hamilton, Ontario L8S 4L7, Canada [email protected]

Abstract. The little theories method, in which mathematical reasoning is distributed across a network of theories, is a powerful technique for describing and analyzing complex systems. This paper presents an infrastructure for intertheory reasoning that can support applications of the little theories method. The infrastructure includes machinery to store theories and theory interpretations, to store known theorems of a theory with the theory, and to make definitions in a theory by extending the theory “in place”. The infrastructure is an extension of the intertheory infrastructure employed in the imps Interactive Mathematical Proof System.

1

Introduction

Mathematical reasoning is always performed within some context, which includes vocabulary and notation for expressing concepts and assertions, and axioms and inference rules for proving conjectures. In informal mathematical reasoning, the context is almost entirely implicit. In fact, substantial mathematical training is often needed to “see” the context. The situation is quite different in formal mathematics performed in logical systems often with the aid of computers. The context is formalized as a mathematical structure. The favored mathematical structure for this purpose is an axiomatic theory within a formal logic. An axiomatic theory, or theory for short, consists of a formal language and a set of axioms expressed in the language. It is a specification of a set of objects: the language provides names for the objects and the axioms constrain what properties the objects have. Sophisticated mathematical reasoning usually involves several related but different mathematical contexts. There are two main ways of dealing with a multitude of contexts using theories. The big theory method is to choose a highly expressive theory—often based on set theory or type theory—that can represent many different contexts. Each context that arises is represented in the theory or in an extension of the theory. Contexts are related to each other in the theory itself. An alternate approach is the little theories method in which separate contexts are represented by separate theories. Structural relationships between contexts are represented as interpretations between theories (see [4,19]). Interpretations D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 115–131, 2000. c Springer-Verlag Berlin Heidelberg 2000

116

William M. Farmer

serve as conduits for passing information (e.g., definitions and theorems) from abstract theories to more concrete theories, or indeed to other equally abstract theories. As a result, the big theory is replaced with a network of theories— which can include both small compact theories and large powerful theories. The little theories approach has been used in both mathematics and computer science (see [10] for references). In [10] we argue that the little theories method offers important advantages for mechanized mathematics. Many of these advantages have been demonstrated by the imps Interactive Mathematical Proof System [9,11] which supports the little theories method. A mechanized mathematics system based on the little theories method requires a different infrastructure than one based on the big theory method. In the big theory method all reasoning is performed within a single theory, while in the little theories method there is both intertheory and intratheory reasoning. This paper presents an infrastructure for intertheory reasoning that can be employed in several kinds of mechanized mathematics systems including theorem provers, software specification and verification systems, computer algebra systems, and electronic mathematics libraries. The infrastructure is closely related to the intertheory infrastructure used in imps, but it includes some capabilities which are not provided by the imps intertheory infrastructure. The little theories method is a major element in the design of several software specification systems including ehdm [18], iota [16], kids [20], obj3 [12], and Specware [21]. The intertheory infrastructures of these systems are mainly for constructing theories and linking them together into a network. They do not support the rich interplay of making definitions, proving theorems, and “transporting” definitions and theorems from one theory to another needed for developing and exploring theories within a network. The Ergo [17] theorem proving system is another theorem proving system besides imps that directly supports the little theories method.1 Its infrastructure for intertheory reasoning provides full support for constructing theories from other theories via inclusion and interpretation but only partial support for developing theories by making definitions and proving theorems. In Ergo, theory interpretation is static: theorems from the source theory of an interpretation can be transported to the target theory of the interpretation only when the interpretation is created [14]. Theory interpretation is dynamic in the intertheory infrastructure of this paper (and of imps). The rest of the paper is organized as follows. The underlying logic of the intertheory infrastructure is given in section 2. Section 3 discusses the design requirements for the infrastructure. The infrastructure itself is presented in section 4. Finally, some applications of the infrastructure are described in section 5.

1

Many theorem proving systems indirectly support the little theories methods by allowing a network of theories to be formalized within a big theory.

An Infrastructure for Intertheory Reasoning

2

117

The Underlying Logic

The intertheory infrastructure presented in this paper assumes an underlying logic. Many formal systems, including first-order logic and Zermelo-Fraenkel set theory, could serve as the underlying logic. For the sake of convenience and precision, we have chosen a specific underlying logic for the infrastructure rather than treating the underlying logic as a parameter. Our choice is Church’s simple theory of types [3], denoted in this paper by C. The underlying logics of many theorem proving systems are based on C. For example, the underlying logic of imps (and its intertheory infrastructure) is a version of C called lutins [5,6,8]. Unlike C, lutins admits undefined terms, partial functions, and subtypes. By virtue of its support for partial functions and subtypes, many theory interpretations can be expressed more directly in lutins than in C [8]. We will give now a brief presentation of C. The missing details can be filled in by consulting Church’s original paper [3] or one of the logic textbooks, such as [1], which contains a full presentation of C. We will also define a number of logical notions in the context of C including the notions of a theory and an interpretation. 2.1

Syntax of C

The types of C are defined inductively as follows: 1. ι is a type (which denotes the type of individuals). 2. ∗ is a type (which denotes the type of truth values). 3. If α and β are types, then (α → β) is a type (which denotes the type of total functions that map values of type α to values of type β). Let T denote the set of types of C. A tagged symbol is a symbol tagged with a member of T . A tagged symbol whose symbol is a and whose tag is α is written as aα . Let V be a set of tagged symbols called variables such that, for each α ∈ T , the set of members of V tagged with α is countably infinite. A constant is a tagged symbol cα such that cα 6∈ V. A language L of C is a set of constants. (In the following, let a “language” mean a “language of C”.) An expression of type α of L is a finite sequence of symbols defined inductively as follows: 1. Each aα ∈ V ∪ L is an expression of type α. 2. If F is an expression of type α → β and A is an expression of type α, then F (A) is an expression of type β. 3. If xα ∈ V and E is an expression of type β, then (λ xα . E) is an expression of type α → β. 4. If A and B are expressions of type α, then (A = B) is an expression of type ∗.

118

William M. Farmer

5. If A and B are expressions of type ∗, then ¬A, (A ⊃ B), (A∧B), and (A∨B) are expressions of type ∗. 6. If xα ∈ V and E is an expression of type ∗, then (∀ xα . E) and (∃ xα . E) are expressions of type ∗. Expressions of type α are denoted by Aα , Bα , Cα , etc. Let E L denote the set of expressions of L. “Free variable”, “closed expression”, and similar notions are defined in the obvious way. Let S L denote the set of sentences of L, i.e., the set of closed expressions of type ∗ of L. 2.2

Semantics of C

For each language L, there is a set ML of models and a relation |= between models and sentences of L. M |= A∗ is read as “M is a model of A∗ ”. Let L be a language, A∗ ∈ S L , Γ ⊆ S L , and M ∈ ML . M is a model of Γ , written M |= Γ , if M |= B∗ for all B∗ ∈ Γ . Γ logically implies A∗ , written Γ |= A∗ , if every model of Γ is a model of A∗ . 2.3

Theories

A theory of C is a pair T = (L, Γ ) where L is a language and Γ ⊆ S L . Γ serves as the set of axioms of T . (In the following, let a “theory” mean a “theory of C”.) A∗ is a (semantic) theorem of T , written T |= A∗ , if Γ |= A∗ . T is consistent if some sentence of L is not a theorem of T . A theory T 0 = (L0 , Γ 0) is an extension of T , written T ≤ T 0 , if L ⊆ L0 and Γ ⊆ Γ 0 . T 0 is a conservative extension of T , written T T 0 , if T ≤ T 0 and, for all A∗ ∈ S L , if T 0 |= A∗ , then T |= A∗ . The following lemma about theory extensions is easy to prove. Lemma 1. Let T1 , T2 , and T3 be theories. 1. 2. 3. 4. 2.4

If If If If

T1 ≤ T2 ≤ T3 , then T1 ≤ T3 . T1 T2 T3 , then T1 T3 . T1 ≤ T2 ≤ T3 and T1 T3 , then T1 T2 . T1 T2 and T1 is consistent, then T2 is consistent. Interpretations

Let T = (L, Γ ) and T 0 = (L0 , Γ 0 ) be theories, and let Φ = (γ, µ, ν) where γ ∈ T and µ : V → V and ν : L → E L0 are total functions. For α ∈ T , Φ(α) is defined inductively as follows: 1. Φ(ι) = γ. 2. Φ(∗) = ∗. 3. If α, β ∈ T , then Φ(α → β) = Φ(α) → Φ(β).

An Infrastructure for Intertheory Reasoning

119

Φ is a translation from L to L0 if: 1. For all xα ∈ V, µ(xα ) is of type Φ(α). 2. For all cα ∈ L, ν(cα ) is of type Φ(α). Suppose Φ is a translation from L to L0 . For Eα ∈ E L, Φ(Eα ) is the member of E L0 defined inductively as follows: 1. 2. 3. 4. 5. 6. 7. 8.

If Eα ∈ V, then Φ(Eα ) = µ(Eα ). If Eα ∈ L, then Φ(Eα ) = ν(Eα ). Φ(Fα→β (Aα )) = Φ(Fα→β )(Φ(Aα )). Φ(λ xα . Eβ ) = (λ Φ(xα ) . Φ(Eβ )). Φ(Aα = Bα ) = (Φ(Aα ) = Φ(Bα )) Φ(¬E∗) = ¬Φ(E∗). Φ(A∗ 2 B∗ ) = (Φ(A∗ ) 2 Φ(B∗ )) where 2 ∈ {⊃, ∧, ∨}. Φ(2xα . E∗) = (2Φ(xα ) . Φ(E∗)) where 2 ∈ {∀, ∃}.

Φ is an interpretation of T in T 0 if it is a translation from L to L0 that maps theorems to theorems, i.e., for all A∗ ∈ S L , if T |= A∗ , then T 0 |= Φ(A∗ ). Theorem 1 (Relative Consistency). Suppose Φ be an interpretation of T in T 0 and T 0 is consistent. Then T is consistent. Proof. Assume Φ = (γ, µ, ν) is an interpretation of T in T 0 , T 0 is consistent, and T is inconsistent. Then F∗ = (∃ xι . ¬(xι = xι )) is a theorem of T , and so Φ(F∗) = (∃ µ(xι ) . ¬(µ(xι ) = µ(xι ))) is a theorem of T 0 , which contradicts the consistency of T 0 .2 The next theorem gives a sufficient condition for a translation to be an interpretation. Theorem 2 (Interpretation Theorem). Suppose Φ is a translation from L to L0 and, for all A∗ ∈ Γ , T 0 |= Φ(A∗ ). Then Φ is an interpretation of T in T 0 . Proof. The proof is similar to the proof of Theorem 12.4 in [6].2

3

Design Requirements

At a minimum, an infrastructure for intertheory reasoning should provide the capabilities to store theories and interpretations and to record theorems as they are discovered. We present in this section a “naive” intertheory infrastructure with just these capabilities. We then show that the naive infrastructure lacks several important capabilities. From these results we formulate the requirements that an intertheory infrastructure should satisfy.

120

3.1

William M. Farmer

A Naive Intertheory Infrastructure

We present now a naive intertheory infrastructure. In this design, the state of the infrastructure is a set of infrastructure objects. The infrastructure state is initially the empty set. It is changed by the application of infrastructure operations which add new objects to the state or modify objects already in the state. There are three kinds of infrastructure objects for storing theories, theorems, and interpretations, respectively, and there are four infrastructure operations for creating the three kinds of objects and for “installing” theorems in theories. Infrastructure objects are denoted by boldface letters. The three infrastructure objects are defined simultaneously as follows: 1. A theory object is a tuple T = (n, L, Γ, Σ) where n is a string, L is a language, Γ ⊆ S L , and Σ is a set of theorem objects. n is called the name of T and is denoted by [T]. (L, Γ ) is called the theory of T and is denoted by thy(T). 2. A theorem object is a tuple A = ([T], A∗ , J) where T = (n, L, Γ, Σ) is a theory object, A∗ ∈ S L , and J is a justification2 that thy(T) |= A∗ . 3. An interpretation object is a tuple I = ([T], [T0 ], Φ, J) where T and T0 are theory objects, Φ is a translation, and J is a justification that Φ is an interpretation of thy(T) in thy(T0 ). Let S denote the infrastructure state. The four infrastructure operations are defined as follows: 1. Given a string n, a language L, and Γ ⊆ S L as input, if, for all theory objects T0 = (n0 , L0 , Γ 0 , Σ 0 ) ∈ S, n 6= n0 and thy(T) 6= thy(T0 ), then create-thy-obj adds the theory object (n, L, Γ, ∅) to S; otherwise, the operation fails. 2. Given a theory object T ∈ S, a sentence A∗ , and a justification J as input, if A = ([T], A∗, J) is a theorem object, then create-thm-obj adds A to S; otherwise, the operation fails. 3. Given two theory objects T, T0 ∈ S, a translation Φ, and a justification J as input, if I = ([T], [T0 ], Φ, J) is an interpretation object, then create-int-obj adds I to S; otherwise, the operation fails. 4. Given a theorem object A = ([T], A∗, J) ∈ S and a theory object T0 = (n0 , L0 , Γ 0 , Σ 0 ) ∈ S as input, if thy(T) ≤ thy(T0 ), then install-thm-obj replaces T0 in S with the theory object (n0 , L0 , Γ 0, Σ 0 ∪{A}); otherwise, the operation fails.

2

The notion of a justification is not specified. It could, for example, be a formal proof.

An Infrastructure for Intertheory Reasoning

3.2

121

Missing Capabilities

The naive infrastructure is missing four important capabilities: A. Definitions. Suppose we would like to make a definition that the constant is zero ι→∗ is the predicate (λ xι . xι = 0ι ) in a theory T stored in a theory object T = (n, L, Γ, Σ) ∈ S. The naive infrastructure offers only one way to do this: create the extension T 0 = (L0 , Γ 0) of T , where L0 = L ∪ {is zero ι→∗ } and Γ 0 = Γ ∪ {is zeroι→∗ = (λ xι . xι = 0ι )}, and then store T 0 in a new theory object T0 by invoking create-thy-obj. If is zero ι→∗ is not in L, T and T 0 can be regarded as the same theory since T T 0 and is zeroι→∗ can be “eliminated” from any expression of L0 by replacing every occurrence of it with (λ xι . xι = 0ι ). Definitions are made all the time in mathematics, and thus, implementing definitions in this way will lead to an explosion of theory objects storing theories that are essentially the same. A better way of implementing definitions would be to extend T to T 0 “in place” by replacing T in T with T 0 . The resulting object would still be a theory object because every theorem of T is also a theorem of T 0. This approach, however, would introduce a new problem. If an interpretation object I = ([T], [T0 ], Φ, J) ∈ S and thy(T) is extended in place by making a definition cα = Eα , then the interpretation Φ would no longer be an interpretation of T in T 0 because Φ(cα ) would not be defined. There are three basic solutions to this problem. The first one is to automatically extend Φ to an interpretation of T in T 0 by defining Φ(cα ) = Φ(Eα ). However, this solution has the disadvantage that, when an expression of T containing cα is translated to an expression of T 0 via the extended Φ, the expression of T will be expanded into a possibly much bigger expression of T 0 . The second solution is to automatically transport the definition cα = Eα from T to a T 0 via Φ by making a new definition of the form dβ = Φ(Eα ) in T 0 and defining Φ(cα ) = dβ . The implementation of this solution would require care because, when two similar theories are both interpreted in a third theory, common definitions in the source theories may be transported multiple times to the target theory, resulting in definitions in the target theory that define different constants in exactly the same way. The final solution is to let the user extend Φ by hand whenever it is necessary. This solution is more flexible than the first two solutions, but it would impose a heavy burden on the user. Our experience in developing imps suggests that the best solution would be some combination of these three basic solutions. B. Profiles. Suppose we would like to make a “definition” that the constant a non zeroι has a value not equal to 0ι in a theory T stored in a theory object T = (n, L, Γ, Σ) ∈ S. That is, we would like to add a new constant a non zeroι to L whose value is specified, but not necessarily uniquely determined, by the

122

William M. Farmer

sentence ¬(a non zeroι = 0ι ). More precisely, let T 0 = (L0 , Γ 0 ) where L0 = L ∪ {a non zeroι } and Γ 0 = Γ ∪ {¬(a non zeroι = 0ι )}. If a non zeroι is not in L and the sentence (∃ xι . ¬(xι = 0ι )) is a theorem of T , then T T 0 . We call definitions of this kind profiles.3 A profile introduces a finite number of new constants that satisfy a given property. Like ordinary definitions, profiles produce conservative extensions, but unlike ordinary definitions, the constants introduced by a profile cannot generally be eliminated. A profile can be viewed as a generalization of a definition since any definition can be expressed as a profile. Profiles are very useful for introducing new machinery into a theory. For example, a profile can be used to introduce a collection of objects plus a set of operations on the objects—what is called an “algebra” in mathematics and an “abstract datatype” in computer science. The new machinery will not compromise the original machinery of T because the resulting extension T 0 of T will be conservative. Since T 0 is a conservative extension of T , any reasoning performed in T could just as well have been performed in T 0 . Thus the availability of T 0 normally makes T obsolete. Making profiles in the naive infrastructure leads to theory objects which store obsolete theories. The way of implementing definitions by extending theories in place would work just as well for profiles. As with definitions, extending theories in place could cause some interpretations to break. A combination of the second and third basic solutions to the problem given above for definitions could be used for profiles. The first basic solution is not applicable because profiles do not generally have the eliminability property of definitions.

C. Theory Extensions. Suppose that S contains two theory objects T and T0 with thy(T) ≤ thy(T0 ). In most cases (but not all), one would want every theorem object installed in T to also be installed in T0 . The naive infrastructure does not have this capability. That is, there is no support for having theorem objects installed in a theory object to automatically be installed in preselected extensions of the theory object. An intertheory infrastructure should guarantee that, for each theory object T and each preselected extension T0 of T, every theorem, definition, and profile installed in T is also installed in T0 .

D. Theory Copies. The naive infrastructure does not allow the infrastructure state to contain two theory objects storing the same theory. As a consequence, it is not possible to add a copy of a theory object to the infrastructure state. We will see in section 5 that creating copies of a theory object is a useful modularization technique. 3

Profiles are called constant specifications in [13] and constraints in [15].

An Infrastructure for Intertheory Reasoning

3.3

123

Requirements

Our analysis of the naive intertheory infrastructure suggests that the intertheory infrastructure should satisfy the following requirements: R1 The infrastructure enables theories and interpretations to be stored. R2 Known theorems of a theory can be stored with the theory. R3 Definitions can be made in a theory by extending the theory in place. R4 Profiles can be made in a theory by extending the theory in place. R5 Theorems, definitions, and profiles installed in a theory are automatically installed in certain preselected extensions of the theory. R6 An interpretation of T1 in T2 can be extended in place to an interpretation of T10 in T20 if Ti is extended to Ti0 by definitions or profiles for i = 1, 2. R7 A copy of a stored theory can be created and then developed independently from the original theory. The naive infrastructure satisfies only requirements R1 and R2. The imps intertheory infrastructure satisfies all of the requirements except R4 and R7.

4

The Intertheory Infrastructure

This section presents an intertheory infrastructure that satisfies all seven requirements in section 3.3. It is the same as the naive infrastructure except that the infrastructure objects and operations are different. That is, the infrastructure state is a set of infrastructure objects, is initially the empty set, and is changed by the application of infrastructure operations which add new objects to the state or modify objects already in the state. As in the naive infrastructure, let S denote the infrastructure state. 4.1

Objects

There are five kinds of infrastructure objects. The first four are defined simultaneously as follows: 1. A theory object is a tuple T = (n, L0 , Γ0, L, Γ, ∆, σ, N) where: (a) n is a string called the name of T. It is denoted by [T]. (b) L0 and L are languages such that L0 ⊆ L. L0 and L are called the base language and the current language of T, respectively. (c) Γ0 ⊆ S L0 and Γ ⊆ S L with Γ0 ⊆ Γ . The members of Γ0 and Γ are called the base axioms and the current axioms of T, respectively. (d) Γ ⊆ ∆ ⊆ {A∗ ∈ S L : Γ |= A∗ }. The members of ∆ are called the known theorems of T, and ∆ is denoted by thms(T).

124

William M. Farmer

(e) σ is a finite sequence of theorem, definition, and profile objects called the event history of T. (f) N is a set of names of theory objects called the principal subtheories of T. For each [T0 ] ∈ N with T0 = (n0 , L00 , Γ00 , L0 , Γ 0 , ∆0, σ 0 , N 0 ), L00 ⊆ L0 , Γ00 ⊆ Γ0 , L0 ⊆ L, Γ 0 ⊆ Γ , ∆0 ⊆ ∆, and σ 0 is a subsequence of σ. The base theory of T is the theory (L0 , Γ0 ) and the current theory of T, written thy(T), is the theory (L, Γ ). 2. A theorem object is a tuple A = ([T], A∗, J) where: (a) T is a theory object with thy(T) = (L, Γ ). (b) A∗ ∈ S L . A∗ is called the theorem of A. (c) J is a justification that Γ |= A∗ . 3. A definition object is a tuple D = ([T], cα, Eα , J) where: (a) T is a theory object with thy(T) = (L, Γ ). (b) cα is a constant not in L. (c) Eα ∈ E L . cα = Eα is called the defining axiom of D. (d) J is a justification that Γ |= O∗ where O∗ is (∃ xα . xα = Eα ) and xα does not occur in Eα .4 O∗ is called the obligation of D. 4. A profile object is a tuple P = ([T], C, Eβ , J) where: (a) T is a theory object with thy(T) = (L, Γ ). (b) C = {c1α1 , . . . , cm αm } is a set of constants not in L. 1 m (c) Eβ = (λ x1α1 · · · λ xm αm . B∗ ) where xα1 , . . . , xαm are distinct variables. Eβ (c1α1 ) · · · (cm ) is called the profiling axiom of P. αm (d) J is a justification that Γ |= O∗ where O∗ is (∃ x1α1 · · · ∃ xm αm . B∗ ). O∗ is called the obligation of P. An event object is a theorem, definition, or profile object. Let T ≤ T0 mean thy(T) ≤ thy(T0 ) and T T0 mean thy(T) thy(T0 ). T is a structural subtheory of T0 if one of the following is true: 1. T = T0 . 2. T is a structural subtheory of a principal subtheory of T0 . T is a structural supertheory of T0 if T0 is a structural subtheory of T. For a theory object T = (n, L0 , Γ0 , L, Γ, ∆, σ, N) and an event object e whose justification is correct, T[e] is the theory object defined as follows: 1. Let e be a theorem object ([T0 ], A∗, J). If T0 ≤ T, then T[e] = (n, L0 , Γ0, L, Γ, ∆ ∪ {A∗ }, σˆhei, N ); otherwise, T[e] is undefined. 2. Let e be a definition object ([T0 ], cα , Eα, J). If T0 ≤ T and cα 6∈ L, then T[e] = (n, L0 , Γ0, L ∪ {cα }, Γ ∪ {A∗ }, ∆ ∪ {A∗ }, σˆhei, N ) where A∗ is the defining axiom of e; otherwise, T[e] is undefined. 4

In C, Γ |= (∃ xα . xα = Eα ) always holds and so no justification is needed, but in other logics such as lutins a justification is needed since Γ |= (∃ xα . xα = Eα ) will not hold if Eα is undefined.

An Infrastructure for Intertheory Reasoning

125

3. Let e be a profile object ([T0 ], C, Eβ , J). If T0 ≤ T and C ∩ L = ∅, then T[e] = (n, L0 , Γ0 , L ∪ C, Γ ∪ {A∗ }, ∆ ∪ {A∗ }, σˆhei, N ) where A∗ is the profiling axiom of e; otherwise, T[e] is undefined. An event history σ is correct if the justification in each member of σ is correct. For a correct event history σ, T[σ] is defined by: 1. Let σ = hi. Then T[σ] = T. 2. Let σ = σ 0 ˆhei. If (T[σ 0 ])[e] is defined, then T[σ] = (T[σ 0 ])[e]; otherwise, T[σ] is undefined. Let the base of T, written base(T), be the theory object (n base, L0 , Γ0 , L0 , Γ0 , Γ0, hi, ∅). T is proper if the following conditions are satisfied: 1. Its event history σ is correct. 2. thy(T) = thy(base(T)[σ]). 3. thms(T) = thms(base(T)[σ]). Lemma 2. If T is a proper theory object, then A∗ is a known theorem of T iff A∗ is a base axiom of T or a theorem, defining axiom, or profiling axiom of an event object in the event history of T. Proof. Follows immediately from the definitions above. Theorem 3. If T is a proper theory object, then base(T) T. Proof. Since T is proper, the event history σ of T is correct and thy(T) = thy(base(T)[σ]). We will show base(T) T by induction on |σ|, the length of σ. Basis. Assume |σ| = 0. Then thy(T) = thy(base(T)) and so base(T) T is obviously true. Induction step. Assume |σ| > 0. Suppose σ = σ 0 ˆhei. By the induction hypothesis, base(T) base(T)[σ 0 ]. We claim base(T)[σ 0 ] (base(T)[σ 0 ])[e]. If e is a theorem object, then clearly thy(base(T)[σ 0 ]) = thy((base(T)[σ 0 ])[e]) and so base(T)[σ 0 ] (base(T)[σ 0 ])[e]. If e is a definition or profile object, then base(T)[σ 0 ] (base(T)[σ 0 ])[e] by the justification of e. Therefore, base(T) T follows by part (2) of Lemma 1. 2 We will now define the fifth and last infrastructure object: An interpretation object is a tuple I = ([T], [T0 ], Φ, J) where: 1. 2. 3. 4.

T is a theory object called the source theory of I. T0 is a theory object called the target theory of I. Φ is a translation. J is a justification that Φ is an interpretation of thy(base(T)[σ]) in thy(base(T0 )[σ 0 ]) where σ and σ 0 are initial segments of the event histories of T and T0 , respectively.

126

4.2

William M. Farmer

Operations

The infrastructure design includes ten operations. There are operations for creating the infrastructure objects: 1. Given a string n, a language L, a set Γ of sentences, and theory objects Ti = (ni , Li0 , Γ0i, Li , Γ i, ∆i , σ i , N i ) ∈ S for i = 1, . . . , m as input, let (a) L00 = L10 ∪ · · · ∪ Lm 0 . (b) Γ00 = Γ01 ∪ · · · ∪ Γ0m . (c) L0 = L1 ∪ · · · ∪ Lm . (d) Γ 0 = Γ 1 ∪ · · · ∪ Γ m. (e) ∆0 = ∆1 ∪ · · · ∪ ∆m . (f) σ 0 = σ 1 ˆ · · · ˆσ m . If T = (n, L ∪ L00 , Γ ∪ Γ00 , L ∪ L0 , Γ ∪ Γ 0, Γ ∪ ∆0 , σ 0 , {[T1 ], . . . , [Tm ]})

2.

3.

4.

5.

is a theory object and n 6= [T0 ] for any theory object T0 ∈ S, then create-thy-obj adds T to S; otherwise, the operation fails. Given a theory object T ∈ S, a sentence A∗ , and a justification J as input, if A = ([T], A∗, J) is a theorem object, then create-thm-obj adds A to S; otherwise, the operation fails. Given a theory object T ∈ S, a constant cα , an expression Eα , and a justification J as input, if D = ([T], cα, Eα , J) is a definition object, then create-def-obj adds D to S; otherwise, the operation fails. Given a theory object T ∈ S, a set C of constants, an expression Eβ , and a justification J as input, if P = ([T], C, Eβ , J) is a profile object, then create-pro-obj adds P to S; otherwise, the operation fails. Given two theory objects T, T0 ∈ S, a translation Φ, and a justification J as input, if I = ([T], [T0 ], Φ, J) is an interpretation object, then create-int-obj adds I to S; otherwise, the operation fails.

There are operations for installing theorem, definition, and profile objects in theory objects: 1. Given a theorem object A = ([T0 ], A∗, J) ∈ S and a theory object T1 ∈ S, if T0 ≤ T1 , then install-thm-obj replaces every structural supertheory T of T1 in S with T[A]; otherwise, the operation fails. 2. Given a definition object D = ([T0 ], cα , Eα , J) ∈ S and a theory object T1 ∈ S, if T0 ≤ T1 and T[D] is defined for every structural supertheory T of T1 in S, then install-def-obj replaces every structural supertheory T of T1 in S with T[D]; otherwise, the operation fails. 3. Given a profile object P = ([T0 ], C, Eβ , J) ∈ S and a theory object T1 ∈ S, if T0 ≤ T1 and T[P] is defined for every structural supertheory T of T1 in S, then install-pro-obj replaces every structural supertheory T of T1 in S with T[P]; otherwise, the operation fails.

An Infrastructure for Intertheory Reasoning

127

There are operations to extend an interpretation object and to copy a theory object: 1. Given an interpretation object I = ([T], [T0 ], Φ, J) ∈ S, a translation Φ0 , and a justification J 0 as input, if Φ0 extends Φ and I0 = ([T], [T0 ], Φ0, J 0 ) is an interpretation object, then extend-int replaces I in S with I0 ; otherwise, the operation fails. 2. Given a string n and a theory object T = (n0 , L0 , Γ0 , L, Γ, ∆, σ, N) ∈ S as input, if n 6= [T0 ] for any theory object T0 ∈ S, then create-thy-copy adds the theory object T0 = (n, L0 , Γ0, L, Γ, ∆, σ, N) to S; otherwise, the operation fails. The infrastructure operations guarantee that the following theorem holds: Theorem 4. If the justification of every event object in S is correct, then: 1. Every object in S is a well-defined theory, theorem, definition, profile, or interpretation object. 2. Every theory object in S is proper. 3. Distinct theory objects in S have distinct names.

Some Remarks about the Intertheory Infrastructure: 1. Theory and interpretation objects are modifiable, but event objects are not. 2. The event history of a theory object records how the theory object is constructed from its base theory. 3. The theory stored in a theory object T extends all the theories stored in the principal subtheories of T. 4. Theorem, definition, and profile objects installed in a theory T in S are automatically installed in every structural supertheory of T in S. 5. The infrastructure allows definitions and profiles to be made in a theory object T both by modifying T using install-def-obj and install-prof-obj and by creating an extension of T using create-thy-obj. 6. By Theorem 2, if Φ is a translation from thy(T) to thy(T0 ) which maps the base axioms of T to known theorems of T0 , then Φ is an interpretation of thy(T) in thy(T0 ). 7. The interpretation stored in an interpretation object is allowed to be incomplete. It can be extended as needed using extend-int.

128

5 5.1

William M. Farmer

Some Applications Theory Development System

The intertheory infrastructure provides a strong foundation on which to build a system for developing axiomatic theories. The infrastructure operations enable theories and interpretations to be created and extended. Many additional operations can be built on top of the ten infrastructure operations. Examples include operations for transporting theorems, definitions, and profiles from one theory to another and for instantiating theories. Given a theorem object A = ([T0 ], A∗, J0 ) installed in T ∈ S and an interpretation object I = ([T], [T0 ], Φ, J) ∈ S as input, the operation transport-thm-obj would invoke create-thm-obj and install-thm-obj to create a new theorem object ([T0 ], Φ(A∗), J 0 ) and install it in T0 . The justification J 0 would be formed from J0 and J. Given a constant dβ , a definition object D = ([T0 ], cα, Eα , J) installed in T ∈ S , and an interpretation object I = ([T], [T0 ], Φ, J) ∈ S as input, if Φ(α) = β and dβ is not in the current language of T0 , the operation transport-def-obj would invoke create-def-obj and install-def-obj to create a new definition object ([T0 ], dβ , Φ(Eα ), J 0 ) and install it in T0 ; otherwise, the operation fails. The justification J 0 would be formed from J0 and J. An operation transport-pro-obj could be defined similarly. Given theory objects T, T0 ∈ S and an interpretation object I = ([T0 ], [T00 ], Φ, J) ∈ S as input, if T0 ≤ T and T00 ≤ T0 , the operation instantiate-thy would invoke create-thy-obj to create a new theory object T00 and create-int-obj to create a new interpretation object I0 = ([T], [T00 ], Φ0, J) such that: – T00 is an extension of T0 obtained by “instantiating” T0 in T with T0 . How T0 is cemented to the part of T outside of T0 is determined by Φ. The constants of T which are not in T0 may need to be renamed and retagged to avoid conflicts with the constants in T0 . – Φ0 is an interpretation of thy(T) in thy(T00 ) which extends Φ. For further details, see [7]. This notion of theory instantiation is closely related to the notion of theory instantiation proposed by Burstall and Goguen [2]; in both approaches a theory is instantiated via an interpretation. However, in our approach, any theory can be instantiated with respect to any of its subtheories. In the Burstall-Goguen approach, only “parameterized theories” can be instantiated and only with respect to the explicit parameter of the parameterized theory.

An Infrastructure for Intertheory Reasoning

5.2

129

Foundational Theory Development System

A theory development system is foundational if every theory developed in the system is consistent relative to one or more “foundational” theories which are known or regarded to be consistent. Since the operations for installing theorems, definitions, and profiles in a theory always produce conservative extensions of the original theory by Theorem 3, these operations preserve consistency. Therefore, a foundational theory development system can be implemented on top of the infrastructure design by simply using a new operation for creating theory objects that is successful only when the theory stored in the object is consistent relative to one of the foundational theories. The new operation can be defined as follows. Suppose T∗ is a foundational theory. Given a string n, a language L, a set Γ of sentences, theory objects T1 , . . . , Tm ∈ S, a translation Φ, and a justification J as input, if J is a justification that Φ is an interpretation of T = (L, Γ ) in thy(T∗ ), the new operation would invoke create-thy-obj on (n, L, Γ, {[T1 ], . . . , [Tm ]}) to create a theory object T and then invoke create-int-obj on ([T], [T∗ ], Φ, J) to create an interpretation object I; otherwise the operation fails. If the operation is successful and J is correct, then thy(T) would be consistent relative to thy(T∗ ) by Theorem 1. 5.3

Encapsulated Theory Development

Proving a theorem in a theory may require introducing several definitions and proving several lemmas in the theory that would not be useful after the theorem is proved. Such “local” definitions and lemmas would become logical clutter in the theory. One strategy for handling this kind of clutter is to encapsulate local development in a auxiliary theory so that it can be separated from the development of the main theory. The infrastructure design makes this encapsulation possible. Suppose that one would like to prove a theorem in a theory stored in theory object T using some local definitions and lemmas. One could use create-thy-copy to create a copy T0 of T and create-int-obj to create a interpretation object I storing the identity interpretation of thy(T0 ) in thy(T). Next the needed local definitions and lemmas could be installed as definition and theorem objects in T0 . Then the theorem could be proved and installed as a theorem object in T0 . Finally, the theorem could be transported back to T using the interpretation stored in I. The whole local development needed to prove the theorem would reside in T0 completely outside of the development of T. A different way to encapsulate local theory development is used in the ACL2 theorem prover [15]. 5.4

Sequent-Style Proof System

A goal-oriented sequent-style proof system can be built on top of the intertheory infrastructure. A sequent would have the form T → A∗ where T is a theory object called the context and A∗ is a sentence in the current language of T

130

William M. Farmer

called the assertion. The system would include the usual inference rules of a sequent-style proof system plus rules to: – Install a theorem, definition, or profile into the context of a sequent. – Transport a theorem, definition, or profile from a theory object to the context of a sequent. Some of the proof rules, such as the deduction rule, would add or remove axioms from the context of a sequent, thereby defining new theory objects. The proof rules for the rules of universal generalization and existential generalization would be implemented by installing a profile in the context of a sequent. A sentence A∗ in the current language of a theory object T would be proved as follows. create-thy-copy would be used to create a copy T0 of T and create-int-obj would be used to create a interpretation object I storing the identity interpretation of thy(T0 ) in thy(T). Then the sequent T0 → A∗ would be proved, possibly with the help of local or imported definitions and lemmas. The contexts created in the course of the proof would be distinct supertheories of T0 . A theorem or definition installed in a context appearing in some part of the proof would be available wherever else the context appeared in the proof. When the proof is finished, A∗ would be installed as a theorem object in T0 . The theorem could be then transported back to T using the interpretation stored in I. The theory objects needed for the proof—T0 and its supertheories—would be separated from T and the other theory objects in S.

Acknowledgments Many of the ideas in this paper originated in the design and implementation of imps done jointly by Dr. Joshua Guttman, Dr. Javier Thayer, and the author.

References 1. P. B. Andrews. An Introduction to Mathematical Logic and Type Theory: To Truth through Proof. Academic Press, 1986. 2. R. Burstall and J. Goguen. The semantics of Clear, a specification language. In Advanced Course on Abstract Software Specifications, volume 86 of Lecture Notes in Computer Science, pages 292–332. Springer-Verlag, 1980. 3. A. Church. A formulation of the simple theory of types. Journal of Symbolic Logic, 5:56–68, 1940. 4. H. B. Enderton. A Mathematical Introduction to Logic. Academic Press, 1972. 5. W. M. Farmer. A partial functions version of Church’s simple theory of types. Journal of Symbolic Logic, 55:1269–91, 1990. 6. W. M. Farmer. A simple type theory with partial functions and subtypes. Annals of Pure and Applied Logic, 64:211–240, 1993. 7. W. M. Farmer. A general method for safely overwriting theories in mechanized mathematics systems. Technical report, The mitre Corporation, 1994. 8. W. M. Farmer. Theory interpretation in simple type theory. In J. Heering et al., editor, Higher-Order Algebra, Logic, and Term Rewriting, volume 816 of Lecture Notes in Computer Science, pages 96–123. Springer-Verlag, 1994.

An Infrastructure for Intertheory Reasoning

131

9. W. M. Farmer, J. D. Guttman, and F. J. Thayer F´ abrega. imps: An updated system description. In M. McRobbie and J. Slaney, editors, Automated Deduction—CADE13, volume 1104 of Lecture Notes in Computer Science, pages 298–302. SpringerVerlag, 1996. 10. W. M. Farmer, J. D. Guttman, and F. J. Thayer. Little theories. In D. Kapur, editor, Automated Deduction—CADE-11, volume 607 of Lecture Notes in Computer Science, pages 567–581. Springer-Verlag, 1992. 11. W. M. Farmer, J. D. Guttman, and F. J. Thayer. imps: An Interactive Mathematical Proof System. Journal of Automated Reasoning, 11:213–248, 1993. 12. J. A. Goguen and T. Winkler. Introducing obj3. Technical Report sri-csl-99-9, sri International, August 1988. 13. M. J. C. Gordon and T. F. Melham. Introduction to HOL: A Theorem Proving Environment for Higher Order Logic. Cambridge University Press, 1993. 14. N. Hamilton, R. Nickson, O. Traynor, and M. Utting. Interpretation and instantiation of theories for reasoning about formal specifications. In M. Patel, editor, Proceedings of the Twentieth Australasian Computer Science Conference, volume 19 of Australian Computer Science Communications, pages 37–45, 1997. 15. M. Kaufmann and J S. Moore. Structured theory development for a mechanized logic, 1999. Available at http://www.cs.utexas.edu/users/moore/publications/acl2-papers.html, 16. R. Nakajima and T. Yuasa, editors. The iota Programming System, volume 160 of Lecture Notes in Computer Science. Springer-Verlag, 1982. 17. R. Nickson, O. Traynor, and M. Utting. Cogito ergo sum—providing structured theorem prover support for specification formalisms. In K. Ramamohanarao, editor, Proceedings of the Nineteenth Australasian Computer Science Conference, volume 18 of Australian Computer Science Communications, pages 149–158, 1997. 18. J. Rushby, F. von Henke, and S. Owre. An introduction to formal specification and verification using ehdm. Technical Report sri-csl-91-02, sri International, 1991. 19. J. R. Shoenfield. Mathematical Logic. Addison-Wesley, 1967. 20. D. Smith. kids: A knowledge-based software development system. In M. Lowry and R. McCartney, editors, Automating Software Design, pages 483–514. MIT Press, 1991. 21. Y. Srinivas and R. Jullig. Specware: Formal support for composing software. In Proceedings of the Conference on Mathematics of Program Construction, 1995.

G¨ odel’s Algorithm for Class Formation Johan Gijsbertus Frederik Belinfante Georgia Institute of Technology, Atlanta, GA 30332–0160 (U.S.A.) [email protected]

Abstract. A computer implementation of G¨ odel’s algorithm for class formation in MathematicaTM is useful for automated reasoning in set theory. The original intent was to forge a convenient preprocessing tool to help prepare input files for McCune’s automated reasoning program Otter. The program is also valuable for discovering new theorems. Some applications are described, especially to the definition of functions. A brief extract from the program is included in an appendix.

1

Introduction

Robert Boyer et al. (1986) proposed clauses capturing the essence of G¨ odel’s finite axiomatization of the von Neumann-Bernays theory of sets and classes. Their work was simplified significantly by Art Quaife (1992a and 1992b). About four hundred theorems of elementary set theory were proved using McCune’s automated reasoning program Otter. A certain degree of success has been achieved recently (1999a and 1999b) in extending Quaife’s work. Some elementary theorems of ordinal number theory were proved, based on Isbell’s definition (1960) of ordinal number, which does not require the axiom of regularity to be assumed. An admitted disadvantage of G¨ odel’s formalism is the absence of the usual class formation {x | p(x)} notation. Replacing the axiom schema for class formation are a small number of axioms for certain basic class constructions. Definitions of classes must be expressed in terms of two basic classes, the universal class V and the membership relation E, and seven other basic class constructors: the unary constructors complement, domain, flip and rotate, and the binary constructors pairset, cart, intersection. G¨ odel also included an axiom for inverse, but it can be deduced from the others.

2

A Brief Description of the GOEDEL Program

As a replacement for the axiom schema for class formation, Kurt G¨ odel (1940) proved a fundamental Class Existence Metatheorem Schema for class formation. His proof of this metatheorem is constructive; a recursive algorithm for converting customary definitions of classes using class formation to expressions built out of the primitive constructors is presented, together with a proof of termination. An implementation of G¨ odel’s algorithm in MathematicaTM was created (1996) D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 132–147, 2000. c Springer-Verlag Berlin Heidelberg 2000

G¨ odel’s Algorithm for Class Formation

133

to help prepare input files for proofs in set theory using McCune’s automated reasoning program Otter. The likelihood of success in proving theorems using programs like Otter depends critically on the simplicity of the definitions used and the brevity of the statements of the theorems to be proved. To mitigate the effects of combinatorial explosion, one typically sets a weight limit to exclude complicated expressions from being considered. Although combinatorial explosion can not be prevented, the idea is to snatch a proof quickly before the explosion gets well under way. Because one needs compact definitions for practical applications, and because the output of G¨ odel’s original algorithm is typically extremely complicated, a large number of simplification rules were added to the Mathematica implementation of G¨ odel’s algorithm. With the addition of simplification rules, G¨ odel’s proof of termination no longer applies. No assurance can be given that the added simplification rules will not cause looping to occur, but we have tested the program on a suite of several thousand examples, and it appears that it can be used as a practical tool to help formulate definitions and to simplify the statements of theorems. The GOEDEL program contains no mechanism for carrying out deductions, but it does sometimes manage to prove statements by simplifying them to True. Much of the complexity of G¨ odel’s original algorithm stems from his use of Kuratowski’s definition for ordered pairs. The Mathematica implementation does not assume any particular contruction of ordered pairs, but instead includes additional rules to deal with ordered pairs. The self-membership rule in the original algorithm was modified because in our work on ordinal numbers the axiom of regularity is not assumed. The stripped down version of the GOEDEL program presented in the Appendix omits many membership rules for defined constructors as well as most of the simplification rules. The modified G¨ odel’s algorithm is presented as a series of definitions for a Mathematica function class[x,p]. The first argument x, assumed to be the name of a set, must be either an atomic symbol, or an expression of the form pair[u, v] where u and v in turn are either atomic symbols or pairs, and so on. It should be noted that G¨ odel did not allow both u and v to be pairs, but this unnecessary limitation has been removed to make the formalism more flexible. The second argument p is some statement which can involve the variables that appear in x, as well as other variables that may represent arbitrary classes (not just sets). The statement can contain quantifiers, but the quantified variables must be sets. The G¨ odel algorithm does not apply to statements containing quantifiers over proper classes. The quantifiers forall and exists used in the GOEDEL program are explicitly restricted to set variables. A few simple examples will be presented to illustrate how the GOEDEL program is used. For convenience, Mathematica style notation will be employed, which does not quite conform to the notational requirements of Otter. For example, Mathematica permits one to define intersection to be an associative and commutative function of any number of variables. For brevity we write

134

Johan Gijsbertus Frederik Belinfante

a −→ b to mean that Mathematica input a produces Mathematica output b for some version of the GOEDEL program. The functions FIRST and SECOND which project out the first and second components of an ordered pair, respectively, can be specified as the classes class[pair[pair[x, y], z], equal[z, x]] −→ FIRST, class[pair[pair[x, y], z], equal[z, y]] −→ SECOND. Examples which involve quantifiers include the domain and range of a relation: class[x, exists[y, member[pair[x, y], z]]] −→ domain[z], class[y, exists[x, member[pair[x, y], z]]] −→ range[z]. It is implicitly assumed that all quantified variables refer to sets, but the free variable z here can stand for any class.

3

Eliminating Flip and Rotate

G¨ odel’s algorithm uses two special constructors flip[x] and rotate[x] which produce ternary relations. The ternary relation flip[x] is class[pair[pair[u, v], w], member[pair[pair[v, u], w], x]] while rotate[x] is class[pair[pair[u, v], w], member[pair[pair[v, w], u], x]]. Because these functors are not widely used in mathematics, it may be of interest to note that they could be eliminated in favor of more familiar ones. One can rewrite flip[x] as composite[x, SWAP], where SWAP = flip[Id] is the relation class[pair[pair[u, v], pair[x, y]], and[equal[u, y], equal[v, x]]] −→ SWAP. Note that the functions that project out the first and second members of an ordered pair are related by SECOND = flip[FIRST] and FIRST = flip[SECOND]. The general formula for rotate[x] is more complicated, but G¨odel’s algorithm actually only involves the special case where x is a Cartesian product. In this special case one has the simple formula, rotate[cart[x , y ]] := composite[x, SECOND, id[cart[y, V]]]. Using these formulas, the constructors flip and rotate could be completely eliminated from G¨ odel’s algorithm, as well as from G¨ odel’s axioms for class theory, if one instead takes as primitives the constructors composite, inverse, FIRST and SECOND. We have done so in the abbreviated version of the GOEDEL program listed in the Appendix. The function SWAP mentioned above, for example, could be defined in terms of these new primitives as intersection[composite[inverse[FIRST],SECOND], composite[inverse[SECOND],FIRST]] := SWAP.

G¨ odel’s Algorithm for Class Formation

4

135

Equational Set Theory without Variables

The simplification rules in the GOEDEL program can be used not only to simplify descriptions of classes, but can also be induced to simplify statements. Given any statement p, one can form the class class[w, p] where w is any variable that does not occur in the statement p. This class is the universal class V if p is true, and is the empty class when p is false. One can form a new statement equivalent to the original one by the definition assert[p ] := Module[{w = Unique[]}, equal[V, class[w, p]]] The occurrence of class causes G¨odel’s algorithm to be invoked, the meaning of the statement p to be interpreted, and the simplification rules in the GOEDEL program to be applied. While there can be no assurance the transformed statement will actually be simpler than the statement one started with, in practice it often is. For instance, the input assert[equal[composite[cross[x, x], DUP], composite[DUP, x]]] produces the statement FUNCTION[composite[Id, x]] as output. To improve the readability of the output, in the current version of the GOEDEL program, rules have been added which may convert the equations obtained with assert back to simpler non-equational statements. Since some theorem provers are limited to equational statements, it is of interest to reformulate set theory in equational terms. Alfred Tarski and Steven Givant (1987) have shown that all statements of set theory can be reformulated as equations without variables, somewhat reminiscent of combinatory logic. But whereas combinatory logic uses function-like objects as primitives, their calculus is based on the theory of relations. It has recently been proposed by Omodeo and Formisano (1998) that this formalism be used to recast set theory in a form accessible to purely equational automated reasoning programs. It is interesting to note that the assert mechanism in the GOEDEL program achieves the same objective. Any statement is converted by assert into an equation of the form equal[V, x]. If one prefers, one may also write this equation in the equivalent form equal[0, complement[x]]. Another consequence of the assert process is that one can always convert negative statements into positive ones. For example, the negative statement not[equal[0, x]] is converted by assert into the equivalent positive statement equal[V, image[V, x]]. Thus it appears that at least in set theory it does not make too much sense to make a big distinction between positive and negative literals, because one can always convert the one into the other. Also, one can always convert a clause with several literals into a unit clause; the clause or[equal[0, x], equal[0, y]], for example, is equivalent to the unit clause equal[0, intersection[image[V, x], image[V, y]]]. The class image[V, x] which appears in these expressions is equal to the empty set if x is empty, and is equal to the universal class V if x is not empty. This class

136

Johan Gijsbertus Frederik Belinfante

is quite useful for reformulating conditional statements as unconditional ones. Many equations in set theory hold only for sets and not for proper classes. For example, the union of the singleton of a class x is x when x is a set, but is the empty set otherwise. This rule can be written as a single equation which applies to both cases as follows: U[singleton[x ]] := intersection[x, image[V, singleton[x]]] (This is in fact one of the thousands of rules in the GOEDEL program.) Although such unconditional statements initially appear to be more complex than the conditional statements that they replace, experience both with Otter and with the GOEDEL program indicates that the unconditional statements are in fact preferable. In Otter the unconditional rule can often be added to the demodulator list. In Mathematica, an unconditional simplification rule generally works faster than a conditional one. When assert is applied to a statement containing quantifiers, the statement is converted to a logically equivalent equation without quantifiers. All quantified variables are eliminated. What happens is that the quantifiers are neatly built into equivalent set-theoretic constructs like domain and composite. For example, the axiom of regularity is usually formulated using quantifiers as: implies[not[equal[x, 0]], exists[u, and[member[u, x], disjoint[u, x]]]]. When assert is applied, this statement is automatically converted into the equivalent quantifier-free statement or[equal[0, x], not[subclass[x, complement[P[complement[x]]]]]]. In this case the quantifier was hidden in the introduced power class functor. Replacing x by its complement, one obtains the following neat reformulation of the axiom of regularity: implies[subclass[P[x], x], equal[x, V]]. That is, the axiom of regularity says that the universal class is the only class which contains its own power class. When the axiom of regularity is not assumed, there may be other classes with this property. In particular, the Russell class RUSSELL = complement[fix[E]] has this property, a fact that is useful in the Otter proofs in ordinal number theory. This reformulation of the axiom of regularity has the advantage over the original one in that its clausification does not introduce new Skolem functions.

5

Functions, Vertical Sections, and Cancellation Machines

The process of eliminating variables and hiding quantifiers is facilitated by having available a supply of standard functions corresponding to the primitive constructors, as well as important derived constructors. For example, Quaife introduced the function SUCC corresponding to the constructor succ[x ] := union[x, singleton[x]]

G¨ odel’s Algorithm for Class Formation

137

so that the statement that the set omega of natural numbers is closed under the successor operation could be written in the compact variable-free form as the condition subclass[image[SUCC, omega], omega]. This is just one of the many techniques that Quaife exploited to reduce the plethora of Skolem functions that had appeared in the earlier work of Robert Boyer, et al. Replacing the function symbols of first order logic by bonafide set-theoretic functions not only helps to eliminate Skolem functions, but also improves the readability of the statements of theorems. A standard way to obtain definitions for most of these functions is in terms of a basic constructor VERTSECT, enabling one to introduce a lambda calculus for defining functions by specifying the result obtained when they are applied to an input. The basic idea is not limited to functions; any relation can be specified by giving a formula for its vertical sections. The vertical sections of a relation z are the family of classes image[z, singleton[x]] = class[y, member[pair[x, y], z]]. One is naturally led to introduce the function which assigns these vertical sections: VERTSECT[z] == class[pair[x, y], equal[y, image[z, singleton[x]]]] (Formisano and Omodeo (1998) call this function ∇(z).) G¨ odel’s algorithm converts this formula to the expression VERTSECT[z] == composite[Id,intersection[ complement[composite[E,complement[z]]], complement[composite[complement[E],z]]]]. Of course, for many relations z the vertical sections need not be sets. The domain of VERTSECT[z] in general is the class of all sets x for which image[z, singleton[x]] is also a set. We call a relation thin when all vertical sections are sets. In addition to functions, there are many important relations, such as inverse[E] and inverse[S], that are thin. Using Otter, we have proved many facts about VERTSECT, making it unnecessary to repeat such work for individual functions. When f is a function, image[f, singleton[x]] is a singleton, and one can select the element in that singleton by applying either the sum class operation U, as Quaife does, or by applying the unary intersection operation A defined by class[u, forall[v, implies[member[v, x], member[u, v]]]] −→ A[x] or equivalently, complement[image[complement[inverse[E]], x]] −→ A[x]. The difference between using U and A only affects the case that x is a proper class. Nevertheless, using A instead of U in the definition of application has many practical advantages. For example, one can use VERTSECT to obtain a formula for any function from a formula for its application A[image[f, singleton[x]]]. This can be done neatly in the GOEDEL program by introducing the Mathematica definition

138

Johan Gijsbertus Frederik Belinfante

lambda[x_,e_]:= Module[{y=Unique[]},VERTSECT[class[pair[x,y],member[y,e]]]] This Mathematica function lambda satisfies: FUNCTION[f] := True;

lambda[x, A[image[f, singleton[x]]]] −→ f,

It should be noted that nothing like this works when one replaces A by U because U does not distinguish between 0 and singleton(0), whereas A does. For the constant function f := cart[x, singleton[0]], for example, one has U[image[f, singleton[y]]] −→ 0, whereas A[image[f, singleton[y]]] −→ complement[image[V, intersection[x, singleton[y]]]]. Because the formula for U[image[f, singleton[y]]] has lost all information about the domain x of the function f, one cannot reconstruct f from this formula, but one can reconstruct f from the formula for A[image[f, singleton[y]]]. As examples of definitions obtained using lambda we mention the function SINGLETON which takes any set to its singleton, lambda[x, singleton[x]] −→ VERTSECT[Id], and the function POWER which takes any set to its power set, lambda[x, P[x]] −→ VERTSECT[inverse[S]]. The function VERTSECT[x] itself satisfies lambda[w, image[x, singleton[w]]] −→ VERTSECT[x]. In addition to VERTSECT, it is also convenient to introduce a related constructor IMAGE, defined by VERTSECT[composite[x , inverse[E]]] := IMAGE[x]. The function IMAGE[x] satisfies lambda[u, image[x, u]] −→ IMAGE[x]. The definition IMAGE[inverse[E]] := BIGCUP of the function BIGCUP which corresponds to the constructor U[x] was one of the first applications found for IMAGE. The function IMAGE[inverse[S]] is the hereditary closure operator, which takes any set x to its hereditary closure image[inverse[S],x]. This function is closely related to the POWER function mentioned earlier. The functions IMAGE[FIRST] and IMAGE[SECOND] take x to its domain and range, respectively, while IMAGE[SWAP] takes x to its inverse. The function IMAGE[cross[u,v]] takes x to composite[v,x,inverse[u]]. For example, the function that corresponds to the constructor flip is IMAGE[cross[SWAP,Id]].

G¨ odel’s Algorithm for Class Formation

139

The constructor IMAGE is not a functor in the category theory sense. The function IMAGE[x] does not in general preserve composites, but only when the right hand factor is thin: domain[VERTSECT[t]] := V; IMAGE[composite[x, t]] −→ composite[IMAGE[x], IMAGE[t]]. IMAGE preserves the global identity function: IMAGE[Id] −→ Id; but in general IMAGE[id[x]] is not an identity function. It is nonetheless a useful function: lambda[w, intersection[x, w]] −→ IMAGE[id[x]]. An important application of VERTSECT is to provide a mechanism for recovering a function f from a formula for composite[inverse[E], f]. One can use VERTSECT to cancel factors of inverse[E]; for example, the Mathematica input FUNCTION[f1] := True; FUNCTION[f2] := True; domain[f1] := V; domain[f2] := V; Map[VERTSECT, composite[inverse[E],f1]==composite[inverse[E],f2]] produces the output f1 == f2. When the assumption about the domains of the functions are omitted, the results are slightly more complicated, but one nonetheless can obtain a formula for each function in terms of the other. It is possible to use VERTSECT to construct other such cancellation machines which cancel factors of S, inverse[S] or DISJOINT. These machines were found to be quite useful in our investigations of the binary functions which correspond to the constructors intersection, cart, union and so forth.

6

Binary Functions and Proof by Rotation

Binary functions such as CART, CAP, CUP, corresponding to the constructors cart, intersection, union, are important for obtaining variable-free expressions in many applications. To apply the lambda formalism to these functions, it is convenient to introduce the abbreviations first[x_] := A[domain[singleton[x]]; second[x_] := A[range[singleton[x]]. One then has lambda[x, intersection[first[x], second[x]]] −→ CAP, lambda[x, union[first[x], second[x]]] −→ CUP, lambda[x, cart[first[x], second[x]]] −→ CART, (It should be noted that first and second here are technically different from the rather similar constructors 1st and 2nd introduced by Quaife.) Although G¨ odel’s rotate functor can be completely eliminated, nevertheless it does in fact have many desirable properties. For example, the rotate functor

140

Johan Gijsbertus Frederik Belinfante

preserves unions, intersections and relative complements, whereas composite preserves only unions. In the study of binary functions, the rotate constructor has turned out to be extremely useful. Often we can take one equation for binary functions and rotate it to obtain another. The SYMDIF function corresponding to the symmetric difference operation is rotation invariant. Schroeder’s transposition theorem can be given a succinct variable-free formulation as the statement that the relation composite[DISJOINT, COMPOSE] is rotation invariant, where DISJOINT is class[pair[x, y], disjoint[x, y]], and COMPOSE is the binary function corresponding to composite. We mention three applications of these binary functions for defining classes. The class of all transitive relations can be specified as: class[x, subclass[composite[x, x], x]] −→ fix[composite[S, COMPOSE, DUP]]. The class of all disjoint collections, specified as the input class[z,forall[x,y, implies[and[member[x,z],member[y,z]], or[equal[x,y],disjoint[x,y]]]]] produces fix[image[inverse[CART], P[union[DISJOINT, Id]]]] as output. The class of all topologies, input as class[t,and[subclass[image[BIGCUP,P[t]],t], subclass[image[CAP,cart[t,t]],t]]] produces the output intersection[ complement[fix[composite[complement[E],BIGCUP,inverse[S]]]], fix[composite[S,IMAGE[CAP],CART,DUP]]].

7

Conclusion

Proving theorems in set theory with a first order theorem prover such as Otter is greatly facilitated by the use of a companion program GOEDEL which permits one to automatically translate from the notations commonly used in mathematics to the special language needed for the G¨ odel theory of classes. Having an arsenal of set-theoretic functions that correspond to the function symbols of first order logic proves to be useful for systematically eliminating existential quantifiers and thereby avoiding the Skolem functions produced when formulas with existential quantifiers are converted to clause form. Although the main focus in this talk was on the use of the GOEDEL program to help find convenient definitions for all these functions, the GOEDEL program also permits one to

G¨ odel’s Algorithm for Class Formation

141

discover useful formulas that these functions satisfy. By adding these formulas as new simplification rules, the program has grown increasingly powerful over the years. The GOEDEL program currently contains well over three thousand simplification rules, many of which have been proved valid using Otter. The simplification rules can be used not only to simpify definitions, but also to simplify statements. This power to simplify statements has led to the discovery of many new formulas, especially new demodulators. Experience with Otter indicates that searches for proofs are dramatically improved by the presence of demodulators even when they are not directly used in the proof of a theorem because they help to combat combinatorial explosion.

Appendix. An Extract from the GOEDEL Program Print[":Package Title: GOEDEL.M 2000 January 13 at 6:45 a.m. "]; (* :Context: Goedel‘ :Mathematica Version: 3.0 :Author: Johan G. F. Belinfante :Summary: The GOEDEL program implements Goedel’s algorithm for class formation, modified to avoid assuming the axiom of regularity, and Kuratowski’s construction of ordered pairs. :Sources: <description of algorithm, information for experts> Kurt Goedel, 1939 monograph on consistency of the axiom of choice and the generalized continuum hypothesis, pp. 9-14. :Warnings: <description of global effects, incompatibilities> 0 is used to represent the empty set. E is used to represent the membership relation. :Limitations: <special cases not handled, known problems> The simplification rules are not confluent; termination is not assured. There is no user control over the order that simplification rules are applied. This stripped down version of GOEDEL51.A23 lacks 95% of the simplification rules needed to produce good output. Mathematica’s builtin Tracing commands are the only mechanism for discovering what rules were actually applied. :Examples: Sample files are available for various test suites. *) BeginPackage["Goedel‘"] and::usage = "and[x,y,...] is conjunction" assert::usage = "assert[p] produces a statement equivalent to p by applying Goedel’s algorthm to class[w,p]. Applying assert repeatedly sometimes simplifies a statement." cart::usage = "cart[x,y] is the cartesian product of classes x and y." class::usage = "class[x,p] applies Goedel’s algorthm to the class of all sets x satisfying the condition p. The variable x may be atomic, or of the form pair[u,v], where u and v in turn can be pairs, etc." complement::usage = "complement[x] is the class of all sets that do not belong to x" composite::usage = "composite[x,y,...]

composite of x,y, ... "

domain::usage = "domain[x] is the domain of x" E::usage = "E is the membership relation" equal::usage = "equal[x,y] is the statement that the classes x and y are equal"

142

Johan Gijsbertus Frederik Belinfante

exists::usage = "exists[x,y,...,p] means there are sets x,y, ... such that p" FIRST::usage = "FIRST is the function which takes pair[x,y] to y" forall::usage = "forall[x,y,..., p] means that p holds for all sets x,y,..." Id::usage = "Id is the identity relation" id::usage = "id[x] is the restriction of the identity relation to x" image::usage = "image[x,y] is the image of the class y under x" intersection::usage = "intersection[x,y,...] is the intersection of classes x,y,..." inverse::usage = "the relation inverse[x] is the inverse of x" LeftPairV::usage = "LeftPairV is the function that takes x to pair[V,x]" member::usage = "member[x,y] is the statement that x belongs to y" not::usage = "not[p] represents the negation of p" or::usage = "or[x,y,...] is the inclusive or" P::usage = "the power class P[x] is the class of all subsets of x" pair::usage = "pair[x,y] is the ordered pair of x and y." range::usage = "range[x] is the range of x" RightPairV::usage = "RightPairV is the function that takes x to pair[x,V]" S::usage = "S is the subset relation" SECOND::usage = "SECOND is the function that maps pair[x,y] to y" singleton::usage = "singleton[x] has no member except x; it is 0 if x is not a set" subclass::usage = "subclass[x,y] is the statement that x is contained in y" U::usage = "the sum class U[x] is the union of all sets belonging to x" union::usage = "union[x,y,...]

is the union of the classes x,y,... "

V::usage = "the universal class" Begin["‘Private‘"] (* begin the private context *) (* definitions of auxiliary functions not exported *) varlist[u_] := {u} /; AtomQ[u] varlist[pair[u_,v_]] := Union[varlist[u],varlist[v]] (* Is the expression x free of all variables which occur in y? *) allfreeQ[x_,y_] := Apply[And,Map[FreeQ[x,#]&,varlist[y]]] (* definitions of exported functions *) (* Rules that must be assigned before attributes are set. *) and[p_] := p or[p_] := p Attributes[and] := {Flat, Orderless, OneIdentity} Attributes[or] := {Flat, Orderless, OneIdentity} composite[x_] := composite[Id,x] intersection[x_] := x union[x_] := x Attributes[composite] := {Flat,OneIdentity} Attributes[intersection] := {Flat, Orderless, OneIdentity} Attributes[union] := {Flat, Orderless, OneIdentity}

G¨ odel’s Algorithm for Class Formation

not[True] := False not[False] := True

(* Truth Table *)

(* abbreviation for multiple quantifiers *) exists[x_,y__,p_] := exists[x,exists[y,p]] (* elimination rule for universal quantifiers *) forall[x__,y_] := not[exists[x,not[y]]] (* basic rules for membership *) member[u_,0] := False (* Added to avoid assuming axiom of regularity. Goedel assumes member[x,x] = 0. *) class[w_,member[x_,x_]] := Module[{y=Unique[]},class[w, exists[y,and[member[x,y],equal[x,y]]]]] class[z_,member[w_,cart[x_,y_]]] := Module[{u = Unique[],v = Unique[]}, class[z,exists[u,v,and[equal[pair[u,v],w],member[u,x],member[v,y]]]]] member[pair[u_,v_],cart[x_,y_]] := and[member[u,x],member[v,y]] member[u_,complement[x_]] := and[member[u,V],not[member[u,x]]] class[w_,member[z_,composite[x_,y_]]] := Module[{t=Unique[],u=Unique[],v=Unique[]}, class[w,exists[t,u,v,and[equal[z,pair[u,v]],and[member[pair[u,t],y], member[pair[t,v],x]]]]]] class[z_,member[w_,cross[x_,y_]]] := Module[{u1=Unique[],u2=Unique[],v1=Unique[],v2=Unique[]}, class[z,exists[u1,u2,v1,v2,and[equal[pair[pair[u1,u2],pair[v1,v2]],w], member[pair[u1,v1],x],member[pair[u2,v2],y]]]]] (* Goedel’s definition 1.5 *) class[w_,member[u_,domain[x_]]] := Module[{v = Unique[]},class[w,exists[v, and[member[u,V],member[pair[u,v],x]]]]] class[z_,member[w_,E]] := Module[{u = Unique[],v = Unique[]},class[z,exists[u,v, and[equal[pair[u,v],w],member[u,v]]]]] class[w_,member[x_,FIRST]] := Module[{u = Unique[],v = Unique[]},class[w, exists[u,v,equal[pair[pair[u,v],u],x]]]] class[z_,member[w_,Id]] := Module[{u = Unique[]},class[z,exists[u,equal[w,pair[u,u]]]]] class[z_,member[w_,id[x_]]] := Module[{u = Unique[]},class[z,exists[u, and[member[u,x],equal[w,pair[u,u]]]]]] class[w_,member[v_,image[z_,x_]]] := Module[{u = Unique[]},class[w,exists[u, and[member[v,V],member[u,x],member[pair[u,v],z]]]]] member[u_,intersection[x_,y_]] := and[member[u,x],member[u,y]] class[x_,member[w_,inverse[z_]]] := Module[{u = Unique[],v = Unique[]}, class[x,exists[u,v,and[equal[pair[u,v],w],member[pair[v,u],z]]]]] class[x_,member[w_,LeftPairV]] := Module[{u = Unique[],v = Unique[]},class[x, exists[u,v,and[equal[pair[u,v],w],equal[v,pair[V,u]]]]]] member[x_,P[y_]] := and[member[x,V],subclass[x,y]] class[u_,member[v_,pair[x_,y_]]] := Module[{z=Unique[]},class[u,exists[z, and[equal[pair[x,y],z],member[v,z]]]]] class[w_,member[v_,range[z_]]] := Module[{u = Unique[]},class[w,exists[u, and[member[v,V],member[pair[u,v],z]]]]] class[x_,member[w_,RightPairV]] := Module[{u = Unique[],v = Unique[]},class[x, exists[u,v,and[equal[pair[u,v],w],equal[v,pair[u,V]]]]]]

143

144

Johan Gijsbertus Frederik Belinfante

class[w_,member[x_,S]] := Module[{u = Unique[],v = Unique[]},class[w,exists[u,v, and[equal[pair[u,v],x],subclass[u,v]]]]] class[w_,member[x_,SECOND]] := Module[{u = Unique[],v = Unique[]},class[w, exists[u,v,equal[pair[pair[u,v],v],x]]]] member[u_,singleton[x_]] := and[equal[u,x],member[u,V]] member[u_,union[x_,y_]] := or[member[u,x],member[u,y]] class[w_,member[x_,U[z_]]] := Module[{y = Unique[]},class[w,exists[y, and[member[x,y],member[y,z]]]]] class[w_,subclass[x_,y_]] := Module[{u = Unique[]},class[w, forall[u,or[not[member[u,x]], member[u,y]]]]] class[x_,False]:=0 class[x_,True]:=V /; AtomQ[x] class[pair[u_,v_],True] := cart[class[u,True],class[v,True]] class[u_,member[u_,x_]] := x /; And[FreeQ[x,u],AtomQ[u]] (* axiom B.1 membership relation *) class[pair[u_,v_],member[u_,v_]] := E /; And[AtomQ[u],AtomQ[v]] (* axiom B.2 intersection *) class[x_,and[p_,q_]] := intersection[class[x,p],class[x,q]] class[x_,or[p_,q_]] := union[class[x,p],class[x,q]] (* axiom B.3 complement *) class[x_,not[p_]] := intersection[complement[class[x,p]],class[x,True]] (* axiom B.4 domain and Goedel’s equation 2.8 on page 9 *) class[x_,exists[y_,p_]] := domain[class[pair[x,y],p]] (* axiom B.5 cartesian product *) class[pair[u_,v_],member[u_,x_]] := cart[x,V] /; And[FreeQ[x,u],FreeQ[x,v],AtomQ[u],AtomQ[v]] (* axiom B.6 inverse *) class[pair[u_,v_],member[v_,u_]] := inverse[E] /; And[AtomQ[u],AtomQ[v]] (* an interpretation of Goedel’s equation 2.41 on page 9 *) class[pair[u_,v_],p_] := cart[class[u,p],class[v,True]] /; allfreeQ[p,v] (* an interpretation of Goedel’s equation 2.7 on page 9 *) class[pair[u_,v_],p_] := cart[class[u,True],class[v,p]] /; allfreeQ[p,u] (*

Four rules to replace the rotation rules on Goedel’s page 9: *)

class[pair[pair[u_,v_],w_],p_] := composite[class[pair[v,w],p],SECOND, id[cart[class[u,True],V]]] /; allfreeQ[p,u] class[pair[pair[u_,v_],w_],p_] := composite[class[pair[u,w],p],FIRST, id[cart[V,class[v,True]]]] /; allfreeQ[p,v] class[pair[w_,pair[u_,v_]],p_] := composite[id[cart[class[u,True],V]], inverse[SECOND],class[pair[w,v],p]] /; allfreeQ[p,u] class[pair[w_,pair[u_,v_]],p_] := composite[id[cart[V,class[v,True]]], inverse[FIRST],class[pair[w,u],p]] /; allfreeQ[p,v] (* special maneuver on page 10 of Goedel’s monograph *) class[u_,member[x_,y_]] := Module[{v = Unique[]}, class[u,exists[v,and[equal[x,v],member[v,y]]]]] /; FreeQ[varlist[u],x]

G¨ odel’s Algorithm for Class Formation

145

(* new rules for equality *) equal[x_,x_] := True class[pair[u_,v_],equal[u_,v_]] := Id /; And[AtomQ[u],AtomQ[v]] class[pair[u_,v_],equal[v_,u_]] := Id /; And[AtomQ[u],AtomQ[v]] class[x_,equal[x_,y_]] := intersection[singleton[y],class[x,True]] /; allfreeQ[y,x] (* Goedel’s Axiom A.3 of Coextension. *) class[w_,equal[x_,y_]] := intersection[class[w,subclass[x,y]], class[w,subclass[y,x]]] /; And[Or[Not[MemberQ[varlist[w],x]], Not[MemberQ[varlist[w],y]]], Not[SameQ[Head[x],pair]], Not[SameQ[Head[y],pair]]] class[x_,equal[y_,x_]] := intersection[singleton[y],class[x,True]] /; allfreeQ[y,x] equal[pair[x_,y_],0] := False equal[pair[x_,y_],V] := False (* equality of pairs *) equal[pair[u_,v_],pair[x_,y_]] := and[equal[singleton[u],singleton[x]], equal[singleton[v],singleton[y]]] class[w_,equal[singleton[u_],singleton[v_]]] := class[w,equal[u,v]] /; MemberQ[varlist[w],u] || MemberQ[varlist[w],v] || member[u,V] || member[v,V] (* flip equations involving a single pair to put pair on the left *) equal[x_,y_pair] := equal[y,x] (* rules that apply when x or y is known not to be a set *) pair[x_,y_] := pair[V,y] /; Not[V === x] && not[member[x,V]] pair[x_,y_] := pair[x,V] /; Not[V === y] && not[member[y,V]] (* rule that applies when z does not occur in varlist[u] or when z occurs in x or y. *) class[u_,equal[pair[x_,y_],z_]] := Module[{v=Unique[]}, class[u,exists[v,and[equal[pair[x,y],v],equal[v,z]]]]] /; Not[MemberQ[varlist[u],z]] || Not[FreeQ[{x,y},z]] (* rule that applies when z does occur in varlist[w] and z does not occur in either x or y. This rule only applies when x and y are known to be sets. *) class[w_,equal[pair[x_,y_],z_]] := Module[{u=Unique[],v=Unique[]}, class[(w/.z->pair[u,v]),and[equal[x,u],equal[y,v]]]] /; And[MemberQ[varlist[w],z],FreeQ[{x,y},z], Or[member[x,V],MemberQ[varlist[w],x]], Or[member[y,V],MemberQ[varlist[w],y]]] (* rule that applies when one does not know whether or not x is a set *) class[u_,equal[pair[x_,y_],z_]] := Module[{v=Unique[]}, class[u,or[and[not[member[x,V]],equal[pair[V,y],z]], exists[v,and[equal[x,v],equal[pair[v,y],z]]]]]] /; Not[MemberQ[varlist[u],x]] && UnsameQ[V,x] && Not[member[x,V] === True] (* rule that applies when one does not know whether or not y is a set *) class[u_,equal[pair[x_,y_],z_]] := Module[{v=Unique[]}, class[u,or[and[not[member[y,V]],equal[pair[x,V],z]], exists[v,and[equal[y,v],equal[pair[x,v],z]]]]]] /; Not[MemberQ[varlist[u],y]] && UnsameQ[V,y] && Not[member[y,V] === True] class[pair[u_,v_],equal[pair[V,u_],v_]] := LeftPairV class[pair[u_,v_],equal[pair[u_,V],v_]] := RightPairV class[pair[u_,v_],equal[pair[V,v_],u_]] := inverse[LeftPairV] class[pair[u_,v_],equal[pair[v_,V],u_]] := inverse[RightPairV] image[inverse[RightPairV],x_] := 0 /; composite[Id,x] == x image[inverse[LeftPairV],x_] := 0 /; composite[Id,x] == x

146

Johan Gijsbertus Frederik Belinfante

class[w_,equal[pair[V,y_],z_]] := Module[{v = Unique[]}, class[w,or[and[not[member[y,V]],equal[pair[V,V],z]], and[member[y,V],exists[v,and[equal[pair[V,v],z],equal[v,y]]]]]]] /; Not[allfreeQ[y,w]] class[w_,equal[pair[x_,V],z_]] := Module[{v = Unique[]}, class[w,or[and[not[member[x,V]],equal[pair[V,V],z]], and[member[x,V],exists[v,and[equal[pair[v,V],z],equal[v,x]]]]]]] /; Not[allfreeQ[x,w]] class[w_,equal[pair[V,V],w_]] := singleton[pair[V,V]] class[w_,equal[pair[V,V],x_]] := Module[{v = Unique[]}, class[w,exists[v,and[equal[pair[V,V],v],equal[v,x]]]]] /; Not[MemberQ[x,varlist[w]]] (* assertions *) assert[p_] := Module[{w = Unique[]}, equal[V,class[w,p]]] (* a few simplification rules *) cart[x_,0] := 0 cart[0,x_] := 0 complement[0] := V complement[complement[x_]] := x complement[union[x_,y_]] := intersection[complement[x],complement[y]] complement[V] := 0 composite[x_,cart[y_,z_]] := cart[y,image[x,z]] composite[cart[x_,y_],z_] := cart[image[inverse[z],x],y] composite[Id,x_,y_] := composite[x,y] composite[x_,Id] := composite[Id,x] composite[Id,Id] := Id domain[cart[x_,y_]] := intersection[x,image[V,y]] domain[composite[x_,y_]] := image[inverse[y],domain[x]] domain[Id] := V domain[id[x_]] := x id[V] := Id image[0,x_] := 0 image[x_,0] := 0 image[composite[x_,y_],z_] := image[x,image[y,z]] image[Id,x_] := x image[id[x_],y_] := intersection[x,y] intersection[cart[x_,y_],z_] := composite[id[y],z,id[x]] intersection[V,x_] := x inverse[0] := 0 inverse[cart[x_,y_]] := cart[y,x] inverse[complement[x_]] := composite[Id,complement[inverse[x]]] inverse[composite[x_,y_]] := composite[inverse[y],inverse[x]] inverse[Id] := Id inverse[inverse[x_]] := composite[Id,x] range[Id] := V union[0,x_] := x End[ ] (* end the private context *) Protect[ and, assert, cart, class, complement, composite, domain, E, equal, exists, FIRST, forall, Id, id, image, intersection, inverse, LeftPairV, member, not, or, P, pair, range, RightPairV, S, SECOND, singleton, subclass, U, union, V ] EndPackage[ ]

(* end the package context *)

G¨ odel’s Algorithm for Class Formation

147

References [1996]Belinfante, J. G. F., On a Modification of G¨ odel’s Algorithm for Class Formation, Association for Automated Reasoning News Letter, No. 34 (1996) pp. 10–15. [1997]Belinfante, J. G. F., On Quaife’s Development of Class Theory, Association for Automated Reasoning Newsletter, No. 37 (1997) pp. 5–9. [1999a]Belinfante, J. G. F., Computer Proofs in G¨odel’s Class Theory with Equational Definitions for Composite and Cross, Journal of Automated Reasoning, vol. 22 (1999) pp. 311–339. [1999b]Belinfante, J. G. F., On Computer-Assisted Proofs in Ordinal Number Theory, Journal of Automated Reasoning, vol. 22 (1999), pp. 341–378. [1991]Bernays, P., Axiomatic Set Theory, North Holland Publishing Co., Amsterdam. First edition: 1958. Second edition: 1968. Republished in 1991 by Dover Publications, New York. [1986]Boyer, R., Lusk, E., McCune, W., Overbeek, R., Stickel M. and Wos, L., Set Theory in First Order Logic: Clauses for G¨ odel’s Axioms, Journal of Automated Reasoning, volume 2 (1986), pages 287–327. [1998]Formisano, A. and Omodeo, E. G., An Equational Re-Engineering of Set Theories, presented at the FTP’98 International Workshop on First Order Theorem Proving (November 23–25, 1998). [1940]G¨ odel, K., The Consistency of the Axiom of Choice and of the Generalized Continuum Hypothesis with the Axioms of Set Theory, Princeton University Press, Princeton, 1940. [1960]Isbell, J. R., A Definition of Ordinal Numbers, The American Mathematical Monthly vol. 67 (1960), pp. 51–52. [1994]McCune, W. W., Otter 3.0 Reference Manual and Guide, Argonne National Laboratory Report ANL–94/6, Argonne National Laboratory, Argonne, IL, January 1994. [1997]Megill, N. D., Metamath: A Computer Language for Pure Mathematics, 1997. [1993]No¨el, P. A. J., Experimenting with Isabelle in ZF set theory, Journal of Automated Reasoning, vol. 10 (1993), pp. 15–58. [1996]Paulson, L. C., and Gr¸abczewski, K., Mechanizing Set Theory, Journal of Automated Reasoning, vol. 17, pp. 291–323 (1996). [1992a]Quaife, A., Automated Deduction in von Neumann-Bernays-G¨ odel Set Theory, Journal of Automated Reasoning, vol. 8 (1992), pp. 91–147. [1992b]Quaife, A., Automated Development of Fundamental Mathematical Theories, Ph.D. thesis, Univ. of California at Berkeley, Kluwer Acad. Publishers, Dordrecht, 1992. [1987]Tarski, A., and Givant, S., A Formalization of Set Theory without Variables, American Mathematical Society Colloquium Publications, volume 41, Providence, Rhode Island, 1987. [1988]Wos, L., Automated Reasoning: 33 Basic Research Problems, Prentice Hall, Englewood Cliffs, NJ, 1988. [1989]Wos, L., The problem of finding an inference rule for set theory, Journal of Automated Reasoning, vol. 5 (1989), pp. 93–95. [1992]Wos, L., Overbeek, R., Lusk, E. and Boyle, J., Automated Reasoning: Introduction and Applications, Second Edition, McGraw Hill, New York, 1992. [1996]Wolfram, S., The MathematicaTM Book, Wolfram Media Inc., Champaign, Illinois, 1996.

Automated Proof Construction in Type Theory Using Resolution Marc Bezem1 , Dimitri Hendriks2 , and Hans de Nivelle3 1 2

Utrecht University, Department of Philosophy [email protected] Utrecht University, Department of Philosophy [email protected] 3 Max Planck Institute [email protected]

Abstract. We provide techniques to integrate resolution logic with equality in type theory. The results may be rendered as follows. – A clausification procedure in type theory, equipped with a correctness proof, all encoded using higher-order primitive recursion. – A novel representation of clauses in minimal logic such that the λ-representation of resolution proofs is linear in the size of the premisses. – A translation of resolution proofs into lambda terms, yielding a verification procedure for those proofs. – The power of resolution theorem provers becomes available in interactive proof construction systems based on type theory.

1

Introduction

Type theory (= typed Lambda Calculus) offers a powerful formalism for formalizing mathematics. Strong points are: the logical foundation, the fact that proofs are first-class citizens, and the generality which naturally facilitates extensions, such as inductive types. Type theory captures definitions, reasoning and computation at various levels in an integrated way. In a type-theoretical system, formalized mathematical statements are represented by types, and their proofs are represented by λ-terms. The problem whether π is a proof of statement A reduces to checking whether the term π has type A. Computation is based on a simple notion of rewriting. The level of detail is such that the well-formedness of definitions and the correctness of derivations can automatically be verified. However, there are also weak points. It is exactly the appraised expressivity and the level of detail that makes automation at the same time necessary and difficult. Automated deduction appears to be mostly successful in weak systems, such as propositional logic and predicate logic, systems that fall short to formalize a larger body of mathematics. Apart from the problem of the expressivity of these systems, only a minor part of the theorems that can be expressed can actually be proved automatically. Therefore it is necessary to combine automated D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 148–163, 2000. c Springer-Verlag Berlin Heidelberg 2000

Automated Proof Construction in Type Theory Using Resolution

149

theorem proving with interactive theorem proving. Recently a number of proposals in this direction have been made. In [MS99] Otter is combined with the Boyer-Moore theorem prover. (A verified program rechecks proofs generated by Otter.) In [Hur99] Gandalf is linked to HOL. (The translation generates scripts to be run by the HOL-system.) In [ST95], proofs are translated into Martin-L¨ of’s type theory, for the Horn clause fragment of first-order logic. In the Omega system [Hua96,Omega] various theorem provers have been linked to a natural deduction proof checker. The purpose there is to automatically generate proofs from so called proof plans. Our approach is different in that we generate complete proof objects for both the clausification and the refutation part. Resolution theorem provers, such as Bliksem [Blk], are powerful, but have the drawback that they work with normal forms of formulae, so-called clausal forms. Clauses are (universally closed) disjunctions of literals, and a literal is either an atom or a negated atom. The clausal form of a formula is essentially its Skolemconjunctive normal form, which need not be exactly logically equivalent to the original formula. This makes resolution proofs hard to read and understand, and makes the interactive navigation of the theorem prover through the search space very difficult. Moreover, optimized implementations of proof procedures are error-prone (cf. recent CASC disqualifications). In type theory, the proof generation capabilities suffer from the small granularity of the inference steps and the corresponding astronomic size of the search space. Typically, one hyperresolution step requires a few dozens of inference steps in type theory. In order to make the formalisation of a large body of mathematics feasible, the level of automation of interactive proof construction systems such as Coq [Coq98], based on type theory, has to be improved. We propose the following proof procedure. Identify a non-trivial step in a Coq session that amounts to a first-order tautology. Export this tautology to Bliksem, and delegate the proof search to the Bliksem inference engine. Convert the resolution proof to type theoretic format and import the result back in Coq. We stress the fact that the above procedure is as secure as Coq. Hypothetical errors (e.g. the clausification procedure not producing clauses, possible errors in the resolution theorem prover or the erroneous formulation of the lambda terms corresponding to its proofs) are irrelevant because the resulting proofs are type-checked by Coq. The security could be made independent of Coq by using another type-checker. Most of the necessary meta-theory is already known. The negation normal form transformation can be axiomatized by classical logic. The prenex and conjunctive normal form transformations require that the domain is non-empty. Skolemization can be axiomatized by so-called Skolem axioms, which can be viewed as specific instances of the Axiom of Choice. Higher-order logic is particularly suited for this axiomatization: we get logical equivalence modulo classical logic plus the Axiom of Choice, instead of awkward invariants as equiconsistency or equisatisfiability in the first-order case. By adapting a result of Kleene, Skolem functions and –axioms could be eliminated from resolution proofs, which would allow us to obtain directly a proof of

150

Marc Bezem, Dimitri Hendriks, and Hans de Nivelle

the original formula (cf. [Pfe84]), but currently we still make use of the Axiom of Choice. The paper is organized as follows. In Section 2 we set out a two-level approach and define a deep embedding to represent first-order logic. Section 3 describes a uniform clausification procedure. We explain how resolution proofs are translated into λ-terms in Sections 4 and 5. Finally, the outlined constructions are demonstrated in Section 6.

2

A Two-Level Approach

The basic sorts in Coq are ∗p and ∗s . An object M of type ∗p is a logical proposition and denotes the class of proofs of M . Objects of type ∗s are usual sets such as the set of natural numbers, lists etc. In type theory, the typing relation is expressed by t : T , to be interpreted as ‘t belongs to set T ’ when T : ∗s , and as ‘t is a proof of proposition T ’ when T : ∗p . As usual, → associates to the right; → is used for logical implication as well as for function spaces. Furthermore, well-typed application is denoted by (M N ) and associates to the left. Scopes of bound variables are always extended to the right as far as possible. We use the notation    constructor 0 : · · · → T .. .. s T : ∗ := . .   constructor n : · · · → T to define the inductive set T , that is: the smallest set of objects that is freely generated by constructor 0 , . . . , constructor n . Moreover, we use    pattern 0 ⇒ rhs 0 .. .. .. λt : T. Cases t of . . .   pattern m ⇒ rhs m for the exhaustive case analysis on t in the inductive type T . If t matches pattern i , it is replaced by the right-hand side rhs i . We choose for a deep embedding in adopting a two-level approach for the treatment of arbitrary first-order languages. The idea is to represent first-order formulae as objects in an inductive set o : ∗s , accompanied by an interpretation function E that maps these objects into ∗p .1 The next paragraphs explain why we distinguish a higher (meta-, logical ) level ∗p and a lower (object-, computational ) level o. The universe ∗p includes higher-order propositions; in fact it encompasses full impredicative type theory. As such, it is too large for our purposes. Given a suitable signature, any first-order proposition ϕ : ∗p will have a formal counterpart p : o such that ϕ equals (E p), the interpretation of p. Thus, the first-order 1

Both o as well as E depend on a fixed but arbitrary signature.

Automated Proof Construction in Type Theory Using Resolution

151

fragment of ∗p can be identified as the collection of interpretations of objects in o. Secondly, Coq supplies only limited computational power on ∗p , whereas o, as every inductive set, is equipped with the powerful computational device of higher-order primitive recursion. This enables the syntactical manipulation of object-level propositions. Reflection of object-level propositions is used for the proof construction of first-order formulae in ∗p, in the following way. Let ϕ : ∗p be a first-order proposition. Then there is some ϕ˙ : o such that (E ϕ) ˙ is convertible with ϕ.2 Moreover, suppose we have proved ∀p : o. (E (T p)) → (E p) for some function T : o → o. Then, to prove ϕ it suffices to prove (E (T ϕ)). ˙ Matters are presented schematically in Figure 1. In Section 3 we discuss a concrete function T , for which we have proved the above. For this T , proofs of (E (T ϕ)) ˙ will be generated automatically, as will be described in Sections 4 and 5.

ϕV o ˙

(E (T ϕ)) ˙ U E

ϕ˙

meta-level ∗p

E T

/ (T ϕ) ˙

object-level o

Fig. 1. Schematic overview of the general procedure. The proof of the implication from (E (T ϕ)) ˙ to ϕ can be generated uniformly in ϕ. ˙

Object-Level Propositions and the Reflection Operation In Coq, we have constructed a general framework to represent first-order languages with multiple sorts. Bliksem is (as yet) one-sorted, so we describe the setup for one-sorted signatures only. Assume a domain of discourse σ : ∗s . Suppose we have relation symbols R0 , . . . , Rk typed σ e0 → ∗p , . . . , σek → ∗p respectively. Here e0 , . . . , ek are natural numbers and σ n is the Cartesian product of n copies of σ, that is: σ 0 = unit

σ1 = σ

σ n+2 = σ × σ n+1

The set unit is a singleton with sole inhabitant tt. 2

The mapping ˙ is a syntax-based translation outside Coq.

152

Marc Bezem, Dimitri Hendriks, and Hans de Nivelle

Let L be the non-empty3 list of arities [e0 , . . . , ek ]. We define o, the set of objects representing propositions, inductively by:    

rel : Πi : (index L). σ (select ¬˙ : o → o o : ∗s := ˙ ∧˙ : o → o → o →, ˙ ∨,    ˙ ∃˙ : (σ → o) → o ∀,

L i)

→o

We use the dot-notation ˙ to distinguish the object-level constructors from Coq’s predefined connectives. Connectives are written infix. The function select is of type ΠL : (nelist nat ). (index L) → nat , where (index L) computes the set {0, . . . , k}. We have (select L i) =βδι ei . Thus, an atomic proposition is of form (rel i t), with t an argument tuple in σ ei . ˙ ∃˙ map propositional functions of type σ → o to propoThe constructors ∀, sitions of type o. This representation has the advantage that binding and predication are handled by λ-abstraction and λ-application. On the object-level, existential quantification of x over p (of type o, possibly containing occurrences of x) is written as (∃˙ (λx : σ. p)). Although this representation suffices for our purposes, it causes some well-known difficulties. E.g. we cannot write a boolean function which recognizes whether a given formal proposition is in prenex normal form. As there is no canonical choice of a fresh term in σ, it is not possible to recursively descend under abstractions in λ-terms. See [NM98, Sections 8.3, 9.2] for a further discussion. For our purposes, a shallow embedding of function symbols is sufficient. We have not defined an inductive set term representing the first-order terms in σ like we have defined o representing the first-order fragment of ∗p . Instead, ‘meta-level’ terms of type σ are taken as arguments of object-level predicates. Due to this shallow embedding, we cannot check whether certain variables have occurrences in a given term. Because of that, e.g., distributing universal quantifiers over conjuncts can yield dummy abstractions. These problems could be overcome by using de Bruijn-indices (see [dB72]) for a deep embedding of terms in Coq. The interpretation function E is a canonical homomorphism recursively defined as follows.  (rel i t) ⇒ (Ri t)     ¬p ˙ 0 ⇒ ¬(E p0 )     ˙ p2 ⇒ (E p1 ) → (E p2 )  p1 → p1 ∨˙ p2 ⇒ (E p1 ) ∨ (E p2 ) E : o → ∗p := λp : o. Cases p of   p1 ∧˙ p2 ⇒ (E p1 ) ∧ (E p2 )     (∀˙ p0 ) ⇒ ∀x : σ. (E (p0 x))    (∃˙ p0 ) ⇒ ∃x : σ. (E (p0 x)) 3

We require the signature to contain at least one relation symbol.

Automated Proof Construction in Type Theory Using Resolution

153

In the above definitions of o, its constructors and of E , the dependency on the signature has been suppressed. In fact we have: o : ∗s → (nelist nat ) → ∗s rel : Πσ : ∗s . ΠL : (nelist nat ). Πi : (index L). σ (select L i) → (o σ L) E : Πσ : ∗s . ΠL : (nelist nat ). (Πi : (index L). σ (select L i) → ∗p) → (o σ L) → ∗p In the next section, we fix an arbitrary signature and mention the above dependencies implicitly only.

3

Clausification and Correctness

We describe the transformation to minimal clausal form (see Section 4), which is realized on both levels. On the object-level, we define an algorithm mcf : o → o that converts object-level propositions into their clausal form. On the metalevel, clausification is realized by a term mcfprf , which transforms a proof of (E (mcf p)) into a proof of (E p). The algorithm mcf consists of the subsequent application of the following functions: nnf , pnf , cnf , sklm, duqc, impf standing for transformations to negation, prenex and conjunctive normal form, Skolemization, distribution of universal quantifiers over conjuncts and transformation to implicational form, respectively. As an illustration, we describe the functions nnf and sklm. Concerning negation normal form, a recursive call like (nnf ¬(A ˙ ∧˙ B)) = (nnf ¬A) ˙ ∨˙ (nnf ¬B) ˙ is not primitive recursive, since ¬A ˙ and ¬B ˙ are not subformulae of ¬(A ˙ ∧˙ B). Such a call requires general recursion. Coq’s computational mechanism is higherorder primitive recursion, which is weaker than general recursion but ensures universal termination. The function nnf : o → pol → o defined below4 , makes use of the so-called ⊕ : pol polarity of an input formula. Polarities are: pol : ∗s := . : pol

4

˙ ∃, ˙ we write Qx : σ. p instead of (Q (λx : σ. p)). For Q = ∀,

154

Marc Bezem, Dimitri Hendriks, and Hans de Nivelle

nnf : o → pol → o := λp : o. λa : pol. Cases p a of   (rel i t) ⊕ ⇒ (rel i t)    ˙ i t)  (rel i t) ⇒ ¬(rel    ¬p ˙ ⊕ ⇒ (nnf p 0 0 )     ¬p ˙ ⇒ (nnf p 0 0 ⊕)    ˙ (nnf  p → ˙ p ⊕ ⇒ (nnf p 1 2 1 ) ∨    ˙ (nnf  p → ˙ p ⇒ (nnf p ⊕) ∧ 1 2 1    p ∨˙ p ⊕ ⇒ (nnf p ⊕) ∨˙ (nnf 1 2 1 p1 ∨˙ p2 ⇒ (nnf p1 ) ∧˙ (nnf     p1 ∧˙ p2 ⊕ ⇒ (nnf p1 ⊕) ∧˙ (nnf      p1 ∧˙ p2 ⇒ (nnf p1 ) ∨˙ (nnf   ˙ : σ. (nnf (p0 x)  (∀˙ p0 ) ⊕ ⇒ ∀x    ˙ : σ. (nnf (p0 x) ˙  ) ⇒ ∃x ( ∀ p  0    ˙ : σ. (nnf (p0 x) ˙  (∃ p ) ⊕ ⇒ ∃x   ˙ 0 ˙ : σ. (nnf (p0 x) (∃ p0 ) ⇒ ∀x

p2 ⊕) p2 ) p2 ⊕) p2 ) p2 ⊕) p2 ) ⊕) ) ⊕) )

We have proved the following lemma. EM → ∀p : o. ((E p) ↔ (E (nnf p ⊕))) ∧ (¬(E p) ↔ (E (nnf p ))) Where EM is the principle of excluded middle, defined in such a way that it affects the first-order fragment only. EM : ∗p := ∀p : o. (E p) ∨ ¬(E p) Skolemization of a formula means the removal of all existential quantifiers and the replacement of the variables that were bound by the removed existential quantifiers, by new terms, that is, Skolem functions applied to the universally quantified variables whose quantifier had the existential quantifier in its scope. Instead of quantifying each of the Skolem functions, we introduce an index type skolT , which may be viewed as a family of Skolem functions. skolT : ∗s := nat → nat → Πn : nat. σn → σ A Skolem function, then, is a term (f i j n) : σ n → σ, with f : skolT and i, j, n : nat . Here, i and j are indices that distinguish the family members. If the output of nnf yields a conjunction, the remaining clausification steps are performed separately on the conjuncts. (This yields a significant speed-up in performance.) Index i denotes the position of the conjunct, j denotes the number of the replaced existentially quantified variable in that conjunct. The function sklm is defined as follows. sklm : skolT → nat → nat → Πn : nat. σn → o → o := λf : skolT . λi, j, n: nat . λt : σn. λp : o. Cases p of  ˙ (sklm f i j (n + 1) (insert x n t) (p0 x))  (∀˙ p0 ) ⇒ ∀x. ˙ (∃ p0 ) ⇒ (sklm f i (j + 1) n t (p0 (f i j n t)))  p0 ⇒ p0

Automated Proof Construction in Type Theory Using Resolution

155

Given a variable x : σ, an arity n : nat , a tuple t : σ n , the term (insert x n t) adds x at the end of t, resulting in a tuple of type σ n+1 . Thus, if p is a universal statement, the quantified variable is added at the end of the so far constructed tuple t of universally quantified variables. In case p matches (∃˙ p0 ), the term (f i j n t) is substituted for the existentially quantified variable (the ‘hole’ in p0 ) and index j is incremented. The third case, p0 , exhausts the five remaining cases. As we force input formulae to be in prenex normal form (via the definition of mcf ), nothing remains to be done. We proved the following lemma. σ → AC → ∀i : nat. ∀p : o. (E p) → ∃f : skolT . (E (sklm f i 0 0 tt p)) Here, σ → · · · expresses the condition that σ is non-empty. AC is the Axiom of Choice, which allows us to form Skolem functions. AC : ∗p := ∀α : σ → skolT → o. (∀x : σ. ∃f : skolT . (E (α x f))) → ∃F : σ → skolT . ∀x : σ. (E (α x (F x))) Reconsider Figure 1 and substitute mcf for T . We have proved that for all objects p : o the interpretation of the result of applying mcf to p implies the interpretation of p. Thus, given a suitable signature, from any first-order formula ϕ : ∗p , we can construct the classical equivalent (E (mcf ϕ)) ˙ ∈ MCF . The term mcfprf makes clausification effective on the meta-level. mcfprf : EM → AC → σ → ∀p : o. (E (mcf p)) → (E p) Given inhabitants em : EM and ac : AC , an element s : σ, a proposition p : o and a proof ρ : (E (mcf p)), the term (mcfprf em ac s p ρ) is a proof of (E p). The term (E (mcf p)) : ∗p computes a format C1 → · · · → Cn → ⊥. Here C1 , . . . , Cn : ∗p are universally closed clauses that will be exported to Bliksem, which constructs the λ-term ρ representing a resolution refutation of these clauses (see Sections 4 and 5). Finally, ρ is type-checked in Coq. Section 6 demonstrates the outlined constructions. The complete Coq-script generating the correctness proof of the clausification algorithm comprises ± 65 pages. It is available at the following URL. www.phil.uu.nl/~hendriks/claus.tar.gz

4

Minimal Resolution Logic

There exist many representations of clauses and corresponding formulations of resolution rules. The traditional form of a clause is a disjunction of literals, that is, of atoms and negated atoms. Another form which is often used is that of a sequent, that is, the implication of a disjunction of atoms by a conjunction of atoms. Here we propose yet another representation of clauses, as far as we know not used before. There are three main considerations.

156

Marc Bezem, Dimitri Hendriks, and Hans de Nivelle

- A structural requirement is that the representation of clauses is closed under the operations involved, such as instantiation and resolution. - The Curry-Howard correspondence is most direct between minimal logic (→,∀) and a typed lambda calculus with product types (with → as a special, non-dependent, case of Π). Conjunction and disjunction in the logic require either extra type-forming primitives and extra terms to inhabit these, or impredicative encodings. - The λ-representation of resolution proofs should preferably be linear in the size of the premisses. These considerations have led us to represent a clause like: L1 ∨ · · · ∨ Lp by the following classically equivalent implication in minimal logic: L1 → · · · → Lp → ⊥ Here Li is the complement of Li in the classical sense (i.e. double negations are removed). If C is the disjunctive form of a clause, then we denote its implicational form by [C]. As usual, these expressions are implicitly or explicitly universally closed. A resolution refutation of given clauses C1 , . . . , Cn proves their inconsistency, and can be taken as a proof of the following implication in minimal logic: C1 → · · · → Cn → ⊥ The logic is called minimal as we use no particular properties of ⊥. We are now ready for the definition of the syntax of minimal resolution logic. → Definition 1. Let ∀− x . φ denote the universal closure of φ. Let Atom be the set of atomic propositions. We define the sets Literal, Clause and MCF of, respectively, literals, clauses and minimal clausal forms by the following abstract syntax. Literal ::= Atom | Atom → ⊥ Clause ::= ⊥ | Literal → Clause → MCF ::= ⊥ | (∀− x . Clause) → MCF Next we elaborate the familiar inference rules for factoring, permuting and weakening clauses, as well as the binary resolution rule. Factoring, Permutation, Weakening Let C and D be clauses, such that C subsumes D propositionally, that is, any literal in C also occurs in D. Let A1 , . . . , Ap , B1 , . . . , Bq be literals (p, q ≥ 0) and write [C] = A1 → · · · → Ap → ⊥

Automated Proof Construction in Type Theory Using Resolution

157

and [D] = B1 → · · · → Bq → ⊥ assuming that for every 1 ≤ i ≤ p there is 1 ≤ j ≤ q such that Ai = Bj . A proof of [C] → [D] is the following λ-term: λc : [C]. λb1 : B1 . . . λbq : Bq . (c π1 . . . πp ) with πi = bj , where j is such that Bj = Ai . Binary Resolution In the traditional form of the binary resolution rule for disjunctive clauses we have premisses C1 and C2 , containing one or more occurrences of a literal L and of L, respectively. The conclusion of the rule, the resolvent, is then a clause D consisting of all literals of C1 different from L joined with all literals of C2 different from L. This rule is completely symmetric with respect to C1 and C2 . For clauses in implicational form there is a slight asymmetry in the formulation of binary resolution. Let A1 , . . . , Ap, B1 . . . , Bq be literals (p, q ≥ 0) and write [C1] = A1 → · · · → Ap → ⊥, with one or more occurrences of the negated atom A → ⊥ among the Ai and [C2 ] = B1 → · · · → Bq → ⊥, with one or more occurrences of the atom A among the Bj . Write the resolvent D as [D] = D1 → · · · → Dr → ⊥ consisting of all literals of C1 different from A → ⊥ joined with all literals of C2 different from A. A proof of [C1 ] → [C2 ] → [D] is the following λ-term: λc1 : [C1]. λc2 : [C2]. λd1 : D1 . . . λdr : Dr . (c1 π1 . . . πp) For 1 ≤ i ≤ p, πi is defined as follows. If Ai 6= (A → ⊥), then πi = dk , where k is such that Dk = Ai . If Ai = A → ⊥, then we put πi = λa : A. (c2 ρ1 . . . ρq ), with ρj (1 ≤ j ≤ q) defined as follows. If Bj 6= A, then ρj = dk , where k is such that Dk = Bj . If Bj = A, then ρj = a. It is easily verified that πi : (A → ⊥) in this case. If (A → ⊥) occurs more than once among the Ai , then (c1 π1 . . . πp ) need not be linear. This can be avoided by factoring timely. Even without factoring, a linear proof term is possible: by taking the following β-expansion of (c1 π1 . . . πp ) (with a0 replacing copies of proofs of (A → ⊥)): (λa0 : A → ⊥. (c1 π1 . . . a0 . . . a0 . . . πp ))(λa : A. (c2 ρ1 . . . ρq )) This remark applies to the rules in the next subsections as well.

158

Marc Bezem, Dimitri Hendriks, and Hans de Nivelle

Paramodulation Paramodulation combines equational reasoning with resolution. For equational reasoning we use the inductive equality of Coq. In order to simplify matters, we assume a fixed domain of discourse σ, and denote equality of s1 , s2 ∈ σ by s1 ≈ s2 . Coq supplies us with the following terms: eqrefl : ∀s : σ. (s ≈ s) eqsubst : ∀s : σ.∀P : σ → ∗p . (P s) → ∀t : σ. (s ≈ t) → (P t) eqsym : ∀s1 , s2 : σ. (s1 ≈ s2 ) → (s2 ≈ s1 ) As an example we define eqsym from eqsubst , eqrefl: λs1 , s2 : σ. λh : (s1 ≈ s2 ). (eqsubst s1 (λs : σ. (s ≈ s1 )) (eqrefl s1 ) s2 h) Paramodulation for disjunctive clauses is the rule with premiss C1 containing the equality literal t1 ≈ t2 and premiss C2 containing literal A[t1 ]. The conclusion is then a clause D containing all literals of C1 different from t1 ≈ t2 , joined with C2 with A[t2 ] instead of A[t1 ]. Let A1 , . . . , Ap, B1 . . . , Bq be literals (p, q ≥ 0) and write [C1] = A1 → · · · → Ap → ⊥, with one or more occurrences of the equality atom t1 ≈ t2 → ⊥ among the Ai , and [C2 ] = B1 → · · · → Bq → ⊥, with one or more occurrences of the atom A[t1 ] among the Bj . Write the conclusion D as [D] = D1 → · · · → Dr → ⊥ and let l be such that Dl = A[t2 ]. A proof of [C1] → [C2] → [D] can be obtained as follows: λc1 : [C1]. λc2 : [C2]. λd1 : D1 . . . λdr : Dr . (c1 π1 . . . πp) If Ai 6= (t1 ≈ t2 → ⊥), then πi = dk , where k is such that Dk = Ai . If Ai = (t1 ≈ t2 → ⊥), then we want again that πi : Ai and therefore put πi = λe : (t1 ≈ t2 ). (c2 ρ1 . . . ρq ). If Bj 6= A[t1 ], then ρj = dk , where k is such that Dk = Bj . If Bj = A[t1 ], then we also want that ρj : Bj and put (with dl : Dl ) ρj = (eqsubst t2 (λs : σ. A[s]) dl t1 (eqsym t1 t2 e)) The term ρj has type A[t1 ] in the context e : (t1 ≈ t2 ). The term ρj contains an occurrence of eqsym because of the fact that the equality t1 ≈ t2 comes in the wrong direction for proving A[t1 ] from A[t2 ]. With this definition of ρj , the term πi has indeed type Ai = (t1 ≈ t2 → ⊥). As an alternative, it is possible to expand the proof of eqsym in the proof of the paramodulation step.

Automated Proof Construction in Type Theory Using Resolution

159

Equality Factoring Equality factoring for disjunctive clauses is the rule with premiss C containing equality literals t1 ≈ t2 and t1 ≈ t3 , and conclusion D which is identical to C but for the replacement of t1 ≈ t3 by t2 6≈ t3 . The soundness of this rule relies on t2 ≈ t3 ∨ t2 6≈ t3 . Let A1 , . . . , Ap, B1 . . . , Bq be literals (p, q ≥ 0) and write [C] = A1 → · · · → Ap → ⊥, with equality literals t1 ≈ t2 → ⊥ and t1 ≈ t3 → ⊥ among the Ai . Write the conclusion D as [D] = B1 → · · · → Bq → ⊥ with Bj 0 = (t1 ≈ t2 → ⊥) and Bj 00 = (t2 ≈ t3 ). We get a proof of [C] → [D] from λc : [C]. λb1 : B1 . . . λbq : Bq . (c π1 . . . πp ). If Ai 6= (t1 ≈ t3 → ⊥), then πi = bj , where j is such that Bj = Ai . For Ai = (t1 ≈ t3 → ⊥), we put πi = (eqsubst t2 (λs : σ. (t1 ≈ s → ⊥)) bj 0 t3 bj 00 ). The type of πi is indeed t1 ≈ t3 → ⊥. Note that the equality factoring rule is constructive in the implicational translation, whereas its disjunctive counterpart relies on the decidability of ≈. This phenomenon is well-known from the double negation translation. Positive and Negative Equality Swapping The positive equality swapping rule for disjunctive clauses simply swaps an atom t1 ≈ t2 into t2 ≈ t1 , whereas the negative rule swaps the negated atom. Both versions are obviously sound, given the symmetry of ≈. We give the translation for the positive case first and will then sketch the simpler negative case. Let C be the premiss and D the conclusion and write [C] = A1 → · · · → Ap → ⊥, with some of the Ai equal to t1 ≈ t2 → ⊥, and [D] = B1 → · · · → Bq → ⊥. Let j 0 be such that Bj 0 = (t2 ≈ t1 → ⊥). The following term is a proof of [C] → [D]. λc : [C]. λb1 : B1 . . . λbq : Bq . (c π1 . . . πp ) If Ai 6= (t1 ≈ t2 → ⊥), then πi = bj , where j is such that Bj = Ai . Otherwise πi = λe : (t1 ≈ t2 ). (bj 0 (eqsym t1 t2 e))

160

Marc Bezem, Dimitri Hendriks, and Hans de Nivelle

such that also πi : (t1 ≈ t2 → ⊥) = Ai . In the negative case the literals t1 ≈ t2 in question are not negated, and we change the above definition of πi into πi = (eqsym t2 t1 bj 0 ). In this case we have bj 0 : (t2 ≈ t1 ) so that πi : (t1 ≈ t2 ) = Ai also in the negative case. Equality Reflexivity Rule The equality reflexivity rule simply cancels a negative equality literal of the form t 6≈ t in a disjunctive clause. We write once more the premiss [C] = A1 → · · · → Ap → ⊥, with some of the Ai equal to t ≈ t, and the conclusion [D] = B1 → · · · → Bq → ⊥. The following term is a proof of [C] → [D]: λc : [C]. λb1 : B1 . . . λbq : Bq . (c π1 . . . πp ). If Ai 6= (t ≈ t), then πi = bj , where j is such that Bj = Ai . Otherwise πi = (eqrefl t).

5

Lifting to Predicate Logic

Until now we have only considered inference rules without quantifications. In this section we explain how to lift the resolution rule to predicate logic. Lifting the other rules is very similar. Recall that we must assume that the domain is not empty. Proof terms below may contain a variable s : σ as free variable. By abstraction λs : σ we will close all proof terms. This extra step is necessary since ∀s : σ. ⊥ does not imply ⊥ when the domain σ is empty. This is to be compared to 2⊥ being true in a blind world in modal logic. Consider the following clauses C1 = ∀x1 , . . . , xp : σ. [A1 ∨ R1 ] and C2 = ∀y1 , . . . , yq : σ. [¬A2 ∨ R2 ] and their resolvent R = ∀z1 , . . . , zr : σ. [R1θ1 ∨ R2 θ2 ] Here θ1 and θ2 are substitutions such that A1 θ1 = A2 θ2 and z1 , . . . , zr are all variables that actually occur in the resolvent, that is, in R1 θ1 ∨ R2 θ2 after

Automated Proof Construction in Type Theory Using Resolution

161

application of θ1 , θ2 . It may be the case that xi θ1 and/or yj θ2 contain other variables than z1 , . . . , zr ; these are understood to be replaced by the variable s : σ (see above). It may be the case that θ1 , θ2 do not represent a most general unifier. For soundness this is no problem at all, but even completeness is not at stake since the resolvent is not affected. The reason for this subtlety is that the proof terms involved must not contain undeclared variables. Using the methods of the previous sections we can produce a proof π that has the type [A1 ∨ R1 ]θ1 → [¬A2 ∨ R2 ]θ2 → [R1 θ1 ∨ R2 θ2 ]. A proof of C1 → C2 → R is obtained as follows: λc1 : C1 . λc2 : C2. λz1 . . . zr : σ. (π (c1 (x1 θ1 ) . . . (xp θ1 )) (c2 (y1 θ2 ) . . . (yq θ2 ))) We finish this section by showing how to assemble a λ-term for an entire resolution refutation from the proof terms justifying the individual steps. Consider a Hilbert-style resolution derivation C1 , . . . , Cm , Cm+1 , . . . , Cn with premisses c1 : C1 , . . . , cm : Cm . Starting from n and going downward, we will define by recursion for every m ≤ k ≤ n a term πk such that πk [cm+1 , . . . , ck ] : Cn in the context extended with cm+1 : Cm+1 , . . . , ck : Ck . For k = n we can simply take πn = cn . Now assume πk+1 has been constructed for some k ≥ m. The proof πk is more difficult than πk+1 since πk cannot use the assumption ck+1 : Ck+1 . However, Ck+1 is a resolvent, say of Ci and Cj for some i, j ≤ k. Let ρ be the proof of Ci → Cj → Ck+1 . Now define πk [cm+1 , . . . , ck ] = (λx : Ck+1 .πk+1 [cm+1 , . . . , ck , x])(ρ ci cj ) : Cn The downward recursion yields a proof πm : Cn which is linear in the size of the original Hilbert-style resolution derivation. Observe that a forward recursion from m to n would yield the normal form of πm , which could be exponential.

6

Example

Let P be a property of natural numbers such that P holds for n if and only if P does not hold for any number greater than n. Does this sound paradoxical? It is contradictory. We have P (n) if and only if ¬P (n + 1), ¬P (n + 2), ¬P (n + 3), . . ., which implies ¬P (n + 2), ¬P (n + 3), . . ., so P (n + 1). It follows that ¬P (n) for all n. However, ¬P (0) implies P (n) for some n, contradiction. A closer analysis of this argument shows that the essence is not arithmetical, but relies on the fact that < is transitive and serial. The argument is also valid in a finite structure, say 0 < 1 < 2 < 2. This qualifies for a small refutation problem, which we formalize in Coq. Type dependencies are given explicitly.

162

Marc Bezem, Dimitri Hendriks, and Hans de Nivelle

Thus, nat is the domain of discourse. We declare a unary relation P and a binary relation <. P : nat → ∗p < : nat × nat → ∗p Let L : (nelist nat) := [1, 2] be the corresponding list of arities. The relations are packaged by Rel . 0⇒P (select L i) p Rel : Πi : (index L). nat → ∗ := λi : (index L). Cases i of 1⇒< For instance, (E nat L Rel (rel nat L 1 (2, 0))) =βδι (Rel 1 (2, 0)) =βδι 2 < 0. It is convenient to represent the relations P, < as object-level constants. P˙ : nat → (o nat L) := (rel nat L 0) ˙ : nat × nat → (o nat L) := (rel nat L 1) < ˙ is serial Let us construct the formal propositions trans and serial , stating that < ˙ m instead of (< ˙ (n, m)).) and transitive. (We write n < ˙ y, z : nat . (x < ˙ y ∧˙ y < ˙ z) → ˙ z trans : (o nat L) := ∀x, ˙ x< ˙ ˙ ˙ serial : (o nat L) := ∀x : nat. ∃y : nat . x < y We define foo. ˙ : nat. (P˙ x) ↔ ˙ : nat . x < ˙ y→ foo : (o nat L) := ∀x ˙ (∀y ˙ ¬( ˙ P˙ y)) Furthermore, we define taut on the object-level, representing the example informally stated at the beginning of this section. (If the latter is denoted by ϕ, then taut = ϕ.) ˙ taut : (o nat L) := (trans ∧˙ serial) → ˙ ¬foo ˙ Interpreting taut , that is βδι-normalizing (E nat L Rel taut ), results in ‘taut without dots’. We declare em : (EM nat L Rel ), ac : (AC nat L Rel ) and use 0 to witness the non-emptiness of nat . We reduce the goal (E nat L Rel taut ), using the result of Section 3, to the goal (E nat L Rel (mcf nat L taut )). If we prove this latter goal, say by a term ρ, then (mcfprf nat L Rel em ac 0 taut ρ) : (E nat L Rel taut ) We normalize the new goal: (E nat L Rel (mcf nat L taut )) =βδι ∀f : (skolT nat). (∀x, y, z : nat . x < y → y < z → (x < z → ⊥) → ⊥) → (∀x : nat. (x < (f 1 0 1 x) → ⊥) → ⊥) → (∀x : nat. (x < (f 2 0 1 x) → ⊥) → ((P x) → ⊥) → ⊥) → (∀x : nat. ((P (f 2 0 1 x)) → ⊥) → ((P x) → ⊥) → ⊥) → (∀x, y : nat. (P x) → x < y → (P y) → ⊥) →⊥

Automated Proof Construction in Type Theory Using Resolution

163

This is the minimal clausal form of the original goal. We refrained from exhibiting its proof ρ for reasons of space. The Coq-script generating ρ can be found in example.v in the tar file mentioned at the end of Section 3.

References Blk. dB72.

Bliksem is available at URL: www.mpi-sb.mpg.de/~bliksem. N. de Bruijn. Lambda calculus notation with nameless dummies, a tool for automatic formula manipulation. Indagationes Mathematicae 34, pages 381– 392, 1972. Coq98. B. Barras, S. Boutin, C. Cornes, J. Courant, J.-C. Filliˆ atre, E. Gim´enez, H. Herbelin, G. Huet, C. Mu˜ nez, C. Murthy, C. Parent, C. Paulin-Mohring, A. Sa¨ıbi, B. Werner. The Coq Proof Assistant Reference Manual, version 6.2.4. INRIA, 1998. Available at: ftp.inria.fr/INRIA/coq/V6.2.4/doc/Reference-Manual.ps. Hen98. D. Hendriks. Clausification of First-Order Formulae, Representation & Correctness in Type Theory. Master’s Thesis, Utrecht University, 1998. URL: www.phil.uu.nl/~hendriks/thesis.ps.gz. Hua96. X. Huang. Translating machine-generated resolution proofs into ND-proofs at the assertion level. In Proceedings of PRICAI-96, pages 399–410, 1996. Hur99. J. Hurd. Integrating Gandalf and HOL. In Proceedings TPHOL’s 99, number LNCS 1690, pages 311–321. Springer Verlag, 1999. MS99. W. McCune and O. Shumsky. IVY: A preprocessor and proof checker for firstorder logic. Preprint ANL/MCS-P775-0899, Argonne National Laboratory, Argonne IL, 1999. NM98. G. Nadathur and D. Miller. Higher-order logic programming. In D. Gabbay e.a. (eds.) Handbook of logic in artificial intelligence, Vol. 5, pp. 499–590. Clarendon Press, Oxford, 1998. Omega. Omega can be found on www.ags.uni-sb.de/~omega/. Pfe84. F. Pfenning. Analytic and non-analytic proofs. In Proceedings CADE 7, number LNCS 170, pages 394–413. Springer Verlag, 1984. ST95. J. Smith and T. Tammet. Optimized encodings of fragments of type theory in first-order logic. In Proceedings Types 95, number LNCS 1158, pages 265–287. Springer Verlag, 1995.

System Description: TPS: A Theorem Proving System for Type Theory Peter B. Andrews1 , Matthew Bishop2 , and Chad E. Brown1 1

2

1

Department of Mathematical Sciences, Carnegie Mellon University Pittsburgh, PA 15213, USA [email protected], [email protected] Department of Computer Science, King’s College London, Strand London WC2R 2LS, England [email protected]

Introduction

This is a brief update on the Tps automated theorem proving system for classical type theory, which was described in [3]. Manuals and information about obtaining Tps can be found at http://gtps.math.cmu.edu/tps.html. In Section 2 we discuss some examples of theorems which Tps can now prove automatically, and in Section 3 we discuss an example which illustrates one of the many challenges of theorem proving in higher-order logic. We first provide a brief summary of the key features of Tps . Tps uses Church’s type theory [8] (typed λ-calculus) as its logical language. Wffs are displayed on the screen and in printed proofs in the notation of this system of symbolic logic. One can use Tps in automatic, semi-automatic, or interactive mode to construct proofs in natural deduction style, and a mixture of these modes of operation is most useful for significant applications. Our current research is focused primarily on increasing the power of the purely automatic search procedures, since these are useful in speeding up the construction of proofs even if many of the key ideas must be supplied interactively. When searching for a proof of a theorem, Tps first tries to find an expansion proof [11], of which an important component is a mating [1] (otherwise known as a spanning set of connections [4]). Various search procedures are implemented in Tps , most notably those described in [6], [5], [10], and [9]. The method of dual instantiation of definitions discussed in [7] is also implemented in Tps . Once an expansion proof has been found, it is translated into a natural deduction proof by the methods of [13] and [14]. Many aspects of the behavior of Tps can be varied by changing the settings of flags. These flags provide a convenient facility for exploring various aspects of the problem of searching for proofs, and are essential in setting bounds for the many dimensions of proof search in higher-order logic. ?

This material is based upon work supported by the National Science Foundation under grant CCR-9732312.

D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 164–169, 2000. c Springer-Verlag Berlin Heidelberg 2000

System Description: TPS: A Theorem Proving System for Type Theory

2

165

New Theorems

In the notation used by Tps , o is the type of truth values, ι is the type of individuals, and (αβ) (which some authors prefer to write as (β → α)) is the type of functions from elements of type β to elements of type α. An entity of type (oα) is regarded as a set of elements of type α, and foα xα can be interpreted as meaning that xα is in foα . γβα is an abbreviation for ((γβ)α). A dot stands for a left bracket whose mate is as far to the right as is consistent with the pairing of brackets already present. The following theorems were all proven completely automatically by Tps once the flags were set. All the timings quoted below represent the internal runtime, excluding garbage-collect time, used by Tps to find an expansion proof and translate it into a natural deduction proof on a Tangent workstation with a Pentium III processor and 512 megabytes of RAM using Allegro Common Lisp 5.0 for Linux. The numbers are useful only for their approximate magnitudes. They may not represent optimal settings of the flags, and the times required to prove these theorems will probably increase as ways are found to move more of the burden of setting flags from users to Tps . We start with several theorems concerned with various formulations of the Axiom of Choice.1 We first list these formulations. In [15] these are presented as statements of axiomatic set theory; in a type-theoretic context their variables must be given types, and they take the form of axiom schemas. Logical relations between these formulations of the Axiom of Choice are then complicated by the need to have appropriate relations between the types which are involved. AC1(β) from [15] : ∀so(oβ) .∀Xoβ [sX ⊃ ∃yβ Xy] ⊃ ∃fβ(oβ) ∀X.sX ⊃ X.fX If s is a set of non-empty sets, there is a function f such that for every x ∈ s, f(x) ∈ x. AC3(β, α) from [15] : ∀r(oβ)α ∃gβα ∀xα .∃yβ rxy ⊃ rx.gx For every function r, there is a function g such that for every x, if x is in the domain of r and r(x) 6= ∅, then g(x) ∈ r(x). (In a set-theoretic context it may be assumed that the values of r are sets, but in a type-theoretic context r must be given a type compatible with this assumption.) AC17(α) from [15] : ∀goα(α(oα)) .∀hα(oα)∃uα [gh]u ⊃ ∃fα(oα) gf.f.gf If s is a set (which we represent as the set of all elements of type α), t is the collection of all non-empty subsets of s, F is the set of all functions (which must have type (α(oα))) from t to s, and g is a function from F to t, then there is an f ∈ F such that f(g(f)) ∈ g(f). AC(α) from [2] : ∃fα(oα) ∀Xoα .∃tα Xt ⊃ X.fX There is a universal choice function f (for elements of type α) such that if X is any non-empty set (whose elements are of type α), then fX ∈ X THM532: AC1(β) ⊃ AC3(β, α) (3.96 seconds) THM533: AC3(α, oα) ⊃ AC1(α) (1.92 seconds) THM560: AC3(α, oα) ≡ AC1(α) (19.26 seconds) 1

See [12] for a discussion of interactive proofs of many similar theorems.

166

Peter B. Andrews, Matthew Bishop, and Chad E. Brown

THM534: AC1(α) ⊃ AC17(α) THM541: AC(α) ≡ AC1(α)

(4.71 seconds) (5.00 seconds)

THM531E: FINITE-SET Coα ∧ Boα ⊆ C ⊃ FINITE-SET B (21.50 seconds) THM531E says that a subset of a finite set is finite. FINITE-SET is defined as [λXoα ∀Po(oα) .∀Eoα [∼ ∃tα Et ⊃ P E]∧∀Yoα ∀xα ∀Zoα [[P Y ∧.Z ⊆ .Y +x] ⊃ P Z] ⊃ P X], which is one of several ways one can define finiteness inductively. Yoα + xα is defined as [λtα .Yoα t ∨ t = xα ], which is another notation for Yoα ∪ {xα }. THM196B: ∼ [aι = bι ] ⊃ ∼ ∀jιι ∀kιι .ITERATE+ j[k ◦ j] ⊃ ITERATE+ jk (1.41 seconds) ITERATE+ is defined as λfαα λgαα ∀po(αα).pf ∧ ∀jαα [pj ⊃ p.f ◦ j] ⊃ pg, and so ITERATE+ fg means that g is an iterate of f — i.e., a function of the form f ◦ ... ◦ f. The symbol ◦ denotes the composition of functions, and is defined as λfαβ λgβγ λxγ f.gx. The theorem refutes the conjecture that if k ◦ j is an iterate of j, then k must be an iterate of j. Of course, the conjecture is trivially true if there is just one individual, so the theorem depends on the assumption that there are two distinct individuals. Tps proves the theorem by constructing the simple counterexample where k is the identity function and j is the constant function whose value is b. The proof consists of a verification that this is indeed a counterexample. THM563: CLOS-SYS1 .λ Woβ . W 0β ∧ ∀ xβ ∀ yβ [ W y ∧ x ≤ y ⊃ W x] ∧ ∀ x∀ y∀ zβ . W x ∧ W y ∧ JOIN x y z ⊃ W z (49.2 minutes) THM563 states that the collection of sets Woβ which contain an element 0, are downward closed with respect to a binary relation ≤, and are closed with respect to a tertiary relation JOIN , is a closure system. T T CLOS-SYS1 is defined as λ CLo(oβ) ∀ So(oβ) . S ⊆ CL ⊃ CL. S, and is defined as λ So(oβ) λ xβ ∀ Woβ . S W ⊃ W x, so a closure system is a collection of sets closed under arbitrary intersections. THM563 is very general, but we can illustrate it with the following special case. Let β be the type of finite binary trees. We use 0 as a name for the tree with a single node. We can define a partial ordering ≤ on this type by saying a tree x is less than a tree y if we can replace the leaves of x by some trees to obtain y. Thus, 0 is the smallest member of β. Given trees x and y, there is a tree called [x ∨ y] which is the lub{x, y} such that JOIN x y [x ∨ y]. We can represent each infinite binary tree by a certain set Woβ of finite binary trees, which approximate the infinite tree in the sense illustrated below, where the finite trees approximate the infinite tree on the right. 0 ≤

0 0

≤

0 0 0

≤

0 0 0 0

≤ ··· ≤

··· 0 0 0

(We can regard finite binary trees as special cases of infinite binary trees. A finite binary tree xβ when considered as an infinite binary tree is the set {yβ |y ≤ x},

System Description: TPS: A Theorem Proving System for Type Theory

167

an object of type (oβ).) A set Woβ that represents a tree contains the tree 0, is downward closed, and is closed with respect to joins. In the case of this example, THM563 shows that the set of infinite trees constitutes a closure system, and therefore forms a complete lattice under the subset ordering. S S X5204: #fαβ [ wo(oβ) ] = .#[#f]w (33.16 seconds) # is defined as λfαβ λxoβ λzα ∃tβ .xt ∧ z = ft, and so #fαβ xoβ is the image of the set xoβ under the function fαβ . This is a polymorphic definition, S and the instances of # in the theorem have appropriate types attached to them. is deS fined as λDo(oα) λxα ∃Soα .DS ∧ Sx; hence Do(oα) is the union of the collection Do(oα) of sets. X5311A-EXT: ∀yα [ι[= y] = y] ∧ ∀poα ∀qoα [∀xα [px ≡ qx] ⊃ ∀ro(oα) .rp ⊃ rq] ⊃ ∀p.Σ 11 p ⊃ p.ιp (3.0 minutes) Σ 11 is defined as λpoα ∃yα .py ∧ ∀zα .pz ⊃ y = z, and represents the property of being a one-element set, This is essentially theorem 5311 from [2]. In order to prove it one needs axioms of descriptions and extensionality, so they are made antecedents of the main implication. The theorem says that if p is a one-element set, then the description operator ι maps p to the unique entity which is in p.

3

A Challenge

We conclude with a discussion of an example which poses a significant challenge for Tps and other theorem provers for higher-order logic and set theory. Cantor’s theorem for sets says that if U is any set and W is its power set, then W has larger cardinality than U . This is usually expressed by saying that there is no surjection from U onto W . If one takes the members of U as the set of individuals, the theorem can be expressed simply by the wff ∼ ∃goιι ∀foι ∃jι .gj = f, which we called X5304 in [2] and [3]. Tps has been able to prove this for many years. However, one can also express the fact that W has larger cardinality than U by saying that there is no injection from W into U , which we formalize as follows: X5309: ∼ ∃hι(oι) ∀poι ∀qoι .hp = hq ⊃ p = q (not proven) We call this the Injective Cantor Theorem. Here is an informal proof of this theorem. Suppose there is a function h : W → U such that (1) h is injective. Let (2) D = {ht | t ∈ W and ht ∈ / t}. Note that (3) D ∈ W . Now suppose that (4) hD ∈ D. Then (by 2) there is a set t such that (5) t ∈ W and (6) ht ∈ / t and (7) hD = ht. Therefore (8) D = t (by 1, 7), so (9) hD ∈ / D (by 6, 8). This argument (4-9) shows that (10) hD ∈ / D. Thus (11) hD ∈ D (by 2, 3, 10). This contradiction shows that there can be no such h. It is easy to prove parts of this argument automatically. Define IDIAG to be [λhι(oι) #h.λsoι . ∼ s.hs]. Then [IDIAG h] represents the set D of the informal argument above. Tps can automatically prove the following theorems, from which X5309 follows trivially:

168

Peter B. Andrews, Matthew Bishop, and Chad E. Brown

THM143B: ∀hι(oι) .∀poι ∀qoι [hp = hq ⊃ p = q] ⊃ ∼ IDIAG h.h.IDIAG h (3.35 seconds) THM144B: ∀hι(oι) .IDIAG h.h.IDIAG h (0.47 seconds) However, a completely automatic proof of X5309 seems well beyond the present capabilities of Tps . The expansion proof which corresponds to the argument above involves instantiating a quantifier on a set variable with a wff which contains another quantifier on a set variable, which must also be instantiated with a wff which contains a quantifier. We may say that such an expansion proof has quantificational depth 3. Thus far Tps has found expansion proofs only of quantificational depth ≤ 2. We may define the quantificational depth of a theorem to be the minimum of the quantificational depths of its expansion proofs. (Thus all theorems of first-order logic have quantificational depth at most 1). Research on methods of proving theorems which are deep in this sense should stimulate significant progress in higher-order theorem proving.2

References 1. Peter B. Andrews. Theorem Proving via General Matings. Journal of the ACM, 28:193–214, 1981. 2. Peter B. Andrews. An Introduction to Mathematical Logic and Type Theory: To Truth Through Proof. Academic Press, 1986. 3. Peter B. Andrews, Matthew Bishop, Sunil Issar, Dan Nesmith, Frank Pfenning, and Hongwei Xi. TPS: A Theorem Proving System for Classical Type Theory. Journal of Automated Reasoning, 16:321–353, 1996. 4. Wolfgang Bibel. Automated Theorem Proving. Vieweg, Braunschweig, second edition, 1987. 5. Matthew Bishop. A Breadth-First Strategy for Mating Search. In Harald Ganzinger, editor, Proceedings of the 16th International Conference on Automated Deduction, volume 1632 of Lecture Notes in Artificial Intelligence, pages 359–373, Trento, Italy, 1999. Springer-Verlag. 6. Matthew Bishop. Mating Search Without Path Enumeration. PhD thesis, Department of Mathematical Sciences, Carnegie Mellon University, April 1999. Department of Mathematical Sciences Research Report No. 99–223. Available at http://gtps.math.cmu.edu/tps.html. 7. Matthew Bishop and Peter B. Andrews. Selectively Instantiating Definitions. In Claude Kirchner and H´el`ene Kirchner, editors, Proceedings of the 15th International Conference on Automated Deduction, volume 1421 of Lecture Notes in Artificial Intelligence, pages 365–380, Lindau, Germany, 1998. Springer-Verlag. 8. Alonzo Church. A Formulation of the Simple Theory of Types. Journal of Symbolic Logic, 5:56–68, 1940. 9. Sunil Issar. Path-Focused Duplication: A Search Procedure for General Matings. In AAAI–90. Proceedings of the Eighth National Conference on Artificial Intelligence, volume 1, pages 221–226. AAAI Press/The MIT Press, 1990. 2

We are not aware of a proof that X5309 or any other theorem has depth greater than 2, but we conjecture that there is no upper bound on the quantificational depths of theorems of type theory.

System Description: TPS: A Theorem Proving System for Type Theory

169

10. Sunil Issar. Operational Issues in Automated Theorem Proving Using Matings. PhD thesis, Carnegie Mellon University, 1991. 147 pp. 11. Dale A. Miller. A Compact Representation of Proofs. Studia Logica, 46(4):347–370, 1987. 12. Lawrence Paulson and Krzystztof Grabczewski. Mechanising Set Theory: Cardinal Arithmetic and the Axiom of Choice. Journal of Automated Reasoning, 17:291–323, 1996. 13. Frank Pfenning. Proof Transformations in Higher-Order Logic. PhD thesis, Carnegie Mellon University, 1987. 156 pp. 14. Frank Pfenning and Dan Nesmith. Presenting Intuitive Deductions via Symmetric Simplification. In M. E. Stickel, editor, Proceedings of the 10th International Conference on Automated Deduction, volume 449 of Lecture Notes in Artificial Intelligence, pages 336–350, Kaiserslautern, Germany, 1990. Springer-Verlag. 15. Herman Rubin and Jean E. Rubin. Equivalents of the Axiom of Choice, II. NorthHolland, 1985.

! "# $ % & $ $' ( &$ % % & & & $ ' " &%& & & )*% # "&$ % & & & ' ! $ % " & %+ &&& % " '

!"# $ %

%

$ $ & $ $ & ' ( )* + (!,# $ ) ) ' ) & ' & %

$ & -.!, ) &)

./# 0 1 $."# $ &

11.! .. ..# 2 $ %

$

)

$ % ) 3

% 3.4# 5(./# 2 ."# ª 6 .7#

%

$

%

$ $ $ $ % , "# " & % -.,- $ / 0121 2

!" # $" % ! & ' "

!

( 3$ 4

$

) 8 9 % : ' %

$

0 $ $ : % $

6 ' '

$ $ ;

%

$ % < $

$ %

$ +

<) $ $ %

0 ' $ %

$ =&

$ ' % ' $

% %

$ )

$

$ ; $ ') & %

- %

$ & ) 8 9 ;

1

/' - ' Editor

Editor

Nuprl-5

Library

Evaluator

THEORY .... defs, thms, tactics rules, structure

Maude

Evaluator MetaPRL

THEORY

Evaluator

THEORY .... defs, thms, tactics rules, structure

PRL

defs, thms, tactics rules, structure

Evaluator SoS (Lisp)

Editor

Web

THEORY

(HOL)

defs, thms, tactics rules, structure

THEORY (PVS) defs, thms, tactics rules, structure

Translator Java

THEORY

.... defs, thms, tactics rules, structure

Refiner

Nuprl

Refiner

MetaPRL

Refiner

HOL/SPIN

Refiner

PVS

Refiner

Ω MEGA

Translator OCaml

&%& + 0 ) 8 9 8 9 $ 8 9

$ ) $

( # & ; % ª =.># &) ? !!# ..# $

$ %

$ @

0 < &

& <

3 $ % $ ' ? $ $ $

$ %

$ ) % .711.!# $ )

)$ $ @ $ 0 % < $ < )

%

$ 0 $ <

%

)

8 1 $."1.7#9

!

( 3$ 4

6 %

$ ' % ) $ ..# 0 & $ $ ) ' =.,# 0 & & $ ; ; $

& 0 $ $ $ ) & %

$ ; ; < % %

$ + $ % $ %

$ ' % %

$ $ $ %

; A $ 1 $

$ ; $ ' ) $ ;

; %

$ $ ; $ ' & $ $ %

$ $ $

8 9 8 9 $ % % ;

& ; ' $ '

$ %

$ $ '

!"

/' - '

$ ;

; $ % < ; $ & @ ) %

$ $ @ B C + %

$

0 $ %

$ ) ? ; &

' $ $ $ D % % $ $ $ % $ %

$ ) ; 0 ' $ @ &

$ $

2

$ 1 $ 8 9 ; $ ; % $ %

; ' ' $ % & ' 0 ) $ $ '

$ $ E.!# $ ' $ $ $ # "

$ " &

' ) ; %

$ 0

% 0 %

% 6 ; '

$ $

!

( 3$ 4

% & ( # + -44 $ )

$ (6 (# ? $ %

' & &) 44# 8 ..#9 E..# ? $ & /

F> ' $

? $

G $ & $ $ $ 6 & % & 0 $ ) & $ % & ' ) $ & % Æ % ) $ % & ? + $

$ $ % % $ & ' + $ &

$

% ' $ ' @ '

' )

! 5-32 6 5: 26

7' -$& & 7' 3' 8 $ $ ' 3 00 ' 19 ' $ 22 '

' : ; < ' ª $= !"& ' 3- 12 ' 11910' $ 22'

0 5 06

/' - '

.' % ' !" #' , > 20' 5 226 7' /' ' 4# ?' 7$ 7'(' 7& ' $% 3 2 ' 09 $ 222' 5 22%6 7' /' ' 4# ,' 3 ' 7( ?' 7$ ?'/' @&' ! 7& ' &% 3 0 ' 191 $ 222' 5/>26 - / & $ >"' >%& ! , $ $ & >(3' 3- 12 ' 9 0' $ 22' 5A72 6 7' A& & !' 7 ' ! '() " " ! ' %&$ , 22 ' 5>"206 ' >"' $ >(3 ,.3' & " ' (! 3 1 ' 10911' $ 220' 5?26 ,' ?# ' & " *#+ ,-' ' 22' 5B>>26 ' B; 7' >& ?' >#' - & $ ' . 3- 1 ' 9 1' $ 22' 5B26 ' B;' / $ % = 4 %&& $ 73 ' ! . !.2 0 ' 22' 5B226 ' B;' - & # C$ $ ' & * 3 2 ' 9 ' $ 222' 5B6 ' B; ?' ( ' :' , #' 7*%& ! , $'

! ' B" 1' 53 226 D' 3 ' B; .' . ?' ># 7' >& B' : .' %' :& $ % $ ' *(*% ( $ ." EF=21 222' 576 ! 7: ! '

' 576 7 $'

' 5736 ,' 7 3< ' &# &# :% 2' 526 ,' ' ,% $ "%' ! . !.2 02 ' 22' 5226 ,' ' $ % ' ! . !.22 ' 222' 5(206 ' (" ' ,8= % $ C # $ & & # $' / 3 1 ' 9 $ 220' 5,26 3" ' , ' %= ! * ' ! * ' 0 9 0' -& , 22' 5?26 ' 8' & .& ?< $' ,4 G-.4= / $ "' 22' 5G6 ' G ' ) *# 0# ' -&& G 2'

System Description: aRa – An Automatic Theorem Prover for Relation Algebras? Carsten Sinz Symbolic Computation Group, WSI for Computer Science Universit¨ at T¨ ubingen, D-72076 T¨ ubingen, Germany [email protected]

Abstract. aRa is an automatic theorem prover for various kinds of relation algebras. It is based on Gordeev’s Reduction Predicate Calculi for n-variable logic (RPCn ) which allow first-order finite variable proofs. Employing results from Tarski/Givant and Maddux we can prove validity in the theories of simple semi-associative relation algebras, relation algebras and representable relation algebras using the calculi RPC3 , RPC4 and RPCω . aRa, our implementation in Haskell, offers different reduction strategies for RPCn , and a set of simplifications preserving n-variable provability.

1

Introduction

Relations are an indispensable ingredient in many areas of computer science, such as graph theory, relational databases, logic programming, and semantics of computer programs, to name just a few. So relation algebras – which are extensions of Boolean algebras – form the basis of many theoretical investigations. As they can also be approached from a logic point of view, an application of ATP methods promises to be beneficial. We follow the lines of Tarski [TG87], Maddux [Mad83] and Gordeev [Gor95] by converting equations from various relation algebraic theories to finite variable first-order logic sentences. We are then able to apply Gordeev’s n-variable calculi RPCn to the transformed formulae. Our implementation aRa is a prover for the RPCn calculi with a front-end to convert relation algebraic propositions to 3-variable first-order sentences. It implements a fully automatic proof procedure, various reduction strategies, and some simplification rules, to prove theorems in the theories SSA1 , RA, and RRA.

2

Theoretical Foundations

Gordeev’s Reduction Predicate Calculi. Gordeev developed a cut free formalization of predicate logic without equality using only finitely many distinct ? 1

This work was partially supported by DFG under grant Ku 966/4-1. This “simple” variant of semi-associative relation algebra (SA) includes the identity ◦ axiom A 1 = A only for literals A.

D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 177–182, 2000. c Springer-Verlag Berlin Heidelberg 2000

178

Carsten Sinz

variables. In [Gor95] the reduction predicate calculi RPCn were introduced. These term rewriting systems reduce valid formulae of n-variable logic to true. n-variable logic comprises only formulae containing no more than n distinct variables, the same restriction applies to RPCn . Moreover, all RPCn formulae are supposed to be in negation normal form. Formulae provable in RPCn are exactly those provable in the standard modus ponens calculus, or, alternatively, the sequent calculus with cut rule, using at most n distinct variables ([Gor95], Theorem 3.1 and Corollary). The RPCn rewriting systems consist of the following rules:2 (R1) A ∨ > −→ > (R2) L ∨ ¬L −→ > (R3) A ∧ > −→ A (R4) A ∨ (B ∧ C) −→ (A ∨ B) ∧ (A ∨ C) (R5) ∃xA −→ ∃xA ∨ A[x/t] (R6) ∀xA ∨ B −→ ∀xA ∨ B ∨ ∀y(∀yB ∨ A[x−y][x/y]) (R60 ) ∀xA −→ ∀xA ∨ A[−x] ∨ ∀y(A[x−y][x/y]) Here, L denotes an arbitrary literal, A, B and C are formulae, x and y are individual variables, and t is any term. A term may be a variable or a constant, function symbols do not occur in RPCn . F [x/t] denotes substitution of x by t, where F [x/t] does not introduce new variables. F [−x] and F [x−y] are variable elimination operators (see [Gor95] for a rigorous definition), where F [−x] deletes all free occurrences of x from F by replacing the respective (positive and negative) literals contained in F by falsum (⊥). The binary elimination operator F [x −y] is defined by F [x −y] = F [−y] if x 6= y, and F [x −y] = F otherwise. Note, that the variable elimination operators are – just like substitution – metaoperators on formulae, and not part of the formal language of the logic itself. A formula F is provable in RPCn (RPCn ` F ) iff it can be reduced to true, ∗ i.e., iff F −→RPC >. We call a formula sequence (F0 , . . . , Fn) a reduction chain if Fi −→RPC Fi+1 . A reduction strategy is a computable function that extends a reduction chain by one additional formula. As all essential rules of RPCn are of the form F −→ F ∨ G, and thus no “wrong” reductions are possible, strategies can be used instead of a search procedure. This also means that no backtracking is needed and the RPCn calculi are confluent on the set of valid formulae of n-variable logic. aRa implements various reduction strategies rather than unrestricted breadth-first search or iterative deepening. Translation from Relation Algebra to Predicate Logic. In order to prove formulae in the theory of relation algebra, we follow the idea of [TG87] and transform sentences of relation algebra to 3-variable first-order sentences. The transformation τxyz is straightforward, where x and y denote the predicate’s arguments and z is a free variable. For brevity, we give only part of the definition 2

∨ and ∧ are supposed to be AC-operators.

System Description: aRa – An Automatic Theorem Prover

179

of τxyz , for predicate symbols, relative and absolute product ( and ·): τxyz (R) = xRy τxyz (Φ Ψ ) = ∃z(τxzy (Φ) ∧ τzyx (Ψ )) τxyz (Φ · Ψ ) = τxyz (Φ) ∧ τxyz (Ψ ) Here, Φ and Ψ stand for arbitrary relation algebraic expressions. A sentence Φ = Ψ from relation algebra is then translated by τ into an equivalence expression of first-order logic, a relational inclusion Φ ≤ Ψ into an implication: τ (Φ = Ψ ) = ∀x∀y(τxyz (Φ) ↔ τxyz (Ψ )) τ (Φ ≤ Ψ ) = ∀x∀y(τxyz (Φ) → τxyz (Ψ )) We can simulate proofs in various theories of relation algebra by using the following tight link between n-variable logic and relation algebras proved by Maddux (see, e.g., [Mad83]): 1. A sentence is valid in every semi-associative relation algebra (SA) iff its translation can be proved in 3-variable logic. 2. A sentence is valid in every relation algebra (RA) iff its translation can be proved in 4-variable logic. 3. A sentence is valid in every representable relation algebra (RRA) iff its translation can be proved in ω-variable logic. We have to restrict SA in case of 3-variable logic to its simple variant SSA, as RPCn – as well as the more familiar Hilbert-Bernays first-order formalism – contains only the simple Leibniz law. Compared to the generalized Leibniz law used in Tarski’s and Maddux’s formalisms, this simple schema is finitely (i.e., as an axiom) representable in RPCn . The corresponding refinement is not necessary in case of RA and RRA, as the generalized Leibniz law for 3-variable formulae is deducible from its simple form in RPC4 , and thus in RPCω (see, e.g., [Gor99]).

3

Implementation

The aRa prover is a Haskell implementation of the RPCn calculi with a front end to transform relation algebraic formulae to first-order logic. It offers different reduction strategies and a set of additional simplification rules. Most of these simplification rules preserve n-variable provability and can thus be used for SSAand RA-proofs. Input Language. The aRa system is capable of proving theorems of the form {E1 , . . . En } ` E, where E and Ei are relation algebraic equations or inclusions.3 Each equation in turn may use the connectives for relative and absolute sum 3

Instead of an equation E a formula F may also be used, which is interpreted as the equation F = 1, where 1 stands for the absolute unit predicate.

180

Carsten Sinz

and product, negation, conversion, and arbitrary predicates, among them the absolute and relative unit and the absolute zero as predefined predicates. The translation of a relational conjecture of the form {E1 , . . . En } ` E to first-order logic is done in accordance with Tarski’s deduction theorem for L× ([TG87], 3.3): τ ({E1 , . . . En } ` E) = τ (E1 ) ∧ . . . ∧ τ (En ) → τ (E) To give an impression of how the actual input looks like, we show the representation of Dedekind’s rule (Q R) · S ≤ (Q · (S R` )) (R · (Q` S)) as a conjecture for aRa: |- (Q@R)*S < (Q*(S@R^))@(R*(Q^@S)); Literal and Reduction Tracking. To guarantee completeness of the deterministic proof search introduced by reduction strategies, we employ the technique of reduction tracking in our implementation. The idea is as follows: While successively constructing the reduction chain, record the first appearance of each reduction possibility4 and track the changes performed on it. A strategy is complete, if each reduction possibility is eventually considered. Literal tracking is used by the LP reduction strategy described later. It keeps track of the positions of certain literal occurrences during part of the proof. Reduction Strategies. We implemented a trivial strategy based on the reduction tracking idea described above and several variants of a literal tracking strategy. The latter select a pair of complementary literals (and therefore are called LP strategies) that can be disposed of by a series of reduction steps. In order to find such pairs, an equation system is set up (similar to unification) that is solvable iff there is a reduction sequence that moves the literals (or one of their descendants) into a common disjunction. Then, RPC reductions are selected according to the equation system to make the literal pair vanish. Additional Simplification Rules. To improve proof search behavior we added some simple rules and strategies to the RPCn calculus that preserve n-provability: 1. 2. 3. 4. 5.

Give priority to shortening rules (R1), (R2) and (R3). Remove quantifiers that bind no variables. Minimize quantifier scopes. Subgoal generation: To prove F ∧ G prove first F , and then G. Additional ∀-rule: ∀xA ∨ B −→ ∀y(A[x/y] ∨ B) if y ∈ / Fr(A) ∪ Fr(B), which is used with priority over (R6) and (R60 ). 6. Delete pure literals. 7. Replace F ∨ F˜ resp. F ∧ F˜ by F , if F˜ is a bound renaming of F .

4

A reduction possibility consists of a position in the formula and, in case of the RPC-rules (R5), (R6) and (R60 ), an additional reduction term.

System Description: aRa – An Automatic Theorem Prover

181

Simplification rule 3 is applied only initially, before the actual proof search starts. Rules 5 and 7 are used to keep formula sizes smaller during proof search, where rule 5 is a special form of (R6) that has the purpose to accelerate introduction of so far unused variables. Moreover, there are two additional simplification rules that, however, may change n-provability: (1) partial Skolemization to remove ∀-quantifiers and (2) replacement of free variables by new constants.

4

Experimental Results

We made some experiments with our implementation on a Sun Enterprise 450 Server running at 400 MHz. The Glasgow Haskell Compiler, version 4.04, was used to translate our source files. In Table 1 the results of our tests are summarized. The problem class directly corresponds to the number of variables used for the proof, as indicated at the end of Section 2. Table 1. aRa run times for some relation algebra problems. problem source class strat. proofs steps time 3.2(v) [TG87] SSA 3.2(vi) [TG87] SSA 3.2(xvii) [TG87] SSA 3.1(iii)() [TG87] RA 3.2(xix) [TG87] RA Thm 2.7 [CT51] RA Thm 2.11 [CT51] RA Cor 2.19 [CT51] RA Dedekind [DG98] RA Cor 2.19 [CT51] RRA

LI LI LI AI LA AI AI AI AI AI

2 46 130 2 41 100 3 22 40 6 104 200 3 25 50 1 12 40 1 19 50 1 77 75170 1 37 90 1 38 140

In the last three columns the following information is given: the number of proofs that the problem consists of, the total number of RPC-reductions5 , and the total proof time in milliseconds. The first letter in the strategy column is “L” for the normal LP reduction strategy and “A” for the LP strategy with priority for the additional ∀-simplification rule. The second letter corresponds to the selection of disjunctive subformulae, i.e., A in rule (R4) and B in rule (R6). The strategy indicated by letter “I” selects a minimal suitable disjunction, “A” a maximal one. The proof of Corollary 2.19 from [CT51] reveals an unexpectedly long runtime, which is reduced considerably by allowing more variables for the proof and thus switching to RRA. 5

Only reductions with one of the rules (R4), (R5), (R6) and (R60 ) are considered.

182

5

Carsten Sinz

Conclusion and Future Work

By using aRa, many small and medium-sized theorems in various relation algebras could be proved. Formulae not containing the relative unit predicate can be handled quite efficiently, other formulae suffer from the fact that neither the RPCn calculi nor the aRa prover offer a special treatment of equality. Compared with other implementations [HBS94, vOG97, DG98], the most obvious differences are the automatic proof procedure using reduction strategies and the translation to the RPCn calculi. The RALF system [HBS94] offers no automatic proof search, but has particular strengths in proof presentation. RALL [vOG97] is based on HOL and Isabelle, and thus is able to deal with higher order constructs. It also offers an experimental automatic mode using Isabelle’s tactics. δRA is a Display Logic calculus for relation algebra, and its implementation [DG98] is based on Isabelle’s metalogic. It also offers an automatic mode using Isabelle’s tactics. aRa can also be used to generate proofs in ordinary first-order logic and in restricted variable logics. As aRa is the initial implementation of a new calculus, we expect that further progress is very well possible. Implementation of new strategies or built-in equality may be viable directions for improvement. Availability. The aRa system is available as source and binary distribution from www-sr.informatik.uni-tuebingen.de/~sinz/ARA.

References [CT51]

L. H. Chin and A. Tarski. Distributive and modular laws in the arithmetic of relation algebras. University of California Publications in Mathematics, New Series, 1(9):341–384, 1951. [DG98] J. Dawson and R. Gor´e. A mechanized proof system for relation algebra using display logic. In JELIA’98, LNAI 1489, pages 264–278. Springer, 1998. [Gor95] L. Gordeev. Cut free formalization of logic with finitely many variables, part I. In CSL’94, LNCS 933, pages 136–150. Springer, 1995. [Gor99] L. Gordeev. Variable compactness in 1-order logic. Logic Journal of the IGPL, 7(3):327–357, 1999. [HBS94] C. Hattensperger, R. Berghammer, and G. Schmidt. RALF - a relationalgebraic formula manipulation system and proof checker. In AMAST’93, Workshops in Computing, pages 405–406. Springer, 1994. [Mad83] R. Maddux. A sequent calculus for relation algebras. Annals of Pure and Applied Logic, 25:73–101, 1983. [TG87] A. Tarski and S. Givant. A Formalization of Set Theory without Variables, volume 41 of Colloquium Publications. American Mathematical Society, 1987. [vOG97] D. von Oheimb and T. Gritzner. RALL: Machine-supported proofs for relation algebra. In Automated Deduction - CADE-14, LNAI 1249, pages 380– 394. Springer, 1997.

Scalable Knowledge Representation and Reasoning Systems Henry Kautz AT&T Labs-Research 180 Park Ave Florham Park NJ 07974, USA [email protected]

Abstract. Traditional work in knowledge representation (KR) aimed to create practical reasoning systems by designing new representations languages and specialized inference algorithms. In recent years, however, an alternative approach based on compiling combinatorial reasoning problems into a common propositional form, and then applying general, highly-efficient search engines has shown dramatic progress. Some domains can be compiled to a tractable form, so that run-time problemsolving can be performed in worst-case polynomial time. But there are limits to tractable compilation techniques, so in other domains one must compile instead to a minimal combinatorial ”core”. The talk will describe how both problem specifications and control knowledge can be compiled together and then solved by new randomized search and inference algorithms.

D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 183–183, 2000. c Springer-Verlag Berlin Heidelberg 2000

Efficient Minimal Model Generation Using Branching Lemmas Ryuzo Hasegawa, Hiroshi Fujita, and Miyuki Koshimura Graduate School of Information Science and Electrical Engineering Kyushu University, Kasuga-shi, Fukuoka 816-8580, Japan {hasegawa,fujita,koshi}@ar.is.kyushu-u.ac.jp http://ss104.is.kyushu-u.ac.jp/

Abstract. An efficient method for minimal model generation is presented. The method employs branching assumptions and lemmas so as to prune branches that lead to nonminimal models, and to reduce minimality tests on obtained models. This method is applicable to other approaches such as Bry’s complement splitting and constrained search or Niemel¨ a’s groundedness test, and greatly improves their efficiency. We implemented MM-MGTP based on the method. Experimental results with MM-MGTP show a remarkable speedup compared to MM-SATCHMO.

1

Introduction

The notion of minimal models is important in a wide range of areas such as logic programming, deductive databases, software verification, and hypothetical reasoning. Some applications in such areas would actually need to generate Herbrand minimal models of a given set of first-order clauses. Although the conventional tableaux and the Davis-Putnam methods can construct all minimal models, they may also generate nonminimal models that are redundant and thus would cause inefficiency. In general, in order to ensure that a model M is minimal, it is necessary to check if M is not subsumed by any other model. We call it a minimality test on M . Since minimality tests on obtained models become still more expensive as the number of models increases, it is important to avoid the generation of nonminimal models. Recently two typical approaches in the tableaux framework have been reported. Bry and Yahya [1] presented a sound and complete procedure for generating minimal models and implemented MM-SATCHMO [2] in Prolog. The procedure rejects nonminimal models by means of complement splitting and constrained search. Niemel¨a also presented a propositional tableaux calculus for minimal model reasoning [8], where he introduced the groundedness test which substitutes for constrained searches. However, both approaches have the following problems: they perform unnecessary minimality tests on such models that are assured to be minimal through a simple analysis of a proof tree, and they cannot completely prune all redundant branches that lead to nonminimal models. To solve these problems, we propose a new method that employs branching lemmas. It is applicable to the above approaches to enhance their ability. D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 184–199, 2000. c Springer-Verlag Berlin Heidelberg 2000

Efficient Minimal Model Generation Using Branching Lemmas

185

Branching lemmas provide an efficient way of applying factorization [6] to minimal model generation, and the use of them is justified by the notion of proof commitment. Consider model generation with complement splitting. If no proof commitment occurs between a newly generated model M and any other model M 0 that has been obtained (no branch extended below a node labeled with a literal in M 0 is closed by the negation of a literal in M ), M is guaranteed to be minimal, thus a minimality test on M can be omitted. In addition, by pruning many branches that result in nonminimal models, the search space will be greatly reduced. The above things can be achieved with branching lemmas. We implemented the method on a Java version of MGTP [3,4] into which the functions of CMGTP [9] are already incorporated. We call this system MMMGTP. It is applicable to first-order clauses as well as MM-SATCHMO. Experimental results show remarkable speedup compared to MM-SATCHMO. This paper is organized as follows: in Section 2 the basic procedure of MGTP is outlined, while in Section 3 key techniques for minimal model generation are described. Then in Section 4 we define the branching lemma and explain how it works for minimal model generation. Section 5 refers to the features of MM-MGTP, and Section 6 proves the soundness and completeness of minimal model generation with complement splitting and branching lemmas. in Section 7 we compare experimental results obtained by running MM-MGTP and MMSATCHMO, then discuss related work in Section 8.

2

Outline of MGTP

Throughout this paper, a clause is represented in implication form: A1 ∧ . . . ∧ Am → B1 ∨ . . . ∨ Bn where Ai (1 ≤ i ≤ m) and Bj (1 ≤ j ≤ n) are literals; the left hand side of → is said to be the antecedent; and the right hand side of → the consequent. A clause is said to be positive if its antecedent is true (m = 0), and negative if its consequent is false (n = 0). A clause for n ≤ 1 is called a Horn clause, otherwise a clause for n > 1 is called a non-Horn clause. A clause is said to be range-restricted if every variable in the consequent of the clause appears in the antecedent, and violated under a set M of ground literals if it holds that ∀i(1 ≤ i ≤ m)Ai σ ∈ M ∧ ∀j(1 ≤ j ≤ n)Bj σ 6∈ M with some substitution σ. A sequential algorithm of the MGTP procedure mg is sketched in Fig. 1. Given a set S of clauses, mg tries to construct a model by extending the current model candidate M so as to satisfy violated clauses under M (model extension). This process forms a proof tree called an MG-tree. In Fig. 1, operations to construct a model tree T consisting of only models are added to the original procedure, for use in a model checking type MM-MGTP to be explained later. The function mg0 takes, as an initial input, the consequents of positive Horn and non-Horn clauses, an empty model candidate M and a null model tree T , and returns SAT/UNSAT as a proof result. It works as follows: (1) As long as the unit buffer U is not empty, mg0 picks up a unit literal u from U , and extends a model candidate M with u (Horn extension). T 0 ⊕ u means that u is attached to the leaf of T 0 . Then, the conjunctive matching

186

Ryuzo Hasegawa, Hiroshi Fujita, and Miyuki Koshimura U0 ← positive Horn clauses; D0 ← positive non-Horn clauses; T ← φ; Ans ← mg0(U0 , D0 , ∅, T ); function mg0(U, D, M, var T ) f T 0 ← φ; while (U 6= ∅) f U ← U \ {u ∈ U }; · · · (1) if (u 6∈ M ) f M ← M ∪ {u}; T 0 ⊕ u; CJ M (u, M ); if ( M is rejected ) return UNSAT; g if (U = ∅) f Simp&Subsump(D, M ); · · · (2) if ( M is rejected ) return UNSAT; g g if (D 6= ∅) f d ← (L1 ∨ . . . ∨ Ln ) ∈ D; D ← D \ {d}; · · · (3) A ← UNSAT; for j ← 1 to n f Tj ← φ; A ← A ◦ mg0(U ∪ {Lj }, D, M, Tj ); g T ⊕ T 0 ⊕ hT1 , . . . , Tn i; return A; g else f T ⊕ T 0 ; return SAT; g · · · (4)

Fig. 1. The MGTP procedure mg L∈M

¬L ∈ M

L(¬L) ∈ M

¬L(L) ∨ C ∈ D

⊥

C

Fig. 2. Unit refutation

Fig. 3. Disjunction simplification

procedure CJM (u, M ) is invoked to search for clauses whose antecedents are satisfied by M and u. If such nonnegative clauses are found, their consequents are added to U or the disjunction buffer D according to the form of a consequent. When the antecedent of a negative clause is satisfied by M ∪ {u} in CJM (u, M ), or the unit refutation rule shown in Fig. 2 applies to M ∪ {u}, mg0 rejects M and returns UNSAT (model rejection). (2) When U becomes empty, the procedure Simp&Subsump(D, M ) is invoked to apply the disjunction simplification rule shown in Fig. 3 and to perform subsumption tests on D against M . If a singleton disjunction is derived as a consequence of disjunction simplification, it is moved from D to U . When an empty clause is derived, mg0 rejects M and returns UNSAT. (3) If D is not empty, mg0 picks up a disjunction d from D and recursively calls mg0 to expand M with each disjunct Lj ∈ d (non-Horn extension). A ◦ B returns SAT if either A or B is SAT, otherwise returns UNSAT. T 0 ⊕ hT1 . . . Tn i means that each sub model tree Tj of Lj is attached to the leaf of T 0 , where hφ, φi = φ and T ⊕ φ = T . (4) When both U and D become empty, mg0 returns SAT. The nodes of an MG-tree except the root node are all labeled with literals used for model extension. A branch or a path from the root to a leaf corresponds to a model candidate. Failed branches are those closed by model rejection, and are marked with × at their leaves. A branch is a success branch if it ends with a

Efficient Minimal Model Generation Using Branching Lemmas S1 = ( ) → a ∨ b ∨ c. a → b. c→.

a

H b

H

c ×

b ⊗

187

(B11 ∧ . . . ∧ B1k1 ) ∨ . . . ∨ (Bn1 ∧ . . . ∧ Bnkn ) B11 ... Bn1 .. .. .. . . . ... Bnkn B1k1

Fig. 4. S1 and its MG-tree

Fig. 5. Splitting rule

node at which model extension cannot be performed any more. Figure 4 gives an MG-tree for the clause set S1. Here two models {a, b} and {b} are obtained, while a model candidate {c} is rejected. {a, b} is nonminimal since it is subsumed by {b}. We say that a model M subsumes M 0 if M ⊆ M 0 . The mark (⊗) placed at a leaf on a success branch indicates that the model corresponding to the branch is minimal (respectively nonminimal). MGTP allows an extended clause of the form Ante → (B11 ∧ . . . ∧ B1k1 ) ∨ . . . ∨ (Bn1 ∧ . . . ∧ Bnkn ) as in [5]. The clause implies that model extension with it is performed according to the splitting rule shown in Fig. 5 Major operations in MGTP, such as conjunctive matching, subsumption testing, unit refutation, and disjunction simplification, comprise a membership test to check if a literal L belongs to a model candidate M . So speeding up the test is the key to achieving a good performance. For this, we introduced a facility called an Activation-cell (A-cell) [4]. It retains a boolean flag to indicate whether a literal L is in the current model candidate M under construction (active) or not (inactive). On the other hand, all occurrences of L are uniquely represented as a single object in the system (no copies are made), and the object has an ac field to refer to an A-cell. So, whether L ∈ M or not is determined by merely checking the ac field of L. By using the A-cell facility, every major operation in MGTP can be performed in O(1) w.r.t. the size of the model candidate.

3

Minimal Model Generation

The first clause → a ∨ b ∨ c in S1 is equivalent to an extended clause → (a ∧ ¬b ∧ ¬c) ∨ (b ∧ ¬c) ∨ c. By applying the splitting rule in Fig. 5 to the extended clause, the nonminimal model {a, b} of S1 can be pruned since the unit refutation rule applies to b and ¬b. The added ¬b and ¬c are called branching assumptions and they are denoted by [¬b] and [¬c], respectively. In general, non-Horn extension with a disjunction L1 ∨ L2 ∨ . . . ∨ Ln is actually performed using an augmented one (L1 ∧ [¬L2 ] ∧ . . . ∧ [¬Ln ]) ∨ (L2 ∧ [¬L3 ] ∧ . . . ∧ [¬Ln ]) ∨ . . .∨ Ln , which exactly corresponds to an application of the complement splitting rule [1]. Complement splitting guarantees that the leftmost model in an MG-tree is always minimal, as proven by Bry and Yahya [1]. However, all other models generated to the right of it are not necessarily minimal. For instance, given the clause set S2 in Fig. 6, we obtain a minimal model {a} on the leftmost branch, while obtaining a nonminimal model {b, a} on the rightmost branch.

188

Ryuzo Hasegawa, Hiroshi Fujita, and Miyuki Koshimura

S2 =

a [¬b]

→ a ∨ b. b → a.

@

b a ⊗

Fig. 6. Ineffective branching assumption

PP P

a1 1

e1 b1 1 1 S3 = [¬e ] [¬b ] [[¬e ]]   1 → a ∨ e ∨ b.  [¬b ]    a → c ∨ d. M2 a2 @ 1 1 c → b. c d   @   [¬d1 ] [[¬c1]] c2 d2 b → a. [¬d2 ] [[¬c2]] ⊗ M1 b2 × M3 M4

Fig. 7. An MG-tree with branching lemmas

In order to ensure that every model obtained is minimal, MM-SATCHMO employs constrained search based on model constraints [1] as follows. When a minimal model {L1 , . . . , Lm } is found, a model constraint, i.e., a new negative clause L1 ∧ . . . ∧ Lm →, is added to the given clause set. For instance, in Fig. 6, a negative clause a → is added to S2 when the minimal model {a} is obtained. The negative clause forces the nonminimal model {b, a} to be rejected. However, this method needs to maintain negative clauses being added dynamically, the number of which might increase significantly. Moreover, it may bring rather heavy overhead due to conjunctive matching on the negative clauses, which is performed every time a model candidate is extended. To alleviate the above memory consumption problem, Niemel¨ a’s approach [8] seems to be promising. His method works as follows. Whenever a model M = {L1 , . . . , Lm } is obtained, it is tested whether ∀L ∈ M S ∪ M |= L holds or not, where S is the given clause set and M = {¬L0 | L0 6∈ M }. This test, called the groundedness test, is nothing but reconstructing a new tableaux with a temporarily augmented clause set SM = S ∪ M ∪ {L1 ∧ . . . ∧ Lm →}. If SM is unsatisfiable, then it is concluded that M is minimal, otherwise nonminimal.

4

Branching Lemma

If branching assumptions are added symmetrically, inference with them becomes unsound. For instance, consider the clause set S20 obtained by adding a clause a → b to S2 in Fig. 6. If ¬a is added to the disjunct b, no models are obtained for S20 , although a minimal model {a, b} does exist. However, for S2, ¬a can be added to b to reject the model {b, a}, because the proof below a does not depend on that of b, that is, there is no mutual proof commitment between the two branches. In this situation, we can use ¬a as a unit lemma in the proof below b. Definition 1. Let Li be a disjunct in L1 ∨ . . .∨ Ln used for non-Horn extension. The disjunct Li is called a committing disjunct, if a branch expanded below Li is closed by the branching assumption [¬Lj ] of some right sibling disjunct Lj (i+1 ≤ j ≤ n). On the other hand, every right sibling disjunct Lk (i + 1 ≤ k ≤ n) is called a committed disjunct from Li .

Efficient Minimal Model Generation Using Branching Lemmas

S4 = ( ) → a ∨ b ∨ c. b → a. c → b.

H H

a c b [¬b] [¬c] [[¬a]] [¬c] [[¬a]] [[¬b]] a b × ×

Fig. 8. Pruning by branching lemmas

S5 = → a ∨ b. → c ∨ d.

189

H H

a [¬b]

b [[¬a]]

c d [¬d] [[¬c]]

c d [¬d] [[¬c]]

@

@

Fig. 9. Omitting minimality test

Definition 2. If Li in L1 ∨ . . . ∨ Ln used for non-Horn extension is not a committing disjunct, we add ¬Li to every right sibling disjunct Lj (i + 1 ≤ j ≤ n). Such ¬Li is called a branching lemma and is denoted by [[¬Li]]. For example, in Fig. 7, a1 is a committing disjunct since the assumption [¬b ] is used to close the leftmost branch expanded below a1 . Here, a superscript is added to a literal to identify an occurrence of the identical literal. e1 , b1 are committed disjuncts since they are committed from a1 . Branching lemmas [[¬c1]], [[¬e1]], and [[¬c2]] are generated from non-committing disjuncts c1 , e1 , and c2 , respectively, whereas [[¬a1]] cannot be generated from the committing disjunct a1 . 1

Definition 3. Let M be a model obtained in an MG-tree. If it contains a committed disjunct Lj in L1 ∨ . . .∨ Ln used for non-Horn extension, each committing disjunct Li appearing as a left sibling of Lj is said to be a committing disjunct relevant to M. M is said to be a safe model if it contains no committed disjuncts. Otherwise, M is said to be a warned model. With branching lemmas, it is possible to prune branches that would lead to nonminimal models as shown in Fig. 8. In addition to this, branching lemmas have a great effect of reducing minimality tests as described below. Omitting a Minimality Test. If an obtained model M is safe, M is assured to be minimal so that no minimality test is required. Intuitively, this is justified as follows. If Lj ∈ M is a disjunct in some disjunction L1 ∨ . . . ∨ Ln , it cannot be a committed disjunct by Definition 3. For each left sibling disjunct Lk (1 ≤ k ≤ j − 1), a model M 0 containing Lk , if any, satisfies the following: Lj 6∈ M 0 under branching assumption [¬Lj ], and Lk 6∈ M under branching lemma [[¬Lk ]]. Thus, M is not subsumed by M 0 . For instance, in Fig. 9, all obtained models of S5 are assured to be minimal without performing any minimality test, since they are safe. Restricting the Range of a Minimality Test. On the other hand, if an obtained model M is warned, it is necessary to perform a minimality test on M against models that have been obtained. However, minimality tests should be performed only against such models that contain committing disjuncts relevant

190

Ryuzo Hasegawa, Hiroshi Fujita, and Miyuki Koshimura

to M . We call this a restricted minimality test. The reason why a minimality test is necessary in this case is as follows: Suppose that Lk is a committing disjunct relevant to M , Lj is the corresponding committed disjunct in M and M 0 is a model containing Lk . Although Lj 6∈ M 0 under branching assumption [¬Lj ], it may hold that Lk ∈ M since the branching lemma [[¬Lk ]] is not allowed for M . Thus, M may be subsumed by M 0 . For example, in Fig. 7, model M1 is safe because it contains no committed disjunct. Thus, a minimality test on M1 can be omitted. Models M2 , M3 , M4 are warned because they contain the committed disjunct e1 or b1 . Hence, they require minimality tests. Since a1 is the committing disjunct relevant to each of them, minimality tests on them are performed1 only against M1 containing a1 .

5

Implementation of MM-MGTP

We have implemented two types of a minimal model generation prover MMMGTP: model checking and model re-computing. The former is based on Bry and Yahya’s method and the latter on Niemel¨ a’s method. Model Checking MM-MGTP. Although a model checking MM-MGTP is similar to MM-SATCHMO, the way of treating model constraints differs somewhat. Instead of dynamically adding model constraints (negative clauses) to the given clause set, MM-MGTP retains them in the form of a model tree T . Thus, the constrained search for minimal models in MM-SATCHMO is replaced by a model tree traversal for minimality testing. For this, whenever a warned model M is obtained at (4) in Fig. 1, mg invokes the attached procedure mchk. Consider Fig. 7 again. When the proof of a1 has completed, the node Na1 labeled with a1 in T is marked as having a committing disjunct, and a pointer to the A-cell allocated for its parent (root) is assigned to a com field of the corresponding committed literals e1 , b1 . By this, e1 , b1 are recognized to be committed just by checking their com fields, and branches below e1 , b1 are identified to be warned. Hence, when a model M3 is generated, mchk first finds the committed disjunct b1 in M3 . Then, finding b1 ’s left sibling node Na1 , mchk traverses down paths below Na1 searching for a minimal model that subsumes M3 . During the traversal of a path in T , each node on the path is examined whether a literal L labeling the node belongs to the current model M (active) or not by checking the ac field of L. If L is active, it means that L ∈ M , otherwise L 6∈ M . For the latter, mchk quits traversing the path immediately and searches for another one. If mchk reaches an active leaf, it means that M is subsumed by the minimal model on the traversed path, and thus M is nonminimal. Here, we employ early pruning as follows. If the current model M = {L1 , . . . , Li , . . . , Lm } is subsumed by the previous model M 0 such that M 0 ⊆ {L1 , . . . , Li }, we can prune the branches below Li . Although mchk is invoked after M has been 1

For further refinement, a minimality test on M that contains no committing disjunct relevant to it can be omitted. This is the case for M2 .

Efficient Minimal Model Generation Using Branching Lemmas

191

generated, our method is more efficient than using model constraints, since it performs a minimality test not every time model extension with Lj ∈ M occurs, but only once when a warned model M is obtained. Model Re-computing MM-MGTP. A model re-computing MM-MGTP can also be implemented easily. In this version, model tree operations are removed from mg, and the re-computation procedure rcmp for minimality testing is attached at (4) in Fig. 1. rcmp is the same as mg except that some routines are modified for restarting the execution. It basically performs groundedness tests in the same way as Niemel¨a’s: whenever a warned model M = {L1 , . . . , Lm } is obtained, mg invokes rcmp to restart model generation for S ∪ M with a negative clause CM = L1 ∧ . . . ∧ Lm → being added temporarily. If rcmp returns UNSAT, then M is assured to be minimal. Otherwise, a model M 0 satisfying M 0 ⊂ M should be found, and then M will be rejected since it turns out to be nonminimal. Note that no model M 00 satisfying M 00 6⊂ M will be generated by rcmp because of the constraints M and CM . Note also that those minimal models which subsume M , if any, must be found to the left of M in the MG-tree, due to complement splitting. The slight difference with Niemel¨a’s is that in place of CM above, we use a shortened negative clause Lk1 ∧ . . . ∧ Lkr →, consisting of committed disjuncts in M . It is obtained by removing from CM uncommitted literals Lu ∈ M , i.e., those not committed from their left siblings. For instance in Fig. 7, when model M4 = {b1 , a2 , d2 } is obtained, a shortened negative clause b1 → will be created instead of b1 ∧ a2 ∧ d2 →, since only b1 is the committed disjunct in M4 . The use of shortened negative clauses corresponds to the restricted minimality test and enables it to avoid groundedness tests on such uncommitted literals Lu . The validity of using shortened negative clauses is given in Theorem 4.

6

Soundness and Completeness

In this section, we present some results on soundness and completeness of the MM-MGTP procedure. First, we show that model generation with factorization is complete for generating minimal models. This implies that the MM-MGTP procedure is also complete because the use of branching assumptions and lemmas can be viewed as an application of factorization. Second, we give a necessary condition for a generated model to be nonminimal. The restricted minimality test keeps minimal model soundness because it is performed whenever the condition is satisfied. Last, we prove that using shortened negative clauses for a groundedness test guarantees the minimality of generated models. Theorem 1. Let T be a proof tree of a set S of clauses, N1 and N2 be sibling nodes in T , Li a literal labeling Ni and Ti a subproof tree below Ni (i = 1, 2), as shown in Fig. 10(a). If N2 has a descendant node N3 labeled with L1 , then for each model M through a subproof tree T3 below N3 , there exists a model M 0 through T1 such that M 0 ⊆ M (Fig. 10(b)).

192

Ryuzo Hasegawa, Hiroshi Fujita, and Miyuki Koshimura

H H

N2 : L2 .. .

N1 : L1 · · · N2 : L2

N3 : L1

A T1 A A

A T3 A A

(a)

A T2 A

A

(b)

N1 : L1

HH

L11 · · · L1i · · · L1n1

(c)

H H

N1 : L1 · · · N2 : L2 .. . k Q

A Q Q N3 : L1 T1 A ? A (d)

Fig. 10. Proof trees explaining Theorem 1, 2 and Definition 5 Proof. We define the sequence of literals s1 , s2 , . . ., constituting M 0 through T1 by induction. Let I be a set of literals on the path P ending with N1 . If N1 is a leaf (T1 = φ), P must be a success path and M 0 = I ⊆ M , because M is a model through T3 and L1 ∈ M . Otherwise, there is a clause C1 = Γ1 → L11 ∨ . . . ∨ L1n1 violated under I and used for model extension at N1 (Fig. 10(c)). Then, M |= Γ1 and M |= L11 ∨ . . . ∨ L1n1 since M is a model. So, there exists a node labeled with s1 such that s1 ∈ {L11 , . . . , L1n1 } and s1 ∈ M . Suppose that we have defined the first n literals of M 0 in T1 to be s1 , . . . , sn , by traveling down the successor nodes whose labels belong to M . If the node labeled with sn is a leaf, we are done. Otherwise, we may continue our definition of M 0 . The sequence ends with a label of a leaf finitely or continues forever. In either case, there exists a model M 0 containing literals in the sequence such that M 0 = I ∪ {s1 , s2 , . . .} ⊆ M . u t The above is a fundamental theorem for proving the minimal model completeness of model generation with factorization. We define our factorization essentially in the same means as tableau factorization [6]. To avoid a circular argument, a factorization dependency relation is arranged on a proof tree. Definition 4 (Factorization Dependency Relation). A factorization dependency relation on a proof tree is a strict partial ordering ≺ relating sibling nodes in the proof tree. A relation N1 ≺ N2 means that searching for minimal models below N2 is committed to that below N1 . Definition 5 (Factorization). Given a proof tree T and a factorization dependency relation ≺ on T , first select a node N3 labeled with literal L1 and another node N1 labeled with the same literal L1 such that (1) N3 is a descendant of a node N2 which is a sibling of N1 , and (2) N2 6≺ N1 . Next, close the branch extended to N3 (denoted by ?) and modify ≺ by adding a relation N1 ≺ N2 , then forming the transitive closure of the relation. The symbol ? means that the proof of N3 is committed to that of N1 . The situation is depicted in Fig. 10(d). Corollary 1. Let S be a set of clauses. If a minimal model M of S is built by model generation, then M is also built by model generation with factorization. Proof. Immediately from Theorem 1.

t u

Efficient Minimal Model Generation Using Branching Lemmas

193

The model generation procedure is minimal model complete (in the sense that it generates all minimal models) for range-restricted clauses [1]. This implies the minimal model completeness of model generation with factorization. Corollary 2. (Minimal model completeness of model generation with factorization) Let S be a satisfiable set of range-restricted clauses and T a proof tree by model generation with factorization. If M is a minimal model of S, then M is found in T . We consider model generation with branching assumptions and lemmas as arranging factorization dependency relation on sibling nodes N1 , . . . , Nm labeled with L1 , . . . , Lm , respectively, as follows: Nj ≺ Ni for all j(i < j ≤ m) if Li is a committing disjunct, while Ni ≺ Nj if [[¬Li ]] is used below Nj . This consideration leads to the minimal model completeness of the MM-MGTP procedure. Corollary 3 (Minimal Model Completeness of MM-MGTP). Let S be a satisfiable set of range-restricted clauses and T a proof tree by model generation with branching assumptions and branching lemmas. If M is a minimal model of S, then M is found in T . Although model generation with factorization can suppress the generation of nonminimal models, it may still generate them. In order to make the procedure sound, that is, to make it generate minimal models only, we need a minimality test on an obtained model. The following theorem gives a necessary condition for a generated model to be nonminimal. Theorem 2. Let S be a set of clauses and T a proof tree of S by model generation with factorization. Let N1 and N2 be sibling nodes in T , Ti a subproof tree below Ni and Mi a model through Ti (i = 1, 2). If N2 6≺ N1 , then M1 6⊆ M2 . Proof. Suppose that Ni is labeled with a literal Li (i = 1, 2) (Fig. 10(a)). It follows from N2 6≺ N1 that (1) N1 ≺ N2 or (2) there is no ≺-relation between N1 and N2 . If (1) holds, L1 6∈ M2 because every node labeled with L1 in T2 has been factorized with N1 . On the other hand, L1 ∈ M1 . Therefore, M1 6⊆ M2 . If (2) holds, L1 6∈ M2 because there is no node labeled with L1 in T2 . Therefore, M1 6⊆ M2 . t u Theorem 2 says that (1) if N2 6≺ N1 , no minimality test on M2 against M1 is required, otherwise (2) if N2 ≺ N1 , we need to check the minimality of M2 against M1 . In MM-MGTP based on a depth-first-left-first search, omitting a minimality test on a safe model is justified by the above (1), while the restricted minimality test on a warned model is justified by (2). If N1 were a left sibling of N2 such that N1 ≺ N2 , e.g., a branching lemma [[¬L1 ]] is used below N2 , a minimality test on M1 against M2 will be required according to Theorem 2. However it is unnecessary in MM-MGTP since it always holds that M2 6⊆ M1 as follows. Theorem 3. Let T be a proof-tree by a depth-first-left-first search version of model generation with factorization and M1 a model found in T . If a model M2 is found to the right of M1 in T , then M2 6⊆ M1 .

194

Ryuzo Hasegawa, Hiroshi Fujita, and Miyuki Koshimura

Proof. Let PMi be a path corresponding to Mi (i = 1, 2). Then, there are sibling nodes N1 on PM1 and N2 on PM2 . Let Li be a label of Ni (i = 1, 2). Now assume that M2 ⊆ M1 . This implies L2 ∈ M1 . Then, there is a node NL2 labeled with L2 on PM1 . However, NL2 can be factorized with N2 in the depth-first-left-first search. This contradicts that M1 is found in T . Therefore, M2 6⊆ M1 . t u

Corollary 4 (Minimal Model Soundness of MM-MGTP). Let S be a satisfiable set of range-restricted clauses and T a proof tree by model generation with branching assumptions, branching lemmas, and restricted minimality tests. If M is a model found in T , then M is a minimal model of S. The following theorem says that a shortened negative clause for the groundedness test guarantees the minimality of generated models. Definition 6. Let S be a set of clauses, T a proof tree of S by model generation with factorization and M a model found in T . For each literal L ∈ M , NL denotes a node labeled with L on the path of M . Let Mp ⊆ M be a set satisfying the following condition: for every L ∈ Mp , there exists a node N such that NL ≺ N . Note that Mp is a set of committed disjuncts in M . CMp denotes a shortened negative clause of the form L1 ∧ . . . ∧ Lm →, where Li ∈ Mp (i = 1, . . . , m).

Theorem 4. Let S be a set of clauses. M is a minimal model of S if and only if SMp = S ∪ M ∪ {CMp 2 } is unsatisfiable, where M = {L0 → | L0 6∈ M }. Proof. (Only-if part) Let M 0 be a model of S. There are three cases according to the relationship between M and M 0 : (1) M 0 \ M 6= ∅, (2) M 0 = M , or (3) M 0 ⊂ M . If (1) holds, M 0 is rejected by a negative clause in M . If (2) holds, M 0 is rejected by the shortened negative clause CMp . The case (3) conflicts with the assumption that M is minimal. Now that no model of S is a model of SMp . Therefore, SMp is unsatisfiable. t u Proof. (If part) Let T be a proof tree of S by model generation with factorization. Suppose that M is not minimal. Then there exists a model M 0 of S found 0 in T such that M 0 ⊆ M . Let PM , PM be the paths corresponding to M, M 0 , respectively. Then, there are sibling nodes N and N 0 in T such that N is on PM 0 and N 0 on PM . Let L, L0 be a label of N, N 0 , respectively. In case of N ≺ N 0 , 0 M conflicts neither with CMp because L 6∈ M 0 nor with M because M 0 ⊆ M . Thus, M 0 is a model of SMp . This contradicts that SMp is unsatisfiable. In case of N 6≺ N 0 , since a node labeled with L0 cannot appear on PM , both L0 6∈ M and L0 ∈ M 0 hold. This contradicts that M 0 ⊆ M . Therefore, M is minimal. u t 2

If Mp = ∅, L1 ∧. . .∧Lm → becomes an empty clause → which denotes contradiction. In this case, we conclude that M is minimal without a groundedness test.

Efficient Minimal Model Generation Using Branching Lemmas

7

195

Experimental Results

This section compares experimental results on MM-MGTP with those on MMSATCHMO and MGTP. Regarding MM-MGTP, we also compare four versions: model re-computing with/without branching lemmas (Rcmp+BL/Rcmp), and model checking with/without branching lemmas (Mchk+BL/Mchk). MMMGTP and MGTP are implemented in Java, while MM-SATCHMO in ECLi PSe Prolog. All experiments were performed on a Sun Ultra10 (333 MHz,128 MB). Table 1 shows the results. The examples used are as follows. ex1. Sn = {→ ak ∨ bk ∨ ck ∨ dk ∨ ek ∨ fk ∨ gk ∨ hk ∨ ik ∨ jk | 1 ≤ k ≤ n} This problem is taken from the benchmark examples for MM-SATCHMO. The MG-tree for ex1 is a balanced tree of branching factor 10, and every generated model is minimal. Since every success branch contains no committed disjunct, i.e., the corresponding model is safe, no minimality test is required if branching lemmas are used. ex2. Sn = {ai−1 → ai ∨ bi ∨ ci , bi → ai , ci → bi | 2 ≤ i ≤ n} ∪ {→ a1 } The MG-tree for ex2 becomes a right-heavy unbalanced tree. Only the leftmost branch gives a minimal model, which subsumes all other models to the right. With branching lemmas, these nonminimal models can be rejected. ex3. T1 = {→ a1 ∨ b1 , a1 → b1 , b1 → a2 ∨ b2 , a2 → b2 ∨ d1 } T2 = {b2 → a3 ∨ b3 , a3 → a2 ∨ c2 , a3 ∧ a2 → b3 ∨ d2 , a3 ∧ c2 → b3 ∨ d2 } Tj = {bj → aj+1 ∨ bj+1 , aj+1 → aj ∨ cj , cj → aj−1 ∨ cj−1 , aj+1 ∧ a2 → bj+1 ∨ dj , aj+1 ∧ c2 → bj+1 ∨ dj } (j ≥ 3) Sn Sn = i=1 Ti The MG-tree for ex3 is a right-heavy unbalanced tree as for ex2. Since every success branch contains committed disjuncts, minimality tests are inevitable. However, none of the obtained models is rejected by the minimality test. ex4. Sa = {→ ai ∨ bi ∨ ci ∨ di ∨ ei | 1 ≤ i ≤ 4} ∪ {a3 → a2 , a4 → a3 , a1 → a4 } ex5. Sabcd = Sa ∪ {b3 → b2 , b4 → b3 , b1 → b4 } ∪ {c3 → c2 , c4 → c3 , c1 → c4 } ∪ {d3 → d2 , d4 → d3 , d1 → d4 } ex4 and ex5 are taken from the paper [8]. No nonminimal models can be rejected without using branching lemmas. syn9-1. An example taken from the TPTP library [11], which is unsatisfiable.

196

Ryuzo Hasegawa, Hiroshi Fujita, and Miyuki Koshimura

Table 1. Performance comparison Problem Rcmp+BL Mchk+BL ex1 (N=5) ex1 (N=7) ex2 (N=14) ex3 (N=16) ex3 (N=18) ex4 ex5 syn9-1 channel

Rcmp

Mchk

MM-SAT

MGTP

0.271 0.520 2.315 0.957 8869.950 0.199 100000 100000 100000 100000 100000 100000 0 0 0 0 0 0 34.150 OM (>144) 324.178 OM (>115) OM (>40523) 19.817 10000000 − 10000000 − − 10000000 0 − 0 − − 0 0.001 0.001 82.112 16.403 1107.360 9.013 1 1 1 1 1 1594323 26 26 1594322 1594322 1594323 0 19.816 5.076 19.550 5.106 OM (>2798) 589.651 65536 65536 65536 65536 − 86093442 1 1 1 1 − 0 98.200 26.483 95.436 26.103 OM (>1629) 5596.270 262144 262144 262144 262144 − 774840978 1 1 1 1 − 0 0.002 0.002 0.009 0.003 0.3 0.004 341 341 341 341 341 501 96 96 160 160 284 0 0.001 0.001 0.002 0.001 0.25 0.001 17 17 17 17 17 129 84 84 88 88 608 0 0.105 0.109 0.101 0.092 TO (>61200) 0.088 0 0 0 0 − 0 19683 19683 19683 19683 − 19683 4.016 4.064 46.166 4.517 NA 3.702 51922 51922 51922 51922 − 51922 78 78 78 78 − 78

top: time (sec), middle: No. of models, bottom: No. of failed branches. MM-SAT: MM-SATCHMO, OM: Out of memory, TO: Time out, NA: Not available due to lack of constraint handling

Channel. A channel-routing problem [12] in which constraint propagation with negative literals plays an essential role to prune the search space. One can obtain only minimal models with MGTP. The last two first-order examples are used to estimate the overhead of minimality testing in MM-MGTP. MM-MGTP vs. MM-SATCHMO. Since MM-SATCHMO aborted execution very often due to memory overflow, we consider the problems that MMSATCHMO could solve. A great advantage of MM-MGTP is seen for ex1 that does not need any minimality test and ex2 in which branching lemmas have high pruning effects. The fastest version of MM-MGTP achieves a speedup of 33,000 and 1,100,000 for ex1 and ex2, respectively, compared to MM-SATCHMO. Even for small problems like ex4 and ex5, MM-MGTP is more than one hundred

Efficient Minimal Model Generation Using Branching Lemmas

197

times faster than MM-SATCHMO. In addition, it is reported that the Niemel¨a’s system takes less than 2 and 0.5 seconds for ex4 and ex5, respectively [8]. MM-MGTP vs. MGTP. Compared to MGTP, proving time for MM-MGTP is much shortened as the number of nonminimal models rejected by branching lemmas and minimality tests increases. In particular, ex2 and ex3 exhibit a great effect of minimal model generation with branching lemmas. Although ex1, syn91, and channel are problems such that no nonminimal model is created, very little overhead is observed for MM-MGTP that employs branching lemmas, because minimality tests can be omitted. Rcmp vs. Mchk. Proving time for Rcmp increases from 2 to 5 times that for Mchk because of re-computation overhead, for propositional problems ex1 (except N=7) through ex5, that do not require a term memory [10]. For the first-order problem channel that requires the term memory, Rcmp is about 10 times slower than Mchk. This is because the overhead of term memory access is large, besides Rcmp doubles the access frequencies when performing groundedness tests. Next, look at the branching lemma effect. For ex1 and channel, since minimality tests can be omitted with branching lemmas, Rcmp+BL and Mchk+BL obtain 8.5 to 11.5- and 1.84 to 1.1-fold speedup, respectively, compared to versions without branching lemmas. Although the speedup ratio is rather small for Mchk+BL, it proves that Mchk based on model tree traversal is very efficient. ex2 is a typical example to demonstrate the effect, thus both Rcmp+BL and Mchk+BL achieve several-ten-thousand-fold speedup as expected. Rcmp+BL vs. Mchk+BL. For ex3 in which minimality tests cannot be omitted, Mchk+BL is about 4 times faster than Rcmp+BL. Although for ex1 (N=5), no difference between Mchk+BL and Rcmp+BL should exist in principle, the former is about 2 times slower than the latter. This is because Mchk+BL has to retain all generated models, thereby causing frequent garbage collection.

8

Related Work

In a tableaux framework, Letz presented factorization [6] to prune tableaux trees. Complement splitting (or folding-down in [6]) is a restricted way of implementing factorization. It is restricted in the sense that a precedence relation is pre-determined between disjuncts in each disjunction, and that only a disjunct having higher precedence can commit its proof to that of another sibling disjunct with lower precedence, whereas such precedence is not pre-determined in factorization. Although factorization is more powerful than complement splitting, it may also generate nonminimal models without any guide or control. Lu [7] proposed a minimal model generation procedure which in a sense relaxes the above restriction by adding branching assumptions symmetrically, as

198

Ryuzo Hasegawa, Hiroshi Fujita, and Miyuki Koshimura

in (a∧ [¬b])∨ (b ∧ [¬a]) for a disjunction a∨ b. However, his method involves postdetermination of the precedence between the disjuncts. This is because mutual proof commitment may occur due to symmetrical branching assumptions, and some possibly open branch are forced to be closed thereby making the proof unsound (and incomplete w.r.t. model finding). If this is the case, some tentatively closed branches have to be re-opened so that the performance would degrade. Branching lemmas proposed in this paper can still be taken as a restricted implementation of factorization, because it is disabled for a disjunct to generate a branching lemma once a branching assumption of some sibling disjunct is used to prove the disjunct, whether mutual proof commitment actually occurs or not. Nevertheless, our method provides an efficient way of applying factorization to minimal model generation, since it is unnecessary to compute the transitive closure of the factorization relation. The effects of the branching lemma mechanism are summarized as follows: it can (1) suppress the generation of nonminimal models to a great extent, (2) avoid unnecessary minimality tests, and (3) restrict the range of minimality tests on the current model M to models on which committing disjuncts relevant to M appear. The model checking version of MM-MGTP aims to improve MM-SATCHMO by introducing branching lemmas, and it is also based on complement splitting and constrained searches. Major differences between both systems are the following. MM-SATCHMO stores model constraints as negative clauses and performs minimality tests through conjunctive matching on the negative clauses, thereby being very inefficient in terms of space and time. Our model checking version, on the other hand, is more efficient because model constraints are retained in a model tree in which multiple models can share common paths, and minimality tests are suppressed or restricted by using branching lemmas. Since the above two systems depend on model constraints which are a kind of memoization, they may consume much memory space, the size of which might increase exponentially in the worst case. This situation is alleviated by Niemel¨a’s method [8]. It can reject every nonminimal model without performing a minimality test against previously found minimal models, by means of the cut rule which is essentially equivalent to complement splitting and the groundedness test that is an alternative of the constrained search. The model re-computing version of MM-MGTP takes advantage of Niemel¨ a’s method in which it is unnecessary to retain model constraints. However, both systems repeatedly perform groundedness tests rather more expensive than constrained searches. In addition, they necessarily generate each minimal model twice. In the model re-computing version, the latter problem is remedied to some extent by introducing shortened negative clauses. Moreover, due to branching lemmas, it is possible to invoke as few groundedness tests as possible.

9

Conclusion

We have presented an efficient method to construct minimal models by means of branching assumptions and lemmas. Our work was motivated by the two ap-

Efficient Minimal Model Generation Using Branching Lemmas

199

proaches: Bry’s method based on complement splitting and constrained searches and Niemel¨ a’s method that employs the groundedness test. However both methods may contain redundant computation, which can be suppressed by using branching lemmas in MM-MGTP. The experimental results with MM-MGTP show that orders of magnitude speedup can be achieved for some problems. Nevertheless, we still need minimality tests when branching lemmas are not applicable. It should be pursued in future work to omit as many minimality tests as possible, for instance, through a static analysis of clauses. It would also be worthwhile to combine our method with other pruning techniques such as folding-up and full factorization, or to apply it to stable model generation.

References 1. Bry, F. and Yahya, A.: Minimal Model Generation with Positive Unit HyperResolution Tableaux. Proc. of 5th Workshop on Theorem Proving with Analytic Tableaux and Related Methods, LNAI 1071, (1996) 143–159 2. Bry, F.: http://www.pms.informatik.uni-muenchen.de/software/MM-SATCHMO/ (1999) 3. Fujita, H. and Hasegawa, R.: Implementing a Model-Generation Based Theorem Prover MGTP in Java. Research Reports on Information Science and Electrical Engineering of Kyushu University, Vol.3, No.1, (1998) 63–68 4. Hasegawa, R. and Fujita, H.: A New Implementation Technique for Model Generation Theorem Provers To Solve Constraint Satisfaction Problems. Research Reports on Information Science and Electrical Engineering of Kyushu University, Vol.4, No.1, (1999) 57–62 5. Inoue, K., Koshimura, M. and Hasegawa, R.: Embedding Negation as Failure into a Model Generation Theorem Prover. Proc. CADE-11, Springer-Verlag, (1992) 400–415 6. Letz, R., Mayer K. and Goller C.: Controlled Integration of the Cut Rule into Connection Tableau Calculi, J. of Automated Reasoning, 13, (1994) 297-337 7. Lu, W.: Minimal Model Generation Based on E-hyper Tableaux. Research reports 20/96, (1996) 8. Niemel¨ a, I.: A Tableaux Calculus for Minimal Model Reasoning. Proc. of 5th Workshop on Theorem Proving with Analytic Tableaux and Related Methods, LNAI 1071, (1996) 278–294 9. Shirai, Y. and Hasegawa, R.: Two Approaches for Finite-Domain Constraint Satisfaction Problem — CP and MGTP —. Proc. of the 12-th Int. Conf. on Logic Programming, MIT Press, (1995) 249–263 10. Stickel, M..: The Path-Indexing Method For Indexing Terms. Technical Note No.473, AI Center, SRI International, (1989) 11. Sutcliffe, G., Suttner, C. and Yemenis, T.: The TPTP Problem Library. Proc. CADE-12, (1994) 252–266 12. Zhou, N.: A Logic Programming Approach to Channel Routing. Proc. 12th Int. Conf. on Logic Programming, MIT Press, (1995) 217–231

!" # $ %% " & '"( ) # * +$ "

)& # % * & # )

!" )& ** $ # % * & # )

%* ,$ $ & & &$ ## -

#$. -# ) $ ) ) # # " ) # * / $ $ $ $ % #$. ) -

) %$

)#- $ & # $ )#-#

)# & )) $ * 0 $ $ !" # # ) )

& # )*

! " #

$ % $ " $ % " & ' ( " ) $ % $ $

Æ % * + $ $ "

$ %

" , $ $ & ' & ' - %

% % ( "" " $ $ " . " ! & $ ) )

* ! " #! $ % & !

!" 1 / ! 2 " & " #

3

/ .

" $ " 0 + - &' & ' $ % " 1 2 3 2 3"

. % $ "" 2 3 2 3" Æ & ' $

( % 2"" 2 3 2 33 " 4 1 - % 2 3

2 3 2 3 23 2 3 2 3" 4 $ % 2 " 3 2 3 2 3 $ + 2 3 +

2 3" 5

2 3

2 $ + 3" 5 2 3 2 3 2 3 2 3 23 2 3 $ " # &' . 6 6 " . +

- & ' 2 % 3" &' &'

" / 2& '3" , + " 7 + $ 2 3 2 3 8 2 3 2 3 2 3 2 3 8 2 3 2 3" 8 2 3 2 3 & ' 8 % " 9

2 3 2 3 2 3 2 3 2 3 "

$ 2 & % '3 + 2 3 "

/# $ )- %

$ * ,$ # $ )& # 4 5 %$ $ ) )

#$ $ #*

" 0 &

: $ &' $ $ $

( % 2" 5 ;3" / % $

" / $ <

$ 6 2 23 233"

-

Æ 2& '3 2 Æ 3 Æ 2 Æ "" # : 7:= 5: , >=? 3 + $ ,5=@ + +" : $ . " . Æ A $ 2"" 3" ) % "" 2 B.== A 3" . 5C % "" " $ + $ % "

:

( " $ "

& $ ' $ " ( $ + " .

"

% 7?! " : " $

"

8 8 8 D "" 8 8 " : %

6 " % " $ " : $ $ " $ %+ & '

" :

" 4 D 2 %

3"

!" 1 / ! 2 " & " #

: $ "

+ "" 2 3 8 1 2 3 8 1 2 3 8 " . + % 2 3 8 1 2 3 8 $ " : 8 2 3 8 2

6 $ 3" : 2 3 Æ Æ " ! 7?! "" $ % $ "

8 23 0EF + " G ( % Æ % $ " % %

" . % 2

0EF

3 $ " 5 B@; " $ % 8 2 3 8 23 $" . % 1 8 D 1 D 1 " " / " .

% 1 " :

" $ % " / ( % " " % < 205E3 " #

# " $% & 7 ' 8 2 3 2 3" 2 3 2 3 05E 2 3 " " 05E (" 2 3 05E 2 233 " 2 233 2 3 2 233 2 233" : 05E " & ' 2 3"

6 % $ ) # # # # *

7

" 0 &

. Æ . $ & ' " < " " " #

' ( " #

' ) ! #

! '

9 $ % < $ 05E " 2 " 3 " : ($ % & " ' " $% * 8 2 3 2 3 2 3" 2 3 2 233 " " 4 $ 2 3 2 233 " 2 3 05E 2 233 " " 2 2 3 2 3 05E 2 233 " 3" 2 3 2 233 " 2 233 2 2 3 2 233 8 2 3 3" + 8

. + " 9

5 $ " 4 $ " . $% + 8 2 3 2 3 2 3 2 3" 4 " , & - !

- 8

" $ + $ D & ' $ " 23 $ $ + 23 $ $ + 23("

% < "

! " # )

. /

. / "

# 0 # ' )

. / # ( )

. /

2 # 45 45 # % %$#$ % )$ $ & # '4& 5( $ # #*

!" 1 / ! 2 " & " #

$% 1 . B+ ! " 2 3 2 " 2 3

3" : 2 3 2 3 2

3 " 2 3 " B $ " ) % -

" 2 2 (

/ &$' & ' $ " . 8 23 23 23 23 8 23 23" 8 " " 7 $ 8 23 23 8 23" :

$ & ' 2 3 "

% 2 3 # $ " Æ # Æ0 #

Æ ( #

) . /

. /( . ' /

. / . 8 2 3 23 8 2 3 2 3 2 3 Æ 8 $ Æ $ "

: .

" Æ

" $

% " % & ' ( 8 " % # '

3 ( 3 ( ' 4 8 2 3( ' ( H 2 ( ( '

/ 23 < $ " / 23 % 2 3 2" H 3" $ 2 3

"

" 0 &

$% 5 ' 8 2 3 2 3 2 3 8 2 3 2 3" 8 2 3 ' 8 2 3 ' " ,$ 8 % ' 0EF 2 3 2 3 2 3 2 3 8 2 3 2 3 8 2 3 " ' 8 2 3 2 3 8 2 3" . $ % 2"" ' 3" : $ + % 2 3 ' 23 " % 8 2 3 8 23 $ 2 3 2 3 " ' " % + % $ "

$ $ % + 2$ 3" )!! ( 2 8

/ $ $ $ 2 $ . 3" . %

" .

Æ 2 3-

)!! 2

3 ( ' ! ! ( 8 2 3 %( ! / % " 9

$ % & ' & '" 4 $ % . 1 & '-

* + +$

. '/ !

)( 8 ( ' 8 2 3( 8 ) Æ # ( Æ ) ( ' . / . / 5! 5 (

"

,$ # $ ##

& & *

!" 1 / ! 2 " & " #

, * . - / - ! / Æ $

" $ $ - 2 3 2 3 2 3 2 3 2 3 2 2 3 $ 2 33"

" "

" / (

% "

, & ' & ' - " ( " ) ( ! # ( ! ) . /

. / ! % ' "

' ' ) 6 ( (

2 3 '

( %( ' # 2 3 " $ 2 3 " " & ' "

" #+ . % "

. "/ 0 ) ' ( ' ( ' ' 4 ' (

8 ' (

2 ( ! ! ( ( 2 3 ) ! , + ! : $ 2 $ 3" H " # % % " / + H " , %

-

9

" 0 &

2" # 3 %" 23 %" . 23 % $

23 23 " / 2 3 2 % 3 " 4 23 $ % %" 4 23 " : ( % 2"" I

3 " 5 % $ " . B+ ; ' 2 3" # < " : # ! "

2 B+ J

3"

1 !! ) ' ( ' ' 4 ' (

8 ' (

2 ( ! 7 ! ( ( 2 3 ) , 1 ! 2 3 $ " $ $ $

"

$ - 2 3 2 3 2 3 2 % 3" " : $ 23 & " " "' " $ 2 3 ! 8 8 2 8 3 ( ( * 8 2 3 2 3

(

*

8 2 3 2 3

(

!" 1 / ! 2 " & " #

:

' ' 8 23 ! " 2 3 #

! " " $ " $% 9 ! % $ 8 2 3 8 2 3 2 3 2 3" -

-

-

'-

8 D &'" 2 3D 2$ 3 D 2 + 3" # 2" # J3 8 2" # !3" + ' " ' $ ( ) " : ( 2 % $ 2 3 2 33" 2 2 3 2 3 2 3 2 3 2 3 2 33" 8 " $ '

) "

, 5 : ! .

& ' $ $ " / $ 23 " / &

' $ D $ 2" J3" . 2 3

.

3

" 0 &

5C % 2 + 3 "

< 2" @3"

-

)!! $ ' ) ' ( ) ( ' 4 ( ) 8 ( 2 ( (

/ % &$ ' " 5 % $ " # + ( 23 K 2$3 " / 23 < # !" 5 2 < 5 ? 3" 2" # J3 2$3 23 ($ & ' 2 2$3

3" / 2$3 $

- $ % " " #

" $ " #+ " 7 2 3 2 3 % 23" ,$ 23 % + $ % " # 23 + $ " . 8 2 3 8 2 3 2 3 8 8 2 3" / 2 3 2 3 2 3 2 3

$ 2 3 2 3 D 2 3 2 3 " 5 " F "" * + " /

+

!" 1 / ! 2 " & " #

33

+ " $ "

Æ $ D $ $ $

" $ $ & ' $ 6 " + "

2 3 $ 23 " # 1 8 23 % $ "" 23 2 $ 3 23 Æ Æ 2 233 $ 2 8 Æ 8 + $

3" / 23 23 " . + ' B+ 2 3 8 2 3 2 3 8 2 3 8 2 3" # 23 23 23 % 23"

% 23" / $

$ " /

2 + 3" / 2 3 " # $ 6 "" 2 3 2 3 8 $ $ " / $ 6 "

$ $ <

- + & ' + $ " . $ $ 2" " H3D + $ 2.

! 3" / $ 2 3 $ " & ' " # $

$ $ 8 &' 2 B+ 3" 4 " / Æ

,$ & $ ) ) # ##*

3

" 0 &

( 4 . < $ 4 " / ( $

% & $ ' "

2'!

/ - )) )

- $ " / 8 2" 7?! 3" . 2% 3

2 3 Æ $ " # " #+ " . $ &' $ $ " $ % +$ 2 3 " - &$ ' " HH $

$ A $ " .

% 6 1$ " , $ B.===@ " 7 . ( "" + 2

3"

' Ë ( ' && 8 ( " 2 3 8 8 2 8 3 8 2 3 !

( * 8 (

( (

* ( 8 Ë

& % ' + % " : " 5 % %

!" 1 / ! 2 " & " #

3

23 % " / $ $ +"

- '

-

( ' &* ) % # ' 4

( *!

( ' (

# ! !

23 % H! $ " % D 23 % H! $ % $ "

% ( $ 2 " ( 3 $ 23 23 $ $ $ 2" # J ! 3" 4 $ - Æ # $ 2 ""3 $

" / " . 8 2 3 !23 B+ 2 8 !23 3

) 2 2 3 )3"

$ % -

2'! !

/ - )) 2 ( ( 8 # $ 6 " : % - " L " 7 "" 8 23 23" + D 2 + 3 $ + K 23 " . 4 5 "

,$ # # # &$ ) # & ' $ ; )( $&$ )*

#

#

#

37

" 0 &

! " 5 $ 6

1$ +" / %

$ D

$ $ + $ " +

$ $ " . 6 * + 2% 3 "

"

' ( < ' (

= $ )

= > $ #

#$* > & & $ ; = )$

)& ' (

= , * ! ## #

> >8 > 8 ' ' (( 8 < 8 = ))# >8 > < = ?# # ' ( =

#$ ; ' ( = @&$

#$ = ))# * A# # > >8 ! " ' ( > 8 ' ( 8 >8 < = ))# B & * = 8 ))# >8 > < = ?# # %$ )$ * = 6% &$ ;>

8 = A %$ )$ ; ' ( ' ( = 6 )$ B %$ $&$ > ' C 3( = 0 >

!" 1 / ! 2 " & " #

3

' (

5 %-

- (! 1 2" ;3D ' 2 3 &' $ $ " :

2 3 % % " .

" :

1$ " 5 - " $ " < % "

% &$ ' $ / D < Æ " / $

% " ; $

" " 8

# ! J" # % - HH 6 " : "+ " - #2 3 2 H; ?3 #2 3 + " H % $ 2 3 " ) H 2 = 3 2# ! 3" 4

! $ $ "

<

$ " 2 !?3 + $ $ " 4 $ 6 * + % " / $ 6 $

2" 5 ;3" $ $

< 5 ;" : $ $ $ $

** # &

#$ -

& *

3

" 0 &

" 4 "

$ < 2 3 "

5 H $ $ " : " $ $

" # 2$3 + + $ 5 "

2'!

/ ' - )) ) ' ' ! ' ( 4

2 2 3 /

2 2 3 /

2 2 3 8

2 2 3 ( 8 2 3 ! 8 8 2 8 3

'

# : A $ " : 2 3 + "

: ; < : 7?! " /

$ & ' . < $ " # $ 1 - . $ "" 2 3 " : (

$ $ 2" # ; 3" . $ " ) . &% ' " 7?! .=!.= "

$

& +' "

/ % 4)# # #

#&5 4 #5 4#&5 #*

!" 1 / ! 2 " & " #

3

$ % == I $ + 2 "" 7 =; 3" : " / & + ' " " . " . .

$ 2 A $ 3 1$ + $ + * + " / $ . $ 2" & . ' 5 H3" . A =@ 2 .#= 50@@ 3" : . $ $ K & ' " .

1 - + 23 . "" &' "

& ' $ <

" ) (

= " : $ " : I - 2543 = 2 3 % " 0 $

" / 54 . $ & ' & '"

. &

' $ " / $ 54 . "" 54 . 54 $ . " 4 $ " G 1 2,543 >=? " ,54 & '" / . + " $ . $ " $ ,54 ( . "

; < : - . Æ " 5 + 2$ 3" :

39

" 0 &

+ % 4 ( 2 ( 3" / $ $ 5: ##B 7:57H H=== . J J H $ " . 55M=J . $ Æ % ( ( 2"" $ ! 3" $

JN 2,- ;N3 H " , ( ( $ $ " , $ . E =@ %"

"< '

/ $

. " $ $ $ $ " " D0 :9E

" 0 & * 6 ) , ; 1 ,$ F;

* 6

?%

# $ %& F/ 3:* ?) & 3::9* D0#:9E 0 $ 0# * ' " ( # * $ 3::9* D002?:9E +& & 0 ? 0 & G 2 ,$ ?#$ * H A$ ) > A ) I;* +& & 0 " 6* ?#$ ) ) ! )** ))* 3B 3:* % /# # "$ 3::9* D0I!::E " 0 & F I& #$ ! #$* / #J # # # #* 6

& )+ ,- F/ 3 ))* :B7 , 3:::* ?) & * D0!F:E " 0 & #$ ! #$ F

* 6 ) , ;* .+') %- F/ 33 * ?) & 3::* D0:E G " 0* ,$ ## K$* "* K& * K# * K# K* 2 &$ # $ %- F/ 33 ))* 33B3 * ?) & 3::* DA/:E G*K* A

% ** /* I;) $ # )

* ) ' 93 3::* DAE A* A$ & @* * ( # * /# # " 3:* DA":E @# A

F# " * # " # & K 0 & #$.* * /%0 3::* DE K * I & $

#$ # ) *

* )** ( 1 +$* ) 2 * * ( LH ))* 3B* / #

D E

K $ # ?# 3:* K* * & * * / #$ ) &

$ ) &* )( '( 3: *

!" 1 / ! 2 " & " # D"E DI9E D!:E

D!:E D":9E D": E DK099E D"::E D"O:E D??P:7E DO$ :E

3:

K* 6* " * / A )& " # M -# ,$ * . )( > 3B 3 3:* I I * " ) ? -# * . * 3'3( K #$ 3:9* A$ ! /; #$* K & * I* 0 & * G

& 6* 0 & ?* K K*K* @#$ * 1 /%3 FA? ))* 37B379* ?) & 3::* A$ ! /; #$* 6 )

&* . * ' (>3B 3::* & @$ "#$ * + & %$ / > A ); @ / # @) 6

K* ,4 * * III 3::9* ?*G* * " * I & )# %$ $ 6 ) & ?

& * . ) 5 :> B7 3:: * @ K $ !

N# 0 * ?/,A6K2> $ ) ) " &* I%& @ 2 )+ % FA? 3 ))* 73B77* ?) & 3:99* F* " * " & $ #$ ) # ;

#& ;* . '6 ' (> 3B 3 3:::* /* " P$ O$* 2 ? # 6 ) &* )))' %7 3::* * ?#Q A* ? ,* P * ,$ ,"," )

* / 0 )+ ,3 F/ 937 ))* 3: B F # !

# G 3::7* ?) & * 6 O$ &* ?/,2> / IÆ# " ) ,$ " * +* K#A )+ ,4 F/ 3 7: ))* B ?) & *

Rigid E-Unification Revisited? Ashish Tiwari1 , Leo Bachmair1 , and Harald Ruess2 1

Department of Computer Science, SUNY at Stony Brook Stony Brook, NY 11794, U.S.A {leo,astiwari}@cs.sunysb.edu 2 SRI International, Menlo Park, CA 94025, U.S.A [email protected]

Abstract. This paper presents a sound and complete set of abstract transformation rules for rigid E-unification. Abstract congruence closure, syntactic unification and paramodulation are the three main components of the proposed method. The method obviates the need for using any complicated term orderings and easily incorporates suitable optimization rules. Characterization of substitutions as congruences allows for a comparatively simple proof of completeness using proof transformations. When specialized to syntactic unification, we obtain a set of abstract transition rules that describe a class of efficient syntactic unification algorithms.

1

Introduction

Rigid E-unification arises when tableaux-based theorem proving methods are extended to logic with equality. The general, simultaneous rigid E-unification problem is undecidable [7] and it is not known if a complete set of rigid Eunifiers in the sense of [10] gives a complete proof procedure for first-order logic with equality. Nevertheless complete tableau methods for first-order logic with equality can be designed based on incomplete, but terminating, procedures for rigid E-unification [8]. A simpler version of the problem is known to be decidable and also NP-complete, and several corresponding algorithms have been proposed in the literature (not all of them correct) [9, 10, 5, 8, 11, 6]. In the current paper, we consider this standard, non-simultaneous version of the problem. Most of the known algorithms for finding a complete set of (standard) rigid unifiers employ techniques familiar from syntactic unification, completion and paramodulation. Practical algorithms also usually rely on congruence closure procedures in one form or another, though the connection between the various techniques has never been clarified. The different methods that figure prominently in known rigid unification procedures—unification, narrowing, superposition, and congruence closure—have all been described in a framework based on transformation rules. We use the recent work on congruence closure as a starting point [12, 4] and formulate a rigid E-unification method in terms of fairly abstract transformation rules. ?

The research described in this paper was supported in part by the National Science Foundation under grant CCR-9902031.

D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 220–234, 2000. c Springer-Verlag Berlin Heidelberg 2000

Rigid E-Unification Revisited

221

This approach has several advantages. For one thing, we provide a concise and clear explication of the different components of rigid E-unification and the connections between them. A key technical problem has been the integration of congruence closure with unification techniques, the main difficulty being that congruence closure algorithms manipulate term structures over an extended signature, whereas unifiers need to be computed over the original signature. We solved this problem by rephrasing unification problems in terms of congruences and then applying proof theoretic methods, that had originally been developed in the context of completion and paramodulation. Some of the new and improved features of the resulting rigid E-unification method in fact depend on the appropriate use of extended signatures. Almost all the known rigid E-unification algorithms require relatively complicated term orderings. In particular, most approaches go to great length to determine a suitable orientation of equations (between terms to be unified), such as x ≈ fy, a decision that depends of course on the terms that are substituted (in a “rigid” way) for the variables x and y. But since the identification of a substitution is part of the whole unification problem, decisions about the ordering have to made during the unification process, either by orienting equations non-deterministically, as in [10], or by treating equations as bi-directional constrained rewrite rules (and using unsatisfiable constraints to eliminate wrong orientations) [5]. In contrast, the only orderings we need are simple ones in which the newly introduced constants are smaller than all other non-constant terms. The advantage of such simple orderings is twofold, in that not only the description of the rigid E-unification method itself, but also the corresponding completeness proofs, become simpler.1 Certain optimizations can be easily incorporated in our method that reduce some of the non-determinism still inherent in the unification procedure. The treatment of substitutions as congruences defined by special kinds of rewrite systems (rather than as functions or morphisms) is a novel feature that allows us to characterize various kinds of unifiers in proof-theoretic terms via congruences. As an interesting fallout of this work we obtain an abstract description of a class of efficient syntactic unification algorithms based on recursive descent. Other descriptions of these algorithms are typically based on data structures and manipulation of term dags. Since our approach is suitable for abstractly describing sharing, we obtain a pure rule based description. One motivation for the work presented here has been the generalization of rigid E-unification modulo theories like associativity and commutativity, which we believe are of importance for theorem proving applications. Our approach, 1

A key idea of congruence closure is to employ a concise and simplified term representation via variable abstraction, so that complicated term orderings are no longer necessary or even applicable. There usually is a trade-off between the simplicity of terms thus obtained and the loss of term structure [4]. In the case of rigid unification, we feel that simplicity outweighs the loss of some structure, as the non-determinism inherent in the procedure limits the effective exploitation of a more complicated term structure in any case.

222

Ashish Tiwari, Leo Bachmair, and Harald Ruess

especially because of the use of extensions of signatures and substantially weaker assumptions about term orderings, should more easily facilitate the development of such generalized unification procedures. We also believe that our way of describing rigid E-unification will facilitate a simpler proof of the fact that the problem is in NP. Previous proofs of membership of this problem in NP “require quite a bit of machinery” [10]. The weaker ordering constraints, a better integration of congruence closure and a rule-based description of the rigid E-unification procedure should result in a simpler proof.

2

Preliminaries

Given a set Σ = ∪n Σn and a disjoint set V , we define T (Σ, V ) as the smallest set containing V and such that f(t1 , . . . , tn ) ∈ T (Σ, V ) whenever f ∈ Σn and t1 , . . . , tn ∈ T (Σ, V ). The elements of the sets Σ, V and T (Σ, V ) are respectively called function symbols, variables and terms (over Σ and V ). The set Σ is called a signature and the index n of the set Σn to which a function symbol f belongs is called the arity of the symbol f. Elements of arity 0 are called constants. By T (Σ) we denote the set T (Σ, ∅) of all variable-free, or ground terms. The symbols s, t, u, . . . are used to denote terms; f, g, . . ., function symbols; and x, y, z, . . ., variables. A substitution is a mapping from variables to terms such that xσ = x for all but finitely many variables x. We use post-fix notation for application of substitutions and use the letters σ, θ, . . . to denote substitutions. A substitution σ can be extended to the set T (Σ, V ) by defining f(t1 , . . . , tn )σ = f(t1 σ, . . . , tn σ). The domain Dom(σ) of a substitution σ is defined as the set {x ∈ V : xσ 6= x}; and the range Ran(σ) as the set of terms {xσ : x ∈ Dom(σ)}. A substitution σ is idempotent if σσ = σ.2 We usually represent a substitution σ with domain {x1 , . . . , xn } as a set of variable “bindings” {x1 7→ t1 , . . . , xn 7→ tn }, where ti = xi σ. By a triangular form representation of a substitution σ we mean a sequence of bindings, [x1 7→ t1 ; x2 7→ t2 ; . . . ; xn 7→ tn ], such that σ is the composition σ1 σ2 . . . σn of substitutions σi = {xi 7→ ti }. Congruences An equation is a pair of terms, written as s ≈ t. The replacement relation →E g induced by a set of equations E is defined by: u[l] →E g u[r] if, and only if, l ≈ r is in E. The rewrite relation →E induced by a set of equations E is defined by: u[lσ] →E u[rσ] if, and only if, l ≈ r is in E and σ is some substitution. In other words, the rewrite relation induced by E is the replacement relation induced by ∪σ Eσ, where Eσ is the set {sσ ≈ tσ : s ≈ t ∈ E}. 2

We use juxtaposition στ to denote function composition, i.e., x(στ ) = (xσ)τ .

Rigid E-Unification Revisited

223

If → is a binary relation, then ← denotes its inverse, ↔ its symmetric closure, →+ its transitive closure and →∗ its reflexive-transitive closure. Thus, ↔∗E g denotes the congruence relation3 induced by E. The equational theory of a set E of equations is defined as the relation ↔∗E . Equations are often called rewrite rules, and a set E a rewrite system, if one is interested particularly in the rewrite relation →∗E rather than the equational theory ↔∗E . Substitutions as Congruences It is often useful to reason about a substitution σ by considering the congruence relation ↔∗Eσg induced by the set of equations Eσ = {xσ ≈ x : x ∈ Dom(σ)}. The following proposition establishes a connection between substitutions and congruences. Proposition 1. (a) For all terms t ∈ T (Σ, V ), t ↔∗Eσg tσ. Therefore, Eσ ⊆ ↔∗(E∪Eσ )g . (b) If the substitution σ is idempotent, then for any two terms s, t ∈ T (Σ, V ), we have sσ = tσ if, and only if, s ↔∗E g t. σ

Proof. Part (a) is straight-forward and also implies the “only if” direction of part (b). For the “if” direction, note that if u ↔Eσg v, then u = u[l] and v = v[r] for some equation l ≈ r or r ≈ l in Eσ . Thus, uσ = (u[l])σ = uσ[lσ] ↔(Eσ σ)g uσ[rσ] = (u[r])σ = vσ. Therefore, if s ↔∗Eσg t, then sσ ↔∗(Eσ σ)g tσ. But if σ is idempotent, then Eσ σ consists only of trivial equations t ≈ t, and hence sσ and tσ are identical. Theorem 1. Let σ be an idempotent substitution and [x1 7→ t1 ; . . . ; xn 7→ tn ] be a triangular form representation of σ. Then the congruences ↔∗Eσg and ↔∗(∪i Eσ )g i are identical, where σi = {xi 7→ ti }. Proof. It is sufficient to prove that Eσ ⊆ ↔∗(∪i Eσ )g and Eσi ⊆ ↔∗Eσg for all i 1 ≤ i ≤ n. If xi σ ≈ xi is an equation in Eσ, then xi σ = xi σ1 . . . σn = xi σi . . . σn = ti σi+1 . . . σn , and therefore, using Proposition 1 part (a), xi ↔∗Eσg ti ↔∗Eσg i

i+1

ti σi+1 ↔∗Eσg

i+2

· · · ↔∗Eσg ti σi+1 . . . σn = xi σ. n

For the converse, using part of the above proof, we get ti ↔∗(∪k>i Eσ )g xi σ ↔∗Eσg i xi . By induction hypothesis we can assume, Eσk ⊆ ↔∗Eσg for k > i, and then the above proof would establish Eσi ⊆ ↔∗Eσg . 3

A congruence relation is a reflexive, symmetric and transitive relation on terms that is also a replacement relation.

224

Ashish Tiwari, Leo Bachmair, and Harald Ruess

The theorem indicates that if an idempotent substitution σ can be expressed as a composition σ1 σ2 . . . σn of finitely many idempotent substitutions σi with disjoint domains, then the congruence induced by Eσ is identical to the congruence induced by ∪i Eσi . We denote by Eσ the set ∪i Eσi . The variable dependency ordering SV induced by a set S of equations on the set V of variables is defined by: x SV y if there exists an equation t[x] ≈ y in S. A finite set S of equations is said to be substitution-feasible if (i) the right-hand sides of equations in S are all distinct variables and (ii) the variable dependency ordering SV induced by S is well-founded. If S is a substitution-feasible set {ti ≈ xi : 1 ≤ i ≤ n} such that xj 6SV xi whenever i > j, then the idempotent substitution σ represented by the triangular form [x1 7→ t1 ; . . . ; xn 7→ tn ] is called the substitution corresponding to S. Given an idempotent substitution σ and any triangular form representation σ1 σ2 . . . σn , the sets Eσ and ∪i Eσi are substitution-feasible. Rigid E-Unification Definition 1. Let E be a set of equations (over Σ ∪ V ) and s and t be terms in T (Σ, V ). A substitution σ is called a rigid E-unifier of s and t if sσ ↔∗(Eσ)g tσ. When E = ∅, rigid E-unification reduces to syntactic unification. Theorem 2. An idempotent substitution σ is a rigid E-unifier of s and t if and only if s ↔∗(E∪Eσ )g t. Proof. Let σ be an idempotent substitution that is a rigid E-unifier of s and t. By definition we have sσ ↔∗(Eσ)g tσ. Using Proposition 1 part (a), we get s ↔∗Eσg sσ ↔∗(E∪Eσ )g tσ ↔∗Eσg t. Conversely, suppose σ is an idempotent substitution such that s ↔∗(E∪Eσ )g t. Then, sσ ↔∗(Eσ∪Eσ σ)g tσ. But (Eσ )σ consists of trivial equations of the form t ≈ t and hence we have sσ ↔∗(Eσ)g tσ. If the substitution σ is not idempotent, the above proof does not go through as the set (Eσ )σ may contain non-trivial equations. However, we may use Theorem 1 to establish that the congruences induced by E ∪ Eσ and E ∪ Eθ , where θ1 . . . θn is a triangular representation for σ, are identical. We obtain a characterization of standard E-unification if we replace the congruence induced by E ∪ Eσ by the congruence induced by ∪σ Eσ ∪ Eσ in the above theorem, and a characterization of syntactic unifiers if E = ∅. Orderings on Substitutions Unification procedures are designed to find most general unifiers of given terms. A substitution σ is said to be more general than another substitution θ with

Rigid E-Unification Revisited

225

respect to a set of variables V , denoted by σ V θ, if there exists a substitution σ 0 such that xσσ 0 = xθ for all x ∈ V . A substitution σ is called more general modulo E g on V than θ, denoted by σ VE g θ, if there exists a substitution σ0 such that xσσ 0 ↔∗(Eθ)g xθ for all x ∈ V . We also define an auxiliary relation v between substitutions by σ vVE g θ if xσ ↔∗(Eθ)g xθ for all x ∈ V . Two substitutions σ and θ are said to be equivalent modulo E g on V if σ VE g θ and θ VE g σ. If σ is a rigid E-unifier of s and t, then there exists an idempotent rigid E-unifier of s and t that is more general modulo E g than σ. Hence, in this paper, we will be concerned only with idempotent unifiers. Comparisons between idempotent substitutions can be characterized via congruences. Theorem 3. Let σ and θ be idempotent substitutions and V the set of variables in the domain or range of σ. Then, σ VE g θ if and only if Eσ ⊆ ↔∗E g ∪E g . θ

Proof. If σ is idempotent then we can prove that σ VE g θ if and only if σθ vVE g θ. Now assuming σ VE g θ, we have xσθ ↔∗(Eθ)g xθ for all x ∈ V . But xθ ↔(Eθ )g x and Eθ ⊆ ↔∗E g ∪E g by Proposition 1. Therefore, it follows that Eσθ ⊆ ↔∗E g ∪E g . θ θ But again by Proposition 1, xσθ ↔∗E g xσ and therefore, Eσ ⊆ ↔∗E g ∪E g . θ θ Conversely, if Eσ ⊆ ↔∗E g ∪E g then, Eσ θ ⊆ ↔∗(Eθ)g ∪(Eθ θ)g , and since the θ equations in Eθ θ are all trivial equations of the form u ≈ u, it follows that Eσ θ ⊆ ↔∗(Eθ)g , which implies σθ vVE g θ and hence σ VE g θ.

3

Rigid E-Unification

We next present a set of abstract transformation (or transition) rules that can be used to describe a variety of rigid E-unification procedures. By Theorem 2, the problem of finding a rigid E-unifier of two terms s and t amounts to finding a substitution-feasible set S such that s ↔∗(E∪S)g t, and involves (1) constructing a substitution-feasible set S, and (2) verifying that s and t are congruent modulo E ∪ S. Part (1), as we shall see, can be achieved by using syntactic unification, narrowing and superposition. Efficient techniques for congruence testing via abstract congruence closure can be applied to part (2). Abstract Congruence Closure A term t is in normal form with respect to a rewrite system R if there is no term t0 such that t →R t0 . A rewrite system R is said to be (ground) confluent if for all (ground) terms t, u and v with u←∗R t →∗Rv there exists a term w such that u→∗R w ←∗Rv. It is terminating if there exists no infinite reduction sequence t0 →R t1 →R t2 · · · of terms. Rewrite systems that are (ground) confluent and terminating are called (ground) convergent. Let Γ be a set of function symbols and variables and K be a disjoint set of constants. An (abstract) congruence closure (with respect to Γ and K) is a

226

Ashish Tiwari, Leo Bachmair, and Harald Ruess

ground convergent rewrite system R over the signature Γ ∪ K 4 such that (i) each rule in R is either a D-rule of the form f(c1 , . . . , ck ) ≈ c0 where f is a k-ary symbol in Γ and c0 , c1 , . . . , ck are constants in K, or a C-rule of the form c0 ≈ c1 with c0 , c1 ∈ K, and (ii) for each constant c ∈ K that is in normal form with respect to R, there exists a term t ∈ T (Γ ) such that t →∗Rg c. Furthermore, if E is a set of equations (over Γ ∪ K) and R is such that (iii) for all terms s and t in T (Γ ), s ↔∗E g t if, and only if, s →∗Rg ◦ ←∗Rg t, then R is called an (abstract) congruence closure for E. For example, let E0 = {gfx ≈ z, fgy ≈ z} and Γ = {g, f, x, y, z}. The set E1 consisting of the rules x ≈ c1 , y ≈ c2 , z ≈ c3 , fc1 ≈ c4 , gc4 ≈ c3 , gc2 ≈ c5 , fc5 ≈ c3 is an abstract congruence closure (with respect to Γ and {c1 , . . . , c5 }) for E0 . The key idea underlying (abstract) congruence closure is that the constants in K serve as names for congruence classes, and equations f(c1 , . . . , ck ) ≈ c0 define relations between congruence classes: a term f(t1 , . . . , tk ) is in the congruence class c0 if each ti is in the congruence class ci . The construction of a congruence closure will be an integral part of our rigid E-unification method. We will not list specific transformation rules, but refer the reader to the description in [2] which can be easily adapted to the presentation in the current paper. For our purposes, transition rules are defined on quintuples (K, E; V, E?; S), where Σ is a given fixed signature, V is a set of variables, K is a set of constants disjoint from Σ ∪ V , and E, E?andS are sets of equations. The first two components of the quintuple represent a partially constructed congruence closure, whereas the third and fourth components are needed to formalize syntactic unification, narrowing and superposition. The substitution-feasible set in the fifth component stores an answer substitution in the form of a set of equations. For a given state (K, E; V, E?; S), we try to find a substitution σ with Dom(σ) ⊆ V , that is a rigid E-unifier of each equation in the set E?. By an initial state we mean a tuple (∅, E0 ; V0 , {s ≈ t}; ∅) where V0 is the set of all variables that occur in E0 , s or t. Transition rules specify ways in which one quintuple state can be transformed into another such state. The goal is to successively transform a given initial state to a state in which the fourth component is empty.

C-Closure:

(K, E; V, E?; S) (K 0 , E 0 ; V, E?; S)

if K ⊂ K 0 and E 0 is an abstract congruence closure (with respect to Σ ∪ V and K 0 ) for E. Note that we need not choose any term ordering, which is one of the main differences of our approach with most other rigid unification methods. 4

We treat variables as constants and in this sense speak of a ground convergent system R.

Rigid E-Unification Revisited

227

Syntactic Unification C-closure can potentially extend the signature by a set of constants. Thus we obtain substitutions (or substitution-feasible sets) and terms over an extended signature that need to be translated back to substitutions and terms in the original signature, essentially by replacing these constants by terms from the original signature. For example, consider the abstract congruence closure E1 for E0 = {gfx ≈ z, fgy ≈ z} described above, and the substitution-feasible set S = {c3 ≈ x, x ≈ y}. This set can be transformed by replacing the constant c3 by z to give a substitution-feasible set {z ≈ x, x ≈ y}. Unfortunately, this may not be possible always. For example, in the substitution-feasible set S = {c1 ≈ x}, we can’t eliminate the constant c1 since x is the only term congruent to c1 modulo E1 , but, the resulting set {x 7→ x} is not substitution-feasible. We say that a (substitution-feasible) set S = {ti ≈ xi : ti ∈ T (Σ ∪K, V ), xi ∈ V, 1 ≤ i ≤ n} of rules is E-feasible on V if, there exist a terms si ∈ T (Σ ∪V ) with si ↔∗E g ti , such that the set S↑E = {si ≈ xi : 1 ≤ i ≤ n} is substitution-feasible. Recall that if σ is a rigid E-unifier of s and t, then there exists a proof s ↔∗E g ∪Eσg t. The transition rules are obtained by analyzing the above hypothetical proof. The rules attempt to deduce equations in Eσ by simplifying the above proof. We first consider the special case when s ↔∗E g t, and hence s and σ t are syntactically unifiable. Trivial proofs can be deleted. Deletion:

(K, E; V, E? ∪ {t ≈ t}; S) (K, E; V, E?; S)

If Eσ is a substitution-feasible set and the top function symbols in s and t are identical, then all replacement steps in the proof s ↔∗Eσg t occur inside a non-trivial context, and hence this proof can be broken up into simpler proofs. Decomposition:

(K, E; V, E? ∪ {f(t1 , . . . , tn ) ≈ f(s1 , . . . , sn )}; S) (K, E; V, E? ∪ {t1 ≈ s1 , . . . , tn ≈ sn }; S)

if f ∈ Σ is a function symbol of arity n. Finally, if the proof s ↔∗Eσg t is a single replacement step (at the root position, and within no contexts), we can eliminate it. Elimination:

(K, E; V, E? ∪ {x ≈ t}; S) (K ∪ {x}, E ∪ Eθ ; V − {x}, E?; S ∪ Eθ )

if (i) θ = {x 7→ t}, (ii) the set Eθ = {t ≈ x} is E-feasible on V , and (iii) x ∈ V . Deletion and decomposition are identical to the transformation rules for syntactic unification, c.f. [1]. However, elimination (and narrowing and superposition described below), do not apply the substitution represented by Eθ (or Eθ↑E ) to the sets E? and S as is done in the corresponding standard rules for syntactic unification. Instead we add the equations Eθ to the second component of the state.

228

Ashish Tiwari, Leo Bachmair, and Harald Ruess

Decomposition, deletion and elimination can be replaced by a single rule that performs full syntactic unification in one step. We chose to spell out the rules above as they provide a method to abstractly describe an efficient quadratic-time syntactic unification algorithm by recursive descent, c.f. [1]. Narrowing and Superposition The following rule reflects attempts to identify and eliminate steps in a proof s ↔∗E g ∪Eσg t that use equations in E g . Narrowing:

(K, E; V, E? ∪ {s[l0 ] ≈ t}; S) 0

(K ∪ V , E ∪ Eθ ; V − V 0 , E? ∪ {s[c] ≈ t}; S ∪ Eθ )

where (i) l ≈ c ∈ E, (ii) θ is the most general unifier of l0 and l, (iii) the set Eθ is E-feasible on V , (iv) V 0 = Dom(θ) ⊂ V , (v) E is an abstract congruence closure with respect to Σ and K ∪ V , and (vi) either l0 6∈ V or l ∈ V . We may also eliminate certain “proof patterns” involving rules in E g (and g Eσ ) from the proof s ↔∗E g ∪Eσg t via superposition of rules in E. Superposition:

(K, E = E 0 ∪ {t ≈ c, C[t0] ≈ d}; V, E?; S) (K ∪ V 0 , E 0 ∪ {t ≈ c} ∪ T ; V − V 0 , E?; S ∪ Eθ )

if (i) θ is the most general unifier of t and t0 , (ii) Eθ is E-feasible on V , (iii) T = Eθ ∪ {C[c] ≈ d}, (iv) V 0 = Dom(θ) ⊂ V , (v) E is an abstract congruence closure with respect to Σ and K ∪ V , and (vi) either t0 6∈ V or t ∈ V . Narrowing, elimination and superposition add new equations to the second component of the state, which are subsequently processed by C-closure. We illustrate the transition process by considering the problem of rigidly unifying the two terms fx and gy modulo the set E0 = {gfx ≈ z, fgy ≈ z}. Let E1 denote an abstract congruence closure {x ≈ c1 , y ≈ c2 , z ≈ c3 , fc1 ≈ c4 , gc4 ≈ c3 , gc2 ≈ c5 , fc5 ≈ c3 } for E0 and K1 be the set {c1 , . . . , c5 } of constants. i Ki Ei Vi E?i 0∅ E0 {x, y, z} {fx ≈ gy} 1 K1 E1 {x, y, z} {fx ≈ gy} 2 K1 ∪ {x} E1 ∪ {x ≈ c5 } {y, z} {c3 ≈ gy} 3 K2 E3 {y, z} {c3 ≈ gy} 4 K2 ∪ {y} E3 ∪ {y ≈ c3 } {z} {c3 ≈ c3 } 5 K4 E4 {z} ∅

Si Rule ∅ C-Closure ∅ Narrow {c5 ≈ x} C-Closure {c5 ≈ x} Narrow S3 ∪ {c3 ≈ y}] Delete S4

where E3 = {x ≈ c1 , y ≈ c2 , z ≈ c3 , fc1 ≈ c3 , gc3 ≈ c3 , gc2 ≈ c1 , c5 ≈ c1 , c4 ≈ c3 } is an abstract congruence closure for E2 . Since the set E?5 is empty, the rigid unification process is completed. Any set S4↑E4 is a rigid unifier of fx and gy. For instance, we may choose gy for the constant c5 and z for c3 to get the set S4↑E4 = {gy ≈ x, z ≈ y} and the corresponding unifier [x 7→ gy; y 7→ z].

Rigid E-Unification Revisited

229

Optimizations A cautious reader might note that the transition rules contain a high degree of non-determinism in the present form. In particular after an initial congruence closure step, every variable x in the third component of a state occurs as a lefthand side of some rule x ≈ c in the second component. Consequently, this rule can be used for superposition or narrowing with any rule in the second or fourth component. A partial solution to this problem is to replace all occurrences of c by x in the second component and then delete the rule x ≈ c. This is correct under certain conditions. Compression1:

(K ∪ {c}, E ∪ {x ≈ c}; V ∪ {x}, E?; S) (K, Eθ; V ∪ {x}, E?θ; S 0 )

if (i) θ is the substitution {c 7→ x}, (ii) E ∪ {x ≈ c} is a fully-reduced abstract congruence closure (with respect to Σ and K ∪ V ), (iii) c does not occur on the right-hand side of any rule in E, and (iv) S 0 is obtained from S by applying substitution θ only to the left-hand sides of equations in S. A related optimization rule is: Compression2:

(K ∪ {c, d}, E ∪ {c ≈ d}; V, E?; S) (K ∪ {d}, E; V, E?θ; S 0)

if (i) θ is the substitution {c 7→ d}, (ii) E ∪ {c ≈ d} is a fully-reduced abstract congruence closure (with respect to Σ and K ∪ V ), and (iii) S 0 is obtained from S by applying substitution θ only to the left-hand sides of equations in S. These two optimization rules can be integrated into the congruence closure phase. More specifically, we assume that application of C-closure rule is always followed by an exhaustive application of the above compression. We refer to this combination as an “Opt-Closure” rule. To illustrate these new rules, now note that both x ≈ c1 and y ≈ c2 can be eliminated from the abstract congruence closure E1 for E0 . We obtain an optimized congruence closure {z ≈ c3 , fx ≈ c4 , gc4 ≈ c3 , gy ≈ c5 , fc5 ≈ c3 }. Note that we cannot remove z ≈ c3 from the above set.

4

Correctness

Let U be an infinite set of constants from which new constants are chosen in opt-closure. If a state ξi = (Ki , Ei ; Vi , E?i ; Si ) is transformed to a state ξj = (Kj , Ej ; Vj , E?j ; Sj ) by opt-closure, then (i) Ej is an abstract congruence closure (with respect to Σ and K ∪ V ) for Ei and (ii) Ej is contained in a well-founded simplification ordering5 . 5

For instance, a simple lexicographic path ordering based on a partial precedence on symbols in Σ ∪ U ∪ V for which f c, whenever f ∈ Σ and c ∈ U ∪ V , and x c, whenever x ∈ V and c ∈ U − V , will suffice.

230

Ashish Tiwari, Leo Bachmair, and Harald Ruess

We use the symbol `REU to denote the one-step transformation relation induced by opt-closure, deletion, decomposition, elimination, narrowing and superposition. A derivation is a sequence of states ξ0 `REU ξ1 `REU · · · with no two consecutive applications of opt-closure. Theorem 4. (Termination) All derivations starting from an initial state (∅,E0 ; V0 , {s ≈ t}; ∅) are finite. Proof. Define a measure associated with a state (K, E; V, E?; S) to be the pair (|V |, mE? ), where |V | denotes the cardinality of the set V and mE? = {{s, t} : s ≈ t ∈ E?}. These pairs are compared lexicographically using the greater-than relation on the integers in the first component and the two-fold multiset extension of the ordering in the second component. This induces a well-founded ordering on states with respect to which each transition rule is reducing. Lemma 1. Let (Kn , En ; Vn , E?n ; Sn ) be the final state of a derivation from (∅, E0; V0 , E?0 ; ∅), where E0 ∪ E?0 are equations over T (Σ, V0 ). Then (a) the set Sn is En -feasible on V0 and (b) if E?n ⊆ ↔∗(En ∪Sn↑E )g , then E?0 ⊆ ↔∗(E0 ∪Sn↑E )g . n

n

Theorem 5 (Soundness). If (Kn , En ; Vn , E?n ; Sn ) is the final state of a derivation from (∅, E0; V0 , E?0 ; ∅), then the set Sn is En -feasible and the substitution corresponding to (any) set Sn↑En is a rigid E0 -unifier of s and t. Proof. The En -feasibility of Sn on V0 follows from Lemma 1. Since E?n = ∅, the antecedent of the implication in part (b) of Lemma 1 is vacuously satisfied and hence E?0 ⊆ ↔∗(E0 ∪Sn↑E )g . n Note that by Theorem 1, the ground congruence induced by Sn ↑En is identical to the congruence induced by Eσ , where σ is the idempotent substitution corresponding to the set Sn ↑En . Hence, s ↔∗E g ∪Eσg t. Using Theorem 2, we 0 establish that σ is a rigid E0 -unifier of s and t. Theorem 6 (Completeness). Let θ be an idempotent rigid E0 -unifier of s and t and V0 the set of variables in E0 , s and t. Then, there exists a (finite) derivation with initial state (∅, E0 ; V0 , {s ≈ t}; ∅) and final state (Kn , En; Vn , ∅; Sn) where ESn↑En ⊆ ↔∗E g ∪E g . 0

θ

Proof. (Sketch) Let ξi = (Ki , Ei ; Vi , E?i ; Si ) be a state. We say a substitutionfeasible set S i is a solution for state ξi if S i is a Ei -feasible on Vi and E?i ⊆ ↔∗(E∪Si ∪S i )g . Now, given a state ξi and a solution S i for ξi , we show how to obtain a new state ξj and a solution S j for ξj such that the pair hξj , S j i is smaller in a certain well-founded ordering than the pair hξi , S i i and the congruences induced by

Rigid E-Unification Revisited

231

Ej ∪ Sj ∪ S j and Ei ∪ Si ∪ S i are identical. The well-founded ordering will be a lexicographic combination of the ordering on states ξi ’s used in the proof of Theorem 4 and a well-founded ordering on substitution-feasible sets Si ’s. If a pair (ξi , S i ) can not be reduced then we show that E?i = ∅. This yields the desired conclusion. The above reduction of a pair (ξi , S i ) can be achieved in two ways: (i) by an REU transformation on ξi , suitably guided by the given solution S i , or, (ii) by some simple transformation of the set S i . The latter transformation rules are defined in the context of the state ξi . The initial state is (S i , ∅). R1 : R2 : R3 : R4 :

(D0 ∪ {c ≈ x}, C 0 )

if c ∈ Ki ∪ Vi , x 6→∗E g \C 0g c6 .

(D0 , C 0 ∪ {c ≈ x})

i

(D0 ∪ {c ≈ x}, C 0 )

if c ∈ Ki ∪ Vi , x →∗E g \C 0g c

(D0 , C 0)

i

(D0 ∪ {t[l0 ] ≈ x}, C 0 ) (D0 ∪ {t[c] ≈ x}, C 0) (D0 ∪ {t[l0 ] ≈ x}, C 0 ) (D0 ∪ {t[y] ≈ x}, C 0)

if l ≈ c ∈ Ei , l ↔∗C 0g l0 if l ≈ y ∈ D0 , l ↔∗C 0g l0 , l 6∈ Ki ∪ Vi

These rule construct a generalized congruence closure for the initial set D0 ∪C 0 (modulo the congruence induced by Ei ). If (D0 , C 0) can be obtained from (S i , ∅) by repeated application of these rules, then the set D0 ∪ C 0 is (i) substitutionfeasible, (ii) Ei -feasible with respect to Vi , and (iii) equivalent modulo Eig on Vi to S i . The set of rules R1 − −R4 is terminating.

5

Specialization to Syntactic Unification

Since rigid unification reduces to syntactic unification when the set E0 is empty, one pertinent question is what procedure the REU transformation rules yield in this special case? Note that elimination does not apply a substitution to the fourth and fifth components of the state, but does perform an occur check in condition (ii). This is in the spirit of syntactic unification by recursive descent algorithm which works on term directed acyclic graphs and is a quadratic timecomplexity algorithm. In fact, in the case of syntactic unification, every equation in the second component is of a special form where one side is always a variable. Hence, we can argue that for each c ∈ K, there is at most one rule in E of the form f(. . .) → c where f ∈ Σ. We may therefore replace superposition by the following rule: Det-Decompose:

(K, E; V, E? ∪ {c ≈ t}; S) (K, E; V, E? ∪ {f(c1 , . . . , ck ) ≈ t}; S)

if there exist exactly one rule f(c1 , . . . , ck ) ≈ c with right-hand side c in E.

232

Ashish Tiwari, Leo Bachmair, and Harald Ruess

In addition, we may restrict narrowing so that the unifier θ is always the identity substitution, that is, narrowing is used to only for simplification of terms in the fourth component E? by equations in the second component E. We can get various efficient syntactic unification algorithms by using specific strategies over our abstract description. Other descriptions of the quadratic time syntactic unification algorithms are usually based on descriptions of dags and abstract rules that manipulate the dags directly. However, since we can abstractly capture the notion of sharing, we obtain rules for this class of efficient algorithms that work on terms and are very similar to the rules for describing the naive syntactic unification procedures (with a worst case exponential behavior).

6

Summary

We have presented a formulation of rigid E-unification in terms of fairly abstract transformation rules. The main feature is the integration of (abstract) congruence closure with transformation rules for syntactic unification, paramodulation and superposition. The use of an extended signature (inherent in abstract congruence closure) helps to dispense with term orderings over the original signature. An abstract rule-based description facilitates various optimizations. The specialization of the transformation rules to syntactic unification yields a set of abstract transition rules that describe a class of efficient syntactic unification algorithms. Our transformation rules can be derived from proof simplification arguments. In [10], a congruence closure algorithm is used in a rigid E-unification procedure, but not as a submodule. Congruence closure is used “indirectly” to do ground completion. The work on abstract congruence closure shows that congruence closure actually is ground completion with extension. But for the purpose of rigid E-unification, we don’t need to translate the abstract closure to a ground system over the original signature, though we do need to translate the substitutions back to the original signature. Extended signatures also help as we do not need to guess an ordering to orient equations such as x ≈ fa when the substitution for x is not yet known. This is a major concern in [10] where the dependence on orderings complicates the unification process. In [5], the problem of correctly orienting equations is solved by delaying the choice of orientation and maintaining constraints. Constraint satisfiability is required to ensure that orientations are chosen in a consistent manner, and to guarantee the termination of such a procedure. We would like to point out that the transformation process involves “don’tcare” non-determinism (where it does not matter which rule one applies) and “don’t-know” non-determinism (where an application of a wrong rule may lead to a failure even if a unifier exists). Whereas opt-closure, deletion and narrowing with identity substitution can be applied “don’t-care” non-deterministically, the other rules have to be applied in a “don’t-know” non-deterministic manner. The rules for syntactic unification described in Section 5 are “don’t-care” nondeterministic.

Rigid E-Unification Revisited

233

All algorithms for computing the set of rigid unifiers for a pair of terms can be seen as a combination of top-down and bottom-up method. In a pure bottomup approach a substitution is guessed non-deterministically: for every variable one tries every subterm that occurs in the given unification problem, see [13] for details. Superposition and narrowing using a rule that contains a variable as its left-hand side captures the bottom-up aspect in our description. A top-down approach is characterized by the use of narrowing to simplify the terms in the goal equations E?. We note that for variables that cannot be eliminated from the left-hand sides of rules using compression1, we need to try a lot of possible substitutions because they can unify with almost all subterms in the second and fourth components. This is the cause of a bottom-up computation for these variables. For other variables, however, we need to try only those substitutions that are produced by some unifier during an application of narrowing or superposition, and hence a top-down approach works for these variables. We illustrate some of the above observations via an example. Let E0 = {gx ≈ x, x ≈ a}, and suppose we wish to find a rigid E0 -unifier of the terms gfffgf f x and fffx. The substitution {x 7→ fa} is a rigid E-unifier, but it cannot be obtained unless one unifies the variable x with an appropriate subterm. We believe that our approach of describing rigid E-unification can be used to obtain an easier proof of the fact that this problem is in NP. We need to show that (i) the length of a maximal derivation from any initial state is bounded by some polynomial in the input size, (ii) each rule can be efficiently applied, and (iii) there are not too many choices between the rules to get the next step in a derivation. It is easy to see that (i) holds. For the second part, a crucial argument involves showing that the test for E-feasibility can be efficiently done. This is indeed the case, but due to space limitations, we don’t give a way to do this here. The notion of an abstract congruence closure is easily extended to handle associative and commutative functions [3]. The use of extended signatures is particularly useful when one incorporates such theories. This leads us to believe that our proposed description of rigid E-unification can be suitably generalized to such applications. Acknowledgements We would like to thank David Cyrluk for initial discussions on the problem and the anonymous reviewers for their helpful comments.

References [1] F. Baader and T. Nipkow. Term Rewriting and All That. Cambridge University Press, Cambridge, 1998. [2] L. Bachmair, C. Ramakrishnan, I.V. Ramakrishnan, and A. Tiwari. Normalization via rewrite closures. In P. Narendran and M. Rusinowitch, editors, 10th Int. Conf. on Rewriting Techniques and Applications, pages 190–204, 1999. LNCS 1631.

234

Ashish Tiwari, Leo Bachmair, and Harald Ruess

[3] L. Bachmair, I.V. Ramakrishnan, A. Tiwari, and L. Vigneron. Congruence closure modulo associativity and commutativity. In H. Kirchner and C. Ringeissen, editors, Frontiers of Combining Systems, 3rd Intl Workshop FroCoS 2000, pages 245–259, 2000. LNAI 1794. [4] L. Bachmair and A. Tiwari. Abstract congruence closure and specializations. In D. McAllester, editor, 17th Intl Conf on Automated Deduction, 2000. [5] G. Becher and U. Petermann. Rigid unification by completion and rigid paramodulation. In B. Nebel and L.D. Fiseher, editors, KI-94: Advances in Artificial Intelligence, 18th German Annual Conf on AI, pages 319–330, 1994. LNAI 861. [6] B. Beckert. A completion-based method for mixed universal and rigid Eunification. In A. Bundy, editor, 12th Intl Conf on Automated Deduction, CADE12, pages 678–692, 1994. LNAI 814. [7] A. Degtyarev and A. Voronkov. The undecidability of simultaneous rigid Eunification. Theoretical Computer Science, 166(1–2):291–300, 1996. [8] A. Degtyarev and A. Voronkov. What you always wanted to know about rigid E-unification. Journal of Automated Reasoning, 20(1):47–80, 1998. [9] J. Gallier, P. Narendran, D. Plaisted, and W. Snyder. Rigid E-unification: Npcompleteness and applications to equational matings. Information and Computation, 87:129–195, 1990. [10] J. Gallier, P. Narendran, S. Raatz, and W. Snyder. Theorem proving using equational matings and rigid E-unification. Journal of the Association for Computing Machinery, 39(2):377–429, April 1992. [11] J. Goubault. A rule-based algorithm for rigid E-unification. In G. Gottlob, A. Leitsch, and D. Mundici, editors, Computational logic and proof theory. Proc. of the third Kurt Godel Colloquium, KGC 93, pages 202–210, 1993. LNCS 713. [12] D. Kapur. Shostak’s congruence closure as completion. In H. Comon, editor, 8th Intl Conf on Rewriting Techniques and Applications, pages 23–37, 1997. LNCS 1232. [13] Eric de Kogel. Rigid E-unification simplified. In P. Baumgartner, R. Hahnle, and J. Posegga, editors, Theorem Proving with Analytic Tableaux and Related Methods, 4th International Workshop, TABLEAUX ’95, pages 17–30, 1995. LNAI 918.

Connecting Bits with Floating-Point Numbers: Model Checking and Theorem Proving in Practice Carl-Johan Seger Intel Strategic CAD Labs Portland OR [email protected]

Abstract. Model checking and theorem proving have largely complementary strengths and weaknesses. Thus, a research goal for many years has been to find effective and practical ways of combining these approaches. However, this goal has been much harder to reach than originally anticipated, and several false starts have been reported in the literature. In fact, some researchers have gone so far as to question whether there even exists an application domain in which such a hybrid solution is needed. In this talk I will argue that formal verification of the floating-point circuits of modern high-performance microprocessors is such a domain. In particular, when a correctness statement linking the actual low-level (gate-level) implementation with abstract floating-point numbers is needed, a combined model checking and theorem proving based approach is essential. To substantiate the claim, I will draw from data we have collected during the verification of the floating point units of several generations of Intel microprocessors. In addition, I will discuss the in-house formal verification environment we have created that has enabled this effort with an emphasis on how model checking and theorem proving have been integrated without sacrificing usability.

D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 235–235, 2000. c Springer-Verlag Berlin Heidelberg 2000

Reducing Model Checking of the Many to the Few ? E. Allen Emerson and Vineet Kahlon Department of Computer Sciences The University of Texas at Austin, Austin TX-78712, USA {emerson,kahlon}@cs.utexas.edu http://www.cs.utexas.edu/users/{emerson,kahlon}

Abstract. The Parameterized Model Checking Problem (PMCP) is to determine whether a temporal property is true for every size instance of a system comprised of many homogenous processes. Unfortunately, it is undecidable in general. We are able to establish, nonetheless, decidability of the PMCP in quite a broad framework. We consider asynchronous systems comprised of an arbitrary number of homogeneous copies of a generic process template. The process template is represented as a synchronization skeleton while correctness properties are expressed using Indexed CTL*\X. We reduce model checking for systems of arbitrary size n to model checking for systems of size up to (of) a small cutoff size c. This establishes decidability of PMCP as it is only necessary to model check a finite number of relatively small systems. Efficient decidability can be obtained in some cases. The results generalize to systems comprised of multiple heterogeneous classes of processes, where each class is instantiated by many homogenous copies of the class template (e.g., m readers and n writers).

1

Introduction

Systems with an arbitrary number of homogeneous processes can be used to model many important applications. These include classical problems such as mutual exclusion, readers and writers, as well as protocols for cache coherence and data communication among others. It is often the case that correctness properties are expected to hold irrespective of the size of the system, as measured by the number of processes in it. However, time and space constraints permit us to verify correctness only for instances with a small number of processes. This makes it impossible to guarantee correctness in general and thus motivates consideration of automated methods to permit verification for system instances of arbitrary sizes. The general problem, known in the literature as the Parameterized Model Checking Problem (PMCP) is the following: to decide whether a temporal property is true for every size instance of a given system. This problem ?

This work was supported in part by NSF grant CCR-980-4737, SRC contract 99TJ-685 and TARP project 003658-0650-1999.

D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 236–254, 2000. c Springer-Verlag Berlin Heidelberg 2000

Reducing Model Checking of the Many to the Few

237

is known to be undecidable in general [1]. However, by imposing certain stipulations on the organization of the processess we can get a useful framework with a decidable PMCP. We establish our results in the synchronization skeleton framework. Our results apply to systems comprised of multiple heterogeneous classes of processes with many homogeneous process instances in each class. Thus, given family (U1 , ..., Uk) of k process classes, and tuple (n1 , ..., nk) of natural numbers, we let (U1 , ..., Uk)(n1 ,...,nk ) denote the concrete system composed of n1 copies or instances of U1 through nk copies or instances of Uk running in parallel asynchronously (i.e., with interleaving semantics). By abuse of notation, we also write (U1 , ..., Uk)(n1 ,...,nk ) for the associated state graph, where each process starts in its designated initial state. Correctness properties are expressed using a fragment of Indexed CTL*\X. The basic assertions are of the form “for all processes Ah”, or “for all processes Eh”, where h is an LTL\X formula (built using F “sometimes”, G “always”, U, “until”, but without X “next-time”) over propositions indexed just by the processes being quantified over, and A “for all futures” and E “for some future” are the usual path quantifiers. Use of such an indexed, stuttering-insensitive logic is natural for parameterized systems. We consider correctness properties of the following types: 1. V Over all individual V processes of single class Ul : il Ah(il ) and il Eh(il ) , where il ranges over (indices of) individual processes in Ul . 2. V Over pairs of different V processes of a single class Ul : il 6=jl Ah(il , jl ) and il 6=jl Eh(il , jl ), where il , jl range over pairs of distinct processes in Ul . 3. V Over one process fromVeach of two different classes Ul , Um : il ,jm Ah(il , jm ) and il ,jm Eh(il , jm ), where il ranges over Ul and jm ranges over Um . We say that the k-tuple (c1 , ..., ck) of natural numbers is a cutoff of (U1 , ..., Uk) for formula f iff : ∀(n1 , ..., nk), (U1 , ..., Uk)(n1 ,...,nk ) |= f iff ∀(m1 , ..., mk) (c1 , ..., ck) : (U1 , ..., Uk)(m1 ,...,mk ) |= f, where we write (m1 , ..., mk) (c1 , ..., ck) to mean (m1 , ..., mk ) is component-wise less than or equal to (c1 , ..., ck) and (m1 , ..., mk) (c1 , ..., ck) to mean (c1 , ..., ck) (m1 , ..., mk ). In this paper, we show that for systems in the synchronization skeleton framework with transition guards of a particular disjunctive or conjunctive form, there is a small cutoff. This, in effect, reduces PMCP to ordinary model checking over a relatively few small, finite sized systems. In some cases, depending on the kind of property and guards, we can get an efficient (quadratic in the size of the template processes) solution to PMCP. Each process class is described by a generic process, a process template for the class. A system with k classes is given by templates (U1 , ..., Uk). For such a system, define ci = |Ui | + 3 and di = 2|Ui | + 1, where |Ui | is the size i.e. the number of local states of template Ui . Then, for both conjunctive and disjunctive guards, cutoffs of (d1 , ..., dk) and (c1 , ..., ck) respectively suffice for all three types

238

E. Allen Emerson and Vineet Kahlon

of formulae described above. These results give decision procedures for PMCP for conjunctive or disjunctive guards. Since these are a broad framework and PMCP is undecidable in general, we view this as quite a positive result. However, the decision procedures are not necessarily efficient ones, although they may certainly be usable on small examples. Because the cutoff is proportional to the sizes of the template processes, the global state graph of the cutoff system is of size exponential in the template sizes, and the decision procedures are also exponential. In the case of disjunctive guards, if we restrict ourselves to the A path quantifiers, but still permit all three type of properties, then the cutoff can be reduced, in quadratic time in the size of the template processes, to something of the form (1, ..., 2, ..., 1) or (1, ..., 3, ..., 1). In fact, depending on the type of property, we can show that it is possible to simplify the guards to ensure that only two or three classes need be retained. On the other hand, for conjunctive guards, if we restrict ourselves to model checking over infinite paths or over finite paths, then sharper cutoffs of the form (1,...,3,...,1), (1,...,2,...,1) or even (1,...,1) can, in some cases, be obtained. The rest of the paper is organized as follows. Section 2 defines the system model. Section 3 describes how to exploit the symmetry inherent in the model and correctness properties. Sections 4 and 5 prove the results pertaining to disjunctive and conjunctive guards respectively. We show some applications of our results in Section 6. In the concluding Section 7, we discuss related work.

2

The System Model

We focus on systems comprised of multiple heterogeneous classes of processes modelled as synchronization skeletons(cf. [2]) . Here, an individual concrete prog cess has a transition of the form l → m indicating that the process can transit from local state l to local state m, provided the guard g is true. Each class is specified by giving a generic process template. If I is (an) index set {1, . . . , n}, then we use U I , or (U )(n) for short, to denote the concurrent system U 1 k . . . kU n comprised of the n isomorphic (up to re-indexing) processes U i running in parallel asynchronously. For a system with k classes associated with the given templates U1 , U2 , ..., Uk, we have corresponding (disjoint) index sets I1 , I2 , . . . Ik . Each index set Ij is (a copy of) an interval {1, . . . , c} of natural numbers, denoted {1j , . . . , nj } for emphasis1 . In practice, we assume the k index sets are specified by giving a k-tuple (n1 , ..., nk) of natural numbers, corresponding to I1 being (a copy of) interval {1, . . . n1 } through Ik being (a copy of) interval {1, . . . , nk }. Given family (U1 , ..., Uk) of k template processes, and a k-tuple (n1 , ..., nk) of natural numbers, we let (U1 , ..., Uk )(n1,...,nk ) denote the concrete system composed on n1 copies of U1 through nk copies of Uk running in parallel asynchronously (i.e., with interleaving semantics). A template process Ul = (Sl , Rl, il ) for class l, is comprised of a finite set Sl of (local) states, a set of transition 1

e.g., if I1 is a copy of {1, 2, 3}, the said copy is denoted {11 , 21 , 31 }. Informally, subscripted index 31 means process 3 of class 1; formally, it is the ordered pair (3, 1) as is usual with indexed logics

Reducing Model Checking of the Many to the Few

239

edges Rl , and an initial(local) state il . Each transition Rl is labelled with a guard, a boolean expression over atomic propositions corresponding to local states of other template processes. Then given index i and template process Ul , Uli = (Sli , Ril , iil ) is used to denote the ith copy of the template process Ul . Here Sli , the state set of Uli , Ril its transition relation and iil its initial state are obtained from Sl , Rl and il respectively by uniformly superscripting the states of Ul with i. Thus, for local states sl , tl of Sl , sil , til denote local states of Uli and (sl , tl ) ∈ Rl iff (sil , til ) ∈ Ril . Given guards of transitions in the template process, we now describe how to get the corresponding guards for the concrete process Uli of (U1 , ..., Uk)(n1 ,...,nk ) . In this paper, we consider the following two types of guards. W W i) Disjunctive guards - of the general form (a1 ∨ ... ∨ b1 ) ... (ak ∨ ... ∨ bk ), where the various al , ..., bl are (propositions identified with the) local states of template Ul , label each transition (sl , tl ) ∈ Rl . In concrete process Uli of the system (U1 , ..., Uk )(n1,...,nk ) , the corresponding transition (sil , til ) ∈ Ril is then labelled by the guard W

r r6=i (al

∨ ... ∨ brl ) ∨

W

W

j6=l (

k∈[1..nj ]

(akj ∨ ... ∨ bkj )),

where proposition akj is understood to be true when process k in class Uj i.e. Ujk is in local state aj for template process Uj . ii) Conjunctive guards with initial state - of the general form (i1 ∨ a1 ∨ ... ∨ V V b1 ) ... (ik ∨ ak ∨ ... ∨ bk ). In concrete process i of class l, Uli , in the system (U1 , ..., Uk)(n1 ,...,nk ) , the corresponding transition is then labelled by the guard V

r r6=i (il

∨ arl ∨ ... ∨ brl ) ∧

V

V

j6=l (

k∈[1..nj ]

(ikj ∨ akj ∨ ... ∨ bkj )).

Note that the initial local states of processes must be present in these guards. Thus, the inital state of a process has a “neutral” character so that when process j is in its initial state, it does not prevent progress by another process i. This natural condition permits modelling a broad range of applications (and is helpful technically). We now formalize the asynchronous concurrent (interleaving) semantics. A process transition with guard g is enabled in global state s iff s |= g i.e., g is true over the local states in s. A transition can be fired in global state s iff its guard g is enabled. Let (U1 , ..., Uk)(n1 ,...,nk ) = (S (n1 ,...,nk ) , R(n1,...,nk ) , i(n1,...,nk ) ) be the global state transition graph of the system instance (n1 , n2 , ..., nk). A state s ∈ S (n1 ,...,nk ) is written as a (n1 + ... + nk )-tuple (u11 , ..., un1 1, u12 , ..., unk k ) where the projection of s onto process i of class l, denoted s(l, i), equals uil , the local state of the ith copy of the template process Ul . The initial state i(n1,...,nk ) = (i11 , ..., ink k ). A global transition (s, t) ∈ R(n1 ,...,nk ) iff t results from s by firing an enabled transition of some process i.e., there exist i, l such that the guard labelling (uil , vli ) ∈ Ril is enabled at s, s(l, i) = uil , t(l, i) = vli , and for all

240

E. Allen Emerson and Vineet Kahlon

(j, k) 6= (i, l), s(k, j) = t(k, j). We write (U1 , ..., Uk)(n1 ,...,nk ) |= f to indicate that the global state graph of (U1 , ..., Uk )(n1,...,nk ) satisfies f at initial state i(n1 ,...,nk ) . Finally, for global state s, define Set(s) = {t | s contains an indexed S local copy of t }. For computation path x = x0 , x1 , ... we define P athSet(x) = i Set(xi ) . We say that the sequence of global states y = y0 , y1 , ... is a stuttering of computation path x iff there exists a parsing P0 P1 ... of y such that for all j ≥ 0 there is some r > 0 with Pj = (xj )r (cf. [3]). Also, we extend the definition of projection to include computation sequences as follows: for i ∈ [1..nl], the sequence of local states x0 (l, i), x1 (l, i), ... is denoted by x(l, i).

3

Appeal to Symmetry

We can exploit symmetry inherent in the system model and the properties in the spirit of “state symmetry” codified by [8](cf. V [16],[12]) V to simplify our proof obligation. To establish formulae of types f(i ), l il il 6=jl f(il , jl ) and V f(i , j ), it suffices to show the results with the formulae replaced by l m il ,jm f(1l ), f(1l , 2l ) and f(1l , 1m), respectively. The basic idea is that in a system comprised of fully interchangeable processes 1 through n of a given class, symmetry considerations dictate that process 1 satisfies a property iff each process i ∈ [1..n] satisfies the property. Proofs are omitted for the sake of brevity.

4

Systems with Disjunctive Guards

In this section, we show how to reduce the PMCP for systems with disjunctive guards, to model checking systems of sizes bounded by a small cutoff, where the size of the cutoff for each process class is essentially the number of local states of individual process template for the class. This yields decidability for this formulation of PMCP, a pleasant result since PMCP is undecidable in full generality. But this result, by itself, does not give us an efficient decision procedure for the PMCP at hand. We go on to show that in the case of universal-path-quantified specification formulae (Ah), efficient decidability can be obtained. 4.1

Properties Ranging over All Processes in a Single Class

We will first establish the Theorem Cutoff Theorem). V 4.1.1 (Disjunctive V Let f be il Ah(il ) or il Eh(il ), where h is an LTL\X formula and l ∈ [1..k]. Then we have the following ∀(n1 , . . . , nk ) (1, . . . , 1) : (U1 , . . . , Uk )(n1 ,...,nk ) |= f iff ∀(d1 , . . . , dk ) (c1 , . . . , ck ) : (U1 , . . . , Uk )(d1 ,...,dk ) |= f, where the cutoff (c1 , . . . , ck ) is given by cl = |Ul | +2, and for i 6= l : ci = |Ui | +1. As a corollary, we will have the

Reducing Model Checking of the Many to the Few

241

Theorem 4.1.2 (Disjunctive Decidability Theorem). PMCP for systems with disjunctive guards and single-index assertions as above is decidable in exponential time. Proof idea By the Disjunctive Cutoff Theorem, it is enough to model check each of the exponentially many exponential size state graphs corresponding to systems (U1 , . . . , Uk )(d1 ,...,dk ) for all (d1 , . . . , dk ) (c1 , . . . , ck ). t u For notational brevity, we establish the above results for systems with just two process classes. We begin by proving the following lemmas. Lemma 4.1.1 (Disjunctive Monotonicity Lemma). (i) ∀n ≥ 1 : (V1 , V2 )(1,n) |= Eh(12 ) implies (V1 , V2 )(1,n+1) |= Eh(12 ). (ii) ∀n ≥ 1 : (V1 , V2 )(1,n) |= Eh(11 ) implies (V1 , V2 )(1,n+1) |= Eh(11 ). Proof idea (i) The idea is that for any computation x of (V1 , V2 )(1,n), there exists an analogous computation y of (V1 , V2 )(1,n+1) wherein the (n+1)st copy of template process V2 stutters in its initial state and the rest of the processes behave as in x. (ii) This part follows by using a similar argument. t u The following lemma allows reduction in system size, one coordinate at a time. Lemma 4.1.2 (Disjunctive Bounding Lemma). (i) ∀n ≥ |V2 | + 2 : (V1 , V2 )(1,n) |= Eh(12 ) iff (V1 , V2 )(1,c2) |= Eh(12 ), where c2 = |V2 | + 2. (ii) ∀n ≥ |V2 | + 1 : (V1 , V2 )(1,n) |= Eh(11 ) iff (V1 , V2 )(1,|V2|+1) |= Eh(11 ). Proof (i) (⇒) Let x = x0 , x1 , ... denote a computation sequence of (V1 , V2 )(1,n). Define Reach = {s1 , ..., sr} to be the set of all local states of template process V2 occuring in x. For st ∈ Reach, let t1 , t2 , ..., tm be a finite local computation of minimal length in x ending in st . Then we use M inLength(st ) to denote m and M inComputation(st ) to denote the sequence t1 , t2 , ..., tm−1, (tm )ω . Let v = (i2 )ω . If x is an infinite computation sequence and x(1, 1) and x(2, 1) are finite local computations, then there exists an infinite local computation sequence u, say. In that case, reset v = u. Construct a formal sequence y = y0 , y1 , ... of global states of (V1 , V2 )(1,|V2 |+1) from x as follows 1. y(1, 1) = x(1, 1) and y(2, 1) = x(2, 1) i.e. the local computation paths in x of process index 1 of classes V1 , V2 are preserved, and

242

E. Allen Emerson and Vineet Kahlon

2. For each state sj ∈ Reach, we set y(2, j + 1) = M inComputation(sj ) i.e. we let the (j + 1)st copy of V2 perform a local computation of minimum length in x leading to sj and then let it stutter in sj forever. The above condition has the implication that for all i ≥ 1, Set(xi ) ⊆ Set(yi ). To see this, let t ∈ Set(xi ). Then, M inLength(t) ≤ i. Also, t ∈ Set(xi ) implies that t ∈ Reach i.e. t = sq for some q ∈ [1..r]. Then y(2, q + 1) stutters in sq for all k ≥ M inLength(st ) and therefore for all k ≥ i, also. Hence yi (2, q + 1) is an indexed copy of sq , i.e. t ∈ Set(yi ). Thus for all i ≥ 1, Set(xi ) ⊆ Set(yi ). 3. y(2, c2 ) = v. This ensures that if x is an infinite computation sequence then in y infinitely many local transitions are fired. However, it might be the case that sequence y violates the interleaving semantics requirement. Clearly, this happens iff the following scenario occurs. Let states sp , sq ∈ Reach, be such that M inComputation(sp ) and M inComputation(sq ) are realized by the same local computation of x and suppose that M inLength(sp ) ≤ M inLength(sq ). Then, if for i < M inLength(sp ), (ti , ti+1 ) is a transition in M inComputation(sp ), (yi (2, p+1), yi+1 (2, p+1)) and (yi (2, q +1), yi+1 (2, q +1)) are both local transitions driving yi to yi+1 . This violates the interleaving semantics condition requiring that there be atmost one local transition driving each global transition. There are two things to note here. First, for a transition (yi , yi+1 ), the violation occurs only for values of i ≤ maxj∈[1..r] M inLength(sj ) and secondly, for a fixed i, all violations are caused by a unique template transition (s, t) of V2 , namely one which was involved in the transition (xi , xi+1 ). To solve this problem, we construct a sequence of states w = w0 , w1, ... from y by “staggering” copies of the same local transition as described below. Let (yi , yi+1 ) be a transition where the interleaving semantics requirement is violated by process indices in1 , ..., ind of V2 executing indexed copies in1 ind ind 1 (sin 2 , t2 ), ..., (s2 , t2 ) respectively of the template transition (s2 , t2 ) of V2 . Replace (yi , yi+1 ) with a sequence u1 , u2 , ..., uf such that u1 = yi , uf = yi+1 and in in for all j, transition (uj , uj+1 ) results by executing local transition (s2 j , t2 j ). Clearly the interleaving semantics requirement is met as atmost one local transition is executed for each global transition. Also, it is not hard to see that for all j, Set(yi ) ⊆ Set(uj ) and hence for all k, transition (uk , uk+1 ) is valid. Finally, note that states with indices other than in1 , ..., ind are made to stutter finitely often in u1 , ..., uf which is allowed since we are considering only formulae without the next-time operator X. Thus, given a computation path x of (V1 , V2 )(1,n), we have constructed a stuttering computation path w of (V1 , V2 )(1,c2 ) , such that the local computation sequence w(2, 1) is a stuttering of the local computation sequence x(2, 1). From this path correspondence, we easily have the result. (⇐) The proof follows by repeated application of the Disjunctive Monotonicity Lemma. (ii) This part follows by using a similar argument.

t u

Reducing Model Checking of the Many to the Few

243

The following lemma allows reduction in system size over multiple coordinates simultaneously (2 coordinates for notational brevity). Lemma 4.1.3 (Disjunctive Truncation Lemma). 0 0 ∀n1 , n2 ≥ 1 : (U1 , U2 )(n1 ,n2) |= Eh(12 ) iff (U1 , U2 )(n1 ,n2 ) |= Eh(12 ), where n02 = min(n2 , |U2 | + 2) and n01 = min(n1 , |U1 | + 1). Proof If n2 > |U2 | + 2, set V1 = U1n1 and V2 = U2 . Then, (U1 , U2 )(n1 ,n2) |= Eh(12 ) iff 0 (V1 , V2 )(1,n2) |= Eh(12 ) iff (V1 , V2 )(1,n2 ) |= Eh(12 ) (by the Disjunctive Bounding 0 Lemma) iff (U1 , U2 )(n1 ,n2) |= Eh(12 ). n0

If n1 ≤ |U1 |+1, then n1 = n01 and we are done, else set V1 = U2 2 and V2 = U1 . 0 0 Then, (U1 , U2 )(n1 ,n2 ) |= Eh(12 ) iff (U2 , U1 )(n2 ,n1 ) |= Eh(11 ) iff (V1 , V2 )(1,n1) |= (1,|U1|+1) Eh(11 ) iff (V1 , V2 ) |= Eh(11 ) (by the Disjunctive Bounding Lemma) iff 0 0 (U1 , U2 )(n1 ,n2) |= Eh(12 ). t u An easy but important consequence of the Disjunctive Truncation Lemma is the following TheoremV4.1.3 (Disjunctive Cutoff Result). V Let f be il Ah(il ) or il Eh(il ), where h is a LTL\X formula and l ∈ [1..2]. Then we have the following ∀(n1 , n2 ) (1, 1) : (U1 , U2 )(n1 ,n2 ) |= f iff ∀(d1 , d2 ) (c1 , c2 ) : (U1 , U2 )(d1 ,d2 ) |= f where the cutoff (c1 , c2 ) is given by cl = |Ul | + 2, and for i 6= l : ci = |Ui | + 1. Proof By appeal to symmetry and the fact that A and E are duals, it suffices to prove the result for formulae of the type Eh(12 ). The (⇒) direction is trivial. For the (⇐) direction, let n1 , n2 ≥ 1. Define n01 = min(n1 , |U1 | + 1), n02 = min(n2 , |U2 | + 0 0 2). Then, (U1 , U2 )(n1 ,n2) |= f(12 ) iff (U1 , U2 )(n1 ,n2) |= f(12 ) by the Disjunctive Truncation Lemma. The latter is true since (n01 , n02 ) (c1 , c2 ). This proves the cutoff result. t u The earlier-stated Cutoff Theorem re-articulates the above Cutoff Result more generally for systems with k ≥ 1, different classes of processes; since its proof is along similar lines but is notationally more complex, we omit it for the sake of brevity.

4.2

Efficient Decidability for “For All Future” Properties

It can be shown V that for “for some future” properties, corresponding to formulae of the type Eh, the reduction entailed in the previous result is, in general, the best possible. We omit the proof for the sake of brevity.

244

E. Allen Emerson and Vineet Kahlon

However, for universal-path-quantified properties, it is possible to be much more efficient. We will establish the 0

Theorem 4.2.1 (Reduction Theorem). Define V = Ul if for some l ∈ [1..k], 0 the transition graph for Ul has a nontrivial strongly connected component else V (1,1) 0 0 |= Ah(11 ), set V = U1 . Then, (U1 , ..., Uk)(c1 ,...,ck ) |= il Ah(il ) iff (Ul , V ) 0 where cl = |Ul | + 2, ci = |Ui | + 1 for i 6= l and Ul is the simplified process that we get from Ul by the reduction technique described below. V This makes precise our claim that for formulae of the type il Ah(il ), it is possible to give efficient decision procedures for the PMCP at hand, by reducing it to model checking systems consisting of two or three template processes. To this end, we first prove the following lemma which states that the PMCP problem for the above mentioned properties reduces to model checking just the single system instance of size equal to the (small) cutoff (as opposed to all systems of size less than or equal to the cutoff). Lemma 4.2.1 (Single-Cutoff Lemma). ∀n1 , n2 ≥ 1 : (U1 , U2 )(n1 ,n2) |= Ah(12 ) iff (U1 , U2 )(c1 ,c2) |= Ah(12 ), where c1 = |U1 | + 1 and c2 = |U2 | + 2. Proof (⇒) This direction follows easily by instantiating n1 = c1 and n2 = c2 on the left hand side. (⇐) Choose arbitrary k1 , k2 ≥ 1. Set k10 = min(k1 , c1 ) and k20 = min(k2 , c2 ). 0 0 Then, (U1 , U2 )(k1 ,k2) |= Eh(12 ) iff (U1 , U2 )(k1 ,k2 ) |= Eh(12 ) (by the Disjunctive Truncation Lemma) which implies (U1 , U2 )(c1 ,c2 ) |= Eh(12 ) (by repeated application of the Disjunctive Monotonicity Lemma). Now, by contraposition, (U1 , U2 )(c1 ,c2 ) |= Ah(12 ) implies (U1 , U2 )(k1 ,k2) |= Ah(12 ). Since k1 , k2 were arbitrarily chosen, the proof is complete. t u Next, we transform the given template processes and follow that up with lemmas giving the soundness and completeness proofs for the transformation. Given 0 0 template processes U1 , ..., Uk , define ReachableStates(U1 , ..., Uk) = (S1 , ..., Sk ), 0 where Si = { t | t ∈ Si , such that for some n1 , n2 , ..., nk ≥ 1, there exists a computation path of (U1 , ..., Uk)(n1 ,...,nk ) , leading to a global state that contains a local indexed copy of t}. ∀j ≥ 0, ∀l ∈ [1..k], we define Plj as follows: Pl0 = {il }. S g Plj+1 = Plj {p0 : ∃p ∈ Plj : ∃p → p0 ∈ Rl and expression g contains a S state in t Ptj }. For l ∈ [1..k], define Pl =

S j

Plj . Then we have the

Reducing Model Checking of the Many to the Few

245

Lemma 4.2.2 (Soundness Lemma). Given j, for all l ∈ [1..k], define al = |Plj |. Then, there exists a finite computation sequence x = x0 , x1, ..., xm of (U1 , ..., Uk)(a1 ,...,ak ) , such that ∀l ∈ [1..k] : ∀sl ∈ Plj : (∃p ∈ [1..al] : xm (l, p) = spl ). Proof The proof is by induction on j. The base case, j = 0, is vacuously true. Assume that the result holds for j ≤ u and let y = y0 , y1 , ..., yt be a computation sequence of (U1 , ..., Uk )(r1 ,...,rk ) , where rl = |Plu |, with the property that ∀l ∈ [1..k]: ∀sl ∈ Plu : (∃p ∈ [1..rl] : xm (l, p) = spl ). Now, assume that Plu+1 6= Plu , and let sl ∈ Plu+1 \ Plu . Furthermore, let 0 (sl , sl ) be the transition that led to the inclusion of sl into Plu+1 . Clearly, s0l ∈ Plj . Then, by the induction hypothesis, ∃q ∈ [1..rl] : yt (l, q) is an indexed copy of s0l . 0 Consider the sequence y0 = y00 , y10 , ..., y2t+1 of states of (U1 , ..., Uk )(r1 ,...,rl +1,...,rk ) , 0 where for i ∈ [1..k] : c ∈ [1..ri] : y (i, c) = y(i, c)(yt (i, c))t+1 and y0 (l, rl + 1) = (irl l +1 )t z , where z is y(l, q)slrl +1 with the index q replaced by rl +1. It can be seen that y0 is a valid stuttering computation path of (U1 , ..., Uk)(r1 ,...,rl+1,...,rk ) , where 0 y2t+1 has the property that ∀l ∈ [1..k] : ∀sl ∈ Plu : ∃p ∈ [1..rl] : y2t+1 (l, p) = spl 0 and y2t+1 (l, rl + 1) = slrl +1 . Repeating the above procedure for all states in Plu+1 \ Plu , we get a computation path with the desired property. This completes the induction step and proves the lemma. t u 0

0

Lemma 4.2.3 (Completeness Lemma). (S1 , ..., Sk) = (P1 , ..., Pk). Proof 0 0 0 By the above lemma, ∀i ∈ [1..k] : Pi ⊆ Si . If possible, suppose that (S1 , ..., Sk ) 6= S 0 (P1 , ..., Pk). Then, the set D = i (Si − Pi ) 6= ∅. For definiteness, let sl ∈ T 0 D Sl . Then by definition of Sl , there exists a finite computation sequence x = x0 , x1 , ..., xm such that for T some i, xm (l, i) = sil . Let j ∈ [0..m] beS the smallest index such that Set(xj ) D 6= ∅. Then, P athSet(x0 , ..., xj−1) ⊆ i Pi which implies that there exists a transition (s0l , sl ) in Rl , with guard g such that xj−1 |= g. But this implies that for some t, sl would be included in Plt i.e. sl ∈ Pl , a contradiction to our assumption that sl ∈ D. Thus D = ∅ and we are done. u t We now modify the k-tuple of template processes (U1 , ..., Uk) to get the k0 0 0 0 0 0 tuple (U1 , ..., Uk), where Ui = (Si , Ri , ii ), with (si , ti ) ∈ Ri iff guard gi labelling S 0 (si , ti ) in Ui contains an indexed copy of a state in i∈[1..k] Si . Furthermore, any transition in the new system is labelled with gU , a universal guard that evaluates to true irrespective of the current global state of the system. The motivation behind these definitions is that since for any n1 , n2 , ..., nk ≥ 1, no indexed copy 0 of states in Si \ Si is reachable in any computation of (U1 , ..., Uk )(n1 ,...,nk ) , we can safely delete these states from their respective template process. Also, any 0 guard of a template process involving only states in Si \ Si , will then always evaluate to false and hence the transition labelled by this guard will never be fired. This justifies deleting such transitions from the transition graph of respec-

246

E. Allen Emerson and Vineet Kahlon

tive template processes. This brings us to the following Reduction Result, which by appeal to symmetry yields the Reduction Theorem stated before. 0

Theorem 4.2.2 (Reduction Result). Define V = Ul if for some l ∈ [1..k], 0 the transition graph for Ul has a nontrivial strongly connected component else 0

0

(1,1)

set V = U1 . Then, (U1 , ..., Uk )(c1 ,...,ck ) |= Ah(1p ) iff (Up , V ) cp = |Up | + 2 and ci = |Ui | + 1 for i 6= p.

|= Ah(11 ), where

Proof (1,1) 0 |= Eh(11 ). For defiWe show that (U1 , ..., Uk)(c1 ,...,ck ) |= Eh(1p) iff (Up , V ) 0 niteness, let V = Ur . 0 (⇒) Define sequence u = (ir )ω . If Ur has a nontrivial strongly connected component, then there exists an infinite path v, say, in its transition graph. In that case, reset u = v. Let x = x1 , x2 , ... be a computation sequence of (U1 , ..., Uk)(c1 ,...,ck ) . Define a formal sequence y = y1 , y2 , ... as follows. Set y(1, 1) = x(p, 1) and in case x(p, 1) is a finite computation sequence of length f, say, set y(2, 1) = (ir )f u else set 0

(1,1)

, y(2, 1) = (ir )ω . To prove that y is a valid computation sequence of (Up , V ) it suffices to show that all transitions of local path y(1, 1) are valid. This follows 0 from the definition of U1 and by noting that all states occuring in x(p, 1) are reachable and all transitions in x(p, 1) are labelled by guards whose expressions S 0 0 involve a state in j Sj and hence they occur in Rp . (⇐) By the Soundness and Completeness lemmas, it follows that there exists a finite computation path u = u0 , u1 , ..., um of (U1 , ..., Uk)(|U1 |,...,|Uk |) starting 0 at i(|U1 |,...,|Uk |) , such that ∀j ∈ [1..k] : ∀qj ∈ Sj : ∃t ∈ [1..|Uj |] : um (j, t) = 0

(1,1)

qjt . Let x = x0 , x1 , ... be a computation path of (Up , V ) . Define a formal (c1 ,...,ck ) sequence y = y0 , y1 , ... of states of (U1 , ..., Uk ) as follows. Set y(p, 1) = ((ip )m )x(1, 1), y(r, 1) = ((ir )m )x(2, 1), ∀z ∈ [1..|Up|] : y(p, z + 1) = u(p, z), ∀z ∈ [1..|Ur |] : y(r, z + 1) = u(r, z), and ∀j ∈ [1..k], j 6= p, r : ∀z ∈ [1..|Uj |] : S 0 y(j, z) = u(j, z)(um (j, z))ω . Note that, ∀l ≥ m : Set(yl ) = j Sj and hence for 0 0 all l ≥ m, all template transitions in R1 ∪Rr are enabled in yl . Thus for all i ≥ m, all transitions (yi , yi+1 ) are valid and hence it follows that y is a stuttering of a valid computation path of (U1 , ..., Uk)(c1 ,...,ck ) with local path y(1, 1) being a stuttering of local path x(1, 1). The path correspondence gives us the result. u t Finally, we get the Theorem 4.2.3 (Efficient Decidability V Theorem). For systems with disjuctive guards and properties of the type il Ah(il ), the PMCP is decidable in time Pquadratic in the size of the given family (U1 , ..., Uk ), where size is defined as j (|Sj | + |Rj |), and linear in the size of the B¨ uchi Automaton for ¬h(1l ).

Reducing Model Checking of the Many to the Few

247

Proof We first argue that we can construct the simplified system Ul0 efficiently. S By definition, ∀j ≥ 0 : Plj ⊆ Plj+1 . Let P i = l Pli . Then, it is easy to see that, ∀j ≥ 0 : P j ⊆ P j+1 and if P j = P j+1 , then ∀i ≥ j : P i = P j . Also, ∀i : P i ⊆ S 0 j l SP l . Thus to evaluate sets Pl , for all j, it suffices to evaluate them for values of j ≤ l |Sl |. Furthermore, given Plj to evaluate Plj+1 , it suffices to make a pass through all transitions leading to states in Sl \ Plj to check if a guard leading to S any of these states contains a state in l Plj . This can clearly be accomplished in P time j (|Sj | + |Rj |). The above remarks imply that evaluation of sets Plj , can P 0 be done in time O(( j (|Sj | + |Rj |))2 ). Furthermore, given p, whether Up has a 0 0 nontrivial strongly connected component can be decided in time O(|Sp | + |Rp|) 0 by constructing all strongly connected components of Up . Thus, determining P whether such a p exists can be done in time O( j (|Sj | + |Rj |)). The Reduction Theorem reduces the PMCP problem to model checking for 0 0 the system (Ul , V )(1,1), where V = Ur if for some r ∈ [i..k], the transition 0 0 graph for Ur has a nontrivial strongly connected component else V = U1 . Now, 0 0 (1,1) (1,1) (Ul , V ) |= Ah(11 ) iff (Ul , V ) |= ¬E¬h(11 ). Thus it suffices to check 0 (1,1) |= E¬h(11 ), for which we use the automata-theoretic apwhether (Ul , V ) proach of [20]. We construct a B¨ uchi Automaton B¬h for ¬h(11 ), and check that 0 language of the product B¨ uchi Automaton P, of (Ul , V )(1,1) and B¬h is nonempty(cf [14]). Since the nonemptiness check forPP can be done in time linear in 0 the size of P, and the size of (Ul , V )(1,1) is O(( j (|Sj | + |Rj |))2 ), we are done. t u 4.3

Properties Ranging over Pairs of Processes from Two Classes

Using similar kinds of arguments as were used in proving assertions in the sections 4.1 and 4.2, we can prove the following results. Theorem V 4.3.1 (Cutoff Theorem). V Let f be il ,jm Ah(il , jm ) or il,jm Eh(il , jm ), where h is an LTL\X formula and l, m ∈ [1..k]. Then we have the following ∀(n1 , . . . , nk ) (1, . . . , 1) : (U1 , . . . , Uk )(n1 ,...,nk ) |= f iff ∀(d1 , . . . , dk ) (c1 , . . . , ck ) : (U1 , . . . , Uk )(d1 ,...,dk ) |= f, where the cutoff (c1 , . . . , ck ) is given by cl = |Ul | + 2, cm = |Um | + 2 and for i 6= l, m : ci = |Ui | + 1. Theorem 4.3.2 (Reduction Theorem). V V (1,1) 0 0 |= il ,jm Ah(il , jm ), (U1 , ..., Uk)(c1 ,...,ck ) |= il ,jm Ah(il , jm ) iff (Ul , Um ) where cl = |Ul | + 2, cm = |Um | + 2 and ∀i 6= l, m : ci = |Ui | + 1. Again, we get the analogous Decidability Theorem and Efficient Decidability Theorem. Moreover, we can specialize these V results to apply when V l=m. This permits reasoning about formulae of the type il 6=jl Ah(il , jl ) or il6=jl Eh(il , jl ), for properties ranging over all pairs of processes in a single class l.

248

5

E. Allen Emerson and Vineet Kahlon

Systems with Conjunctive Guards

The development of results for conjunctive guards closely resembles that for disjunctive guards. Hence, for the sake of brevity, we only provide a proof sketch for each of the results. Lemma 5.1 (Conjunctive Monotonicity Lemma). (i) ∀n ≥ 1 : (V1 , V2 )(1,n) |= Eh(12 ) implies (V1 , V2 )(1,n+1) |= Eh(12 ). (ii) ∀n ≥ 1 : (V1 , V2 )(1,n) |= Eh(11 ) implies (V1 , V2 )(1,n+1) |= Eh(11 ). Proof Sketch The intuition behind this lemma is that for any computation x of (V1 , V2 )(1,n), there exists an analogous computation y of (V1 , V2 )(1,n+1) wherein the (n + 1)st copy of template process V2 stutters in its initial state and the rest of the processes behave as in x. t u Lemma 5.2 (Conjunctive Bounding Lemma). (i) ∀n ≥ 2|V2 | + 1 : (V1 , V2 )(1,n) |= Eh(12 ) iff (V1 , V2 )(1,c2 ) |= Eh(12 ), where c2 = 2|V2 | + 1. (ii) ∀n ≥ 2|V2 | : (V1 , V2 )(1,n) |= Eh(11 ) iff (V1 , V2 )(1,2|V2|) |= Eh(11 ). Proof Sketch Let x be an infinite computation of (V1 , V2 )(1,n). Set v = (i2 )ω , where i2 is the initial state of V2 . If none of x(1, 1) or x(2, 1) is an infinite local computation then there exists l 6= 1 such that x(2, l) is an infinite local computation. In that case, reset v = x(2, l). Construct a formal sequence y of (V1 , V2 )(1,c2 ) as follows. Set y(1, 1) = x(1, 1), y(2, 1) = x(2, 1), y(2, 2) = v and ∀j ∈ [3..c2] : y(2, j) = (i2 )ω . Then, it can be proved that y is a stuttering of a valid infinite computation of (V1 , V2 )(1,c2) . Now consider the case when x = x0 x1 ...xd is a deadlocked computation sequence of (V1 , V2 )(1,n). Let S = Set(xd ) ∩ S2 . For each s ∈ S, define an index set Is as follows. If there exists a unique indexed copy xd (2, in) of s in xd set Is = {in} else set Is = {in1 , in2 }, where xd (2, in1 ) and xd (2, in2 ) are indexed S copies of s and in1 6= in2 . Let I = s Is . Also, for index j and global state s define Set(s, j) = {t|t ∈ S1 ∪ S2 and t has a copy with index other than j in s} Construct a formal sequence y = y0 , ..., yd of states of (V1 , V2 )(1,2|V2|+1) by projecting each global state xi onto process 1 coordinate of V1 and process index coordinate of V2 , where index = 1 or index ∈ I. From our construction, it follows that for all j, Set(yj ) ⊆ Set(xj ). Hence all transitions (yi , yi+1 ) are valid. Also for each i ∈ [1..(2|V2| + 1)], there exists j ∈ [1..n] such that y(2, i) is a projection of x(2, j). Then, from our construction, it follows that Set(xd , j) = Set(yd , i) and thus process V2i is deadlocked in yd iff V2j is deadlocked in xd . Then from the fact that xd is deadlocked, we can conclude that yd is a deadlocked state and hence y is a stuttering of a deadlocked computation of (V1 , V2 )(1,c2 ) . In both cases, when constructing y from x, we preserved the local computation sequence of process V21 . This path correspondence gives us the result. t u

Reducing Model Checking of the Many to the Few

249

Again as before, the following lemma allows reduction in system size over multiple coordinates simultaneously (2 coordinates for notational brevity). Lemma 5.3 (Conjunctive Truncation Lemma). 0 0 ∀n1 , n2 ≥ 1 : (U1 , U2 )(n1 ,n2 ) |= Eh(12 ) iff (U1 , U2 )(n1,n2 ) |= Eh(12 ), where n02 = min(n2 , 2|U2 | + 1) and n01 = min(n1 , 2|U1|). Proof Idea Use the Conjunctive Bounding Lemma and associativity of the || operator.

t u

TheoremV5.1 (Conjunctive Cutoff Result). V Let f be il Ah(il ) or il Eh(il ), where h is a LTL\X formula and l ∈ [1..2]. Then we have the following ∀(n1 , n2 ) (1, 1) : (U1 , U2 )(n1 ,n2 ) |= f iff ∀(d1 , d2 ) (c1 , c2 ) : (U1 , U2 )(d1 ,d2 ) |= f, where the cutoff (c1 , c2 ) is given by cl = 2|Ul | + 1, and for i 6= l : ci = 2|Ui |. Proof Sketch Follows easily from the Truncation Lemma.

t u

More generally, for systems with k ≥ 1 class of processes we have TheoremV5.2 (Conjunctive Cutoff Theorem). V Let f be il Ah(il ) or il Eh(il ), where h is a LTL\X formula and l ∈ [1..k]. Then we have the following ∀(n1 , ..., nk) (1, ..., 1) : (U1 , ..., Uk)(n1 ,...,nk ) |= f iff ∀(d1 , ..., dk) (c1 , ...ck ) : (U1 , ..., Uk)(d1 ,...,dk ) |= f, where the cutoff (c1 , ..., ck) is given by cl = 2|Ul | + 1, and for i 6= l : ci = 2|Ui |. Although the above results yield decidability for PMCP in the Conjunctive guards case, the decision procedures are not efficient. We now show that if we limit path quantification to range over infinite paths only (i.e. ignore deadlocked paths); or finite paths only; then we can give an efficient decision procedure for this version of the PMCP. We use Ainf for “for all infinite paths”, Einf for “for some infinite path”, Afin for “for all finite paths”, and Efin for “for some finite path”. Theorem 5.3 (Infinite Conjunctive Reduction Theorem). For any LTL\X formula h and l ∈ [1..k], we have V (i) ∀(n1 , . . . , nk ) (1, . . . , 1) : (U1 , ..., Uk)(n1 ,...,nk ) |= il Einf h(il ), iff (U1 , ..., Uk)(c1 ,...,ck ) |= Einf h(1l ); V (ii) ∀(n1 , . . . , nk ) (1, . . . , 1) : (U1 , ..., Uk)(n1 ,...,nk ) |= il Ainf h(il ), iff (U1 , ..., Uk)(c1 ,...,ck ) |= Ainf h(1l ), where (c1 , ..., ck) = (1, ..., 2, ..., 1). | {z } l

250

E. Allen Emerson and Vineet Kahlon

Proof Sketch To obtain (a), by appeal to symmetry, it suffices to establish that for each (n1 , . . . , nk ) (1, . . . , 1) : (U1 , ..., Uk)(n1 ,...,nk ) |= Einf h(1l ) iff (U1 , ..., Uk)(c1 ,...,ck ) |= Einf h(1l ). Using the duality between Ainf and Einf on both sides of the latter equivalence, we can also appeal to symmetry to obtain (b). We establish the latter equivalence as follows. b0 ,g0

b1 ,g1

(⇒) Let x = x0 −→ x1 −→ . . . denote an infinite computation of (U1 , ..., Uk)(n1 ,...,nk ) , where bi indicates which process fired the transition driving the system from global states xi to xi+1 and gi is the guard enabling the transition. Since x is infinite, it follows that there exists some process such that the result of projecting x onto that process results in a stuttering of an infinite local computation of the process. By appeal to symmetry, we can without loss of generality, assume that for each process class Up , if a copy of Up in (U1 , ..., Uk )(n1,...,nk ) has the above property then that copy is in fact the concrete process Up1 in case p 6= l and the concrete process Up2 in case p = l and local computation x(l, 1) is finite. b0 ,g0

b0 ,g0

0 0 1 1 Define a (formal) sequence y = y0 −→ y1 −→ . . . by projecting each global state xi onto process 1 coordinate for each class Up for p 6= l and onto process coordinates 1 and 2 for process class Ul to get a state yi . We let b0i = 1l if bi = 1l , b0i = 2l if bi = 2l , else set b0i = , while gi0 is the syntactic guard resulting from gi by deleting all conjuncts corresponding to indices not preserved in the projection. Then, by our construction and the fact that x was an infinite computation, we have that y denotes a stuttering of a genuine infinite computation of (U1 , ..., Uk )(c1 ,...,ck ) . To see this, note that for any i such that yi 6= yi+1 , the associated (formal) transitions have their guard gi0 true, since for conjunctive guards gi and their projections gi0 we have xi |= gi implies yi |= gi0 , and can thus fire in (U1 , ..., Uk )(c1 ,...,ck ) . For any stuttering i where yi = yi+1 , the (formal) transition is labelled by b0i = . Thus, given infinite computation path of (U1 , ..., Uk)(n1 ,...,nk ) , there exists a stuttering of an infinite computation path of (U1 , ..., Uk)(c1 ,...,ck ) , such that the local computation path of Ul1 is the same in both. This path correspondence proves the result.

(⇐) Let y = y0 , y1 , ... be an infinite computation path of (U1 , ..., Uk)(c1 ,...,ck ) . Then, consider the sequence of states = x0 , x1 , ...,, where x(l, 1) = y(l, 1), x(l, 2) = y(l, 2) and ∀(k, j) 6= (l, 1), (l, 2) : x(k, j) = (ijk )ω . Let gi be the guard labelling the transition s1l → t1l in state yi . Then all the other processes are in their initial states in xi , and since the guards do allow initial states of all template processes as “nonblocking” states in that their being present in the global state does not falsify any guards, we have xi |= gi . Thus, given infinite computation path y of (U1 , ..., Uk )(c1 ,...,ck ) , there exists an infinite computation path x of (U1 , ..., Uk)(n1 ,...,nk ) , such that the local computation path of Ul1 is the same in both. This path correspondence easily gives us the desired result. t u

Reducing Model Checking of the Many to the Few

251

In a similar fashion, we may prove the following result. Theorem 5.4 (Finite Conjunctive Reduction Theorem). For any LTL\X formula h, and l ∈ [1..k] we have V (i) ∀(n1 , . . . , nk ) (1, . . . , 1) : (U1 , ..., Uk)(n1 ,...,nk ) |= il Efinh(il ), iff (U1 , ..., Uk)(1,...,1) |= Efinh(1l ); V (ii) ∀(n1 , . . . , nk ) (1, . . . , 1) : (U1 , ..., Uk)(n1 ,...,nk ) |= il Afinh(il ), iff (U1 , ..., Uk)(1,...,1) |= Afinh(1l ). Note that the above theorem permits us to verify safety properties efficiently. Informally, this is because if there is a finite path leading to a “bad” state in the system (U1 , ..., Uk)(n1 ,...,nk ) , then there exists a finite path leading to a bad state in (U1 , ..., Uk)(1,...,1). Thus, checking that there is no finite path leading to bad state in (U1 , ..., Uk )(n1 ,...,nk ) reduces to checking it for (U1 , ..., Uk )(1,...,1). We can use this to obtain an Efficient Conjunctive Decidability Theorem. Moreover, the results can be readily extended to formulae with multiple indices as in the disjunctive guards case.

6

Applications

Here, we consider a solution to the mutual exclusion problem. The template process is given below. Initially, every process is in local state N , the non-critical N

T

C

U

G

U

region. U = T ∨ N ∨ C denotes the universal guard, which is always true independent of the local states of other processes. If a process wants to enter the critical section C, it goes into the trying region T which it can always do since U is always true. Guard GV= N ∨ T , instantiated for process i of n processes, takes the conjunctive form j6=i (Nj ∨ Tj ). When G is true, no other process is in the critical section, and the transition from T to C can be taken. Note that all guards are conjunctive with neutral(i.e., non-blocking) initial state N. Thus, by the Finite Conjunctive Reduction Theorem for multi-indexed properties, PMCP V for all sizes n with the mutual exclusion property i,j,i6=j Afin G¬(Ci ∧ Cj ) can be reduced to checking a 2-process instance. Using the Conjunctive Cutoff TheV orem, the starvation-freedom property i A(G(Ti ⇒ FCi )) can be checked by a 7-process instance. In this simple example, mutual exclusion is maintained but starvation-freedom fails.

252

7

E. Allen Emerson and Vineet Kahlon

Concluding Remarks

PMCP is, in general, undecidable[1]. However, under certain restrictions, a variety of positive results have been obtained. Early work includes [15] which uses an abstract graph of exponential size “downstairs” to capture the behaviour of arbitrary sized parameterized asynchronous programs “upstairs” over Fetch-andAdd primitives; however, while it caters for partial automation, the completeness of the method is not established, and it is not clear that it can be made fully automatic. A semi-automated method requiring construction of a closure process which represents computations of an arbitrary number of processes is described in [4]; it is shown that, if for some k, C||U k is appropriately bisimilar to C||U k+1, then it suffices to check instances of size at most k to solve the PMCP. But it is not shown that such a cutoff k exists, and the method is not guaranteed to be complete. Kurshan and McMillan [13] introduce the related notion of a process invariant (cf. [22]). Ip and Dill [12] describe another approach to dealing with many processes using an abstract graph; it is sound but not guaranteed to be complete; [18] proposes a similar construction for verification of safety properties of cache coherence protocols, which is also sound but not complete. A theme is that most these methods suffer, first, from the drawback of being only partially automated and hence requiring human ingenuity, and, second, from being sound but not guaranteed complete (i.e., a path “upstairs” maps to a path “downstairs”, but paths downstairs do not necessarily lift). Other methods can be fully automated but do not appear to have a clearly defined class of protocols on which they are guaranteed to terminate successfully (cf. [5], [21], [19]). For systems comprised of CCS processes, German and Sistla [10] combine the automata-theoretic method with process closures to permit efficient solution to PMCP for single index properties, modulo deadlock. But efficient solution is only yielded for processes in a single class. Even for systems of the form C||U n a doubly exponential decision procedure results, which likely limits its practical use. Emerson and Namjoshi [7] show that in a single class (or clientserver) synchronous framework PMCP is decidable but with PSPACE-complete complexity. Moreover, this framework is undecidable in the asynchronous case. A different type of parameterized reasoning about time bounds is considered in [9]. In some sense, the closest results might be those of Emerson and Namjoshi [6] who for the token ring model, reduce reasoning, for multi-indexed temporal logic formulae, for rings of arbitrary size to rings up to a small cutoff size. These results are significant in that, like ours, correctness over all sizes holds iff correctness of (or up to) the small cutoff size holds. But these results were formulated only for a single process class and, for a restricted version of the token ring model, namely one where the token cannot be used to pass values. Also, related are the results of Attie and Emerson [2]. In the context of program synthesis, rather than program verification, it is shown how certain 2-process solutions to synchronization problems could be inflated to n-process solutions. However, the correspondence is not an “iff”, but is established in only one di-

Reducing Model Checking of the Many to the Few

253

rection for conjunctive-type guards. Disjunctive guards are not considered, nor are multiple process classes. We believe that our positive results on PMCP are significant for several reasons. Because PMCP solves (a major aspect of) the state explosion problem and the scalability problem in one fell swoop, many researchers have attempted to make it more tractable, despite its undecidability in general. Of course, PMCP seems to be prone to undecidability in practice as well, as is evidenced by the wide range of solution methods proposed that are only partially automated or incomplete or lack a well-defined domain of applicability. Our methods are fully automated returning a yes/no answer, they are sound and complete as they rely on establishing exact (up to stuttering) correspondences (yes upstairs iff yes downstairs). In many cases, our methods are efficient, making the problem genuinely tractable. An additional advantage, is that downstairs we have a small system of cutoff size that, but for its size, looks like a system of size n. This contrasts with methods that construct an abstract graph downstairs which may have a complex and non-obvious organization.

References 1. K. Apt and D. Kozen. Limits for automatic verification of finite-state concurrent systems. Information Processing Letters, 15, pages 307-309, 1986. 2. P.C. Attie and E.A. Emerson. Synthesis of Concurrent Systems with Many Similar Processes. ACM Transactions on Programming Languages and Systems, Vol. 20, No. 1, January 1998, pages 51-115. 3. M.C. Browne, E.M. Clarke and O. Grumberg. Reasoning about Networks with Many Identical Finite State Processes. Information and Control, 81(1), pages 1331, April 1989. 4. E.M. Clarke and O. Grumberg. Avoiding the State Explosion Problem in Temporal Logic Model Checking Algorithms. In Proceedings of the Sixth Annual ACM Symposium on Principles of Distributed Computing, pages 294-303, 1987. 5. E.M. Clarke, O. Grumberg and S. Jha. Verifying Parameterized Networks using Abstracion and Regular Languages. In CONCUR ’95: Concurrency Theory, Proceedings of the 6th International Conference, LNCS 962, pages 395-407, SpringerVerlag, 1995. 6. E.A. Emerson and K.S. Namjoshi. Reasoning about Rings. In Conference Record of POPL ’95: 22nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 85-94, 1995. 7. E.A. Emerson and K.S. Namjoshi. Automatic Verification of Parameterized Synchronous Systems. In Computer Aided Verification, Proceedings of the 8th International Conference. LNCS , Springer-Verlag, 1996. 8. E.A. Emerson and A.P. Sistla. Symmetry and Model Checking. In Computer Aided Verification, Proceedings of the 5th International Conference. LNCS 697, SpringerVerlag, 1993. 9. E. Emerson and R. Trefler, Parametric Quantitative Temporal Reasoning. LICS 1999, 336-343. 10. S.M. German and A.P. Sistla. Reasoning about Systems with Many Processes. J. ACM, 39(3), July 1992.

254

E. Allen Emerson and Vineet Kahlon

11. C. Ip and D. Dill. Better verification through symmetry. In Proceedings of the 11th International Symposium on Computer Hardware Description Languages and their Applications.1993. 12. C. Ip and D. Dill, Verifying Systems with Replicated Components in Murphi, pp. 147-158 CAV 1996. 13. R.P. Kurshan and L. McMillan. A Structural Induction Theorem for Processes. In Proceedings of the Eight Annual ACM Symposium on Principles of Distributed Computing, pages 239-247, 1989. 14. O. Lichtenstein and A. Pnueli. Checking that finite state concurrent programs satisfy their linear specifications. In Conference Record of POPL ’85: 12nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 97-107, 1985. 15. B. Lubachevsky. An Approach to Automating the Verification of Compact Parallel Coordination Programs I.Acta Informatica 21, 1984. 16. K. McMillan, Verification of Infinite State Systems by Compositional Model Checking, CHARME’99. 17. A. Pnueli. The Temporal Logic of Programs. In Proceedings of the eighteenth Symposium on Foundations of Computer Science. 1977. 18. F. Pong and M. Dubois. A New Approach for the Verification of Cache Coherence Protocols. IEEE Transactions on Parallel and Distributed Systems, August 1995. 19. A. P. Sistla, Parameterized Verification of Linear Networks Using Automata as Invariants, CAV, 1997, 412-423. 20. M. Vardi and P. Wolper. An Automata-theoretic Approach to Automatic Program Verification. In Proceedings, Symposium on Logic in Computer Science, pages 332344, 1986. 21. I. Vernier. Specification and Verification of Parameterized Parallel Programs. In Proceedings of the 8th International Symposium on Computer and Information Sciences, Istanbul, Turkey, pages 622-625,1993. 22. P. Wolper and V. Lovinfosse. Verifying Properties of Large Sets of Processes with Network Invariants. In J. Sifakis(ed) Automatic Verification Metods for Finite State Systems, Springer-Verlag, LNCS 407, 1989.

Simulation Based Minimization Doron Bustan and Orna Grumberg Computer Science Dept. Technion, Haifa 32000, Israel [email protected]

Abstract. This1 work presents a minimization algorithm. The algorithm receives a Kripke structure M and returns the smallest structure that is simulation equivalent to M . The simulation equivalence relation is weaker than bisimulation but stronger than the simulation preorder. It strongly preserves ACTL and LTL (as sub-logics of ACTL∗). We show that every structure M has a unique up to isomorphism reduced structure that is simulation equivalent to M and smallest in size. We give a Minimizing Algorithm that constructs the reduced structure. It first constructs the quotient structure for M , then eliminates transitions to little brothers and finally deletes unreachable states. The first step has maximal space requirements since it is based on the simulation preorder over M . To reduce these requirements we suggest the Partitioning Algorithm which constructs the quotient structure for M without ever building the simulation preorder. The Partitioning Algorithm has a better space complexity but might have worse time complexity.

1

Introduction

Temporal logic model checking is a method for verifying finite-state systems with respect to propositional temporal logic specifications. The method is fully automatic and quite efficient in time, but is limited by its high space requirements. Many approaches to beat the state explosion problem of model checking have been suggested, including abstraction, partial order reduction, modular methods, and symmetry ([8]). All are aimed at reducing the size of the model (or Kripke structure) to which model checking is applied, thus, extending its applicability to larger systems. Abstraction methods, for instance, hide some of the irrelevant details of a system and then construct a reduced structure. The abstraction is required to be weakly preserving, meaning that if a property is true for the abstract structure then it is also true for the original one. Sometimes we require the abstraction to be strongly preserving so that, in addition, a property that is false for the abstract structure, is also false for the original one. In a similar manner, for modular model checking we construct a reduced abstract environment for a part of the system that we wish to verify. In this case 1

The full version of this paper including proofs of correctness can be found in [6].

D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 255–270, 2000. c Springer-Verlag Berlin Heidelberg 2000

256

Doron Bustan and Orna Grumberg

as well, properties that are true (false) of the abstract environment should be true (false) of the real environment. It is common to define equivalence relations or preorders on structures in order to reflect strong or weak preservation of various logics. For example, language equivalence (containment) strongly (weakly) preserves the linear-time temporal logic LTL. Other relations that are widely used are the bisimulation equivalence [15] and the simulation preorder [14]. The former guarantees strong preservation of branching-time temporal logics such as CTL and CTL∗ [7]. The latter guarantees weak preservation of the universal fragment of these logics (ACTL and ACTL∗ [10]). Bisimulation has the advantage of preserving more expressive logics. However, this is also a disadvantage since it requires the abstract structure to be too similar to the original one, thus allowing less powerful reductions. The simulation preorder, on the other hand, allows more powerful reductions, but it provides only weak preservation. Language equivalence provides strong preservation and large reduction, however, its complexity is exponential while the complexity to compute bisimulation and simulation is polynomial. In this paper we investigate the simulation equivalence relation that is weaker than bisimulation but stronger than the simulation preorder and language equivalence. Simulation equivalence strongly preserves ACTL∗ , and also strongly preserves LTL and ACTL as sublogics of ACTL∗ . Both ACTL and LTL are widely used for model checking in practice. As an equivalence relation that is weaker than bisimulation, it can derive smaller minimized structure. For example, the structure in part 2 of Figure 1 is minimized with respect to simulation equivalence. In comparison, the minimized structure with respect to bisimulation is the structure in part 1 of Figure 1 and the minimized structure with respect to language equivalence is the structure in part 3 of Figure 1 .

e a

e

1 a

b

c

e

2

a

a b

d

b

c

3

b

d

b

c

d

Fig. 1. Different minimized structures with respect to different equivalence relations

Simulation Based Minimization

257

Given a Kripke structure M , we would like to find a structure M 0 that is simulation equivalent to M and is the smallest in size (number of states and transitions). For bisimulation this can be done by constructing the quotient structure in which the states are the equivalence classes with respect to bisimulation. Bisimulation has the property that if one state in a class has a successor in another class then all states in the class have a successor in the other class. Thus, in the quotient structure there will be a transition between two classes if every (some) state in one class has a successor in the other. The resulting structure is the smallest in size that is bisimulation equivalent to the given structure M . The quotient structure for simulation equivalence can be constructed in a similar manner. There are two main difficulties, however. First, it is not true that all states in an equivalence class have successors in the same classes. As a result, if we define a transition between classes whenever all states of one have a successor in the other, then we get the ∀−quotient structure. If, on the other hand, we have a transition between classes if there exists a state of one with a successor in the other, then we get the ∃−quotient structure. Both structures are simulation equivalent to M , but the ∀−quotient structure has fewer transitions and therefore is preferable. The other difficulty is that the quotient model for simulation equivalence is not the smallest in size. Actually, it is not even clear that there is a unique smallest structure that is simulation equivalent to M . The first result in this paper is showing that every structure has a unique up to isomorphism smallest structure that is simulation equivalent to it. This structure is reduced, meaning that it contains no simulation equivalent states, no little brothers (states that are smaller by the simulation preorder than one of their brothers), and no unreachable states. Our next result is presenting the Minimizing Algorithm that given a structure M constructs the reduced structure for M . Based on the maximal simulation relation over M , the algorithm first builds the ∀−quotient structure with respect to simulation equivalence. Then it eliminates transitions to little brothers. Finally, it removes unreachable states. The time complexity of the algorithm is O(|S|3 ). Its space complexity is O(|S|2 ) which is due to the need to hold the simulation preorder in memory. Since our main concern is space requirements, we suggest the Partitioning Algorithm which computes the quotient structure without ever computing the simulation preorder. Similarly to [13], the algorithm starts with a partition Σ0 of the state space to classes whose states are equally labeled. It also initializes a preorder H0 over the classes in Σ0 . At iteration i + 1, Σi+1 is constructed by splitting classes in Σi . The relation Hi+1 is updated based on Σi , Σi+1 and Hi . When the algorithm terminates (after k iterations) Σk is the set of equivalence classes with respect to simulation equivalence. These classes form the states of the quotient structure. The final Hk is the maximal simulation preorder over the states of the quotient structure. Thus, the Partitioning Algorithm replaces the first step of the Minimizing Algorithm . Since every step in the Minimizing

258

Doron Bustan and Orna Grumberg

Algorithm further reduces the size of the initial structure, the first step handles the largest structure. Therefore, improving its complexity influences most the overall complexity of the algorithm. The space complexity of the Partitioning Algorithm is O(|Σk |2 + |S|· log(|Σk |)). We assume that in most cases |Σk | << |S|, thus this complexity is significantly smaller than that of the Minimizing Algorithm . Unfortunately, time complexity will probably become worse (depending on the size of Σk ). It is bounded by O(|S|2 · |Σk |2 · (|Σk |2 + |R|)). However, since our main concern is the reduction in memory requirements, the Partitioning Algorithm is valuable. Other works also suggest minimization algorithms. In [13], the quotient structure with respect to bisimulation is constructed without first building the bisimulation relation. We follow a similar approach. However, in our case states may remain in the same class even when they do not have successors in the same classes. Thus, our analysis is more complicated and requires both Σi and Hi . Symbolic bisimulation minimization is suggested in [5]. In [4] a minimized structure with respect to bisimulation is generated directly out of the text. In [9] a bisimulation minimization is applied to the intersection of the system automaton and the specification automaton. The algorithm from [13] is used. [12] shows that eliminating little brothers results in a simulation equivalent structure. However, the paper does not consider the minimization problem. Several works minimize a structure in a compositional way, preserving language containment [2] or a given CTL formula [1]. Minimizing with respect to a given formula may result in a more power reduction, however it requires to determine the checked formula in advance. The rest of the paper is organized as follows. Section 2 gives our basic definitions. Section 3 defines reduced structures and shows that every structure has a unique simulation equivalent reduced structure. Section 4 presents the Minimizing Algorithm . Finally, Section 5 describes the Partitioning Algorithm and discusses its space and time complexity.

2

Preliminaries

Let AP be a set of atomic propositions. A Kripke structure M over AP is a four tuple M = (S, s0 , R, L) where S is a finite set of states; s0 ∈ S is the initial state; R ⊆ S × S is the transition relation that must be total, i.e., for every state s ∈ S there is a state s0 ∈ S such that R(s, s0 ); and L : S → 2AP is a function that labels each state with the set of atomic propositions true in that state. The size |M | of a Kripke structure M is the pair (|S|, |R|). We say that |M | ≤ |M 0 | if |S| ≤ |S 0 | or |S| = |S 0 | and |R| ≤ |R0 |. Given two structures M and M 0 over AP , a relation H ⊆ S × S 0 is a simulation relation [14] over M × M 0 iff the following conditions hold: 1. (s0 , s00 ) ∈ H. 2. For all (s, s0 ) ∈ H, L(s) = L0 (s0 ) and ∀t[(s, t) ∈ R → ∃t0 [(s0 , t0 ) ∈ R0 ∧ (t, t0 ) ∈ H]].

Simulation Based Minimization

259

We say that M 0 simulates M (denoted by M M 0 ) if there exists a simulation relation H over M × M 0 . The logic ACTL∗ [10] is the universal fragment of the powerful branchingtime logic CTL∗ . ACTL∗ consists of the temporal operators X (next-time), U (until) and R (release) and the universal path quantifier A (for all paths). For lack of space the formal definition is omitted. It can be found in [8]. The following lemma and theorem have been proven in [10]. Lemma 1. is a preorder on the set of structures. Theorem 2. Suppose M M 0 . Then for every ACTL∗ formula f, M 0 |= f implies M |= f. Given two Kripke structures M, M 0 , we say that M is simulation equivalent to M 0 iff M M 0 and M 0 M . It is easy to see that this is an equivalence relation. By Theorem 2 , if M and M 0 are simulation equivalent then they are equivalent with respect to ACT L∗ . However, they are not equivalent with respect to CT L∗ . A simulation relation H over M ×M 0 is maximal iff for all simulation relations 0 H over M × M 0 , H 0 ⊆ H. In [10] it has been shown that if there is a simulation relation over M × M 0 then there is a unique maximal simulation over M × M 0 .

3

The Reduced Structure

Given a Kripke structure M , we would like to find a reduced structure that will be simulation equivalent to M and smallest in size. In this section we show that a reduced structure always exists. Furthermore, we show that all reduced structures of M are isomorphic to each other. Let M be a Kripke structure. The maximal simulation relation over M × M always exists and is denoted by HM . We need the following two definitions in order to characterize reduced structures. Two states s1 , s2 ∈ M are simulation equivalent iff (s1 , s2 ) ∈ HM and (s2 , s1 ) ∈ HM . A state s1 is a little brother of a state s2 iff there exists a state s3 such that: – (s3 , s2 ) ∈ R and (s3 , s1 ) ∈ R. – (s1 , s2 ) ∈ HM and (s2 , s1 ) 6∈ HM . Definition 3. A Kripke structure M is reduced if: 1. There are no simulation equivalent states in M . 2. There are no states s1 , s2 such that s1 is a little brother of s2 . 3. All states in M are reachable from s0 . Theorem 4. : Let M M 0 be two reduced Kripke structures. Then the following two statements are equivalent:

260

Doron Bustan and Orna Grumberg

1. M and M 0 are simulation equivalent. 2. M and M 0 are isomorphic. The proof that 2 implies 1 is straight forward. In the rest of this section we assume that M and M 0 are reduced Kripke structures. We will show that if M M 0 and M 0 M then M and M 0 are isomorphic. We use HM M 0 and HM 0M to denote the maximal simulation relations over M × M 0 and M 0 × M respectively. The composed relation HM M 0M ⊆ S × S is defined by HM M 0M = {(s1 , s2 )|∃s0 ∈ S 0 . (s1 , s0 ) ∈ HM M 0 ∧ (s0 , s2 ) ∈ HM 0 M }. Lemma 5. The composed relation HM M 0 M is a simulation relation. For the reduced Kripke structures M and M 0 , we define the matching relation f ⊆ S 0 × S as follows: (s0 , s) ∈ f iff (s0 , s) ∈ HM 0M and (s, s0 ) ∈ HM M 0 . We show that f is an isomorphism between M 0 and M , i.e., f is an one to one and onto total function that preserves the state labeling and the transition relation. Lemma 6. Let f ⊆ S 0 × S be the matching relation. Then f is an one to one, onto, and total function from S 0 to S. Proof Sketch : First we need to prove that f is a function from S 0 to S. We assume to the contrary that there are different states s1 , s2 ∈ S and s0 ∈ S 0 such that (s0 , s1 ) ∈ f and (s0 , s2 ) ∈ f. We show that (s1 , s2 ) ∈ HM M 0M and (s2 , s1 ) ∈ HM M 0M . Since HM M 0M is included in HM , this contradicts the assumption that M is reduced. The proof that f −1 is a function from S to S 0 is similar. Thus, we conclude that f is one to one. Next, we prove that f is onto, i.e. for every state s in S there exists a state s0 in S 0 such that (s0 , s) ∈ f. The proof is by induction on the distance of s ∈ S from the initial state. (since all states are reachable, the distance is bounded by |S|). Again we use the composed relation HM M 0M to show that if f is not onto then M 0 is not reduced. Similarly, we can show that f −1 is onto and therefore f is total. t u Lemma 7. For all s0 ∈ S 0 , L0 (s0 ) = L(f(s0 )). Furthermore, for all s01 , s02 ∈ S 0 , (s01 , s02 ) ∈ R0 iff (f(s01 ), f(s02 )) ∈ R. Thus, we conclude Theorem 4 . Theorem 8. Let M be a non-reduced Kripke structure, then there exists a reduced Kripke structure M 0 such that M, M 0 are simulation equivalent and |M 0| < |M |. In order to prove Theorem 8 , we present in the next sections an algorithm that receives a Kripke structure M and computes a reduce Kripke structure M 0 , which is simulation equivalent to |M |, such that |M 0 | ≤ |M |. Moreover, if M is not reduced then |M 0 | < |M |. Lemma 9. Let M 0 be a reduced Kripke structure. For every M that is simulation equivalent to |M 0 |, if M and M 0 are not isomorphic then |M 0 | < |M |.

Simulation Based Minimization

4

261

The Minimizing Algorithm

In this section we present the Minimizing Algorithm that gets a Kripke structure M and computes a reduced Kripke structure M 0 which is simulation equivalent to M and |M 0 | ≤ |M |. If M is not reduced then |M 0 | < |M |. The algorithm consists of three steps. First, a quotient structure is constructed in order to eliminate equivalent states. The resulting quotient model is simulation equivalent to M but may not be reduced. The next step disconnects little brothers and the last one removes all unreachable states. In each step of the algorithm, if the resulting structure differs from the original one then the resulting one is strictly smaller than the original structure. 4.1

The ∀−Quotient Structure

In order to compute a simulation equivalent structure that contains no equivalent states, we compute the ∀−quotient structure with respect to the simulation equivalence relation. We fix M to be the original Kripke structure. We denote by [s] the equivalence class which includes s. Definition 10. The ∀−quotient structure Mq =< Sq , Rq , s0q , Lq > of M is defined as follow: – Sq is the set of the equivalence classes of the simulation equivalence. (We will use Greek letters to represent equivalence classes). – Rq = {(α1 , α2 )|∀s1 ∈ α1 ∃s2 ∈ α2 . (s1 , s2 ) ∈ R} – s0q = [s0 ]. – Lq ([s]) = L(s). The transitions in Mq are ∀-transitions, in which there is a transition between two equivalence classes iff every state of the one has a successor in the other. We could also define ∃-transitions, in which there is a transition between classes if there exists a state in one with a successor in the other. Both definitions result in a simulation equivalent structure. However, the former has smaller transition relation and therefore it is preferable. Note that, |Sq | ≤ |S| and |Rq | ≤ |R|. If |Sq | = |S|, then every equivalence class contains a single state. In this case, Rq is identical to R and Mq is isomorphic to M . Thus, when M and Mq are not isomorphic, |Sq | < |S|. Next, we show that M and Mq are simulation equivalent. Definition 11. Let G ⊆ S be a set of states. A state sm ∈ G is maximal in G iff there is no state s ∈ G such that (sm , s) ∈ HM and (s, sm ) 6∈ HM . Definition 12. Let α be a state of Mq , and t1 a successor of some state in α. The set G(α, t1 ) is defined as follow: G(α, t1 ) = {t2 ∈ S|∃s2 ∈ α ∧ (s2 , t2 ) ∈ R ∧ (t1 , t2 ) ∈ HM }.

262

Doron Bustan and Orna Grumberg

Intuitively, G(α, t1 ) is the set of states that are greater than t1 and are successors of states in α. Notice that since all state in α are simulation equivalent, every state in α has at least one successor in G(α, t1 ). Lemma 13. Let α, t1 be as defined in Definition 12 . Then for every maximal state tm in G(α, t1), [tm ] is a successor of α. Proof : Let tm be a maximal state in G(α, t1), and let sm ∈ α be a state such that tm is a successor of sm . We prove that for every state s ∈ α, there exists a successor t ∈ [tm ], which implies that [tm ] is a successor of α. s, sm ∈ α implies (sm , s) ∈ HM . This implies that there exists a successor t of s such that (tm , t) ∈ HM . By transitivity of the simulation relation, (t1 , t) ∈ HM . Thus t ∈ G(α, t1 ). Since tm is maximal in G(α, t1 ), (t, tm ) ∈ HM . Thus, t and tm are simulation equivalent and t ∈ [tm ]. t u Theorem 14. The structures M and Mq are simulation equivalent. Proof Sketch : It is straight forward to show that H 0 = {(α, s)|s ∈ α} is a simulation relation over Mq × M . Thus, Mq M . In order to prove that M Mq we choose H 0 = {(s1 , α)| there exists a state s2 ∈ α such that (s1 , s2 ) ∈ HM }. Clearly, (s0 , s0q ) ∈ H 0 and for all (s, α) ∈ H 0 , L(s) = Lq (α). Assume (s1 , α1 ) ∈ H 0 and let t1 be a successor of s1 . We prove that there exists a successor α2 of α1 such that (t1 , α2 ) ∈ H 0 . We distinguish between two cases: 1. s1 ∈ α1 . Let tm be a maximal state in G(α1 , t1 ), then Lemma 13 implies that (α1 , [tm ]) ∈ Rq . Since tm is maximal in G(α1 , t1 ), (t1 , tm ) ∈ HM which implies (t1 , [tm]) ∈ H 0 . 2. s1 6∈ α1 . Let s2 ∈ α1 be a state such that (s1 , s2 ) ∈ HM . Since (s1 , s2 ) ∈ HM there is a successor t2 of s2 such that (t1 , t2 ) ∈ HM . The first case implies that there exists an equivalence class α2 such that (α1 , α2 ) ∈ Rq and (t2 , α2) ∈ H 0 . By (t2 , α2) ∈ H 0 we have that there exists a state t3 ∈ α2 such that (t2 , t3 ) ∈ HM . By transitivity of simulation (t1 , t3 ) ∈ HM . Thus, (t1 , α2) ∈ H 0 . t u 4.2

Disconnecting Little Brothers

Our next step is to disconnect the little brothers from their fathers. As a result of applying this step to a Kripke structure M with no equivalent states, we get a Kripke structure M 0 satisfying: 1. 2. 3. 4.

M are M 0 are simulation equivalent. There are no equivalent states in M 0 . There are no little brothers in M 0 . |M 0 | ≤ |M |, and if M and M 0 are not identical, then |M 0 | < |M |.

Simulation Based Minimization

263

change := true while (change = true) do Compute the maximal simulation relation HM change := false If there are s1 , s2 , s3 ∈ S such that s1 is a little brother of s2 and s3 is the father of both s1 and s2 then change := true R = R \ {(s3 , s1 )} end end

Fig. 2. The Disconnecting Algorithm.

In Figure 2 we present an iterative algorithm which disconnects little brothers and results in M 0 . Since in each iteration of the algorithm one edge is removed, the algorithm will terminate after at most |R| iterations. We will show that the resulting structure is simulation equivalent to the original one. Lemma 15. Let M 0 =< S 0 , R0 , s00 , L0 > be the result of the Disconnecting Algorithm on M . Then M and M 0 are simulation equivalent. Proof Sketch : We prove the lemma by induction on the number of iterations. Base: at the beginning M and M are simulation equivalent. Induction step: Let M 00 be the result of the first i iterations and H 00 be the maximal simulation over M 00 ×M 00 . Let M 0 be the result of the (i+1)th iteration where R0 = R00 \ {(s001 , s002 )}. Assume that M and M 00 are simulation equivalent. It is straight forward to see that H 0 = {(s01 , s002 )|(s001 , s002 ) ∈ H 00 } is a simulation relation over M 0 × M 00 . Thus, M 0 M 00 . To show that M 00 M 0 we prove that H 0 = {(s001 , s02 )|(s001 , s002 ) ∈ H 00 } is a simulation relation. Clearly, (s000 , s00 ) ∈ H 0 and for all (s001 , s02 ) ∈ H 00 , L00 (s001 ) = L0 (s02 ). Suppose (s001 , s02 ) ∈ H 0 and t001 is a successor of s001 . Since H 00 is a simulation relation, there exists a successor t002 of s002 such that (t001 , t002 ) ∈ H 00 . This implies that (t001 , t02 ) ∈ H 0 . If (s02 , t02 ) ∈ R0 then we are done. Otherwise, (s002 , t002 ) is removed from R00 because t002 is a little brother of some successor t003 of s002 . Since (s002 , t002 ) is the only edge removed at the (i + 1)th iteration, (s02 , t03 ) ∈ R0 . Because t002 is a little brother of t003 then (t002 , t003 ) ∈ H 00 . By transitivity of the simulation relation, (t001 , t003 ) ∈ H 00 , thus (t001 , t03 ) ∈ H 0 . t u We proved that the result M 0 of the Disconnecting Algorithm is simulation equivalent to the original structure M . Note that M 0 has the same set of states as M . We now show that the maximal simulation relation over M is identical to the maximal simulation relations for all intermediate structures M 00 (including M 0 ), computed by the Disconnecting Algorithm. Since there are no simulation equivalent states in M , there are no such states in M 0 as well.

264

Doron Bustan and Orna Grumberg

Lemma 16. Let M 0 =< S, R0 , s0 , L > be the result of the Disconnecting Algorithm on M and let H 0 ⊆ S 0 × S 0 be the maximal simulation over M 0 × M 0 . Then, HM = H 0 . The lemma is proved by induction on the number of iterations. As a result of the last lemma, the Disconnecting Algorithm can be simplified significantly. The maximal simulation relation is computed once on the original structure M and is used in all iterations. If the algorithm is executed symbolically (with BDDs) then this operation can be performed efficiently in one step: R0 = R − {(s1 , s2 )|∃s3 : (s1 , s3 ) ∈ R ∧ (s2 , s3 ) ∈ HM ∧ (s3 , s2 ) 6∈ HM }. 4.3

The Algorithm

We now present our algorithm for constructing the reduced structure for a given one. 1. Compute the ∀−quotient structure Mq of M and the maximal simulation relation HM over Mq × Mq . 2. R0 = Rq − {(s1 , s2 )|∃s3 : (s1 , s3 ) ∈ Rq ∧ (s2 , s3 ) ∈ HM } 3. Remove all unreachable states.

Fig. 3. The Minimizing Algorithm Note that, in the second step we eliminate the check (s3 , s2 ) 6∈ HM . This is based on the fact that Mq does not contain simulation equivalent states. Removing unreachable states does not change the properties of simulation with respect to the initial states. The size of the resulting structure is equal to or smaller than the original one. Similarly to the first two steps of the algorithm, if the resulting structure is not identical then it is strictly smaller in size. We have proved that the result of the Minimizing Algorithm M 0 is simulation equivalent to the original structure M . Thus we can conclude that Theorem 8 is correct. Figure 4 presents an example of the three steps of the Minimizing Algorithm applied to a Kripke structure. 1. Part 1 contains the original structure, where the maximal simulation relation is (not including the trivial pairs): {(2, 3), (3, 2), (11, 2), (11, 3), (4, 5), (6, 5), (7, 8), (8, 7), (9, 10), (10, 9)}. The equivalence classes are : {{1}, {2, 3}, {11}, {4}, {5}, {6}, {7, 8}, {9, 10}}. 2. Part 2 presents the ∀−structure Mq . The maximal simulation relation HM is (not including the trivial pairs): HM = {({11}, {2, 3}), ({4}, {5}), ({6}, {5})}. 3. {11} is a little brother of {2, 3} and {1} is their father. Part 3 presents the structure after the removal of the edge ({1}, {11}).

Simulation Based Minimization 1

1

2

a

11

{11}

2

b

b

c

c 8

d

6 9

d

e

c

{4}

10 e

{6}

{5}

c

c {7,8} d

c {9,10} e

a {2,3}

{2,3} b

5

4

{1}

4

a

{11}

b

b

{1}

3

{1} {2,3}

3

b

7

a

b

b

{4}

{6}

{5}

c

d

{5} c

c

c {7,8}

265

{9,10} e

{7,8} d

{9,10} e

Fig. 4. An example of the Minimizing Algorithm 4. Finally, part 4 contains the reduced structure, obtained by removing the unreachable states. 4.4

Complexity

The complexity of each step of the algorithm depends on the size of the Kripke structure resulting from the previous step. In the worst case the Kripke structure does not change, thus all three steps depend on the original Kripke structure. Let M be the given structure. We analyze each step separately (a naive analysis): 1. First, the algorithm constructs equivalence classes. To do that it needs to compute the maximal simulation relation. [3,11] showed that this can be done in time O(|S| · |R|). Once the algorithm has the simulation relation, the equivalence classes can be constructed in time O(|S|2 ). Next, the algorithm constructs the transition relation. This can be done in time O(|S| + |R|). As a whole, building the quotient structure can be done in time O(|S| · |R|). 2. Disconnecting little brothers can be done in O(|S|3 ). 3. Removing unreachable states can be done in O(|R|). As a whole the algorithm works in time O(|S|3 ) The space bottle neck of the algorithm is the computation of the maximal simulation relation which is bounded by |S|2 .

5

Partition Classes

In the previous section, we presented the Minimizing Algorithm . The algorithm consists of three steps, each of which results in a structure that is smaller in size. Since the first step handles the largest structure, improving its complexity will influence most the overall complexity of the algorithm. In this section we suggest an alternative algorithm for computing the set of equivalence class. The algorithm avoids the construction of the simulation relation over the original structure. As a result, it has a better space complexity, but its time complexity is worse. Since the purpose of the Minimizing Algorithm is to reduce space requirements, it is more important to reduce its own space requirement.

266

5.1

Doron Bustan and Orna Grumberg

The Partitioning Algorithm

Given a structure M , we would like to build the equivalence classes of the simulation equivalence relation, without first calculating HM . Our algorithm, called the Partitioning Algorithm , starts with a partition Σ0 of S to classes. The classes in Σ0 differ from one another only by their state labeling. In each iteration, the algorithm refines the partition and forms a new set of classes. We use Σi to denote the set of the classes obtained after i iterations. In order to refine the partitions we build an ordering relation Hi over Σi × Σi which is updated in every iteration according to the previous and current partitions (Σi−1 and Σi ) and the previous ordering relation (Hi−1). Initially, H0 includes only the identity pairs (of classes). In the algorithm, we use succ(s) for the set of successors of s. Whenever Σi is clear from the context, [s] is used for the equivalence class of s. We also use a function Π that associates with each class α ∈ Σi the set of classes α0 ∈ Σi−1 that contain a successor of some state in α. Π(α) = {[t]i−1|∃s ∈ α. (s, t) ∈ R} We use English letters to denote states, capital English letters to denote sets of states, Greek letters to denote equivalence classes, and capital Greek letters to denote sets of equivalence classes. The Partitioning Algorithm is presented in Figure 5 . Definition 17. The partial order ≤i on S is defined by: s1 ≤i s2 implies, L(s1 ) = L(s2 ) and if i > 0, ∀t1 [(s1 , t1 ) ∈ R → ∃t2 [(s2 , t2 ) ∈ R ∧ ([t1 ], [t2]) ∈ Hi−1]]. In case i = 0, s1 ≤0 s2 iff L(s1 ) = L(s2 ). Two states s1 , s2 are i−equivalent iff s1 ≤i s2 and s2 ≤i s1 . In the rest of this section we explain how the algorithm works. There are three invariants which are preserved during the execution of the algorithm. Invariant 1: For all states s1 , s2 ∈ S, s1 and s2 are in the same class α ∈ Σi iff s1 and s2 are i−equivalent. Invariant 2: For all states s1 , s2 ∈ S, s1 ≤i s2 iff ([s1 ], [s2]) ∈ Hi . Invariant 3: Hi is transitive. Σi is a set of equivalence classes with respect to the i−equivalence relation. In the ith iteration we split the equivalence classes of Σi−1 so that only states that are i-equivalent remain in the same class. A class α ∈ Σi−1 is repeatedly split by choosing an arbitrary state sp ∈ α (called the splitter) and identifying the states in α that are i−equivalent to sp . These states form an i−equivalence class α0 that is inserted to Σi . α0 is constructed in two steps. First we calculate the set of states GT ⊆ α that contains all states sg such that sp ≤i sg . Next we calculate the set of states LT ⊆ α that contains all states sl such that sl ≤i sp . The states in the intersection of GT and LT are the states in α that are i−equivalent to sp . Hi captures the partial order ≤i , i.e., s1 ≤i s2 iff ([s1 ], [s2 ]) ∈ Hi . Note that the sequence ≤0 , ≤1 , . . . satisfies ≤0 ⊇≤1 ⊇≤2 ⊇ . . .. Therefore, if s1 ≤i s2 then

Simulation Based Minimization

267

Initialize the algorithm: change := true for each label a ∈ 2AP construct αa ∈ Σ0 such that s ∈ αa ⇔ L(s) = a. H0 = {(α, α)|α ∈ Σ0 } while change = true do begin change := f alse refine Σ: Σi+1 := ∅ for each α ∈ Σi do begin while α 6= ∅ do begin choose sp such that sp ∈ α GT := {sg |sg ∈ α ∧ ∀tp ∈ succ(sp) ∃tg ∈ succ(sg ). ([tp], [tg ]) ∈ Hi } LT := {sl |sl ∈ α ∧ ∀tl ∈ succ(sl) ∃tp ∈ succ(sp). ([tl], [tp ]) ∈ Hi } α0 := GT ∩ LT if α 6= α0 then change := true α := α \ α0 Add α0 as a new class to Σi+1 . end end update H: Hi+1 = ∅ for every (α1 , α2 ) ∈ Hi do begin for each α02 , α01 ∈ Σi+1 such that α2 ⊇ α02, α1 ⊇ α01 do begin Φ = {φ|∃ξ ∈ Π(α02 ) (φ, ξ) ∈ Hi } if Φ ⊇ Π(α01 ) then insert (α01 , α02 ) to Hi+1 else change := true end end end

Fig. 5. The Partitioning Algorithm s1 ≤i−1 s2 . Thus, ([s1 ], [s2 ]) ∈ Hi implies ([s1 ], [s2 ]) ∈ Hi−1 . Based on that, when constructing Hi it is sufficient to check (α01 , α02 ) ∈ Hi only in case α2 ⊇ α02 , α1 ⊇ α01 , and (α1 , α2) ∈ Hi−1 . For suitable α01 and α02 , we first construct the set Φ of classes that are “smaller” than the classes in Π(α02 ). By checking if Φ ⊇ Π(α01 ) we determine whether every class in Π(α01 ) is “smaller” than some class in Π(α02 ), in which case (α01 , α02) is inserted to Hi . When the algorithm terminates, ≤i is the maximal simulation relation and the i−equivalence is the simulation equivalence relation over M × M . Moreover, Hi is the maximal simulation relation over the corresponding quotient structure Mq . The algorithm runs until there is no change both in the partition Σi and in the relation Hi . A change in Σi is the result of a partitioning of some class α ∈ Σi . The number of changes in Σi is bounded by the number of possible partitions, which is bounded by |S|.

268

Doron Bustan and Orna Grumberg

A change in Hi results in the relation ≤i+1 which is contained in ≤i and smaller in size, i.e., | ≤i | > | ≤i+1 |. The number of changes in Hi is therefore bounded by | ≤0 |, which is bounded by |S|2 . Thus, the algorithm terminates after at most |S|2 + |S| iterations. Note that, it is possible that in some iteration i, Σi will not change but Hi will, and in a later iteration j > i, Σj will change again. Example: In this example we show how the Partitioning Algorithm is applied to the Kripke structure presented in Figure 6 . 0 a 1

2

a

a 3

4

b

6 c

5

b

7 c

b

8

9 d

d

Fig. 6. An example structure

– We initialize the algorithm as follows: Σ0 = {α0 , β0 , γ0 , δ0 }, H0 = {(α0 , α0), (β0 , β0 ), (γ0 , γ0 ), (δ0 , δ0 )}, where α0 = {0, 1, 2}, β0 = {3, 4, 5}, γ0 = {6, 7}, δ0 = {8, 9}. – The first iteration results in the relations: Σ1 = {α1 , α2 , β1 , β2 , β3 , γ0 , δ0 }, H1 = {(α1 , α1 ), (α2 , α2), (β1 , β1 ), (β2 , β2 ), (β3 , β3 ), (β1 , β2 ), (β3 , β2 ), (γ0 , γ0 ), (δ0 , δ0 )}, where α1 = {0}, α2 = {1, 2}, β1 = {3}, β2 = {4}, β3 = {5}, γ0 = {6, 7}, δ0 = {8, 9}. – The second iteration results in the relations: Σ2 = {α1 , α2 , β1 , β2 , β3 , γ1 , γ2 , δ0 }, H2 = {(α1 , α1), (α2 , α2 ), (β1 , β1 ), (β2 , β2 ), (β3 , β3 ), (β1 , β2 ), (β3 , β2 ), (γ1 , γ1 ), (γ2 , γ2 ), (γ1 , γ2 ), (δ0 , δ0 )}, where α1 = {0}, α2 = {1, 2}, β1 = {3}, β2 = {4}, β3 = {5}, γ1 = {6}, γ2 = {7}, δ0 = {8, 9}. – The third iteration results in the relations: Σ3 = Σ2 , H3 = H2 - change = false. The equivalence classes are: α1 = {0}, α2 = {1, 2}, β1 = {3}, β2 = {4}, β3 = {5}, γ1 = {6}, γ2 = {7}, δ0 = {8, 9} Since the third iteration results in no change to the computed partition or ordering relation, the algorithm terminates. Σ2 is the final set of equivalence classes

Simulation Based Minimization

269

which constitutes the set Sq of states of Mq . H2 is the maximal simulation relation over Mq × Mq . The proof of correctness of the algorithm can be fount in the full version. 5.2

Space and Time Complexity

The space complexity of the Partitioning Algorithm depends on the size of Σi . We assume that the algorithm applied to Kripke structures with some redundancy, thus |Σi | << |S|. We measure the space complexity with respect to the size of the three following relations: 1. The relation R. 2. The relations Hi whose size depends on Σi . We can bound the size of Hi by |Σi |2 . 3. A relation that relates each state to its equivalence class. Since every state belongs to a single class, the size of this relation is O(|S| · log(|Σi |)). In the ith iteration we do not need to keep all H0 , H1 , . . . and Σ0 , Σ1 , . . ., since we only refer to Hi , Hi+1 and Σi , Σi+1 . By the above we conclude that the total space complexity is O(|R| + |Σk |2 + |S| · log(|Σk |)) In practice, we often do not hold the transition relation R in the memory. Rather we use it to provide, whenever needed, the set of successors of a given state. Thus, the space complexity is O(|Σk |2 + |S| · log(|Σk |)). Recall that the space complexity of the naive algorithm for computing the equivalence classes of the simulation equivalence relation is bounded by |S|2 , which is the size of the simulation relation over M ×M . In case |Σk | << |S|, the Partitioning Algorithm achieve a much better space complexity. As we already mentioned, the algorithm runs at most |S|2 iterations. In every iteration it performs one refine and one update. refine can be done in O(|Σk |3 + |Σk | · |R|) and update can be done in O(|Σk |2 · (|Σk |2 + |R|)). Thus the total time complexity is O(|S|2 · |Σk |2 · (|Σk |2 + |R|)).

References 1. A. Aziz, T.R. Shiple, V. Singhal, and A.L. Sangiovanni-Vincetelly. Formuladependent equivalence for compositional CTL model checking. In D. Dill, editor, proceedings of the Sixth Conference on Computer Aided Verification (CAV’94), volume 818 of LNCS, pages 324–337, 1994. 2. A. Aziz, V. Singhal, G.M. Swamy, and R.K. Brayton. Minimizing interacting finite state machines: A compositional approach to language containment. In Proceedings of the International Conference on Computer Design, pages 255–261, 1994. 3. B. Bloom and R. Paige. Transformational design and implementation of new efficient solution to the ready simulation problem. In Science of Computer Programming, volume 24, pages 189–220, 1996. 4. A. Bouajjani, J.-C. Fernandez, and N. Halbwachs. Minimal model generation. In E.M Clarke and R.P. Kurshan, editors, Computer-Aided Verification, pages 197– 203, New York, June 1990. Springer-Verlag.

270

Doron Bustan and Orna Grumberg

5. A. Bouali and R. de Simone. Symbolic bisimulation minimisation. In G. V. Bochmann and D. K. Probst, editors, Proceedings of the 4th Conference on Computer-Aided Verification, volume 663 of LNCS, pages 96–108. Springer Verlag, July 1992. 6. D. Bustan and O. Grumberg. Simulation based minimization. Technical Report TR #CS-2000-04, Computer Science Department, Technion, Haifa, April 2000. 7. E. M. Clarke and E. A. Emerson. Synthesis of synchronization skeletons for branching time temporal logic. In Logic of Programs: Workshop, Yorktown Heights, NY, May 1981, volume 131 of LNCS. SPRINGER VERLAG, 1981. 8. E.M. Clarke, O. Grumberg, and D.A. Peled. Model Checking. MIT press, December 1999. 9. K. Fisler and M. Vardi. Bisimulation minimization in an automata-theoretic verification framework. In FMCAD, pages 115–132, 1998. 10. O. Grumberg and D.E. Long. Model checking and modular verification. ACM Trans. on Programming Languages and Systems, 16(3):843–871, 1994. 11. M.R. Henzinger, T.A. Henzinger, and P.W. Kopke. Computing simulation on finite and infinite graphs. In Proc. Symp. Foundations of Computer Science, pages 453– 462, 1995. 12. A. Kucera and R. Mayr. Simulation preorder on simple process algebras. In International Colloquium on Automata, Languages and Programing, volume LNCS 1644, 1999. 13. D. Lee and M. Yannakakis. Online minimization of transition systems. In Proceedings of the 24th ACM Symp. on Theory of Computing, 1992. 14. R. Milner. An algebraic definition of simulation between programs. In Proc. of the 2nd IJCAI, pages 481–489, London, UK, 1971. 15. D. Park. Concurrency and automata on infinite sequences. In 5th GI-Conference on Theoretical Computer Science, pages 167–183. Springer-Verlag, 1981. LNCS 104.

Rewriting for Cryptographic Protocol Verification Thomas Genet1 and Francis Klay2 1 IRISA / Universit´e de Rennes Campus de Beaulieu, F-35042 Rennes Cedex [email protected] 2 France Telecom avenue Pierre Marzin, F-22307 Lannion Cedex [email protected]

Abstract. On a case study, we present a new approach for verifying cryptographic protocols, based on rewriting and on tree automata techniques. Protocols are operationally described using Term Rewriting Systems and the initial set of communication requests is described by a tree automaton. Starting from these two representations, we automatically compute an over-approximation of the set of exchanged messages (also recognized by a tree automaton). Then, proving classical properties like confidentiality or authentication can be done by automatically showing that the intersection between the approximation and a set of prohibited behaviors is the empty set. Furthermore, this method enjoys a simple and powerful way to describe intruder work, the ability to consider an unbounded number of parties, an unbounded number of interleaved sessions, and a theoretical property ensuring safeness of the approximation.

Introduction In this paper, we present a new way of verifying cryptographic protocols. We do not aim here at discovering attacks on the protocol but our goal is to prove that there is not any, which is a more difficult problem. In practice, positive proofs of security properties on cryptographic protocols are highly desirable results since they give a better guarantee on the reliability of the protocol than any amount of passed tests. In [9], a decidable approximation of the set of descendants (reachable terms) was presented. In this paper, we propose to apply those theoretical results to the verification of cryptographic protocols. Our case study is the Needham-Schroeder Public Key protocol [19] (NSPK for short). We chose this particular example for two reasons. First of all, this protocol is real but can be easily understood. The second reason is that, in spite of its apparent simplicity and robustness, and in spite of several verification attempts, this protocol designed in 1978 was proved insecure only in 1995 by G. Lowe [13] and in 1996 by C. Meadows [17]. In particular, G. Lowe found a smart attack invalidating the main security properties of the protocol. In this paper, we will use the corrected version of the NSPK protocol also proposed by G. Lowe in [14]. Starting from a TRS representing the protocol and a tree automaton recognizing the initial set of communication requests, we automatically compute D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 271–290, 2000. c Springer-Verlag Berlin Heidelberg 2000

272

Thomas Genet and Francis Klay

a superset of the set of exchanged messages by over-approximating the set of reachable terms. This model – also a tree automaton – takes into account an unbounded number of parties, an unbounded number of interleaved sessions as well as a powerful intruder activity description. For building this model, we needed to extend the approximation technique of [9], initially designed to approximate functional programs encoded by left-linear TRSs, to the more general class of TRSs (possibly non left-linear) with associative and commutative symbols. In section 1, we recall basic definitions of terms, term rewriting systems, and tree automata. In section 2, we recall the technique for approximating the set of descendants for left-linear term rewriting systems and regular set of terms [9]. In section 3, we shortly present the Needham-Schroeder Public Key protocol, comment on its expected properties and propose an encoding into a term rewriting system in section 4. However, the term rewriting system describing the NSPK is not left-linear, has Associative and Commutative (AC for short) symbols and, consequently, is out of the scope of the basic approximation technique of [9]. Thus, in section 5, we show how to extend our technique to the case of non left-linear and AC TRSs. We also describe the application of approximation to NSPK and show how to prove confidentiality and authentication properties. Finally, in section 6, we conclude, compare with other approaches and present ongoing developments.

1

Preliminaries

We now introduce some notations and basic definitions. Comprehensive surveys can be found in [7] for term rewriting systems, in [3] for tree automata and tree language theory, and in [11] for connections between regular tree languages and term rewriting systems. Terms, Substitutions, Rewriting Systems Let F be a finite set of symbols associated with an arity function, X be a countable set of variables, T (F , X ) the set of terms, and T (F) the set of ground terms (terms without variables). Positions in a term are represented as sequences of integers. The set of positions in a term t, denoted by Pos(t), is ordered by lexicographic ordering ≺. The empty sequence denotes the top-most position. If p ∈ Pos(t), then t|p denotes the subterm of t at position p and t[s]p denotes the term obtained by replacement of the subterm t|p at position p by the term s. For any term s ∈ T (F , X ), we denote by PosF (s) the set of functional positions in s, i.e. {p ∈ Pos(s) | p 6= and Root(s|p ) ∈ F } where Root(t) denotes the symbol at position in t. A ground context is a term of T (F ∪ {2}) with exactly one occurrence of 2, where 2 is a special constant not occurring in F . For any term t ∈ T (F), C[t] denotes the term obtained after replacement of 2 by t in the ground context C[ ]. The set of variables of a term t is denoted by Var(t). A term is linear if any variable of Var(t) has exactly one occurrence in t. A substitution is a mapping σ from X into T (F , X ), which can uniquely be extended to an endomorphism of T (F , X ). Its domain Dom(σ) is {x ∈ X | xσ 6= x}.

Rewriting for Cryptographic Protocol Verification

273

A term rewriting system R is a set of rewrite rules l → r, where l, r ∈ T (F, X ), l 6∈ X , and Var(l) ⊇ Var(r). A rewrite rule l → r is left-linear (resp. right-linear) if the left-hand side (resp. right-hand side) of the rule is linear. A rule is linear if it is both left and right-linear. A TRS R is linear (resp. leftlinear, right-linear) if every rewrite rule l → r of R is linear (resp. left-linear, right-linear). The relation →R induced by R is defined as follows: for any s, t ∈ T (F , X ), s →R t if there exist a rule l → r in R, a position p ∈ Pos(s) and a substitution σ such that lσ = s|p and t = s[rσ]p . The reflexive transitive closure of →R is denoted by →∗R . The set of R-descendants of a set of ground terms E is denoted by R∗ (E) and R∗ (E) = {t ∈ T (F ) | ∃s ∈ E s.t. s →∗R t}. Automata, Regular Tree Languages Let Q be a finite set of symbols, with arity 0, called states. T (F ∪ Q) is called the set of configurations. A transition is a rewrite rule c → q, where c ∈ T (F ∪ Q) and q ∈ Q. A normalized transition is a transition c → q where c = q 0 ∈ Q or c = f(q1 , . . . , qn), f ∈ F, ar(f) = n, and q1 , . . . , qn ∈ Q. A bottom-up nondeterministic finite tree automaton (tree automaton for short) is a quadruple A = hF, Q, Qf , ∆i, where Qf ⊆ Q and ∆ is a set of normalized transitions. A tree automaton is deterministic if there are no two rules with the same right hand side. The rewriting relation induced by ∆ is denoted either by →∆ or by →A . The tree language recognized by A is L(A) = {t ∈ T (F ) | ∃q ∈ Qf s.t. t →∗A q}. For a given q ∈ Q, the tree language recognized by A and q is L(A, q) = {t ∈ T (F) | t →∗A q}. A tree language (or a set of terms) E is regular if there exists a bottom-up tree automaton A such that L(A) = E. The class of regular tree languages is closed under boolean operations ∪, ∩, \, and inclusion is decidable. A Q-substitution is a substitution σ : X 7→ Q. Let Σ(Q, X ) be the set of Qsubstitutions. For every transition, there exists an equivalent set of normalized transitions. Normalization consists in decomposing a transition s → q, into a set N orm(s → q) of normalized transitions. The method consists in abstracting subterms s0 of s s.t. s0 6∈ Q by states of Q. We first define the abstraction function as follows: Definition 1. Let F be a set of symbols, and Q a set of states. For a given configuration s ∈ T (F ∪ Q) \ Q, an abstraction of s is a mapping α: α : {s|p | p ∈ PosF (s)} 7→ Q The mapping α is extended on T (F ∪ Q) by defining α as identity on Q, i.e. ∀q ∈ Q : α(q) = q. Definition 2. Let F be a set of symbols, Q a set of states, s → q a transition s.t. s ∈ T (F ∪ Q) and q ∈ Q, and α an abstraction of s. The set N ormα (s → q) of normalized transitions is inductively defined by:

274

Thomas Genet and Francis Klay

1. if s = q, then N ormα (s → q) = ∅, and 2. if s ∈ Q and s 6= q, then N ormα (s → q) = {s → q}, and 3. if s = f(t1 , . . . , tn ), then N ormα → q) = S(s n {f(α(t1 ), . . . , α(tn)) → q} ∪ i=1 N ormα (ti → α(ti )). Example 1. Let F = {f, g, a} and A = hF, Q, Qf , ∆i, where Q = {q0 , q1 , q2, q3 , q4 }, Qf = {q0}, and ∆ = {f(q1 ) → q0 , g(q1 , q1 ) → q1 , a → q1 }. • The languages recognized by q1 and q0 are the following: L(A, q1 ) is the set of terms built on {g, a}, i.e. L(A, q1 ) = T ({g, a}), and L(A, q0) = L(A) = {f(x) | x ∈ L(A, q1 )}. • Let s = f(g(q1 , f(a))), and α1 be an abstraction of s, mapping g(q1 , f(a)) to q2 , f(a) to q3 and a to q4 . The normalization of transition f(g(q1 , f(a))) → q0 with abstraction α1 is the following: N ormα1 (f(g(q1 , f(a))) → q0 ) = {f(q2 ) → q0 , g(q1 , q3) → q2 , f(q4 ) → q3 , a → q4 }.

2

Approximation Technique

For a regular set of terms E ⊆ T (F ), although there exists some restricted classes of TRSs R such that R∗ (E) is regular (see [5,21,4,12]), this is not the case in general [11,12]. In [9], for any tree automaton A (s.t. L(A) ⊇ E) and for any leftlinear TRS R, it is proposed to build an approximation automaton TR↑ (A) such that L(TR↑ (A)) ⊇ R∗ (E). The quality of the approximation highly depends on an approximation function called γ which define some folding positions: subterms who can be approximated. We now briefly recall the construction of TR↑ (A) [9]: Let R be a left-linear term rewriting system and A = hF, Q, Qf , ∆i a tree automaton such that E = L(A) (or even E ⊆ L(A)). First, we infinitely extend the set of states Q of A with an infinite number of new states, initially not occurring in Q. Note that since we do not modify ∆ nor Qf (in particular, they remain finite), the language recognized by A is the same. On the other hand, it is always possible to come back to a finite set of states for A by restricting Q to the set of accessible states, i.e. states q such that L(A, q) 6= ∅. Starting from A0 = A, we incrementally build a finite number of tree automata Ai = hF, Q, Qf , ∆i i with i ≥ 0 such that ∀i ≥ 0 : L(Ai ) ⊂ L(Ai+1 ) until we get an automaton Ak with k ∈ N such that L(Ak ) ⊇ R∗ (L(A0 )), i.e. L(Ak ) ⊇ R∗ (E). We denote by TR↑ (A) this automaton Ak . To construct Ai+1 from Ai , the technique consists in finding a term s in L(Ai ) such that s →R t and t 6∈ L(Ai ), and then in building ∆i+1 such that L(Ai ) ⊂ L(Ai+1 ) and t ∈ L(Ai+1 ). L(A) L(A1 ) L(A2 ) s

R

t s0

... R

...

Since Ai and Ai+1 only differs by their respective transitions sets, to ensure L(Ai ) ⊂ L(Ai+1 ) it is enough to construct ∆i+1 such that it strictly contains

Rewriting for Cryptographic Protocol Verification

275

∆i . In order to have also t ∈ L(Ai+1 ) it is necessary to add some transitions to ∆i to obtain ∆i+1 . This can be viewed as a completion step between the two term rewriting systems: the set of transitions ∆i of Ai and R. If there exists a term s in L(Ai ) such that s →R t, by definition of →R , there exists a rule l → r, a ground context C[ ] and a substitution (a match) σ such that s = C[lσ] →R C[rσ] = t. On the other hand, by construction of tree automata, s = C[lσ] ∈ L(Ai ) means that (1) there exists a state q ∈ Q such that lσ →∗Ai q and (2) C[q] →∗Ai q 0 such that q 0 ∈ Qf . Hence, from (1) we know that we have following critical pair between transitions of Ai and rules of R: lσ Ai

R

rσ

∗ q

Since every transition of Ai is in Ai+1 (i.e. ∆i ⊆ ∆i+1 ), for the term t to be recognized by Ai+1 , it is enough to ensure that (3) rσ →∗Ai+1 q. This is sufficient since we can then rewrite t = C[rσ] into C[q] and from (2) we get that C[q] →∗Ai+1 q 0 , since ∆i ⊆ ∆i+1 . Finally, since q 0 ∈ Qf , t ∈ L(Ai+1 ). To ensure (3), we need to add some transitions to ∆i+1 , i.e. join the critical pair: lσ Ai

rσ

R ∗

∗

q

Ai+1

A direct solution to have rσ →∗Ai+1 q is to have a transition of the form rσ → q in Ai+1 . However, this is not compatible with the standard normalized form of the tree automata we use here1 . Thus, before adding rσ → q to transitions of Ai , we normalize it first thanks to the N ormα function (see definition 2). Hence, ∆i+1 = ∆i ∪ N ormα (rσ → q). We give here an example of completion process on a simple TRS Example 2. Let F = {f, g, a} and R the one rule TRS R = {f(g(x)) → g(f(x))}. Let A0 = hF, Q, Qf , ∆0 i such that Qf = {qf } and ∆0 = {f(qf ) → qf , g(qa) → qf , a → qa}. We have L(A0 ) = f ∗ (g(a)). Between R and transitions of A0 there exists a critical pair: f (g(qa )) A0

R

g(f (qa))

∗

qf 1

keeping tree automata in standard normalized form allows, in particular, to apply usual algorithms: intersection, union, etc.

276

Thomas Genet and Francis Klay

The Q-substitution used here is σ = {x 7→ qa }. As defined before, we have ∆1 = ∆0 ∪ N ormα (g(f(qa ))) → qf ). Let α be the abstraction function such that α(f(qa )) = qnew where qnew is a state not occurring in transitions of A0 . Then, we have ∆1 = ∆0 ∪ {g(qnew ) → qf , f(qa ) → qnew}. Except in some simple decidable case, this completion procedure is not guaranteed to converge but, instead, may infinitely add new transitions and thus generate an infinite number of tree automata A1 , A2 , etc. However, choosing particular values for α may force the completion process to converge by approximating infinitely many transitions by finite sets of more general transitions. Those particular abstraction functions are associated with approximation functions denoted by γ, defining some folding positions: positions in the right hand side of rules where subterms are approximated by regular languages: for each completion step from Ai to Ai+1 involving a rewrite step lσ →R rσ, a folding position p is a position in r which is assigned a state q 0 such that we only ensure L(Ai+1 , q 0 ) ⊇ {rσ|p} instead of strict equality: L(Ai+1 , q 0 ) = {rσ|p}. This comes from the fact that the same state q 0 can be used for recognizing different terms obtained by different positions, rules or substitutions. The role of the approximation function is to relate rσ|p and the state q 0 . Folding positions depend on the applied rule l → r and on the substitution σ. Furthermore, since in our setting a rewriting step s = C[lσ] →R C[rσ] = t is modeled by a completion step on the critical pair lσ →R rσ and lσ →Ai q, q is also a parameter of the approximation function. Finally, the approximation function γ maps every triple (l → r, q, σ) to a sequence of states (one for each position in PosF (r)) used for the normalization of the transition rσ → q. Definition 3. Let Q be a set of states and Q∗ the set of sequences q1 · · · qk of states in Q. An approximation function is a mapping γ : R×Q×Σ(Q, X ) 7→ Q∗ , such that γ(l → r, q, σ) = q1 · · · qk , where k = Card(PosF (r)). From every γ(l → r, q, σ) = q1 · · · qk , we can associate q1 , . . . , qk to positions p1 , . . . , pk in PosF (r). This can be done by defining the corresponding abstraction function α on the restricted domain {rσ|p | ∀l → r ∈ R, ∀p ∈ PosF (r), ∀σ ∈ Σ(Q, X )}: α(rσ|pi ) = qi for all pi ∈ PosF (r) = {p1 , . . . , pk }, s.t. pi ≺ pi+1 for i = 1 . . . k − 1 (where ≺ is the lexicographic ordering). In the following, we will note N ormγ the normalization function whose α value is defined according to γ as above. Starting from a left-linear TRS R, a tree automaton A and an approximation function γ, the algorithm for building the approximation automaton TR↑ (A) is the following. First, set A0 to A. Then, to construct Ai+1 from Ai : 1. search for a critical pair, i.e. a state q ∈ Q, a rewrite rule l → r and a substitution σ ∈ Σ(Q, X ) such that lσ →∗Ai q and rσ 6→∗Ai q. 2. Ai+1 = Ai ∪ N ormγ (rσ → q).

Rewriting for Cryptographic Protocol Verification

277

This process is iterated until it stops on a tree automaton Ak such that ∀q ∈ Q, ∀l → r ∈ R and ∀σ ∈ Σ(Q, X ) if lσ →∗Ak q then rσ →∗Ak q. Then, TR↑ (A) = Ak . The fact that Q and Σ(Q, X ) may be infinite is not a problem in practice since, for finding a critical pair, we can restrict Q to the finite set of accessible states in Ai , without changing L(Ai ) nor L(Ai+1 ). We now recall a theorem of [9]. Theorem 1. (Completeness) Given a tree automaton A and a left-linear TRS R, for any approximation function γ, L(TR↑ (A)) ⊇ R∗ (L(A)) The γ function fix the quality of the approximation. For example, one of the roughest approximation is obtained with a constant γ function mapping every triple (l → r, σ, q) to sequences of q 0 a unique state of Q: ∀l → r ∈ R, ∀σ ∈ Σ(Q, X ), ∀q ∈ Q : γ(l → r, σ, q) = q 0 · · · q 0 . On the opposite, the best approximation consists in mapping every triple (l → r, σ, q) to sequences of distinct states. However, although any rough approximation built with the first γ is guaranteed to terminate, this is not necessarily the case for the second one. On a practical point of view, the fact that completeness of the approximation construction does not depend on the chosen γ (Theorem 1) is a very interesting property. Indeed, it guarantees that for any approximation function, TR↑ (A) is a safe model of R∗ (E), in the sense of abstract interpretation. Example 3. Back to the example 2, adding to A0 transitions {g(qnew ) → qf , f(qa ) → qnew} to obtain A1 brings another critical pair: f (g(qnew )) A1

R

g(f (qnew ))

∗

qf

Like in the previous example, we build ∆2 by adding N ormα (f(g(qnew )) → qf ) 0 to ∆1 . However, if α maps g(qnew ) to another state qnew not occurring in ∆1 , we add some new transitions and get another critical pair, and the process may go on for ever. Instead, we can here define an approximation function γ in a simple and static way, for example: ∀σ ∈ Σ(Q, X ), ∀q ∈ Q : γ(f(g(x)) → g(f(x)), q, σ) = qnew. Since PosF (g(f(x))) = {1} is a singleton, note that the γ function maps triple of the form (f(g(x)) → g(f(x)), q, σ) to sequences of states of length one. This γ function defines a very rough approximation since the same state qnew is used for every normalization, whatever values q and σ may be. Thanks to this approximation function γ, the completion terminates. The value of ∆1 remain the same but, for the next completion step, we have N ormγ (f(g(qnew )) → qf ) = {g(qnew) → qf , f(qnew ) → qnew }. Thus, ∆2 = ∆1 ∪ {f(qnew ) → qnew }, there is no new critical pair between ∆2 and rule f(g(x)) → g(f(x)), and we have L(A2 ) = f ∗ (g(f ∗ (a))). Once TR↑ (A) is obtained, it is easy to verify some reachability properties on R and E. It can be shown for example that a regular set of terms F cannot be reached from terms of E by →R ∗ . This can be done by showing that L(TR↑ (A)) ∩ F = ∅. We will apply this to the verification of the Needham-Schroeder Public Key Protocol in section 5.

278

3

Thomas Genet and Francis Klay

Needham-Schroeder Public Key Protocol

In this section, we present our case study on the Needham-Schroeder Public Key protocol (NSPK). More precisely, we here use the fixed version of the protocol [14] without key server. Key servers have been discarded here for the sake of simplicity. Note that attacks from [14] have been found on the NSPK without key servers. Moreover, the approximation technique have also been successfully applied to the protocol with key servers. The NSPK protocol aim at mutual authentication of two agents, an initiator A and a responder B, separated by an insecure network. Mutual authentication means that, when a protocol session is completed between two agents, they should be assured of each other’s identity. In general, the main property expected for this kind of protocol is to prevent an intruder from impersonating one of the two agents. This protocol is based on an exchange of nonces (usually fresh random numbers or time stamps) and on asymmetric encryption of messages: every agent has a public key (for encryption) and a private key (for decryption). Every public key is supposed to be known by any agent whereas, the private key of agent X is supposed to be only known by X. Thus, in this setting, we suppose that messages encrypted with the public key of X can only be decrypted and read by X. Here is a description of the three steps of the fixed version of protocol, borrowed from [14]: 1. A ,→ B : {NA , A}KB 2. B ,→ A : {NA , NB , B}KA 3. A ,→ B : {NB }KB In the first step, A tries to initiate a communication with B: A creates a nonce NA and sends to B a message, containing NA as well as his identity, encrypted with the public key of B: KB . Then, in the second step, B sends back to A a message encrypted with the public key of A, containing the nonce NA that B received, a new nonce NB , and B’s identity. Finally, in the last step, A returns the nonce NB he received from B. If the protocol is completed, mutual authentication of the two agents is ensured: – as soon as A receives the message containing the nonce NA , sent back by B at step 2., A believes that this message was really built and sent by B. Indeed, NA was encrypted with the public key of B and, thus, B is the only agent that is able to send back NA , – similarly, when B receives the message containing the nonce NB , sent back by A at step 3., B believes that this message was really built and sent by A. Another property that may be expected for this kind of protocol is confidentiality of nonces. In particular, if nonces remain confidential, they can be used later as keys for symmetric encryption of communications between A and B. Thus, confidentiality of nonces may also be of interest. A cryptographic protocol is supposed to resist to any attack of an intruder. In particular for NSPK, we intend to show that, for agents respecting the protocol, and whatever the intruder may do,

Rewriting for Cryptographic Protocol Verification

279

– nonces and private keys remain confidential (confidentiality), – if an agent X believes that a message was built by another agent Y , then the message was effectively built by Y (authentication).

4

Encoding the Protocol and the Intruder

In this section, we show how to model NSPK by a TRS. First, we present the signature F and the terms of T (F ) used for representing agents, messages, keys, etc. Each agent is labeled by a unique identifier, let Lagt be the set of agent labels (terms representing agent labels will be given later). For any agent label l ∈ Lagt , the term agt(l) will denote the agent whose label is l. The term mesg(x, y, c) will represent a message whose header refers agent x as emitter, agent y as receiver and whose contents is c. The term pubkey(a) denotes the public key of agent a and encr(k, a, c) denotes the result of encryption of content c by key k. In this last term, a is a flag recording who has performed the encryption. This field is not used by the protocol rules but will be used for verification. The term N (x, y) represents a nonce generated by agent x for identifying a communication with y. We also use an AC binary symbol t in order to represent sets. For example the term x t (y t z) (equivalent modulo AC to (x t y) t z) will represent the set {x, y, z}. Starting from a set of initial requests, our aim is to compute a tree automaton recognizing an over-approximation of all sent messages. The approximation also contains some terms signaling either communication requests or established communications. For example, a term of the form goal(x, y) means that x expect to open a communication with y. A term of the form c init(x, y, z) means that x believes to have initiated a communication with y, but, in reality x communicates with z. Conversely, a term c resp(y, x, z) means that y believes to have responded to a communication request coming from x but z is the real author of the request. Then, encoding of the protocol into AC rewrite rules2 is straightforward: each step of the protocol is described thanks to a rewrite rule whose left-hand side is a precondition on the current state (set of received messages and communication requests), and the right-hand side represents the message to be sent (and sometimes established communication) if the precondition is met. The sent message is added to the current state. As a result, every rewrite rule we use is a ‘cumulative rule’, i.e. of the form l → l t X. Thus, for commodity, we choose to use the short-hand LHS for the term l occurring in the right-hand side. For instance, the rule mesg(x, y, c) → LHS t c init(x, y, y) will represent the rule: mesg(x, y, c) → mesg(x, y, c) t c init(x, y, y). Now for each step of the protocol, 2

We describe here our encoding in a general way. However, for the particular case of NSPK, encoding could have be done without the AC-symbol t, since t is only needed when the sending of a message depends on the reception of two (or more) distinct messages, i.e. rules of the form: m1 t m2 → m3 . In general, those rules are necessary to modelize protocols, but it is not the case for this simple version of NSPK.

280

Thomas Genet and Francis Klay

we give the corresponding rewrite rule. The encoding into TRS is longer than the initial protocol specification of section 3 because it is more complete. For instance, whereas the initial specification only informally define how to check the content of messages and how to deal with communication requests, these points are formally defined in our specification with rewrite rules. Furthermore, the initial specification can be viewed as a trace of a correct execution of the NSPK protocol for two specific agents A and B. Thus, this specification cannot be directly used in a more general context where some other agents also use the protocol. Hence, another difference between our specification and the initial specification of section 3 is that agents’ identities of initial specification (A and B) have been abstracted by term with variables of the form agt(x), agt(y). In the following, x, y, z, u, v, x2, x3 and z2 are supposed to be variables since we consider an unbounded number of agents and transactions. 1. A ,→ B : {NA , A}KB . The emission of the first message is encoded by the rule: goal(x, y) → LHS t mesg(x, y, encr(pubkey(y), x, JN (x, y), xK)) The meaning of this rule is the following: if an agent x wants to establish a communication with y then x sends a message to y whose contents is encrypted with public key of y. The contents is here represented by a list (build with classical operators cons and null) containing a nonce N (x, y) produced by x for y as well as x’s identity. For commodity, lists will be represented in the usual way, for example a list of the form cons(u, cons(v, null)) will be denoted by Ju, vK. 2. B ,→ A : {NA , NB , B}KA . mesg(x, agt(u), encr(pubkey(agt(u)), z, Jv, agt(x2)K)) → LHS t mesg(agt(u), agt(x2), encr(pubkey(agt(x2)), agt(u), Jv, N (agt(u), agt(x2)), agt(u)K)) The second message is sent by an agent agt(u) when he receives the first message from an agent agt(x2) whose identity is enclosed in the message3 . Note that in those rules, we achieve some kind of type checking on the content of the message. For instance, in the left-hand side of this rule, by expecting the message content pattern Jv, agt(x2)K instead of a more general pattern like Jv, x3K, we check that this element of the message is an agent’s identity. The role of this kind of type checking is important since it permits to avoid some attacks based on type confusion like those described in [17]. 3. A ,→ B : {NB }KB . This step is encoded by the rule: mesg(x, agt(y), encr(pubkey(agt(y)), z2, JN (agt(y), agt(z)), u, agt(z)K)) → LHS t mesg(agt(y), agt(z), encr(pubkey(agt(z)), JuK)) t c init(agt(y), agt(z), z2) 3

In this protocol, agent’s identity contained in the header of the message (x in our example) is never used, since it may have been corrupted by an intruder. However, this information is sometimes used, for example in the extended version of NSPK where a key server is also involved.

Rewriting for Cryptographic Protocol Verification

281

When agent agt(y) receives from agt(z) the nonce N (agt(y), agt(z)) he has built for agt(z) then he performs two actions. The first action is to send the last protocol message to agt(z). The second action consists in reporting the communication agt(y) thinks to have established with agt(z). However, the reality may be different and the identity of the real author of the message, z2, is used for filling the third field of the c init term. 4. In the last step of the protocol, no message is sent but when an agent receives the last message of the protocol sent at step 3., he reports a communication where he has the responder role. mesg(x, agt(y), encr(pubkey(agt(y)), z2, JN (agt(y), z)K)) → LHS t c resp(agt(y), z, z2) To prove the authentication property on the protocol, we need to prove that any couple of agents can securely establish a communication through the network, whatever the behavior of other agents and the behavior of an intruder may be. Thus, we assume that there is an unbounded number of agent labels in Lagt but we will observe more precisely two agents, namely agents labeled by A and B. For the unbounded number of other agent labels we will use integers built on usual operators 0 and s (successor). Hence, Lagt = {A, B} ∪ N and the initial set of terms E is the set of terms of the form goal(agt(x), agt(y)) where x, y ∈ Lagt . In other words, E is the set of all communication requests – – – –

from from from from

A or B towards any other agent agt(i) with i ∈ N, and agt(i) with i ∈ N towards A or B, and any agent agt(i) to any agent agt(j), i, j ∈ N, and A to B, B to A, A to A and B to B.

Note that we work in a very general setting where we also take into account the case where an agent use the protocol to authenticate himself. It is clear that selfauthentication of an agent may be not of practical interest, but, if it happens we want to verify that the intruder cannot take advantage of it to build an attack. The set E is recognized by the following tree automaton A0 . The final state of A0 is qnet and here is the set of transitions: 0 → qint s(qint ) → qint A → qA B → qB agt(qint ) → qagtI agt(qA ) → qagtA

agt(qB ) → qagtB qnet t qnet → qnet goal(qagtA , qagtB ) → qnet goal(qagtB , qagtA) → qnet goal(qagtA , qagtA) → qnet goal(qagtB , qagtB ) → qnet

goal(qagtA , qagtI ) → qnet goal(qagtI , qagtA) → qnet goal(qagtB , qagtI ) → qnet goal(qagtI , qagtB ) → qnet goal(qagtI , qagtI ) → qnet

Description of the Intruder In this last automaton, the state qnet is a special state representing both the network and the fact base containing communication requests and communication reports. As in many other verification approach of cryptographic protocols,

282

Thomas Genet and Francis Klay

the intruder is supposed to have a total control on the network. In particular, the intruder is assumed to know every message sent on the network. In our approach this assumption is a bit stronger: the intruder is the network. A direct consequence of this choice is that the knowledge of the intruder and every message that the intruder can build is supposed to always remain on the network. Furthermore, we suppose that agents agt(i) with i ∈ N (i.e. every agent that is not A or B) may be dishonest and deliberately give to the intruder their private key as well as the content of any message they send or receive. The intruder can also disassemble messages or build new ones from his knowledge. Rewrite rules are the simplest way to describe how an intruder can decrypt or disassemble components of a message. Since the agents agt(i) with i ∈ N are fool enough to give their private keys to the intruder, he can decrypt the messages encrypted with their public keys. On the opposite, we assume that the intruder has no means of guessing the private key of A or B. Here are the corresponding rules which can be applied on the AC-term representing the network, i.e. the intruder knowledge: cons(x, y) t z → LHS t x cons(x, y) t z → LHS t y mesg(x, y, z) t u → LHS t z encr(pubkey(agt(0)), y, z) t u → LHS t z encr(pubkey(agt(s(x))), y, z) t u → LHS t z

/ ∗ Disassembling ∗ / / ∗ Decrypting ∗ /

On the other hand, intruder’s ability to build new messages from its knowledge is shortly defined thanks to some tree automaton transitions. Since qnet is the state of A0 recognizing all the messages on the network, and since in our setting the knowledge of the intruder is the network, qnet is also the state recognizing the knowledge of the intruder. First, we assume that the intruder knows the identity of every agent of the network, as well as their public keys. agt(qint) → qnet pubkey(qagtI ) → qnet

agt(qA ) → qnet pubkey(qagtA ) → qnet

agt(qB ) → qnet pubkey(qagtB ) → qnet

Agents agt(i) with i ∈ N give the intruder the nonces they produce for other agents: N (qagtI , qagtA) → qnet

N (qagtI , qagtB ) → qnet

N (qagtI , qagtI ) → qnet

Finally, starting from components he already knows or will obtain later (i.e. terms in qnet ), the intruder can combine them into lists with the cons operator, encrypt them with anything (including keys) he knows with operator encr, build messages with operator mesg, etc. in order to enrich his knowledge (the language recognized by qnet). Note, however, that the second field of the operator encr (which is a flag) cannot be corrupted by the intruder and always refer to qagtI the real author of the encryption, i.e. the intruder. cons(qnet , qnet) → qnet mesg(qnet , qnet, qnet) → qnet

null → qnet

encr(qnet , qagtI , qnet) → qnet

Rewriting for Cryptographic Protocol Verification

283

There are several things to notice here. First, the initial description of L(A0 , qnet) is as wide and loose as possible: roughly, it authorizes the intruder to build nearly every term of T (F) except terms containing nonces built by A or B, i.e. terms containing subterms of the form N (agt(A), agt(x)) or N (agt(B), agt(y)). This can be automatically obtained by a complement operation. This kind of specification is quite natural with regards to intruder description since it is much more simpler and more convincing to specify what cannot be built by the intruder than to precisely and totally define what he can do. Consequently, the language recognized by state qnet is loose and it may also contain strangely formed messages whose effect on the protocol can hardly be predicted, for example: mesg(agt(A), agt(B), encr(pubkey(agt(B)), agt(0), Jencr(pubkey(agt(A)), agt(0), JN (agt(0), agt(A))K), N (agt(0), agt(B))K)) i.e. a message of the form agt(A) ,→ agt(B) : {{Nagt(0)}Kagt(A) , Nagt(0)}Kagt(B) . The language recognized by qnet contains also, for instance, terms representing repeated encryption (an unbound number) which are important to consider for cryptographic protocols verification: encr(pubkey(agt(A)), agt(s(0)), encr(pubkey(agt(B)), agt(0), encr(. . . The last thing to remark here is that during approximation construction, new messages or messages components m obtained by rewriting are added to the language recognized by automaton Ai as new transitions into Ai+1 s.t. m →∗Ai+1 qnet and thus can be used ’dynamically’ as new base components for intruder’s message constructions. To sum up, we have here described a model where we consider an unbounded number of agents executing an unbounded number of protocol sessions in parallel. In particular, note that if there exists an attack based on parallel protocol sessions between, say four agents A, B, C and D, this attack will appear in the model: C and D can be represented by two ’dishonest’ agents, say agt(i) and agt(j) with i, j ∈ N and i 6= j since all ’dishonest’ agents are able to respect the protocol.

5

Approximation and Verification

Extensions of Approximations to AC non Left-Linear TRSs In this section, we show how to extend the approximation construction to this larger class of TRSs. Roughly, the problem with non left-linear rules is the following: let f(x, x) → g(x) be a rule of R and let A be a tree automaton whose set of transitions contains f(q1 , q1 ) → q0 and f(q2 , q3) → q0 . Although we can construct a valid substitution σ = {x 7→ q1 } for matching the rewrite rule on the first transition, it is not the case for the second one. The semantics of a completion between rule f(x, x) → g(x) and transition f(q2 , q3) → q0 would be to find the common language of terms recognized both by q2 and q3 . This can

284

Thomas Genet and Francis Klay

be obtained by computing a new tree automaton A0 with a set of states Q0 such that Q0 is disjoint from states of A and ∃q ∈ Q0 : L(A0 , q) = L(A, q2) ∩ L(A, q3 ). Then, to end the completion step it would be enough to add transitions of A0 to A with the new transition g(q) → q0 . However, adding transitions of A0 to A also adds Q0 to states of A. Thus, we add new states to A and, in some cases, this may lead to non-termination of the approximation construction. On the other hand, one can remark that the non-linearity problem would disappear with deterministic automata since for any deterministic automaton Adet and for all states q, q 0 of Adet we trivially have L(A, q) ∩ L(A, q 0 ) = ∅. However, determinization of a tree automaton may result in an exponential blowup of the number of states [3]. Thus, we chose here to use locally deterministic tree automata: non-deterministic tree automata with some deterministic states, i.e. states q such that there is no two rules t → q and t → q 0 with q 6= q 0 . Hence, for all deterministic state q, we have ∀q 0 6= q : L(A, q) ∩ L(A, q 0 ) = ∅. During the approximation construction, if all states, matched by a non-linear variable of the left-hand side of a rule, are deterministic then it is enough to build critical pairs where non linear variables of the left-hand side are mapped to the same state. For instance, in the last example, it is enough to build the first critical pair, add the transition g(q1 ) → q0 , and keep q2 , q3 deterministic, i.e. such that L(TR↑ (A), q2)∩ L(TR↑ (A), q3) = ∅. We now show the completeness of this algorithm on locally deterministic tree automata. For all term t non linear, let us denote by tlin the term t linearized, i.e. where all occurrences of non linear variables are replaced by disjoint variables. For example, if t = f(x, y, g(x, x)), then tlin = f(x0 , y, g(x00 , x000 )). Definition 4. (States Matching) Let A be a tree automaton, Q its set of states, t ∈ T (F, X ) a non linear term, and {p1 , . . . , pn } ⊆ Pos(t) the set of positions of a non linear variable x in t. We say that states q1 , . . . , qn ∈ Q are matched by x iff ∃σ ∈ Σ(Q, X ) s.t. tlin σ →∗A q ∈ Q, and tlin σ|p1 = q1 , . . . , tlin σ|pn = qn Theorem 2. (Completeness Extended to non Left-Linear TRS) Let A be a tree automaton, R a TRS, TR↑ (A) the corresponding approximation automaton and Q its set of states. For all non left-linear rule l → r ∈ R, for all non linear variable x of l, for all states q1 , . . . , qn ∈ Q matched by x, if either q1 = . . . = qn or L(TR↑ (A), q1 ) ∩ . . . ∩ L(TR↑ (A), qn ) = ∅ then L(TR↑ (A)) ⊇ R∗ (L(A)) Proof. (Sketch)(See [10] for a detailed proof) Assume that there exists a term t such that t ∈ R∗ (L(A)) and t 6∈ L(TR↑ (A)). Let s ∈ L(A) such that s →∗R t. On the rewrite chain from s to t, let t1 , t2 be the first two terms such that t1 ∈ L(TR↑ (A)), t1 →R t2 and t2 6∈ L(TR↑ (A)). We then show that the rule l → r ∈ R applied for rewriting t1 into t2 is necessarily a non left-linear rule (otherwise t2 would be in L(TR↑ (A))). Then, we obtain that there exists a subterm u of t1 matched by all occurrences of a non linear variable x in l and there exists at least two distinct states q, q 0 of TR↑ (A) such that u →∗TR↑(A) q and u →∗TR↑(A) q 0 . This contradicts the hypothesis of the theorem since q and q 0 are matched by x in l and L(TR↑ (A), q) ∩ L(TR↑ (A), q 0 ) ⊇ {u} 6= ∅. 2

Rewriting for Cryptographic Protocol Verification

285

In our framework, states matched by non-linear variables are easily kept deterministic. For example, in the NSPK specification, non-linear variables always match terms A, B, i ∈ N (representing agent labels) which are initially recognized by qA , qB , qint, respectively. Those states are initially deterministic and this property is trivially preserved during completion since agent labels do not occur in right-hand side of rules and thus agent labels do not occur in new transitions to be added. However, when necessary, we can also automatically check this property on TR↑ (A) by proving that L(TR↑ (A), q1)∩. . .∩L(TR↑ (A), qn ) = ∅, for each non linear variable x of a rule matching distinct states q1 , . . . , qn . For dealing with the AC symbols, the extension is straightforward. Since approximation can deal with non terminating TRS, we can explicitly define the AC-behavior of a symbol. Thus, we replace in F the (implicit) AC-symbol t by a non-AC symbol U and add to R the following left-linear rules defining explicitly the AC behavior of U: xUy→yUx

(x U y) U z → x U (y U z)

x U (y U z) → (x U y) U z

Approximation Function Let R and A0 be respectively the set of all rewrite rules and the tree automaton given above. Our aim is now to compute a tree automaton TR↑ (A0 ) recognizing a superset of R∗ (L(A0 )) and thus, to over-approximate the network, i.e. the set of all possible sent messages (as well as the set of communication reports). We now give the approximation function γ, defining the folding positions for R and A0 . For approximation, the first choice we have made is to confuse dishonest agents (agt(i) with i ∈ N) together. In other words, in our approximation, no difference is made between agents agt(i) and agt(j) for any i, j ∈ N. However, we still distinguish between agt(A), agt(B) and any agent agt(i) with i ∈ N. In a similar manner, we collapse together all the messages sent and received by dishonest agents but we still do not confuse messages involving agt(A) or agt(B). For example, the approximation function used for the rule 1 , i.e. goal(x, y) → LHS U mesg(x, y, encr(pubkey(y), x, JN (x, y), xK)) is such that there are only seven distinct values for γ (The detail of sequences of new states used for each value can be found in [10] with the complete specification.): i γ( 1 , qnet, {x 7→ qagtA, y 7→ qagtB }) iii γ( 1 , qnet, {x 7→ qagtA, y 7→ qagtA }) v γ( 1 , qnet, {x 7→ qagtI , y 7→ qagtA }) vii γ( 1 , qnet, {y 7→ qagtI })

ii γ( 1 , qnet, {x 7→ qagtB , y 7→ qagtA}) iv γ( 1 , qnet, {x 7→ qagtB , y 7→ qagtB }) vi γ( 1 , qnet, {x 7→ qagtI , y 7→ qagtB })

According to case (i) all messages generated thanks to rule 1, where x is the agent labeled by A and y is the agent labeled by B, are decomposed using the same states defined by the sequence γ( 1 , qnet, {x 7→ qagtA, y 7→ qagtB }). Similarly, the case (vii) means that all messages generated thanks to rule 1, where y is an

286

Thomas Genet and Francis Klay

agent labeled by i ∈ N and x is any agent, are decomposed using states of the same sequence γ( 1 , qnet, {y 7→ qagtI }). Thus, no difference is made, for example, between messages sent by agt(A) to agt(i), messages sent by agt(B) to agt(j), and messages sent for by agt(i) to agt(j) for any i, j ∈ N. This is in fact natural since all messages sent to a dishonest agent are captured and factorized by the same intruder. Verification We use a prototype, based on a tree automata library [9,8] developed in ELAN [2], which permits to automatically compute approximations for a given R, A0 and an approximation function γ. Thanks to the approximation function given above, we obtain a finite tree automaton TR↑ (A0 ), with about 130 states and 340 transitions, recognizing a regular superset of R∗ (L(A0 )). See [10] for the complete specification and for a complete listing of the automaton TR↑ (A0 ). Thanks to this automaton, we can directly verify that NSPK has the confidentiality and authentication property. For confidentiality, it is enough to verify that the intruder cannot capture a nonce of the form N (agt(x), agt(y)) where x, y ∈ {A, B}. Since in our model the intruder emits all his knowledge on the network (as explained in section 4), this can be done by checking that the intruder cannot emit a nonce of the form N (agt(A), agt(B)), N (agt(B), agt(A)), . . . i.e. that the intersection between TR↑ (A0 ) and the automaton Aconf is empty. The final state of Aconf is qnet and its transitions are: A → qA B → qB agt(qA ) → qagtA

agt(qB ) → qagtB N (qagtA , qagtB ) → qnet N (qagtB , qagtA) → qnet

N (qagtA, qagtA) → qnet N (qagtB , qagtB ) → qnet qnet U qnet → qnet

The intersection can be automatically computed and we obtain a tree automaton whose set of states is empty, i.e. the recognized language is empty. Hence, there is no term of L(Aconf ) in L(TR↑ (A0 )) nor in R∗ (L(A0 )). Similarly, the cases where authentication is corrupted can be described by the following automaton Aaut whose final state is qnet and transitions are: 0 →qint s(qint ) →qint A →qA B →qB agt(qint ) →qagtI agt(qA ) →qagtA agt(qB ) →qagtB qnet U qnet →qnet

c init(qagtA , qagtB , qagtI ) →qnet c init(qagtA , qagtB , qagtA ) →qnet c resp(qagtB , qagtA , qagtI ) →qnet c resp(qagtB , qagtA , qagtB ) →qnet c init(qagtB , qagtA , qagtI ) →qnet c init(qagtB , qagtA , qagtB ) →qnet c resp(qagtA , qagtB , qagtI ) →qnet c resp(qagtA , qagtB , qagtA ) →qnet

c init(qagtA , qagtA , qagtI ) →qnet c resp(qagtA , qagtA , qagtI ) →qnet c init(qagtA , qagtA , qagtB ) →qnet c resp(qagtA , qagtA , qagtB ) →qnet c init(qagtB , qagtB , qagtI ) →qnet c resp(qagtB , qagtB , qagtI ) →qnet c init(qagtB , qagtB , qagtA ) →qnet c resp(qagtB , qagtB , qagtA ) →qnet

encoding all the cases where there is a distortion in communication reports between the belief of the parties and the reality, for example terms of the form c init(agt(A), agt(B), agt(k)) for k ∈ N ∪ {A} meaning that agt(A) think to have established a communication with B but, in reality, he has been fooled and he communicates with some agt(i) with i ∈ N or with himself. The intersection between TR↑ (A0 ) and the automaton Aaut is also empty (see [10] for traces of execution).

Rewriting for Cryptographic Protocol Verification

6

287

Conclusion

In this paper, we have shown an application of descendant approximation to cryptographic protocols verification. We have obtained a positive proof of authentication and confidentiality of NSPK. Moreover, applying the same approximation mechanism on the flawed NSPK specification of [19] has led to some non-empty intersections with Aconf and Aaut , signaling violation of confidentiality and authentication properties. An interesting aspect of this method is that it takes advantage of theorem proving and a form of abstract interpretation called approximation. The basic deduction mechanism, coming from the domain of theorem proving, provide some simple and efficient tools – tree automata – to manipulate infinite objects. On the other hand, approximation simplifies the proof in such a way that it can be automatically computed afterwards. Compared to other rewriting based verification techniques like proofs by consistency or proofs by induction, properties that can be proved with the approximation technique are clearly more restricted: they could be qualified as ‘regular properties’. However, by restricting attention to ’regular properties’, we obtain a verification technique that enjoys many interesting practical properties: termination of the TRS is not needed, TRS may include AC symbols, proofs are obtained by intersections with TR↑ (A0 ) (automatically and quickly computed), construction of TR↑ (A0 ) is automatic, incremental and can be guaranteed to terminate by a good choice of the γ approximation function (like in the NSPK case above or in a fully automatic way like in [9]). Constructing an approximation function does not require any particular skill in formal proof since it only consists in pointing out some sets of objects (represented here by states recognizing regular sets of terms) to be merged together in order to build an approximated model. In the NSPK case, the γ approximation has been entirely given by hand but it is systematic: for each distinct value of the co-domain of γ the user has to give a sequence of fresh states used for normalizing new transitions. For historical reasons, this step is manual in our prototype but will be automated in the new implementation of this tool which is in progress. We can also compare this technique with other verification techniques used for verifying cryptographic protocols. The first main difference to be pointed out is that our technique is not designed for discovering attacks. From approximation TR↑ (A0 ), we can derive some information on the context of those attacks but it is approximate and should be studied with a theorem prover or a model-checker to re-construct an exact trace of the attack. Model-checking is, in fact, particularly well suited for attack discovery as showed by the many flaws discovered by G. Lowe [15]. Furthermore, when attacks are no longer found, model-checking can also be used to verify cryptographic protocols by lifting the properties proved on a finite domain to an unbounded. However, the lifting has to be done by hand like G. Lowe did in [14] or, in a more automatic way, by abstract interpretation like it is done by D. Bolignano in [1]. Although we started with a different formalism and used a different technique, our approach is very close to D. Bolignano’s one. In particular, approximation functions can be seen as particular abstract

288

Thomas Genet and Francis Klay

interpretations. Nevertheless, approximations enjoys a property that abstract interpretations have not in general: safety of our abstract model (approximations) is implicit and guaranteed by Theorem 1 for every approximation function γ. Automated theorem proving has also been widely used for cryptographic protocols verification. The NRL Protocol Analyzer, developed by C. Meadows [17], uses narrowing. L. Paulson applied induction proof and the theorem prover Isabelle/HOL to the verification of cryptographic protocols [20]. Those two theorem proving approaches achieve a very detailed verification of protocols. In particular, they provide one of the most convincing answer to the problem of freshness. In counterpart, the proofs may diverge and the main difficulty remain to inject the right lemma at the right moment in order to make the proof converge. Thus, automation of this kind of method remains partial. Furthermore, proofs are long, complex and they require a user with a strong practical experience of the prover. A more recent work is due to C. Weidenbach [22] who gave a positive proof for the Neuman-Stubblebine protocol thanks to the theorem prover SPASS. His technique is based on saturation of sets of horn clauses, which is related to the descendant computation we here use. For a restricted class of clauses called semi-linear, saturation can be computed exactly. However, when the protocol specification cannot be encoded into semi-linear clauses the saturation process may diverge. Thus, specifications must be modified in order to ensure termination of the process. In our framework, no restriction is set on the TRSs we use but, instead, we defined an over-approximation technique in order to tackle the divergence problem. In [6], G. Denker, J. Meseguer and C. Talcott proposed to encode the NSPK into object-oriented TRSs. This encoding is executable and is used for detecting attacks in the initial version of the protocol by testing. Using objects is clearly a great advantage for a better clarity and readability of the encoding. Nevertheless, since rewriting remains the operational model of object oriented rewriting, it should be possible to extend approximations to objects and thus benefit of the clarity of object oriented specifications. In [18], D. Monniaux also use tree automata and a completion mechanism for verifying cryptographic protocols. With regards to our work, an important difference is that his method can only deal with a bounded number of agents and a bounded number of protocol sessions. On a more technical point of view, unlike our approach, rewriting is only used for estimating intruder knowledge and not for encoding the protocol itself. Moreover, his completion mechanism is limited to the decidable and well known case of collapsing rules4 covered by the decidable and more general case of right-linear and monadic rules [21]. However, this approach is interesting since it shows a possible way for combining tree automata and state-transition models for abstract interpretation of protocols: tree automata and completion for abstracting structures and state-transition models for representing the notion of time in the abstract model. 4

right-hand side of a collapsing rule is reducted to a variable occurring in its left-hand side.

Rewriting for Cryptographic Protocol Verification

289

In the approximation model we consider for NSPK, time is totally collapsed, i.e. every message is considered to be permanently sent and received at every moment. Collapsing time let us easily consider an infinite number of protocol sessions in a finite model. Although this does not raise problems for proving confidentiality or authentication properties on NSPK, this is not the case in general. For instance, in electronic commerce protocols like SET [16], there is little hope to prove any security property on an abstract model with no time since freshness plays a central role. A direct solution is to consider several states for the network (i.e. of intruder knowledge) for different steps of the protocol instead of collapsing all states in one. Our main goal is to be able to handle protocol as complex as SET. To achieve this goal, the first thing to consider is to formally define the concepts used in cryptographic protocols (keys, nonces, agents, . . . ) in order to get a natural protocol language description and an automatic translator to the encoding presented in this paper. The second point would be, on the one hand, to extend the present work with conditional rules in order to get a more powerful behavior description language and, on the other hand, to handle other tree grammars to get finer approximations. Finally, we think that approximations could be used for the verification of systems different from cryptographic protocols. Rewriting based approximations seems to be a way to combine, in the same formalism, automated theorem proving techniques and abstract interpretation: theorem proving for proving properties needing high level proof techniques – like induction – and approximations for proving the remaining parts of the proof where abstract interpretation and model-checking are enough.

Acknowledgments We would like to thank Pascal Brisset for discussion about cryptographic protocols and Pierre-Etienne Moreau for technical help with ELAN.

References 1. D. Bolignano. Towards a Mechanization of Cryptographic Protocol Verification. In Proc. 9th CAV Conf., Haifa (Israel), volume 1254 of LNCS. Springer-Verlag, 1997. 2. P. Borovansk´ y, C. Kirchner, H. Kirchner, P.-E. Moreau, and M. Vittek. ELAN: A logical framework based on computational systems. In Proc. 1st WRLA, volume 4 of ENTCS, Asilomar (California), 1996. 3. H. Comon, M. Dauchet, R. Gilleron, F. Jacquemard, D. Lugiez, S. Tison, and M. Tommasi. Tree automata techniques and applications. http://l3ux02.univ-lille3.fr/tata/, 1997. 4. J. Coquid´e, M. Dauchet, R. Gilleron, and S. V´agv¨ olgyi. Bottom-up tree pushdown automata and rewrite systems. In R. V. Book, editor, Proc. 4th RTA Conf., Como (Italy), volume 488 of LNCS, pages 287–298. Springer-Verlag, 1991. 5. M. Dauchet and S. Tison. The theory of ground rewrite systems is decidable. In Proc. 5th LICS Symp., Philadelphia (Pa., USA), pages 242–248, June 1990.

290

Thomas Genet and Francis Klay

6. G. Denker, J. Meseguer, and C. Talcott. Protocol Specification and Analysis in Maude. In Proc. 2nd WRLA Workshop, Pont a ` Mousson (France), 1998. 7. N. Dershowitz and J.-P. Jouannaud. Handbook of Theoretical Computer Science, volume B, chapter 6: Rewrite Systems, pages 244–320. Elsevier Science Publishers B. V. (North-Holland), 1990. Also as: Research report 478, LRI. 8. T. Genet. Tree Automata Library. http://www.loria.fr/ELAN/. 9. T. Genet. Decidable approximations of sets of descendants and sets of normal forms. In Proc. 9th RTA Conf., Tsukuba (Japan), volume 1379 of LNCS, pages 151–165. Springer-Verlag, 1998. 10. T. Genet and F. Klay. Rewriting for cryptographic protocols verification (extended version). Technical report, INRIA, 2000. http://www.irisa.fr/lande/genet/publications.html. 11. R. Gilleron and S. Tison. Regular tree languages and rewrite systems. Fundamenta Informaticae, 24:157–175, 1995. 12. F. Jacquemard. Decidable approximations of term rewriting systems. In H. Ganzinger, editor, Proc. 7th RTA Conf., New Brunswick (New Jersey, USA), pages 362–376. Springer-Verlag, 1996. 13. G. Lowe. An Attack on the Needham-Schroder Public-Key Protocol. IPL, 56:131– 133, 1995. 14. G. Lowe. Breaking and fixing the Needham-Schroeder public-key protocol using CSP and FDR. In Proc. 2nd TACAS Conf., Passau (Germany), volume 1055 of LNCS, pages 147–166. Springer-Verlag, 1996. 15. G. Lowe. Some New Attacks upon Security Protocols. In 9th Computer Security Foundations Workshop. IEEE Computer Society Press, 1996. 16. Mastercard & Visa. Secure Electronic Transactions. http://www.visa.com/set/, 1996. 17. C. A. Meadows. Analyzing the Needham-Schroeder Public Key Protocol: A comparison of two approaches. In Proc. 4th ESORICS Symp., Rome (Italy), volume 1146 of LNCS, pages 351–364. Springer-Verlag, 1996. 18. D. Monniaux. Abstracting Cryptographic Protocols with Tree Automata. In Proc. 6th SAS, Venezia (Italy), 1999. 19. R. M. Needham and M. D. Schroeder. Using Encryption for Authentication in Large Networks of Computers. CACM, 21(12):993–999, 1978. 20. L. Paulson. Proving Properties of Security Protocols by Induction. In 10th Computer Security Foundations Workshop. IEEE Computer Society Press, 1997. 21. K. Salomaa. Deterministic Tree Pushdown Automata and Monadic Tree Rewriting Systems. J. of Computer and System Sciences, 37:367–394, 1988. 22. C. Weidenbach. Towards an Automatic Analysis of Security Protocols. In Proc. 16th CADE Conf., Trento, (Italy), volume 1632 of LNAI, pages 378–382. SpringerVerlag, 1999.

System Description: *sat A Platform for the Development of Modal Decision Procedures? Enrico Giunchiglia and Armando Tacchella DIST, Universit` a di Genova Viale Causa 13 – 16145 Genova, Italy {enrico,tac}@dist.unige.it

Abstract. *sat is a platform for the development of modal decision procedures. Currently, *sat features decision procedures for the normal modal logic K(m) and for the classical modal logic E(m). *sat embodies a state of the art SAT solver, and includes techniques for optimizing automated deduction in modal and temporal logics. Owing to its modular design and to the extensive reuse of software components, *sat provides an open, easy to maintain, yet efficient implementation framework.

1

Introduction

In this paper we present *sat, a platform for the development of SAT-based decision procedures. By SAT-based we mean built on top of a SAT solver in the spirit of [1]. Currently, *sat features SAT-based decision procedures for the normal modal logic K(m), and for the classical modal logic E(m) [9,2]. The *sat propositional engine is an embedded version of sato 3.2, one of the most efficient SAT checkers publicly available [3]. We chose sato because it is a fast propositional reasoner and it features many optimizations that we exploited in *sat. We also implemented other optimizations that speed up modal reasoning, like: – early investigation of modal successors [1], – internal optimized clause form conversions [2], and – caching structures and retrieval algorithms [4]. *sat has been designed to be modular and to allow for an easy integration of new decision procedures and optimizations. The system is implemented in C and extensively reuses software components from state-of-the-art systems, i.e., sato, and the glu library of data types from the vis model checking system [5]. The glu library provides *sat with efficient implementations of, e.g., lists, hashtables, sparse-matrices. Taking sato and glu off-the-shelf, we inherit and exploit ?

We wish to thank Fausto Giunchiglia, Peter Patel-Schneider and Roberto Sebastiani for useful discussions related to *sat. This work is supported by MURST.

D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 291–296, 2000. c Springer-Verlag Berlin Heidelberg 2000

292

Enrico Giunchiglia and Armando Tacchella

in *sat several years of experience in building highly optimized data structures and algorithms for automated deduction in propositional and temporal logics. *sat source code, documentation and experimental results are available on the WWW at: http://www.mrg.dist.unige.it/~tac/StarSAT.html

2

Algorithms

For sake of clarity, we introduce some preliminary notions. The set of formulas is constructed starting from a given set of propositional letters and applying the 0-ary operators > and ⊥ (representing truth and falsity respectively); the unary operators ¬ and 2; and the binary operators ∧, ∨, ⊃ and ≡.1 A modal logic is a set of formulas (called theorems) closed under tautological consequence. A formula ϕ is consistent in a modal logic L (or L-consistent) if ¬ϕ is not a theorem of L, i.e., if ¬ϕ 6∈ L. By atom we mean a propositional letter or a formula of the form 2ϕ. A literal is either an atom or the negation of an atom. An assignment is any conjunction µ of literals such that for any pair ψ, ψ0 of conjuncts in µ, it is not the case that ψ = ¬ψ0 . An assignment µ satisfies a formula ϕ if µ entails ϕ by propositional reasoning. Consider a formula ϕ. Let L be a modal logic. Whether ϕ is L-consistent can be determined by implementing two mutually recursive procedures: – Lsat(ϕ) for the generation of assignments satisfying ϕ, and – Lconsist(µ) for testing the L-consistency of each generated assignment µ. The procedure Lsat is independent of the particular modal logic L considered, and can be based on any propositional decision procedure (see [2]). Indeed, the logic specific reasoning is delegated to Lconsist. Currently, *sat features the procedures Econsist and Kconsist playing the role of Lconsist for the logics E(m) and K(m) respectively. For lack of space, we present only the Lsat algorithm here. For Econsist and Kconsist, see [2] and also [4]. The Lsat procedure implemented in *sat is based on sato 3.2 [3], an efficient implementation of the Davis-Putnam-Longemann-Loveland (DP) procedure [6]. Figure 1 shows an high-level description of Lsat. In the Figure: – cnf(ϕ) is a set of clauses obtained from ϕ by applying a conversion to conjunctive normal form (CNF) based on renaming (see, e.g., [7]). – choose-literal(Φ, µ) returns a literal occurring in Φ and chosen according to some heuristic criterion. – if l is a literal, l stands for A if l = ¬A, and for A if l = ¬A; – for any literal l and set Φ of clauses, assign(l, Φ) is the set of clauses obtained from Φ by (i) deleting the clauses in which l occurs as a disjunct, and (ii) eliminating l from the others. 1

For simplicity, we consider the case with only one modality.

System Description: *sat function Lsat(ϕ)

293

return Lsatdp (cnf(ϕ), >).

function Lsatdp (Φ, µ) if Φ = ∅ then return Lconsist(µ); if ∅ ∈ Φ then return False; if { a unit clause {l} is in Φ } then return Lsatdp (assign(l, Φ),µ ∧ l); if not Lconsist(µ) then return False; l := choose-literal(Φ, µ); return Lsatdp (assign(l, Φ),µ ∧ l) or Lsatdp (assign(l, Φ),µ ∧ l).

/* base */ /* backtrack */ /* unit */ /* early pruning */ /* split */

Fig. 1. Lsat and Lsat dp As can be observed, the procedure Lsatdp in Figure 1 is the DP-procedure modulo (i) the call to Lconsist(µ) when it finds an assignment µ satisfying the input formula (Φ = ∅), and (ii) the early pruning step, i.e., a call to Lconsist(µ) that forces backtracking after each unit propagation when incomplete assignments are not L-consistent. Early pruning prevents *sat from thrashing, i.e., from repeatedly generating different assignments that contain a same inconsistent kernel [1,8].

3

Implementation and Features

The *sat modular architecture is depicted in Figure 2. The thickest external box represents the whole system and, inside it, each solid box represents a different module. By module, we mean a set of routines dedicated to a specific task.2 The dashed horizontal lines single out the four main parts of *sat: INTERFACE: The modules KRIS, KSATC, LWB, and TPTP are parsers for different input syntaxes. The module TREES stores the input formula as a tree, at the same time performing some simple preprocessing (e.g. pushing negations down to atoms). DATA: The module DAGS (for Directed Acyclic Graphs) implements the main data structures of *sat. The input formula is preprocessed and stored as a DAG. A Look Up Table (LUT), mapping each atom 2ψ into a newly introduced propositional letter Cψ is built. Then, each modal atom is replaced by the corresponding propositional letter. The initial preprocessing allows to map trivially equivalent3 modal atoms into a single propositional letter, thus fostering the detection of (un)satisfiable subformulae [1,10]. ENGINE: This part includes the module SAT, the propositional core of *sat. Since SAT implements a DP algorithm, techniques like semantic branching, 2 3

As a matter of fact, each module corresponds to a file in *sat distribution package. Technically, the preprocessing maps a formula ϕ into a formula ϕ0 which is logically equivalent to ϕ in any classical modal logic (see [9] for the definition of classical modal logic).

294

Enrico Giunchiglia and Armando Tacchella

KSATC

KRIS

LWB

TPTP INTERFACE

TREES

DATA

DAGS

SAT

K(m)

CNF

E(m)

MONIITOR

STAT

ENGINE

* CACHING

LOGICS

DPSAT

Fig. 2. *sat modular architecture

boolean constraint propagation (BCP) and heuristic guided search are inherited for free by *sat. The dashed box (labeled CNF) stands for a set of DPSAT routines implementing CNF conversions based on renaming. CNF routines allow *sat to handle any formula even if the SAT decider accepts CNF formulae only. LOGICS: Currently, *sat features the two modules K(m) and E(m), implementing Kconsist and Econsist respectively. The dotted box “*” is a placeholder for other L-consistency modules that will be implemented in the near future. CACHING implements data structures and retrieval algorithms that are used to optimize the L-consistency checking routines contained in the logic dependent modules (see [2] for caching in E(m) and [4] for caching in K(m)). The modules DPSAT, MONITOR and STAT span across different parts of *sat. DPSAT interfaces the inner modules between them. The result is that these modules are loosely coupled and can be modified/replaced (almost) independently from each other. MONITOR records information about *sat performance, e.g, cpu time, memory consumption, number of L-consistency checks. STAT explores the preprocessed input formula and provides information like number of occurrences of a variable, number of nested boxes. This information is used by different modules, e.g., for dimensioning the internal data structures. To understand the behavior of *sat, let ϕ be the formula ¬(¬2(2C2 ∧ 2C1 ) ∧ 2C2 ). *sat first stores ϕ as an intermediate representation (provided by TREES) where it undergoes some preliminary transformations. In our case, ϕ becomes (2(2C2 ∧ 2C1 ) ∨ ¬2C2 ). Then, the building of the internal representation (provided by DAGS) causes lexical normalization ((2C2 ∧ 2C1 ) would be

System Description: *sat

295

m S 2m S ∨

m ¬ m ```` 2m 2m ∧

m C 1

Cm 2

∨

S S C5 S ¬ ∧ ` ```` C

C

4

3

C1

C2

5

4

3

2

1

Fig. 3. Internal representation of concepts in *sat rewritten into (2C1 ∧2C2 )) and propositional simplification (e.g., (C ∨C) would be rewritten into C) to be performed on ϕ. The resulting formula is represented by the data structure depicted in Figure 3 (left). Next, *sat creates the LUT and replaces each modal atom with the corresponding propositional letter. The result is depicted in Figure 3 (right), where the numbers appearing in the LUT have the obvious meaning. Notice that the top-level formula ϕ0 = C5 ∨ ¬C3 is now purely propositional. If SAT accepts only CNF formulae then (i) for every LUT entry Cψ , both ψ and ¬ψ are converted to CNF and (ii) the top level formula ϕ0 is replaced by its CNF conversion. Finally, the core decision process starts. SAT is properly initialized and called with ϕ0 as input. Once a satisfying truth assignment is found, a logic dependent module (e.g. K(m)) is called to check its L-consistency. The recursive tests are built in constant time using the LUT to reference the subformulae. The process continues until no more truth assignments are possible or a model is found ([2] details this process for K(m), E(m) and several other classical modal logics).

References 1. F. Giunchiglia and R. Sebastiani. Building decision procedures for modal logics from propositional decision procedures - the case study of modal K. In Proc. CADE-96, LNAI, 1996. 2. E. Giunchiglia, F. Giunchiglia, and A. Tacchella. SAT-Based Decision Procedures for Classical Modal Logics. To appear in Journal of Automated Reasoning. 3. H. Zhang. SATO: An efficient propositional prover. In Proc. CADE-97, volume 1249 of LNAI, 1997. 4. E. Giunchiglia and A. Tacchella. Subset-matching Size-bounded Caching for Satisfiability in Modal Logics, 2000. Submitted. 5. VIS-group. VIS: A system for verification and synthesis. In Proc. CAV-96, LNAI, 1996. 6. M. Davis, G. Longemann, and D. Loveland. A machine program for theorem proving. Journal of the ACM, 5(7), 1962.

296

Enrico Giunchiglia and Armando Tacchella

7. D.A. Plaisted and S. Greenbaum. A Structure-preserving Clause Form Translation. Journal of Symbolic Computation, 2:293–304, 1986. 8. I. Horrocks. Optimizing Tableaux Decision Procedures for Description Logics. PhD thesis, University of Manchester, 1997. 9. B. F. Chellas. Modal Logic – an Introduction. Cambridge University Press, 1980. 10. U. Hustadt and R.A. Schmidt. On evaluating decision procedures for modal logic. In Proc. IJCAI-97, 1997.

System Description: DLP Peter Patel-Schneider Bell Labs Research, Murray Hill, NJ, U.S.A. [email protected]

DLP (Description Logic Prover) is an experimental description logic knowledge representation system. DLP implements an expressive description logic that includes propositional dynamic logic as a subset. DLP provides a simple interface allowing users to build knowledge bases of descriptions in this description logic, but, as an experimental system, DLP does not have a full user interface. Because of the correspondence between description logics and propositional modal logics, DLP can serve as a reasoner for several propositional modal logics. As well as propositional dynamic logic, the logic underlying DLP contains fragments that are in direct correspondence to the propositional modal logics K(m) and K4(m) . DLP provides an interface that allows direct satisfiability checking of formulae in K(m) and K4(m) . Using a standard encoding, the interface also allows satisfiability checking of formulae in KT(m) and S4(m) . DLP is available via the WWW at http://www.bell-labs.com/user/pfps. DLP is implemented in SML/NJ. The current version of DLP, version 4.1, includes a number of new optimisations and options not included in previous versions. One of the purposes in building DLP was to investigate various optimisations for description logic systems. A number of these optimisations have appeared in various description logic systems [1,3,7]. As there is still need to investigate optimisations further and to develop new optimisation techniques, DLP has a number of compile-time options to select various description logic optimisations. DLP implements the description logic in Figure 1. In the syntax chart A is an atomic concept; C and D are arbitrary concepts; P is an atomic role; R and S are arbitrary roles; and n is an integer. There is an obvious correspondence between most of the constructs in this description logic and propositional dynamic logic, which is given in the chart. Implementation DLP uses the now-standard method for subsumption testing in description logics, namely translating subsumption tests into satisfiability tests and checking for satisfiability using an optimised tableaux method. DLP was designed from the beginning to be an experimental system. As a result, much more attention has been paid to making the internal algorithms correct and efficient in the worst-case than to reducing constant factors. Similarly, the internal data structures have been chosen for their flexibility rather than having the absolute best modification and access speeds. Some care has been taken to make the internal D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 297–301, 2000. c Springer-Verlag Berlin Heidelberg 2000

298

Peter Patel-Schneider DL Syntax A > ⊥ ¬C CuD CtD ∃R.C ∀R.C nP nP P:n Roles P (Modalities) R t S (Actions) R◦S R/C R+ Concepts (Formulae)

> 6

PDL Syntax A T F ∼C C∧D C∨D hRi C [R] C

P R∪S R;S R ; C? R ; R∗

Semantics A I ⊆ ∆I ∆I ∅ ∆I − C I C I ∩ DI C I ∪ DI {d ∈ ∆I : RI (d) ∩ C I 6= ∅} {d ∈ ∆I : RI (d) ⊆ C I } {d ∈ ∆I : | RI (d) | ≥ n} {d ∈ ∆I : | RI (d) | ≤ n} {d ∈ ∆I : RI (d) 3 n} P I ⊆ ∆I × ∆ I RI ∪ S I RI ◦ S I RI ∩ (∆I × C I ) In n≥1 R

S

Fig. 1. Simplified Syntax for DLP

data structures reasonably fast, however—there is considerable use of binary maps and hash tables instead of lists to store sets, for example. DLP is implemented in SML/NJ instead of a language like C so that it can be more-easily changed. There is some price to be paid for this, as SML/NJ does not allow some of the low-level optimisations possible in languages like C. Further, DLP is implemented in a mostly-functional fashion. The only nonfunctional portions of the satisfiability checker in DLP have to do with unique storage of formulae, and caching of several kinds of information. All this caching is monotone, i.e., it does not have be undone during a proof, or even between proofs. Nonetheless, DLP is quite fast on several problem sets, including the Tableaux’98 propositional modal logic comparison benchmark [9] and several collections of hard random formulae in K [10,8,11]. Optimisation Techniques Many of the optimisation techniques in DLP have already appeared in various description logic systems. The most complete description of these optimisations can be found in Ian Horrocks’ thesis [7]. The basic algorithm in DLP is a simple tableau algorithm that searches for a model that demonstrates the satisfiability of a description logic description or, equivalently, a propositional modal logic formula. The algorithm process modal constructs by building successor nodes with attached formulae that represent related possible worlds. The algorithm incorporates the usual control mechanism to guarantee termination, including a check for equality of the formulae at nodes to guarantee termination for transitive roles (modalities).

System Description: DLP

299

Before the model search algorithm in DLP starts, incoming formulae are converted into a normal form, and common sub-formulae are uniquely stored. This conversion detects analytically satisfiable sub-formulae. This unique storage of formulae also allows values to be efficiently given to any sub-formula in the formula, not just propositional variables. This can result in search failures being detected much earlier than would otherwise be the case. DLP performs semantic branching search. When DLP decides to branch, it picks a formula and assigns that formula to true and false in turn instead of picking a disjunction and assigning each of its disjuncts to true in turn. Semantic branching is guaranteed to explore each section of the search space at most once, as opposed to syntactic branching, and this is important in propositional modal logics as the generation and analysis of successors can result in large overlap in the search space when using syntactic branching. DLP looks for formulae whose value is determined by the current set of assignments, and immediately gives these formulae the appropriate value. This technique can result in dramatic reductions in the search space, particularly in the presence of semantic branching. For every sub-formula DLP keeps track of which choice points lead to the deduction of that sub-formula. When backtracking to a choice point, DLP checks to see if the current search failure depends on that choice; if it does not, the alternative branch need not be considered, as it would just lead to the same failure. This technique, often called backjumping [2], can dramatically reduce the search space, but does have some overhead. During a satisfiability check successor nodes with the same set of formulae as a previously-encountered node are often generated. As all that matters is whether the node is satisfiable or not, DLP caches and reuses their status. Care has to be taken to ensure that caching does not interfere with the rest of the algorithm, particularly the determination of dependencies and loop analysis. Caching does require that information about each node generated be retained for a longer period of time than required for a basic depth-first implementation of the satisfiability checker. However, caching can produce dramatic gains in speed. There are many heuristic techniques that can be used to determine which sub-formula to branch on first. However, these techniques require considerable information to be computed for each sub-formula of the unexpanded disjunctions. Further, the heuristic techniques available have mostly been devised for non-modal logics and are not necessarily suitable for modal logics. Nonetheless, DLP includes some simple heuristics to guide its search, mostly heuristics for more-effective backjumping. New Techniques Version 4.1 of DLP includes quite a number of new techniques to improve its performance. In previous versions of DLP, the cache did not include dependency information, which meant that a conservative approximation to this information had

300

Peter Patel-Schneider

to be made, possibly resulting in less-than-optimal backjumping. The formula cache has now been expanded to incorporate the dependency information needed in backjumping, so that caching does not interfere with backjumping. Of course, this does increase the size of cache entries. The low-level computations in DLP used to be quite expensive for very large formulae. If the formula was also difficult to solve, this cost would be masked by the search time, but if the formula was easy to solve, the low-level computation cost would dominate the solution time. Version 4.1 of DLP dramatically reduces the time taken for low-level computations both by reducing the amount of heuristic information generated when there are many clauses active and also by caching some of this information so that it does not have to be repeatedly computed. Of course, DLP is still much slower on large-but-easy formulae than provers that use imperative techniques, but such provers are much harder to build and debug than DLP. DLP used to completely generate assignments for the current node before investigating any modal successors. The current version of DLP has an option to investigate modal successors whenever a choice point is encountered, a technique taken from KSatC [5]. This option can be beneficial but often increases solution times. DLP can now retain not only the status of nodes, but the model found if the node is satisfiable. This model can be used to restart the search when reinvestigating modal successors, reducing the time overhead for early investigation of modal successors—at the cost of considerably increasing the space required for the cache. DLP can now also return a model for satisfiable formulae. DLP now incorporates a variant of dynamic backtracking [4]. When jumping over a choice point, a determination is made as to whether any invalidated branch(es) from that choice point depends on the choice being changed. If it does not, then the search ignores the invalidated branch(es) when the choice point is again encountered. Summary DLP has not been used in any actual applications, and as an experimental system, it is unlikely to receive any such use. DLP has been used to classify a version of the Galen medical knowledge base [12]. DLP performed capably on this knowledge base, creating the subsumption partial order in 210 seconds on a Sparc Ultra 1-class machine. DLP has also been tested on several sets of benchmarks, including the Tableaux’98 comparison benchmarks [6] and several collections of hard random modal formulae [10,8,11]. DLP is the fastest modal decision procedure for many of these tests. As it is an experimental system, I did not expect DLP to be particularly fast on hard problems. It was gratifying to me that it is competitive with existing propositional modal reasoners including FaCT and KSatC. My current plan for DLP is to incorporate inverse roles (converse modalities), a change that requires considerable modification to the implementation of the system.

System Description: DLP

301

References 1. F. Baader, E. Franconi, B. Hollunder, B. Nebel, and H.-J. Profitlich. An empirical analysis of optimization techniques for terminological representation systems or: Making KRIS get a move on. In Bernhard Nebel, Charles Rich, and William Swartout, editors, Principles of Knowledge Representation and Reasoning: Proceedings of the Third International Conference (KR’92), pages 270–281. Morgan Kaufmann Publishers, San Francisco, California, October 1992. Also available as DFKI RR-93-03. 2. A. B. Baker. Intelligent Backtracking on Constraint Satisfaction Problems: Experimental and Theoretical Results. PhD thesis, University of Oregon, 1995. 3. P. Bresciani, E. Franconi, and S. Tessaris. Implementing and testing expressive description logics: a preliminary report. In Gerard Ellis, Robert A. Levinson, Andrew Fall, and Veronica Dahl, editors, Knowledge Retrieval, Use and Storage for Efficiency: Proceedings of the First International KRUSE Symposium, pages 28–39, 1995. 4. M. L. Ginsberg. Dynamic backtracking. Journal of Artificial Intelligence Research, 1:25–46, 1993. 5. E. Giunchiglia, F. Giunchiglia, R. Sebastiani, and A. Tacchella. More evaluation of decision procedures for modal logics. In Anthony G. Cohn, Lenhart Schubert, and Stuart C. Shapiro, editors, Principles of Knowledge Representation and Reasoning: Proceedings of the Sixth International Conference (KR’98), pages 626–635. Morgan Kaufmann Publishers, San Francisco, California, June 1998. 6. A. Heuerding and S. Schwendimann. A benchmark method for the propositional modal logics K, KT, and S4. Technical report IAM-96-015, University of Bern, Switzerland, October 1996. 7. I. Horrocks. Optimising Tableaux Decision Procedures for Description Logics. PhD thesis, University of Manchester, 1997. 8. I. Horrocks, P. Patel-Schneider, and R. Sebastiani. An Analysis of Empirical Testing for Modal Decision Procedures. Logic Journal of the IGPL, 2000. 9. I. Horrocks and P. Patel-Schneider. FaCT and DLP. In Harrie de Swart, editor, Automated Reasoning with Analytic Tableaux and Related Methods: International Conference Tableaux’98, number 1397 in Lecture Notes in Artificial Intelligence, pages 27–30, Berlin, May 1998. Springer-Verlag. 10. I. Horrocks and P. Patel-Schneider. Performance of DLP on random modal formulae. In Proceedings of the 1999 Description Logic Workshop, pages 120–124, July 1999. 11. P. Patel-Schneider and I. Horrocks. DLP and FaCT. In Neil V. Murray, editor, Automated Reasoning with Analytic Tableaux and Related Methods: International Conference Tableaux’99, number 1617 in Lecture Notes in Artificial Intelligence, pages 19–23, Berlin, June 1999. Springer-Verlag. 12. A. Rector, S. Bechhofer, C. A. Goble, I. Horrocks, W. A. Nowlan, and W. D. Solomon. The Grail concept modelling language for medical terminology. Artificial Intelligence in Medicine, 9:139–171, 1997.

Two Techniques to Improve Finite Model Search Gilles Audemard, Belaid Benhamou, and Laurent Henocque Laboratoire d'Informatique de Marseille Centre de Mathematiques et d'Informatique 39, Rue Joliot Curie - 13453 Marseille cedex 13 - France Tel: 04 91 11 36 25 - Fax: 04 91 11 36 02

{audemard,benhamou,henocque}@lim.univ-mrs.fr

This article introduces two techniques to improve the propagation eciency of CSP based nite model generation methods. One approach consists in statically rewriting some selected clauses so as to trigger added constraint propagations. The other approach uses a dynamic lookahead strategy to both lter out inconsistent domain values and select the most appropriate branching variable according to a rst fail heuristic. Abstract.

1

Introduction

Many methods have been implemented to deal with many-sorted or uni-sorted theories: FINDER [7], FMSET [3], SATO [8], SEM [11], FMC [5] are known systems which solved some open problems. The method SEM (System for Enumerating Models) introduced by J. Zhang and H. Zhang in [11] is one of the most powerful known methods for solving problems expressed as many-sorted theories. The goal of this article is to explore ways to improve SEM by increasing the propagations it performs (i.e. the number of inferred negative ground literals) so as to reduce the search space and overall computation time. A rst possible improvement is a static preprocessing which automatically rewrites clauses having a specic structure. A second improvement consists in a dynamic domain ltering achieved by using a lookahead at some nodes of the search tree. This lookahead procedure uses unit propagation and detects incompatible assignments (e.g. trying f(0) = 0, then f(0) = 1 ...). This ltering is augmented by the introduction of a new heuristic, in the spirit of the SATZ propositional solver (see [4]). This article is organized as follows: Section 2 denes the rst-order logic theories accepted as input language and the background of the SEM algorithm. In section 3 we study two techniques which improve SEM eciency. In section 4, we compare our work with other methods on mathematicals problems. Section 5 concludes.

D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 302308, 2000. c Springer-Verlag Berlin Heidelberg 2000

Two Techniques to Improve Finite Model Search

2

303

Background and SEM Description

The theories accepted as input by the model generator SEM are many sorted rst order theories, with equality, without existential quantiers, in clause normal form (CNF). Since we are interested in nite models only, all sorts are nite. Because all the variables are universally quantied, the quantiers are usually omitted. We call the degree of a literal the number of its functional symbol occurrences. We call a cell the ground term f(e1 , ...ek) where all ei are sort elements. An interpretation of a theory maps each cell to a value from the appropriate sort. A model of a theory is an interpretation which satises all its clauses. As an initial preprocessing stage, SEM expands the original theory axioms to the set of their terminal instances (i.e. ground clauses), by substituting for each logical variable all the members of the appropriate sort. SEM's nite model search algorithm is described in gure 1. It uses the following parameters: A the set of assignments, B the set of unassigned cells and their possible values and C the set of ground clauses. The function Propa of the search algorithm propagates the assignment from A to C . This simplies C and may force some cells in B to become assigned. It modies (A, B, C) until a xed point is reached or an inconsistency is detected, and returns the modied triple (A, B, C) upon success. For a full description of SEM and the propagation algorithm one can refer to [9] and [11]. Function

Search(A, B, C): Return Boolean B = ∅ Then Return TRUE Choose and delete (cei , Di ) from B If Di = ∅ Then Return FALSE

If

e ∈ Di Do (A0 , B 0 , C 0 ) = P ropa(A ∪ (cei , e), B, C) 0 0 0 0 If C 6= F alse Then Search(A , B , C )

for All

Algorithm 1: SEM Search Algorithm

3

Two Domain Filtering Techniques

SEM's propagation algorithm allows propagation of negative assignments only when literals with the form ce! = e exist in the set of clauses (ce is a cell and e an element). Otherwise, only positive facts (value assignments) are propagated. This leads to an increase in the number of decision points necessary for the search, and potentially increases run times. Because SEM performs some amount of value elimination using negative facts, one approach consists in favoring it by rewriting some clauses to give them the appropriate structure. This static technique is performed in a preprocessing phase.

304

Gilles Audemard, Belaid Benhamou, and Laurent Henocque

The second approach is dynamic, involving computations performed at each decision node. It consists in using a lookahead technique, called unit propagation, to eliminate selected values from the domains of selected cells. 3.1 Clauses Transformation SEM performs value elimination when clauses contain negative literals of degree two. We can thus rewrite some selected clauses, using a attening technique as in FMSET [3] to rewrite clauses to logically equivalent clauses containing negative literals of degree two. Because such a transformation introduces auxiliary variables, and thus increases the number of ground clause instances, such a rewriting is a tradeo. Candidate clauses are thus carefully selected: we restrict the rewriting process to clauses of degree 3. This transformation allows to drastically reduce on some problems the number of decision points and the execution times as shown in section 4, the results obtained with this rewriting technique are listed under the name CTSEM (Clause Transformation in SEM). Denition 1.

A reducible clause is a clause which contains a literal with the

following pattern:

f(x1 , ...xm, g(xk+1 , ..., xl), xm+1 ...xk) = x0 where f, g are two xi∈{1..l} are variables. Such a literal is called reducible.

functional symbols and

By using the clause transformation algorithm described in [3], we can rewrite each reducible literal to the form: f(x1 , . . . , xm, v, xm+1 . . . xk ) = x0 ∨ v 6= g(xk+1 , . . . , xl ). This preserves the semantics of the clause, and introduces the negative literal v 6= g(xk+1 , . . . , xl ). It requires the introduction of an auxiliary variable v. The literal h(h(x, y), x) = y is reducible and can be transformed to its logical equivalent h(v, x) = y ∨ v 6= h(x, y). Now, the ground clause h(0, 1) 6= 0 ∨ h(0, 0) = 1 exists in the set of ground clauses. When we assign h(0, 0) to 0 the second literal of the previous clause becomes false. So SEM can propagate the fact h(0, 1) 6= 0. This eliminates 0 from the domain of the cell h(0, 1). Example 1.

3.2 Value Elimination At a given node of the search tree, let B equal the set of yet unassigned cells and CA the set of axioms simplied by assignments of cells in A. In other words CA = C 0 such that (A0 , B 0 , C 0 ) = P ropa(A, B, C). Let (ce, Dce ) ∈ B and let e ∈ Dce . If Propa (A ∪ {(ce, e)}, B, C) leads to inconsistency, then we can remove the value e from Dce . However, such unit propagations are time consuming. We must restrict the number of calls to Propa in order to obtain an ecient method. We use here a property similar to the one introduced for SAT problems in [2]. After the call to Propa (A ∪ {(ce, e)}, B, C), there are two possibilities:

Two Techniques to Improve Finite Model Search

305

CA∪{(ce,e)} = F alse and then Dce = Dce − {e}: Value elimination. CA∪{(ce,e)} 6= F alse: the value assignments (ci, ei ) propagated during the process are compatible with the current assignment and would not lead to value elimination if tried later. This drastically reduces the number of possible candidates for propagation, and minimizes the number of calls to Propa. Formally, we have the following propositions : Proposition 1.

Let

CA ∧ (ce 6= e).

(ce, e) ∈ B ,

if

CA∪{(ce,e)} |= ⊥

then

CA

is equivalent to

This property is used to eliminate values from the cell domains. Proposition 2. CA∪{(ce,e)} 6|= ⊥

Let then

(ce, e) ∈ B , if CA∪{(ce,e)} |= (ce1 , e1 ), ...(cen, en ) ∀i ∈ {1...n}|cei ∈ B , CA∪{(cei,ei )} 6|= ⊥

and if

This property avoids to propagate useless facts. This allows to perform fewer calls to the Propa procedure. An additional possibility to reduce the number of unit propagations is to select which cells must be tried. We note T ⊆ B the set of cells which are candidates for unit propagation. Because of symmetries (LNH), only the cells with indices less or equal than mdn need to be considered. We call those cells mdn cells. The results obtained using this dynamic ltering technique are listed in section 4 under the name VESEM. 3.3 A First Fail Heuristic In its original version, SEM chooses as the next cell the one with the smallest domain, and tries to not increment mdn. We note H these previous conditions. Then the heuristic chooses as the next variable to instantiate, the one that both satises conditions H and that maximizes the count of the number of propagations done on each cell for all their possible values. This approach is similar to the one described in [4] for propositional logic. The algorithm of this heuristic and value elimination process is shown in the algorithm 2. In algorithm 2, Mark[ce,e]=True means that the value e of the cell ce can be suppressed. The number N b equals the number of propagations.

Remark 1.

4

Experimentations

We compare SEM, VESEM (SEM + Value Elimination), CTSEM (SEM + Clause Transformation preprocessing) and CTVESEM (SEM + Clause Transformation + Value Elimination) on a set of well known problems. Run times are in seconds. All experiments were carried out under Linux on a K6II 400 PC with 128 MB of RAM. A '+' indicates that a program fails to solve a problem in less than two hours.

306

Gilles Audemard, Belaid Benhamou, and Laurent Henocque

Function Up_Heuristic(A, B, C): Return Next cell to choose

For All (ce, D) ∈ T Do For All e ∈ D Do Mark[ce, e]=True

(ce, D) ∈ T Do Nb = 0 For All e ∈ D such that Mark[ce, e]=True Do (A0 , B 0 , C 0 )=Propa(A ∪ (ce, e), B, C) 0 0 0 0 For All (ce , e ) propagated Do Mark[ce , e ]=False Nb = Nb + 1 0 If C = F alse Then D = D − {e} If |D| = 1 Then return ce

For All

Else

Return

w(ce) = w(ce) + Nb ce with the smallest domain and maximising w

Algorithm 2: Value Elimination and Heuristic 4.1 Quasigroup Problems A quasigroup is a binary operator '.' such that the equations a.x = b and x.a = b have an unique solution for all a, b. We deal here with idempotent quasigroups, statisfying the additional axiom (x.x = x). Adding dierent extra axioms leads to several problem instances, fully described in [6]. None of these axioms are reducible. The results obtained with quasigroups are listed in table 1. The results show that VESEM always explores fewer nodes than SEM. The amount of memory required to solve these problems is the same with both algorithms. Because of the cost of computing the heuristic, computation times are not signicantly improved in general except on one example (QG6). Only two examples (QG7 and QG1) exhibit results slightly worse with VESEM than with SEM. Although the quasigroup problems do not clearly prove a superiority of VESEM, they show that the value elimination and lookahead strategy generally results in a favorable tradeo and should be used. 4.2 Group and Ring Problems We compare VESEM and CTSEM and CTVESEM to SEM on a list of group and ring problems described by J. Zhang in [10]. The results are listed in table 2. Our algorithms explore fewer nodes than SEM. The lookahead strategy implemented in VESEM generally leads to improved computation times. The execution time ratio is sometimes very important: about 60 for NG and GRP. CTSEM and VESEM not only solve problems faster, but solve problems of much higher orders (NG, GRP, RU). To the best of our knowledge, it is the rst time that a program ever computes a nite model for NG34 and RU24 or proves the inconsistency of GRP38. The program CTVESEM combining both techniques (Clause Transformation and Value Elimination) visits fewer search tree nodes. But, almost all the values suppressed (leading to skipped nodes) are due to the clause rewriting technique.

Two Techniques to Improve Finite Model Search

307

Table 1. Quasigroup Problems - Comparison. Problem Nb Model QG1 7 4 QG2 6 0 QG2 7 3 QG3 9 0 QG3 10 0 QG4 9 74 QG4 10 0 QG5 14 0 QG5 15 0 QG6 11 0 QG6 12 0 QG7 13 2 QG7 14 0

Time 22 1.4 63 7.4 1416 6.3 1263 83 2031 40 2519 14.5 443

SEM Nodes 411 17 871 48 278 7 948 372 38 407 6 946 603 320 728 7 518 920 840 542 50 290 872 69 053 2 015 778

VESEM Time Nodes 24 194 1.4 9 59 401 6.3 40 015 1335 3 558 564 5.3 17 116 1 099 2 941 094 53 106 703 1306 2 251 311 2.3 13 690 142 929 781 16 37 132 528 1 107 404

Thus, adding value elimination to clause transformation is redundant and results in increased computation times. All results obtained and a fully detailed description of the dierent algorithms described in this paper are available in [1]. Table 2. Ring and Group Problems - Comparison

Problem Nb Models AG 28 162 AG 32 2 295 NG 28 51 NG 29 0 NG 34 3 GRP 31 0 GRP 32 2 712 GRP 38 0 RU 19 1 RU 20 21 RU 24 445 RNA 14 0 RNA 15 0 RNA16 ? RNB 17 0 RNB 18 0

SEM Time Nodes 328 642 103 940 2 037 525 6 934 8 359 103 + + 3 831 2 751 805 + + 4 591 2 720 769 + + 592 646 421 592 646 421 + 15 13 148 16 13 238

VESEM Time Nodes 321 76 663 956 624 304 806 108 120 752 94 417 5 450 504 182 272 21 821 1 620 740 797 6 690 584 374 1729 94 326 2 904 370 652 5 029 434 006 + 1 021 150 538 + 20 6 389 36 2 171

CTSEM Time Nodes 336 57 941 968 101 356 432 100 036 489 108 922 3 469 478 337 97 14 711 529 35 546 3 480 442 039 848 197 953 3 678 609 320 + 354 131 355 513 144 613 + 15 2 287 16 2 377

CTVESEM Time Nodes 394 41 859 1 272 76 393 1 105 88 832 1 191 82 519 + 378 24 691 2 204 93 420 + 1 666 15 741 2 957 336 612 5 019 366 597 426 45 162 682 56 623 + 21 1 309 34 852

308

5

Gilles Audemard, Belaid Benhamou, and Laurent Henocque

Conclusion

We introduce two techniques that can be used to improve CSP approaches to nite model generation of rst order theories. Their eciency stems from the introduction of negative facts in the clause transformation technique case (CTSEM), and from the elimination of domain values at some node of the search tree in the dynamic ltering case (VESEM). The behaviour of the algorithms on the AG and RNA problems suggests to search for improvements in the heuristic strategy associated with the lookahead procedure in VESEM, and also to eliminate more isomorphic subspaces than is actually done with the LNH heuristic used in those programs. VESEM seems to provide the basis for a general algorithm for nite model search of rst order theories.

References [1] G. Audemard, B. Benhamou, and L. Henocque. Two techniques to improve nite model search. Technical report, Laboratoire d'Informatique de Marseille, 1999. accessible electronicaly at http://www.cmi.univ-mrs.fr/~audemard/publi.html. [2] G. Audemard, B. Benhamou, and P. Siegel. La méthode d'avalanche aval: une méthode énumérative pour sat. In JNPC, pages 1725, 1999. [3] B. Benhamou and L. Henocque. A hybrid method for nite model search in equational theories. Fundamenta Informaticae, 39(1-2):2138, June 1999. [4] Chu Min Li and Anbulagan. Heuristics based on unit propagation for satisability problems. In Proceedings of the 15th International Joint Conference on Articial Intelligence (IJCAI-97), pages 366371, August 2329 1997. [5] Nicolas Peltier. A new method for automated nite model building exploiting failures and symmetries. Journal of Logic and Computation, 8(4):511543, 1998. [6] J. Slaney, M. Fujita, and M. Stickel. Automated reasoning and exhaustive search: Quasigroup existence problems. Computers and Mathematics with Applications, 29(2):115132, 1993. [7] J. Slanley. Finder: Finite domain enumerator. version 3 notes and guides. Technical report, Austrian National University, 1993. [8] H. Zhang and Mark E. Stickel. Implementing the davis-putnam algorithm by tries. Technical report, Department of Computer Science, University of Iowa,1994. [9] J. Zhang and H. Zhang. Constraint propagation in model generation. In proceedings of CP95, Marseille 1995. [10] Jian Zhang. Constructing nite algebras with FALCON. Journal of Automated Reasoning, 17(1):122, August 1996. [11] Jian Zhang and Hantao Zhang. SEM: a system for enumerating models. In Chris S. Mellish, editor, Proceedings of the Fourteenth IJCAI, pages 298303, 1995.

Eliminating Dummy Elimination J¨ urgen Giesl1 and Aart Middeldorp2 1

Computer Science Department University of New Mexico, Albuquerque, NM 87131, USA [email protected] 2 Institute of Information Sciences and Electronics University of Tsukuba, Tsukuba 305-8573, Japan [email protected]

Abstract. This paper is concerned with methods that automatically prove termination of term rewrite systems. The aim of dummy elimination, a method to prove termination introduced by Ferreira and Zantema, is to transform a given rewrite system into a rewrite system whose termination is easier to prove. We show that dummy elimination is subsumed by the more recent dependency pair method of Arts and Giesl. More precisely, if dummy elimination succeeds in transforming a rewrite system into a so-called simply terminating rewrite system then termination of the given rewrite system can be directly proved by the dependency pair technique. Even stronger, using dummy elimination as a preprocessing step to the dependency pair technique does not have any advantages either. We show that to a large extent these results also hold for the argument filtering transformation of Kusakari et al.

1

Introduction

Traditional methods to prove termination of term rewrite systems are based on simplification orders, like polynomial interpretations [6,12,17], the recursive path order [7,14], and the Knuth-Bendix order [9,15]. However, the restriction to simplification orders represents a significant limitation on the class of rewrite systems that can be proved terminating. Indeed, there are numerous important and interesting rewrite systems which are not simply terminating, i.e., their termination cannot be proved by simplification orders. Transformation methods (e.g. [5,10,11,16,18,20,21,22]) aim to prove termination by transforming a given term rewrite system into a term rewrite system whose termination is easier to prove. The success of such methods has been measured by how well they transform non-simply terminating rewrite systems into simply terminating rewrite systems, since simply terminating systems were the only ones where termination could be established automatically. In recent years, the dependency pair technique of Arts and Giesl [1,2] emerged as the most powerful automatic method for proving termination of rewrite systems. For any given rewrite system, this technique generates a set of constraints which may then be solved by standard simplification orders. In this way, the power of traditional termination proving methods has been increased significantly, i.e., the class of systems where termination is provable mechanically by D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 309–323, 2000. c Springer-Verlag Berlin Heidelberg 2000

310

J¨ urgen Giesl and Aart Middeldorp

the dependency pair technique is much larger than the class of simply terminating systems. In light of this development, it is no longer sufficient to base the claim that a particular transformation method is successful on the fact that it may transform non-simply terminating rewrite systems into simply terminating ones. In this paper we compare two transformation methods, dummy elimination [11] and the argument filtering transformation [16], with the dependency pair technique. With respect to dummy elimination we obtain the following results: 1. If dummy elimination transforms a given rewrite system R into a simply terminating rewrite system R0 , then the termination of R can also be proved by the most basic version of the dependency pair technique. 2. If dummy elimination transforms a given rewrite system R into a DP simply terminating rewrite system R0 , i.e., the termination of R0 can be proved by a simplification order in combination with the dependency pair technique, then R is also DP simply terminating. These results are constructive in the sense that the constructions in the proofs are solely based on the termination proof of R0 . This shows that proving termination of R directly by dependency pairs is never more difficult than proving termination of R0 . The second result states that dummy elimination is useless as a preprocessing step to the dependency pair technique. Not surprisingly, the reverse statements do not hold. In other words, as far as automatic termination proofs are concerned, dummy elimination is no longer needed. The recent argument filtering transformation of Kusakari, Nakamura, and Toyama [16] can be viewed as an improvement of dummy elimination by incorporating ideas of the dependency pair technique. We show that the first result above also holds for the argument filtering transformation. The second result does not extend in its full generality, but we show that under a suitable restriction on the argument filtering applied in the transformation of R to R0 , DP simple termination of R0 also implies DP simple termination of R. The remainder of the paper is organized as follows. In the next section we briefly recall some definitions and results pertaining to termination of rewrite systems and in particular, the dependency pair technique. In Section 3 we relate the dependency pair technique to dummy elimination. Section 4 is devoted to the comparison of the dependency pair technique and the argument filtering transformation. We conclude in Section 5.

2

Preliminaries

An introduction to term rewrite systems (TRSs) can be found in [4], for example. We first introduce the dependency pair technique. Our presentation combines features of [2,13,16]. Apart from the presentation, all results stated below are due to Arts and Giesl. We refer to [2,3] for motivations and proofs. Let R be a (finite) TRS over a signature F . As usual, all root symbols of left-hand sides of rewrite rules are called defined, whereas all other function symbols are constructors. Let F ] denote the union of F and {f ] | f is a defined symbol of R} where f ] has

Eliminating Dummy Elimination

311

the same arity as f. Given a term t = f(t1 , . . . , tn ) ∈ T (F , V) with f defined, we write t] for the term f ] (t1 , . . . , tn ). If l → r ∈ R and t is a subterm of r with defined root symbol then the rewrite rule l] → t] is called a dependency pair of R. The set of all dependency pairs of R is denoted by DP(R). In examples we often write F for f ] . For instance, consider the following well-known one-rule TRS R from [8]: f(f(x)) → f(e(f(x)))

(1)

Here f is defined, e is a constructor, and DP(R) consists of the two dependency pairs F(f(x)) → F(e(f(x)))

F(f(x)) → F(x)

An argument filtering [2] for a signature F is a mapping π that associates with every n-ary function symbol an argument position i ∈ {1, . . . , n} or a (possibly empty) list [i1 , . . . , im ] of argument positions with 1 6 i1 < · · · < im 6 n. The signature Fπ consists of all function symbols f such that π(f) is some list [i1 , . . . , im ], where in Fπ the arity of f is m. Every argument filtering π induces a mapping from T (F , V) to T (Fπ , V), also denoted by π:   if t is a variable, t π(t) = π(ti ) if t = f(t1 , . . . , tn ) and π(f) = i,   f(π(ti1 ), . . . , π(tim )) if t = f(t1 , . . . , tn ) and π(f) = [i1 , . . . , im ]. Thus, an argument filtering is used to replace function symbols by one of their arguments or to eliminate certain arguments of function symbols. For example, if π(f) = π(F) = [1] and π(e) = 1, then we have π(F(e(f(x)))) = F(f(x)). However, if we change π(e) to [ ], then we obtain π(F(e(f(x)))) = F(e). A preorder (or quasi-order ) is a transitive and reflexive relation. A rewrite preorder is a preorder % on terms that is closed under contexts and substitutions. A reduction pair [16] consists of a rewrite preorder % and a compatible wellfounded order > which is closed under substitutions. Here compatibility means that the inclusion % · > ⊆ > or the inclusion > · % ⊆ > holds. In practice, > is often chosen to be the strict part of % (or the order where s > t iff sσ tσ for all ground substitutions σ). The following theorem presents the (basic) dependency pair approach of Arts and Giesl. Theorem 1. A TRS R over a signature F is terminating if and only if there exists an argument filtering π for F ] and a reduction pair (%, >) such that π(R) ⊆ % and π(DP(R)) ⊆ >. Because rewrite rules are just pairs of terms, π(R) ⊆ % is a shorthand for π(l) % π(r) for every rewrite rule l → r ∈ R. In our example, when using π(e) = [ ], the inequalities f(f(x)) % f(e), F(f(x)) > F(e), and F(f(x)) > F(x) resulting from the dependency pair technique are satisfied by the recursive path order, for instance. Hence, termination of this TRS is proved.

312

J¨ urgen Giesl and Aart Middeldorp

Rather than considering all dependency pairs at the same time, like in the above theorem, it is advantageous to treat groups of dependency pairs separately. These groups correspond to clusters in the dependency graph of R. The nodes of the dependency graph are the dependency pairs of R and there is an arrow from node l1] → t]1 to l2] → t]2 if there exist substitutions σ1 and σ2 such that t]1 σ1 →∗R l2] σ2 . (By renaming variables in different occurrences of dependency pairs we may assume that σ1 = σ2 .) The dependency graph of R is denoted by DG(R). We call a non-empty subset C of dependency pairs of DP(R) a cluster if for every two (not necessarily distinct) pairs l1] → t]1 and l2] → t]2 in C there exists a non-empty path in C from l1] → t]1 to l2] → t]2 . Theorem 2. A TRS R is terminating if and only if for every cluster C in DG(R) there exists an argument filtering π and a reduction pair (%, >) such that π(R) ⊆ %, π(C) ⊆ % ∪ >, and π(C) ∩ > 6= ∅. Note that π(C) ∩ > 6= ∅ denotes the situation that π(l] ) > π(t] ) for at least one dependency pair l] → t] ∈ C. In the above example, the dependency graph only contains an arrow from F(f(x)) → F(x) to itself and thus {F(f(x)) → F(x)} is the only cluster. Hence, with the refinement of Theorem 2 the inequality F(f(x)) > F(e) is no longer necessary. See [3] for further examples which illustrate the advantages of regarding clusters separately. Note that while in general the dependency graph cannot be computed automatically (since it is undecidable whether t]1 σ →∗R l2] σ holds for some σ), one can nevertheless approximate this graph automatically, cf. [1,2,3, “estimated dependency graph”]. In this way, the criterion of Theorem 2 can be mechanized. Most classical methods for automated termination proofs are restricted to simplification (pre)orders, i.e., to (pre)orders satisfying the subterm property f(. . . t . . . ) t or f(. . . t . . . ) % t, respectively. Hence, these methods cannot prove termination of TRSs like (1), as the left-hand side of its rule is embedded in the right-hand side (so the TRS is not simply terminating). However, with the development of the dependency pair technique now the TRSs where an automated termination proof is potentially possible are those systems where the inequalities generated by the dependency pair technique are satisfied by simplification (pre)orders. A straightforward way to generate a simplification preorder from a simplification order is to define s t if s t or s = t, where = denotes syntactic equality. Such relations are particularly relevant, since many existing techniques generate simplification orders rather than preorders. By restricting ourselves to this class of simplification preorders, we obtain the notion of DP simple termination. Definition 1. A TRS R is called DP simply terminating if for every cluster C in DG(R) there exists an argument filtering π and a simplification order such that π(R ∪ C) ⊆ and π(C) ∩ = 6 ∅. Simple termination implies DP simple termination, but not vice versa. For example, the TRS (1) is DP simply terminating, but not simply terminating.

Eliminating Dummy Elimination

313

The above definition coincides with the one in [13] except that we use the real dependency graph instead of the estimated dependency graph of [1,2,3]. The reason for this is that we do not want to restrict ourselves to a particular computable approximation of the dependency graph, for the same reason that we do not insist on a particular simplification order to make the conditions effective.

3

Dummy Elimination

In [11], Ferreira and Zantema defined an automatic transformation technique which transforms a TRS R into a new TRS dummy(R) such that termination of dummy(R) implies termination of R. The advantage of this transformation is that non-simply terminating systems like (1) may be transformed into simply terminating ones. Thus, after the transformation, standard techniques may be used to prove termination. Below we define Ferreira and Zantema’s dummy elimination transformation. While our formulation of dummy(R) is different from the one in [11], it is easily seen to be equivalent. Definition 2. Let R be a TRS over a signature F . Let e be a distinguished function symbol in F of arity m > 1 and let be a fresh constant. We write F for (F \ {e}) ∪ {}. The mapping cap: T (F , V) → T (F , V) is inductively defined as follows:  if t ∈ V, t cap(t) = if t = e(t1 , . . . , tm ),  f(cap(t1 ), . . . , cap(tn )) if t = f(t1 , . . . , tn ) with f 6= e. The mapping dummy assigns to every term in T (F , V) a subset of T (F , V), as follows: dummy(t) = {cap(t)} ∪ {cap(s) | s is an argument of an e symbol in t}. Finally, we define dummy(R) = {cap(l) → r 0 | l → r ∈ R and r 0 ∈ dummy(r)}. The mappings cap and dummy are illustrated in Figure 1, where we assume that the numbered contexts do not contain any occurrences of e. Ferreira and Zantema [11] showed that dummy elimination is sound. Theorem 3. Let R be a TRS. If dummy(R) is terminating then R is terminating. For the one-rule TRS (1), dummy elimination yields the TRS consisting of the two rewrite rules f(f(x)) → f()

f(f(x)) → f(x)

314

J¨ urgen Giesl and Aart Middeldorp

1 e

e t=

2

6

3 e 4

5

7

8 > > > cap(t) = > > > > > > < > > > > > > > > > :

4

1 2 3

5 6 7

9 > > > > > > > > > = = > > > > > > > > > ;

dummy(t)

Fig. 1. The mappings cap and dummy.

In contrast to the original system, the new TRS is simply terminating and its termination is easily shown automatically by standard techniques like the recursive path order. Hence, dummy elimination can transform non-simply terminating TRSs into simply terminating ones. However, as indicated in the introduction, nowadays the right question to ask is whether it can transform non-DP simply terminating TRSs into DP simply terminating ones. Before answering this question we show that if dummy elimination succeeds in transforming a TRS into a simply terminating TRS then the original TRS is DP simply terminating. Even stronger, whenever termination of dummy(R) can be proved by a simplification order, then the same simplification order satisfies the constraints of the dependency pair approach. Thus, the termination proof using dependency pairs is not more difficult or more complex than the one with dummy elimination. Theorem 4. Let R be a TRS. If dummy(R) is simply terminating then R is DP simply terminating. Proof. Let F be the signature of R. We show that R is DP simply terminating even without considering the dependency graph refinement. So we define an argument filtering π for F ] and a simplification order on T (Fπ] , V) such that π(R) ⊆ and π(DP(R)) ⊆ . The argument filtering π is defined as follows: π(e) = [ ] and π(f) = [1, . . . , n] for every n-ary symbol f ∈ (F \ {e})] . Moreover, if e is a defined symbol, we define π(e] ) = [ ]. Let = be any simplification order that shows the simple termination of dummy(R). We define the simplification order on T (Fπ] , V) as follows: s t if and only if s0 = t0 where (·)0 denotes the mapping from T (Fπ] , V) to T (F , V) that first replaces every marked symbol F by f and afterwards replaces every occurrence of the constant e by . Note that and = are essentially the same. It is very easy to show that π(t)0 = π(t] )0 = cap(t) for every term t ∈ T (F, V). Let l → r ∈ R. Because cap(l) → cap(r) is a rewrite rule in dummy(R), we get π(l)0 = cap(l) = cap(r) = π(r)0 and thus π(l) π(r). Hence π(R) ⊆ and thus certainly π(R) ⊆ . Now let l] → t] be a dependency pair of R, originating from the rewrite rule l → r ∈ R. From t E r (E denotes the subterm relation) we easily infer the existence of a term u ∈ dummy(r) such that cap(t) E u. Since cap(l) → u is a rewrite rule in dummy(R), we have

Eliminating Dummy Elimination

315

π(l] )0 = cap(l) = u. The subterm property of = yields u w cap(t) = π(t] )0 . Hence π(l] )0 = π(t] )0 and thus π(l] ) π(t] ). We conclude that π(DP(R)) ⊆ . t u The previous result states that dummy elimination offers no advantage compared to the dependency pair technique. On the other hand, dependency pairs succeed for many systems where dummy elimination fails [1,2] (an example is given in the next section). One could imagine that dummy elimination may nevertheless be helpful in combination with dependency pairs. Then to show termination of a TRS one would first apply dummy elimination and afterwards prove termination of the transformed TRS with the dependency pair technique. In the remainder of this section we show that such a scenario cannot handle TRSs which cannot already be handled by the dependency pair technique directly. In short, dummy elimination is useless for automated termination proofs. We proceed in a stepwise manner. First we relate the dependency pairs of R to those of dummy(R). Lemma 1. If l] → t] ∈ DP(R) then cap(l)] → cap(t)] ∈ DP(dummy(R)). Proof. In the proof of Theorem 4 we observed that there exists a rewrite rule cap(l) → u in dummy(R) with cap(t) E u. Since root(cap(t)) is a defined symbol in dummy(R), cap(l)] → cap(t)] is a dependency pair of dummy(R). t u Now we prove that reducibility in R implies reducibility in dummy(R). Definition 3. Given a substitution σ, the substitution σcap is defined as cap ◦ σ (i.e., the composition of cap and σ where σ is applied first). Lemma 2. For all terms t and substitutions σ, we have cap(tσ) = cap(t)σcap . Proof. Easy induction on the structure of t.

t u

Lemma 3. If s →∗R t then cap(s) →∗dummy(R) cap(t). Proof. It is sufficient to show that s →R t implies cap(s) →∗dummy(R) cap(t). There must be a rule l → r ∈ R and a position p such that s|π = lσ and t = s[rσ]p . If p is below the position of an occurrence of e, then we have cap(s) = cap(t). Otherwise, cap(s)|p = cap(lσ) = cap(l)σcap by Lemma 2. Thus, cap(s) →dummy(R) cap(s)[cap(r)σcap]p = cap(s)[cap(rσ)]p = cap(t). t u Next we show that if there is an arrow between two dependency pairs in the dependency graph of R then there is an arrow between the corresponding dependency pairs in the dependency graph of dummy(R). Lemma 4. Let s, t be terms with defined root symbols. If s] σ →∗R t] σ for some substitution σ, then cap(s)] σcap →∗dummy(R) cap(t)] σcap. Proof. Let s = f(s1 , . . . , sn ). We have s] σ = f ] (s1 σ, . . . , sn σ). Since f ] is a constructor, no step in the sequence s] σ →∗R t] σ takes place at the root position and thus t] = f ] (t1 , . . . , tn ) with si σ →∗R ti σ for all 1 6 i 6 n. We obtain cap(si )σcap = cap(si σ) →∗dummy(R) cap(ti σ) = cap(ti )σcap for all 1 6 i 6 n by Lemmata 2 and 3. Hence cap(s)] σcap →∗dummy(R) cap(t)] σcap. t u

316

J¨ urgen Giesl and Aart Middeldorp

Finally we are ready for the main theorem of this section. Theorem 5. Let R be a TRS. If dummy(R) is DP simply terminating then R is DP simply terminating. Proof. Let C be a cluster in the dependency graph of R. From Lemmata 1 and 4 we infer the existence of a corresponding cluster, denoted by dummy(C), in the dependency graph of dummy(R). By assumption, there exists an argument filtering π 0 and a simplification order = such that π 0 (dummy(R) ∪ dummy(C)) ⊆ w and π 0 (dummy(C)) ∩ = 6= ∅. Let F be the signature of R. We define an argument filtering π for F ] as follows: π(f) = π 0 (f) for every f ∈ (F \ {e})] , π(e) = [ ] and, if e is a defined symbol of R, π(e] ) = [ ]. Slightly different from the proof of Theorem 4, let (·)0 denote the mapping that just replaces every occurrence of the constant e by and every occurrence of e] by ] . It is easy to show that π(t)0 = π 0 (cap(t)) for every term t ∈ T (F , V) and π(t] )0 = π 0 (cap(t)] ) for every term t ∈ T (F, V) with a defined root symbol. Similar to Theorem 4, we define the simplification order on Fπ as s t if and only if s0 = t0 . We claim that π and satisfy the constraints for C, i.e., π(R ∪ C) ⊆ and π(dummy(C)) ∩ = 6 ∅. If l → r ∈ R, then cap(l) → cap(r) ∈ dummy(R) and thus π(l)0 = π 0 (cap(l)) w π 0 (cap(r)) = π(r)0 . Hence π(l) π(r). If l] → t] ∈ C, then cap(l)] → cap(t)] ∈ dummy(C) by Lemma 1 and thus π(l] )0 = π 0 (cap(l)] ) w π 0 (cap(t)] ) = π(t] )0 . Hence π(l] ) π(t] ) and if π 0 (cap(l)] ) = π 0 (cap(t)] ), then π(l] ) π(t] ). t u We stress that the proof is constructive in the sense that a DP simple termination proof of dummy(R) can be automatically transformed into a DP simple termination proof of R (i.e., the orders and argument filterings required for the DP simple termination proofs of dummy(R) and R are essentially the same). Thus, the termination proof of dummy(R) is not simpler than a direct proof for R. Theorem 5 also holds if one uses the estimated dependency graph of [1,2,3] instead of the real dependency graph. As mentioned in Section 2, such a computable approximation of the dependency graph must be used in implementations, since constructing the real dependency graph is undecidable in general. The proof is similar to the one of Theorem 5, since again for every cluster in the estimated dependency graph of R there is a corresponding one in the estimated dependency graph of dummy(R).

4

Argument Filtering Transformation

By incorporating argument filterings, a key ingredient of the dependency pair technique, into dummy elimination, Kusakari, Nakamura, and Toyama [16] recently developed the argument filtering transformation. In their paper they proved the soundness of their transformation and they showed that it improves upon dummy elimination. In this section we compare their transformation to the dependency pair technique. We proceed as in the previous section. First we recall the definition of the argument filtering transformation.

Eliminating Dummy Elimination

317

Definition 4. Let π be an argument filtering, f a function symbol, and 1 6 i 6 arity(f). We write f ⊥π i if neither i ∈ π(f) nor i = π(f). Given two terms s and t, we say that s is a preserved subterm of t with respect to π and we write s Eπ t, if s E t and either s = t or t = f(t1 , . . . , tn ), s is a preserved subterm of ti , and f 6⊥π i. Definition 5. Given an argument filtering π, the argument filtering π ¯ is defined as follows: π(f) if π(f) = [i1 , . . . , im ], π ¯ (f) = [π(f)] if π(f) = i. The mapping AFTπ assigns to every term in T (F , V) a subset of T (Fπ , V), as follows: [ AFTπ (t) = {π(t) | π ¯ (t) contains a defined symbol} ∪ AFTπ (s) s∈S

with S denoting the set of outermost non-preserved subterms of t. Finally, we define AFTπ (R) = {π(l) → r 0 | l → r ∈ R and r 0 ∈ AFTπ (r) ∪ {π(r)}}. Consider the term t of Figure 1. Figure 2 shows AFTπ (t) for the two argument filterings with π(e) = [1] and π(e) = 2, respectively, and π(f) = [1, . . . , n] for every other n-ary function symbol f. Here we assume that all numbered contexts contain defined symbols, but no occurrence of e.

8> >> >> >> >> >< >> >> >> >> >> :

π(t) =

1 e

e

2

6

3 e

5

4

7

π(e) = [1]

9 > > > > > > > > > > = > > > > > > > > > > ;

= AFTπ (t) =

8 > > > > > > > > > > < > > > > > > > > > > :

1

= π(t) 7

3 5

4

2

6

9 > > > > > > > > > > = > > > > > > > > > > ;

π(e) = 2

Fig. 2. The mappings π and AFTπ . So essentially, AFTπ (t) contains π(s) for s = t and for all (maximal) subterms s of t which are eliminated if the argument filtering π is applied to t.

318

J¨ urgen Giesl and Aart Middeldorp

However, one only needs terms π(s) in AFTπ (t) where s contained a defined symbol outside eliminated arguments (otherwise the original subterm s cannot have been responsible for a potential non-termination). Kusakari et al. [11] proved the soundness of the argument filtering transformation. Theorem 6. If AFTπ (R) is terminating then R is terminating. We show that if AFTπ (R) is simply terminating then R is DP simply terminating and again, a termination proof by dependency pairs works with the same argument filtering π and the simplification order used to orient AFTπ (R). Thus, the argument filtering transformation has no advantage compared to dependency pairs. We start with two easy lemmata.1 Lemma 5. Let s and t be terms. If s Eπ t then π(s) E π(t). Proof. By induction on the definition of Eπ . If s = t then the result is trivial. Suppose t = f(t1 , . . . , tn ), s Eπ ti , and f 6⊥π i. The induction hypothesis yields π(s) E π(ti ). Because f 6⊥π i, π(ti ) is a subterm of π(t) and thus π(s) E π(t) as desired. t u Lemma 6. Let r be a term. For every subterm t of r with a defined root symbol there exists a term u ∈ AFTπ (r) such that π(t) E u. Proof. We use induction on the structure of r. In the base case we must have t = r and we take u = π(r). Note that π(r) ∈ AFTπ (r) because root(¯ π (r)) = root(r) is defined. In the induction step we distinguish two cases. If t Eπ r then we also have t Eπ¯ r and hence π ¯ (t) E π ¯ (r) by Lemma 5. As root(¯ π (t)) = root(t) is defined, the term π ¯ (r) contains a defined symbol. Hence π(r) ∈ AFTπ (r) by definition and thus we can take u = π(r). In the other case t is not a preserved subterm of r. This implies that t E s for some outermost non-preserved subterm s of r. The induction hypothesis, applied to t E s, yields a term u ∈ AFTπ (s) such that π(t) E u. We have AFTπ (s) ⊆ AFTπ (r) and hence u satisfies the requirements. t u Theorem 7. Let R be a TRS and π an argument filtering. If AFTπ (R) is simply terminating then R is DP simply terminating. Proof. Like in the proof of Theorem 4 there is no need to consider the dependency graph. Let be a simplification order that shows the (simple) termination of AFTπ (R). We claim that the dependency pair constraints are satisfied by π and , where π and are extended to F ] by treating each marked symbol F in the same way as the corresponding unmarked f. For rewrite rules l → r ∈ R we have π(l) π(r) as π(l) → π(r) ∈ AFTπ (R). Let l] → t] be a dependency pair of R, originating from the rewrite rule l → r. We show that π(l) π(t) and hence, π(l] ) π(t] ) as well. We have t E r. Since root(t) is a defined function symbol 1

Argumentations similar to the proofs of Lemma 6 and Theorem 7 can also be found in [16, Lemma 4.3 and Theorem 4.4]. However, [16] contains neither Theorem 7 nor our main Theorem 8, since the authors do not compare the argument filtering transformation with the dependency pair approach.

Eliminating Dummy Elimination

319

by the definition of dependency pairs, we can apply Lemma 6. This yields a term u ∈ AFTπ (r) such that π(t) E u. The subterm property of yields u π(t). By definition, π(l) → u ∈ AFTπ (R) and thus π(l) u by compatibility of with AFTπ (R). Hence π(l) π(t) as desired. t u Note that in the above proof we did not make use of the possibility to treat marked symbols differently from unmarked ones. This clearly shows why the dependency pair technique is much more powerful than the argument filtering transformation; there are numerous DP simply terminating TRSs which are no longer DP simply terminating if we are forced to interpret a defined function symbol and its marked version in the same way. As a simple example, consider   0 ÷ s(y) → 0  x − 0 → x R1 = x − s(y) → p(x − y) s(x) ÷ s(y) → s((x − y) ÷ s(y)) .   p(s(x)) → x Note that R1 is not simply terminating as the rewrite step s(x) ÷ s(s(x)) → s((x−s(x))÷s(s(x))) is self-embedding. To obtain a terminating TRS AFTπ (R1 ), the rule p(s(x)) → x enforces p 6⊥π 1 and s 6⊥π 1. From p 6⊥π 1 and the rules for − we infer that π(−) = [1, 2]. But then, for all choices of π(÷), the rule s(x)÷s(y) → s((x−y)÷s(y)) is transformed into one that is incompatible with a simplification order. So AFTπ (R1 ) is not simply terminating for any π. (Similarly, dummy elimination cannot transform this TRS into a simply terminating one either.) On the other hand, DP simple termination of R1 is easily shown by the argument filtering π(p) = 1, π(−) = 1, π(−] ) = [1, 2], and π(f) = [1, . . . , arity(f)] for every other function symbol f in combination with the recursive path order. This example illustrates that treating defined symbols and their marked versions differently is often required in order to benefit from the fact that the dependency pair approach only requires weak decreasingness for the rules of R1 . The next question we address is whether the argument filtering transformation can be useful as a preprocessing step for the dependency pair technique. Surprisingly, the answer to this question is yes. Consider the TRS   → f(c(a)) f(a) → f(d(a)) e(g(x)) → e(x)   f(a) f(d(x)) → x R2 = f(c(x)) → x .   f(c(a)) → f(d(b)) f(c(b)) → f(d(a)) This TRS is not DP simply terminating which can be seen as follows. The dependency pair E(g(x)) → E(x) constitutes a cluster in the dependency graph of R2 . Hence, if R2 were DP simply terminating, there would be an argument filtering π and a simplification order such that (amongst others) π(f(a)) π(f(c(a))) π(f(c(x))) x π(f(c(a))) π(f(d(b)))

π(f(a)) π(f(d(a))) π(f(d(x))) x π(f(c(b))) π(f(d(a)))

From π(f(c(x))) x and π(f(d(x))) x we infer that f 6⊥π 1, c 6⊥π 1, and d 6⊥π 1. Hence π(f(a)) π(f(c(a))) and π(f(a)) π(f(d(a))) can only be satisfied

320

J¨ urgen Giesl and Aart Middeldorp

if π(c) = π(d) = 1. But then π(f(c(a))) π(f(d(b))) and π(f(c(b))) π(f(d(a))) amount to either f(a) f(b) and f(b) f(a) (if π(f) = [1]) or a b and b a (if π(f) = 1). Since f(a) 6= f(b) and a 6= b the required simplification order does not exist. On the other hand, if π(e) = 1 then AFTπ (R2 ) consists of the first six rewrite rules of R together with g(x) → x. One easily verifies that there are no clusters in DG(AFTπ (R2 )) and hence AFTπ (R2 ) is trivially DP simply terminating. Definition 6. An argument filtering π is called collapsing if π(f) = i for some defined function symbol f. The argument filtering in the previous example is collapsing. In the remainder of this section we show that for non-collapsing argument filterings the implication “AFTπ (R) is DP simply terminating ⇒ R is DP simply terminating” is valid. Thus, using the argument filtering transformation with a non-collapsing π as a preprocessing step to the dependency pair technique has no advantages. First we prove a lemma to relate the dependency pairs of R and AFTπ (R). Lemma 7. Let π be a non-collapsing argument filtering. If l] → t] ∈ DP(R) then π(l)] → π(t)] ∈ DP(AFTπ (R)). Proof. By definition there is a rewrite rule l → r ∈ R and a subterm t E r with defined root symbol. According to Lemma 6 there exists a term u ∈ AFTπ (r) such that π(t) E u. Thus, π(l) → u ∈ AFTπ (R). Since π is non-collapsing, root(π(t)) = root(t). Hence, as root(t) is defined, π(l)] → π(t)] is a dependency pair of AFTπ (R). t u Example R2 shows that the above lemma is not true for arbitrary argument filterings. The reason is that e(g(x))] → e(x)] is a dependency pair of R, but with π(e) = 1 there is no corresponding dependency pair in AFTπ (R). The next three lemmata will be used to show that clusters in DG(R) correspond to clusters in DG(AFTπ (R)). Definition 7. Given an argument filtering π and a substitution σ, the substitution σπ is defined as π ◦ σ (i.e., σ is applied first). Lemma 8. For all terms t, argument filterings π, and substitutions σ, π(tσ) = π(t)σπ . Proof. Easy induction on the structure of t.

t u

Lemma 9. Let R be a TRS and π a non-collapsing argument filtering. If s →∗R t then π(s) →∗AFTπ (R) π(t). Proof. It suffices to show that π(s) →∗AFTπ (R) π(t) whenever s →∗R t consists of a single rewrite step. Let s = C[lσ] and t = C[rσ] for some context C, rewrite rule l → r ∈ R, and substitution σ. We use induction on C. If C is the empty context, then π(s) = π(lσ) = π(l)σπ and π(t) = π(rσ) = π(r)σπ according to

Eliminating Dummy Elimination

321

Lemma 8. As π(l) → π(r) ∈ AFTπ (R), we have π(s) →AFTπ (R) π(t). Suppose C = f(s1 , . . . , C 0 , . . . , sn ) where C 0 is the i-th argument of C. If f ⊥π i then π(s) = π(t). If π(f) = i (which is possible for constructors f) then π(s) = π(C 0 [lσ]) and π(t) = π(C 0 [rσ]), and thus we obtain π(s) →∗AFTπ (R) π(t) from the induction hypothesis. In the remaining case we have π(f) = [i1 , . . . , im ] with ij = i for some j and hence π(s) = f(π(si1 ), . . . , π(C 0 [lσ]), . . . , π(sim )) and π(t) = f(π(si1 ), . . . , π(C 0 [rσ]), . . . , π(sim )). In this case we obtain π(s) →∗AFTπ (R) π(t) from the induction hypothesis as well. t u The following lemma states that if two dependency pairs are connected in R’s dependency graph, then the corresponding pairs are connected in the dependency graph of AFTπ (R) as well. Lemma 10. Let R be a TRS, π a non-collapsing argument filtering, and s, t be terms with defined root symbols. If s] σ →∗R t] σ for some substitution σ then π(s)] σπ →∗AFTπ (R) π(t)] σπ . Proof. We have s = f(s1 , . . . , sn ) and t = f(t1 , . . . , tn ) for some n-ary defined function symbol f with si σ →∗R ti σ for all 1 6 i 6 n. Let π(f) = [i1 , . . . , im ]. This implies π(sσ)] = f ] (π(si1 σ), . . . , π(sim σ)) and π(tσ)] = f ] (π(ti1 σ), . . . , π(tim σ)). From the preceding lemma we know that π(sij σ) →∗AFTπ (R) π(tij σ) for all 1 6 j 6 m. Hence, using Lemma 8, π(s)] σπ = π(sσ)] →∗AFTπ (R) π(tσ)] = π(t)] σπ . t u Now we can finally prove the main theorem of this section. Theorem 8. Let R be a TRS and π a non-collapsing argument filtering. If AFTπ (R) is DP simply terminating then R is DP simply terminating. Proof. Let C be a cluster in DG(R). According to Lemmata 7 and 10, there is a corresponding cluster in DG(AFTπ (R)), which we denote by π(C). By assumption, there exist an argument filtering π 0 and a simplification order such that π 0 (AFTπ (R) ∪ π(C)) ⊆ and π 0 (π(C)) ∩ = 6 ∅. We define an argument filtering π 00 for R as the composition of π and π 0 . For a precise definition, let [ denote the unmarking operation, i.e., f [ = f and (f ] )[ = f for all f ∈ F . Then for all f ∈ F ] we define  [ 0 [ij1 , . . . , ijk ] if π(f ) = [i1 , . . . , im ] and π (f) = [j1 , . . . , jk ], if π(f [ ) = [i1 , . . . , im ] and π 0 (f) = j, π 00 (f) = ij  i if π(f) = i. It is not difficult to show that π 00 (t) = π 0 (π(t)) and π 00 (t] ) = π 0 (π(t)] ) for all terms t without marked symbols. We claim that π 00 and satisfy the constraints for C, i.e., π 00 (R∪C) ⊆ and π 00 (C)∩ = 6 ∅. These two properties follow from the two assumptions π 0 (AFTπ (R) ∪ π(C)) ⊆ and π 0 (π(C)) ∩ 6= ∅ in conjunction with the obvious inclusion π(R) ⊆ AFTπ (R). t u Theorem 8 also holds for the estimated dependency graph instead of the real dependency graph.

322

5

J¨ urgen Giesl and Aart Middeldorp

Conclusion

In this paper, we have compared two transformational techniques for termination proofs, viz. dummy elimination [11] and the argument filtering transformation [16], with the dependency pair technique of Arts and Giesl [1,2,3]. Essentially, all these techniques transform a given TRS into new inequalities or rewrite systems which then have to be oriented by suitable well-founded orders. Virtually all wellfounded orders which can be generated automatically are simplification orders. As our focus was on automated termination proofs, we therefore investigated the strengths of these three techniques when combined with simplification orders. To that end, we showed that whenever an automated termination proof is possible using dummy elimination or the argument filtering transformation, then a corresponding termination proof can also be obtained by dependency pairs. Thus, the dependency pair technique is more powerful than dummy elimination or the argument filtering transformation on their own. Moreover, we examined whether dummy elimination or the argument filtering transformation would at least be helpful as a preprocessing step to the dependency pair technique. We proved that for dummy elimination and for an argument filtering transformation with a non-collapsing argument filtering, this is not the case. In fact, whenever there is a (pre)order satisfying the dependency pair constraints for the rewrite system resulting from dummy elimination or a non-collapsing argument filtering transformation, then the same (pre)order also satisfies the dependency pair constraints for the original TRS. As can be seen from the proofs of our main theorems, this latter result even holds for arbitrary (i.e., non-simplification) (pre)orders. Thus, in particular, Theorems 5 and 8 also hold for DP quasi-simple termination [13]. This notion captures those TRSs where the dependency pair constraints are satisfied by an arbitrary simplification preorder % (instead of just a preorder where the equivalence relation is syntactic equality as in DP simple termination). Future work will include a further investigation on the usefulness of collapsing argument filtering transformations as a preprocessing step to dependency pairs. Note that our counterexample R2 is DP quasi-simply terminating (but not DP simply terminating). In other words, at present it is not clear whether the argument filtering transformation is useful as a preprocessing step to the dependency pair technique if one admits arbitrary simplification preorders to solve the generated constraints. However, an extension of Theorem 8 to DP quasi-simple termination and to collapsing argument filterings π is not straightforward, since clusters of dependency pairs in R may disappear in AFTπ (R) (i.e., Lemma 7 does not hold for collapsing argument filterings). We also intend to examine the relationship between dependency pairs and other transformation techniques such as “freezing” [20]. Acknowledgements. J¨ urgen Giesl is supported by the DFG under grant GI 274/4-1. Aart Middeldorp is partially supported by the Grant-in-Aid for Scientific Research C(2) 11680338 of the Ministry of Education, Science, Sports and Culture of Japan.

Eliminating Dummy Elimination

323

References 1. T. Arts and J. Giesl, Automatically Proving Termination where Simplification Orderings Fail, Proc. 7th TAPSOFT, Lille, France, LNCS 1214, pp. 261–273, 1997. 2. T. Arts and J. Giesl, Termination of Term Rewriting Using Dependency Pairs, Theoretical Computer Science 236, pp. 133–178, 2000. Long version available at www.inferenzsysteme.informatik.tu-darmstadt.de/~reports/ibn-97-46.ps. 3. T. Arts and J. Giesl, Modularity of Termination Using Dependency Pairs, Proc. 9th RTA, Tsukuba, Japan, LNCS 1379, pp. 226–240, 1998. 4. F. Baader and T. Nipkow, Term Rewriting and All That, Cambridge University Press, 1998. 5. F. Bellegarde and P. Lescanne, Termination by Completion, Applicable Algebra in Engineering, Communication and Computing 1, pp. 79–96, 1990. 6. A. Ben Cherifa and P. Lescanne, Termination of Rewriting Systems by Polynomial Interpretations and its Implementation, Science of Computer Programming 9, pp. 137–159, 1987. 7. N. Dershowitz, Orderings for Term-Rewriting Systems, Theoretical Computer Science 17, pp. 279–301, 1982. 8. N. Dershowitz, Termination of Rewriting, Journal of Symbolic Computation 3, pp. 69–116, 1987. 9. J. Dick, J. Kalmus, and U. Martin, Automating the Knuth Bendix Ordering, Acta Informatica 28, pp. 95–119, 1990. 10. M.C.F. Ferreira, Termination of Term Rewriting: Well-foundedness, Totality and Transformations, Ph.D. thesis, Utrecht University, The Netherlands, 1995. 11. M.C.F. Ferreira and H. Zantema, Dummy Elimination: Making Termination Easier, Proc. 10th FCT, Dresden, Germany, LNCS 965, pp. 243–252, 1995. 12. J. Giesl, Generating Polynomial Orderings for Termination Proofs, Proc. 6th RTA, Kaiserslautern, Germany, LNCS 914, pp. 426–431, 1995. 13. J. Giesl and E. Ohlebusch, Pushing the Frontiers of Combining Rewrite Systems Farther Outwards, Proc. 2nd FROCOS, 1998, Amsterdam, The Netherlands, Studies in Logic and Computation 7, Research Studies Press, Wiley, pp. 141–160, 2000. 14. S. Kamin and J.J. L´evy, Two Generalizations of the Recursive Path Ordering, unpublished manuscript, University of Illinois, USA, 1980. 15. D.E. Knuth and P. Bendix, Simple Word Problems in Universal Algebras, in: Computational Problems in Abstract Algebra (ed. J. Leech), Pergamon Press, pp. 263– 297, 1970. 16. K. Kusakari, M. Nakamura, and Y. Toyama, Argument Filtering Transformation, Proc. 1st PPDP, Paris, France, LNCS 1702, pp. 48–62, 1999. 17. D. Lankford, On Proving Term Rewriting Systems are Noetherian, Report MTP-3, Louisiana Technical University, Ruston, USA, 1979. 18. A. Middeldorp, H. Ohsaki, and H. Zantema, Transforming Termination by SelfLabelling, Proc. 13th CADE, New Brunswick (New Jersey), USA, LNAI 1104, pp. 373–387, 1996. 19. J. Steinbach, Simplification Orderings: History of Results, Fundamenta Informaticae 24, pp. 47–87, 1995. 20. H. Xi, Towards Automated Termination Proofs Through “Freezing”, Proc. 9th RTA, Tsukuba, Japan, LNCS 1379, pp. 271–285, 1998. 21. H. Zantema, Termination of Term Rewriting: Interpretation and Type Elimination, Journal of Symbolic Computation 17, pp. 23–50, 1994. 22. H. Zantema, Termination of Term Rewriting by Semantic Labelling, Fundamenta Informaticae 24, pp. 89–105, 1995.

Extending Decision Procedures with Induction Schemes Deepak Kapur1? and Mahadavan Subramaniam2 1

2

Department of Computer Science University of New Mexico, Albuquerque, NM [email protected] HAL Computer Systems, Fujitsu Inc, Campbell, CA [email protected]

Abstract. Families of function definitions and conjectures based in quantifier-free decidable theories are identified for which inductive validity of conjectures can be decided by the cover set method, a heuristic implemented in a rewrite-based induction theorem prover Rewrite Rule Laboratory (RRL) for mechanizing induction. Conditions characterizing definitions and conjectures are syntactic, and can be easily checked, thus making it possible to determine a priori whether a given conjecture can be decided. The concept of a T -based function definition is introduced that consists of a finite set of terminating complete rewrite rules of the form f (s1, · · · , sm ) → r, where s1 , · · · , sm are interpreted terms from a decidable theory T , and r is either an interpreted term or has nonnested recursive calls to f with all other function symbols from T . Two kinds of conjectures are considered. Simple conjectures are of the form f (x1, · · · xm ) = t, where f is T -based, xi ’s are distinct variables, and t is interpreted in T . Complex conjectures differ from simple conjectures in their left sides which may contain many function symbols whose definitions are T -based and the nested order in which these function symbols appear in the left sides have the compatibility property with their definitions. The main objective is to ensure that for each induction subgoal generated from a conjecture after selecting an induction scheme, the resulting formula can be simplified so that induction hypothesis(es), whenever needed, is applicable, and the result of this application is a formula in T . Decidable theories considered are the quantifier-free theory of Presburger arithmetic, congruence closure on ground terms (with or without associative-commutative operators), propositional calculus, and the quantifier-free theory of constructors (mostly, free constructors as in the case of finite lists and finite sequences). A byproduct of the approach is that it can predict the structure of intermediate lemmas needed for automatically deciding this subclass of conjectures. Several examples over lists, numbers and of properties involved in establishing the numbertheoretic correctness of arithmetic circuits are given. ?

Partially supported by the National Science Foundation Grant nos. CCR-9712396, CCR-9712366, CCR-9996150, and CDA-9503064.

D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 324–345, 2000. c Springer-Verlag Berlin Heidelberg 2000

Extending Decision Procedures with Induction Schemes

1

325

Introduction

Inductive reasoning is ubiquitous in verifying properties of computations realized in hardware and software. Automation of inductive reasoning is hampered by the fact that proofs by induction need an appropriate selection of variables for performing induction and a suitable induction scheme, as well as intermediate lemmas. It is well-known that inductive reasoning often needs considerable user guidance, because of which its automation has been a major challenge. A lot of effort has been spent on mechanizing induction in theorem provers (e.g., Nqthm, ACL2, RRL, INKA, Oyster-Clam), and induction heuristics in these provers have been successfully used to establish several nontrivial properties. However, the use of induction as a mechanized rule of inference is seriously undermined due to the lack of automation in using this rule. Many reasoning tools including model checkers (in conjunction with decision procedures), invariant generators, deductive synthesis tools preclude induction for lack of automation. This severely limits the reasoning capability of these tools. In many cases inductive properties are established outside these tools manually. For hardware circuit descriptions, need for inductive reasoning arises when reasoning is attempted about a circuit description parameterized by data width and/or generic components. In protocol verification, induction is often needed when a protocol has to be analyzed for a large set of processors (or network channels). Inductive reasoning in many such cases is not as challenging as in software specifications as well as recursive and loop programs. This paper is an attempt to address this limitation of these automated tools while preserving their automation, and without having the full generality of a theorem prover. It is shown how decision procedures for simple theories about certain data structures, e.g., numbers, booleans, finite lists, finite sequences, can be enhanced to include induction techniques with the objective that proofs employing such techniques can be done automatically. The result is an extended decision procedure with a built-in induction scheme, implying that an inductive theorem prover can be run in push-button mode as well. We believe the proposed approach can substantially enhance the reasoning power of tools built using decision procedures and model-checkers without losing the advantage of automation. This cannot be done, however, in general. Conditions are identified on function definitions and conjectures which guarantee such automation. It becomes possible to determine a priori whether a given conjecture can be decided automatically, thus predicting the success or failure of using a theorem proving strategy. That is the main contribution of the paper. A byproduct of the proposed approach is that in case of a failure of the theorem proving strategy, it can predict for a subclass of conjectures, the structure of lemmas needed for proof attempts to succeed. The proposed approach is based on two main ideas: First, terminating recursive definitions of function symbols as rewrite rules oriented using a wellfounded ordering, can be used to generate induction schemes providing useful induction hypotheses for proofs by induction; this idea is the basis for the cover

326

Deepak Kapur and Mahadavan Subramaniam

set method proposed in [16] and implemented in a rewrite-rule based induction theorem prover Rewrite Rule Laboratory (RRL) [14]. Second, for inductive proofs of conjectures satisfying certain conditions, induction schemes generated from T -based recursive function definitions (a concept characterized precisely below), lead to subgoals in T (after the application of induction hypotheses), where T is a decidable theory. These conditions are based on the structure of the function definitions and the conjecture. The concept of a T -based function definition is introduced to achieve the above objective. It is shown that conjectures of the form f(x1 , · · · , xm) = r, where f has a T -based function definition, xi ’s are distinct variables, and r is an interpreted term in T (i.e., r includes only function symbols from T ), can be decided using the cover set method1 . The reason for focusing on such simple conjectures is that there is only one induction scheme to be considered by the cover set method. It might be possible to relax this restriction and consider more complicated conjectures insofar as they suggest one induction scheme and the induction hypothesis(es) is applicable to the subgoals after using the function definitions for simplification. Decidable theories considered are the quantifier-free theory of Presburger arithmetic, congruence closure on ground terms (with or without associativecommutative operators), propositional calculus as well as the quantifier-free theory of constructors (mostly free constructors as in the case of finite lists and finite sequences). For each such theory, decision procedures exist, and RRL, for instance, has an implementation of them integrated with rewriting [7,8]. Below, we review two examples providing an overview of the proposed approach, the subclass of conjectures and definitions which can be considered. 1.1

A Simple Conjecture

Consider the following very simple but illustrative example. (C1): double(m) = m + m,

where double is recursively defined using the rewrite rules: 1. double(0) --> 0, 2. double(s(x)) --> s(s(double(x))).

A proof by induction with m as the induction variable and using the standard induction scheme (i.e., Peano’s principle of mathematical induction over numbers), leads to one basis goal and one induction step goal. For the basis subgoal, the substitution m <- 0 gives double(0) = 0 + 0 which simplifies using the definition of double to a valid formula in Presburger arithmetic. In the step subgoal, the conclusion generated using substitution m <- s(x) is double(s(x)) = s(x) + s(x), with the induction hypothesis got by the substitution m <- x, being double(x) = x + x. 1

As will be shown later, it is not necessary to require that each argument to f be a distinct variable; instead, non-induction arguments can be interpreted terms that do not include variables in inductive positions of f .

Extending Decision Procedures with Induction Schemes

327

By the second rule in the definition of double, the formula simplifies, the induction hypothesis applies, resulting again in a valid formula s(s(x + x)) = s(x) + s(x) in Presburger arithmetic. Hence, (C1) is valid using Presburger arithmetic and the induction scheme of double. Similarly, a conjecture double(m) = m can be decided to be false: the basis case will go through, but the formula resulting from the induction step and the application of the induction hypothesis is not valid. The main features of the above conjectures and the definition of double are: 1. unambiguous induction scheme using which the formula can be decided, 2. induction hypotheses are strong enough to be applicable to induction subgoals, and finally 3. the formulas resulting from subgoals after applying the definition and the induction hypotheses are decidable. Properties of T -based definitions and simple conjectures ensure the above. 1.2

A Complex Conjecture

For considering complex conjectures including many function symbols with T based definitions, it becomes necessary to consider the interaction among their definitions based on their nesting order in conjectures. This aspect is captured by the compatibility property of function definitions (which is precisely characterized in a later section). The key insight is similar to the one observed of a simple conjecture. Compatible function definitions can be viewed as composing into a single T -based function definition so that a complex conjecture can be viewed as being a simple conjecture in terms of the composed function as illustrated below. (C2): log(exp2(m)) = m.

The definitions of the functions log, exp2 (logarithm and exponentiation to the base 2 respectively) are as follows. Following the mathematical convention, log is defined on positive numbers only2 . 1. log(s(0)) --> 0, 2. log(x + x) --> s(log(x)), 3. log(s(x + x)) --> s(log(x)). 4. exp2(0) 5. exp2(s(x))

--> s(0), --> exp2(x) + exp2(x).

Unlike a simple conjecture, the left side of (C2) is a nested term. Again, there is only one induction variable m, and the induction scheme used for attempting 2

This implies that the induction schemes generated using the definition of log can be used to decide the validity of conjectures over positive numbers only. For a detailed discussion of the use of cover sets and induction schemes derived from definitions such as log, please refer to [9].

328

Deepak Kapur and Mahadavan Subramaniam

a proof by induction is the principle of mathematical induction for numbers (suggested by the cover set of exp2, as explained later in section 2.1). There is one basis goal and one induction step subgoal. The basis subgoal is log(exp2(0)) = 0. The left side rewrites using the definitions of exp2 and then log, resulting in a valid formula in Presburger arithmetic. In the induction step, the conclusion is log(exp2(s(x))) = s(x) with the hypothesis being log(exp2(x)) = x. By the definition of exp2, exp2(s(x)) simplifies to exp2(x) + exp2(x). This subgoal will simplify to a formula in Presburger arithmetic if log(exp2(x) + exp2(x)) rewrites to s(log(exp2(x))) either as a part of the definition of log or as an intermediate lemma, and then, the induction hypothesis can apply. Such interaction between the definitions of log and exp2 is captured by compatibility. Since the definition of log includes such a rule, the validity of the induction step case and hence the validity of (C2) can be decided. The validity of a closely related conjecture, (C2’): exp2(log(m)) = m,

can be similarly decided since exp2 is compatible with log. An induction proof can be attempted using m as the induction variable as before. However, m can take only positive values since the function log is defined only for these. The induction scheme used is different from the principle of mathematical induction. Instead, it is based on the definition of log. There is a basis case corresponding to the number s(0), and two step cases corresponding to m being a positive even or a positive odd number respectively (this scheme is derived from the cover set of log as explained later in section 2.1). In one of the induction step subgoals, the left side of the conclusion, exp2(log(s(x + x))) = s(x + x), rewrites by the definition of log to exp2(s(log(x))) which then rewrites to exp2(log(x)) + exp2(log(x)) to which the hypothesis exp2(log(x)) = x applies to produce the inconsistent Presburger arithmetic formula x + x = s(x + x). As stated above, if log and exp2 are combined to develop the definition of the composed function log(exp2(x)) from their definitions, then (C2) is a simple conjecture about the composed function. Further, the definition of the composed function can be proved to be T -based as well. So the decidability of the conjecture follows. The notion of compatibility among the definitions of function symbols can be generalized to a compatible sequence of function symbols f1 , · · · , fd where each fi is compatible with fi+1 at ji -th argument, 1 ≤ i ≤ d − 1. A conjecture l = r can then be decided if the sequence of function symbols from the root of l to the innermost function symbol forms a compatible sequence, and r is an interpreted term in T . The proposed approach is discussed below in the framework of our theorem prover Rewrite Rule Laboratory (RRL), but the results should apply to other induction provers that rely on decision procedures and support heuristics for selecting induction schemes, e.g., Boyer and Moore’s theorem prover Nqthm,

Extending Decision Procedures with Induction Schemes

329

ACL2, and INKA. And, the proposed approach can be integrated in tools based on decision procedures and model checking. The main motivation for this work comes from our work on verifying properties of generic, parameterized arithmetic circuits, including adders, multipliers, dividers and square root [12,10,11,13]. The approach is illustrated on several examples including properties arising in proofs of arithmetic circuits, as well as commonly used properties of numbers and lists involving defined function symbols. A byproduct of this approach is that if a conjecture with the above mentioned restriction cannot be decided, structure of intermediate lemmas needed for deciding it can be predicted. This can aid in automatic lemma speculation. 1.3

Related Work

Boyer and Moore while describing the integration of linear arithmetic into Nqthm [3] discussed the importance of reasoning about formulas involving defined function symbols and interpreted terms. Many examples of such conjectures were discussed there. They illustrated how these examples can be done using the interaction of the theorem prover and the decision procedure. In this paper we have focussed on automatically deciding the validity of such conjectures. Most of the examples described there can be automatically decided using the proposed approach. Fribourg [4] showed that properties of certain recursive predicates over lists expressed as logic programs along with numbers, can be decided. Most of the properties established there can be formulated as equational definitions and decided using the proposed approach. The procedure in [4] used bottom-up evaluation of logic programs which need not terminate if successor operation over numbers is included. The proposed approach does not appear to have this limitation. Gupta’s dissertation [1] was an attempt to integrate (a limited form of) inductive reasoning with a decision procedure for propositional formulas, e.g., ordered BDDs. She showed how properties about a certain subclass of circuits of arbitrary data width can be verified automatically. Properties automatically verified using her approach constitute a very limited subset, however.

2

Cover Set Induction

The cover set method is used to mechanize well-founded induction in RRL, and has been used to successfully perform proofs by induction in a variety of nontrivial application domains [12,10,11]. For attempting a proof by induction of a conjecture containing a subterm t = f(x1 , · · · , xm ), where each xi is a distinct variable, an induction scheme from a complete definition of f given as a set of terminating rewrite rules, is generated as follows. There is one induction subgoal corresponding to each terminating rule in the definition of f. The induction conclusion is generated using the substitution from the left side of a rule, and an induction hypothesis is generated using the substitution from each recursive

330

Deepak Kapur and Mahadavan Subramaniam

function call in the right side of the rule. Rules without recursive calls in their right sides lead to subgoals without any induction hypotheses (basis steps). The recursive definitions of function symbols appearing in a conjecture can thus be used to come up with an induction scheme. Heuristics have been developed and implemented in RRL, which in conjunction with failure analysis of induction schemes and backtracking in case of failure, have been found appropriate for prioritizing induction schemes, automatically selecting the “most appropriate” induction scheme (thus selecting induction variables), and generating the proofs of many conjectures. 2.1

Definitions and Notation

Let T (F, X) denote a set of terms where F is a finite set of function symbols and X is a set of variables. A term is either a variable x ∈ X, or a function symbol f ∈ F followed by a finite sequence of terms, called arguments of f. Let V ars(t) denote the variables appearing in a term t. The subterms of a term are the term itself and the subterms of its arguments. A position is a finite sequence of positive integers separated by ”.”’s, which is used to identify a subterm in a term. The subterm of t at the position denoted by the empty sequence is t itself. If f(t1 , · · · , tm ) is a subterm at a position p then tj is the subterm at the position p.j. Let depth(t) denote the depth of t; depth(t) is 0 if t is a variable or a constant (denoted by a function symbol with arity 0). depth(f(t1 , · · · , tm )) = maximum(depth(ti )) + 1 for 1 ≤ i ≤ m. A term f(t1 , · · · , tm ) is called basic if each ti is a distinct variable. A substitution θ is a mapping from a finite set of variables to terms, denoted as {x1 ← t1 , · · · , xm ← tm }, m ≥ 0, and x0i s are distinct. θ applied on s = f(s1 , · · · , sm ) is f(θ(s1 ), · · · , θ(sm )). Term s matches t under θ if θ(s) = t. Terms s and t unify under θ if θ(s) = θ(t). A rewrite rule s → t is an ordered pair of terms (s, t) with V ars(t) ⊆ V ars(s). A rule s → t is applicable to a term u iff for some substitution θ and position p in u, θ(s) = u|p. The application of the rule rewrites u to u[p ← θ(t)], the term obtained after replacing the subterm at position p in u by θ(t). A rewrite system R is a finite set of rewrite rules. R induces a relation among terms denoted →R . s →R t denotes rewriting of s to t by a single application of a rule in R. →+ R and →∗R denote the transitive and the reflexive, transitive closure of →R . The set F is partitioned into defined and interpreted function symbols. An interpreted function symbol comes from a decidable theory T . A defined function symbol is defined by a finite set of terminating rewrite rules, and its definition is assumed to be complete. Term t is interpreted if all the function symbols in it are from T . Underlying decidable theories are quantifier-free theories. A equation s = t is inductively valid (valid, henceforth) iff for each variable in it, whenever any ground term of the appropriate type is substituted into s = t, the instantiated equation is in the equational theory (modulo the decidable theory T ) of the definitions of function symbols in s, t. This is equivalent to s = t holding in the initial model of the equations corresponding to the definitions and the decidable theory T .

Extending Decision Procedures with Induction Schemes

331

Given a complete function definition as a finite set of terminating rewrite rules {li → ri | li = f(s1 , · · · , sm ), 1 ≤ i ≤ k}, the main steps of cover set method are 1. Generating a Cover Set from a Function Definition: A cover set associated with a function f is a finite set of triples. For a rule l → r, where l = f(s1 , · · · , sm ) and f(ti1 , · · · , tim ) is the ith recursive call to f in the right side r, the corresponding triple is hhs1 , · · · , sm i, {· · · , hti1 , · · · , tim i, · · ·}, {}i3. The second component of a triple is the empty set if there is no recursive call to f in r. The cover sets of double, exp2 and log obtained from their definitions in section 1 are given below. Cover(double): {<<0>, {}, {}>, <<s(x)>, {<x>}, {}>}, Cover(exp2): {<<0>, {}, {}>, <<s(x)>,{<x>}, {}>}, Cover(log): {<<s(0)>, {}, {}>, <<x + x>, {<x>}, {}>}, <<s(x + x)>, {<x>}, {}>}.

2. Generating Induction Schemes using Cover Sets: Given a conjecture C, a basic term t = f(x1 , · · · , xm ) appearing in C can be chosen for generating an induction scheme from the cover set of f. The variables in argument positions in t over which the definition of f recurses are called induction variables and the corresponding positions in t are called the inductive (or changeable) positions; other positions are called the unchangeable positions [2]. An induction scheme is a finite set of induction cases, each of the form hσc , {θi }i generated from a cover set triple hhs1 , · · · , sm i, {· · · , hti1 , · · · , tim i, · · ·}, {}i as follows 4 : σc = {x1 ← s1 , · · · , xm ← sm }, and , θi = {x1 ← ti1 , · · · , xm ← tim }.5 The induction scheme generated from the cover sets of double, exp2 is the principle of mathematical induction. The scheme generated from the cover set of log is different since the function log is defined over positive numbers only. There is one basis step—h{x ← s(0)}, {}i, and h{x ← s(0)}, {}i. There are two induction steps —h{x ← m + m}, {{x ← m}}i, and h{x ← s(m + m)}, x ← mi. The variable m is a positive number. 3

4 5

The third component in a triple is a condition under which the conditional rewrite rule is applicable; for simplicity, we are considering only unconditional rewrite rules, so the third component is empty to mean that the rule is applicable whenever its left side matches. The proposed approach extends to conditional rewrite rules as well. See [16,15,13]. The variables in a cover set triple are suitably renamed if necessary. To generate an induction scheme, it suffices to unify the subterm t in a conjecture C with the left side of each rule in the definition of f as well as with the recursive calls to f in the right side of the rule. This is always possible in case t is a basic term; but it even works if only variables in the induction positions of t are distinct.

332

Deepak Kapur and Mahadavan Subramaniam

3. Generating Induction Subgoals using an Induction Scheme: Each induction case generates an induction subgoal: σc is applied to the conjecture to generate the induction conclusion, whereas each substitution θi applied to the conjecture generates an induction hypothesis. Basis subgoals come from induction cases whose second component is empty. The reader can consult examples (C1) and (C2) discussed above, and see how induction subgoals are generated using the induction schemes generated from the cover sets of double and exp2.

T -Based Definitions and Simple Conjectures

3

Definition 1. A definition of a function symbol f is T -based in a decidable theory T iff for each rule f(t1 , · · · , tm ) → r in the definition, each ti , 1 ≤ i ≤ m, is an interpreted term in T , any recursive calls to f in r only have interpreted terms as arguments, and the abstraction of r defined as replacing recursive calls to f in r by variables is an interpreted term in T 6 . For examples, the definitions of double, log and exp2 given in Section 1 are T -based over Presburger arithmetic. So is the definition of * given using rules, 1. 2.

x * 0 --> 0, x * s(y) --> x + (x * y).

We abuse the notation slightly and call the functions themselves as being T based whenever their definitions are T -based. In order to use T -based definitions for generating induction schemes, they should be complete as well as terminating over T . For a brief discussion of how to perform such checks, see [9,6]. It should be easy to see that terms in the cover set generated from a T -based function definition are interpreted in T . 3.1

Simple Conjectures

Definition 2. A term is T -based if it contains variables, interpreted function symbols from T and function symbols with T -based definitions. Definition 3. A conjecture f(x1 , · · · , xm ) = r, where f has a T -based definition, x0i s are distinct variables and r is interpreted in T , is called simple. Note that both sides of a simple conjecture are T -based. For example, the conjecture (C1): double(m) = m + m about double is simple over Presburger arithmetic, whereas the conjecture (C2): log(exp2(m)) = m about log is not simple over Presburger arithmetic. For a simple conjecture, the cover set method proposes only one induction scheme, which is generated from the cover set derived from the definition of f. 6

If r includes occurrences of cond, a special built-in operator in RRL for doing simulated conditional rewriting and automatic case analysis, then the first argument to cond is assumed to be an interpreted boolean term in T .

Extending Decision Procedures with Induction Schemes

333

Theorem 4. A simple conjecture C over a decidable theory T can be decided using the cover set method. Proof. Given f(x1 , · · · , xm ) = r, where r is interpreted in T , from the cover set associated with the definition of f, an induction scheme can be generated and a proof can be attempted. Since σc(f(x1 , · · · , xm )) = li for some rule li → ri in the definition of f, the left side of a basis subgoal σc (f(x1 , · · · , xm )) = σc (r), rewrites using the rule to ri , an interpreted term in T . The result is a decidable formula in T . This part of the proof exploits the fact that the right side of a simple conjecture, r, is an interpreted term in T . For each induction step subgoal derived from a rule lj → rj in the definition of f where rj = h(· · · , f(· · ·), · · ·), with recursive calls to f, the conclusion is σc(f(x1 , · · · , xm )) = σc(r); σc(f(x1 , · · · , xm )) = lj with θi (f(x1 , · · · , xm )) = θi (r) being an induction hypothesis corresponding to each recursive call to f in rj . The left side of the conclusion simplifies by the corresponding rule to rj which includes an occurrence of θi (f(x1 , · · · , xm )) as a subterm at a position pi in rj . The application of these hypotheses generates the formula rj [p1 ← θ1 (r), · · · , pk ← θk (r)] = σc (r) of T , since the abstraction of rj after recursive calls to f have been replaced by variables, is an interpreted term in T . Since every basis and induction step subgoal generated by the cover set method can be decided in T , the conjecture C can be decided by the cover set method.2 As the above proof suggests, a slightly more general class of simple conjectures can be decided. Not all the arguments to f need be distinct variables. It suffices if the inductive positions in f are distinct variables, and the other positions are interpreted and do not contain variables appearing in the inductive positions. The above proof would still work. For example, the following conjecture (C3): append(n, nil) = n,

is not simple. The validity of the conjecture (C3) can be decided over the theory of free constructors nil and cons for lists. The definition of append is 1. append(nil,x) --> x, 2. append(cons(x,y),z) --> cons(x,append(y,z)).

The requirement that unchangeable positions in a conjecture do not refer to the induction variables, seems essential for the above proof to work, as otherwise the application of the induction hypotheses may get blocked. For example, the cover set method fails to disprove a conjecture such as append(m, m) = m.

334

Deepak Kapur and Mahadavan Subramaniam

from the definition of the function append. An inductive proof attempt based on the cover set of the function append results in an induction step subgoal with the conclusion append(cons(x, y), cons(x, y))

= cons(x, y),

and the hypothesis append(y, y) = y. The conclusion rewrites to cons(x, append(y, cons(x, y))) = cons(x, y) to which the hypothesis cannot be applied. Therefore, the cover set method fails since the induction step subgoal cannot be established.

4

Complex T -Based Conjectures

To decide more complex conjectures by inductive methods, the choice of induction schemes have to be limited as well as the interaction among the function definitions have to be analyzed. In [15], such an analysis is undertaken to predict the failure of proof attempts a priori without actually attempting the proof. The notion of compatibility of function definitions, an idea illustrated in Section 1, is introduced for characterizing this interaction and for identifying intermediate steps in a proof which get blocked in the absence of additional lemmas. In this section, we use related concepts to identify conditions under which conjectures such as (C2), more complex than the generalized simple conjectures discussed in section 3, can be decided. We first consider the interaction among two function symbols. This is subsequently extended to consider the interaction among a sequence of function symbols. Definition 5. A T -based term t is composed if 1. t is a basic term f(x1 , · · · , xm), where f is T -based and x0i s are distinct variables or 2. (a) t = f(s1 , · · · , t0 , · · · , sm ), where t0 is composed and is in an inductive position of a T -based function f, and each si is an interpreted term, and (b) variables xi ’s appearing in the inductive positions of the basic subterm (in the innermost position) of t do not appear elsewhere in t. Other variables in unchangeable positions of the basic subterm can appear elsewhere in t. For example, the left side of the conjecture (C2), log(exp2(m)), is a composed term of depth 2. The first requirement in the above definition can be relaxed as in the case of simple conjectures. Only the variables in the inductive positions of a basic subterm in t have to be distinct; the terms interpreted in T can appear in the unchangeable positions of the basic subterm. Given a conjecture of the form l = r,where l is composed and r is interpreted, it is easy to see that there is only one basic subterm in it whose outermost symbol

Extending Decision Procedures with Induction Schemes

335

is T -based. The cover set method thus suggests only one induction scheme. We will first consider conjectures such as (C2) in which the left side l is of depth 2; later, conjectures in which l is of higher depth, are considered. For a conjecture f(t1 , · · · , g(x1 , · · · , xk ), · · · , tm ) = r, the interaction between the right sides of rules defining g and the left side of rules defining f must be considered, as seen in the proof of the conjecture (C2). The interaction is formalized below in the property of compatibility. Definition 6. A definition of f is compatible with a definition of g in its ith argument in T iff for each right side rg of a rule defining g, the following conditions hold 1. whenever rg is interpreted, then f(x1 , · · · , rg , · · · , xm ) rewrites to an interpreted term in T , and 2. whenever rg = h(s1 , · · · , g(t1 , · · · , tk ), · · · , sn ), having a single recursive call to g, the definition of f rewrites f(x1 , · · · , h(s1 , · · · , y, · · · , sn ), · · · , xm) to h0 (u1 , · · · ,f(x1 ,· · · , y, · · · , xm), · · · , un ), where xi ’s are distinct variables, h, h0 are interpreted symbols in T , and si , uj ’s are interpreted terms of T .7 In case rg has many recursive calls to g, say h(s1 , · · · , g(t1 , · · · , tk ),· · · , g(v1 , · · · , vk ), · · · , sn ), then the definition of f rewrites f(x1 , · · · , h(s1 , · · · , y, · · · , z, · · · , sn ), · · · , xm ) to h0 (u1 , · · · , f(x1 , · · · , y · · · , xm ), · · · , f(x1 , · · · , z, · · · , xm), · · · , un ). The definition of a function f is compatible with a definition of g iff it is compatible with g in every argument position. As will be shown later, the above requirements on compatibility lead to the function symbol f to be distributed over the interpreted terms to have g as an argument so that the induction hypothesis(es) can be applied.8 The above definition is also applicable for capturing the interaction between an interpreted function symbol and a T -based function symbol. For example, the interpreted symbol + in Presburger arithmetic is compatible with * (in both arguments) because of the associativity and commutativity properties of +, which are valid formulas in T . As stated and illustrated in the introduction, the compatibility property can be viewed as requiring that the composition of f with g has a T -based definition. Space limitations do not allow us to elaborate on this interpretation of compatibility property. For ensuring the compatibility property, any lemmas already proved about f can be used along with the definition of f. The requirements for showing compatibility can be used to speculate bridge lemmas as well. The conjecture (C2) is of depth 2. The above insight can be generalized to complex conjectures in which the left side is of arbitrary depth. A conjecture in which a composed term of depth d is equated to an interpreted term, can 7 8

The requirement on the definition of f can be relaxed by including bridge lemmas along with the defining rules of f . In [5], we have given a more abstract treatment of these conditions. The above requirement is one way of ensuring conditions in [5].

336

Deepak Kapur and Mahadavan Subramaniam

be decided if all the function symbols from the root to the position p of the basic subterm in its left side can be pushed in so that the induction hypothesis is applicable. The notion of compatibility of a function definition with another function definition is extended to a compatible sequence of definitions of function symbols. In a compatible sequence of function symbols hf1 , · · · , fd i, each fi is compatible with fi+1 at ji -th argument, 1 ≤ i ≤ d − 1. For example, consider the following conjecture (C4): bton(pad0(ntob(m))) = m.

Functions bton and ntob convert binary representations to decimal representations and vice versa, respectively. The function pad0 adds a leading binary zero to a bit vector. These functions are used to reason about number-theoretic properties of parameterized arithmetic circuits [12,10]. Padding of output bit vectors of one stage with leading zeros before using them as input to the next stage is common in multiplier circuits realized using a tree of carry-save adders. An important property that is used while establishing the correctness of such circuits is that the padding does not affect the number output by the circuit. The underlying decidable theory is the combination of the quantifier-free theories of bit vectors with free constructors nil, cons and b0, b1, to stand for binary 0 and 1, and Presburger arithmetic. In the definitions below, bits increase in significance in the list with the first element of the list being the least significant. Definitions of bton, ntob, and pad0 are T -based. 1. bton(nil) --> 0, 2. bton(cons(b0, y1)) --> bton(y1) + bton(y1), 3. bton(cons(b1, y1)) --> s(bton(y1) + bton(y1)), 4. 5. 6. 7.

ntob(0) ntob(s(0)) ntob(s(s(x2+x2))) ntob(s(s(s((x2+x2)))))

--> --> --> -->

cons(b0, nil), cons(b1, nil), cons(b0,ntob(s(x2))), cons(b1,ntob(s(x2))),

8. pad0(nil) --> cons(b0, nil), 9. pad0(cons(b0, y)) --> cons(b0, pad0(y)), 10. pad0(cons(b1, y)) --> cons(b1, pad0(y)).

The function pad0 is compatible with ntob; bton is compatible with pad0 as well as ntob. However, ntob is not compatible with bton since ntob(s(bton(y1) + bton(y1))) cannot be rewritten using the definition of ntob. However, bridge lemmas, ntob(bton(y1) + bton(y1)) = cons(b0, ntob(bton(y1))) ntob(s(bton(y1) + bton(y1))) = cons(b1, ntob(bton(y1)))

can be identified such that along with these lemmas, ntob is compatible with bton. A proof attempt of (C4) leads to two basis and two step subgoals based on the cover set of ntob. The first basis subgoal where m <- 0, is

Extending Decision Procedures with Induction Schemes

337

bton(pad0(ntob(0))) = 0.

The subterm ntob(0) rewrites using the definition of ntob to cons(b0, nil), then pad0(cons(b0, nil))) rewrites to cons(b0, cons(b0, nil))), and finally, bton(pad0(ntob(0))) rewrites to 0 + 0 + 0 + 0, simplifying the above equation to a valid formula in Presburger arithmetic. The second basis subgoal is similar. Consider the first induction step subgoal. The conclusion is bton(pad0(ntob(s(s(x2 + x2))))) =

s(s(x2 + x2))

with the hypothesis being bton(pad0(ntob(s(x2)))) = s(x2).

The subterm ntob(s(s(x2 + x2))) in the the left side of the conclusion rewrites to cons(b0, ntob(s(x2))) by the definition of ntob; the subterm pad0(cons(b0, ntob(s(x2)))) then rewrites to cons(b0, pad0(ntob(s(x2)))). Term bton(pad0(ntob(s(s(x2 + x2))))) thus rewrites to bton(pad0(ntob(s(x2)))) + bton(pad0(ntob(s(x2)))), on which the hypothesis is applicable. The result is a valid formula s(x2) + s(x2) = s(s(x2 + x2)) in Presburger arithmetic. It can be shown that the second step subgoal also simplifies to a valid formula in Presburger arithmetic. Every induction subgoal can be decided, and hence (C4) can be decided. The reader would have noticed that the compatibility requirement ensures that all the function symbols are pushed over interpreted symbols for the induction hypothesis to be applicable. Note: For understanding the proof below, it would be helpful to concurrently consult the proofs of examples (C2) in Section 1 as well as of (C4) above. Theorem 7. The validity of a conjecture l = r, where l is a composed term and r is interpreted in T , can be decided by the cover set method if the sequence of function symbols hfd , fd−1 , · · · , f2 , f1 i from the outermost function symbol fd of l to the basic subterm f1 (x1 , · · · , xm ) is compatible. Proof. By induction on the depth d of l. Basis case (d = 2): Consider a conjecture l = r where l = f2 (t1 , · · · , f1 (x1 , · · · , xm ),· · · , tk ) and hf2 , f1 i form a compatible sequence (i.e., f2 is compatible with f1 in its argument position), each tj , 1 ≤ j ≤ k, and r are interpreted terms in T . Recall that any induction variable xi appearing in an inductive position of f1 does not occur in any tj . The cover set method uses the induction scheme generated from the cover set associated with the definition of f1 . Consider a basis subgoal σc (l) = σc (r) generated from a rule l1 → r1 in the definition of f1 , where r1 does not have any recursive calls to f1 . Since σc(f1 (x1 , · · · , xm)) = l1 , σc (l) rewrites to f2 (σc (t1 ), · · · , r1 , · · · , σc(tk )). Since ti does not include any induction variable, σc(ti ) = ti , implying f2 (σc(t1 ), · · · , r1 , · · · , σc(tk )) = f2 (t1 , · · · , r1 , · · · , tk ). Because of compatibility of f2 with f1 ,

338

Deepak Kapur and Mahadavan Subramaniam

f2 (t1 , · · · , r1 , · · · , tk ) rewrites to an interpreted term. The basis subgoal therefore simplifies to a formula in T . Consider an induction step subgoal generated from a rule l2 → r2 where r2 = h1 (s1 , · · · , f1 (v1 , · · · , vm ), · · · , sn ) with a single recursive call to f1 (for simplicity). Let the conclusion be σc (f2 (t1 , · · · , f1 (x1 , · · · , xm ) · · · , tk )) = σc (r) and the induction hypothesis be θi (f2 (t1 , · · · , , f1 (x1 , · · · , xm), · · · , tk )) = θi (r), where σc(f1 (x1 , · · · , xm)) = l2 and θi (f1 (x1 , · · · , xm )) = f1 (v1 , · · · , vm ). The left side of the conclusion rewrites to f2 (t1 , · · · , r2 , · · · , tk ) (just as in the basis case). As per definition of compatibility of f2 with f1 , f2 (y1 , · · · , h1 (s1 , · · · , y, · · · , sn ), · · · , yk ) rewrites to h2 (s01 , · · · , f2 (y1 , · · · , y, · · · , yk ), · · · , s0n ) where yi and y are distinct variables, and h2 , sj ’s and s0j ’s are in T . This means that the left side of the simplified conclusion f2 (t1 , · · · , r2 , · · · , tk ) rewrites by the same sequence of rules to h2 (δ(s01 ), · · · , δ(f2 (y1 , · · · , y, · · · , yk )), · · · , δ(s0n )), where δ(yi ) = ti , 1 ≤ i ≤ k, and δ(y) = f1 (v1 ,· · · , vm )}. The hypothesis applies since θi (tj ) = tj for all tj (recall that there are no xi ’s in tj ’s), and θi (f1 (x1 , · · · , xm )) = f1 (v1 , · · · , vm ), which simplifies the conclusion to h2 (δ(s01 ), · · · , θi (r), · · · , δ(s0n )) = σc (r), a formula in T. The above proof step assumed a single recursive call in r2 and the application of a single induction hypothesis. The proof generalizes when there are multiple recursive calls in r2 and many possibly different hypotheses have to be applied. Induction Step case: Assume that the statement of the theorem for all conjectures l0 = r 0 , where l0 is a composed term of depth d0 < d, and r 0 is interpreted. The main idea in this proof is to use the fact that a conjecture l = r in which the composed term l = fd (t1 , · · · , ld−1 , · · · , tk ) is of depth d, uses the same induction scheme as a related conjecture ld−1 = c, where ld−1 is a composed term of depth d−1, and c is an interpreted term. By the induction hypothesis, ld−1 = c can be decided since all subgoals, including basis and induction steps, can be decided. Because of the compatibility of fd with fd−1 , the outermost symbol of ld−1 , it can be shown that each subgoal of l = r using the same induction scheme can also be decided. In the basis step, the instantiated conjecture rewrites to a formula in T , and in the induction step, fd can be pushed over the interpreted symbols to surround fd−1 so that the hypothesis is again applicable, resulting in a formula in T . More details follow. The same basic term f1 (x1 , · · · , xk ) in ld−1 used for generating an induction scheme for ld−1 = c is also used for generating an induction scheme for l = r. By the induction hypothesis, each of the subgoals generated from ld−1 = c using this induction scheme can be decided in T . For l = r as well, a subgoal σc(fd (t1 , · · · , ld−1 , · · · , tk )) = σc(r) using the same substitution can be decided. Consider a basis subgoal σc (ld−1 ) = σc (c) where σc(ld−1 ) simplifies to the interpreted term u through a sequence of rewrite steps using the definitions of the T -based function symbols in ld−1 . (The T -based function symbols in ld−1 are successively eliminated in a bottom up fashion starting with f1 until finally fd−1 rewrites to u by a rule of the form fd−1 (· · · , rg , · · ·) → u in the definition of fd−1 .) By the compatibility of fd with fd−1 , fd (t1 , · · · , u, · · · , tk ) rewrites to an

Extending Decision Procedures with Induction Schemes

339

interpreted term, say u0 , implying that the basis subgoal σc (l) = σc (r) simplifies to u0 = σc (r), a formula in T . Consider an induction step subgoal with the conclusion σc (ld−1 ) = σc(c) and the hypothesis θi (ld−1 ) = θi (c), generated from a rule in the definition of f1 whose right side has a single recursive call to f1 (for simplicity). The left side of the conclusion simplifies through a sequence of rewrite steps using the definitions of the T -based function symbols in ld−1 to a term of the form hd−1 (s01 , · · · , θi (ld−1 ), · · · s0n ) where s0j s and hd−1 are interpreted. The T based functions in ld−1 are successively pushed over interpreted function symbols in a bottom up fashion until finally fd−1 is pushed using a rule of the form fd−1 (y1 ,hd−2 (s1 , · · · , y, · · · , sn ), · · · , yk ) → hd−1 (s01 , · · · , fd−1 (y1 , · · · , x, · · · , yk ), · · · , s0n ), to get the left side of the hypothesis. By compatibility of fd with fd−1 , fd (z1 , · · · , hd−1 (s01 , · · · , z, · · · , s0n ) · · · , zk ) rewrites to hd (s001 , · · · , fd (z1 , · · · , z, · · · , zk ), · · · , s00n ). where hd and s00 j 0 s are interpreted. This implies that in the corresponding induction step subgoal, the left side of the conclusion σc (fd (t1 , · · · , ld−1 , · · · , tk )) will simplify to hd (s001 · · · , θi (fd (t1 ,· · · ld−1 , · · · , tk )), · · · , s00n ), a term containing the left side of the hypothesis. The application of the hypothesis simplifies the conclusion to hd (s001 · · · , θi (r), · · · , s00n ) = σc(r), a formula in T . 2

5

Relaxing Linearity Requirement: Nonlinear Conjectures

To cover a larger class of formulas, we discuss conditions for deciding a conjecture with multiple occurrences of induction variables in its left side. Definition 8. A conjecture f(s1 , · · · , sm ) = r, where f(s1 , · · · , sm ) is a T -based term, r is interpreted in T , and for 1 ≤ i ≤ m, either si is interpreted in T , or si = gi (x1 , · · · , xn ) is a basic term, is called basic nonlinear if some variable has multiple occurrences in l. In a basic nonlinear conjecture, induction variables (as well as noninduction variables) appearing as arguments in basic terms can be shared. For example, the conjecture below is basic nonlinear, (C5): append(blast(m), last(m)) = m,

where last returns the singleton list containing the last element of a list, and blast returns the input list without the last element. 1. 2. 3. 4.

last(cons(x, nil)) --> cons(x, nil). last(cons(x, cons(y, z))) --> last(cons(y, z)), blast(cons(x,nil)) --> nil, blast(cons(x,cons(y,z))) --> cons(x,blast(cons(y,z))).

To decide such a conjecture, additional conditions become necessary. First, since there can be many induction schemes possible, one each generated from the cover set of a basic term, it is required that they can be merged into a

340

Deepak Kapur and Mahadavan Subramaniam

single induction scheme [2,9] (the case when each cover set generates the same induction scheme trivially satisfies this requirement). The second requirement is similar to that of compatibility: f above must be simultaneously compatible with each gi . In the definition below, we assume, for simplicity, that there is at most one recursive call in the function definitions. Definition 9. The definition of f is simultaneously compatible in T with the definitions of g and h in its ith and j th arguments, where i 6= j if for each right side rg and rh of the rules in the definitions of g and h, respectively: 1. whenever rg and rh are interpreted in T , f(x1 , · · · , rg , · · · , rh , · · · , xm) rewrites to an interpreted term in T , and 2. whenever rg = h1 (· · · , g(· · ·), · · ·) and rh = h2 (· · · , h(· · ·), · · ·), the definition of f rewrites f(x1 , · · · , h1 (· · · , x, · · ·), · · · , h2 (· · · , y, · · ·), · · · , xm) to h3 (· · · , f(x1 , · · · , x, · · · , y, · · · , xm), · · ·). For example, append is simultaneously compatible with blast in its first argument and last in its second argument. Theorem 10. A basic nonlinear conjecture f(· · · , g(x1 , · · · , xn), · · · , h(x1 , · · · , xn ), · · ·) = r, such that x1 , · · · , xn do not appear elsewhere in the left side of the conjecture and the remaining arguments of f are interpreted terms, can be decided by the cover set method if f is simultaneously compatible with g and h at ith and j th arguments, respectively, and the induction schemes suggested by g(x1 , · · · , xn ) and h(x1 , · · · , xn) can be merged. The proof is omitted due to lack of space; it is similar to the proof of the basis case of Theorem 7. The main steps are illustrated using a proof of (C5). The induction schemes suggested by the basic terms blast(m), last(m) in (C5) are identical. There is one basis subgoal and one induction step subgoal. The basis subgoal obtained by m <- cons(x, nil), append(blast(cons(x, nil)), last(cons(x, nil))) =

cons(x, nil),

simplifies to a valid formula by the definitions of blast and last, and then by the definition of append. In the step subgoal, the induction conclusion is append(blast(cons(x,cons(y,z))),last(cons(x,cons(y,z)))) = cons(x,cons(y,z)),

with the hypothesis being, append(blast(cons(y, z)), last(cons(y, z))) = cons(y, z).

The left side of the conclusion rewrites by the definitions of last, blast to append(cons(x, blast(cons(y, z))), last(cons(y, z)))) which rewrites using the definition of append to cons(x, append(blast(cons(y, z)), last(cons(y, z)))) to which the hypothesis applies, leading to the valid formula cons(x, cons(y, z)) = cons(x, cons(y, z)).

Extending Decision Procedures with Induction Schemes

341

The notion of simultaneous compatibility and the above theorem generalize to complex nonlinear conjectures, similar to the complex conjecture (C4) discussed in Section 4, in which a conjecture includes a sequence of simultaneously compatible function symbols. Because of space limitations, we cannot discuss this in detail here. The example below illustrates the idea to some extent. The underlying theory is that of free constructors with 0, s. The function symbol + is assumed to have the usual recursive definition: 0 + y --> y, s(x) + y --> s(x + y). The equation is: (C6):

mod2(x) + (half(x) + half(x)) = x,

is a complex nonlinear conjecture with the following definitions of half and mod2. 1. half(0) --> 0, 2. half(s(0)) --> 0, 3. half(s(s(x))) --> s(half(x)). 4. mod2(0) --> 0, 5. mod2(s(0)) --> s(0), 6. mod2(s(s(x))) --> mod2(x).

For + to be compatible with half in both its arguments, an intermediate lemma (either the commutativity of + or x + s(y) = s(x + y)) is needed as well.9 It can be a priori determined that (C6) can be decided by the cover set method since the basic terms half(x), mod2(x) suggest the same induction scheme, and the function symbol + is simultaneously compatible with mod2, + as well as half) in the presence of the above lemma about +.

6

Bootstrapping

As discussed above, simple and complex conjectures with T -based function symbols can be decided using the cover set method, giving an extended decision procedure and an extended decidable theory. In this section, we outline preliminary ideas for bootstrapping this extended decidable theory with the definitions of T -based function symbols and the associated induction schemes, to define and decide a larger class of conjectures. Definition 11. A definition of a function symbol f is extended T -based for a decidable theory T if for each rule, f(t1 , · · · tm ) → r in the definition, where t0i s are interpreted over T , the only recursive call to f in r, if any, has only T -based terms as arguments, and the abstraction of r after replacing the recursive call to f by a variable, is either an interpreted term over T , or a basic term g(· · ·) where g has an (extended) T -based definition. 9

If + is defined by recursing on the second argument, even then commutativity of + or s(x) + y = s(x + y) is needed.

342

Deepak Kapur and Mahadavan Subramaniam

For example, exp denoting exponentiation, defined below, is extended T based over Presburger arithmetic. For rules defining *, please refer to the beginning of Section 3. 1. exp(x, 0) --> s(0), 2. exp(x, s(y)) --> x * exp(x, y).

Unlike simple conjectures, an inductive proof attempt of an extended T -based conjecture may involve multiple applications of the cover set method. Induction may be required to decide the validity of the induction subgoals. In order to determine a priori this, the number of recursive calls in any rule in an extended T -based definition, is restricted to be at most one. The abstracted right side r could be an interpreted term in T , or a basic term with an extended T -based function. Theorem 12. A simple extended T -based conjecture f(x1 , · · · , xm ) = r, where f is an extended T -based function, and r is interpreted over T , can be decided by the cover set method. The key ideas are suggested in the disproof of an illustrative conjecture about exp: (C7): exp(s(0), m)

= s(m).

In the proof attempt of (C7), with induction variable m, there is one basis and one step subgoal. The basis subgoal, exp(s(0), 0) = s(0)

rewrites by definition of exp to the valid formula s(0) = s(0). In the step subgoal, the conclusion exp(s(0), s(y))

=

s(s(y)),

rewrites by definition of exp to s(0) * exp(s(0), y) = s(s(y)), to which the hypothesis, exp(s(0), y) = s(y), applies to give s(0) * s(y) = s(s(y)), which then rewrites by definition of * to s(0) * y = s(y), a simple conjecture which can be decided to be false by the cover set method. Complex extended T -based conjectures can be similarly defined, and conditions for deciding their validity can be developed. This is currently being explored.

7

Conclusion

This paper describes how inductive proof techniques implemented in existing theorem provers, such as RRL, can be used to decide a subclass of equational conjectures. Sufficient conditions for such automation are identified based on the structure of the conjectures and the definitions of the function symbols appearing in the conjectures as well as interaction among the function definitions.

Extending Decision Procedures with Induction Schemes

343

The basic idea is that if the conditions are met, the induction subgoals automatically generated a conjecture by the cover set method simplify to formulas in a decidable theory. This is first shown for simple conjectures with a single function symbol recursively defined using interpreted terms in a decidable theory. Subsequently, this is extended to complex conjectures with nested function symbols by defining the notion of compatibility among their definitions. The compatibility property ensures that in induction subgoals, function symbols can be pushed inside the instantiated conjectures using definitions and bridge lemmas, so as to enable the application of the induction hypotheses, leading to decidable subgoals. It is shown that certain nonlinear conjectures with multiple occurrences of induction variables can also be decided by extending the notion of compatibility to that of simultaneous compatibility of a function symbol to many function symbols. Some preliminary ideas on bootstrapping the proposed approach are discussed by considering conjectures with function symbols that are defined in terms of other recursively defined function symbols. Our preliminary experience regarding the effectiveness of the proposed conditions is encouraging. Several examples about properties of lists and numbers as well as properties used to establish the number-theoretic correctness of arithmetic circuits have been successfully tried. Some representative conjectures, both valid and nonvalid formulas, decided by the proposed approach are given below. With each conjecture, the annotations indicate whether it is simple or complex, as discussed above, its validity and the underlying decidable subtheories. Conjectures are annotated as being nonlinear if they contain multiple basic terms with the same induction variables. For example, the conjectures 12-16 below are nonlinear since they have multiple basic terms with the induction variable x. However, conjectures are 7-9 are not nonlinear since they do not contain multiple basic terms even though they contain multiple occurrences of the variable x. In conjectures 18-20, the underlying theory is Presburger arithmetic extended with the function symbol *. Conjectures 16 and 17 establish the correctness of a restricted form of ripplecarry and carry-save adders respectively. The arguments to the two adders are restricted to be the same in these conjectures. This restriction can be relaxed, and the number-theoretic correctness of parameterized ripple-carry and carrysave adders[12,10] can be done using the proposed approach. In addition, several intermediate lemmas involved in the proof of multiplier circuits and the SRT divider circuit [10,11] can be handled. 1. 2. 3. 4. 5. 6. 7. 8. 9.

half(double(x)) mod2(double(x)) half(mod2(x)) log(mod2(x)) exp2(log(x)) log(exp2(x)) x * log(mod2(x))) x * mod2(double(x)) memb(x, delete(x, y))

= = = = = = = = =

x, 0, 0, 0, x, x, 0, 0, false,

[Complex, [Complex, [Complex, [Complex, [Complex, [Complex, [Complex, [Complex, [Complex,

valid, valid, valid, valid, inval, valid, valid, valid, valid,

Presburger] Presburger] Presburger] Presburger] Presburger] Presburger] Presburger] Presburger] lists]

344

Deepak Kapur and Mahadavan Subramaniam

10. bton(pad0(ntob(x))) = x, [Complex, valid, lists] 11. last(ntob(double(x))) = 0, [Complex, valid, lists] 12. length(append(x, y)) - (length(x) + length(y)) = 0, [Complex, valid, nonlinear, Presburger, Lists] 13. rotate(length(x), x) = x, [Complex,valid,nonlinear,Presburger,Lists] 14. length(nth(x, y)) <= length(x), [Complex,valid,nonlinear,Presburger,Lists] 15. length(delete(x, y)) <= length(y), [Complex,valid,nonlinear,Presburger,Lists] 16. bton(carry-saveadder(ntob(x), ntob(x), ntob(x))) = x + x + x, [Complex, valid, nonlinear, Presburger, Bitvectors] 17. bton(ripple-carryadder(ntob(x), ntob(x), ntob(x)) = x + x + x. [Complex, valid, nonlinear, Presburger, Bitvectors] 18. exp(1, x) = x, [Simple, valid, Presburger extend by *] 19. exp(1, x) = s(x), [Simple, inval, Presburger extend by *] 20. exp(x, mod2(double(y))) = s(0) [Complex,valid, Presburger extend by *]

Inductive reasoning plays a central role in several nontrivial applications, but induction techniques are hardly supported in many reasoning tools, primarily due to the intense manual intervention required to perform inductive proofs in general. The proposed approach can be used to integrate induction proof methods in other reasoning tools and selectively invoke these methods to significantly enhance the reasoning capabilities of these tools without compromising automation. For instance, procedures implementing the cover set induction method can be integrated as a component decision procedure in a cooperating decision procedures framework; it can be invoked to check the validity of inductive subgoals. To make the proposed approach more effective, it should be generalized to decide more general quantifier-free formulas as well as mechanically generate subsidiary conditions under which a given quantifier-free formula is valid. Such an investigation has been initiated and preliminary results are discussed in [5]. It is also necessary to consider decidability of formulas that require nested induction. Another promising direction for extending this work is to use the proposed approach to guide generation of intermediate lemmas. Acknowledgements: Thanks to J¨ urgen Giesl and the referees for useful comments on an earlier draft of the paper.

Extending Decision Procedures with Induction Schemes

345

References 1. A. Gupta, Inductive Boolean Function Manipulation: A Hardware Verification Methodology for Automatic Induction. Ph.D. Thesis, Department of Computer Science, Carnegie Mellon University, Pittsburgh, 1994. 2. R.S. Boyer and J S. Moore, A Computational Logic. ACM Monographs in Computer Science, 1979. 3. R.S. Boyer and J S. Moore, “Integrating decision procedures into heuristic theorem provers: A case study of linear arithmetic”, Machine Intelligence 11, J.E. Hayes, D. Mitchie and J. Richards (eds), 1988. 4. L. Fribourg, “Mixing list recursion and arithmetic”, Proc. Seventh Symp. on Logic in Computer Science, 1992. 5. J. Giesl and D. Kapur, Decidable Classes of Inductive Theorems. Technical Report, Department of Computer Science, University of New Mexico, Feb. 2000. 6. D. Kapur, “Automated tools for analyzing completeness of specifications,” Proc. 1994 Intl. Symp. on Software Testing and Analysis (ISSTA), Seattle, WA, August 1994, 28-43. 7. D. Kapur, “Rewriting, decision procedures and lemma speculation for automated hardware verification,” Proc. 10th Intl. Conf. Theorem Proving in Higher Order Logics, LNCS 1275, 1997. 8. D. Kapur and X. Nie, “Reasoning about numbers in Tecton,” Proc. 8th Intl. Symp. Methodologies for Intelligent Systems, (ISMIS’94), North Carolina, October 1994. 9. D. Kapur and M. Subramaniam, “New uses of linear arithmetic in inductive theorem proving,” J. Automated Reasoning, 16 (1-2), 1996, 39-78. 10. D. Kapur and M. Subramaniam, “Mechanically verifying a family of multiplier circuits,” Proc. Computer Aided Verification (CAV’96), New Jersey, Springer LNCS 1102 (eds. Alur & Henzinger), 1996, 135-146. 11. D. Kapur and M. Subramaniam “Mechanizing reasoning about arithmetic circuits: SRT division,” Proc. 17th FSTTCS, LNCS (eds. Sivakumar & Ramesh), 1997. 12. D. Kapur and M. Subramaniam, “Mechanical verification of adder circuits using powerlists,” J. of Formal Methods in System Design, Nov. 1998. 13. D. Kapur and M. Subramaniam, “Using an induction prover for verifying arithmetic circuits,” to appear in J. of Software Tools for Technology Transfer, Springer Verlag, March 1999. 14. D. Kapur, and H. Zhang, “An overview of Rewrite Rule Laboratory (RRL),” J. of Computer and Mathematics with Applications, 29, 2, 1995, 91-114. 15. M. Subramaniam, Failure Analyses in Inductive Theorem Provers. Ph.D. Thesis, Department of Computer Science, University at Albany, State University of New York, 1997. 16. H. Zhang, D. Kapur, and M.S. Krishnamoorthy, “A mechanizable induction principle for equational specifications,” Proc. 9th Intl. Conf. Automated Deduction (CADE), Springer LNCS 310, (eds. Lusk and Overbeek), Chicago, 1988, 250-265.

½

¾

¿

!" ##$%% &'(

&() ('* + , % - (

.(/'/ / ((* * 0 !'( /

1&23 ' !'( * / ! / 1453 / 0 ! &2 (( &2

/ 6 45 ! (* ! '( (' &2 7 &2 / 2 / ( (* ' ' ($' (8 5&2 4/ /0 'Æ (* 0 !'( 0* /( ( $( 9( / (8 10/' / * / : 3 '( ;((* 0 '(* '( ! &2

! " # $ " " % % %&'

(!)*+, *-.*/0

% %&

%& % %& 1 Æ "

((* ' * / <4 = 7>2 ! 4?$??$#$% !" # $" % ! & ' "

( &/ 2

@

," 2 (,3+0 1 45 (! /-0

(! /6!)*+.*/0 7 " %& 8

% $

(!/39*+:*; :*<& *<=*/,.%**0

%& (>*6>++0 %& ! & 4 5 &45 (,9/+0 " ? &45 45

@ " " 1 %& &45 7 &45 ! @ A Æ "

1 %&

1 &45 5 &45 %&? @ %& &45' @ 5 " &45

% 8 " &45 > (> *-0 5 @ 1 < B " C &45 " & D &45 %& D

-(( ; .( 5'

&452 " " ( :*<0 (>*6>++0 8 7

A " $ 5 E D &45

D 3 8 &45

D

" % D " Æ (5 *;>*;0 &45 1 %& &45 # 8 .

8 - 1 B

& ; D 5 @ < & 3 &45 1 6 & /

1 9

E D 1 ? 8 % E D ( %& 8

> %& F F ( 0 %& D 8

( % %&

( &/ 2

?

@ %&

A

> D 8 8 G 1 8 E D 1 1 $

1 . # F F ? 1 % 8 ? G - B

;

< 8 - B ; . %

8

$ 1 D > 45 8 ? F $

G F G - F F G B F 8 F

-(( ; .( 5'

9 8 % @ D

!"# $%&'() F *

+

F G , F

F G -

F

F

% 8 @ 8

% D

. + 8 ? ' $ ' E D % ( 0 ( 0 D 2 8 ' A &45 ? G &45 D (0

@ D G &45 D ( 0 ( 0 %

( 0 ( 0 D

8 $

A D

( 0 ( 0 D A " " ( 0 ( 0

( &/ 2

/

%

. 8

&45?

& 0

C %

8 A

5 % 8

B % % % @ % &45 " ( AA*30 "

% % - %& "

#

-(( ; .( 5'

9 8 4 G ( % 8 % -

% ?

D &45 @ &45 . , % %& &45

%& &45 D B . &45 % " &45

8

8 %

- 8 - B

8 . C

1

5

% B &45

% B "

! C

( &/ 2

" " %

'

$ &

(>++0 C ? 8 H 8

.

9 D I #

½ + G

# . & "

$

C 7 % 8

F " " ½ ½ 1

& + ,

-(( ; .( 5'

!

1

8

+ ,

&

&

*

. A

8

#

& /

F

F

. 8 %

< -

+

¼

¼ 1 , ¼ 1

-

¼

G

½

9 B ;

% B G ; - . B - ; B

( &/ 2

1 D "

8 8 1

B G B - &

45 . - % %& D

1

D F ? ' 45 .

8 F F

F F - &45

F F %

F F - &45

" F F % F F - &45

" % - F

F . 2 1 D ( :*<0 8

C " D

D F ' 45 .

C F F %

-(( ; .( 5'

-

-

F F -

F F G

1 D (>*60

. 3

+ + + + + + +

C " D

' 45

+ C F F %

% - &45

"

% G &45 8

F F

% " % D 2 D %94 (5 ++0 % %&

< - 8 (>*/0

(>++0 . 4

½

4/ 9( 0 ' * (' /

( &/ 2

@

D &45 + + GG + GG -G -G -- --

C " D

GG -G -G --'

D F -G F -G -G -G GG -- 45 +

"

!

#

#

"

!

" ( :*<0 (>*60

. $ > 2 "

# $ % ! ' % %& 8 G %& 8 9 8 %

F

$

$

$

F #

( :*<0 D F

F '

$

-(( ; .( 5'

1 8 &45 9 F % 8 8 $ 5 8 $ . 8 "

45 . & - B G ; G

& F F % C 8 ( F ( F

-

(,.%**0

% " & 1 %

&45 "

" (:*;0 &45 1

$

$

$

$

# &$

C 8

(>*6>++0 1 3 8 1 %& ?

8 ! 8

9 8 F %

%

&

&

( &/ 2

?

D %

. D

&

1 8 &45 9 $ $ % " B ; - B

C %

1 5 & - &45 . - &45

" C 8

8

3

%

&

&

%

# ' ( ) & * & + ! . @ > 2 (> *-0 &45 C

8

,

&

*

" D - " &45

> 2

5 Æ

-(( ; .( 5'

" . " .

& " C " $ 1 % 8

!

"

7 &45 ( *+0 D D 8 % 8 &45 &45 %& 9 G C &45?

C 8 &45 #

? ?

. 8 &45 % @ ?

F

F

F

F F F F

'( )

'( )

*

C

1

E % # '( )

( &/ 2

%

#

? . 8 .

< -

- &45 &

D 3

% 8 8 D 8 D % " 8 $

.

? F

F F

F 8 8 & 8 % 8 " +

+

+

+

+

+

+

+

,

,

#

$

1 &45 " Æ C # %& %& %& ? " (5 *;0 # %& " (>*;0 8 %&

-

-

-

#

-(( ; .( 5'

. F %

-

-

-

-

-

G &45

-

-

-$

& 5!"# 6!

" "

7

& (>5*/0 # %& # 8

%

"

1 %& &45 " &45 " &45 " 1 @ % @ " 1 @ J K 8 % " J K 8 8 " %& &45

8 5 E D

&45 % &45 " &45 8 8 8 " " D " %

( &/ 2

&45 45 (**0 A5 ()**0

" 1

(>*/0

%& $

.&/ 0 ! 1 C " 4 . .

I 5 " % )L > " & A.,?@B

4 . + , ( .'((* 0/ (: !( %#% #%C#@# $

( %??@ A.,?B 4 . + , ( '(* ! ' * 4 80 ! " # $"% % @? ##C# 4'8' + %?? $ ( A.,B 4 . + , ( 4 ! 0 ' * # D% C%@ # A- B -/ /' /06 ' ! +E 7 8 & $'% # C# 29! >( %? $ ( A-?B ;F - (( & 4 * ( ( ' ) %D@?C? %?? A-?B ; - 4 80 " * & %?? A?B 7' ( * (

%13D @C%% %?? A #B /' /06 2 ! $ 0 * ( %@1 3D#@?C % %?# A @B /' /06 4 ! 0 ( D?C%% %?@ A +?B /' /06 + $& +'' 5 0 * + '0 *

+ (' -D ;( ( / #C # >( &' (/ - %?? A;G?B ; 7 G '* (D 8 7 5 / ( ,- . . ? # C## $ ( %??

A, ?#B

-(( ; .( 5'

.(! , 2 / 4 /( 5 ?#$% ( !8$- / E ( ( , * %??# A,2?B + , ( > 2/( '/ &'/ / ! ! 0 * !/ '0 / + $ 0!&% ' %%C% . 4/ / ( %?? 5 / ' & +/ H( * I A,?B - ,(/ , (6 'Æ ! '( ! 0 ' ) ( D% %C% %?? A+5??B + $& +'' .( 5' 4/ // $ ' / ,1 ''' 2

$2% #C%% 4 (* %??? AJB J +$+ * 40 (6 ! / ' / ' (/ ! ' ! (( %? AJ(?#B +H J( 4 0 * . 8* , * 4 > ' *

+ 2

(' # %C%% 29! * & %??# AJ4??B J J'8 8' < 4* .' :( !$

$0!!% %@# @C% $ ( %??? AJ'B + / - J'8( H (($K'$ / 4 4/ 6*L = ' 3 ?D#%C## * %? A77?B . ( 2/8 7 G 7 4! * (!$( (( ,4 ( (' %% ! 25 @ C 0 -'08 %?? $ ( A2B > 2/( '/ ( /" 4(D ( ! / (* ! ( ,, " # $ ( # A2/(?B > 2/( '/ 2 '(* ! ! 0 * (

% 1#3D C %?? A5' ??B . 5' . !'((* * .$5&2 & 5'0/ " # ( $"% % % % C%@ 4 (* %??? $

( A ?B + / .' ! 0/ ! + 7 6 " # $"% ?% %%C# J (' , * %?? AM?B 7 M 40 ' ! /'/ ! 6 ! . . " # $"% % @? #@%C# 4'8' + %?? AG?B 7 G 4 ! 0D * ($ %@D# C %??

Stratified Resolution Anatoli Degtyarev1 and Andrei Voronkov2 1

Manchester Metropolitan University [email protected] 2 University of Manchester [email protected]

Abstract. We introduce a calculus of stratified resolution, in which special attention is paid to clauses that “define” relations. If such clauses are discovered in the initial set of clauses, they are treated using the rule of definition unfolding, i.e. the rule that replaces defined relations by their definitions. Stratified resolution comes with a new, previously not studied, notion of redundancy: a clause to which definition unfolding has been applied can be removed from the search space. To prove completeness of stratified resolution with redundancies we use a novel technique of traces.

1

Introduction

In this article we introduce two versions of stratified resolution, — a resolution calculus with special rules for handling hierarchical definitions of relations. Stratified resolution generalizes SLD-resolution for Horn clauses to a more general case, where clauses may be non-Horn but “Horn with respect to a set of relations”. Example 1 Suppose we try to establish inconsistency of a set of clauses S containing a recursive definition of a relation split that splits a list of conferences into two sublists: of deduction-related conferences, and of all other conferences. split([x|y], [x|z], u) :- deduction(x), split(y, z, u). split([x|y], z, [x|u]) :- ¬deduction(x), split(y, z, u). split([], [], []). Suppose that S also contains other clauses, for example ¬split(x, y, z) ∨ conference list(x). If we use ordered resolution with negative selection (as most state-of-the-art systems would do), we face several choices in selecting the order and negative literals in clauses. For example, if we choose the order in which every literal with the relation deduction is greater than any literal with the relation split, then we must select either ¬deduction(x) or ¬split(y, z, u) in the first clause. It seems much more natural to select split([x|y], [x|z], u) instead, then we can use the D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 365–384, 2000. c Springer-Verlag Berlin Heidelberg 2000

366

Anatoli Degtyarev and Andrei Voronkov

first clause in the same way it would be used in logic programming. Likewise, if we always try to select a negative literal in a clause, the literal ¬split(y, z, u) will be selected in the second clause, which is most likely a wrong choice, since then any resolvent with the second clause will give us a larger clause. Let us now choose an ordering in which the literals split([x|y], [x|z], u) and split([x|y], z, [x|u]) are maximal in their clauses, and select these literals. Consider the fourth clause. If we select ¬split(x, y, z) in it, we can resolve this literal with all three clauses defining split. It would be desirable to select conference list(x) in it (if our ordering allows us to do so), since a resolvent upon conference list(x) is likely to instantiate x to a nonvariable term t, and then the literal ¬split(t, y, z) can be resolved with only two, one or no clauses at all, depending on the form of t. In all cases, it seems reasonable to choose an ordering and selection function in such a way that the first three clauses will be used as a definition of split so that we unfold this definition, i.e. replace the heads of these clauses with their bodies. Such an ordering would give us the best results if we have a right strategy of negative selection which says: select ¬split(t, r, s) only if t is instantiated enough, or if we have no other choice. In order to implement this idea we have to be able to formalize the right notion of a “definition” in a set of clauses. Such a formalization is undertaken in our paper, in the form of a calculus of stratified resolution. Stratified resolution is based on the following ideas that can be tracked down to earlier ideas developed in logic programming. 1. Logic programming is based on the idea of using definite clauses as definitions of relations. Similar to the notion of definite clause, we introduce a more general notion of a set of clauses definite w.r.t. a set of relations. These relations are regarded as defined by this set of clauses. 2. In logic programming, relations are often defined in terms of other relations. The notion of stratification [5,1,8] allows one to formalize the notion “P is defined in terms of Q”. We use a similar idea of stratification, but in our case stratification must be related to a reduction ordering on literals. Consider another example. Example 2 The difficult problem is to find automatically the right ordering that makes the atom in the head of a “definition” greater than atoms in the body of this definition. Consider, for example, clauses defining reachability in a directed graph, where the graph is formalized by the binary relation edge: reachable(x, y) :- edge(x, y). reachable(x, z) :- edge(x, y), reachable(y, z). There is no well-founded ordering stable under substitutions that makes the atom reachable(x , z ) greater than reachable(y, z). So the standard ordered resolution with negative selection cannot help us in selecting the “right” ordering. The theory developed in this paper allows one to select only the literal reachable(x , z ) in this clause despite that this literal is not the greatest.

Stratified Resolution

367

In addition to intelligent selection of literals in clauses, stratified resolution has a new kind of redundancy, not exploited so far in automated deduction. To explain this kind of redundancy, let us come back to Example 1. Suppose we have a clause ¬split([cade, www, lpar], y, z).

(1)

Stratified resolution can resolve this clause with the first two clauses in the definition of split, obtaining two new clauses deduction(cade), split([www, lpar ], y, z); ¬deduction(cade), split([www, lpar], y, z). In resolution-based theorem proving these two clauses would be added to the search space. We prove that they can replace clause (1) thus making the search space smaller. When the initial set of clauses contains no definitions or cannot be stratified, stratified resolution becomes the ordinary ordered resolution with negative selection. However, sets of clauses which contain definitions and can be stratified in our sense are often met in practice, since they correspond to definitions (maybe recursive) of relations of a special form. For example, a majority of TPTP problems can be stratified. This paper is organized as follows. In Section 2 we define the ground version of stratified resolution and prove its soundness and completeness. Then in Section 3 we define stratified resolution with redundancies, a calculus in which a clause can be removed from the search space after a definition unfolding has been applied to it. Then in Section 4 we define a nonground version of stratified resolution with redundancies. In this paper we deal with first-order logic without equality, we will only briefly discuss equality in Section 6. Related Work There are not so many papers in the automated deduction literature relevant to our paper. Our formal system resembles SLD-resolution [6]. When the initial set of clauses is Horn, our stratified resolution with redundancies becomes SLDresolution. The possibility for arbitrary selection for Horn clauses, and even in the case of equational logic was proved in [7]. For one of our proofs we used a renaming technique introduced in [2,3].

2

Stratified Resolution: The Ground Case

As usual, we begin with a propositional case, lifting to the general case will be standard. Throughout this paper, we denote by L a set of propositional atoms. The literal complementary to a literal L is denoted by L. For every set of atoms P,

368

Anatoli Degtyarev and Andrei Voronkov

we call a P-atom any atom in P, and a P-literal any literal whose atom is in P. A clause is a finite multiset of literals. We denote literals by L, M and clauses by C, D, maybe with indices. The empty clause is denoted by 2. We use the standard notation for multisets, for example write C1 , C2 for the multiset union of two clauses C1 and C2 , and write L ∈ C if the literal L is a member of the clause C. For two ground clauses C1 and C2 , we say that C1 subsumes C2 , if C1 is a submultiset of C2 . This means, for example, that A, A does not subsume A. Let be a total well-founded ordering on L. We extend this ordering on literals in L in the standard way, such that for every atom A we have ¬A A and there is no literal L such that ¬A L A. If L is a literal and C is a clause, we write L C if for every literal M ∈ C we have L M . Usually, we will write a clause as a disjunction of its literals. As usual, we write L L0 if L L0 or L = L0 and write L C if for every literal M ∈ C we have L M . In this paper we assume that always denotes a fixed well-founded ordering. We will now define a notion of selection function which makes a slight deviation from the standard notions (see e.g. [3]. However, the resulting inference system will be standard, and several following definitions will become simpler. We call a selection function any function σ on the set of clauses such that (i) σ(C) is a set of literals in C, (ii) σ(C) is nonempty whenever C is nonempty, and (iii) if A ∈ σ(C) and A is a positive literal, then A C. If L ∈ σ(C), we say that L is selected by σ in C. When we use a selection function, we underline selected literals, so when we write A ∨ C, this means that A (and maybe some other literals) are selected in A ∨ C. In our proofs we will use the result on completeness of the inference system of ordered binary resolution with negative selection (but our main inference system will be different). Let us define this inference system. Definition 3 Let σ be a selection function. The inference system R σ consists of the following inference rules: 1. Positive ordered factoring: C∨A∨A C ∨A

,

where A is a positive literal. 2. Binary ordered resolution with selection: C ∨A

D ∨ ¬A

C ∨D

,

where A C. The following theorem is well-known (for example, a stronger statement can be found in [3]). Theorem 4 Let S be a set of clauses. Then S is unsatisfiable if and only 2 is derivable from S in R σ.

Stratified Resolution

369

The following two definitions are central to our paper. Definition 5 Let P ⊆ L be a set of atoms. A set S of clauses is Horn with respect to P, if every clause of S contains at most one positive P-literal. A clause D is called definite with respect to P if D contains exactly one positive P-literal. Let P ∨ L1 ∨ . . . ∨ Ln be a clause definite w.r.t. P such that P ∈ P. We will sometimes denote this clause by P :- L1 , . . . , Ln . Definition 6 (Stratification) We call a -stratification of L any finite sequence Ln . . . L0 of subsets of L such that 1. L = L0 ∪ . . . ∪ Ln ; 2. If m > n, A ∈ Lm , and B ∈ Ln , then A B. We will denote -stratifications by Ln . . . L0 and call them simply stratifications. From now on we assume a fixed stratification of L of the form Qn Pn Qn−1 Pn−1 . . . Q1 P1 Q0 .

(2)

We denote P = Pn ∪ . . . ∪ P1 and Q = Qn ∪ . . . ∪ Q0 and use this notation throughout the paper. Atoms in P will be denoted by P (maybe with indices). Let C be a clause definite w.r.t. P. Then C contains a positive literal P ∈ Pi , for some i. We say that C admits stratification (2), if all atoms occurring in C belong to Pi ∪ Qi−1 ∪ . . . ∪ P1 ∪ Q0 . Note that every such clause has the form P :- P1 , . . . , Pk , | {z }

L1 , . . . , Ll | {z }

atoms in

Qi−1 ∪ . . . ∪ Q0 -

Pi ∪ . . . ∪ P1

literals

.

Example 7 Consider the set consisting of four clauses: A ∨ B, A ∨ ¬B, ¬A ∨ B, and ¬A ∨ ¬B. This set is Horn with respect to {A} and also with respect to {B}, but not with respect to {A, B}. This set of clauses admits stratification ∅ {A} {B}, in which A is considered as a relation defined in terms of B, but also admits ∅ {B} {A}, in which B is considered as defined in terms of A. This example shows that it is hard to expect to find the “greatest” (in any sense) stratification. Let us fix a well-founded order , a selection function σ, and a -stratification (2). Definition 8 (Stratified Resolution) The inference system of stratified resolution, denoted SR, consists of the following inference rules.

370

Anatoli Degtyarev and Andrei Voronkov

1. Positive ordered factoring: C∨A∨A C ∨A

,

where A is a positive literal, and C contains no positive P-literals. 2. Binary ordered resolution with selection: C ∨A

D ∨ ¬A

C ∨D

,

where A C and C, D, A contain no positive P-literals. 3. Definition unfolding: C ∨P

D ∨ ¬P

C ∨D

,

where P ∈ P and D contains no positive P-literals. Note that in this rule we do not require that P be selected in C ∨ P . Theorem 9 (Soundness and Completeness of Stratified Resolution) Let S be a set of clauses Horn w.r.t. P. Let, in addition, every clause in S definite w.r.t. P admits stratification (2). Then S is unsatisfiable if and only if 2 is derivable from S in the system of stratified resolution. Proof. Soundness is obvious. To prove completeness, we use a technique of [3], (see also [2]) for proving completeness of resolution with free selection for Horn sets. Let us explain the idea of this proof. The inference rules of stratified resolution reminds us of the rules of ordered resolution with negative selection, but with a nonstandard selection function: in any clause P ∨ C definite w.r.t. P the literal P is selected. We could use Theorem 4 on completeness of ordered resolution with negative selection if P was maximal in P ∨ C, since we could then select P in P ∨ C using a standard selection function. However, P is not necessarily maximal: there can be other P-literals in C greater than P . What we do is to “rename” the clause P ∨ C so that P becomes greater than C. Formally, let P 0 be a new set of atoms of the form P n , where P ∈ P, and n is a natural number. Denote by L0 the set of atoms P 0 ∪ Q. We will refer to the set L0 as the new language as opposite to the original language L. Define a mapping ρ : L0 −→ L by (i) ρ(P n ) = P for any natural number n, (ii) ρ(A) = A if A ∈ Q. Extend the mapping to literals and clauses in the new language in a natural way: ρ(¬A) = ¬ρ(A) and ρ(L1 ∨ . . . ∨ Ln ) = ρ(L1 ) ∨ . . . ∨ ρ(Ln ). For every clause C Horn w.r.t. P of the original language we define a set of clauses C ρ in the new language as follows. 1. If C is definite w.r.t. P, then C has the form P ∨ ¬P1, . . . , ∨¬Pk ∨ D, where k ≥ 0 and D has no P-literals. We define C ρ as the set of clauses {P 1+n1+...+nk ∨ ¬P1n1 ∨ . . . ∨ ¬Pknk ∨ D | n1 , . . . , nk are natural numbers }.

Stratified Resolution

371

2. Otherwise, C has the form ¬P1 ∨ . . . ∨ ¬Pk ∨ D, where k ≥ 0 and D has no P-literals. We define C ρ as the set of clauses {¬P1n1 ∨ . . . ∨ ¬Pknk ∨ D | n1 , . . . , nk are natural numbers }. S For any set of clauses S Horn w.r.t. P we define S ρ as the set of clauses C∈S C ρ . We prove the following: (3)

S is satisfiable if and only if so is S ρ .

Suppose S is satisfiable, then some valuation τ : L −→ {true, false} satisfies S. Define a valuation τ 0 : L0 −→ {true, false} such that τ 0 (A) = τ (A), if A ∈ Q, and τ 0 (P n ) = τ (P ). It is not hard to argue that τ 0 satisfies S ρ . Now we suppose that S is unsatisfiable and show that S ρ is unsatisfiable, too. We apply induction on the number k of Q-literals occurring in S. 1. Case k = 0. Then S is a set of Horn clauses. This case has been considered in [3]. (The idea is that P n is interpreted as P has derivation by SLD-resolution in n steps). 2. Case k > 0. For any set of clauses T and literal L, denote by T /L the set of clauses obtained from T be removing all clauses containing L and removing from the rest of clauses all occurrences of literals L. Note that if L is a Qliteral occurring in T , then T /L contains less Q-literals than T . It is not hard to argue that T is unsatisfiable if and only so are both T /L and T /L. Take any Q-literal L occurring in S. Since S is unsatisfiable, so are both S/L and S/L. By the induction hypothesis, (S/L)ρ and (S/L)ρ are unsatisfiable, too. It is easy to see that (S/L)ρ = S ρ /L and (S/L)ρ = S ρ /L, so both S ρ /L and S ρ /L are unsatisfiable. But then S ρ is unsatisfiable too. The proof of (3) is completed. Let us continue the proof of Theorem 9. Define an order 0 on L0 as follows: 1. 0 and coincide on Q-literals. 2. If L is a Q-atom and P is a P-atom, then for every n we let L 0 P n if and only if L P . 3. P1n1 0 P2n2 if P1 ∈ Pi , P2 ∈ Pj and i > j. 4. P1n1 0 P2n2 if P1 and P2 belong to the same set Pi and n1 > n2 . 5. P1n 0 P2n if P1 and P2 belong to the same set Pi and P1 P2 . Using the selection function σ on clauses of the original language we will now define a selection function σ 0 on clauses of the new language. We will only be interested in the behavior of σ 0 on clauses Horn w.r.t. P 0 , so we do not define it for other clauses. 1. If a clause C is definite w.r.t. P 0 , then σ 0 selects in C the maximal literal. 2. If C contains no positive P-literals, then σ 0 selects a literal L in C if and only if σ selects ρ(L) in ρ(C).

372

Anatoli Degtyarev and Andrei Voronkov

Note that our ordering is defined in such a way that for any clause (P n ∨ C) ∈ S ρ definite w.r.t. P 0 we have P n C, and hence P n is always selected in this clause. It is not hard to argue that σ 0 is a selection function. To complete the proof 0 we use Theorem 4 on completeness of the system R applied to S ρ . σ0 0 0 ρ Consider a derivation of a clause D from S with respect to R σ0 . We show simultaneously that: (4) (5)

there exists a derivation of the clause D = ρ(D0 ) from S in the system SR; if D0 6∈ S ρ , then D0 contains no positive P 0 -literal.

We apply induction on the length of the derivation of D0 . If D0 ∈ S ρ , then ρ(D0 ) ∈ S, so the induction base is straightforward. Assume now that the derivation of D0 is obtained by an inference from D10 , . . . , Dn0 . We will show that D can be obtained by an inference in SR from ρ(D10 ), . . . , ρ(Dn0 ), this will imply (4), since ρ(D10 ), . . . , ρ(Dn0 ) can be derived in SR from S by the induction hypothesis. Consider the following cases. 1. Case: D0 = C 0 ∨ A is derived from C 0 ∨ A ∨ A by positive ordered factoring. Then A 0 C 0 . Note that A cannot be a P 0 -atom: C 0 ∨ P n ∨ P n 6∈ S ρ because all clauses in S ρ are Horn w.r.t. P 0 , and C 0 ∨ P n ∨ P n cannot be derived by the induction hypothesis. Therefore, A is a Q-atom, and hence ρ(C 0 ∨ A ∨ A) = ρ(C 0 ) ∨ A ∨ A. Moreover, C 0 contains no positive literal P n because in this case the clause (ρ(C 0 ) ∨ A ∨ A) ∈ S is definite w.r.t. P, so it admits the stratification, so P A, which contradicts to A 0 P n . Therefore, C 0 ∨ A contains no positive P 0 -literal. Then by our definition of σ 0 and 0 , A is maximal and selected in ρ(C 0 ) ∨ A ∨ A, so we can apply positive ordered factoring and derive ρ(C 0 ) ∨ A in SR. It is easy to see that ρ(C 0 ) ∨ A = ρ(C 0 ∨ A) and that ρ(C 0 ∨ A) contains no positive P 0 -literal. 2. Case: D = C10 ∨ C20 is obtained from C10 ∨ A and C20 ∨ ¬A by binary ordered resolution with selection. Denote C1 = ρ(C10 ) and C2 = ρ(C20 ). Consider two cases. (a) Case: A ∈ Q. Then ρ(A) = A. In this case we show that C1 ∨ A C2 ∨ ¬A C1 ∨ C2 is an inference in SR by binary ordered resolution with selection. Since A 0 C10 , then A C1 . By our definition of σ 0 , A (¬A) is selected by σ in C1 ∨ A (in C2 ∨ ¬A) because A (¬A) is selected by σ 0 in C1 ∨ A (C2 ∨ ¬A). It remains to check that C1 , C2 contain no positive P-literals. This is done exactly as in case of positive ordered factoring. (b) Case: A = P n . We prove that C1 ∨ P

C2 ∨ ¬P

C1 ∨ C2

Stratified Resolution

373

is an inference in SR by definition unfolding. Indeed, since ¬P n is selected by σ 0 in C20 ∨ ¬P n, then ¬P is selected by σ in C2 ∨ ¬P . As in the previous cases, we can prove that C10 , C20 contains no positive literal of the form P m . Now we can conclude the proof of Theorem 9. Suppose that S in unsatisfiable. By (3), S ρ is unsatisfiable. Then by Theorem 4 the empty clause 2 is derivable 0 from S ρ in R σ0 . Hence, by (4) ρ(2) is derivable from S in SR. But ρ(2) = 2, so 2 is derivable from S in SR. We will now show that the condition on clauses to admit stratification is essential. Example 10 This example is taken from [7]. Consider the following set of clauses: ¬Q ∨ R ¬Q ∨ ¬P ¬R ∨ P

¬P ∨ Q ¬P ∨ ¬R R∨Q∨P

¬R ∨ ¬Q

This clause is unsatisfiable and definite w.r.t. P . Consider the ordering R Q P . However, the empty clause cannot be derived from it, even if tautologies are allowed. Indeed, the conclusion of any inference by stratified resolution is subsumed by one of the clauses in this set. The problem is that the clause R ∨ P ∨ Q admits no -stratification.

3

Redundancies in Stratified Resolution

In this section we make stratified resolution into an inference system on sets of clauses and add two kinds of inferences that remove clauses from sets. Then we prove completeness of the resulting system using a new technique of traces. The derivable objects in the inference system of stratified resolution with redundancies are sets of clauses. Each inference has exactly one premise S and conclusion S 0 , we will denote such an inference by S . S 0 . We assume an ordering , stratification and selection function be defined as in the case of SR. In this section we always assume that S0 is an initial set of clauses Horn w.r.t. P and having the following property: for every P ∈ P S0 contains only a finite number of clauses in which P is a positive literal. Definition 11 (Stratified Resolution with Redundancies) The inference system of stratified resolution with redundancies, denoted SRR consists of the following inference rules. 1. Suppose that (C ∨ A ∨ A) ∈ S and A is a positive literal. Then S . S ∪ {C ∨ A}, is an inference by positive ordered factoring.

374

Anatoli Degtyarev and Andrei Voronkov

2. Suppose that {C ∨ A, D ∨ ¬A} ⊆ S, where A C and C, D, A contain no positive P-literals. Then S . S ∪ {C ∨ D} is an inference by binary ordered resolution. 3. Suppose (C ∨ ¬P ) ∈ S, where P ∈ P. Furthermore, suppose that P ∨ D1 , . . . , P ∨ Dk are all clauses in S containing P positively. Then S . (S − {C ∨ ¬P }) ∪ {C ∨ D1 , . . . , C ∨ Dn } is an inference by definition rewriting. Note that this inference deletes the clause C ∨ ¬P . Moreover, if S contains no clauses containing the positive literal P , then C ∨ ¬P is deleted from the search space and not replaced by any other clause. 4. Suppose {C, D} ∈ S, C 6= D and C subsumes D. Then S . S − {D} is an inference by subsumption. For the new calculus, we have to change the notion of derivation into a new one. We call a derivation from a set of clauses S0 any sequence of inferences: S0 . S1 . S2 . . . . , possibly infinite. The derivation is said to succeed if some Si contains 2, and fail otherwise. S T Consider a derivation S = S0 . S1 . . . .. The set i j≥i Sj is called the limit of this derivation and denoted by limS . The derivation is called fair if the following three conditions are fulfilled: 1. If (C ∨ A ∨ A) belongs to the limit of S and A C, then some inference in S is the inference by positive ordered factoring Si . Si ∪ {C ∨ A}. 2. If C ∨ A, D ∨ ¬A belong to the limit of S, A C and C, D, A contain no positive P-literals, then some inference in S is the inference by binary ordered resolution Si . Si ∪ {C ∨ D}. 3. Let a clause C ∨¬P belongs to the limit of S and P ∈ P. Then some inference in S is the inference by definition rewriting Si . (Si − {C ∨ ¬P }) ∪ {C ∨ D1 , . . . , C ∨ Dn }.

Stratified Resolution

375

We call a selection function σ subsumption-stable if it has the following property. Suppose C subsumes D, L is a literal in D selected by σ and L ∈ C. Then L is also selected in C by σ. It is evident that subsumption-stable selection functions exist. Theorem 12 (Soundness and Completeness of SRR) Let S0 be a set of clauses Horn w.r.t. P and every clause definite w.r.t. P in S0 admits stratification (2). Let σ be a subsumption-stable selection function. Consider derivations in SRR. (i) If any derivation from S0 succeeds, then S0 is unsatisfiable. (ii) If S0 is unsatisfiable, then every fair derivation from S succeeds. Proof. Soundness is easy to prove. Let a derivation S0 . . . . . Sn succeed. It is easy to see that every clause in every Si is a logical consequence of S0 . Then 2 is a logical consequence of S0 , and hence S0 is unsatisfiable. To prove completeness is more difficult. We will introduce several notions, prove some intermediate lemmas, and then come back to the proof of this theorem. The main obstacle in the proof of completeness is that some clauses can be deleted from the search space, when we apply definition rewriting or subsumption. If we only had subsumption, the proof could be done by standard methods based on clause orderings. However, clause rewriting can rewrite clauses into larger ones. For example, consider a set S of clauses that contains two clauses definite w.r.t. P: P1 :- P2 and P2 :- P1 . Then the following is a valid derivation: S ∪ {¬P1 } . S ∪ {¬P2} . S ∪ {¬P1 } . . . . , so independently of a choice of ordering one of these inferences replaces a smaller clause by a bigger one. To prove completeness we will use a novel technique of traces. This technique was originally introduced in [10], where it was used to prove completeness of a system of modal logic with subsumption. Intuitively, a trace is whatever remains of a clause when the clause is deleted. Unlike [10], in this paper a trace can be an infinite set of clauses. Suppose that S0 is the set of initial clauses, Horn w.r.t. P. Consider any clause D that contains no positive P-literals. Consider the following one-player game that builds a tree of clauses. Initially, the tree contains one root node D. We call a node C closed if either C = 2 or the literal selected in C by σ is a Q-literal. An open node is any leaf that is not closed. At every step of the game the player has two choices of moves: 1. (Subsumption move). Select any leaf C (either closed or open) and add to C as the child node any clause C 0 such that C 0 subsumes C and C 0 6= C. 2. (Expansion move). Select an open leaf C. If the literal selected in C by σ is a negative P-literal ¬P , then C has the form ¬P ∨ C 0 . (As we shall see later, no clause in the tree can contain a positive P-literal, so in this case a negative P-literal is always selected). Let all clauses in S0 that contain the positive literal P be P ∨ D1 , . . . , P ∨ Dn . Then add to this node as children all nodes C 0 ∨ D1 , . . . , C 0 ∨ Dn .

376

Anatoli Degtyarev and Andrei Voronkov

Using the property that S0 is Horn w.r.t. P it is not hard to argue that no clause in the tree can contain a positive P-literal, so an open leaf always contains a selected negative P-literal. A game is fair if every non-closed node is selected by the player. Let us call a tree for D any tree obtained as a limit of a fair game. We call a cover for D the set of all closed leaves in any tree T for D. We will say that the cover is obtained from the tree T and denote the cover by cl T . Example 13 Suppose that the set of clauses S0 is {Pi ∨ ¬Qi | i = 1, 2, . . .} ∪ {Pi ∨ ¬Pi+1 | i = 1, 2, . . .}, where all Pi ’s are P-literals and all Qi ’s are Q-literals. Suppose that the selection function σ always selects a P-literal in a clause that contains at least one Pliteral. Then one possible tree for ¬P1 ∨ Q1 is as follows: ¬P1 ∨ Q1 ¬Q1 ∨ Q1 ¬P2 ∨ Q1 ¬Q2 ∨ Q1 ¬P3 ∨ Q1 ...

...

In this tree no subsumption move was applied. The cover for ¬P1 ∨ Q1 obtained from this tree is the set of clauses {¬Qi ∨ Q1 | i = 1, 2, . . .}. Let us prove some useful properties of covers. Lemma 14 Every cover for a clause C is also a cover for any clause C ∨ D. Proof. Let S be a cover for C and T be the tree from which this cover is obtained. Apply a subsumption move to C ∨ D C ∨D C and extend this tree by the tree T below C. Evidently, we obtain a tree for C ∨ D and the cover obtained from this tree is also S. The following straightforward lemma asserts that it is enough to consider trees with no repeated subsumption moves.

Stratified Resolution

377

Lemma 15 For every cover S for a clause C there exists a tree T for C such that S = cl T and no subsumption move in T was applied to a node obtained by a subsumption move. Proof. Let T 0 be a tree for C such that S = cl T 0 . Suppose that T 0 contains a node obtained by a subsumption move so that a subsumption move is applied to this node as well: D1 ∨ D2 ∨ D3 D1 ∨ D2 D1 Evidently, these two subsumption moves can be replaced by one subsumption move: D1 ∨ D2 ∨ D3 D1 In this way we can eliminate all sequences of subsumption moves and obtain a required tree T . Let C be a clause and S be a derivation. We call a trace of C in S any cover S for C such that S ⊆ limS . When we speak about a trace of C in S we do not assume that C occurs in S. Lemma 16 Let S = S0 . S1 . . . . be a fair derivation. Then every clause D occurring in any Si has a trace in S. Proof. We will play a game which builds a tree T such that cl T is a trace of D in S. We will define a sequence of trees T1 , T2 , . . . such that each Tj+1 is obtained from Tj by zero or more moves, and T is the limit of this sequence. The initial tree T1 consists of the root node D. Each tree Tj+1 is obtained from Tj by inspecting the inference Sj . Sj+1 as follows. 1. Suppose this inference is a subsumption inference Sj . Sj − {C ∨ D}, where C ∈ S and C ∨ D is a leaf node in Tj . Then make a subsumption move for each such leaf, adding C to it as the child.

378

Anatoli Degtyarev and Andrei Voronkov

2. Suppose this inference is a definition rewriting inference Sj . Sj − {C ∨ ¬P } ∪ {C ∨ D1 , . . . , C ∨ Dn }, such that C ∨ ¬P is a leaf in Tj . Then make an expansion move for each such leaf, adding to it the clauses C ∨ D1 , . . . , C ∨ Dn as children. 3. In all other cases make no move at all. Let us make several observations about the tree T and the game. 1. Every leaf in any tree Tj is a member of some Sk . This is proved by induction on j. For T1 this holds by the definition of T1 , since it contains one leaf D ∈ Si . By our construction, every leaf of any Tj+1 which is not a leaf of Tj is a member of Sj+1 . It remains to prove that every node which is a leaf in both Tj and Tj+1 belongs to Sj+1 . Take any such node C, by the induction hypothesis it belongs to Sj . Suppose, by contradiction, C 6∈ Sj+1 , then C has been removed by either subsumption or definition rewriting inference. But by our construction, in both cases C will become a nonleaf in Tj+1 , so we obtain a contradiction. 2. The game is fair. Suppose, by contradiction, that an open leaf has never been selected during the game. Then this leaf contains a clause C ∨ ¬P . Let Tj be any tree containing this open leaf. Then the leaf belongs to all trees Tk for k ≥ j. By the previous item, C ∨ ¬P ∈ Sk for all k ≥ j. Therefore, the clause C ∨ ¬P belongs to the limit of S, but definition rewriting has not been applied to it. This contradicts to the fairness of S. 3. Every closed leaf C in T belongs to the limit of S. Let this leaf first appeared in a tree Tj . Then it belongs to all trees Tk for k ≥ j. By the first item, C ∈ Sk for all k ≥ j, but then C ∈ limS . Now consider cl T . Since T was built using a fair game, cl T is a cover for D. But we proved that every element of cl T belongs to the limit of S, hence cl T is also a trace of D in S. Lemma 17 Let S = S0 . S1 . . . . be a fair derivation. Further, let E be a clause derivable from S0 in SR. Then D has a trace in S. Proof. The proof is by induction on the derivation of E in SR. When the derivation consists of 0 inferences, E ∈ S0 . By Lemma 16 every clause of every Si has a trace in S, so E has a trace. When the derivation has at least one inference, we consider the last inference of the derivation. By the induction hypothesis, all premises of this inference have traces in S. We consider three cases corresponding to the inference rules of stratified resolution. In all cases, using Lemma 15 we can assume that no tree contains a subsumption move applied to a node which itself is obtained by a subsumption move.

Stratified Resolution

379

1. The last inference is by positive ordered factoring. C∨A∨A C ∨A

,

Consider a trace S of C ∨ A ∨ A in S and a tree T from which this trace was obtained. Since A is selected by σ in C, there are only two possibilities for T: (a) Case: T consists of one node C ∨ A ∨ A. Then {C ∨ A ∨ A} is a trace, and hence (C ∨ A∨ A) ∈ limS . Since S is fair, some Si contains the clause C ∨ A obtained from C ∨ A ∨ A by positive ordered factoring. Then by Lemma 16, C ∨ A has a trace in S. (b) Case: the first move of T is a subsumption move. This subsumption move puts a clause C 0 which subsumes C ∨ A ∨ A as the child to C ∨ A ∨ A. Consider two cases. i. Case: C 0 has a form C 00 ∨ A ∨ A, where C 00 subsumes C. Since the selection function is subsumption-stable, A is selected in C 00 ∨ A ∨ A by σ, so the tree T contains no node below C 00 ∨ A ∨ A. Then (C 00 ∨ A ∨ A) ∈ limS . Since S is fair, some Si contains the clause C 00 ∨ A obtained from C 00 ∨ A ∨ A by positive ordered factoring. By Lemma 16, C 00 ∨ A has a trace in S. But C 00 ∨ A subsumes C ∨ A, so by Lemma 14 this trace is also a trace of C ∨ A. ii. Case: C 0 subsumes C ∨ A. Note that cl T is also a trace of C 0 , so by Lemma 14 cl T is also a trace of C ∨ A. 2. The last inference is by binary ordered resolution with selection: C ∨A

D ∨ ¬A

C ∨D

,

where A C and C, D, A contain no positive P-literals. Consider traces S1 , S2 of C ∨ A and D ∨ ¬A, respectively and trees T1 , T2 from which these traces were obtained. We consider two simple cases and then the remaining case (a) If the first move of T1 is a subsumption move that replaces C ∨ A by a clause C 0 that subsumes C, then C 0 has a trace in S, but C subsumes C ∨ D, so by Lemma 14 this trace is also a trace of C ∨ D. (b) Likewise, if the first move of T2 is a subsumption move that replaces D ∨ ¬A by a clause D0 that subsumes D, then D0 has a trace in S, so by Lemma 14 this trace is also a trace of C ∨ D. (c) If neither of the previous cases takes place, then either T1 consists of one node or the top move in T1 is a subsumption move: C∨A C0 ∨ A

380

Anatoli Degtyarev and Andrei Voronkov

such that C 0 subsumes C. Note that in the latter case, since the selection function is subsumption-stable, A is selected in C 0 ∨ A. In both cases the limit of S contains a clause C 0 ∨ A such that C 0 subsumes C (in the former case we can take C as C 0 ). Likewise, we can prove that the limit of S contains a clause D0 ∨ ¬A such that D0 subsumes D. Since S is fair, some Si contains the clause C 0 ∨ D0 obtained from C 0 ∨ A and D0 ∨ ¬A by binary ordered resolution with selection. By Lemma 16, C 0 ∨ D0 has a trace in S. But C 0 ∨ D0 subsumes C ∨ D, so by Lemma 14 C ∨ D has a trace in S. 3. The last inference is by definition unfolding: C ∨P

D ∨ ¬P

C ∨D

,

where P ∈ P. Consider a S of D ∨ ¬P , and a tree T from which this trace was obtained. (a) If the first move of T is a subsumption move that replaces D ∨ ¬P by a clause D0 that subsumes D, then D0 has a trace in S, so by Lemma 14 this trace is also a trace of C ∨ D. (b) If the previous case does not take place, there are two possibilities for the top move of T : it is either an expansion move or a subsumption move followed by an expansion move. We consider the former case, the latter case is similar. The top of the tree T has the form D ∨ ¬P ...

C ∨D

...

Denote the subtree of T rooted at C ∨ D by T 0 . Note that cl T 0 is a cover of C ∨ D. But since T 0 is a subtree of T , we have cl T 0 ⊆ cl T . Since cl T is a trace, we also have cl T ⊆ limS , then cl T 0 ⊆ limS , and hence cl T 0 is a trace. So C ∨ D has a trace in S. The proof is completed. We can now easily complete the proof of completeness of stratified resolution with redundancies. Proof (of Theorem 12, Continued). Suppose S0 is unsatisfiable. Take any fair derivation S = S0 . S1 . . . .. By Theorem 9 there exists a derivation of 2 from clauses in S0 by stratified resolution. By Lemma 17, 2 has a trace in S, i.e. a cover whose members belong to the limit of S. By the definition of a cover, 2 has only one cover {2}. Then 2 belongs to the limit of S, and hence S is successful. The proof can be easily modified for a system with redundancies in which clauses containing positive P-literals can also be subsumed and in which tautologies are deleted.

Stratified Resolution

4

381

Nonground Case

In this section we introduce the general, nonground, version of stratified resolution with redundancies. Most of the definitions given below are obtained by a simple modification of the definition for the ground case. In this section we treat L, P, and Q as sets of relations but not atoms. To define stratification, we use a precedence relation L , i.e. a total ordering on L. We call a P-literal any literal whose relation belongs to P, and similar for other sets of relations instead of P. The notions of Horn set w.r.t. P, definite clause w.r.t. P, stratification, and clause admitting a stratification are the same as in the ground case. Now we have to define an ordering corresponding to our notion of stratification. We require to be well-founded ordering on atoms, stable under substitutions (i.e. A B implies Aθ Bθ) and total on ground atoms. In addition, we require that is compatible with the precedence relation in the following sense: if A1 L A2 , then A1 (t1 , . . . , tn ) A2 (s1 , . . . , sm ) for all relations A1 , A2 and terms t1 , . . . , tn and s1 , . . . , sm . It is obvious how to modify recursive path orderings or Knuth-Bendix orderings so that they satisfy this requirement. In the definition of selection function for nonground case we revert in fact to one of earlier notions given in [4]. We define a (nonground) selection function to be a function σ on the set of clauses such that (i) σ(C) is a set of literals in C, (ii) σ(C) is nonempty whenever C is nonempty, and (iii) at least one negative literal or otherwise all maximal literals must be selected. Similar to the ground case we assume that selection functions are subsumption-stable in the following sense: if a literal L is selected in C ∨ L and D is a submultiset of C, then L is selected in D ∨ L. Let us fix an order , a selection function σ, and stratification (2). Definition 18 (Stratified Resolution with Redundancies) The inference system of stratified resolution with redundancies, denoted SRR consists of the following inference rules. 1. Positive ordered factoring is the following inference rule: S . S ∪ {(C ∨ A1 )θ} such that (i) S contains a clause C ∨ A1 ∨ A2 , (ii) A1 , A2 are positive literals, and (iii) θ is a most general unifier of A1 and A2 . 2. Binary ordered resolution with selection is the following inference rule: S . S ∪ {(C ∨ D)θ} such that (i) S contains clauses C ∨ A1 and D ∨ ¬A2 , (ii) A1 θ 6∈ Cθ, (iii) C, D, A1 contain no positive P-literals, and (iv) θ is a most general unifier of A1 and A2 .

382

Anatoli Degtyarev and Andrei Voronkov

3. Definition rewriting. Suppose (C ∨ ¬P ) ∈ S, where P ∈ P. Furthermore, suppose that P1 ∨D1 , . . . , Pk ∨Dk are all clauses in S such that Pi is unifiable with P . Then S . (S − {C ∨ ¬P }) ∪ {(C ∨ D1 )θ1 , . . . , (C ∨ Dn )θn }, where each θi is a most general unifier of P and Pi , is an inference by definition rewriting. Theorem 19 Let S be a set of clauses Horn w.r.t. P. Let, in addition, every clause in S definite w.r.t. P admit stratification (2). Then S is unsatisfiable if and only if every fair derivation from S in the system SRR contains the empty clause. The proof can be obtained by standard lifting with a special care of selection function.

5

How to Select a Stratification

Example 7 shows that a set of clauses may admit several different stratifications. How can choose a “good” stratification? Suppose that a set of clauses S is Horn w.r.t. P. Then we can always use the stratification ∅ P Q, in which any P-atom is greater than any Q-atom. Unfortunately, this stratification may be not good enough, since it gives us too little choice for selecting positive Q-literals. Let us illustrate this for clauses of Example 1. Assume that P is {split}. We obtain the stratification ∅ {split} {deduction, conference list}. This stratification does not allow us to select the literal conference list(x) in ¬split(x, y, z) ∨ conference list(x), while intuitively it should be the right selection. This observation shows that it can be better to split the set of atoms into as many strata as possible. Then we will have more options for selecting positive Q-atoms in clauses. In Example 1, a more flexible stratification would be {conference list} {split} {deduction}. We are planning experiments with the choice of stratification using our theorem prover Vampire [9].

Stratified Resolution

6

383

Conclusion

We will mention several interesting open problems associated with stratified resolution. In our proofs, we could not get rid of an annoying condition on selection function to be subsumption-stable. Problem 1 Is stratified resolution with redundancies complete with selection functions that are not subsumption-stable? It is quite possible this can be done, but require a different proof technique. Problem 2 Find new techniques for proving completeness of stratified resolution with redundancies. Problem 3 Find a powerful generalization of definite resolution for logic with equality. There is quite a simple generalization based on the following transformation of clauses definite w.r.t. P: we can replace any clause P (t1 , . . . , tn ) :- C by P (x1, . . . , xn ) :- x1 = t1 , . . . , xn = tn , C. Then we can define stratified resolution and prove its completeness in essentially the same way as before. However, in this case any clause containing ¬P (s1 , . . . , sn ) will be unifiable with the head of any clause defining P , so the gain of using definition rewriting is not obvious any more. The standard semantics of stratified logic programs are based on nonmonotonic reasoning. Stratified resolution makes one think of a logic that combines nonmonotonic reasoning with the monotonic resolution-based reasoning. Such a logic, its semantics and ways of automated reasoning in it, should be investigated. So we have Problem 4 Find a combination of stratified resolution with nonmonotonic logics. Stratified resolution is different from ordered resolution with negative selection in that it allows one to select heads of clauses, even when they are not maximal in their clauses. It is interesting to see if this can give us new decision procedures for decidable fragments of predicate calculus. So, we have Problem 5 Find new decision procedures based on stratified resolution.

7

Recent Developments

Recently, Robert Nieuwenhuis proposed a new interesting method for proving completeness of stratified resolution. His method gives a positive answer to Problem 1: the restriction on selection functions to be subsumption-stable is not

384

Anatoli Degtyarev and Andrei Voronkov

necessary. In a way, it also answers to Problem 2, since his technique is rather different from ours. He also proposed a more concise definition of stratification and clauses admitting stratification in terms of well quasi orderings. Acknowledgments The idea of this paper has appeared due to discussions of the second author during CADE-16 with Harald Ganzinger and Tanel Tammet. Harald Ganzinger explained that the Saturate system, tries to use an ordering under which the heads of “definition-like” clauses are the greatest in their clauses, and hence can be selected. This allows one to use definitions in the way they are often used in mathematics. Tanel Tammet explained that he tried to implement in his prover Gandalf “rewriting on the clause level”, again for clauses that look like definitions, though the implementation was not ready by CASC-16.

References 1. K.R. Apt and H.A. Blair. Towards a theory of declarative knowledge. In J. Minker, editor, Foundations of Deductive Databases and Logic Programming, pages 89–148. Morgan Kaufmann, 1988. 2. L. Bachmair and H. Ganzinger. A theory of resolution. Research report MPI-I-972-005, Max-Planck Institut f¨ ur Informatik, Saarbr¨ ucken, Germany, 1997. To appear in Handbook of Automated Reasoning, edited by A. Robinson and A. Voronkov. 3. L. Bachmair and H. Ganzinger. A theory of resolution. In A. Robinson and A. Voronkov, editors, Handbook of Automated Reasoning. Elsevier Science and MIT Press, 2000. To appear. 4. L. Bachmair, H. Ganzinger, C. Lynch, and W. Snyder. Basic paramodulation. Information and Computation, 121:172–192, 1995. 5. A. Van Gelder. Negation as failure using tight derivations for general logic programs. In J. Minker, editor, Deductive Databases and Logic Programming, pages 149–177. Morgan Kaufmann, 1988. 6. R.A. Kowalski and D. Kuehner. Linear resolution with selection function. Artificial Intelligence, 2:227–260, 1971. 7. C. Lynch. Oriented equational logic is complete. Journal of Symbolic Computations, 23(1):23–45, 1997. 8. H. Przymusinski. On the declarative semantics of deductive databases and logic programs. In J. Minker, editor, Foundations of Deductive Databases and Logic Programming, pages 193–216. Morgan Kaufmann, 1988. 9. A. Riazanov and A. Voronkov. Vampire. In H. Ganzinger, editor, Automated Deduction—CADE-16. 16th International Conference on Automated Deduction, volume 1632 of Lecture Notes in Artificial Intelligence, pages 292–296, Trento, Italy, July 1999. 10. A. Voronkov. How to optimize proof-search in modal logics: a new way of proving redundancy criteria for sequent calculi. Preprint 1, Department of Computer Science, University of Manchester, February 2000. To appear in LICS’2000.

!! " " # $

% & &

'

"' " "( )* " "& ' " * " *' * + * & " ", -

)* " * * - * * * * ' "'

" * *

" * * ' " * ' * & " $

"' *

& . " ""

& " * ""

- /* ""

0

*

" * ' . " " " & "" "

"" "" * 1. "" .

". " & & * ' "

' % " * "" ! // & * "Æ & !11 " !2 1. "" 3

! " # $" " % & $" "

!" # $" % ! & ' "

2

" 4 * 5 6

' ()* +,- " . /%%(0 1 2 & 0 +*- / 3 / $ % $ 4 5 % 3 3 3 6 # 6 # # +7- # +8- 9 6 +:- # / 3 " 6 (6 ; 0 < 9 3 # $ = $ $ 6 & # 45 & "> +;- " "> 2 3 % 6 / 6 )" ) % $" " 0 6 ? ) 6 6 $

"" 7

3

/ 6 ) > " & " +8- % " @ 9 3 "> % 9 )" 9 6

< 3 3 % 4 5 @6 45

$ & . = ) ¼ = ) % & 4 5 4 5 4 5 4 5 & " ¼ $ 4 5 4 5 9 # +A- # # % 6 4

" 4 * 5 6

5 % 6 B 4 5 % . 3 4 $ 5 4 5 # %

4 5 4 5

4 5 ¾

/ 0 # %

% ' 4 5 B ) ' 4 5 % B 4 5 % 6 ' = ) 4 5 = % ¼ # 4 5 45 4 5 4 5 ¼ 4 5 45 4 5 4 5 ¼ ¼ ? ¼ 6 ¼ % $ % = A %

"" 7

8

% = A A c∨e∨α

B ¬c∨e∨β

C c:e∨e∨α∨β

D ¬e∨δ

E e:α∨β∨δ

B ¬c∨e∨β A c∨e∨α

D ¬e∨δ

E e:¬c∨β∨δ

C c:e∨α∨β∨δ

& % ! ) 4 5 A & : % 7 & 4 5 ¼ 4 5 4 = )5 $ 9 6 +8-

% %

8!

" 4 * 5 6

& 6

£ £

9 3 &

0 % C ' = 6 £ A $ £ £ £ £ $ £ £ £ 6 % $ 6 $ £ ? £ % £ $ = £ $ % £ $ ¾

% " $ 6 4 5 $ % # " " $" D " < E 9

" ! % 4 5 6 ' %

"" 7

81

# # % = : " ? 4 5 " $ = : 4 5 " 4 5 % " a∨b∨c∨d

Nc

b∨¬c

c:a∨b∨b∨d Nb

a∨b∨c∨d

¬b

b:a∨d

Nd

Nd

¬a

Na

a:

¬a

d: a∨a∨b∨c Na

a∨¬d d: a∨a

a∨¬d

b∨¬c

a: b∨c

Nc

¬b

c: b∨b Nb

b:

$ "" " ""

% "> +;- > . " # % 4 5 4 5 46 5 46 5 % = 7 % % 6 6 # % 6

89

" 4 * 5 6

C ) l 2 l1

l2 l1

d1

d2

l1

d2

d1

$ " . "" > & 9 " " " & # C +;- " % % > " # 6 6 E % 6 ? 0> 6 F 3 $ & (6 ; ); :A8*8 % 6 B7 3 A

"" 7

8

1 11 111 1 11 111 1 11 119 1 11 119 1 19 191 1 19 191 1 19 199 1 19 199 9 91 911 9 91 911 9 91 919 9 91 919 9 99 991 9 99 991 9 99 999 9 99 999

9323 *& . "" -"

" A ) A ) & "> & "> +8- ' " " " "> E % < 9 > $ 4 "5 & > " % & 4 +:-5 $ 6 '" +)- & " "> " '"

8

" 4 * 5 6

" 6 = 7 D " ordered

lock

1-wso

support ordered rank/ activity

$ /* * + '

' ' . " /* " * "' "," "," *

= = * " % 6 = 7 "> % ) 6 9 ) 6 ! 6

!

/ # % "> " & (

/ > ! % )" " & 6 6 = 7 6 Æ 6 & ) 6 6

"" 7

8

% $ G 6 6 9 # " # " " " # 6 " % ) 6 / 6 = A 9 +8- $ )" % ) 9 & = 8 ) & ) 6 & % = 8 ) & ) 6 & = 8 a∨b∨c

¬b∨c

b:a∨c∨c

a∨b∨¬c

a∨¬c

¬b

c:a∨b∨b ¬a

c:a∨a

b∨c

a:

$ " 1.

¬a

b:a∨a

a:

82

" 4 * 5 6

/ ) % 6 6 # 6 ) 6 "> 6 0"> ) = = ; ) 3 );)) 9 ); :A8*8 % ) 6 % ) 9 6 & ? % 6 6 $ "

#$ %

% 6 7HHG3 1 && *7 G 0G 0 '6 *H " # ;HHH 1 % 6 9&1 :AI " " 1 " C 6 9 1 & "" " 1 3 / ;HHH # % );H 9 6

"" 7

83

& " / @ % 6 IHJ %1%1 A:H +I- Æ 3 H* 9 7HI ;AI & # " 6 6 % # 6 & 6 3 % # ! ) % ' 6 ) 8; 6 ) )HJ 6 % 6 % ) % 6 6 6 G $ 6 ! //H:;) $ AH FCH87) $ ), & & ' . " * ! 2 : 1 9 : 9 1 18: 11 1: 19 12:

' . " & 1.

% ) Æ HA H: H: H7 % A AHJ Æ 1 Æ % " % : ( )H % 7 Æ %

8

" 4 * 5 6 5Æ & & " : !1 !9 22 19 1: !9 ! 9 : ! ! 13! 1 8: ! ! 13: ! !2 ! 9!: / !

5Æ & " & " & 1.

/ ' ; "< & & !1 1 19 9 9 1 12 11 12 9 9 92 2 2 19 1 19 92 8 92 19 19 19 2!! 9

/ & ( )A % =" ' & & ! 1! 1 1! 9! ! 9! ! 9 ! ! 8 ! 12! 3 12! 9! 2 9! 2 ! 11 2 ! 19! 11 19! 92! 12 92! 19! 11 19! 3 1

& -" &

"" 7

88

% ; % ) C $ ' & & " 1 9!: $/ 11 ! !: > 1 9 1 : >5 92 ! !: =# 3 18 : =7 9! 3 : 6# 1!!: > > 9 3!: >5$ 2 1 13: ? 1 : >$ 1 ! !: @ 1 ! !: 7= 2 ! !: 7 1 ! !: #/ 11 11 1!: A 81 1 : ! 3 1:

& " * '

& '

' "> $ )" # 6 & $ # # (6 % $

!!

" 4 * 5 6

(%

% C(0 % # " G61"& K &" K" ! 1 7 *5 * / $ 1831 9 *.>' *' " 7*" *./' > $" A " > " 183 6 " 7 ' " .& "'

!" 9B1C9! 1882 4 5 6 " B

""' " .

'

" ' # $ 89B9C8 1883 4 5 6 " 70 B & % D* " 6 D* " % & %'() & 1 91 > $, %' ' 19C 92 '.E' 4 188 2 4 $ 7 & $ *. " ' &" * *+ 19B9 C 1 182 3 " 4 5 6 #Æ " "' " ' . * ' * = F " / A /* // & & % 5 D " % %'(, & 1 > $, %' ' 99C922 '.E' 188 8 = F F /* $5#.12

* 9 B 31C 82 9!!!

½

¾

! ! "

#$% & &' " ( # ) * + ! ' $ & + ! # + ! * & ) #

& ) , & " &

! "# $ %& ' ( ) * + ( )

) , (

* -) . ( ) / ( , ( * 0 1 ( '2 * + () , 3 * 0 )

$45 , ( ( ( * , ( 6' ( . / 7 ' ' ( ( '2 * + ( ) ( . $45 * + ' 089 67 ' $45 657 $45 ( (

6%7 1 ( ( , "* " # ' +Æ & - ! .'/'0'! '/1 !"#$" % ! & '"

)

. + *

6"7 $45 ( * + . . (: * $ ) ( , (

) , * 0 ) ( * $45 6$ 4 ( $ 47 5

; . * $45 , < . ' 4 ) ( . ' * 0 ( ' , *

$45 , ,. ' , ( , * + , ( ()* 1 , ( ' , ( ,. * + ' ) , =

* > ) , ( ( * + ( ( ( ) ( ( 6 ( 3 , (7 (

( * 0 ( ( *

+ ( ( ,. < * )

( 2 ) 67

. ( 657 ' ' 6%7 / 6"7

3 , 6&7 2 ( 67 ( ( 6 7 67 ( 6 7* 5 " & 3 ( / . 3 (* 0 089 6?&7 $45 67 ' $ 67 $45* + ( (

. (

3 (:

2 #

%$/

+ ( 5 " & ( ( * + ( / ' )

1 ) ) ':

( ( ' $45 ( ) 6

* &7 ( * 0( , ( ( ( 6 ( ( 7= .

) ( * - ) 3 (

$ ( ( ( ) ( . . ( )

( ( 2 ) .(

( * + .

0 ) ( 089 ( ( 2 ) . ( ( 6, 7 * ( $45 , ( ,.

( $45 * ; @ ) ( )* + ( ' ( '

' $ '

( 2 * + . ( : . 2 ) (

$ ( ,

( ( ' $* + ( ' )

' :

0 ) ( 089 2 ) .( ( ( *

. + *

+ ( ( 2 ( 3 (* > ( (. ( '2 * A 1 ( ) ( '2 )

6 ( ( ) 7* +

( ( '2 () ( * 0 $45 ) $45 1 ' ) @ ) ( *

$45 ( 6 * * 7 '

( * $ 4 ( ) $45 4 ( $45 ('* + 4 ( ' , ( @ * 0( 4 ( ( '2 ) ( ( * $ ) ( ) $* $ (' $45 ) , ( . 4 ( ) $ * 0 ' ( ( ( 089 ) ,. < ( '2 * 6+ ( ( ( '2 ' ( B*7 + ' ' ) 4 ( * ( ) , . 3 ' * ( 1

.. ( * 0 ' ( ,

* 089 ( ( C

2 D ) ( ,. ( * A ) . () , * ½

) ) " Æ

2 #

%$3

$45 ( ) ( ' . , ( * ' ) ' , * 0 , ) ) , . ' , ( ) *+ ( E 1 3 , ( 2 1 3 , ( 2 * 0 ( ) ( (

( ' ( * 1 . 3 , ( ) ' ( 1 3 , ' *6 ( 3 , ' 3 , )

*7 -) ( / *

! "# " $

$ ( ( $45 ( ( 089 . ( ( @ $ ) ( * 089 ' $* + ( 089 ' ( !!"""#$%&'()*+!,#$$-'.!/+0* $ 089 ' ( B* % 4 5 6 4" ) , ) 4 6

)/7 82)/9 )/ 00: / . '5 & &' 2 ; < , - =','0 # 00 . !;5 00 3 . !2 <

00 > . + 00 : . + / - ? , - '0 => # 00 1 . // 2 # - $@ 3 :9 #, 5 00: 0 . + * #$%2 * &' 5 ='5::3'100 # 000 , AB . . +2 , !'/ 17)82)9)) 00: ¾

, " 4

System Description: 6\VWHP2Q7373 Geoff Sutcliffe School of Information Technology, James Cook University, Townsville, Australia [email protected]

Abstract. SystemOnTPTP is a WWW interface that allows an ATP problem to be easily and quickly submitted in various ways to a range of ATP systems. The interface uses a suite of currently available ATP systems. The interface allows the problem to be selected from the TPTP library or for a problem written in TPTP syntax to be provided by the user. The problem may be submitted to one or more of the ATP systems in sequence, or may be submitted via the SSCPA interface to multiple systems in parallel. SystemOnTPTP also can provide system recommendations for a problem.

6\VWHP2Q7373 is a WWW interface that allows an ATP problem to be easily and

quickly submitted in various ways to a range of ATP systems. The interface uses a suite of currently available ATP systems, which are maintained in a database structure. The interface is generated directly from the database, and thus is as current as the database. The interface is a single WWW page, in three parts: the problem specification part, the mode specification part, and the system selection part. A user

Fig. 1. Problem specification

D. McAllester (Ed.): CADE-17, LNAI, pp. 406-410, 2000. © Springer-Verlag Berlin Heidelberg 2000

System Description: SystemOnTPTP

407

specifies the problem first, optionally selects systems for use, and then specifies the mode of use. Figure 1 shows the problem specification part of the WWW page. There are three ways that the problem can be specified. First, a problem in the TPTP library [SS98] can be specified by name. The interface provides a tool for browsing the TPTP problems if desired. Second, a file, containing a problem written in TPTP syntax, on the same computer as the WWW browser can be specified for uploading. The interface provides an option for browsing the local disk and selecting the file. Third, the problem formulae can be provided directly in a text window. Links to example TPTP files are provided to remind the user of the TPTP syntax if required. Figure 2 shows the start of the system selection part of the interface. There is one line for each system in the suite, indicating the system name and version, a default CPU time limit, the default tptp2X transformations for the system, the default tptp2X format for the system, and the default command line for the system. To select a system the user selects the corresponding tickbox. The default values in the text fields can be modified, if required. Fig. 2. System selection

Figure 3 shows the mode specification part of the interface. The lefthand side contains information and the RecommendSystems button for obtaining system recommendations for the specified problem. System recommendations are generated Fig. 3. Mode specification

408

Geoff Sutcliffe

as follows: ATP problems have been classified into 14 disjoint specialist problem classes (SPCs) according to problem characteristics such as effective order, use of equality, and syntactic form. In a once off analysis phase for each SPC, performance data for ATP systems (not necessarily all in the suite), for some carefully selected TPTP problems in the SPC, is analyzed. Systems that solve a subset of the problems solved by another system are discarded. The remaining systems are recorded as recommended, in order of the number of problems solved in the SPC. Later at run time, when system recommendations for a specified problem (not necessary one of those used in the analysis phase, or not even from the TPTP) are requested, the problem is classified into its SPC and the corresponding system recommendations are returned. The righthand side of the mode specification part of the interface gives information, options, and the submit buttons for using the ATP systems sequentially or in parallel. When a problem is submitted using either of these submit buttons, the ATP systems are executed on a server at James Cook University. Due to resource restrictions, only one public user may submit at a time. A password field is provided that allows privileged users to submit at any time. The Output mode options specify how much output is returned during processing. In Quiet mode only the final result is returned, giving the type of result and the time taken. The result is either that a proof was found, that it was established that no proof can be found, or that the systems gave up trying for an unknown reason. In Interesting mode information about the progress of the submission is returned; see below for an example. In Verbose mode a full trace of the submission is returned, including the problem in TPTP format, the problem after transformation and formatting for the systems, and all standard output produced by the systems. The RunSelectedSystems button sequentially gives the specified problem to each of the systems selected in the system selection part of the interface. For each selected system, the problem is transformed and formatted using the tptp2X utility as specified in the system selection. The transformed and formatted problem is given to the system using the specifed command line, with a CPU time limit as specified for the system. The Parallel mode options specify the type of parallelism to use when a problem is submitted using the RunParallel button. All of the parallel modes perform competition parallelism [SS94], i.e., multiple systems are run in parallel on the machine (using UNIX multitasking if there are less available CPUs than systems) and when any one gets a deciding result all of the systems are killed. The differences between the modes are which systems are used and the individual time limits imposed on each system’s execution. A limit on the total CPU time that can be taken by the executing systems is specified in the seconds field of the interface. In Naive selected mode all of the systems selected in the system selection part of the interface are run in parallel with equal CPU time limits (the appropriate fraction of the total time limit). In Naive mode the specified number of systems, taken in alphabetical order from the selection list, are run in parallel with equal time limits. In SSCPA mode the system recommendation component is used to get system recommendations for the specified problem. The suite of systems is then checked for versions of the recommended systems, in order of recommendation, until the specified number of systems have been found, or the recommendations are exhausted. The systems are then run in parallel with equal time limits. In Eager SSCPA mode the systems used are the same as for SSCPA mode, but the individual system time limits are calculated by repeatedly dividing the total time limit by two, and allocating the values to the systems in order. In this manner the

System Description: SystemOnTPTP

409

highest recommended system gets half the total time limit, the next system gets a quarter, and so on, with the last two systems getting equal time limits. The motivations and effects of these parallel modes are discussed in [SS99]. The effectiveness of SSCPA was demonstrated in the CADE-16 ATP System Competition [Sut00]. Figure 4 shows the interesting output from submission of the TPTP problem in the Eager SSCPA mode. The execution is first transferred from the WWW server onto a SUN workstation where the ATP systems are installed. The system recommendation component is then invoked. The problem is identified as being real 1st order, having some equality, in CNF, and Horn. Five systems are recommended for the SPC: E 0.32, E-SETHEO 99csp, Vampire 0.0, OtterMACE 437, and Gandalf CID003-1

Fig. 4. Interesting output for CID003-1 in Eager SSCPA mode

c-1.0d. The suite of systems is then checked for versions of these systems, and it is found that versions of four of them, E 0.51, Vampire 0.0, OtterMACE 437, and Gandalf c-1.0d, are in the suite. The submission required three systems, so E 0.51, Vampire 0.0, and OtterMACE 437 are used. The individual system time limits out of the total specified limit of 300 seconds are then computed, 150 seconds for E 0.51 and 75 seconds each for Vampire 0.0 and OtterMACE 437. The problem is then transformed and formatted for each of the systems, and the systems are run in parallel. E 0.51 finds a proof after 39.8 seconds CPU time, 62.2 seconds wall clock time, at which stage all of the systems are killed.

410

Geoff Sutcliffe

6\VWHP2Q7373 is implemented by perl scripts that generate the WWW interface from the system database, accept the submission from the browser, extract the system recommendations, invoke the tptp2X utility to transform and reformat the problem, and control the execution of the systems. The interface is available at: http://www.cs.jcu.edu.au/cgi-bin/tptp/SystemOnTPTPFormMaker

6\VWHP2Q7373 makes it easy for users to easily and quickly submit a problem in TPTP syntax to an appropriate ATP system. The user is absolved of the responsibilities and chores of selecting systems to use, installing the systems, transforming and formatting the problem for the systems, and controlling their execution. This user friendly environment is particularly appropriate for ATP system users who want to focus on the problem content rather than the mechanisms of ATP. The interface is not designed for, and is therefore not suitable for, users who wish to submit a batch of problems to a particular ATP system. Such users should obviously install that ATP system on their own computer, which would also allow use of the system’s own input format rather than the TPTP format. ATP system developers are invited to submit their systems and performance data for inclusion and use in the interface. References SS94 Suttner C.B., Schumann J. (1994), Parallel Automated Theorem Proving, Kanal L., Kumar V., Kitano H., Suttner C., Parallel Processing for Artificial Intelligence 1, pp.209257, Elsevier Science. SS98 Sutcliffe G., Suttner C.B. (1998),The TPTP Problem Library: CNF Release v1.2.1, Journal of Automated Reasoning 21(2), pp.177-203. SS99 Sutcliffe G., Seyfang D. (1999), Smart Selective Competition Parallelism ATP, Kumar A., Russell I., Proceedings of the 12th Florida Artificial Intelligence Research Symposium (Orlando, USA), pp.341-345, AAAI Press. Sut00 Sutcliffe G. (To appear),The CADE-16 ATP System Competition, Journal of Automated Reasoning.

System Description: PTTP+GLiDeS Semantically Guided PTTP Marianne Brown and Geoff Sutcliffe School of Information Technology James Cook University {marianne,geoff}@cs.jcu.edu.au http://www.cs.jcu.edu.au/~marianne/GLiDeS.html

Introduction PTTP+GLiDeS is a semantically guided linear deduction theorem prover, built from PTTP [9] and MACE [7]. It takes problems in clause normal form (CNF), generates semantic information about the clauses, and then uses the semantic information to guide its search for a proof. In the last decade there has been some work done in the area of semantic guidance, in a variety of first order theorem proving paradigms: SCOTT [8] is based on OTTER and is a forward chaining resolution system; CLIN-S [3] uses hyperlinking; RAMCS [2] uses constrained clauses to allow it to search for proofs and models simultaneously; and SGLD [11] is a chain format linear deduction system based on Graph Construction. Of these, CLIN-S and SGLD need to be supplied with semantics by the user. SCOTT uses FINDER [8] to generate models, and RAMCS generates its own models. The Semantic Guidance PTTP+GLiDeS uses a semantic pruning strategy that is based upon the strategy that can be applied to linear-input deductions. In a completed linear-input refutation, all centre clauses are FALSE in all models of the side clauses. This leads to a semantic pruning strategy that, at every stage of a linear-input deduction, requires all centre clauses in the deduction so far to be FALSE in a model of the side clauses. To implement this strategy it is necessary to know which are the potential side clauses, so that a model can be built. A simple possibility is to choose a negative top clause from a set of Horn clauses, in which case the mixed clauses are the potential side clauses. More sensitive analysis is also possible [4,10]. Linear-input deduction and this pruning strategy are complete only for Horn clauses. Unfortunately, the extension of this pruning strategy to linear deduction, which is also complete for non-Horn clauses, is not direct. The possibility of ancestor resolutions means that centre clauses may be TRUE in a model of the side clauses. In PTTP+GLiDeS, rather than placing a constraint on entire centre clauses, a semantic constraint is placed on certain literals of the centre clauses: The input clauses other than the chosen top clause of a linear deduction are named D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 411–416, 2000. c Springer-Verlag Berlin Heidelberg 2000

412

Marianne Brown and Geoff Sutcliffe

the model clauses. In a completed linear refutation, all centre clause literals that have resolved against input clause literals are required to be FALSE in a model of the model clauses. TRUE centre clause literals must be resolved against ancestor clause literals. PTTP+GLiDeS implements linear deduction using the Model Elimination [6] (ME) paradigm. ME uses a chain format, where a chain is an ordered list of Aand B-literals. The disjunction of the B-literals is the clause represented by the chain. Input chains are generated from the input clauses and are composed entirely of B-literals. The chains that form the linear sequence are called the centre chains. A-literals are used in centre chains to record information about ancestor clause in the deduction. The input chains that are resolved against centre chains are called side chains. PTTP+GLiDeS maintains a list of all the A-literals created throughout the entire deduction. This list is called the A-list. The pruning strategy requires that at every stage of the deduction, there must exist at least one ground instance of the A-list that is FALSE in a model of the model clauses. The result is that only FALSE B-literals are extended upon, and TRUE B-literals must reduce. Figure 1 shows an example of a PTTP+GLiDeS refutation. The problem clauses are {∼money ∨ tickets(buy), ∼tickets(sell) ∨ money, money ∨ tickets(X), ∼money ∨ ∼tickets(X)}. The clause ∼money ∨ ∼tickets(X) is chosen to form the top chain, so that the other three clauses are the model clauses. The model M is {money, tickets(buy), ∼tickets(sell)}. The A-list is shown in braces under the centre chains. Since the work described in [1], PTTP+GLiDeS has been enhanced to order the use of side chains, using the model of the model clauses. The model is used to give a score to each clause as follows: If there are N ground domain instances of a clause C with k literals, then for each literal L, nL is the number of TRUE instances of L within the N ground instances. L is given the score nNL . The score 1 Pk for the clause C is N∗k L=1 nL . The clause set is then ordered in descending order of scores. This gives preference to clauses that have more TRUE literal instances in the model. The use of these clauses leads to early pruning and forces the deduction into areas more likely to lead to a proof. Implementation PTTP+GLiDeS consists of a modified version of PTTP version 2e and MACE v1.3.3, combined with a csh script. It requires the problem to be presented in both PTTP and OTTER formats. The OTTER format file is processed so that it contains only the model clauses, and is used by MACE. Initially the domain size for the model to be generated by MACE is set to equal the number of constants in the problem. If a model of this size cannot be found, the domain size is reset to 2 and MACE is allowed to determine the domain size. If no model is found PTTP+GLiDeS exits. If a model is found, the modified PTTP is started. The modified PTTP uses the model to reorder the clause set, then transforms the reordered clauses into Prolog procedures

System Description: PTTP+GLiDeS Semantically Guided PTTP

413

~money ~tickets(X) {} tickets(buy) ~money

extension

~money ~tickets(buy) money { ~tickets(buy) } money tickets(X) ~money ~tickets(buy) ~money tickets(X) { ~tickets(buy), ~money }

extension

reduction

~money ~tickets(buy) ~money { ~tickets(buy), ~money } truncation ~money { ~tickets(buy), ~money }

~money { ~tickets(buy), ~money } money ~tickets(sell)

extension

~money tickets(X) { ~tickets(buy), ~money, ~money }

~money ~tickets(sell) { ~tickets(buy), ~money, ~money } money tickets(X)

extension

~money ~tickets(sell) money { ~tickets(buy), ~money, ~money, ~tickets(sell) } fail and backtrack to

money tickets(X)

money ~tickets(sell) ~money tickets(sell) money { ~tickets(buy), ~money, ~money, tickets(sell) }

reduction

truncation

~money tickets(sell) { ~tickets(buy), ~money, ~money, tickets(sell) }

Fig. 1. A PTTP+GLiDeS refutation that implement the ME deduction and maintain the A-list. A semantic check is performed on the A-list after each extension and reduction operation. If the A-list does not have an instance in which every literal evaluates to FALSE in the model provided by MACE, then the extension or reduction is rejected. Performance Testing was carried out on 541 “difficult” problems from the TPTP problem library [12] v2.1.0. Both PTTP and PTTP+GLiDeS were tested on the same problems under the same conditions. Experiments where carried out on a SunSPARC 20 server using ECLiPSe v3.7.1 as the Prolog engine. A CPU time limit of 300

414

Marianne Brown and Geoff Sutcliffe

seconds was used. The results are summarized in Table 1. MACE failed to generate a model in 272 cases, and so PTTP+GLiDeS couldn’t attempt those problems. Of those 269 problem where models were generated, worst performance is on Horn problems: all of the problems solved by PTTP and not PTTP+GLiDeS are Horn. MACE tends to generate trivial models (positive literals TRUE) for Horn problems. If the top centre clause is negative then, for a Horn clause set, a trivial model does not lead to any pruning. With the additional overhead of the semantic checking this leads to poor performance by PTTP+GLiDeS. Of the 269 models produced by MACE, 155 were effective, i.e., resulted in some pruning. Of the problems with effective models solved by both PTTP and PTTP+GLiDeS, in 13 out of 18 cases PTTP+GLiDeS had a lower inference count; in some cases significantly lower. This is shown by the fact that the average number of inferences for PTTP+GLiDeS is 2.5 times smaller than that of PTTP. This shows that the pruning is having a positive effect. PTTP+GLiDeS performs best on non-Horn problems. Table 2 shows some results for non-Horn problems where PTTP+GLiDeS performed better than PTTP. For these problems even trivial models can be of assistance. Total number of problems: CPU time limit: Number of models generated: Number of problems solved from 269:

541 (311/230) 300 s 269 (227/42) PTTP 66 (60/6) Number of effective models generated: 155 (120/35) Number of problems solved from 155: PTTP 21 (16/5) For the 18 problems (from 155) solved by both systems: PTTP Average CPU Time: 34.24 Average Number of Inferences: 119634.28 Average Number of Rej. Inferences:

(Horn/Non-Horn) PTTP+GLiDeS 59 (51/8) PTTP+GLiDeS 20 (13/7) PTTP+GLiDeS 69.18 47812.22 3896.78

Table 1. Summary of experimental data.

With respect to ordering of the clause set, experiments have been carried out using both ascending and descending with respect to the truth score. Initially it was thought that ordering the clause set in ascending order of truth score (from ‘less TRUE’ to ‘more TRUE’) would lead the search away from pruning and therefore towards the proof. This turns out not to be the case. While the results are not statistically significantly different in terms of rejected inferences and inferences, descending ordering solved 4 more problems overall, of which 3 had effective models. As solving problems is the most significant measure of a theorem prover’s ability this shows that pruning early is more effective.

System Description: PTTP+GLiDeS Semantically Guided PTTP PTTP+GLiDeS CPU Inferences Rejected Time Inferences CAT003-3 non-Trivial 68.5 64232 10451 CAT012-3 Trivial 32.0 49220 4174 GRP008-1 Trivial 248.3 404198 3937 SYN071-1 non-Trivial 70.1 84908 27653 Problem

Model

415

PTTP CPU Inferences Time TIMEOUT 54.6 175367 TIMEOUT 262.8 832600

Table 2. Results for some non-Horn problems where PTTP+GLiDeS outperforms PTTP.

Conclusion In those cases where a strongly effective model has been obtained, results are good. This leads to the question, “what makes a model effective?” At present the first model generated by MACE is used. If the characteristics of a strongly effective model can be quantified then it should be possible to generate many models and select the one most likely to give good performance. PTTP is not a high performance implementation of ME, and thus the performance of PTTP and PTTP+GLiDeS is somewhat worse than that of current state-of-the-art ATP systems. This work has used PTTP to establish the viability of the semantic pruning strategy. It is planned to implement the pruning strategy in the high performance ME implementation SETHEO [5], in the near future. On the completeness issue, this prover prunes away proofs which contain complementary A-literals on different branches of the tableau. In the few cases examined to date, another proof that conforms to this extended admissibility rule has always been found. Whether there is always another such proof is not known.

References 1. M. Brown and G. Sutcliffe. PTTP+GLiDeS: Guiding Linear Deductions with Semantics. In N. Foo, editor, Advanced Topics in Artifical Intelligence: 12th Australian Joint Conference on Artificial Intelligence, AI’99, number 1747 in LNAI, pages 244–254. Springer-Verlag, 1999. 2. R. Caferra and N. Peltier. Extending Semantic Resolution via Automated Model Building: Applications. In C.S. Mellish, editor, Proceedings of the 14th International Joint Conference on Artificial Intelligence, pages 328–334. Morgan Kaufmann, 1995. 3. H. Chu and D. Plaisted. Semantically Guided First-order Theorem Proving using Hyper-linking. In A. Bundy, editor, Proceedings of the 12th International Conference on Automated Deduction, number 814 in LNAI, pages 192–206. SpringerVerlag, 1994.

416

Marianne Brown and Geoff Sutcliffe

4. D.A. de Waal and J.P. Gallagher. The Applicability of Logic Programming Analysis and Transformation to Theorem Proving. In A. Bundy, editor, Proceedings of the 12th International Conference on Automated Deduction, number 814 in LNAI, pages 207–221. Springer-Verlag, 1994. 5. R. Letz, J. Schumann, S. Bayerl, and W. Bibel. SETHEO: A High-Performance Theorem Prover. Journal of Automated Reasoning, 8(2):183–212, 1992. 6. D.W. Loveland. A Simplified Format for the Model Elimination Theorem-Proving Procedure. Journal of the ACM, 16(3):349–363, 1969. 7. W.W. McCune. A Davis-Putnam Program and its Application to Finite FirstOrder Model Search: Quasigroup Existence Problems. Technical Report Technical Report ANL/MCS-TM-194, Argonne National Laboratory, Argonne, USA, 1994. 8. J.K. Slaney. SCOTT: A Model-Guided Theorem Prover. In R. Bajcsy, editor, Proceedings of the 13th International Conference on Artificial Intelligence, pages 109–114. Morgan-Kaufman, 1993. 9. M.E. Stickel. A Prolog Technology Theorem Prover: A New Exposition and Implementation in Prolog. Technical Report Technical Note 464, SRI International, Menlo Park, USA, 1989. 10. G. Sutcliffe. Linear-Input Subset Analysis. In D. Kapur, editor, Proceedings of the 11th International Conference on Automated Deduction, number 607 in LNAI, pages 268–280, Saratoga Springs, NY, USA, June 1992. Springer-Verlag. 11. G. Sutcliffe. The Semantically Guided Linear Deduction System. In D. Kapur, editor, Proceedings of the 11th International Conference on Automated Deduction, number 607 in LNAI, pages 677–680, Saratoga Springs, NY, USA, June 1992. Springer-Verlag. 12. G. Sutcliffe and C.B. Suttner. The TPTP Problem Library: CNF Release v1.2.1. Journal of Automated Reasoning, 21(2):177–203, 1998.

«

! " " # $% ! & & $ # ' $ ( # $ $ ! ) * & ) $ " ( ( & ( ( & $ " % ( & $ $ + $% ( " #

(

! !

" # $ ! %

" & '()

* !

+ & ,'-) ! . /0 !

&/'1) * &. +'2) 3 ! " #! $

% & !

,

- -

! +

4

*

3 /45

&+'()

#

// 6 & /' '7) 3

/45 + /45 * # /45 & 88) " 2 /45 " 9

" ( /45 Æ " : /45 * ; 1

! ; 7 6 -

3 # , " ! . / ! * 3 & ,'-)

+ # ! + ! $ % $ % 3 * # < " $% * = " $ % 3 = 6 !

+ & .

/

$ %001 %$%& %) %%001 %

23 23 1

001 4 1 2 3 ¾

5

+ # ! * $ 2% & '() > !! ! !" ! + # ! 6

#

1

1

2 3

45 4 0 00 5

?* 6

2@ A ! $ % $% # 3 $ % & ) $ % ! 3 ! 4!

67

- -

! $ % # ! /45 # . /45 Æ

# 4 * * 3 ! ; # # 6 /45 3 ! = 6 6 A ! * + $ 2:% ! #

$ ! & '()% + #

$ & 88) % + ! #

$ %

&' % $ + # $ % ! $ % 3 # ! " # $ % $ % ; #

+ # ! * $ 9% " <= $ % $ % $% $ % + * $

«)"* 3 ! # " (

' %

8

+ & .

6

+ 2

3 1 2 7 3 2 3 1 2 7 3 2 3 1 2 7 3

# 3 # $ # (% ! ! $ 92% + &) & ) $ % $ %

2 3 1 2 2 7 3 7 3 2 3 1 2 2 7 3 7 3

" 6 6

+

3

$

$ % # ! <

= 3 $ :% # 3

$ %

B < $! % < $ % $! % /45 ! ! *

" 6 $ # % & 88)

66

- -

& .9

0 01

0 2 3 0 0 2 3 2 3 0 0 2 3 0 0 0 0 ! 2 3 2 23 3 : 0 0 2 2 33 : *0 " 0 0 ! 2 3 2 3 2 01 3 : = 0 " 0 2 3 2 3 2 3 : 0 0 0 2 3 2 3 0 : ; 0 : <0 : < 0

(

0 01 < 0 2 453 : < 0 0 0 2 3 2

#

0 0 ! 3 2 2 0 23 00 33

, % - % ' .

" /45 $ 1% $ .9 ! 01 0

0 ! " 0 2 3 2 3 2

" 0 2 3

01 3

* + $ $ <= %&") % $ &") % $ &") % # " $ <= %&" ) " /45 6 6 +

$ ( Æ

+ & .

#

6>

& % %

+ # 3 C $ 7%

! 01 0

0 ! " 0 2 3 ¼ 0 #23 1 #2¼ 3 2 ¼ 3 2 01 3

" 0 2 3

+

# $ -% (

" " $ 0 ! " 0 1 1 $ 2 01 34$%5 1 01 4$%5 4$%5 (60 " 0 ! " 0 2 01 34%5 1 01 4%5 (0

##

, $ %

3 + # $ '% 3 * ?*< ! $! <= % ! " $! % $! % $ "

6

- -

! 01 0

2 0 2 3 & 0 2' & 3 0 ! 0 % & 2 3 2 " 0 2 3

01 33

3 3 /45 (

/

3

# * *

$ % 3 * " D 3 $% * $ % 3 $ % $ @8%

% 3 & ,'-)

0 + " %

1 4

1 2 3 ¾

5

3 2 3 4% 5¼ 3 2 ¼233 2 3 1 4 0 23" 1 2 3 2 3 2 3 2 3 22 3 01 4%5 ¼ 2 3 2 ¼ 3 2 3 2 ¼ 3 01 01 ¼

2 2

¾ 5 3 23 ¼ ¼ ¼ ¼ ¼ " ¼ " ¼ ¼

+ & .

6?

3 $% $% * 3 * & ) * + # /45 3 # $ @@% $ 3 $ @2% /45 $

% +

" 6 # # 3 /45 @8 # $ % 3 # /45 # $ % $ .

+ &

0 01 * 0 0 0 2 01 3

.9

4%5£

£ $ % & )£ " /45 £ # ! + £ ! + &

@ 0

0 01 " ¼ " " " 0 2 ¼ 3 2 ¼ 3 2 ¼

(

%

.9

¼ 3 2 3

$ # % / & 88) + * " /45 $ @9%

6A

- -

$ .9

01 0

2 0 " 0 2 3 2 3 2 33 " 0 2 3 2 3

9 *

$ @(%

(

#

01 0

2 0 " 0 ¼ " ¼ 0 ¼ #2 3 1 #2 3 #23 1 #2¼ 3 2 ¼ 2 3 2 33 " 0 2 3 2 3

¼ 3 2 ¼ ¼ 3

$ @:%

* 3 # .

01

0 ( 0 2' (3 2 0 " 0 % ( 2 " 0 2 3 2 3

3 2 3 2 33

+ & .

6

3 * 3 & ! '( * * * * #

< & < ! '( +

3 # @1 & ! '( 3

'$% ! & ,'-) ! ) 2 3

2 3

0 *

23 0 *

2 3

0 2 3 0

2 3

2 3

0 *

0

2 8 3 0 * 23 1 23 0 *

0 * 2 *3

0 * 0 * 23 1 01 0 *

2 = 3

0 0 2 3 23 1

0

2 .3

0 * 2 3 1

4 1 2 3 ¾ 5 0

3 /45 # @1 # + $ % /45 # +

*

"

"

* * # #* "

6,

- -

+ & ) .9

1

*

) 0 01

0

0 0 0 2) 3 2 3 2) 3

! * 6 =

= = " /45

" @- % &

/45 ) $ .9

0 0 0 0 0

0 1 23 2) 3 2 3 2) 3 2 3 "

$) ' % $) ' % * ; Æ # 3 /45 $ @' *%

0 0 0 0 2 2 3 2) 3 2 33 2) 3 2 3

* 0

3 = = $ % 3

$) ' %

! .9 $ ' & ! %

+ & .

6/

$) ' & ) % * # * " $ % 3

!

+ 3 $ 28% * & ,'-) 3

* $ @:% $) ' % +

*

! 0

" 0 0 2) 3 2 2) 3 2 3 1 23

3

3

+ !

"

# ! $

3 /45 "

"

*

+ /45 $ £ '% 3 ! /45 $ % !

/45 7 788 E ! 1:E

2 (88 @8E

2 :88 @888 @( 188 2:E (8E

>7

- -

3 !

6 " !

! ! < ! ' * # $ $ $! % $! & ) & )% * / ! /45

. *

! /% 0 1 6 A , ! &,'7) & ':) /45 /45 ! " ! # + , F &, '-) G H .I 6 &. '9) ! " 6 + H A *

&A) ; &A,'() 6 /45 , . . " ; &"'-) /45 3 ' 6 Æ

6

%

$

+ ! /45 6 & '() + # /45 ! /45 * C ! /45 ! " * #* ! 3

! ! .

3 # # #

+ & .

>

* +

#

+ " * # /45

$ £ % ! "

* $ % ! +

& 10%

" H J A * ! 4/A5

< * < B % //A 4# /?5 # # ) )
)% //? >76A% . //A 4## /5 D #% E #% E

$% F - % + < % . //

% !767 4#/ 5 - # * ( % % ( D F - = B% % % >>% >?G>,, *% // 4;5 DH ; $ " = < B" % < % ; /, % <

4;F/ 5 DH ; $ C F (& F $ ) + =

% % ! " " # $# %&'% ,66% ?/G> B *% D$ // 66/6 2D // 3 4+/,5 < < + F = 23 $ !

% 8 I 8 % //, 4-F/,5 - = F J 0 $ (##)%* @ @!% //, 4-775 - - & % D $ 6777

>6 4-/ 5

- -

- $ $ + " , ( # - . % B * ,7% >G 6?% // 4F /,5 *HK F $ - $ ) >A/,% % ; //, ) $ 4F/5 ; F JL & $ $ +$ @ - % % ( # % < $ F% ( D$% // 4<=/>5 D <M

= J = !$ $ & < #& D + -% % /

# % 6,/G>7? B * AA % < //> 4<=/65 < % = (% D J % 2 3 % 770G% //6 4/ 5 # 0 = ; % 8 C = B% < //

A Resolution Decision Procedure for Fluted Logic Renate A. Schmidt1 and Ullrich Hustadt2 1

2

Department of Computer Science, University of Manchester Manchester M13 9PL, United Kingdom [email protected] Centre for Agent Research and Development, Manchester Metropolitan University Manchester M1 5GD, United Kingdom [email protected]

Abstract. Fluted logic is a fragment of first-order logic without function symbols in which the arguments of atomic subformulae form ordered sequences. A consequence of this restriction is that, whereas first-order logic is only semi-decidable, fluted logic is decidable. In this paper we present a sound, complete and terminating inference procedure for fluted logic. Our characterisation of fluted logic is in terms of a new class of socalled fluted clauses. We show that this class is decidable by an ordering refinement of first-order resolution and a new form of dynamic renaming, called separation.

1

Introduction

Fluted logic is of interest for a number of reasons. One of our main motivations for studying fluted logic is the continuation of the programme of characterising first-order decidability by resolution methods. There are various ways of defining decidable fragments of first-order logic. Fragments considered until the sixties usually involve some form of restriction on quantification. In prefix classes such as the Bernays-Sch¨ onfinkel class, the initially extended Ackermann class, the initially extended G¨ odel class the quantifier prefixes are restricted, to ∃∗ ∀∗ , ∃∗ ∀∃∗ and ∃∗ ∀∀∃∗ . In the guarded and loosely guarded fragments, which were introduced more recently, quantifiers are restricted to conditional quantifiers of the form ∃yG(x, y) ∧ ϕ or ∀yG(x, y) → ϕ, where G(x, y) is a guard formula satisfying certain restrictions (G(x, y) is an atom in the case of the guarded fragment). In Maslov’s class K (more precisely, in the dual class K) there is a restriction on universal quantification. Other decidable classes such as the monadic class and FO2 are defined over predicate symbols with bounded arity. By contrast, the restriction of first-order logic which ensures decidability for fluted logic is an ordering on variables and arguments. With the exception of fluted logic, the mentioned logics have been studied in the context of resolution and superposition, see for example Joyner [18], Ferm¨ uller, Leitsch, et al. [5,6], Bachmair, Ganzinger and Waldmann [3], de Nivelle [4], Ganzinger and de Nivelle [7], Hustadt and Schmidt [15]. D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 433–448, 2000. c Springer-Verlag Berlin Heidelberg 2000

434

Renate A. Schmidt and Ullrich Hustadt

Another reason for our interest in fluted logic is the relationship to nonclassical logics. Extended modal logics and expressive description logics play an increasingly important role in various areas of computer science. Fluted logic may be viewed as a generalisation of modal logic, just as the guarded fragments can. The properties fluted logic is known to share with modal logics are decidability and the finite model property [21,22,24]. From a modal perspective an advantage of fluted logic over the guarded fragment is that relational atoms may be negated. This means that logics such as Boolean modal logic [8] and other enriched modal logics [9,12,13], as well as expressive description logics like ALB (without converse) [16], which cannot be embedded in the guarded fragment, can be embedded in fluted logic. Interestingly, translations of propositional modal formulae by both the relational translation and a variation of the functional translation (described and used in [11,14]) are fluted formulae. This raises the question whether the results of Ohlbach and Schmidt [19,26] can be generalised to fluted logic. The answer to this question is negative, though. Already the use of the quantifier exchange operator, which swaps existential and universal quantifiers in a non-standard fashion [19], leads to loss of soundness. A counter example is the relational translation of the second formula in the class of branching K -formulae, defined in [10, Prop. 6.5]. Historically, fluted logic arose as a byproduct of the predicate functor logic introduced by Quine [25]. Adding various combinatory operators to fluted logic defines a lattice of fluted logics, in which fluted logic is the weakest logic and first-order logic with equality is the most expressive logic. (The combinatory operators are equality, binary converse, permutation of arguments, addition of vacuous arguments, fusions of arguments, and composition of binary atoms.) In a series of papers [21,22,24] Purdy studies the decision problem of fluted logics in this lattice, and establishes the limit of decidability to be the boundary of the ideal generated the fluted logic with binary converse and equality [24]. This logic is the most expressive decidable logic in the lattice of fluted logics. In [23] Purdy describes an application in computational linguistics of fluted logics for modelling ordinary English. In this paper we characterise fluted logic by a new class of clauses, called the class of fluted clauses. We present a decision procedure for this class which is based on an ordering refinement of resolution and an additional separation rule. This is a new inference rule which does dynamic renaming. It replaces a clause C ∨ D by two clauses ¬A ∨ C and A ∨ D, where A is an atom with a newly introduced predicate symbol. The rule is sound, in general, and resolution extended by this rule remains complete, if for any set N of clauses the number of applications of separation in any derivation from N is finitely bounded. Separation is essential for our decision procedure, since it allows us to transform certain problematic fluted clauses into so-called strongly fluted clauses. A strongly fluted clause is a fluted clause that contains a literals which includes all the variables of the clause. When inference is restricted to such literals (i) the number of variables in any derivable clause is finitely bounded, in particular, the number of variables does not exceed the number of variables in the original clause

A Resolution Decision Procedure for Fluted Logic

435

set. To show termination, it is usually sufficient [18] to show in addition that (ii) there is a bound on the depth of terms occurring in derived clauses. Because separation introduces new predicate names during the derivation, in our case we also need to show that (iii) there is a bound on the number of applications of the separation rule. Exhibiting (ii) and (iii), along with verifying the deductive closure of the class of (strongly) fluted clauses are the most difficult parts of the termination proof. The difficulty can be attributed to the fact that the depth of terms can grow during the derivation, as is the case for some other solvable clausal classes, for example, those associated with Maslov’s class K [5,14]. The paper is organised as follows. Fluted logic is defined in Section 2. Section 3 gives a brief description of the general ordered resolution calculus. The class of fluted clauses is defined in Section 4. In Section 5 we specify how fluted formulae can be translated into sets of fluted clauses. The new separation rule is defined in Section 6. In Section 7 we define an ordering refinement and prove termination. We conclude with some remarks about the complexity of the class. Because of the space limitations all proofs had to be omitted, but can be found in [17]. Throughout, our notational convention is the following: x, y, z are the letters reserved for first-order variables, s, t, u, v for terms, a, b for constants, f, g, h for function symbols, and p, q, r, P, Q, R for predicate symbols. The Greek letters ϕ, ψ, φ are reserved for formulae. A is the letter reserved for atoms, L for literals, and C, D for clauses. For sets of clauses we use the letter N .

2

Fluted Logic

Let P be a finite set of predicate symbols and let Xm = {x1 , . . . , xm} be an ordered set of variables. An atomic fluted formula of P over Xi is an n-ary atom P (xl , . . . , xi ), with l = i − n + 1 and n ≤ i. Fluted formulae are defined inductively as follows: 1. Any atomic fluted formula over Xi is a fluted formula over Xi . 2. ∃xi+1 ϕ and ∀xi+1 ϕ are fluted formulae over Xi , if ϕ is a fluted formula over Xi+1 . 3. Any Boolean combination of fluted formulae over Xi is a fluted formula over Xi . That is, ϕ → ψ, ¬ϕ, ϕ ∧ ψ, etcetera, are fluted formulae over Xi , if both ϕ and ψ are. By definition, for any formula ϕ, if there is a variable renaming h such that h(ϕ) is a fluted formula according to the above definition then ϕ is a fluted formula. In this paper the assumption is that all fluted formulae are closed. The semantics of fluted logic is defined like the semantics of first-order logic. Three examples of fluted formulae from a linguistic or knowledge representation setting are the following. (mwmc is short for ‘married couple all of whose

436

Renate A. Schmidt and Ullrich Hustadt

children are married’.) ∀x1 (cheese-eater(x1 ) ↔ ∃x2 (cheese(x2 ) ∧ eats(x1 , x2 ))) ∀x1 (cheese-lover(x1 ) ↔ ∀x2 (cheese(x2 ) → (eats(x1 , x2 ) ∧ likes(x1 , x2 )))) ∀x1 x2 (mwmc(x1 , x2 ) ↔ (married(x1 , x2 ) ∧ ∀x3 (have-child(x1 , x2 , x3) → ∃x4 married(x3 , x4 )))) The first formula can be expressed by a multi-modal formula, while the second can only be expressed in a modal logic with an enriched language like Boolean modal logic or description logics with role negation. Because guards may only have a certain polarity the second formula does not belong to the guarded or loosely guarded fragments. The third formula is also not guarded and does not belong to Maslov’s class K either. On the other hand, the formulae ∀x1 x2 (married(x1 , x2 ) ∧ ∀x3 (is-child(x3 , x1 , x2 ) → doctor(x3 )))) ∀x1 x2 (married(x1 , x2 ) ∧ ∃x3 (have-child(x1 , x2, x3 ) → ∃x4 married(x4 , x3))) ∀x1 x2 x3 (ancestor(x1 , x2 ) ∧ ancestor(x2 , x3 )) → ancestor(x1 , x3 ) are not fluted formulae, because in all instances the ordering of the arguments is violated in some atom.

3

Resolution

The usual definition of clausal logic is assumed. A literal is an atom or the negation of an atom. The former is said to be a positive literal and the latter a negative literal. In this paper clauses are assumed to be multisets of literals, and will be denoted by P (x) ∨ P (x) ∨ ¬R(x, y), for example. The components in the variable partition of a clause are called variable-disjoint or split components, that is, split components do not share variables. A clause which cannot be split further will be called a maximally split clause. The condensation cond(C) of a clause C is a minimal subclause of C which is a factor of C. We take equality of clauses (or formulae) to be equality modulo variable renaming. Two clauses (or formulae) that are equal modulo variable renaming are said to be variants of each other. We say an expression is functional if it contains a constant or a non-nullary function symbol. Otherwise it is called non-functional. An expression is shallow if it does not contain a non-constant functional term. The set of variables of an expression E will be denoted by var(E). Next, we briefly recall the definition of ordered resolution from Bachmair and Ganzinger [1,2]. Derivations are controlled through an admissible ordering . In the full calculus a second parameter, a selection function, may be used, but for the results of this paper it is not essential. By definition, an ordering is admissible, if (i) it is a total well-founded ordering on the set of ground literals, (ii) for any atoms A and B, it satisfies: ¬A A, and B A implies B ¬A, and (iii) it is stable under the application

A Resolution Decision Procedure for Fluted Logic Deduce:

N N ∪ {cond(C)}

Delete:

N ∪ {C} N

Split:

N ∪ {C ∨ D} N ∪ {C} | N ∪ {D}

437

if C is a factor or resolvent of premises in N . if C is redundant. if C and D are variable-disjoint.

Resolvents and factors are computed with: C ∨ A1 ¬A2 ∨ D (C ∨ D)σ provided (i) σ is the most general unifier of A1 and A2 , (ii) A1 σ is strictly maximal with respect to Cσ, and (iii) ¬A2 σ is maximal with respect to Dσ.

Ordered resolution:

C ∨ A1 ∨ A2 (C ∨ A1 )σ provided (i) σ is the most general unifier of A1 and A2 , and (ii) A1 σ is maximal with respect to Cσ. Ordered factoring:

Fig. 1. The calculus R

of substitutions. (An ordering is said to be liftable if it satisfies (iii).) The multiset extension of provides an admissible ordering on clauses. A literal L is said to be (strictly) maximal with respect to a clause C if for any literal L0 in C, L0 6 L (L0 6 L.) A literal in a clause C is said to be eligible if it is maximal with respect to C. An ordering is compatible with a given complexity measure cL on ground literals, if cL cL0 implies L L0 for any two ground literals L and L0 . Let R be the resolution calculus defined by the rules of Figure 1. The completeness proof sanctions a global notion of redundancy, with which additional don’t-care non-deterministic simplification and deletion rules can be supported. Essentially, a ground clause is redundant in a set N with respect to the ordering if it follows from smaller instances of clauses in N , and a non-ground clause is redundant in N if all its ground instances are redundant in N . For example, any tautologous clause is redundant. A (theorem proving) derivation from a set N of clauses is a finitely branching tree with root N constructed by applications of the expansion rules. S A derivation T is a refutation if for every path N (= N0 ), N1 , . . . , the clause set j Nj contains the empty clause. A derivation T from N S is called fair if for any path N (= T N0 ), N1 , . . . in the tree T , with limit N∞ = j k≥j Nk , it is the case that each clause C that can be deduced from non-redundant premises in N∞ is contained in some set Nj . Theorem 1 ([3]). Let N be a set of clauses and let T be a fair R-derivation from N (up to redundancy). S Then, N is unsatisfiable iff for every path N (= N0 ), N1 , . . . , the clause set j Nj contains the empty clause.

438

Renate A. Schmidt and Ullrich Hustadt

It should be noted that inferences with ineligible literals are not forbidden, but are provably redundant. In other words, only inferences with eligible literals need to be performed for soundness and completeness. Strictly, the “Split” rule is inessential for the results of this paper, though it may have some computational advantages (we comment on this in the final section). However, the inclusion of splitting allows for a more concise presentation of fluted clauses.

4

Fluted Clauses

This section introduces the class of fluted clauses into which fluted formulae can be translated. Without loss of generality we consider only maximally split clauses. In fluted clauses the arguments of literals have a characteristic form which will be described with the help of a sequence notation. (ui ) will denote a finite, possibly empty, sequence (ui , ui+1 , . . . , um ) of terms. In this paper unless specified otherwise each non-empty sequence (ui ) is assumed to end with um . Thus, the sequences (u1 ), (u2 ), . . . , (um ) are linearly ordered by (the converse of) the ‘is a proper suffix of’ relationship. Note that (um ) = um . Given that (ui ) = (ui , . . . , um ), (ui , t) f(u i ) P (ui ) C(u i )

will denote the sequence (ui , . . . , um , t), will denote the term f(ui , . . . , um ), will denote the atom P (ui , . . . , um ), will denote a (possibly empty) clause of literals of the form (¬)P (ui ).

If (ui ) is the empty sequence then f(ui ), P (ui ) and C(ui ) respectively denote a constant, a propositional literal and a (possibly empty) propositional clause. (ui ) is said to be the argument sequence of f(u i ), P (ui ) and C(ui ). A sequence with n elements will be called an n-sequence. Assume m is a non-negative integer, and Xm = {x1 , . . . , xm } is a set of m ordered variables. We refer to a sequence of terms u = (u1 , . . . , un ) as a fluted sequence over Xm , if the following conditions are all satisfied: (i) n > m, (ii) u1 = x1 , . . . , um = xm , (iii) the number of variables occurring in (um+1 , . . . , un ) is m, and (iv) for every k with m < k ≤ n, there is an i with 1 ≤ i < k such that uk = f(ui , . . . , uk−1) for some function symbol f. The sequence (x1 , . . . , xm) will be called the variable prefix of u. Examples of fluted sequences are: (a), a fluted sequence over X0 = ∅, (x1 , x2, x3 , f(x1 , x2 , x3 )), (x1 , x2, x3 , f(x2 , x3 ), g(x1 , x2 , x3 , f(x2 , x3 )), (x1 , x2, f(x1 , x2 ), g(f(x1 , x2 )), h(x2 , f(x1 , x2 ), g(f(x1 , x2 )))). However, (x1 , x2 , x3, f(x2 , x3 )) is not a fluted sequence, as condition (iii) is violated.

A Resolution Decision Procedure for Fluted Logic

439

By definition, a clause C is a fluted clause over Xm if one of the following holds. (FL0) C is a (possibly empty) propositional clause. (FL1) C is not empty, var(C) = Xm , and for any literal L in C, there is some i where 1 ≤ i ≤ m such that the argument sequence of L is (xi , xi+1 , . . . , xm). (FL2) C is functional and not empty, var(C) = Xm , and for any literal L in C the argument sequence of L is either (xi , xi+1 , . . . , xm) or (uj , uj+1, . . . , un ), where 1 ≤ i ≤ m and (uj , uj+1 , . . . , un ) is a suffix of some fluted sequence u = (u1 , . . . , un) over {xk , . . . , xm }, for some k with 1 ≤ k ≤ m. u will be referred to as the fluted sequence associated with L. (By 4. of Lemma 1 below there can be just one fluted sequence associated with a given literal.) (FL3) C is not empty, var(C) = Xm+1 , and for any literal L in C, the argument sequence of L is either (x1 , x2 , . . . , xm ) or (xi , . . . , xm , xm+1 ), where 1 ≤ i ≤ m. A fluted clause will be called a strongly fluted clause if it is either ground or has a literal which contains all the variables of the clause. It may be helpful to consider some examples. The clause P (x1 , x2 , x3, x4 , x5 ) ∨ Q(x1 , x2 , x3 , x4 , x5) ∨ ¬R(x4 , x5 ) ∨ S(x5 ). satisfies the scheme (FL1), and is defined over five variables. Examples of fluted clauses of type (FL3) which are defined over two (!) variables are: Q(x1 , x2 ) ∨ ¬P (x1, x2 , x3 ) ∨ ¬R(x2 , x3 ) ∨ S(x3 ) Q(x1 , x2 ) ∨ ¬R(x2 , x3 ) ∨ S(x3 ) The following are fluted clauses of type (FL2), where (x1 ) = (x1 , . . . , x4 ). R(x2 , f(x1 , x2 )) ∨ S(f(x1 , x2 )) Q(x1 , x2 ) ∨ R(x2 ) ∨ P (x1 , x2 , f(x1 , x2 )) ∨ R(x2 , f(x1 , x2)) ∨ S(f(x1 , x2 )) Q(x3 , x4 ) ∨ R(x4 ) ∨ P (x4 , g(x1 ), f(x4 , g(x1 ))) ∨ R(f(x4 , g(x1 ))) Q(x1 ) ∨ P (f(x1 , h(x1 )), g(x1 , h(x1 ), f(x1 , h(x1 ))) ∨ R(x4 , h(x1 ), g0 (x4 , h(x1 ))) A few remarks are in order. First, the non-functional subclause of a (FL2)clause will be denoted by ∇. Note that ∇ satisfies (FL1), in other words, clauses of the form (FL1) are building blocks of (FL2)-clauses. Second, clauses of the form (FL3) are defined to be fluted clauses over m variables, even though they contain m + 1 variables. This may seem a bit strange, but this definition ensures a direct association of fluted formulae over m variables to fluted clauses over m variables. Third, no fluted clause can simultaneously satisfy any two of (FL0), (FL1), (FL2) and (FL3). Fourth, using the previously introduced notation a

440

Renate A. Schmidt and Ullrich Hustadt

schematic description of the non-propositional shallow clauses (FL1) and (FL3) is: C(x1 ) ∨ C(x2 ) ∨ . . . ∨ C(xm ) (= ∇) C(x1 ) ∨ C(x1 , xm+1 ) ∨ C(x2 , xm+1 ) ∨ . . . ∨ C(xm , xm+1 ) ∨ C(xm+1 ) Fifth, strongly fluted clauses have special significance in connection with termination of resolution, particularly with respect to the existence of a bound on the number of variables in any clause. Under the refinement we will use the eligible literals are literals which contain all the variables of the clause. So, the number of variables in resolvents of strongly fluted premises will always be less than or equal to the number of variables in any of the parent clauses. The next results give some properties of fluted sequences and strongly fluted clauses. Lemma 1. Let u be a fluted sequence over Xm . Then: 1. 2. 3. 4.

There is an element uk of u such that uk = f(u1 , . . . , uk−1), for some f. If un is last element of u then var(un ) = Xm . u is uniquely determined by its last element. If (uj , uj+1 , . . . , un) is a suffix of u then (u1 , . . . , uj−1 ) is uniquely determined by (uj , uj+1, . . . , un ). By the definition of (FL2)-clauses:

Lemma 2. Let L be any literal of a (FL2)-clause defined over Xm . Then, all occurrences of variable sequences in L are suffixes of (x1 , . . . , xm). Lemma 3. Let C be a fluted clause over m variables. C is strongly fluted iff 1. C satisfies exactly one of the conditions (FL0), (FL1), (FL2), or 2. C satisfies condition (FL3), and it contains a literal with m + 1 variables. In other words, with the exception of certain (FL3)-clauses all fluted clauses include at least one literal which contains all the variables of the clause.

5

From Fluted Formulae to Fluted Clauses

Our transformation of fluted formulae into clausal form employs a standard renaming technique, known as structural transformation or renaming, see for example [20]. For any first-order formula ϕ, the definitional form obtained by introducing new names for subformulae at positions in Λ will be denoted by Def Λ (ϕ). Theorem 2. Let ϕ be a first-order formula. For any subset Λ of the set of positions of ϕ, 1. ϕ is satisfiable iff Def Λ (ϕ) is satisfiable, and 2. Def Λ (ϕ) can be computed in polynomial time.

A Resolution Decision Procedure for Fluted Logic

441

In this paper we assume the clausal form of a first-order formula ϕ, written Cls(ϕ), is computed by transformation into conjunctive normal form, outer Skolemisation, and clausifying the Skolemised formula. By introducing new literals for each non-literal subformula position, any given fluted formula can be transformed into a set of strongly fluted clauses. Lemma 4. Let ϕ be any fluted formula. If Λ contains all non-literal subformula positions of ϕ then ClsDef Λ (ϕ) is a set of strongly fluted clauses (provided the newly introduced literals have the form (¬)Qλ (xi )). Transforming any given fluted formula into a set of fluted clauses requires the introduction of new symbols for all quantified subformulae.1 Lemma 5. Let ϕ be any fluted formula over m ordered variables. If Λ contains at least the positions of any subformulae ∃xi+1 ψ, ∀xi+1 ψ, then ClsDef Λ (ϕ) is a set of fluted clauses (again, provided the new literals have the form (¬)Qλ (xi )).

6

Separation

The motivation for introducing separation is that the class of fluted clauses is not closed under resolution. In particular, resolvents of non-strongly fluted (FL3)-clauses are not always fluted and can cause (potentially) unbounded variable chaining across literals. This is illustrated by considering resolution between P1 (x1 , x2 ) ∨ Q1 (x2 , x3 ) ∨ R(x2 , x3 ) and ¬R(x1 , x2 ) ∨ P2 (x1 , x2 ) ∨ Q2 (x2 , x3 ), which produces the resolvent P1 (x1 , x2 ) ∨ Q1 (x2 , x3) ∨ P2 (x2 , x3) ∨ Q2 (x3 , x4 ). We note that it contains four variables, whereas the premises each contain only three variables. The class of strongly fluted clauses is also not closed under resolution. Fortunately, however, inferences with two strongly fluted clauses always produce fluted clauses, and non-strongly fluted clauses are what we call separable and can be restored to strongly fluted clauses. Consider the resolvent C = P (x1 , x2 ) ∨ P (x2 , x3 ) of the strongly fluted clauses P (x1 , x2 ) ∨ R(x1 , x2 , x3 ) and ¬R(x1 , x2, x3 ) ∨ P (x2, x3 ). C is a fluted clause of type (FL3), but it is not strongly fluted, as none of its literals contains all the variables of the clause. Consequently, the literals are incomparable under an admissible ordering (in particular, a liftable ordering), because the literals have a common instance, for example C{x1 7→ a, x2 7→ a, x3 7→ a} = P (a, a) ∨ P (a, a). The ‘culprits’ are the variables x1 and x3 . Because they do not occur together in any literal, C can be separated and replaced by the following two clauses, where q is a new predicate symbol. ¬q(x2 ) ∨ P (x1 , x2) q(x2 ) ∨ P (x2 , x3) 1

More generally, it requires at least the introduction of new symbols for all positive occurrences of universally quantified subformulae, all negative occurrences of existentially quantified subformulae, and all quantified subformulae with zero polarity. But then inner Skolemisation needs to be used, first Skolemising the deepest existential formulae.

442

Renate A. Schmidt and Ullrich Hustadt

The first clause is of type (FL1) (and thus strongly fluted) and the second is a strongly fluted clause of type (FL3). In the remainder of this section we will formally define separation and consider under which circumstances soundness and completeness hold. In the next section we will show how separation can be used to stay within the class of fluted clauses. Let C be an arbitrary (not necessarily fluted) clause. C is separable if it can be partitioned into two non-empty subclauses D1 and D2 such that var(D1 ) 6⊆ var(D2 ) and var(D2 ) 6⊆ var(D1 ). For example, the clauses P (x1 , x2) ∨ Q(x2 , x3) and P (x1 ) ∨ Q(x2 ) are separable, but P (x1 , x2 ) ∨ Q and P (x1 , x2 ) ∨ Q(x2 , x3 ) ∨ R(x1 , x3 ) are not. (The last clause is not fluted.) Theorem 3. Let C ∨ D be a separable clause such that var(C) 6⊆ var(D), var(D) 6⊆ var(C), and var(C) ∩ var(D) = {x1 , . . . , xn } for n ≥ 0. Let q be a fresh predicate symbol with arity n (q does not occur in N ). Then, N ∪ {C ∨ D} is satisfiable iff N ∪ {¬q(x1 , . . . , xn) ∨ C, q(x1 , . . . , xn ) ∨ D} is satisfiable. On the basis of this theorem we can define the following replacement rule: Separate:

N ∪ {C ∨ D} N ∪ {¬q(x1 , . . . , xn) ∨ C, q(x1 , . . . , xn ) ∨ D}

provided (i) C ∨ D is separable such that var(C) 6⊆ var(D) and var(D) 6⊆ var(C), (ii) var(C) ∩ var(D) = {x1 , . . . , xn } for n ≥ 0, and (iii) q does not occur in N , C or D. C and D will be referred to as the separation components of C ∨ D. Lemma 6. The replacements of a separable clause C each contain less variables than C. Even though it is possible to define an ordering under which the replacement clauses are strictly smaller than the original clause, and consequently, C ∨ D is redundant in N ∪ {¬q(x1 , . . . , xn ) ∨ C, q(x1 , . . . , xn ) ∨ D}, in general, “Separate” is not a simplification rule in the sense of Bachmair-Ganzinger. Nevertheless, we can prove the following. Theorem 4. Let Rsep denote the extension of R with the separation inference rule. Let N be a set of clauses and let T be a fair Rsep -derivation from N such that separation is applied only finitely often in any path of S T . Then N is unsatisfiable iff for every path N (= N0 ), N1 , . . . , the clause set j Nj contains the empty clause. More generally, this theorem holds also if Rsep is based on ordered resolution (or superposition) with selection. By Lemma 3 separable fluted clauses have the form C(x1 ) ∨ C(xi , xm+1 ) ∨ . . . ∨ C(xm , xm+1 ) ∨ C(xm+1 ),

(1)

A Resolution Decision Procedure for Fluted Logic

443

where (x1 ) is a non-empty m-sequence, C(x1 ) is not empty, and i is the smallest integer 1 < i ≤ m such that C(x i , xm+1 ) is not empty. Let sep be a mapping from separable fluted clauses of the form (1) to sets of clauses defined by sep(C) = {¬q(xi ) ∨ C(x1 ), q(xi ) ∨ C(xi , xm+1 ) ∨ . . . ∨ C(xm , xm+1 ) ∨ C(xm+1 )} where q is a fresh predicate S symbol uniquely associated with C and all its variants. Further, let sep(N ) = {sep(C) | C ∈ N }. For example: sep(P (x1 , x2 ) ∨ Q(x2 , x3 )) = {¬q(x2 ) ∨ P (x1 , x2 ), q(x2 ) ∨ Q(x2 , x3 )}. Lemma 7. The separation of a separable fluted clause (1) is a set of strongly fluted clauses. Lemma 8. For fluted clauses a separation inference step can be performed in linear time.

7

Termination

In this section we define a minimal resolution calculus Rsep and prove that it provides a decision procedure for fluted logic. The ordering of Rsep is required to be any admissible ordering compatible with the following complexity measure. Let s denote the proper superterm ordering. Define the complexity measure of any literal L by cL = (ar(L), max(L), sign(L)), where ar(L) is the arity (of the predicate symbol) of L, max(L) is a s -maximal term occurring in L, and sign(L) = 1, if L is negative, and sign(L) = 0, if L is positive. The ordering on the complexity measures is given by the lexicographic combination of >, s , and > (where > is the usual ordering on the non-negative integers). Let Rsep be any calculus in which (i) derivations are generated by strategies applying “Delete”, “Split”, “Separate”, namely, N ∪ {C}/N ∪ sep(C), and “Deduce” in this order, (ii) no application of “Deduce” with identical premises and identical consequence may occur twice on the same path in derivations, and (iii) the ordering is based on , defined above. Now we address the question as to whether the class of fluted clauses is closed under Rsep -inferences. Lemma 9. A factor of a strongly fluted clause C is again a strongly fluted clause of the same type as C. In fact, any (unordered) factor of a strongly fluted clause C is again a strongly fluted clause of the same type. The next lemma is the most important technical result.

444

Renate A. Schmidt and Ullrich Hustadt

Lemma 10. Let C = C 0 ∨ A1 and D = ¬A2 ∨ D0 be (FL2)-clauses. Suppose A1 and ¬A2 are eligible literals in C and D, respectively, and suppose σ is the most general unifier of A1 and A2 . Then: 1. Cσ, Dσ and Cσ ∨ Dσ are (FL2)-clauses. 2. For any functional literal Lσ in Cσ ∨ Dσ, the fluted sequence associated with Lσ is the σ-instance of a fluted sequence v associated with some literal L0 in C ∨ D. Lemma 11. Let C = C 0 ∨ A1 and D = ¬A2 ∨ D0 be strongly fluted clauses. Suppose A1 and ¬A2 are eligible literals in C and D, respectively, and suppose σ is the most general unifier of A1 and A2 . Then Cσ ∨ Dσ is a strongly fluted clause. Lemma 12. Let C, D and σ be as in Lemma 11. Then, |var(Cσ ∨ Dσ)| ≤ max{|var(C)|, |var(D)|}. Lemma 13. Removing any subclause from a fluted clause produces a fluted clause. This cannot be said for strongly fluted clauses, in particular, not for clauses of the form (FL3). For all other forms the statement is also true for strongly fluted clauses, namely, removing any subclause from strongly fluted clauses produces strongly fluted clauses. Consequently: Lemma 14. The condensation of any (strongly) fluted clause is a (strongly) fluted clause. Lemma 15. The resolvent of any two strongly fluted clauses is a strongly fluted clause, or, it is only a fluted clause, if one of the premises is a (FL3)-clause. Lemma 16. Any maximally split, condensed and separated factor or resolvent of strongly fluted clauses is strongly fluted. This proves that the class of (strongly) fluted clauses is closed under Rsep resolution with eager application of condensing, splitting and separation. In the next three lemmas, N is assumed to be a finite set of fluted clauses (which will be transformed into a set of strongly fluted clauses during the derivation, see Lemma 7). Our goal is to exhibit the existence of a term depth bound of all inferred clauses, as well as the existence of a bound on the number of variables occurring in any inferred clause. The latter follows immediately from Lemmas 6 and 12. Lemma 17. All clauses occurring in an Rsep -derivation from N contain at most m + 1 variables, where m is the maximal arity of any predicate symbol in N .

A Resolution Decision Procedure for Fluted Logic

445

The definition of fluted clauses places no restriction on the level of nesting of functional terms. But: Lemma 18. A bound on the maximal term depth of clauses derived by Rsep from N is m, where m is the maximal arity of any predicate symbol in N . Because the signature is extended dynamically during the derivation, it remains to show that separation cannot be performed infinitely often. Lemma 19. The number of applications of the “Separate”-rule in an Rsep derivation from N is bounded. Now, we can state the main theorem of this paper. Theorem 5. Let ϕ be any fluted formula and N = ClsDef Λ (ϕ), where Def Λ satisfies the restrictions of Lemma 5. Then: 1. Any Rsep -derivation from N (up to redundancy) terminates. 2. ϕ is unsatisfiable iff the Rsep -saturation (up to redundancy) of N contains the empty clause. The final theorem gives a rough estimation of an upper bound for the space requirements. Theorem 6. The number of maximally split, condensed strongly fluted clauses in any Rsep -derivation from N is an O(m)-story exponential, where m is the maximal arity of any predicate symbol in N .

8

Concluding Remarks

Developing a resolution decision procedure for fluted logic turned out to be more complicated than expected. Even though to begin with, clauses are simple in the sense that no nesting of non-nullary function symbols occurs (Lemma 4), the class of fluted clauses is rather complex. It is thus natural to ask whether there is a less complex clausal class which corresponds to fluted logic. The complexity of the class is a result of the ordering we have proposed. This ordering is unusual as it first considers the arity of a literal, while more conventional ordering refinements first consider the depth of a literal. A conventional ordering has the advantage that term depth growth can be avoided completely. This would induce a class of clauses which can be described by these schemes: propositional clauses C(x1 ) ∨ C(x2 ) ∨ . . . ∨ C(xm ) (= ∇) ∇ ∨ C(x 1 , f(x1 )) ∨ C(x2 , f(x1 )) ∨ . . . ∨ C(xm , f(x1 )) ∨ C(f(x 1 ))

(2) (3) (4)

C(x1 ) ∨ C(x1 , xm+1 ) ∨ C(x2 , xm+1 ) ∨ . . . ∨ C(xm , xm+1 ) ∨ C(xm+1 )

(5)

The difference between this class and the class of fluted clauses defined in Section 4 is scheme (4). Clauses satisfying scheme (4) are (FL2)-clauses, but not

446

Renate A. Schmidt and Ullrich Hustadt

every (FL2)-clause has the form (4). With separation (on (5)-clauses without an embracing literal) it is possible to stay within the confines of this class. However, the danger with the separation rule is that it could be applied infinitely often. It is open whether there is a clever way of applying the separation rule so that only finitely many new predicate symbols are introduced. For fluted logic with binary converse an example giving rise to an unbounded derivation is the following. P2 (x1 , x2, x3 ) ∨ P1 (f(x1 , x2 , x3), x3 ) ¬P2 (x1 , x2, x3 ) ∨ ¬P1 (x1 , x2 ) ∨ P0 (x2 , x3 ) ¬Q1 (x2 ) ∨ ¬P1 (x1 , x2 ) We do not know whether an alternative form of the separation rule could help. Noteworthy about fluted logic and the proposed method is that, in order to establish an upper bound on the number of variables in derived clauses, a truly dynamic renaming rule is needed (namely separation). Though renaming is a standard technique for transforming formulae into well-behaved clausal classes, it is usually applied in advance, see for example [7,15]. From a theoretical point of view whenever it is possible to do the renaming transformations as part of preprocessing, it is sensible to do so. The above example illustrates what could go wrong otherwise. It should be added though that there are instances where renaming on the fly is useful [27]. For fluted logic it is open whether there is a resolution decision procedure which does not require dynamic renaming. Going by the experience with other solvable classes, for example, Maslov’s class K [5,14], where renaming is only necessary when liftable ordering refinements are used, one possibility for avoiding dynamic renaming may be by using a refinement which is based on a non-liftable ordering. However, it would seem that the problems described in Section 6 are the same with non-liftable orderings. Even if it turns out that there is a resolution decision procedure which does not use separation, one could imagine that the separation rule can have a favourable impact on the performance of a theorem prover, for, with separation the size of clauses can be kept small, which is generally desirable, and for fluted logic separation is a cheap operation (Lemma 8). As noted earlier, the splitting rule is not essential for the results of this paper. The separation rule already facilitates some form of ‘weak splitting’, because, if C and D are variable disjoint and non-ground subclauses of C ∨ D then separation will replace it by q ∨ C and ¬q ∨ D, where q is a new propositional symbol. A closer resemblance to the splitting rule can be achieved by making q minimal in q ∨ C and selecting ¬q in ¬q ∨ D. Nevertheless, splitting has the advantage that more redundancy elimination operations are possible, for example forward subsumption. The realisation of a practical decision procedure for fluted logic would require a modest extension of one of the many available first-order theorem provers which are based on ordered resolution with an implementation of the separation rule. Modern theorem provers such as Spass [28] are equipped with a wide range of simplification rules so that reasonable efficiency could be expected.

A Resolution Decision Procedure for Fluted Logic

447

Acknowledgements We wish to thank Bill Purdy and the referees for valuable comments.

References 1. L. Bachmair and H. Ganzinger. Rewrite-based equational theorem proving with selection and simplification. J. Logic Computat., 4(3):217–247, 1994. 2. L. Bachmair and H. Ganzinger. Resolution theorem proving. In J. A. Robinson and A. Voronkov, eds., Handbook of Automated Reasoning. Elsevier, 2000. To appear. 3. L. Bachmair, H. Ganzinger, and U. Waldmann. Superposition with simplification as a decision procedure for the monadic class with equality. In Proc. Third Kurt G¨ odel Colloquium (KGC’93), vol. 713 of LNCS, pp. 83–96. Springer, 1993. 4. H. de Nivelle. A resolution decision procedure for the guarded fragment. In Automated Deduction—CADE-15, vol. 1421 of LNAI, pp. 191–204. Springer, 1998. 5. C. Ferm¨ uller, A. Leitsch, T. Tammet, and N. Zamov. Resolution Method for the Decision Problem, vol. 679 of LNCS. Springer, 1993. 6. C. G. Ferm¨ uller, A. Leitsch, U. Hustadt, and T. Tammet. Resolution theorem proving. In J. A. Robinson and A. Voronkov, eds., Handbook of Automated Reasoning. Elsevier, 2000. To appear. 7. H. Ganzinger and H. de Nivelle. A superposition decision procedure for the guarded fragment with equality. In Fourteenth Annual IEEE Symposium on Logic in Computer Science, pp. 295–303. IEEE Computer Society Press, 1999. 8. G. Gargov and S. Passy. A note on Boolean modal logic. In P. P. Petkov, ed., Mathematical Logic: Proceedings of the 1988 Heyting Summerschool, pp. 299–309. Plenum Press, 1990. 9. V. Goranko and S. Passy. Using the universal modality: Gains and questions. J. Logic Computat., 2(1):5–30, 1992. 10. J. Y. Halpern and Y. Moses. A guide to completeness and complexity for modal logics of knowledge and belief. Artificial Intelligence, 54:319–379, 1992. 11. A. Herzig. A new decidable fragment of first order logic, 1990. In Abstracts of the Third Logical Biennial, Summer School & Conference in Honour of S. C. Kleene, Varna, Bulgaria. 12. I. L. Humberstone. Inaccessible worlds. Notre Dame J. Formal Logic, 24(3):346– 352, 1983. 13. I. L. Humberstone. The modal logic of ‘all and only’. Notre Dame J. Formal Logic, 28(2):177–188, 1987. 14. U. Hustadt and R. A. Schmidt. An empirical analysis of modal theorem provers. J. Appl. Non-Classical Logics, 9(4), 1999. 15. U. Hustadt and R. A. Schmidt. Maslov’s class K revisited. In Automated Deduction—CADE-16, vol. 1632 of LNAI, pp. 172–186. Springer, 1999. 16. U. Hustadt and R. A. Schmidt. Issues of decidability for description logics in the framework of resolution. In Automated Deduction in Classical and Non-Classical Logics, vol. 1761 of LNAI, pp. 192–206. Springer, 2000. 17. U. Hustadt and R. A. Schmidt. A resolution decision procedure for fluted logic. Technical Report UMCS-00-3-1, University of Manchester, UK, 2000. 18. W. H. Joyner Jr. Resolution strategies as decision procedures. J. ACM, 23(3):398– 417, 1976. 19. H. J. Ohlbach and R. A. Schmidt. Functional translation and second-order frame properties of modal logics. J. Logic Computat., 7(5):581–603, 1997.

448

Renate A. Schmidt and Ullrich Hustadt

20. D. A. Plaisted and S. Greenbaum. A structure-preserving clause form translation. J. Symbolic Computat., 2:293–304, 1986. 21. W. C. Purdy. Decidability of fluted logic with identity. Notre Dame J. Formal Logic, 37(1):84–104, 1996. 22. W. C. Purdy. Fluted formulas and the limits of decidability. J. Symbolic Logic, 61(2):608–620, 1996. 23. W. C. Purdy. Surrogate variables in natural language. To appear in M. B¨ ottner, ed., Proc. of the Workshop on Variable-Free Semantics, 1996. 24. W. C. Purdy. Quine’s ‘limits of decision’. J. Symbolic Logic, 64:1439–1466, 1999. 25. W. V. Quine. Variables explained away. In Proc. American Philosophy Society, vol. 104, pp. 343–347, 1960. 26. R. A. Schmidt. Decidability by resolution for propositional modal logics. J. Automated Reasoning, 22(4):379–396, 1999. 27. G. S. Tseitin. On the complexity of derivations in propositional calculus. In A. O. Slisenko, ed., Studies in Constructive Mathematics and Mathematical Logic, Part II, pp. 115–125. Consultants Bureau, New York, 1970. 28. C. Weidenbach. Spass, 1999. http://spass.mpi-sb.mpg.de.

ZRes: The Old Davis–Putnam Procedure Meets ZBDD Philippe Chatalic and Laurent Simon Laboratoire de Recherche en Informatique U.M.R. CNRS 8623 Universit´e Paris-Sud, 91405 Orsay Cedex, France {chatalic,simon}@lri.fr

Abstract. ZRes is a propositional prover based on the original procedure of Davis and Putnam, as opposed to its modified version of Davis, Logeman and Loveland, on which most of the current efficient SAT provers are based. On some highly structured SAT instances, such as the well known Pigeon Hole and Urquhart problems, both proved hard for resolution, ZRes performs very well and surpasses all classical SAT provers by an order of magnitude.

1

The DP and DLL Algorithms

Stimulated by hardware progress, many more and more efficient SAT solvers have been designed during the last decade. It is striking that most of the complete solvers are based on the procedure of Davis, Logeman and Loveland (DLL for short) presented in 1962 [11]. The DLL procedure may roughly be described as a backtrack procedure that searches for a model. Each step amounts to the extension of a partial interpretation by choosing an assignment for a selected variable. The success of this procedure is mainly due to its space complexity, since making choices only results in simplifications. However, the number of potential extensions remains exponential. Therefore, if the search space cannot be pruned by clever heuristics, this approach becomes intractable in practice. The picture is very different with DP, the original Davis–Putnam algorithm [3]. DP is able to determine if a propositional formula f, expressed under conjunctive normal form (CNF), is satisfiable or not. Assuming the reader is familiar with propositional logic, DP may be roughly described as follows [9]: I. Choose a propositional variable x of f. II. Replace all the clauses which contain the literal x (or ¬x) by all binary resolvents (on x) of these clauses (cut elimination of x), and remove all subsumed clauses. III. a. If the new set of clauses is reduced to the empty clause, then the original set is unsatisfiable. b. If it is empty, then the original formula is satisfiable. c. Otherwise, repeat steps I-III for this new set of clauses. D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 449–454, 2000. c Springer-Verlag Berlin Heidelberg 2000

450

Philippe Chatalic and Laurent Simon

As opposed to DLL, DP avoids making choices by considering the two possible instantiations of a variable simultaneously. It amounts to a sequence of cut eliminations. Since the number of clauses generated at each step may grow exponentially, it is widely acknowledged in the literature as inefficient, although no real experimental study has been conducted to confirm this point. Dechter and Rish [5] have first pointed out some instances for which DLL is not appropriate and where DP obtains better results. A more comprehensive experimentation has been conducted in [2], to evaluate DP on a variety of instances. If this study confirms the superiority of DLL on random instances, it also shows that for structured instances, DP may outperform some of the best DLL procedures. Substantial progress for DLL are due to better heuristics. For DP, significant improvements are possible thanks to efficient data structures for representing very large sets of clauses. Several authors have pointed out that resolution-based provers (like DP and DLL) are intrinsically limited, since they have found instances that require an exponential number of resolution steps to be solved (e.g. [9,10,14]). This is the case for the Pigeon Hole [10] and for the Urquhart problem [14]. They suggest that more powerful proof systems have to be used practically to solve such problems efficiently. However, all these results are based on the implicit hypothesis that successive resolutions in DP and DLL are performed one by one. This paper presents the ZRes system, which is an implementation of the DP algorithm that is able to perform several resolutions in a single step. As a result, ZRes is able to solve instances of such hard problems much more efficiently than the best current DLL provers.

2

Reviving DP

The crucial point of DP is step II, which tends to generate a very large number of clauses. Eliminating subsumed clauses at this step induces a significant overhead but still pays off. Efficient Data Structures. In [2], Trie structures are used to represent sets of clauses. Until now, they seem to remain the state-of-the-art data structures for subsumption checking [15,4]. Tries allow the factorization of clauses beginning in the same way, according to a given order on literals. In ZRes we further generalize this principle, to allow the factorization of the end as well as the beginning of clauses simultaneously. Sets of clauses are thus represented by means of directed acyclic graphs (DAG) instead of trees (by Tries). Using DAG to represent a boolean formula has been intensively investigated in many works on binary decision diagrams (BDD) [1,7]. Many variants of BDD have been proposed but all attempt to compute the BDD encoding of the formula, expressed in Shannon normal form. From the SAT point of view and since the resulting BDD characterizes the formula validity and satisfiability, this construction is de facto more difficult than testing for satisfiability.

ZRes: The Old Davis–Putnam Procedure Meets ZBDD

451

The approach followed by ZRes is quite different since, instead of computing the Shannon normal form, we use here BDD-like structures only to represent sets of clauses. Practically, the set of clauses is represented by a ZBDD (a variant of BDD [12]), proved useful for manipulations of large sets. Sets are represented by their characteristic (boolean) functions and basic set operations can thus be performed as boolean operations on ZBDD. Moreover, it has been shown that the size of such a ZBDD is not directly related to the size of the corresponding set. Since the cost of basic set operations only depend on the size of the considered ZBDD, this hints to performing the cut elimination step of the DP algorithm directly at the set level. Another way to represent f in the cut elimination step of DP is to factorize x and ¬x among its clauses. The formula f can then be rewritten as (x ∨ fx+ ) ∧ (¬x ∨ fx− ) ∧ fx 0, where fx+ (resp. fx− ) is the CNF obtained from the set of clauses containing x (resp. ¬x), after factorization, and where fx 0 denotes the set of clauses containing neither x nor ¬x. The second step of the algorithm then amounts to put the formula (fx+ ∨ fx− ) ∧ fx 0 into CNF. This can be done in 3 stages. First, distribute the set of clauses fx+ over fx− . Second, eliminate tautologies and subsumed clauses from the resulting clauses. Third, compute the union of the remaining clauses with those of fx 0, while deleting subsumed clauses. The two first stages could be performed successively, using standard operations on ZBDD. However, the ZBDD used in ZRes have a special semantics and thus, a more efficient algorithm, called clause-distribution, can be designed. This operation guarantees that, during the bottom-up construction of the result, each intermediate ZBDD is free of tautologies and subsumed clauses. Tautologies are eliminated on the fly and subsumed clauses are deleted by a set difference operation, at each level. Similarly, in the third stage of the cut elimination, subsumed clauses may be deleted while computing the union of the two sets of clauses. This new algorithm takes full advantage of the data structure used to represent sets of clauses.

3

Experimental Results

ZRes1 is written in C, using the Cudd package [13] which provides us with basic ZBDD operations as well as useful dynamic reordering functions. We have tested ZRes on two classes of hard problems for resolution: Hole and Urquhart. Those tests have been performed on a Linux Pentium-II 400MHz2 with 256MB. Our results are compared, when possible, with those of two DLL implementations: Asat [8], which is a good-but-simple DLL implementation, and Sato 3.0 [15], which includes many optimizations and recent improvements, such as backjumping and conflict memorization. Cpu times are given in seconds. We assume that an instance that cannot be solved in less than 10000 seconds counts for 10000. 1 2

ZRes is available at http://www.lri.fr/~simon/research/zres. On the DIMACS [6] machine scale benchmark, our tests have granted this machine a user time saving of 305%, in comparison with the Sparc10.41, given as a reference.

452

Philippe Chatalic and Laurent Simon

The Hole Problem. The following table describes the obtained results on different instances of the Hole problem by Asat, Sato and ZRes. For ZRes, the std and cd columns describe respectively the time obtained without (resp. with) the clause-distribution operation. Instances Hole-09 Hole-10 Hole-11 Hole-12 Hole-20 Hole-30 Hole-40

Var. Nb. 90 110 132 156 420 930 1640

Cl. Nb. 415 561 738 949 4221 13981 32841

Asat 11.94 141.96 1960.77 10000 – – –

Sato 8.90 80.94 7373.65 10000 – – –

ZRes std 1.87 3.26 5.76 10.18 654.8 10000 –

ZRes cd 1.01 1.61 2.65 4.06 69 1102 9421

ZRes clearly surpasses both Asat and Sato. While DLL algorithms can’t solve instances of Hole-n for n > 11, ZRes manages to solve much larger instances3 . As we can see, the speedup induced by the clause-distribution operation is significant on such instances. Our experiments have shown that this problem is very sensitive to the heuristic function used to choose the cut variable. Surprisingly, the best results were obtained using a heuristic function that tends to maximize the number of clauses produced by the cut. Other heuristics, such as in [2], did not allow to solve those Hole instances. The Urquhart Problem. Urquhart has described a class of problems based on properties of expander graphs [14]. Actually, each Urq-n is a class of problems where the number of clauses and variables is not fixed but only bounded to a specific interval. In the last table, M nV (resp. M nC) denotes the mean number of variables (resp. clauses) for a set of instances of a given class. Contrary to Hole, Urquhart problem does not seem sensitive to the heuristics used. The results of Asat and Sato on 100 Urq-3 instances attest the hardness of these problems: System Asat Sato ZRes cd

Total cpu time 404 287 776 364 69.2

#resolved 69 26 100

Mean cpu time (resolved) 1366 1398 0.69

Solving instances for greater values of n seems out of the scope of these systems. On the other hand, ZRes performs quite well on such instances. The following table gives the mean time on 1000 instances for greater values of n. Note that we do not give the std time because the speedup due to the clausedistribution operation is not relevant for this problem. 3

We even solved the Hole-55 instances, with 3080 variables and 84756 clauses, in less than 2 days.

ZRes: The Old Davis–Putnam Procedure Meets ZBDD

Instances MnV Urq-3 42 Urq-4 77 Urq-5 123 Urq-6 178 Urq-7 242 Urq-8 317 Urq-9 403

453

MnC Mean cpu-time 364 0.57 705 1.72 1143 4.25 1665 8.88 2299 16.5 3004 29.6 3837 48.8

About the Compression Power of ZBBD. The previous examples illustrate quite well that DP, associated with the ZBDD encoding of sets of clauses, may in some cases be more effective than DLL. The experiment on the Hole-40 shows that for some of the cut eliminations, the number of clauses corresponding to fx+ and fx− may exceed 1060. Clearly, the result of such a cut could not be computed efficiently without using such an extremely compact data structure. The ability of ZBDD to capture redundancies in sets of clauses suits the DP algorithm particularly well. Indeed, additional redundancies are produced during each cut elimination, when each clause of fx+ is merged with each clause of fx− . Moreover, unlike in random instances, we think that such redundancies may also be found in structured instances, corresponding to real-world problems. In order to appreciate the compression power of ZBDD structures, it is interesting to consider level of compression which may be characterized by the ratio nb of literals/nb of nodes. We have recorded its successive values on 1000 random 3-SAT instances of 42 variables and 180 clauses, on wich DP is known to be a poor candidate [2]. For such instances, the initial value of the ratio is about 2, then it increases up to 6.15 and eventually decreases together with the number of clauses. In contrast, on Hole-40, this ratio varies from 10 to more that 1060 . Similarly, on Urq-10 it may exceed 1064 . Pigeon and Urquart classes however correspond to extreme cases. We have also tested ZRes on some other instances of the SAT DIMACS base [6]. Results are particularly interesting. For some instances the compression level is much more important than for random instances (more than 108 ), while on others, like ssa or flat−50 it is very close to that of random instances. It is however striking that the latter instances, even if they correspond to concrete problems, have been generated in a random manner. This seems to confirm that our hypothesis (structured problems generally have regularities) is well founded.

4

Conclusion

Dechter and Rish [5] were the first to revive the interest in the original Davis– Putnam procedure for SAT. But DP also proves useful for knowledge compilation or validation techniques, which was our initial motivation [2]. The introduction of ZBDD brings significant improvements allowing ZRes to deal with huge sets of clauses. It leads us to completely reconsider the performances of the cut elimination, which can be performed independently of the number of handled clauses.

454

Philippe Chatalic and Laurent Simon

It is thus able to solve two hard problems out of the scope of other resolutionbased provers. Although such examples might be considered somewhat artificial, their importance in the study of the complexity of resolution procedures must not be forgotten. On other examples, such like DIMACS ones, results are not so good, but important compression level, due to ZBDD, can be observed on real-world instances. The strength of ZRes definitely comes from its ability to capture regularities in sets of clauses. Although it has a no chance to compete on random instances, which lack such regularities, it might be a better candidate for solving real-world problems. Further improvements are possible. The Hole example pointed out that standard heuristics for DP are not always appropriate for ZRes. We are studying new heuristics, based on the structure of ZBDD rather than on what they represent. One may also investigate more adapted reordering algorithms, which take advantage of the particular semantics of the ZBDD used in ZRes. Eventually, DP and DLL may be considered as complementary approaches. An interesting idea is to design an hybrid algorithm integrating both DP and DLL in ZRes.

References 1. R.E. Bryant. Graph - based algorithms for boolean function manipulation. IEEE Trans. on Comp., 35(8):677–691, 1986. 2. Ph. Chatalic and L. Simon. Davis and putnam 40 years later: a first experimentation. Technical Report 1237, LRI, Orsay, France, 2000. Submitted to the Journal of Automated Reasoning. 3. M. Davis and H. Putnam. A computing procedure for quantification theory. Journal of the ACM, pages 201–215, 1960. 4. Johan de Kleer. An improved incremental algorithm for generating prime implicates. In AAAI’92, pages 780–785, 1992. 5. R. Dechter and I. Rish. Directional resolution: The Davis–Putnam procedure, revisited. In Proceedings of KR-94, pages 134–145, 1994. 6. The DIMACS challenge benchmarks. ftp://ftp.rutgers.dimacs.edu/challenges/sat. 7. R. Drechsler and B. Becker. Binary Decision Diagram: Theory and Implementation. Kluwer Academic Publisher, 1998. 8. Olivier Dubois. Can a very simple algorithm be efficient for SAT? ftp://ftp.dimacs.rutgers.edu/pub/challenges/sat/contributed/dubois. 9. Zvi Galil. On the complexity of regular resolution and the Davis–Putnam procedure. Theorical Computer Science, 4:23–46, 1977. 10. A. Haken. The intractability of resolution. Theorical Computer Science, 39:297– 308, 1985. 11. G. Logeman M. Davis and D. Loveland. A machine program for theorem-proving. Communications of the ACM, pages 394–397, 1962. 12. S. Minato. Zero-suppressed bdds for set manipulation in combinatorial problems. In 30th ACM/IEEE Design Automation Conference, 1993. 13. F. Somenzy. Cudd release 2.3.0. http://bessie.colorado.edu/~fabio. 14. A. Urquhart. Hard examples for resolution. Journal of the ACM, 34:209–219, 1987. 15. Hantao Zhang. SATO: An efficient propositional prover. In CADE-14, LNCS 1249, pages 272–275, 1997.

! " #

! "#$%&' (

)! * + (

( * , (

!

* , * , - ( . * /

( ( (

( * 0 . ( "' ( * 1

( 2 (

"33'* , 4

"56 33'* 0 ,

) *

0 . ( 7 * $ ( 8 % 9 ½

$%&&'

!"#$%&$ '"# (")$

(

) * %

) / 8%9 / ** ( ( 6%9:* ( ( ( 2

* 0

8%9 / ;2

"/&' * -

< ) Æ

* 0 Æ - ( )* 0 2 ) ** 8%9 /* ( (

! ( ;2

* 2 =

. ( , ( ** . "%>' * ( ( "33'

* , )

+ ,

- * %

.

( ) ¼ ( ¼ ? = ¼

= ( ¼ * 1 ( < * 0

( ( **

( ( 2 * 0 = ( <

* @ ( ( ( ( (

.*

0 " 33' 2 "::>' )* 7

* ! " #$$%&$%'() $* ! ! " $%+,-$+ .""$%+,- $,$* ! / "$+ ."$+ .""$, "$%'()"$$%&"# "

0 ( . 7 7

7 A ( 7 B * @ = B* 0 7 ( = = "%' ? C "9: D' "5/E' ( . * , 2 * . ** F0$G H*

/

) * %

I = (

7 7 " 33'

! " #$%& ( *

1 ( * 0 - ",' "AJ/' - ( *** * 0 * , C ( 7 * , ( ( 72 ) * ( . ,% "::/'* 5 ( ,% ( . ,% ;* 1

7

2 * ( * 0 ( 7 * 0 Æ (

C ,% * - ( ;2 * ( ( * 5 %9 "5D' ( + ( $ * %9

) 2 * + 7 4 433 &33 9 * K (

2 %9 / * 0 . . . **

C * 2 ( .

C ,% 2< 8 ( . 9 / ( C 3*E AK/0 *& *L ,% *L F0$G &*3 *4

+ ,

- * %

0

$( ** (

(

( ( 2! (* /

7 2 . ( * , ( ( ( * * ( ( "33'* 0 = 7 "5'* $1· 0.' 1 # 2 1 + ) 3 4 * % * % % % 5 * ) * 6

7 8 9 , ! 6 *1 1)+5:0. 2;) <= 0 ==>= <00. $110/' - 1 )? * 1 ! - * ! - * ""000 ! "1! "$ " " <00/ $1100' )? 1 4 1 4 <000 1+ $+0/' + + ! @ <00/ · $47 00' ) * 4 1 A 7 * %

8 ) ,<(>
4 A # 1)+5:00 2;) <(B= =<.>==< <000 $4 0.' * 4 ) + C 6 * ! )D

5 ; <00. $40(' + 4 1 ;%) ! ;D A * ) *E 7 % 1)+5:0( 2;) <<& =//>=0= <00( $400' + 4 E ! + # % # F+%G <000 $ ' !

$%&&' * % ) * , E D =&&& $%&&' * % , ! E E&&&=

=&&& ""000 0 " "! $@8' ! ""2

! !"

$H5+0' ! H5+ E ""000 !! "31 "1" !"4 55 <00 $ 0' A ! -# 7 2 8 <&&& B= >B B <00 $+ 00' + 5D FDG I 6B 6B1 <000 ) ""000 06 " -"7)

System Description: Tramp Transformation of Machine-Found Proofs into Natural Deduction Proofs at the Assertion Level Andreas Meier Fachbereich Informatik, Universit¨ at des Saarlandes 66041 Saarbr¨ ucken, Germany [email protected] http://www.ags.uni-sb.de/~ameier Abstract. The Tramp system transforms the output of several automated theorem provers for first order logic with equality into natural deduction proofs at the assertion level. Through this interface, other systems such as proof presentation systems or interactive deduction systems can access proofs originally produced by any system interfaced by Tramp only by adapting the assertion level proofs to their own needs.

1

Introduction

Today’s theorem proving systems (automatic and interactive ones) have reached a considerable strength. However, it has become clear that no single system is capable of handling all sorts of deduction tasks. Therefore, it is a well-established approach to delegate subgoals to other (specialist) systems such as automated theorem provers (ATPs). Unfortunately, most ATPs use their own particular formalism. These machine-oriented formalisms make the proofs difficult to read. Hence, in order to make use of the results of the ATPs other systems need to adapt the output of an ATP to input, that they can further process. To minimize the transformation efforts it is advisable to use an interface that transforms the machine-found proofs of various formalisms into a uniform format. Thereby, interactive deduction systems or proof presentation systems need a uniform format that they can easily transform into a presentation comprehensible to humans. Hence, a uniform format suitable for such systems should consist of intuitive steps and should be compact. Some approaches transform the machine-found proofs into natural deduction (ND) proofs [2,12]. But the resulting ND proofs suffer from the problem that they usually consist of a large number of low-level steps which are pure-syntactic manipulations of logical quantifiers and connectives. An approach to enhance these problems is to produce ND proofs at the assertion level [8]. The assertion level allows for human-oriented macrosteps justified by the application of theorems, lemmas, or definitions which are collectively called assertions. For instance, the assertion level step F ⊂G c∈F DEF⊂ c∈G derives the conclusion c ∈ G by an application of the subset definition DEF⊂ — formalized by ∀S1 .∀S2 .(S1 ⊂ S2 ⇔ ∀x.(x ∈ S1 ⇒ x ∈ S2 ))— from the premises D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 460–464, 2000. c Springer-Verlag Berlin Heidelberg 2000

System Description: Tramp Transformation

461

c ∈ F and F ⊂ G. A corresponding basic ND proof consists of a whole sequence of basic ND steps. Other work indicates that assertion level proofs are well suited as basic representation level for further human-oriented proof presentation [7]. In the following we describe the Tramp system which can transform the output of several ATPs for first order logic with equality into ND proofs at the assertion level. Moreover, we give an example of an assertion level proof produced by Tramp and discuss current applications of Tramp and potential extensions.

2

The Tramp System

Tramp consists of three parts: (1) For each ATP interfaced by Tramp there is a minor transformation process that can transform a problem description (consisting of a set of first order formulas, the assumptions, and one first order formula, the conclusion) into input suitable for this ATP. (2) At the heart is the transformation process that can transform a problem description and the corresponding output of an ATP into an ND proof at the assertion level. (3) A communication shell handles the access of the ATPs by Tramp and the way other systems can reach Tramp. The transformation processes producing the inputs for the ATPs work all in the same manner: compute the clause normal form of the formulas of the problem description; then use these clauses to create an input file for an ATP. In the following we focus on the transformation of proofs at the assertion level and the integration in a networked proof development environment. 2.1

Proof Transformation at the Assertion Level

The transformation process takes as input a problem description and the corresponding output of an ATP. It produces an ND proof at the assertion level. The proof is created through three subprocesses, all embedded into one nutshell: Structuring and Transforming into Refutation Graphs: The output of the ATP is structured by cutting off lemmas from the main proof. The resulting proof parts of the (remaining) main problem and the lemmas are each transformed into refutation graphs (refutation graphs are ground clause graphs representing refutation proofs [5]). Transformation at the Assertion Level: Each refutation graph is transformed into an ND proof at the assertion level. These proofs are connected via the lemmas such that we obtain a single ND proof at the assertion level. Optional Expansion of Assertion Steps: If requested by the user each assertion application can be expanded to a sequence of basic ND steps such that the resulting proof is a basic ND proof. We enriched this basic transformation procedure with several heuristics, producing especially short and comprehensible proof parts and avoiding indirect parts. By structuring and transforming the output of the ATPs into refutation

462

Andreas Meier

graphs we obtain an intermediate uniform representation of the different formalisms of the ATPs. To integrate further ATPs we require only appropriate extensions of this subprocess. We prefer refutation graphs as intermediate uniform representation because proofs in many other refutation based formalisms can be transformed easily into refutation graphs (e.g., a transformation algorithm for resolution proofs is described in [5]). Furthermore, the correspondences between the input clauses (which literals are contradictory?) are directly visible. The second subprocess consists of two phases: First, Tramp decomposes the assumptions, the conclusion, and the refutation graphs until refutation graphs are reached consisting only of a sequence of steps which represent translatable assertion applications (Tramp identifies assertion applications already in the refutation graphs). Then, in a second phase these refutation graphs are transformed by translating these steps successively into corresponding assertion applications in the ND proof. A detailed description of this algorithm which is an extension of [9] to refutation graphs with equality can be found in [11]. 2.2

Integration in a Networked Proof Development Environment

All transformation processes are implemented in Allegro Common Lisp and run in one lisp process. This lisp process runs within a MathWeb communication shell[6]. MathWeb is a system for distributed automated theorem proving. Existing tools are equipped with a communication shell and are integrated into a networked proof development environment. Via MathWeb, Tramp can be reached by other MathWeb services and can reach the ATPs which are also available as MathWeb services. As input Tramp accepts problem descriptions in POST syntax [1]. Moreover, the user can feed Tramp directly with the corresponding output of an ATP or can instruct Tramp to access several ATPs to prove the input problem. Currently, Tramp is able to produce the input and process the output of the ATPs Spass, Bliksem, Otter, WaldMeister1 , ProTeIn, and EQP (see [14] for references). When instructed to access ATPs Tramp computes the inputs for the chosen ATPs and distributes these inputs via MathWeb among the ATPs. The distributed ATPs run competitively. When an ATP is finished it sends its output via MathWeb back to Tramp which transforms the output into an ND proof at the assertion level. Thus, when instructed to access an ATP, Tramp behaves for an external user like an ND-ATP. As output Tramp produces again POST syntax. A created ND proof is expressed in the linearized version first used in [2]. Thereby, a ND line consists of a finite set of formulas 4, called the hypotheses, a single formula F , called the conclusion, and a justification (R). Such a line is denoted as: L.4 ` F (R) where L is a label for this line. Our set of basic ND rules is based on Gentzen’s natural deduction calculus NK, but is enriched with further derived rules to obtain 1

Except WaldMeister all ATPs interfaced by Tramp are refutation based. However, Tramp can transform the output of WaldMeister into refutation graphs by deriving a contradiction between the proved theorem t = t0 and its negation t 6= t0 .

System Description: Tramp Transformation

463

better comprehensible and more compact proofs. The justification ‘application of assertion LA on premises LP1 , . . . , LPn ’ is written as (LA LP1 . . . LPn ).

3

An Example

We apply Tramp on the problem SET001 of the TPTP problem library [13]. The input for Tramp is the following problem description: Assumptions: ∀S1 .∀S2 .((S1 = S2 ) ⇔ ((S1 ⊂ S2 ) ∧ (S2 ⊂ S1 ))) ∀S1 .∀S2 .∀x.((x ∈ S1 ) ∧ (S1 ⊂ S2 ) ⇒ (x ∈ S2 )) Conclusion: ∀S1 .∀S2 .∀x.((x ∈ S1 ) ∧ (S1 = S2 ) ⇒ (x ∈ S2 )), Tramp computes the input for an ATP and applies the ATP via MathWeb. Then, it transforms its output into the following refutation graph G: +(sk1 = sk2 )

+(sk3 ∈ sk1 )

−(sk1 = sk2 ) +(sk1 ⊂ sk2 )

−(sk3 ∈ sk2 )

−(sk1 ⊂ sk2 ) −(sk3 ∈ sk1 ) +(sk3 ∈ sk2 )

sk1 , sk2 , sk3 are skolem constants. Afterwards, Tramp transforms G together with the input problem description into the following assertion level ND proof: L1 . L2 . L3 . L4 . L5 . L6 . L7 . L8 . L9 . L10 . L11 .

L1 L2 L3 L3 L3 L1 , L3 L1 , L2 , L3 L1 , L2 L1 , L2 L1 , L2 L1 , L2

` ∀S1 .∀S2 .((S1 = S2 ) ⇔ ((S1 ⊂ S2 ) ∧ (S2 ⊂ S1 ))) ` ∀S1 .∀S2 .∀x.((x ∈ S1 ) ∧ (S1 ⊂ S2 ) ⇒ (x ∈ S2 )) ` (c ∈ F ) ∧ (F = G) `c ∈ F `F = G `F ⊂ G `c ∈ G ` (c ∈ F ) ∧ (F = G) ⇒ (c ∈ G) ` ∀x.((x ∈ F ) ∧ (F = G) ⇒ (x ∈ G)) ` ∀S2 .∀x.((x ∈ F ) ∧ (F = S2 ) ⇒ (x ∈ S2 )) ` ∀S1 .∀S2 .∀x.((x ∈ S1 ) ∧ (S1 = S2 ) ⇒ (x ∈ S2 ))

(Hyp) (Hyp) (Hyp) (∧E L3 ) (∧E L3 ) (L1 L5 ) (L2 L6 L4 ) (⇒ I L3 L7 ) (∀I L8 ) (∀I L9 ) (∀I L10 )

In this proof the lines L6 and L7 are justified by assertion applications.

4

Experience, Discussion, and Future Work

We have tested the implementation of Tramp on about 100 examples from the TPTP problem library which can be proved by the ATPs interfaced by Tramp. Furthermore, Tramp is used permanently by the systems ProVerb [10] and Ωmega [3]. ProVerb, a proof presentation system, uses Tramp to obtain assertion level proofs that it can translate into natural language proofs. Ωmega, an interactive mathematical assistant system, calls the ATPs via Tramp on open goals in its proof object. Then Ωmega integrates the proofs provided by Tramp into its own proof. Our experiments show that, for a problem containing some applicable assertions, the length of the assertion proof is typically about half the length of the basic ND proof that results from expanding the abstract assertion steps

464

Andreas Meier

(e.g., the assertion proof in Sec. 3 consists of 11 lines whereas the corresponding basic ND proof consists of 21 lines). Although pure equality proofs such as produced by EQP and WaldMeister are neither shortened nor abstracted by using Tramp, we use these systems to push the solvability horizon of ATPs interfaced by Tramp. Not unexpectedly, transforming proofs to the assertion level requires considerable computational effort for larger proofs. From our experience, this effort is justified for systems that need a human-oriented representation of the machine-found proofs such as interactive deduction systems and proof presentation systems. Such systems can adapt easily the resulting assertion level proofs to their own needs for the following reasons: (1) The resulting assertion level proofs are significantly more compact than basic ND proofs. (2) They contain meaningful steps but hardly indirect parts. (3) Each assertion application can be expanded to a sequence of basic ND steps when a more detailed derivation is needed. Otherwise, for the communication between fully automatic systems other uniform representations suitable for this task can be produced with less effort (e.g., by using refutation graphs directly). We are currently working on an extension of Tramp to handle proofs found by LEO [4], a higher order resolution prover with built-in extensionality. The current version of Tramp is available at http://www.ags.uni-sb.de/~ameier/tramp.html. Soon there will be also a web interface for Tramp.

References 1. POST . See at http://www.ags.uni-sb.de/~omega/primer/post.html, 1999. 2. P. B. Andrews. Transforming matings into natural deduction proofs. In Proc. of CADE-5, pages 281–292, 1980. 3. C. Benzm¨ uller et al. Ωmega: Towards a mathematical assistant. In Proc. of CADE-14, pages 252–255, 1997. 4. C. Benzm¨ uller and M. Kohlhase. LEO, a higher order theorem prover. In Proc. of CADE-15, pages 139–144, 1998. 5. N. Eisinger. Completeness, confluence, and related properties of clause graph resolution. PhD thesis, Universit¨ at Kaiserslautern, Germany, 1988. 6. A. Franke and M. Kohlhase. MathWeb, an agent-based communication layer for distributed automated theorem proving. In Proc. CADE-16, pages 217–221, 1999. 7. H. Horacek. Presenting proofs in a human-oriented way. In Proc. of CADE-16, pages 142–156, 1999. 8. X. Huang. Reconstructing proofs at the assertion level. In Proc. of CADE-12, pages 738–752, 1994. 9. X. Huang. Translating machine-generated resolution proofs into ND-proofs at the assertion level. In Proc. of PRICAI-96, pages 399–410, 1996. 10. X. Huang and A. Fiedler. Presenting machine-found proofs. In Proc. of CADE-13, pages 221–225, 1996. 11. A. Meier. Transformation of machine-found proofs into assertion level proofs. Technical report, 2000. Avaible at http://www.ags.uni-sb.de/~ameier/tramp.html. 12. F. Pfenning. Proof transformation in higher-order logic. PhD thesis, CMU, Pittsburgh, Pennsylvania, USA, 1987. 13. G. Sutcliffe et al. The TPTP problem library. In Proc. of CADE-12, pages 252–266, 1994. 14. G. Sutcliffe and C. Suttner. The results of the CADE-13 ATP system competition. Journal of Automated Reasoning, 18(2):259–264, 1997.

On Unification for Bounded Distributive Lattices Viorica Sofronie-Stokkermans Max-Planck-Institut f¨ ur Informatik Im Stadtwald, D-66123 Saarbr¨ ucken, Germany [email protected]

Abstract. We give a resolution-based procedure for deciding unifiability in the variety of bounded distributive lattices. The main idea is to use a structure-preserving translation to clause form to reduce the problem of testing the satisfiability of a unification problem S to the problem of checking the satisfiability of a set ΦS of (constrained) clauses. These ideas can be used for unification with free constants and for unification with linear constant restrictions. Complexity issues are also addressed.

1

Introduction

From an algebraic point of view, unification can be seen as solving (systems of) equations in the initial or free algebra of an equational theory. Apart from its theoretical interest, unification is used e.g. in resolution-based theorem proving and in term rewriting to deal with certain equational axioms (such as associativity and commutativity). The unification problem has been thoroughly studied for equationally defined theories characterized by axioms such as associativity, commutativity, distributivity, associativity-commutativity, associativity-commutativity-idempotency; and for several theories related to algebra (Abelian groups, commutative and Boolean rings, semilattices, Boolean algebras, primal algebras, discriminator varieties). For details cf. [5] and the bibliography cited there. The combination of unification algorithms has been studied in [4]. In this paper we present some results on unification in the equational theory of bounded distributive lattices. The study was motivated, on the one hand, by our interest in distributive lattices with operators, and, on the other hand, by the fact that unification problems in semilattice- and lattice-based structures are becoming of increasing interest in computer science (we mention e.g. the results of Baader and Narendran on unification of concept terms in description logics [2]; similar possible applications in set constraints may also be of interest). It is known that the class D01 of bounded distributive lattices has an undecidable first-order theory, but both its universal theory and its positive ∀∃ theory (hence the unification problem with free constants) are decidable. Unification for distributive lattices has only been addressed in a few papers. In [12], Gerhard and Petrich give a criterion for unifiability (with free constants) of two terms in the theory of distributive lattices. (We were not able to generalize the argument used in the proof of this result to handle conjunctions of equations.) Then, in the attempt to give a basis set for all unifiers of two terms, they considered D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 465–481, 2000. c Springer-Verlag Berlin Heidelberg 2000

466

Viorica Sofronie-Stokkermans

terms containing only one of the lattice operations ∨ or ∧ and, for more general terms, only particular cases, containing few variables. The results of Ghilardi [13] show that the equational class D of distributive lattices has unification type 0, i.e. there exist D-unification problems with no minimal complete set of unifiers. We are not aware of any other results on unification for distributive lattices, e.g. concerning its complexity. Due to the interaction between operators, neither the ideas used in [19] for distributive unification, nor the results in [4] on the combination of unification algorithms can be applied in this case. In [20], we gave a resolution-based decision procedure for the universal theory of certain varieties of distributive lattices with operators. The arguments in [20] cannot be used for the positive ∀∃ theory of such varieties without modification. In this paper we further develop the ideas in [20] and show that the use of the Priestley representation for bounded distributive lattices allows us to give an algorithm based on resolution (with constrained clauses) for the unification problem in D01 . The algorithm consists of the following steps: 1. Structure-preserving translation to clause form: testing the satisfiability of a unification problem S is reduced to the problem of checking the satisfiability of a set ΦS of clauses. Expressing ΦS as a set of constrained clauses further simplifies the representation of the problem. 2. Ordered resolution with selection for (constrained) clauses is used for testing the satisfiability of ΦS . We also show that similar ideas can be used for unification with linear constant restrictions [4]. These results complete and improve the results in [12]. The main advantage of our approach is that the structure-preserving translation to clause form makes it much easier to treat the unification problem for bounded distributive lattices, by using results in resolution theory. As a byproduct, using Prop. 5.6 in [3], our results show that resolution (for ground clauses without equality) can be used for deciding the positive theory of D01 . It seems that many of the results in this paper can be extended without difficulties to other varieties in which the free algebras have a description similar to those in D01 . This is the case for many subvarieties of the variety of Ockham algebras (bounded distributive lattices with a lattice antimorphism), such as, e.g., the variety of De Morgan algebras. For the sake of simplicity, in this paper we restrict our attention to the class of bounded distributive lattices only. The paper is structured as follows. Section 2 contains the background information needed in the paper. Section 3 contains generalities about the unification problem for bounded distributive lattices. In Section 4 we give a resolution-based algorithm for this problem, and an extension to unification with linear constant restrictions. Section 5 contains conclusions and plans for future work.

On Unification for Bounded Distributive Lattices

2 2.1

467

Preliminaries Algebra

Let Σ be a signature and a : Σ → N an arity function. A Σ-algebra is a structure A = (A, {σA }σ∈Σ ), where A is a non-empty set and for every σ ∈ Σ, σA : Aa(σ) → A. We denote by TΣ (X) the term algebra over Σ in the variables X. An equation is an expression of the form t1 = t2 where t1 , t2 ∈ TΣ (X). A Σalgebra A = (A, {σA }σ∈Σ ) satisfies an equation t1 = t2 if t1 and t2 become equal for every substitution of elements in A for the variables. An equational class is the class of all algebras that satisfy a set of equations. If E is a set of equations in the signature Σ, then FΣE (X) := TΣ (X)/≡E is the free algebra over X in the equational class of all algebras that satisfy E (where ≡E is the Σ-congruence on TΣ (X) generated by E). A system of equations is a finite set of equations S : {s1 = t1 , . . . , sk = tk }, where si , ti ∈ TΣ (X) for every 1 ≤ i ≤ k. Let {y1 , . . . , yn } ⊆ X be the set of all variables in S. An algebra A = (A, {σA}σ∈Σ ) satisfies the existential closure, ∃y1 , . . . yn (s1 = t1 ∧ · · · ∧ sk = tk ), of S if there exists a map h : X → A such that h(si ) = h(ti ) for every 1 ≤ i ≤ k, where h : TΣ (X) → A is the unique homomorphism of Σ-algebras that extends h. 2.2

Lattice Theory

For the definition of partially-ordered set and order-filter we refer to [10]. If X = (X, ≤) is a partially-ordered set, we denote its set of order-filters by O(X). There is a bijective correspondence between O(X) and the set of all orderpreserving maps from X to the partially-ordered set 2 = ({0, 1}, ≤), where 0 < 1. A structure L = (L, ∨, ∧), where L is a non-empty set and ∨ and ∧ are two binary operations on L is a lattice if ∨ and ∧ are associative, commutative and idempotent and satisfy the absorption laws. A distributive lattice is a lattice that satisfies either of the distributive laws. A lattice L = (L, ∨, ∧) has a first element if there is an element 0 ∈ L such that 0 ≤ x for every x ∈ L; it has a last element if there is an element 1 ∈ L such that x ≤ 1 for every x ∈ L (where x ≤ y iff x ∨ y = y). A lattice having both a first and a last element is called bounded. In what follows, when we refer to bounded distributive lattices, the first and last element are supposed to be included in the signature. Thus, a bounded distributive lattice is a structure L = (L, ∨, ∧, 0, 1), where (L, ∨, ∧) is a distributive lattice and 0, 1 are constants such that 0 is first element and 1 last element in (L, ∨, ∧). We denote the equational class of all bounded distributive lattices by D01 . D01 contains e.g. the two-element bounded lattice, 2 = ({0, 1}, ∨, ∧, 0, 1), where 0 ∨ 1 = 1, 0 ∧ 1 = 0. 2.3

Priestley Representation

If L is a bounded distributive lattice, let D(L) := HomD01 (L, 2) be set of all 0,1lattice homomorphisms from L to the two-element bounded distributive lattice. The space D(L) = (D(L), ≤, τ ), where ≤ is the pointwise ordering on maps

468

Viorica Sofronie-Stokkermans

and τ is the topology generated by all sets of the form Xa = {h ∈ D(L) | h(a) = 1} and their complements as a subbasis, is called the Priestley dual of L. Let HomP (D(L), 2) be the lattice of all continuous and order-preserving maps between the ordered topological space D(L), and the two-element partiallyordered set 2 with the discrete topology. Priestley [18] showed that for every L ∈ D01 , L is isomorphic to HomP (D(L), 2). In particular, if L is finite, then τ is the discrete topology, so L is isomorphic to (O(D(L)), ∪, ∩, ∅, D(L)). The dual of a finite distributive lattice is much smaller and less complex than the lattice itself. Therefore, problems concerning finite distributive lattices are likely to become simpler when translated into problems about their duals. We illustrate this by comparing the free algebra in D01 over a finite set C, FD01 (C), and its Priestley dual D(FD01 (C)). The theorem below is well-known. Theorem 1. Let C be a finite set. The following statements hold: (1) The map pC : (D(FD01 (C)), ≤) → (2C , ≤) defined for every h ∈ D(FD01 (C)) by pC (h) = h|C (the restriction of h : FD01 (C) → 2 to C) is an orderisomorphism, where in both cases the order is defined pointwise. (2) The map ηC : FD01 (C) → O(2C , ≤) defined for every t ∈ FD01 (C) by ηC (t) = {f : C → {0, 1} | f(t) = 1} (where for every f : C → {0, 1}, f : FD01 (C) → 2 is the unique extension of f to a 0,1-lattice homomorphism) is a lattice −1 isomorphism. Its inverse is defined for every U ∈ O(2C , ≤) by ηC (U ) = V W f∈U ( {c|f(c)=1} c). Every member of FD01 (C) can be written as a finite join of finite meets of elements |C| in C. Hence, FD01 (C) is finite, and its number of elements is bounded by 22 . |FD01 (C)| has been computed only for small values of |C|. By Theorem 1(1), (D(FD01 (C)), ≤) is order-isomorphic to (P(C), ⊆), hence has 2|C| elements. The main idea of this paper relies on this remark. The relatively simple structure of D(FD01 (C)) allows us to define a more efficient method for checking the satisfiability of unification problems with constants compared with methods that use the structure of FD01 (C) and/or equational reasoning. 2.4

Unification

We present the definitions and results on E-unification needed in the paper. Definition 1. Let E be an equational theory, Σ its signature, and ∆ a signature containing Σ. Let S : {s1 = t1 , . . . , sk = tk } be a system of equations, where si , ti ∈ T∆ (Y ). Then S defines an E-unification problem over ∆. S is elementary iff ∆ ⊆ Σ; S is an E-unification problem with (free) constants iff ∆\Σ is a set of constant symbols; and S is an E-unification problem with linear constant restrictions iff it is an E-unification problem with constants and, in addition, a linear ordering < on the variables and free constants occurring in S is given. In a general E-unification problem ∆\Σ may contain arbitrary function symbols.

On Unification for Bounded Distributive Lattices

469

Definition 2. A unification problem S has a solution w.r.t. E if there is a substitution σ : Y → T∆ (Y ) such that σ(si ) ≡E σ(ti ) for every 1 ≤ i ≤ k. If S is an E-unification problem with linear constant restrictions, a solution for S is a substitution σ : Y → T∆ (Y ) with the additional property that for every variable y ∈ Y and every constant c, if y < c then c does not occur in σ(y). In this context, one can study decidability of unifiability, the existence of unifiers, their classification according to “generality”, or the possibility of determining minimal sets of unifiers which are complete, in the sense that all other unifiers are less general. In this paper we focus on testing unifiability. This is sufficient in many applications (e.g. in constraint-based approaches to automated deduction [9,17,16]) and is often simpler than computing complete sets of unifiers. Theorem 2. For any E-unification problem S : {s1 = t1 , . . . , sk = tk } with free constants in C and variables Y = {y1 , . . . , ym } the following are equivalent: (1) S has a solution w.r.t. E. E (∅). (2) The formula ∃y1 , . . . , ym (s1 = t1 ∧ · · · ∧ sk = tk ) is true in FΣ∪C E (3) There exists h : Y → FΣ (C) such that h(si ) = h(ti ) for every 1 ≤ i ≤ k, where h : TΣ (Y ∪ C) → FΣE (C) is the unique extension of h to a homomorphism, such that, for all c ∈ C, h(c) = [c] (where [c] is the equivalence class of c in FΣE (C)). Proof : (Sketch) The equivalence of (1) and (2) is proved e.g. in [7]. The equivE alence of (2) and (3) follows from the fact that U (FΣ∪C (Y )) is isomorphic to E FΣ (Y ∪ C), where U maps a Σ ∪ C-algebra to a Σ-algebra by forgetting the constants in the signature, i.e. U ((A, {σA}σ∈Σ∪C )) = (A, {σA}σ∈Σ ). 2 The importance of E-unification with linear constant restrictions is justified by the following theorem. Theorem 3 ([5,3]). Let E be a non-trivial equational theory. The following statements are equivalent: (1) The positive theory1 of E is decidable. (2) General E-unification is decidable. (3) E-unification with linear constant restrictions is decidable. More precisely, as pointed out e.g. in [1], the decision problem for E-unification with linear constant restrictions can be reduced to the decision problem for general unification in linear time. The nondeterministic polynomial algorithm given in [3] can be used to reduce the decision problem for general unification to the decision problem for E-unification with linear constant restrictions. 1

The positive theory of E is the collection of those closed formulae valid in the class of all models of E which are (equivalent to a formula) of the form (Q1 x1 ) . . . (Qm xm )( qi=1 (si1 = ti1 ∧ · · · ∧ sini = tini )), where Q1 , . . . , Qm ∈ {∃, ∀}.

W

470

3

Viorica Sofronie-Stokkermans

Unification with Constants in D01 . General Remarks.

We now study the unification problem with free constants in the equational class D01 of bounded distributive lattices. We denote by D01 the equational theory of D01 . Let Σ = {∨, ∧, 0, 1} be the signature of bounded distributive lattices. The following result is a direct consequence of Theorem 2. Corollary 1. For any D01 -unification problem S : {s1 = t1 , . . . , sk = tk }, with free constants in C and variables Y, the following are equivalent: (1) S has a solution w.r.t. D01 . (2) There exists h : Y → FD01 (C) such that h(si ) = h(ti ) for every 1 ≤ i ≤ k, where h : TΣ (Y ∪ C) → FD01 (C) is the unique extension of h to a homomorphism, such that h(c) = [c] for all c ∈ C. By Corollary 1 and the fact that FD01 (C) is finite for every finite C it follows that D01 -unification with free constants is decidable. This problem is co-NP hard: if S contains only one equation and no variables it reduces to the word problem for D01 , which has been shown to be co-NP hard [14]. We first present a straightforward (and rather inefficient) method for testing whether a unification problem has a solution. We then show how a simpler case (only one equation) is solved in [12]. In Section 4 we give a simpler method, which allows to test by resolution whether a unification problem has a solution. The Straightforward Method. Let S : {s1 = t1 , . . . , sk = tk } be a D01 unification problem with free constants in a finite set C and variables in the finite set Y . We can check if S has a solution by checking if there is an instantiation of the variables in S with elements in FD01 (C) that satisfies S. There exist at most |C| (22 )|Y | such instantiations. For each instantiation h : Y → FD01 (C), one has to check if h(si ) ≡D01 h(ti ), 1 ≤ i ≤ k. There exists an algorithm for disproving the equivalence of two terms which is nondeterministically polynomial in the length of the terms [14]. The elements in FD01 (C) can be written as disjunctions of conjunctions of elements in C; the length of such a term is at most |C| · 2|C|. Hence, the length of h(si ) and h(ti ), 1 ≤ i ≤ k, can at most be |Y | · |C| · 2|C| + max(S), where max(S) is the maximal length of a term occurring in S. A Special Case. In [12] Gerhard and Petrich present the following criterion for unifiability for one single equation, i.e. for the unification problem S : {s = t}. 1. Let s0 and t0 be the disjunctive normal forms of s resp. t. 2. If neither s0 or t0 has a constant term2 then s and t are unifiable. 3. If s0 or t0 has constant terms, let h : Y →TD01 (C) be defined by h(x) = D for every x ∈ Y , where D is the disjunction of all constant terms in s0 and t0 . 4. If h(s) ≡D01 h(t) then s and t are unifiable, otherwise they are not unifiable. 2

WV

V

If s = i j∈Ii sj is in disjunctive normal form, then a constant term of s is any of the conjunctions j∈Ii sj in s not containing any variable.

On Unification for Bounded Distributive Lattices

471

The disjunction D in Step 3 can be determined in polynomial time w.r.t. length(s0 ) + length(t0 ). The same holds for the process of replacing every variable in s and t by D. Both the length of D and the length of the result of replacing all variables in s, t by D (h(s) resp. h(t)) is polynomial in length(s0 )+length(t0 ); but may be exponential in length(s)+length(t). Hence, the complexity of the criterion above is given by the complexity of Step 1 (computing the disjunctive normal forms of s and t) and Step 4 (solving a word problem). The last problem is co-NP complete [14]; there exists an algorithm for disproving the equivalence of h(s) and h(t) which is nondeterministically polynomial in length(h(s)) + length(h(t)).

4

A Resolution-Based Algorithm

We give a simpler algorithm for the problem of deciding whether a D01 -unification problem with free constants has a solution. The algorithm consists of two steps: 1. Structure-preserving translation to clause form: – Reduce the problem of testing the satisfiability of a unification problem S to checking the satisfiability of a set ΦS of clauses. – Show that ΦS can be expressed as a set of constrained clauses. 2. Checking satisfiability by ordered resolution with selection: – Use ordered resolution with selection for constrained sets of clauses to test the satisfiability of ΦS . 4.1

Structure-Preserving Translation to Clause Form

In this section we reduce the problem of testing the satisfiability of a unification problem S to that of checking the satisfiability of a set of clauses. We do this in two steps: Theorem 4 shows that FD01 (C) can be replaced with the lattice of order-filters of (P(C), ⊆); Theorem 5 further reduces the problem to that of checking the satisfiability of a set of first-order (ground) clauses. Theorem 4. For any D01 -unification problem S : {s1 = t1 , . . . , sk = tk } with free constants C and variables Y , the following are equivalent: (1) S has a solution w.r.t. D01 . (2) There exists h : Y → FD01 (C) such that h(si ) = h(ti ) for every 1 ≤ i ≤ k, where h : TΣ (Y ∪ C) → FD01 (C) is the unique homomorphism that extends h, such that h(c) = [c] for all c ∈ C. (3) There exists g : Y → O(P(C), ⊆) such that g(si ) = g(ti ) for every 1 ≤ i ≤ k, where g : TΣ (Y ∪ C) → O(P(C), ⊆) is the unique homomorphism that extends g, such that g(c) = ↑{c} = {X ⊆ C | c ∈ X} for every c ∈ C. Proof : (Idea) The equivalence of (1) and (2) follows directly from Corollary 1. The equivalence of (2) and (3) follows from the fact that there exists a 0,1-lattice isomorphism ηC : FD01 (C) → O(P(C), ⊆) defined for every t ∈ FD01 (C) by

472

Viorica Sofronie-Stokkermans

ηC (t) = {f −1 (1)∩C | f : FD01 (C) → 2 is a 0,1-lattice homomorphism; f(t) = 1} such that for every c ∈ C, ηC ([c]) = ↑{c}. 2 Theorem 4 justifies a reduction of the problem of checking whether a unification problem with constants S has a solution to the problem of checking the satisfiability of a system of set constraints. This reduction can be then used to give a structure-preserving translation to clause form. Thus, the problem of checking whether a unification problem with constants S has a solution can be reduced to the problem of checking the satisfiability of a set of clauses. The structure-preserving translation to clause form is inspired by Tseitin’s well-known method for transforming quantifier-free formulae to clausal normal form and by the ideas in [20]. The link with set constraints mentioned above also explains the similarities with the structure-preserving translation to clause form presented in [6]. The remarks above are formally expressed by the following theorem. Theorem 5. Let S : {s1 = t1 , . . . , sk = tk } be a D01 -unification problem with free constants C, and variables Y = {y1 , . . . , yn }. Let ST (S) be the set of all subterms of terms occurring in S. The following are equivalent: (1) There exists h : {y1 , . . . , yn } → O(P(C), ⊆) such that h(si ) = h(ti ) for every 1 ≤ i ≤ k, where h : TΣ (Y ∪ C) → O(P(C), ⊆) is the unique homomorphism that extends h, such that h(c) = ↑{c} for every c ∈ C. (2) There exists a family {Ie }e∈ST (S) , such that Ie ⊆ P(C) for all e ∈ ST (S), and for all X, X1 , X2 ⊆ C the following hold: – – – – –

if X1 ∈ Iy and X1 ⊆ X2 then X2 ∈ Iy , for every y ∈ {y1 , . . . , yn }; X ∈ Ie1 ∧e2 iff X ∈ Ie1 and X ∈ Ie2 ; X ∈ Ie1 ∨e2 iff X ∈ Ie1 or X ∈ Ie2 ; I0 = ∅; I1 = C; and for every c ∈ C, X ∈ Ic iff c ∈ X; X ∈ Isi iff X ∈ Iti for all 1 ≤ i ≤ k.

(3) The conjunction of the following formulae is satisfiable: (Her)

Py (X1 ) → Py (X2 )

(Ren) (∧n) Pe1∧e2 (X) → (∧p) Pe1 (X) ∧ Pe2 (X) → (∨n) Pe1∨e2 (X) → (∨p) Pei (X) → (1) P1 (X) (0) ¬P0 (X) (cp) Pc (X) (cn) ¬Pc(X) (P) Psi (X) ↔

for all X1 ⊆ X2 ⊆ C, y ∈ {y1 , . . . , yn } Pei (X) for all X ⊆ C, i = 1, 2 Pe1∧e2 (X) for all X ⊆ C Pe1 (X) ∨ Pe2 (X) for all X ⊆ C Pe1∨e2 (X) for all X ⊆ C, i = 1, 2 for all X ⊆ C for all X ⊆ C for all X ⊆ C with c ∈ X for all X ⊆ C with c 6∈ X Pti (X), for all X ⊆ C, for all 1 ≤ i ≤ k

where each formula in (Her) ∪ (Ren) ∪ (P) is the conjunction of all formulae obtained by instantiating the variables X, resp. X1 , X2 with subsets of C satisfying the additional conditions; the indices e1 ∨ e2 , e1 ∧ e2 , 0, 1, c range over all elements in ST (S); y ranges over all variables in {y1 , . . . , yn }.

On Unification for Bounded Distributive Lattices

473

Proof : (Sketch) (1) ⇒ (2). For every e ∈ ST (S) let Ie := h(e). Since h is a 0,1-homomorphism with h(c) = ↑{c}, and the lattice operations in O(P(C), ⊆) are union and intersection, the family {Ie }e∈ST (S) satisfies the conditions in (2). (2) ⇒ (3) Let {Ie }e∈ST (S) be a family satisfying the conditions in (2). Then (P(C), I), where I(Pe ) := Ie for all e ∈ ST (S), is a model for (Her)∪(Ren)∪(P). (3) ⇒ (1) Assume that (Her)∪(Ren)∪(P) (a conjunction of ground clauses) is satisfied by the map I : {Pe (X) | e ∈ ST (S), X ⊆ C} → {0, 1}. For every y ∈ Y let h(y) := {X ∈ P(C) | I(Py (X)) = 1}. Let h : TΣ (Y ∪ C) → O(P(C), ⊆) be the unique homomorphism that extends h, such that h(c) = ↑{c} for every c ∈ C. As I satisfies (Her) ∪ (Ren), h(e) = {X ∈ P(C) | I(Pe (X)) = 1} for all e ∈ ST (S). Since I satisfies (P), h(si ) = h(ti ) for every 1 ≤ i ≤ k. 2 Corollary 2. The D01 -unification problem S : {s1 = t1 , . . . , sk = tk } with free constants C has a solution w.r.t. the equational theory of D01 iff the set of clauses (Her) ∪ (Ren) ∪ (P) is satisfiable. The satisfiability of (Her) ∪ (Ren) ∪ (P) can be checked for instance by resolution. We now give an upper bound on the complexity of deciding the satisfiability of (Her) ∪ (Ren) ∪ (P), i.e. for deciding the unifiability of S. Theorem 6. (1) The problem of deciding whether the D01 -unification problem S has a solution can be solved in at most non-deterministically polynomial time in |ST (S)|2|C| (and in exponential time in |ST (S)|2|C| by using resolution). (2) If S only contains the operation symbols ∧, 0, 1, and, possibly, constants, then the problem can be decided in at most polynomial time in |ST (S)|2|C|. Proof : Note first that the structure-preserving translation to clause form in Theorem 5 is polynomial in |ST (S)|2|C| . The size of the conjunction of all formulae in (Her) ∪ (Ren) ∪ (P) is also polynomial in |ST (S)|2|C| . (1) follows from this and the fact that the number of all distinct literals that can occur in the conjunction of ground clauses (Her)∪ (Ren)∪ (P) in Theorem 5(3) is bounded by |ST (S)|2|C| . To prove (2) note that if only the operators ∧, 0, 1 and, possibly, constants, occur in S, then the clause form of (Her) ∪ (Ren) ∪ (P) is a set of ground Horn clauses. Dowling and Gallier [11] showed that satisfiability of a set Φ of ground Horn clauses can be proved in linear time w.r.t. the number of clauses in Φ. 2 Note. It is not necessary to explicitly add to (Her) ∪ (Ren) ∪ (P) formulae expressing the order relationship between the elements in P(C). However, these relationships have to be known when expressing (Her) ∪ (Ren) ∪ (P) as the conjunction of ground formulae by instantiating the variables with elements in P(C). 4.2

Translation to Constrained Clause Form

The clause form of the set of formulae (Her)∪(Ren)∪(P) defined in Theorem 5(3) can be naturally expressed by constrained clauses of a special form as follows:

474

Viorica Sofronie-Stokkermans

(Her) Py (X1 ) → (Ren) (∧n) Pe1∧e2 (X) → (∧p) Pe1 (X) ∧ Pe2 (X) → (∨n) Pe1∨e2 (X) → (∨p) Pei (X) → (1) P1 (X) (0) ¬P0 (X) (cp) Pc (X) (cn) ¬Pc(X) (P) Psi (X) ↔

Py (X2 ) [[X1 ⊆ X2 ⊆ C]], y ∈ Y Pei (X) [[X ⊆ C]], i = 1, 2 Pe1∧e2 (X) [[X ⊆ C]] Pe1 (X) ∨ Pe2 (X) [[X ⊆ C]] Pe1∨e2 (X) [[X ⊆ C]], i = 1, 2 [[X ⊆ C]] [[X ⊆ C]] [[X ⊆ C, c ∈ X]] [[X ⊆ C, c 6∈ X]] Pti (X), [[X ⊆ C]], for all 1 ≤ i ≤ k

Definition 3. A constrained clause has the form D[[φ]], where (i) D is a firstorder clause with variables X, X1 , . . . , Xn , . . . ranging over a countably infinite set V ; all predicates occurring in D are unary; and (ii) the constraint φ is of the V V V V form i∈I1 (ci ∈ Xi ) ∧ i∈I2 (ci 6∈ Xi ) ∧ i∈I3,j∈I4 (Xi ⊆ Xj ) ∧ i∈I1 ∪···∪I4 (Xi ⊆ C). Let S be a D01 -unification problem, and ΦS the set of constrained clauses associated with S as explained above. Then ΦS can be constructed in polynomial time in the size of S. The size of ΦS is polynomial in the size of S. A substitution of the variables in V is called ground when it replaces every variable by an element of P(C) (this is the Herbrand universe of (Her) ∪ (Ren) ∪ (P)). A constrained clause D[[φ]] represents the set (D[[φ]])g = {Dσ | σ ground; φσ true} of all ground instances of D by instantiations of the variables which satisfy φ. We say that a set S Φ of constrained clauses is satisfiable if the set of all its ground instances, Φg = D[[φ]]∈N (D[[φ]])g , is satisfiable. 4.3

A Resolution Calculus for Constrained Clauses

We now formulate a resolution calculus CRes S for the type of constrained clauses considered here. The calculus is parameterized by a total ordering on the predicate symbols, and a selection function S that assigns to each constrained clause D[[φ]] a (possibly empty) multiset of (occurrences) of negative literals, 3 called the selected literals of D[[φ]]. CRes S consists of the following rules . Ordered Resolution. D1 ∨ Pe (X) [[φ1 (X, X1 , . . . , Xn )]] D2 ∨ ¬Pe (Z) [[φ2(Z, Z1 , . . . , Zn )]] D1 ∨ D2 σ [[φ1(X, X1 , . . . , Xn ) ∧ φ2 (Z, Z1 , . . . , Zn )σ]] where σ(Z) = X and σ(W ) = W in rest; Pe is the largest predicate symbol in D1 ∨ Pe (X) and no literal is selected in D1 ∨ Pe (X), and either ¬Pe (Z) 3

The ordered resolution calculus with selection Res S is complete for any well-founded and total ordering on ground literals and any selection function S. Here we consider a less restrictive form of resolution for constrained clauses, in order to simplify the presentation by avoiding the necessity of also handling order constraints (w.r.t. ).

On Unification for Bounded Distributive Lattices

475

is selected in D2 ∨ ¬Pe (Z), or otherwise nothing is selected in D2 ∨ ¬Pe (Z) and Pe is the largest predicate symbol in D2 ∨ ¬Pe (Z). Positive Factoring. D ∨ Pe (X) ∨ Pe (Z) [[φ(X, Z, X1 , . . . , Xn )]] Dσ ∨ Pe (X) [[φ(X, Z, X1, . . . , Xn )σ]] where σ(Z) = X, and σ(W ) = W in rest; Pe is the largest predicate symbol in D ∨ Pe (X) ∨ Pe (Z) and nothing is selected in D ∨ Pe (X) ∨ Pe (Z). Theorem 7. Let Φ be a set of constrained clauses, a total order on the predicate symbols, and S a selection function. Φ is unsatisfiable iff the empty clause 2[[φ]] (constrained by a satisfiable constraint φ) is derivable from Φ in CRes S. Proof : (Idea) The proof uses the completeness of ordered resolution with selection for ground clauses and a lifting lemma for constrained clauses; the arguments are similar to those in [9]. 2 Example: Decide whether the D01 -unification problem S : {y ∧ c = 0, y ∨ c = 1} has a solution, where c is a constant and y a variable. (Note that S corresponds to the formula: ∀c∃y(y ∧ c = 0 and y ∨ c = 1.) Solution: Let be a total ordering on the predicate symbols, defined such that Pe Pe0 whenever (i) e0 is a subterm of e; or (ii) e is a non-atomic term and c a constant; or (iii) e is a non-atomic term or a constant and x is a variable. Let S be a selection function that selects all negative occurrences of literals except in Ren(∨p, ∧p), where nothing is selected. By the structure-preserving translation to clause form in Theorem 5(3) we obtain the following set of constrained clauses: (1) (2) (3)

Py (X) → Py (Y ) Py∧c(X) → Py (X) Py∧c(X) → Pc (X)

[[X ⊆ Y ⊆ C]] [[X ⊆ C]] [[X ⊆ C]]

(4) Py (X) ∧ Pc(X) → Py∧c(X) [[X ⊆ C]] (5) Py∨c(X) → Py (X) ∨ Pc(X) [[X ⊆ C]] (6) (7) (8) (9) (10) (11) (12) (13) (14) (15)

Py (X) → Py∨c(X) Pc(X) → ¬P0(X) P1(X) Pc(X) ¬Pc(X) Py∧c(X) → P0(X) → Py∨c(X) → P1(X) →

Py∨c(X)

P0 (X) Py∧c(X) P1 (X) Py∨c(X)

[[X ⊆ C]] [[X [[X [[X [[X [[X [[X [[X [[X [[X

⊆ C]] ⊆ C]] ⊆ C]] ⊆ C, c ∈ X]] ⊆ C, c 6∈ X]] ⊆ C]] ⊆ C]] ⊆ C]] ⊆ C]]

where C = {c}, the selected literals are underlined and the positive literals containing the maximal predicate symbol are in boxes. All ground inferences of (13)

476

Viorica Sofronie-Stokkermans

and (14) are redundant (so, (13) and (14) can be considered to be redundant). We obtain the following deduction of the empty clause 2: (16) (17) (18) (19) (20) (21) (22) (23) (24) (25)

Py (X) ∧ Pc (X) → P0 (X) [[X ⊆ C]] [[c ∈ X, X ⊆ C]] Py (X) → P0 (X) [[X ⊆ C]] Py∨c(X) [[X ⊆ C]] Py (X) ∨ Pc(X) Py (X) [[c 6∈ X, X ⊆ C]] Py (Y ) [[c 6∈ X, X ⊆ Y ⊆ C]] P0 (X) [[c 6∈ X, c ∈ X, X ⊆ C]] 2 [[c 6∈ X, c ∈ X, X ⊆ C]] [[c 6∈ X, c ∈ Y, X ⊆ Y ⊆ C]] P0 (Y ) 2 [[c 6∈ X, c ∈ Y, X ⊆ Y ⊆ C]]

(by (by (by (by (by (by (by (by (by (by

(12) (10) (15) (18) (19) (20) (20) (22) (21) (21)

and and and and and and and and and and

(4)) (16)) (9)) (5)) (11)) (1)) (17)) (8)) (17)) (17))

The constraint in (23) is unsatisfiable, but the constraint in (25) is satisfiable (e.g. by X = ∅ and Y = C)4 . So, the set consisting of the clauses (1)–(15) is unsatisfiable, hence S has no solution. 4.4

Complexity Considerations for Some Special Cases

We now analyze some situations in which deciding D01 -unifiability is especially easy. We start by showing that for unification problems of the form S : {s = t} the algorithm performs well. We end by analyzing the complexity of unification without constants. Unification with Free Constants: General Case. Let S be a unification problem. Let be a total ordering on the predicate symbols {Pe | e ∈ ST (S)}, defined such that Pe Pe0 whenever (i) e0 is a subformula of e; or (ii) e is a non-atomic formula and c a constant; or (iii) e is a non-atomic formula or a constant and y is a variable. Let S be a selection function that (i) selects nothing in Ren(∨p, ∧p), and (ii) in every other non-positive clause selects the set of all occurrences of negative literals that contain the maximal predicate symbol(s) among those occurring in the negative literals in the clause. Then: 1. 2. 3. 4.

4

no inferences are possible between (Her) and (Ren); all inferences between two clauses in (Ren) generate tautologies; no inferences are possible between (Her) and clauses in (P); inferences between Psi (X) → Pti (X) in (P) and clauses in (Ren) lead to: V (a) j∈J Peij (X) → Pti (X) [[X ⊆ C]], where for every J, {eij | j ∈ J} is a multiset of subterms of ST (si ), containing no repetition of subterms that occur V position in si ; V at the same (b) V ( Pcj (X)) ∧ ( Pxl (X)) → Pti (X) [[X ⊆ C, di ∈ X, i ∈ I]]; (c) Pxl (X) → Pti (X) [[X ⊆ C, di ∈ X, i ∈ I]]; (d) Pti (X) [[X ⊆ C, di ∈ X, i ∈ I]].

This shows that inferences with the clause Her in Theorem 5(3) (in particular, with clause (1) for this example) are necessary for the correctness of the method.

On Unification for Bounded Distributive Lattices

477

There are at most 2length(si) such clauses; all constraints are linear in length(si ). 5. inferences between Pti (X) [[X ⊆ C, di ∈ X, i ∈ I]] and clauses in (Ren) lead to: W (a’) j∈J Peij (X) [[X ⊆ C, di ∈ X, i ∈ I]], where for every J, {eij | j ∈ J} is a multiset of subterms of ST (ti ), containing no repetition of subterms that occur W at the same V position in ti ; (b’) W ( Pcj (X)) ∨ ( Pxl (X)) [[X ⊆ C, di ∈ X, i ∈ I, d0k 6∈ X, k ∈ K]]; (c’) Pxl (X) [[X ⊆ C, di ∈ X, i ∈ I, d0k 6∈ X, k ∈ K]]; (d’) 2 [[X ⊆ C, di ∈ X, i ∈ I]]. or factors thereof. There are at most 2length(ti) such clauses; all constraints are linear in length(si ) + length(ti ). 6. inferences between clauses of type (c’) and (Her) produce clauses of the form. W (e’) Pxl (Xi ) [[X ⊆ C, di ∈ X, i ∈ I, d0k 6∈ X, k ∈ K, X ⊆ Xi ]]; 7. further inferences may be possible between clauses of the form (a)–(c) and clauses of type (d) or (a’)–(c’) and (e’) and resolvents thereof. (These inferences can be further controlled, e.g. by adopting an additional labeling of the predicate symbols Pe that also indicates in which of the terms t1 , . . . , tk , s1 , . . . , sk and at which position e occurs; and by defining redundancy criteria. Here, we do not enter in further details.) Corollary 3. If all clauses generated from (Her) ∪ (Ren) ∪ (P) by CRes S contain a negative literal then S has a solution. Proof : Follows from the remarks above and the fact that if all clauses generated contain a (selected) negative literal then 2 cannot be obtained by CRes 2 S. This happens e.g. if for all 1 ≤ i ≤ k, the disjunctive normal forms of si , ti do not contain constant terms. Unification with Free Constants: One Equation. If S consists of only one equation, the last type of inferences can be proved to produce only constrained clauses with the property that all their ground instances are already subsumed by the set of ground instances of clauses of type (d) previously generated, as will be shown in Corollary 4. Lemma 1. The satisfiability of a constraint φ of size |φ| can be checked in time linear in |C| · |φ|. Proof : (Sketch) Note first that a constraint ^ ^ ^ φ= (ci ∈ Xi ) ∧ (di 6∈ Xi ) ∧ (Xi ⊆ Xj ) ∧ i∈I1

i∈I2

i∈I3 ,j∈I4

^

(Xi ⊆ C)

i∈I1 ∪···∪I4

is satisfiable iff the set Eφ = {PXi (ci ) | i ∈ I1 }∪{¬PXj (dj ) | j ∈ I2 }∪{PXi (x) → PXj (x) | i ∈ I3 , j ∈ I4 } of Horn clauses is satisfiable by a Herbrand interpretation. Moreover, Eφ is satisfiable iff the set Gφ of all its ground instances (which has cardinality |C|×|φ|) is satisfiable. Since all clauses in Gφ are ground Horn clauses, it follows by results in [11] that the satisfiability of Gφ can be checked in time linear in the size of Gφ i.e. linear in |C| × |φ|. 2

478

Viorica Sofronie-Stokkermans

Corollary 4. Let S : {s = t} consist only of one equation. Then the satisfiability of S can be checked in time 2length(s)+length(t)(length(s) + length(t)). Proof : (Idea) In a first step, by inferences with (Ren), Ps (X) → Pt (X) generates literals of the form (a)–(c), and, possibly of type (d). Let Pt (X)[[φi ]], i ∈ I be all clauses of type (d) generated this way. Similarly, let Ps (X)[[φ0j ]], j ∈ J be all clauses of type (d) generated from Pt (X) → Ps (X). By the remarks at the beginning of this subsection, φi , φj contain only constraints of the form X ⊆ C and ci ∈ X. By inferences between Pt (X)[[φi ]], i ∈ I and Pt (Z) → Ps (Z) [[Z ⊆ C]] the clause Ps [[φi ]] is generated. Pt [[φ0j ]] are generated, for all j ∈ J, in a similar way. The constraint of a clause of type (a’)–(d’) contains, as a conjunct, one of the constraints φi or φ0j . Let Di → Pt (X)[[ψ]] be one of the clauses in (a)–(c). A resolution with a clause of type (a’)–(d’) would produce a clause of the form Di0 → Pt [[ψ ∧ φi ∧ ρ]]. The set of ground instances of such a clause is redundant with respect any set of clauses that contains all ground instances of Pt [[φi ]], hence (by the proof of Theorem 7) all such clauses can be considered redundant. Similar arguments concerning redundancy of generated clauses can be used to control the inferences with (Her) and to prove that all resolvents of clauses of type (a’)–(d’) as well as resolvents of inferences with Her and clauses of type (a)– (d) have the property that all their ground instances are subsumed by ground instances of clauses of type (d). The conclusion follows since 2|S| clauses are generated this way and the constraints can be checked in linear time. 2 Unification without Constants in D01 . If C = ∅, then FD01 (C) is the twoelement lattice. By Theorem 2, a D01 -unification problem S : {s1 = t1 , . . . , sk = tk } with variables {y1 , . . . , yn } and no constants has a solution w.r.t. the equational theory of D01 iff the existential closure of S, ∃y1 , . . . yn (s1 = t1 ∧ · · ·∧ sk = tk ) is valid in the two-element lattice. The number of clauses corresponding to (Her) ∪ (Ren) ∪ (P) in Theorem 5 is in this case polynomial in ST (S). Theorem 8 (Complexity). Let S be a D01 -unification problem without constants. Assume that all terms in S have been simplified by (recursively) applying the following simplification rules5 : e ∧ 1 7→ e; e ∧ 0 7→ 0; e ∨ 1 7→ 1; e ∨ 0 7→ e. (1) If {0, 1} 6⊆ ST (S), then S always has a solution. (2) If {0, 1} ⊆ ST (S) and S consists of only one equation (or else it contains the equation 1 = 0) then S has no solution. (3) If S only contains the operators ∧, 0, 1, then the problem of checking whether S has a solution can be solved in polynomial time. The same holds if S only contains the operators ∨, 0, 1. (4) Otherwise, the problem of checking whether S has a solution is NP-complete. Proof : (Sketch) (1) Assume that 1 6∈ ST (S). Let ΦS be the set of clauses associated to S by the structure-preserving translation to clause form in Theorem 5(3). 5

This can be done in polynomial time.

On Unification for Bounded Distributive Lattices

479

Since no constant occurs in S, all the clauses ΦS are non-positive, so ΦS is satisfiable (consider a selection function that selects all negative literals in all clauses; then no resolution inference is possible). The case 0 6∈ ST (S) follows by duality. (2) is obvious, and (3) follows from the second part of Theorem 6. (4) The problem of deciding whether a unification problem without constants has a solution is clearly in NP. NP-hardness follows from the fact that the satisfiability problem for Boolean formulae of the form E = F ∧ ¬G, where F and G only contain the operators ∨ and ∧ (which is NP-complete [15]) can be reduced in polynomial time to the satisfiability of a D01 -unification problem with at least two equations si = 0 and sj = 1, namely S : {F = 1, G = 0}. 2 4.5

Unification with Linear Constant Restrictions

We now show that the idea used in the method described in Section 4.1 can be adapted to give an algorithm for unification with linear constant restrictions. We first express the fact that t ∈ FD01 (C\{c}) by using the isomorphism ηC : FD01 (C) → O(P(C), ⊆) defined for every t ∈ FD01 (C) by ηC (t) = {f −1 (1) ∩ C | f : FD01 (C) → 2 is a 0,1-lattice homomorphism; f(t) = 1} (cf. Theorem 1). Lemma 2. If t ∈ FD01 (C) then there exists t0 ∈ FD01 (C\{c}) such that t ≡D01 t0 iff ηC (t) = ∅ or C\{c} ∈ ηC (t). Proof : (Idea) This is a consequence of the fact that for every t ∈ FD01 (C), there exists t0 ∈ FD01 (C\{c}) such that t ≡D01 t0 iff for every X ⊆ C, if c ∈ X then either X 6∈ ηC (t) or X is not minimal in ηC (t). The proof of the equivalence above uses the way ηD is defined for every D, and the fact that there exists an injective homomorphism i : FD01 (C\{c}) → FD01 (C), such that for every U ∈ −1 O(P(C\{c})), ηC (i(ηC\{c} (U ))) is the order-filter generated by U in (P(C), ⊆). 2 As in the case of unification with free constants, the remark above justifies a structure-preserving translation to clause form. Theorem 9. Let S : {s1 = t1 , . . . , sk = tk } be a D01 -unification problem with linear constant restrictions Lcr, constants C and variables Y . The following are equivalent: (1) S has a solution w.r.t. the equational theory of D01 . (2) The conjunction of the following set of formulae is satisfiable: (Her) Py (X1 ) → (Ren) Pe1∧e2 (X) ↔ Pe1∨e2 (X) ↔ P1 (X) ¬P0 (X) Pc (X) ¬Pc(X) (Lcr) Py (X) → (P) Psi (X) ↔

Py (X2 ) for all X1 ⊆ X2 ⊆ C, y ∈ Y Pe1 (X) ∧ Pe2 (X) for all X ⊆ C Pe1 (X) ∨ Pe2 (X) for all X ⊆ C for all X ⊆ C for all X ⊆ C for all X ⊆ C with c ∈ X for all X ⊆ C with c 6∈ X Py (C\{c}) for all X ⊆ C if y < c ∈ Lcr Pti (X), for all 1 ≤ i ≤ k

480

Viorica Sofronie-Stokkermans

Proof : (Sketch) This follows from Definition 2, Lemma 2, and arguments similar to those used for Theorem 5. 2

5

Conclusion

We presented a resolution-based method for deciding unifiability w.r.t. the equational theory of bounded distributive lattices with operators. The method uses the Priestley representation for bounded distributive lattices, in particular the description of the dual D(FD01 (C)) of the free lattice in D01 over C as (P(C), ⊆). This helped us to reduce the problem of checking whether a D01 -unification problem S with constants C (and linear constraint restrictions) has a solution, to the problem of checking the satisfiability of a set ΦS of clauses. ΦS can be represented both as a finite set of ground clauses, and as a set of constrained clauses; the last representation is much more compact. We formulated a resolution calculus for such constrained clauses and proved its soundness and completeness. The method we give is in general still exponential in |ST (S)|2C . However, in several situations it is more efficient than other existing methods: syntactic information about the terms in S is sometimes reflected by the form of clauses, which allows us to establish better upper bounds for these particular problems. Our algorithm also behaves well for unification problems consisting of only one equation. It would be interesting to compare our method with more general unification algorithms, e.g. based on rewriting. One such algorithm [8] decides unifiability over the free algebra, i.e. in an algebraic extension of the free algebra. We analyzed this more general problem for the equational theory of D01 . As part of work in progress, we prove that this reduces to Boolean unifiability (due to the fact that the algebraically closed elements of D01 are the Boolean lattices). Acknowledgments I thank H. Ganzinger for drawing my attention to the results on unification with linear constant restrictions (cf. e.g. [5]) and on Boolean unification [1], and to the possibility of using constrained clauses (for theorem proving for many-valued logics), an idea which proved to be useful in this paper. I thank J.M. Talbot for the discussion we had on possible applications of unification to set constraints.

References 1. F. Baader. On the complexity of Boolean unification. Information Processing Letters, 67(4):215–220, 1998. 2. F. Baader and P. Narendran. Unification of concept terms in description logics. In H. Prade, editor, Proceedings of ECAI’98, pages 331–335. Wiley, 1998. 3. F. Baader and K.U. Schulz. Unification in the union of disjoint equational theories: Combining decision procedures. J. Symbolic Computation, 21:211–243, 1996. 4. F. Baader and K.U. Schulz. Combination of constraint solvers for free and quasifree structures. Theoretical Computer Science, 192:107–161, 1998.

On Unification for Bounded Distributive Lattices

481

5. F. Baader and W. Snyder. Unification theory. In J.A. Robinson and A. Voronkov, editors, Handbook of Automated Reasoning. Elsevier, 2000. To appear. 6. L. Bachmair, H. Ganzinger, and U. Waldmann. Set constraints are the monadic class. In Eighth Annual IEEE Symposium on Logic in Computer Science, pages 75–83, Montreal, Canada, June 19–23, 1993. IEEE Computer Society Press, Los Alamitos, CA, USA. 7. A. Bockmayr. Algebraic and logical aspects of unification. In K.U. Schulz, editor, Proceedings of Word Equations and Related Topics IWWERT’90, T¨ ubingen, October 1990, LNCS 572, pages 171–180. Springer Verlag, 1992. 8. A. Bockmayr. Model-theoretic aspects of unification. In K.U. Schulz, editor, Proceedings of Word Equations and Related Topics IWWERT’90, T¨ ubingen, October 1990, LNCS 572, pages 181–196. Springer Verlag, 1992. 9. H.J. B¨ urckert. A resolution principle for a logic with restricted quantifiers. LNAI 568. Springer Verlag, 1991. 10. S. Burris and H.P. Sankappanavar. A Course in Universal Algebra. Graduate Texts in Mathematics. Springer, 1981. 11. W.F. Dowling and J.H. Gallier. Linear-time algorithms for testing the satisfiability of propositional Horn formulae. J. Logic Programming, 1(3):267–284, 1984. 12. J.A. Gerhard and M. Petrich. Unification in free distributive lattices. Theoretical Computer Science, 126(2):237–257, 1994. 13. S. Ghilardi. Unification through projectivity. J. Logic and Computation, 7(6):733– 752, 1997. 14. H.B. Hunt, D.J. Rosenkrantz, and P.A. Bloniarz. On the computational complexity of algebra of lattices. SIAM Journal of Computation, 16(1):129–148, 1987. 15. H.B. Hunt and R.E. Stearns. The complexity of very simple Boolean formulas with applications. SIAM J. Comput., 19(1):44–70, 1990. 16. C. Kirchner and H. Kirchner. Constrained equational reasoning. In UNIF’89 Extended Abstacts of the 3rd Int. Workshop on Unification, pages 160–171, Pfalzakademie, Lambrecht, 1989. 17. R. Nieuwenhuis and A. Rubio. Theorem proving with ordering constrained clauses. In D. Kapur, editor, Proceedings of CADE-11, LNAI 607, pages 477–491. Springer Verlag, 1992. 18. H.A. Priestley. Ordered topological spaces and the representation of distributive lattices. Proc. London Math. Soc., 3:507–530, 1972. 19. M. Schmidt-Schauß. A decision algorithm for distributive unification. Theoretical Computer Science, 208(1–2):111–148, 1998. 20. V. Sofronie-Stokkermans. On the universal theory of varieties of distributive lattices with operators: Some decidability and complexity results. In H. Ganzinger, editor, Proceedings of CADE-16, LNAI 1632, pages 157–171. Springer Verlag, 1999.

Reasoning with Individuals for the Description Logic SHIQ Ian Horrocks1 , Ulrike Sattler2 , and Stephan Tobies2 1

Department of Computer Science, University of Manchester, UK [email protected] 2 LuFg Theoretical Computer Science, RWTH Aachen, Germany {sattler,tobies}@informatik.rwth-aachen.de

Abstract. While there has been a great deal of work on the development of reasoning algorithms for expressive description logics, in most cases only Tbox reasoning is considered. In this paper we present an algorithm for combined Tbox and Abox reasoning in the SHIQ description logic. This algorithm is of particular interest as it can be used to decide the problem of (database) conjunctive query containment w.r.t. a schema. Moreover, the realisation of an efficient implementation should be relatively straightforward as it can be based on an existing highly optimised implementation of the Tbox algorithm in the FaCT system.

1 Motivation A description logic (DL) knowledge base (KB) is made up of two parts, a terminological part (the terminology or Tbox) and an assertional part (the Abox), each part consisting of a set of axioms. The Tbox asserts facts about concepts (sets of objects) and roles (binary relations), usually in the form of inclusion axioms, while the Abox asserts facts about individuals (single objects), usually in the form of instantiation axioms. For example, a Tbox might contain an axiom asserting that Man is subsumed by Animal, while an Abox might contain axioms asserting that both Aristotle and Plato are instances of the concept Man and that the pair hAristotle, Platoi is an instance of the role Pupil-of. For logics that include full negation, all common DL reasoning tasks are reducible to deciding KB consistency, i.e., determining if a given KB admits a non-empty interpretation [6]. There has been a great deal of work on the development of reasoning algorithms for expressive DLs [2,12,16,11], but in most cases these consider only Tbox reasoning (i.e., the Abox is assumed to be empty). With expressive DLs, determining consistency of a Tbox can often be reduced to determining the satisfiability of a single concept [2,23,3], and—as most DLs enjoy the tree model property (i.e., if a concept has a model, then it has a tree model)—this problem can be decided using a tableau-based decision procedure. The relative lack of interest in Abox reasoning can also be explained by the fact that many applications only require Tbox reasoning, e.g., ontological engineering [15,20] and schema integration [10]. Of particular interest in this regard is the DL SHIQ [18], which is powerful enough to encode the logic DLR [10], and which can thus be used D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 482–496, 2000. c Springer-Verlag Berlin Heidelberg 2000

Reasoning with Individuals for the Description Logic SHIQ

483

for reasoning about conceptual data models, e.g., Entity-Relationship (ER) schemas [9]. Moreover, if we think of the Tbox as a schema and the Abox as (possibly incomplete) data, then it seems reasonable to assume that realistic Tboxes will be of limited size, whereas realistic Aboxes could be of almost unlimited size. Given the high complexity of reasoning in most DLs [23,7], this suggests that Abox reasoning could lead to severe tractability problems in realistic applications.1 However, SHIQ Abox reasoning is of particular interest as it allows DLR schema reasoning to be extended to reasoning about conjunctive query containment w.r.t. a schema [8]. This is achieved by using Abox individuals to represent variables and constants in the queries, and to enforce co-references [17]. In this context, the size of the Abox would be quite small (it is bounded by the number of variables occurring in the queries), and should not lead to severe tractability problems. Moreover, an alternative view of the Abox is that it provides a restricted form of reasoning with nominals, i.e., allowing individual names to appear in concepts [22,5,1]. Unrestricted nominals are very powerful, allowing arbitrary co-references to be enforced and thus leading to the loss of the tree model property. This makes it much harder to prove decidability and to devise decision procedures (the decidability of SHIQ with unrestricted nominals is still an open problem). An Abox, on the other hand, can be modelled by a forest, a set of trees whose root nodes form an arbitrarily connected graph, where number of trees is limited by the number of individual names occurring in the Abox. Even the restricted form of co-referencing provided by an Abox is quite powerful, and can extend the range of applications for the DLs reasoning services. In this paper we present a tableaux based algorithm for deciding the satisfiability of unrestricted SHIQ KBs (i.e., ones where the Abox may be non-empty) that extends the existing consistency algorithm for Tboxes [18] by making use of the forest model property. This should make the realisation of an efficient implementation relatively straightforward as it can be based on an existing highly optimised implementation of the Tbox algorithm (e.g., in the FaCT system [14]). A notable feature of the algorithm is that, instead of making a unique name assumption w.r.t. all individuals (an assumption commonly made in DLs [4]), increased flexibility is provided by allowing the Abox to contain axioms explicitly asserting inequalities between pairs of individual names (adding such an axiom for every pair of individual names is obviously equivalent to making a unique name assumption).

2 Preliminaries In this section, we introduce the DL SHIQ. This includes the definition of syntax, semantics, inference problems (concept subsumption and satisfiability, Abox consistency, and all of these problems with respect to terminologies 2), and their relationships. SHIQ is based on an extension of the well known DL ALC [24] to include transitively closed primitive roles [21]; we call this logic S due to its relationship with 1

2

Although suitably optimised algorithms may make reasoning practicable for quite large Aboxes [13]. We use terminologies instead of Tboxes to underline the fact that we allow for general concept inclusions axioms and do not disallow cycles.

484

Ian Horrocks, Ulrike Sattler, and Stephan Tobies

the proposition (multi) modal logic S4(m) [23].3 This basic DL is then extended with inverse roles (I), role hierarchies (H), and qualifying number restrictions (Q). Definition 1. Let C be a set of concept names and R a set of role names with a subset R+ ⊆ R of transitive role names. The set of roles is R ∪ {R− | R ∈ R}. To avoid considering roles such as R−−, we define a function Inv on roles such that Inv(R) = R− if R is a role name, and Inv(R) = S if R = S − . We also define a function Trans which returns true iff R is a transitive role. More precisely, Trans(R) = true iff R ∈ R+ or Inv(R) ∈ R+ . A role inclusion axiom is an expression of the form R v S, where R and S are roles, each of which can be inverse. A role hierarchy is a set of role inclusion axioms. For a role hierarchy R, we define the relation v* to be the transitive-reflexive closure of v over R ∪ {Inv(R) v Inv(S) | R v S ∈ R}. A role R is called a sub-role (resp. super-role) of a role S if R v* S (resp. S v* R). A role is simple if it is neither transitive nor has any transitive sub-roles. The set of SHIQ-concepts is the smallest set such that – every concept name is a concept, and, – if C, D are concepts, R is a role, S is a simple role, and n is a nonnegative integer, then C u D, C t D, ¬C, ∀R.C, ∃R.C, >nS.C, and 6nS.C are also concepts. A general concept inclusion axiom (GCI) is an expression of the form C v D for two SHIQ-concepts C and D. A terminology is a set of GCIs. Let I = {a, b, c . . . } be a set of individual names. An assertion is of the form a : C, . (a, b) : R, or a = 6 b for a, b ∈ I, a (possibly inverse) role R, and a SHIQ-concept C. An Abox is a finite set of assertions. Next, we define semantics of SHIQ and the corresponding inference problems. Definition 2. An interpretation I = (∆I , ·I ) consists of a set ∆I , called the domain of I, and a valuation ·I which maps every concept to a subset of ∆I and every role to a subset of ∆I × ∆I such that, for all concepts C, D, roles R, S, and non-negative integers n, the following equations are satisfied, where ]M denotes the cardinality of a set M and (RI )+ the transitive closure of RI : RI = (RI )+ for each role R ∈ R+ − I (R ) = {hx, yi | hy, xi ∈ RI } (inverse roles) (C u D)I = C I ∩ DI (conjunction) (C t D)I = C I ∪ DI (disjunction) (¬C)I = ∆I \ C I (negation) (∃R.C)I = {x | ∃y.hx, yi ∈ RI and y ∈ C I } (exists restriction) (∀R.C)I = {x | ∀y.hx, yi ∈ RI implies y ∈ C I } (value restriction) (>nR.C)I = {x | ]{y.hx, yi ∈ RI and y ∈ C I } > n} (>-number restriction) (6nR.C)I = {x | ]{y.hx, yi ∈ RI and y ∈ C I } 6 n} (6-number restriction) An interpretation I satisfies a role hierarchy R iff RI ⊆ S I for each R v S in R. Such an interpretation is called a model of R (written I |= R). 3

The logic S has previously been called ALC R+ , but this becomes too cumbersome when adding letters to represent additional features.

Reasoning with Individuals for the Description Logic SHIQ

485

An interpretation I satisfies a terminology T iff C I ⊆ DI for each GCI C v D in T . Such an interpretation is called a model of T (written I |= T ). A concept C is called satisfiable with respect to a role hierarchy R and a terminology T iff there is a model I of R and T with C I 6= ∅. A concept D subsumes a concept C w.r.t. R and T iff C I ⊆ DI holds for each model I of R and T . For an interpretation I, an element x ∈ ∆I is called an instance of a concept C iff x ∈ C I . For Aboxes, an interpretation maps, additionally, each individual a ∈ I to some element aI ∈ ∆I . An interpretation I satisfies an assertion a : C iff aI ∈ C I , (a, b) : R iff haI , bI i ∈ RI , and . a= 6 b iff aI = 6 bI An Abox A is consistent w.r.t. R and T iff there is a model I of R and T that satisfies each assertion in A. For DLs that are closed under negation, subsumption and (un)satisfiability can be mutually reduced: C v D iff C u ¬D is unsatisfiable, and C is unsatisfiable iff C v A u ¬A for some concept name A. Moreover, a concept C is satisfiable iff the Abox {a : C} is consistent. It is straightforward to extend these reductions to role hierarchies, but terminologies deserve special care: In [2,23,3], the internalisation of GCIs is introduced, a technique that reduces reasoning w.r.t. a (possibly cyclic) terminology to reasoning w.r.t. the empty terminology. For SHIQ, this reduction must be slightly modified. The following Lemma shows how general concept inclusion axioms can be internalised using a “universal” role U , that is, a transitive super-role of all roles occurring in T and their respective inverses. Lemma 1. Let C, D be concepts, A an Abox, T a terminology, and R a role hierarchy. We define CT := u ¬Ci t Di . Ci vDi ∈T

Let U be a transitive role that does not occur in T , C, D, A, or R. We set RU := R ∪ {R v U, Inv(R) v U | R occurs in T , C, D, A, or R}. – C is satisfiable w.r.t. T and R iff C u CT u ∀U.CT is satisfiable w.r.t. RU . – D subsumes C with respect to T and R iff C u ¬D u CT u ∀U.CT is unsatisfiable w.r.t. RU . – A is consistent with respect to R and T iff A ∪ {a : CT u ∀U.CT | a occurs in A} is consistent w.r.t. RU . The proof of Lemma 1 is similar to the ones that can be found in [23,2]. Most importantly, it must be shown that, (a) if a SHIQ-concept C is satisfiable with respect to a terminology T and a role hierarchy R, then C, T have a connected model, i. e., a model where any two elements are connect by a role path over those roles occuring in C and T , and (b) if y is reachable from x via a role path (possibly involving inverse roles), then hx, yi ∈ U I . These are easy consequences of the semantics and the definition of U.

486

Ian Horrocks, Ulrike Sattler, and Stephan Tobies

Theorem 1. Satisfiability and subsumption of SHIQ-concepts w.r.t. terminologies and role hierarchies are polynomially reducible to (un)satisfiability of SHIQ-concepts w.r.t. role hierarchies, and therefore to consistency of SHIQ-Aboxes w.r.t. role hierarchies. Consistency of SHIQ-Aboxes w.r.t. terminologies and role hierarchies is polynomially reducible to consistency of SHIQ-Aboxes w.r.t. role hierarchies.

3 A SHIQ-Abox Tableau Algorithm With Theorem 1, all standard inference problems for SHIQ-concepts and Aboxes can be reduced to Abox-consistency w.r.t. a role hierarchy. In the following, we present a tableau-based algorithm that decides consistency of SHIQ-Aboxes w.r.t. role hierarchies, and therefore all other SHIQ inference problems presented. The algorithm tries to construct, for a SHIQ-Abox A, a tableau for A, that is, an abstraction of a model of A. Given the notion of a tableau, it is then quite straightforward to prove that the algorithm is a decision procedure for Abox consistency. 3.1 A Tableau for Aboxes In the following, if not stated otherwise, C, D denote SHIQ-concepts, R a role hierarchy, A an Abox, RA the set of roles occurring in A and R together with their inverses, and IA is the set of individuals occurring in A. Without loss of generality, we assume all concepts C occurring in assertions a : C ∈ A to be in NNF, that is, negation occurs in front of concept names only. Any SHIQconcept can easily be transformed into an equivalent one in NNF by pushing negations inwards using a combination of DeMorgan’s laws and the following equivalences: ¬(∃R.C) ≡ (∀R.¬C) ¬(∀R.C) ≡ (∃R.¬C) ¬(6nR.C) ≡ >(n + 1)R.C ¬(>nR.C) ≡ 6(n − 1)R.C where 6(−1)R.C := A u ¬A for some A ∈ C For a concept C we will denote the NNF of ¬C by ∼C. Next, for a concept C, clos(C) is the smallest S set that contains C and is closed under sub-concepts and ∼. We use clos(A) := a:C∈A clos(C) for the closure clos(C) of each concept C occurring in A. It is not hard to show that the size of clos(A) is polynomial in the size of A. Definition 3. T = (S, L, E, I) is a tableau for A w.r.t. R iff – – – –

S is a non-empty set, L : S → 2clos(A) maps each element in S to a set of concepts, E : RA → 2S×S maps each role to a set of pairs of elements in S, and I : IA → S maps individuals occurring in A to elements in S.

Furthermore, for all s, t ∈ S, C, C1, C2 ∈ clos(A), and R, S ∈ RA , T satisfies: (P1) if C ∈ L(s), then ¬C ∈ / L(s), (P2) if C1 u C2 ∈ L(s), then C1 ∈ L(s) and C2 ∈ L(s),

Reasoning with Individuals for the Description Logic SHIQ

487

(P3) if C1 t C2 ∈ L(s), then C1 ∈ L(s) or C2 ∈ L(s), (P4) if ∀S.C ∈ L(s) and hs, ti ∈ E(S), then C ∈ L(t), (P5) if ∃S.C ∈ L(s), then there is some t ∈ S such that hs, ti ∈ E(S) and C ∈ L(t), (P6) if ∀S.C ∈ L(s) and hs, ti ∈ E(R) for some R v * S with Trans(R), then ∀R.C ∈ L(t), (P7) hx, yi ∈ E(R) iff hy, xi ∈ E(Inv(R)), (P8) if hs, ti ∈ E(R) and R v* S, then hs, ti ∈ E(S), (P9) if 6nS.C ∈ L(s), then ]S T (s, C) 6 n, (P10) if >nS.C ∈ L(s), then ]S T (s, C) > n, (P11) if (./ n S C) ∈ L(s) and hs, ti ∈ E(S) then C ∈ L(t) or ∼C ∈ L(t), (P12) if a : C ∈ A, then C ∈ L(I(a)), (P13) if (a, b) : R ∈ A, then hI(a), I(b)i ∈ E(R), . (P14) if a = 6 b ∈ A, then I(a) 6= I(b), where ./ is a place-holder for both 6 and >, and S T (s, C) := {t ∈ S | hs, ti ∈ E(S) and C ∈ L(t)}. Lemma 2. A SHIQ-Abox A is consistent w.r.t. R iff there exists a tableau for A w.r.t. R. Proof: For the if direction, if T = (S, L, E, I) is a tableau for A w.r.t. R, a model I = (∆I , ·I ) of A and R can be defined as follows: ∆I := S for concept names A in clos(A) : AI := {s | A ∈ L(s)} for individual names a ∈ I : aI := I(a)  E(R)+ S for role names R ∈ R : RI := E(R) ∪ 

P v * R,P 6=R

PI

if Trans(R) otherwise

where E(R)+ denotes the transitive closure of E(R). The interpretation of non-transitive roles is recursive in order to correctly interpret those non-transitive roles that have a transitive sub-role. From the definition of RI and (P8), it follows that, if hs, ti ∈ S I , then either hs, ti ∈ E(S) or there exists a path hs, s1 i, hs1 , s2 i, . . . , hsn , ti ∈ E(R) for some R with Trans(R) and R v* S. Due to (P8) and by definition of I, we have that I is a model of R. To prove that I is a model of A, we show that C ∈ L(s) implies s ∈ C I for any s ∈ S. Together with (P12), (P13), and the interpretation of individuals and roles, this implies that I satisfies each assertion in A. This proof can be given by induction on the length kCk of a concept C in NNF, where we count neither negation nor integers in number restrictions. The only interesting case is C = ∀S.E: let t ∈ S with hs, ti ∈ S I . There are two possibilities: – hs, ti ∈ E(S). Then (P4) implies E ∈ L(t). – hs, ti 6∈ E(S). Then there exists a path hs, s1 i, hs1 , s2 i, . . . , hsn , ti ∈ E(R) for some R with Trans(R) and R v * S. Then (P6) implies ∀R.E ∈ L(si ) for all 1 ≤ i ≤ n, and (P4) implies E ∈ L(t).

488

Ian Horrocks, Ulrike Sattler, and Stephan Tobies

In both cases, t ∈ E I by induction and hence s ∈ C I . For the converse, for I = (∆I , ·I ) a model of A w.r.t. R, we define a tableau T = (S, L, E, I) for A and R as follows: S := ∆I ,

E(R) := RI ,

L(s) := {C ∈ clos(A) | s ∈ C I },

and

It is easy to demonstrate that T is a tableau for D.

I(a) = aI . t u

3.2 The Tableau Algorithm In this section, we present a completion algorithm that tries to construct, for an input Abox A and a role hierarchy R, a tableau for A w.r.t. R. We prove that this algorithm constructs a tableau for A and R iff there exists a tableau for A and R, and thus decides consistency of SHIQ Aboxes w.r.t. role hierarchies. Since Aboxes might involve several individuals with arbitrary role relationships between them, the completion algorithm works on a forest rather than on a tree, which is the basic data structure for those completion algorithms deciding satisfiability of a concept. Such a forest is a collection of trees whose root nodes correspond to the individuals present in the input Abox. In the presence of transitive roles, blocking is employed to ensure termination of the algorithm. In the additional presence of inverse roles, blocking is dynamic, i.e., blocked nodes (and their sub-branches) can be un-blocked and blocked again later. In the additional presence of number restrictions, pairs of nodes are blocked rather than single nodes. Definition 4. A completion forest F for a SHIQ Abox A is a collection of trees whose distinguished root nodes are possibly connected by edges in an arbitrary way. Moreover, each node x is labelled with a set L(x) ⊆ clos(A) and each edge hx, yi is labelled with a set L(hx, yi) ⊆ RA of (possibly inverse) roles occurring in A. Finally, completion . forests come with an explicit inequality relation = 6 on nodes and an explicit equality . relation = which are implicitly assumed to be symmetric. If nodes x and y are connected by an edge hx, yi with R ∈ L(hx, yi) and R v * S, then y is called an S-successor of x and x is called an Inv(S)-predecessor of y. If y is an S-successor or an Inv(S)-predecessor of x, then y is called an S-neighbour of x. A node y is a successor (resp. predecessor or neighbour) of y if it is an S-successor (resp. S-predecessor or S-neighbour) of y for some role S. Finally, ancestor is the transitive closure of predecessor. For a role S, a concept C and a node x in F we define S F (x, C) by S F (x, C) := {y | y is S-neighbour of x and C ∈ L(y)}. A node is blocked iff it is not a root node and it is either directly or indirectly blocked. A node x is directly blocked iff none of its ancestors are blocked, and it has ancestors x0 , y and y0 such that 1. y is not a root node and 2. x is a successor of x0 and y is a successor of y0 and

Reasoning with Individuals for the Description Logic SHIQ

489

3. L(x) = L(y) and L(x0 ) = L(y0 ) and 4. L(hx0 , xi) = L(hy0 , yi). In this case we will say that y blocks x. A node y is indirectly blocked iff one of its ancestors is blocked, or it is a successor of a node x and L(hx, yi) = ∅; the latter condition avoids wasted expansions after an application of the 6-rule. Given a SHIQ-Abox A and a role hierarchy R, the algorithm initialises a completion forest FA consisting only of root nodes. More precisely, FA contains a root node xi0 for each individual ai ∈ IA occurring in A, and an edge hxi0 , xj0 i if A contains an assertion (ai , aj ) : R for some R. The labels of these nodes and edges and the relations . . 6 and = are initialised as follows: = L(xi0 ) := {C | ai : C ∈ A}, L(hxi0 , xj0 i) := {R | (ai , aj ) : R ∈ A}, . . 6 xj0 iff ai = 6 aj ∈ A, and xi0 = . the =-relation is initialised to be empty. FA is then expanded by repeatedly applying the rules from Figure 1. For a node x, L(x) is said to contain a clash if, for some concept name A ∈ C, {A, ¬A} ⊆ L(x), or if there is some concept 6nS.C ∈ L(x) and x has n + 1 S. neighbours y0 , . . . , yn with C ∈ L(yi ) and yi = 6 yj for all 0 ≤ i < j ≤ n. A completion forest is clash-free if none of its nodes contains a clash, and it is complete if no rule from Figure 1 can be applied to it. For a SHIQ-Abox A, the algorithm starts with the completion forest FA . It applies the expansion rules in Figure 1, stopping when a clash occurs, and answers “A is consistent w.r.t. R” iff the completion rules can be applied in such a way that they yield a complete and clash-free completion forest, and “A and is inconsistent w.r.t. R” otherwise. Since both the 6-rule and the 6r -rule are rather complicated, they deserve some more explanation. Both rules deal with the situation where a concept 6nR.C ∈ L(x) requires the identification of two R-neighbours y, z of x that contain C in their labels. . Of course, y and z may only be identified if y = 6 z is not asserted. If these conditions are met, then one of the two rules can be applied. The 6-rule deals with the case where at least one of the nodes to be identified, namely y, is not a root node, and this can lead to one of two possible situations, both shown in Figure 2. The upper situation occurs when both y and z are successors of x. In this case, we add the label of y to that of z, and the label of the edge hx, yi to the label of the edge hx, zi. Finally, z inherits all inequalities from y, and L(hx, yi) is set to ∅, thus blocking y and all its successors. The second situation occurs when both y and z are neighbours of x, but z is the predecessor of x. Again, L(y) is added to L(z), but in this case the inverse of L(hx, yi) is added to L(hz, xi), because the edge hx, yi was pointing away from x while hz, xi points towards it. Again, z inherits the inequalities from y and L(hx, yi) is set to ∅. The 6r rule handles the identification of two root nodes. An example of the whole procedure is given in the lower part of Figure 2. In this case, special care has to be taken to preserve the relations introduced into the completion forest due to role assertions in

490

Ian Horrocks, Ulrike Sattler, and Stephan Tobies u-rule:

C1 u C2 ∈ L(x), x is not indirectly blocked, and {C1 , C2 } 6⊆ L(x) L(x) −→ L(x) ∪ {C1 , C2 } C1 t C2 ∈ L(x), x is not indirectly blocked, and {C1 , C2 } ∩ L(x) = ∅ L(x) −→ L(x) ∪ {E} for some E ∈ {C1 , C2 } ∃S.C ∈ L(x), x is not blocked, and x has no S-neighbour y with C ∈ L(y) create a new node y with L(hx, yi) := {S} and L(y) := {C} ∀S.C ∈ L(x), x is not indirectly blocked, and there is an S-neighbour y of x with C ∈ / L(y) L(y) −→ L(y) ∪ {C} ∀S.C ∈ L(x), x is not indirectly blocked, and * S, there is some R with Trans(R) and R v there is an R-neighbour y of x with ∀R.C ∈ / L(y) L(y) −→ L(y) ∪ {∀R.C} (./ n S C) ∈ L(x), x is not indirectly blocked, and there is an S-neighbour y of x with {C, ∼C} ∩ L(y) = ∅ L(y) −→ L(y) ∪ {E} for some E ∈ {C, ∼C} nS.C ∈ L(x), x is not blocked, and there are no n S-neighbours y1 , . . . , yn such that C ∈ L(yi ) . and yi = 6 yj for 1 ≤ i < j ≤ n then create n new nodes y1 , . . . , yn with L(hx, yi i) = {S}, . L(yi) = {C}, and yi = 6 yj for 1 ≤ i < j ≤ n. if 1. nS.C ∈ L(x), x is not indirectly blocked, and . 2. ]S F (x, C) > n, there are S-neighbours y, z of x with not y = 6 z, y is neither a root node nor an ancestor of z, and C ∈ L(y) ∩ L(z), then 1. L(z) −→ L(z) ∪ L(y) and 2. if z is an ancestor of x then L(hz, xi) −→ L(hz, xi) ∪ Inv(L(hx, yi)) else L(hx, zi) −→ L(hx, zi) ∪ L(hx, yi) 3. L(hx, yi) −→ ∅ . . 4. Set u = 6 z for all u with u = 6 y if 1. nS.C ∈ L(x), and 2. ]S F (x, C) > n and there are two S-neighbours y, z of x . which are both root nodes, C ∈ L(y) ∩ L(z), and not y = 6 z then 1. L(z) −→ L(z) ∪ L(y) and 2. For all edges hy, wi: i. if the edge hz, wi does not exist, create it with L(hz, wi) := ∅ ii. L(hz, wi) −→ L(hz, wi) ∪ L(hy, wi) 3. For all edges hw, yi: i. if the edge hw, zi does not exist, create it with L(hw, zi) := ∅ ii. L(hw, zi) −→ L(hw, zi) ∪ L(hw, yi) 4. Set L(y) := ∅ and remove all edges to/from y. . . 5. Set u = 6 z for all u with u = 6 y. . 6. Set y = z.

if 1. 2. then t-rule: if 1. 2. then ∃-rule: if 1. 2. then ∀-rule: if 1. 2. then ∀+ -rule: if 1. 2. 3. then choose-rule: if 1. 2. then -rule: if 1. 2.

>

6-rule:

6r -rule:

>

6

6

Fig. 1. The Expansion Rules for SHIQ-Aboxes.

Reasoning with Individuals for the Description Logic SHIQ

491

6-rule x

x

L(hx, zi)

L(hx, yi)

L(z)

z

y

L(y)

L(z)

z

L(hx, zi) ∪ L(hx, yi)

z

6-rule

L(hx, zi) x

z

L(z) ∪ L(y)y

L(z) ∪ L(y)

x ∅

L(y)

y

y

6r -rule

L(hw1 , zi)

L(hw1 , zi) ∪ L(hw1 , yi)

x L(hx, zi) L(z)

L(y)

w1

L(hw1 , yi)

z

L(y)

L(hx, zi) ∪ Inv(L(hx, yi))

L(hx, yi)

w1

∅

x L(hx, yi)

y

L(y)

L(hx, zi) ∪ L(hx, yi) z

L(hy, w2 i)

L(z) ∪ L(y)

y

∅

L(hy, w2 i) w2

w2

Fig. 2. Effect of the 6- and the 6r -rule the Abox, and to memorise the identification of root nodes (this will be needed in order to construct a tableau from a complete and clash-free completion forest). The 6r rule includes some additional steps that deal with these issues. Firstly, as well as adding L(y) to L(z), the edges (and their respective labels) between y and its neighbours are also added to z. Secondly, L(y) and all edges going from/to y are removed from the forest. This will not lead to dangling trees, because all neighbours of y became neighbours of . z in the previous step. Finally, the identification of y and z is recorded in the = relation. Lemma 3. Let A be a SHIQ-Abox and R a role hierarchy. The completion algorithm terminates when started for A and R. Proof: Let m = ]clos(A), n = |RA |, and nmax := max{n | >nR.C ∈ clos(A)}. Termination is a consequence of the following properties of the expansion rules:

492

Ian Horrocks, Ulrike Sattler, and Stephan Tobies

1. The expansion rules never remove nodes from the forest. The only rules that remove elements from the labels of edges or nodes are the 6- and 6r -rule, which sets them to ∅. If an edge label is set to ∅ by the 6-rule, the node below this edge is blocked and will remain blocked forever. The 6r -rule only sets the label of a root node x to ∅, and after this, x’s label is never changed again since all edges to/from x are removed. Since no root nodes are generated, this removal may only happen a finite number of times, and the new edges generated by the 6r -rule guarantees that the resulting structure is still a completion forest. 2. Nodes are labelled with subsets of clos(A) and edges with subsets of RA , so there are at most 22mn different possible labellings for a pair of nodes and an edge. Therefore, if a path p is of length at least 22mn , the pair-wise blocking condition implies the existence of two nodes x, y on p such that y directly blocks y. Since a path on which nodes are blocked cannot become longer, paths are of length at most 22mn . 3. Only the ∃- or the >-rule generate new nodes, and each generation is triggered by a concept of the form ∃R.C or >nR.C in clos(A). Each of these concepts triggers the generation of at most nmax successors yi : note that if the 6- or the 6r rule subsequently causes L(hx, yi i) to be changed to ∅, then x will have some Rneighbour z with L(z) ⊇ L(y). This, together with the definition of a clash, implies that the rule application which led to the generation of yi will not be repeated. Since clos(A) contains a total of at most m ∃R.C, the out-degree of the forest is bounded by mnmax n. t u Lemma 4. Let A be a SHIQ-Abox and R a role hierarchy. If the expansion rules can be applied to A and R such that they yield a complete and clash-free completion forest, then A has a tableau w.r.t. R. Proof: Let F be a complete and clash-free completion forest. The definition of a tableau T = (S, L, E, I) from F works as follows. Intuitively, an individual in S corresponds to a path in F from some root node to some node that is not blocked, and which goes only via non-root nodes. More precisely, a path is a sequence of pairs of nodes of F of the form p = [ xx00 , . . . , xxn0 ]. For such a path we define Tail(p) := xn and Tail0 (p) := x0n . With n 0 [p| xx0n+1 ], we denote the path [ xx00 , . . . , xxn0 , xxn+1 ]. The set Paths(F ) is defined induc0 0 n n+1 n+1 tively as follows: xi

– For root nodes xi0 of F, [ x0i ] ∈ Paths(F ), and 0 – For a path p ∈ Paths(F ) and a node z in F : • if z is a successor of Tail(p) and z is neither blocked nor a root node, then [p| zz ] ∈ Paths(F), or • if, for some node y in F , y is a successor of Tail(p) and z blocks y, then [p| yz ] ∈ Paths(F). Please note that, since root nodes are never blocked, nor are they blocking other nodes, the only place where they occur in a path is in the first place. Moreover, by construction

Reasoning with Individuals for the Description Logic SHIQ

493

of Paths(F), if p ∈ Paths(F ), then Tail(p) is not blocked, Tail(p) = Tail0 (p) iff Tail0 (p) is not blocked, and L(Tail(p)) = L(Tail0 (p)). We define a tableau T = (S, L, E, I) as follows: S = Paths(F ) L(p) = L(Tail(p)) E(R) = {hp, [p| xx0 ]i ∈ S × S | x0 is an R-successor of Tail(p)} ∪ {h[q| xx0 ], qi ∈ S × S | x0 is an Inv(R)-successor of Tail(q)} ∪ {h[ xx ], [ yy ]i ∈ S × S | x, y are root nodes, and y is an R-neighbour of x}  i  [ x0i ] if xi0 is a root node in F with L(xi0 ) 6= ∅ x0 I(ai ) = j . j  [ x0j ] if L(xi0 ) = ∅, xj0 a root node in F with L(xj0 ) 6= ∅ and xi0 = x0 x 0

Please note that L(x) = ∅ implies that x is a root node and that there is another root . node y with L(y) 6= ∅ and x = y. We show that T is a tableau for D. – T satisfies (P1) because F is clash-free. – (P2) and (P3) are satisfied by T because F is complete. – For (P4), let p, q ∈ S with ∀R.C ∈ L(p), hp, qi ∈ E(R). If q = [p| xx0 ], then x0 is an R-successor of Tail(p) and, due to completeness of F , C ∈ L(x0 ) = L(x) = L(q). If p = [q| xx0 ], then x0 is an Inv(R)-successor of Tail(q) and, due to completeness of F, C ∈ L(Tail(q)) = L(q). If p = [ xx ] and q = [ yy ] for two root nodes x, x, then y is an R-neighbour of x, and completeness of F yields C ∈ L(y) = L(q). (P6) and (P11) hold for similar reasons. – For (P5), let ∃R.C ∈ L(p) and Tail(p) = x. Since x is not blocked and F complete, x has some R-neighbour y with C ∈ L(y). • If y is a successor of x, then y can either be a root node or not. ∗ If y is not a root node: if y is not blocked, then q := [p| yy ] ∈ S; if y is blocked by some node z, then q := [p| yz ] ∈ S. ∗ If y is a root node: since y is a successor of x, x is also a root node. This implies p = [ xx ] and q = [ yy ] ∈ S. • x is an Inv(R)-successor of y, then either ∗ p = [q| xx0 ] with Tail(q) = y. ∗ p = [q| xx0 ] with Tail(q) = u 6= y. Since x only has one predecessor, u is not the predecessor of x. This implies x 6= x0 , x blocks x0 , and u is the predecessor of x0 due to the construction of Paths. Together with the definition of the blocking condition, this implies L(hu, x0i) = L(hy, xi) as well as L(u) = L(y) due to the blocking condition. ∗ p = [ xx ] with x being a root node. Hence y is also a root node and q = [ yy ]. In any of these cases, hp, qi ∈ E(R) and C ∈ L(q). – (P7) holds because of the symmetric definition of the mapping E. – (P8) is due to the definition of R-neighbours and R-successor. – Suppose (P9) were not satisfied. Hence there is some p ∈ S with (6nS.C) ∈ L(p) and ]S T (p, C) > n. We will show that this implies ]S F (Tail(p), C) > n, contradicting either clash-freeness or completeness of F . Let x := Tail(p) and P := S T (p, C). We distinguish two cases:

494

Ian Horrocks, Ulrike Sattler, and Stephan Tobies

• P contains only paths of the form [p| yy0 ] and [ 0

i

x0` i

x0`

]. Then ]P > n is impossible

since the function Tail is injective on P : if we assume that there are two distinct paths q1 , q2 ∈ P and Tail0 (q1 ) = Tail0 (q2 ) = y0 , then this implies that each qi 0 is of the form qi = [p| yyi0 ] or qi = [ yy0 ]. From q1 6= q2 , we have that qi = [p| yyi0 ] holds for some i ∈ {1, 2}. Since root nodes occur only in the beginning of paths and q1 6= q2 , we have q1 = [p|(y1 , y0 )] and q2 = [p|(y2 , y0 )]. If y0 is not blocked, then y1 = y0 = y2 , contradicting q1 6= q2 . If y0 is blocked in F , then both y1 and y2 block y0 , which implies y1 = y2 , again a contradiction. Hence Tail0 is injective on P and thus ]P = ] Tail0 (P ). Moreover, for each y0 ∈ Tail0 (P ), y0 is an S-successor of x and C ∈ L(y0 ). This implies ]S F (x, C) > n. • P contains a path q where p = [q| xx0 ]. Obviously, P may only contain one such path. As in the previous case, Tail0 is an injective function on the set P 0 := P \ {q}, each y0 ∈ Tail0 (P 0 ) is an S-successor of x, and C ∈ L(y0 ) for each y0 ∈ Tail0 (P 0 ). Let z := Tail(q). We distinguish two cases: ∗ x = x0 . Hence x is not blocked, and thus x is an Inv(S)-successor of z. Since Tail0 (P 0 ) contains only successors of x we have that z 6∈ Tail0 (P 0 ) and, by construction, z is an S-neighbour of x with C ∈ L(z). ∗ x 6= x0 . This implies that x0 is blocked by x and that x0 is an Inv(S)successor of z. Due to the definition of pairwise-blocking this implies that x is an Inv(S)-successor of some node u with L(u) = L(z). Again, u 6∈ Tail0 (P 0 ) and, by construction, u is an S-neighbour of x and C ∈ L(u). – For (P10), let (>nS.C) ∈ L(p). Hence there are n S-neighbours y1 , . . . , yn of x = Tail(p) in F with C ∈ L(yi ). For each yi there are three possibilities: • yi is an S-successor of x and yi is not blocked in F . Then qi := [p| yyii ] or yi is a root node and qi := [ yyii ] is in S. • yi is an S-successor of x and yi is blocked in F by some node z. Then qi = [p| yzi ] is in S. Since the same z may block several of the yj s, it is indeed necessary to include yi explicitly into the path to make them distinct. • x is an Inv(S)-successor of yi . There may be at most one such yi if x is not a root node. Hence either p = [qi | xx0 ] with Tail(qi ) = yi , or p = [ xx ] and qi = [ yyii ]. Hence for each yi there is a different path qi in S with S ∈ L(hp, qii) and C ∈ L(qi ), and thus ]S T (p, C) > n. – (P12) is due to the fact that, when the completion algorithm is started for an Abox A, the initial completion forest FA contains, for each individual name ai occurring in A, a root node xi0 with L(xi0 ) = {C ∈ clos(A) | ai : C ∈ A}. The algorithm never blocks root individuals, and, for each root node xi0 whose label and edges . are removed by the 6r -rule, there is another root node xj0 with xi0 = xj0 and {C ∈ clos(A) | ai : C ∈ A} ⊆ L(xj0 ). Together with the definition of I, this yields (P12). (P13) is satisfied for similar reasons. – (P14) is satisfied because the 6r -rule does not identify two root nodes xi0 , y0i when . xi0 = 6 y0i holds. t u

Reasoning with Individuals for the Description Logic SHIQ

495

Lemma 5. Let A be a SHIQ-Abox and R a role hierarchy. If A has a tableau w.r.t. R, then the expansion rules can be applied to A and R such that they yield a complete and clash-free completion forest. Proof: Let T = (S, L, E, I) be a tableau for A and R. We use T to trigger the application of the expansion rules such that they yield a completion forest F that is both complete and clash-free. To this purpose, a function π is used which maps the nodes of F to elements of S. The mapping π is defined as follows: – For individuals ai in A, we define π(xi0 ) := I(ai ). – If π(x) = s is already defined, and a successor y of x was generated for ∃R.C ∈ L(x), then π(y) = t for some t ∈ S with C ∈ L(t) and hs, ti ∈ E(R). – If π(x) = s is already defined, and successors yi of x were generated for >nR.C ∈ L(x), then π(yi ) = ti for n distinct ti ∈ S with C ∈ L(ti ) and hs, ti i ∈ E(R). Obviously, the mapping for the initial completion forest for A and R satisfies the following conditions:  L(x) ⊆ L(π(x)),  (∗) if y is an S-neighbour of x, then hπ(x), π(y)i ∈ E(S), and .  x= 6 y implies π(x) 6= π(y). It can be shown that the following claim holds: C LAIM : Let F be generated by the completion algorithm for A and R and let π satisfy (∗). If an expansion rule is applicable to F , then this rule can be applied such that it yields a completion forest F 0 and a (possibly extended) π that satisfy (∗). As a consequence of this claim, (P1), and (P9), if A and R have a tableau, then the expansion rules can be applied to A and R such that they yield a complete and clashfree completion forest. t u From Theorem 1, Lemma 2, 3 4, and 5, we thus have the following theorem: Theorem 2. The completion algorithm is a decision procedure for the consistency of SHIQ-Aboxes and the satisfiability and subumption of concepts with respect to role hierarchies and terminologies.

4 Conclusion We have presented an algorithm for deciding the satisfiability of SHIQ KBs where the Abox may be non-empty and where the uniqueness of individual names is not assumed but can be asserted in the Abox. This algorithm is of particular interest as it can be used to decide the problem of conjunctive query containment w.r.t. a schema [17]. An implementation of the SHIQ Tbox satisfiability algorithm is already available in the FaCT system [14], and is able to reason efficiently with Tboxes derived from realistic ER schemas. This suggests that the algorithm presented here could form the basis of a practical decision procedure for the query containment problem. Work is already underway to test this conjecture by extending the FaCT system with an implementation of the new algorithm.

496

Ian Horrocks, Ulrike Sattler, and Stephan Tobies

References 1. C. Areces, P. Blackburn, and M. Marx. A road-map on complexity for hybrid logics. In Proc. of CSL’99, number 1683 in LNCS, pages 307–321 Springer-Verlag, 1999. 2. F. Baader. Augmenting concept languages by transitive closure of roles: An alternative to terminological cycles. In Proc. of IJCAI-91, 1991. 3. F. Baader, H.-J. B¨urckert, B. Nebel, W. Nutt, and G. Smolka. On the expressivity of feature logics with negation, functional uncertainty, and sort equations. Journal of Logic, Language and Information, 2:1–18, 1993. 4. F. Baader, H.-J. Heinsohn, B. Hollunder, J. Muller, B. Nebel, W. Nutt, and H.-J. Profitlich. Terminological knowledge representation: A proposal for a terminological logic. Technical Memo TM-90-04, DFKI, Saarbr¨ucken, Germany, 1991. 5. P. Blackburn and J. Seligman. What are hybrid languages? In Advances in Modal Logic, volume 1, pages 41–62. CSLI Publications, Stanford University, 1998. 6. M. Buchheit, F. M. Donini, and A. Schaerf. Decidable reasoning in terminological knowledge representation systems. J. of Artificial Intelligence Research, 1:109–138, 1993. 7. D. Calvanese. Reasoning with inclusion axioms in description logics: Algorithms and complexity. In Proc. of ECAI’96, pages 303–307. John Wiley & Sons Ltd., 1996. 8. D. Calvanese, G. De Giacomo, and M. Lenzerini. On the decidability of query containment under constraints. In Proc. of PODS’98, pages 149–158. 1998. 9. D. Calvanese, G. De Giacomo, M. Lenzerini, D. Nardi, and R. Rosati. Source integration in data warehousing. In Proc. of DEXA-98. IEEE Computer Society Press, 1998. 10. Diego Calvanese, Giuseppe De Giacomo, Maurizio Lenzerini, Daniele Nardi, and Riccardo Rosati. Description logic framework for information integration. In Proc. of KR-98, 1998. 11. G. De Giacomo and F. Massacci. Combining deduction and model checking into tableaux and algorithms for converse-PDL. Information and Computation, 1998. To appear. 12. Giuseppe De Giacomo and Maurizio Lenzerini. What’s in an aggregate: Foundations for description logics with tuples and sets. In Proc. of IJCAI-95, 1995. 13. V. Haarslev and R. M¨oller. An empirical evaluation of optimization strategies for abox reasoning in expressive description logics. In Lambrix et al. [19], pages 115–119.. 14. I. Horrocks. FaCT and iFaCT. In Lambrix et al. [19], pages 133–135. 15. I. Horrocks, A. Rector, and C. Goble. A description logic based schema for the classification of medical data. In Proc. of the 3rd Workshop KRDB’96. CEUR, June 1996. 16. I. Horrocks and U. Sattler. A description logic with transitive and inverse roles and role hierarchies. Journal of Logic and Computation, 9(3):385–410, 1999. 17. I. Horrocks, U. Sattler, S. Tessaris, and S. Tobies. Query containment using a DLR ABox. LTCS-Report 99-15, LuFG Theoretical Computer Science, RWTH Aachen, Germany, 1999. 18. I. Horrocks, U. Sattler, and S. Tobies. Practical reasoning for expressive description logics. In Proc. of LPAR’99, number 1705 in LNAI, pages 161–180. Springer-Verlag, 1999. 19. P. Lambrix, A. Borgida, M. Lenzerini, R. M¨oller, and P. Patel-Schneider, editors. Proc. of the International Workshop on Description Logics (DL’99), 1999. 20. E. Mays, R. Weida, R. Dionne, M. Laker, B. White, C. Liang, and F. J. Oles. Scalable and expressive medical terminologies. In Proc. of the 1996 AMAI Annual Fall Symposium, 1996. 21. U. Sattler. A concept language extended with different kinds of transitive roles. In 20. Deutsche Jahrestagung f¨ur KI, volume 1137 in LNAI. Springer-Verlag, 1996. 22. A. Schaerf. Reasoning with individuals in concept languages. Data and Knowledge Engineering, 13(2):141–176, 1994. 23. K. Schild. A correspondence theory for terminological logics: Preliminary report. In J. Mylopoulos, R. Reiter, editors, Proc. of IJCAI-91, Sydney, 1991. 24. M. Schmidt-Schauß and G. Smolka. Attributive concept descriptions with complements. Artificial Intelligence, 48(1):1–26, 1991.

System Description: Embedding Verification into Microsoft Excel? Graham Collins1 and Louise A. Dennis2 1

Department of Computing Science, University of Glasgow, Glasgow G12 8QQ, UK 2 Division of Informatics, University of Edinburgh Edinburgh EH1 1HN, UK

Abstract. The aim of the Prosper project is to allow the embedding of existing verification technology into applications in such a way that the theorem proving is hidden, or presented to the end user in a natural way. This paper describes a system built to test whether the Prosper toolkit satisfied this aim. The system combines the toolkit with Microsoft Excel, a popular commercial spreadsheet application.

1

Introduction

The Prosper project is researching and developing a toolkit [1] that allows an expert to easily and flexibly assemble proof engines from existing tools to provide embedded formal reasoning support inside applications. The ultimate goal is to make the reasoning and proof support invisible to the end-user—or at least, more realistically, to incorporate it securely within the interface and style of interaction to which they are already accustomed. Several large case studies are taking place within the project to investigate this. This paper describes a preliminary case study embedding verification into Microsoft Excel without inventing or re-implementing any existing theorem proving techniques or mathematical decision procedures.1 The primary aim was to show that the technology is effective when applied to real, standard applications not designed by project members. In addition we were interested in investigating a “lightweight” theorem proving approach where only a small amount of theorem proving functionality is added but it is completely hidden from the user. This paper begins with a brief overview of the Prosper toolkit (§2) and Excel (§3) followed by a discussion of the system developed.

2

Extending Applications with Custom Proof Engines

A central part of Prosper’s vision is the idea of a proof engine—a custom built verification engine which can be operated by another program through an Application Programming Interface (API). A proof engine can be built by a system ? 1

Work funded ESPRIT Framework IV Grant LTR 26241 The case study is available from http://www.collins-peak.net/p-excel/

D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 497–501, 2000. c Springer-Verlag Berlin Heidelberg 2000

498

Graham Collins and Louise A. Dennis

developer using the toolkit provided by the project. A proof engine is based upon the functionality of a theorem prover with additional capabilities provided by ‘plugins’ formed from existing, off-the-shelf, tools. The toolkit includes a set of libraries based on a language-independent specification, the Prosper Integration Interface (PII), for communication between components of a final system. The theorem prover’s command language is treated as a kind of scripting or glue language for managing plugin components and orchestrating the proofs. The PII consists of several parts. There is a datatype for communication of data between components of a system which includes the language of higher order logic used by the HOL system[2] and so any formula expressible in higher order logic can be passed between components. There is support for installing procedures in an API and calling them remotely. There are also parts for managing low level communication, which are largely invisible to an application developer. The PII is currently implemented in ML, C, Java, Python, λProlog and ADA. Proof engines are constructed on top of a small subset of HOL, called the Core Proof Engine. This consists of theorems, inference rules for higher order logic and an ML implementation of the PII. A developer can write extensions to the Core Proof Engine and place them in an API to form a custom proof engine. When incorporating a proof engine into an application the developer calls the customised API through the PII.

3

Microsoft Excel

Excel is a spreadsheet package marketed by Microsoft [4]. Its basic constituents are rows and columns of cells into which either values or formulae may be entered. Formulae refer to other cells, which may contain either values or further formulae. Users of Excel are likely to have no interest in using or guiding mathematical proof, but they do want to know that they have entered formulae correctly. They therefore have an interest in ‘sanity checking functions’ that they can use to reassure themselves of correctness. This made Excel suited as a case study since the users have a notion of formulae and correctness, all that needs to be hidden is the proof. Another advantage is that Excel was designed to allow new functionality to be added and although its developers were not concerned with verification there is support for calling external tools. As a simple example, the authors undertook to incorporate a sanity checking function into Excel. We chose to implement an equality checking function which would take two cells containing formulae and attempt to determine whether these formulae were equal for all possible values of the cells to which they refer. Simplifying assumptions were made for the case study. The most important were that cell values were only natural numbers or booleans and that only a small subset of the functions available in Excel (some simple arithmetical and logical functions) appeared in formulae. Given these assumptions, less than 150 lines of code were needed to produce a prototype. This prototype handled only a small range of formulae decidable by linear arithmetic or propositional logic decision procedures, but it demonstrated the basic functionality.

System Description: Embedding Verification into Microsoft Excel

4

499

Architecture

The main difficulty in the system was that Excel is Windows based and expects Microsoft’s Component Object Model (COM) to be used for communication between processes, whereas the Prosper toolkit had been developed for UNIX machines2 and uses sockets for communications between components. Several possible solutions to this problem were considered including implementing the PII in Visual Basic and using internet sockets to let Excel communicate with a proof engine. We did not take this approach because our aim was to show that theorem proving technology can be incorporated into applications in as natural a way as possible. For Excel this meant making the functionality of the Prosper tools available as a COM server. The Prosper COM server was implemented in Python, a dynamically typed, object oriented scripting language which supports both COM and sockets. The server consists of two parts, the python implementation of the PII and the additional code described below which is specific to this example. The remaining decision was where to convert Excel’s formulae, which we access as strings, into terms. This requires some type inference but is simple to do and could have been written in either the Python or Visual Basic components. This was done in Python since it was the preferred language of the authors. From the Excel side the Python component is a COM server which makes available a small number of functions that Excel can call. The use of a UNIX based theorem prover is not visible to Excel. From the proof engine side the Python component behaves like any other application calling the proof engine using the PII. The use of Excel is not visible to the theorem prover. A view of the current (2 operating system) architecture is shown below.

Strings Excel

COM Server/ PII Client

Data

Proof Engine

Windows

5

Prover Plugin

UNIX

Custom Proof Engine

The initial custom proof procedure is very simple-minded. It uses a linear arithmetic decision procedure provided by HOL and a propositional logic plugin (based on Prover Technology’s proof tool [6,5]) to decide the truth of formulae. While the approach is not especially robust, it is strong enough to handle many formulae. 2

It is expected that a future version will be ported to Windows.

500

Graham Collins and Louise A. Dennis

The additional code required to create this custom proof procedure is very small (approx. 45 extra lines of ML were needed). All the verification code used already existed either in HOL or the plugin, the new code concentrated on gluing together the decision procedures and deciding which should be used. A proof engine which could handle a wider range of formulae would require more work. It is possible that more decision procedures could be used to provide this, for instance we could exploit HOL’s simplifier. Alternatively it might prove necessary to implement some specialised theorem proving algorithms. This would also be possible using the Prosper toolkit.

6

Python COM Component

The main piece of code developed for this system is the Python implementation of the PII. This was simple to write, partly since the structure is similar to the existing Java PII, and partly because this is the sort of application for which Python was designed. The code makes use of dynamic typing and other features of the language to provide a compact and natural implementation of the PII. Although written for this one application, the Python implementation makes available the objects of the PII, and hence the functionality of the Prosper tools to any language that supports COM. In addition to the PII implementation the COM component contains some additional code specific to this example. This first parses the strings to logical terms. This assumes that the semantics of the operators is the same in Excel and HOL. The terms are then passed on to the proof engine. It returns the result of the proof attempt as true, false, or ‘unable to decide’, which is displayed in the cell containing the ISEQUAL formula. This result can be used by other cells and will be automatically recomputed if necessary.

7

Excel Macro

We wrote a visual basic function, ISEQUAL, using Excel’s macro editor. Once written, it automatically appears in Excel’s function list as a User Defined Function and can be used in a spreadsheet like any other function. ISEQUAL takes two cell references as arguments. It recursively extracts the formulae contained in the cells as strings (support for this already exists in Excel) and passes them on to the Python object. The macro consists of about 30 lines of Visual Basic code.

8

Conclusions

There are numerous Add-Ins to Excel many of which, unsurprisingly, extend its mathematical ability. The Maple 6 Add-In provides computer algebra techniques to Excel spreadsheets. Interval Solver [3] extends Excel with Interval Constraint Solving to allow spreadsheet users to reason with incomplete and uncertain information. We believe that theorem proving could also have a role to play in this

System Description: Embedding Verification into Microsoft Excel

501

field. We have demonstrated that the Prosper approach provides a framework in which this could be done. We were surprised and pleased with the ease that a very basic prototype of verification support for Excel could be produced. It took two programmers, neither of whom had any experience with Visual Basic, Python or COM only 48 hours to get to the point where Excel was able to prove the commutativity of plus. While this may seem uninteresting, the reordering of the mathematical operators in large formulae is exactly the kind of lightweight sanity check that may appeal to users. Extending the system to handle more arithmetic and logical operators was easy and the system has been tested on a range of linear arithmetic and spreadsheet style examples. The system could be extended further and more complex and interesting proof strategies could be programmed. The system is a proof of concept of the claim made by the Prosper project that their toolkit would enable the embedding of verification into applications not designed with it specifically in mind. The only significant piece of new code is the Python port of the PII which is a general purpose component that could be used for other systems. Adding even limited theorem proving functionality by programming a procedure from scratch instead of using existing tools would have taken much longer, as would interfacing to a theorem prover without using the Prosper tools. The use of two operating systems is not ideal but could be removed if the Prosper tools were ported to Windows. The current setup would be reasonable in a networked setting with many copies of Excel accessing one proof engine. The embedding of verification into Excel also serves as an example of the concepts of “lightweight” theorem proving and the “invisible” use of verification. Here all the infrastructure is invisible to the user who simply gets an extra function available in Excel.

References 1. L. A. Dennis, G. Collins, M. Norrish, R. Boulton, K. Slind, G. Robinson, M. Gordon and T. Melham, The PROSPER Toolkit, TACAS 2000, to appear. 2000. 2. M. J. C. Gordon and T. F. Melham (eds), Introduction to HOL: A theorem proving environment for higher order logic, Cambridge University Press, 1993. 3. E. Hyv¨ o nen and S. De Pascale, A New Basis for Spreadsheet Computing: Interval SolverT M for Microsoft Excel. Proceedings of 16th National Conference on Artificial Intelligence and 11th Innovative Applications of Artificial Intelligence Conference (AAAI/IAAI-99), AAAI Press / The MIT Press, pp. 799–806, 1999. 4. Microsoft Corporation, Microsoft Excel, http://www.microsoft.com/excel. 5. M. Sheeran and G. St˚ almarck, A tutorial on St˚ almarck’s proof procedure for propositional logic. The Second International Conference on Formal Methods in Computer-Aided Design, Lecture Notes in Computer Science 1522, Springer-Verlag, pp. 82–99, 1998. 6. G. St˚ almarck and M. S¨ aflund, Modelling and Verifying Systems and Software in Propositional Logic. Proceedings of SAFECOMP ’90, Pergamon Press, pp. 31–36, 1990.

System Description: Interactive Proof Critics in XBarnacle Mike Jackson1 and Helen Lowe2 1

1

Department of Electronic and Electrical Engineering, University of Edinburgh, Scotland [email protected] 2 Department of Computer Studies, Glasgow Caledonian University Cowcaddens Road, Glasgow G4 0BA, Scotland [email protected]

Introduction

Proof critics [2] extend the power of a theorem prover by, for example, allowing lemmas to be postulated and proved in the course of a proof. However, extending the automated theorem prover CLAM by adding critics also increases the search space. XBarnacle [5] was developed to make the process of interacting with a semiautomatic theorem prover more tractable for the non-expert user. We have now substantially amplified and extended XBarnacle so that it makes the work of expert users more efficient as they interact with proof critics. Of course, we have also made cosmetic improvements to aid navigability and to bridge the gulf of evaluation [1] which proves such an obstacle in making theorem provers more accessible, and even their expert users more efficient.

2

System Requirements

In building the new version of XBarnacle, we were able to build on experience with several systems, each with strengths and weaknesses. An obvious meta-requirement was to try to incorporate the strengths of each whilst avoiding the weaknesses. − Clam version 3.2 could patch proofs by applying critics under certain patterns of failure in the pre-conditions of methods. However, this greatly increases the search space, making interaction necessary. − XClam [3] was a graphical interface to Clam version 3.2 but, like its parent system had no persistent representation of the partial plan, making undoing and re-planning nodes impossible; it also used non-hierarchical planning, making navigation difficult. − The version of XBarnacle (based on clam version 2.2) reported in [5] allowed the user to interact with the proof tree in the course of a proof but could not patch proofs automatically. D. McAllester (Ed.): CADE-17, LNAI, pp. 502-506, 2000. © Springer-Verlag Berlin Heidelberg 2000

System Description: Interactive Proof Critics in XBarnacle

503

− Another version of XBarnacle [4] incorporated critics in the “flat” structure of Clam 3.2. This made proof trees very rapidly become large and unnavigable in practice if not in theory.. Co-incidentally, during the evaluation users requested passive as opposed to active critiquing (so that the decision to apply a critic was the user’s rather than the program’s). The proof engine of the new version of XBarnacle was therefore to be an amalgam of CLAM version 2.6, which has no critics but a hierarchical method set; and CLAM version 3.2, which incorporates critics in a flat method structure. In addition, we had found that a common request on the original XBarnacle system reported in [5] was to be able to “open up” the high-level proof steps to see the individual smaller steps within. This facility of hierarchical tree browsing has therefore been provided in the current system, not merely to help less experienced users see how the steps are performed, but so that expert users can interact directly with the step requiring the use of critics.

3

Hierarchical Tree Browsing

With previous versions of XBarnacle only the top level goals were retained and displayed. The intermediate goals arising in sub-plans were not stored. For example, if the method used several rewrite rules only the final result of rewriting was shown, not the intermediate steps. We made modifications so that all of these steps are retained and can be displayed on request. This helps to bridge the gulf of evaluation for users who may find that the granularity is too high, and that the system is making too big a leap from one step to another. It also facilitates the use of critics. Experienced users can open up nodes until they see a likely looking sub-goal for which they know a lemma will also certainly be proposed by a critic. A data structure in Prolog to facilitate hierarchical tree browsing was provided as a set of tree_nodes. The Tcl/Tk side also has a tree_node array. This array is indexed by the canvas in which the sub-plan will be drawn if the user requests it. It indicates whether any of the (sub-)goals may be critiqued, in other words, whether a critic might be applicable.

4

Critics

XBarnacle is a co-operative system and allows the user to critique nodes in the proof plan. Interactive proof critics will then propose ways in which the proof might be improved, for example by the addition of some wave rules or generalising a goal and the user is free to accept or reject the proposed patches. All nodes that may be critiqued with the loaded critics will be marked by XBarnacle using a distinctive icon.

504

Mike Jackson and Helen Lowe

XBarnacle allows the user to critique either a specific goal, in which case critics are tested for applicability at that goal only; or an entire sub-plan, in which case every goal that is in the sub-plan (and recursively sub-plans of the sub-plan etc.) will be critiqued (i.e. the system will test for applicability of critics). The type of critiquing supported on a node will be indicated by a shaded mark on the node. The user can then indicate that they wish to critique the goal, possibly by using hierarchical tree browsing, or the sub-plan. After having selected whether to critique a goal or a sub-plan XBarnacle will display a list of the critics that may be used to critique the goal or sub-plan. Currently these are lemma calculation, lemma speculation, generalization, and inductiiob revision. The user can select one or more of these critics to try to apply to the goal/sub-plan. For example, if the user thinks that the original goal must be generalized before the theorem is provable, they will choose generalization. If there seems to be a missing lemma, lemma calucation and/or lemma speculation will be chosen. Once the critic is selected, XBarnacle will see if the chosen critics can propose some patches for the selected goal, or goals in a sub-plan. If the user has no idea which to choose, they can simply leave the choice to XBarnacle. Instead of choosing a selected set of critics to apply as described above, pressing the Critique with All Critics button causes XBarnacle to critique the goal/sub-plan using all the allowable critics. View the Proposed Patches is then facilitated. There will be a short delay as CLAM tests whether the critics propose any patches. Critiquing a sub-plan may take a while since every goal in the sub-plan hierarchy must be critiqued. If any of the chosen critics propose patches for the goal/sub-plan goals being critiqued then XBarnacle will display a Proposed Patches window with a list of the possible patches, each patch including the name of the critic that proposed that patch.. The user can apply a patch or retrieve a number of types of information about a patch. Some patches may need to be customized. All of these actions are specific to the currently selected patch. Clicking on a patch in the list of patches displayed selects that patch. Actions such as customization, apply patch, view locations, explanations and view patches as wave rules, will now be specific to the selected patch until the user selects another one. To apply a selected patch we just press the Apply button and XBarnacle will attempt to apply the selected patch. There will be a delay as XBarnacle plans the node where the patch is to be applied and attempts to apply the patch. If the attempt to apply the patch fails then XBarnacle will just apply an applicable method instead.

System Description: Interactive Proof Critics in XBarnacle

505

Two types of explanation may be viewed for each patch proposed in the possible patches window. All are more suitable for users familiar with proof planning and rippling than general or novice users. − The Why critic applicable? button displays an explanation as to why the method associated with the critic failed, in terms of the pattern of precondition failure of the method, and some extra information; − The What critic does? button gives a general explanation as to what the critic will do to patch the proof. Patches intended to be used as wave rules, which, at present, are the patches proposed by the lemma calculation and lemma speculation critics, may be viewed as wave rules, shaded graphically. Some patches need to be customized by the user before they can be applied. The user is expected to provide some information. On selecting a customizable patch the Customize... button in the Proposed Patches window will become active. On pressing this button XBarnacle will display the Customize window, displaying the patch. The customizable parts of the patch (higher order meta-terms) will be displayed so that the patch can be customized by editing these to provide the necessary instantiations. The instantiations may use any of the variables listed in the Customize window, any of the functions loaded into XBarnacle, and any of the standard constants and operators. Infix versions of common functions, such as + for plus are also accepted. If the user has instantiated all the meta-terms in the patch then they can try to apply it. This is done by pressing the Apply button. XBarnacle will first perform checks to ensure that the patch contains no uninstantiated meta-terms, there are no unknown function or other symbols, the term is syntactically correct and that there are no type violations.

5

Obtaining the System

The interface side of XBarnacle is written in Tcl 8.0 and Tk 8.0. These may be downloaded free from http://www.scriptics.com/software/8.0.html. The proof engine is written in SICSTUS Prolog 3.5, obtainable from http://www.sics.se/isl/sicstus.html. XBarnacle is available at http://members.xoom.com/helen_lowe/XBarnacle.tar. After extracting the files, go to the tk_tcl/make subdirectory and edit the Makefile as described in the file at http://members.xoom.com/helen_lowe/readme.txt. Follow the instructions in that file to make and run the executable. The doc/ sub-directory contains substantial user and developer information. The user guide includes a short tutorial. The developer manual contains details of the architecture, the interplay between Prolog and tcl/tk, data structures used to hold information

506

Mike Jackson and Helen Lowe

to be displayed and facilitate navigation, and the design rationale behind some of the design decisions taken.

6

Further Work

The system was built for, and evaluated by, expert users. As HCI practitioners, we focus strongly on the user and the task for which the system is designed. One maxim is “Speak the User’s Language” [6]. The users in this case were all members of, or close associates of, the Mathematical Reasoning Group in the Division of Informatics at the University of Edinburgh. Their interests were in theorem proving per se. Redesigning the system so that, for example, it supports users of proof tools, interested primarily in program development, is not just a simple question of rewording explanations, although that could be easily done. However, much of what we have learned can be carried through to other users and other tasks. A hierarchical display of the proof seems fairly generally desirable, merely the level and granularity differing between novices and experts. There is no conclusive evidence that passive as opposed to active critiquing is desired by all users, but these could easily be provided as alternatives, a feature customizable by the user.

References 1. 2. 3. 4. 5. 6.

E. L. Hutchins, Hollan, J. D. and Norman D. Direct manipulation interfaces. In User Centred System Design (Norman, D. and Draper, S., eds.), pp. 87−124. Hillsdale, NJ: Lawrence Erlbaum Associates, 1986. Ireland, A., and Bundy, A. Productive Use of Failure in Inductive Proof. Special edition of Journal of Automated Reasoning on Inductive Proof 16 1996. Ireland, A., Jackson, M., and Reid, G. Interactive Proof Critics, Formal Aspects of Computing 11, 302-325, Springer-Verlag, 1999.. Jackson, M. Interacting with Semi-automated Theorem Provers via Interactive Proof Critics PhD Thesis, School of Computing, Napier University, 1999. Lowe, H., and Duncan, D. XBarnacle: Making Theorem Provers More Accessible. Proceedings of the Fourteenth conference on Automated Deduction, Townsville, Australia, pp 108–121, 1997. Molich, R. and Nielsen, J. Improving a human-computer dilogue, Communications of the ACM, 33(3), 338-348, 1990.

! " # # # # # $ # ! % ! % &% $ % %' ' # % '# ( ' ! ) & ' ' * * ' ! % % ' !

! " & ' ++! & ' % &% ! & #

' % % # # & % % # # ' ! , - # % # ! . % ' $ / # % % $ $ % ! # & %!

) ) 0 1 )2 .& 2 ! % ' 3# & # ++# 4 ! ' ' ' % !

!"#$" % ! & '"

!

"

#$$% & ' $ ( $ ) *) + , -+ ). , /012/ 3!/ 4 %% , # %%& ( +

, 5 -+, 6 ' ) ) 7 + ) , 8 %%%, 6 , # & ( +

, , ) , 6 , %% , #%!& ( +

) "

, , ) %%!, 9 ' : :%!: , #& "

, ! " , : , ,

Tutorial: Automated Deduction and Natural Language Understanding Stephen Pulman University of Cambridge Computer Laboratory and SRI International, Cambridge [email protected]

The purpose of this tutorial is to introduce the Automated Deduction community to a growing area in which their expertise can be applied to a novel set of problems. No knowledge of natural language processing will be assumed. (If you have time for some background reading, James Allen’s ‘Natural Language Understanding’, 2nd edition, Addison Wesley, 1995 can be recommended). In this tutorial I will describe and illustrate some of the areas in which NLU interacts with theorem proving, and say what our problems are. My hope is that out of this you will get some interesting new problems to work on, and that we will eventually get answers to some of our questions. The topics to be covered include: Semantic Assembly. At some point, all interesting natural language processing applications need to relate sentences to a meaning representation of some kind. In our case, the target representation is usually first order logic, augmented with some higher order constructs. We choose first order logic because that is what it is easiest to mechanise inferences for, but natural language is intrinsically higher order. I will describe some of the typical problems that arise in choosing logical representations for English constructs that are (a) capable of being derived compositionally from parsed sentences and (b) capable of supporting the necessary inferences. Usually this is a balancing act between expressiveness and efficiency of inference. What we need from you is some guidance about what higher order constructs are ‘safe’ in that they can be mechanised with reasonable efficiency, and some lessons in how to transform or compile the rest into something that can be used in practical systems. Underspecified Representations. Unfortunately, sentences usually contain context-dependent constructs (pronouns, etc.) whose interpretation will depend on the circumstances in which the sentence is produced. This means that semantic interpretation has to take place in two stages: one relatively compositional phase in which the meanings of the words and their syntactic configuration are used to build a ‘quasi-logical form’, and a second stage in which inferences are made from the context to flesh out the quasi-logical form to something that is evaluable independently. We have two problems here: firstly, the ‘quasi-logical’ forms only have a semantics indirectly and so we need inference mechanisms for non-standard logics, or at least nonstandard, linguistically oriented representations. Secondly, the context may D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 509–510, 2000. c Springer-Verlag Berlin Heidelberg 2000

510

Stephen Pulman

not be fully represented either and so we often need to make conditional or abductive inferences leading to conclusions of the form ‘P if Q holds’. Again, many of the constructs we are dealing with are higher order, and typically, the search spaces can be very large. ¡/p¿ Disambiguation. Most sentences are ambiguous. We can use various methods for choosing the most likely reading. Many readings can be eliminated because they are inconsistent with the model that has been built up by previous sentences. I will describe current applications of model building and model checking for this purpose. Our problems here include (a) reasoning efficiently with large numbers of axioms (b) efficient reasoning with equality. Yet again, many of the constructs we are dealing with are higher order. Some web sites which offer demos of NLP systems using theorem provers in the ways described above: – http://www.coli.uni-sb.de/~bos/doris: Johan Bos’s DORIS system – http://www.cs.rochester.edu/research/epilog: Len Schubert’s EPILOG demo – http://ubatuba.ccl.umist.ac.uk: Allan Ramsay’s PARASITE system

Tutorial: Using TPS for Higher-Order Theorem Proving and ETPS for Teaching Logic Peter B. Andrews and Chad E. Brown Department of Mathematical Sciences, Carnegie Mellon University Pittsburgh, PA 15213, USA [email protected], [email protected]

TPS is an automated theorem proving system which can be used to prove theorems of first- or higher-order logic automatically, interactively, or in a combination of these modes of operation. Proofs in TPS are presented in natural deduction style. ETPS is a program which was obtained from TPS by deleting all the facilities for proving theorems automatically. ETPS can be used by students to learn how to prove theorems interactively. The objective of the tutorial is to teach participants how to make effective use of TPS and ETPS. Information about TPS, including manuals and information about obtaining the system, can be found at http://gtps.math.cmu.edu/tps.html. ETPS is intended to be used as a teaching tool in logic courses. ETPS can be used effectively in a course which is concerned purely with first-order logic as well as one which also deals with higher-order logic. ETPS gives students immediate feedback for both correct and incorrect actions, and makes it easy to display selected parts of proofs, as well as modify and rearrange them. Proofs, and the active lines of the proof, are displayed in proof windows which are automatically updated as the proof is constructed interactively. ETPS enables students to construct rigorous proofs of more difficult theorems than they otherwise find tractable. ETPS checks proofs automatically, and creates records of the theorems proved by each student which can be automatically transferred to the teacher’s grade file. The logical language of TPS is Church’s type theory, a formulation of higherorder logic in which theorems of mathematics can be expressed very naturally. The notation of this language is displayed on the screen and in printed proofs. Definitions are handled elegantly by λ-notation. The tutorial presupposes familiarity with first-order logic, but not with higher-order logic. The tutorial includes an introduction to the notation of type theory, examples showing how to express theorems of mathematics (including those involving inductive definitions) in this language, and lessons on how to write theorems and definitions in TPS and put them into a TPS library. The facilities for constructing natural deduction proofs and an editor for wffs are common to TPS and ETPS. In addition, TPS has tactics for applying natural deduction rules of inference semi-automatically, and automatic procedures for constructing complete proofs or filling in gaps in partially completed natural ?

The development of TPS and ETPS was supported by the National Science Foundation under grant CCR-9732312 and previous grants.

D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 511–512, 2000. c Springer-Verlag Berlin Heidelberg 2000

512

Peter B. Andrews and Chad E. Brown

deduction proofs. TPS searches for proofs in automatic mode by first searching for an expansion proof, and then translating this into a natural deduction proof. TPS has a number of search procedures, and there are many flags which control the behavior of TPS and set bounds for the many dimensions of proof search in higher-order logic. TPS is designed to be a research tool as well as a theorem proving system. It has facilities for working on unification problems and mating searches, displaying wffs in vertical path diagrams, printing proofs in various styles including tex, and translating back and forth between natural deduction proofs and expansion proofs. TPS has library facilities, online help, and extensive documentation (some of which is produced automatically). The tutorial provides opportunities for hands-on experience with TPS and ETPS, and discussion of how to treat examples provided by the participants.

References 1. Peter B. Andrews. An Introduction to Mathematical Logic and Type Theory: To Truth Through Proof. Academic Press, 1986. 2. Peter B. Andrews. On Connections and Higher-Order Logic. Journal of Automated Reasoning, 5:257–291, 1989. 3. Peter B. Andrews, Matthew Bishop, Sunil Issar, Dan Nesmith, Frank Pfenning, and Hongwei Xi. TPS: A Theorem Proving System for Classical Type Theory. Journal of Automated Reasoning, 16:321–353, 1996. 4. Matthew Bishop. A Breadth-First Strategy for Mating Search. In Harald Ganzinger, editor, Proceedings of the 16th International Conference on Automated Deduction, volume 1632 of Lecture Notes in Artificial Intelligence, pages 359–373, Trento, Italy, 1999. Springer-Verlag. 5. Matthew Bishop and Peter B. Andrews. Selectively Instantiating Definitions. In Claude Kirchner and H´el`ene Kirchner, editors, Proceedings of the 15th International Conference on Automated Deduction, volume 1421 of Lecture Notes in Artificial Intelligence, pages 365–380, Lindau, Germany, 1998. Springer-Verlag. 6. Alonzo Church. A Formulation of the Simple Theory of Types. Journal of Symbolic Logic, 5:56–68, 1940. 7. Douglas Goldson, Steve Reeves, and Richard Bornat. A Review of Several Programs for the Teaching of Logic. The Computer Journal, 36:373–386, 1993. 8. Sunil Issar. Path-Focused Duplication: A Search Procedure for General Matings. In AAAI–90. Proceedings of the Eighth National Conference on Artificial Intelligence, volume 1, pages 221–226. AAAI Press/The MIT Press, 1990. 9. Dale A. Miller. A Compact Representation of Proofs. Studia Logica, 46(4):347–370, 1987. 10. Frank Pfenning and Dan Nesmith. Presenting Intuitive Deductions via Symmetric Simplification. In M. E. Stickel, editor, Proceedings of the 10th International Conference on Automated Deduction, volume 449 of Lecture Notes in Artificial Intelligence, pages 336–350, Kaiserslautern, Germany, 1990. Springer-Verlag.

Workshop: Model Computation – Principles, Algorithms, Applications Peter Baumgartner, Chris Fermueller, Nicolas Peltier, and Hantao Zhang Computing models of first-order or propositional logic specifications is the complementary problem of refutational theorem proving. A deduction system capable of producing models significantly extends the functionality of purely refutational systems by providing the user with useful information in case that no refutation exists. Ideally, any theorem prover that terminates without finding a refutational proof should be able to output (information on) countermodels. Characterizing classes of inputs for which termination can be guaranteed, defining appropriate formalisms for representing such models, and providing algorithms for working with the resulting model representations (e.g., evaluating clauses, testing equivalence, etc.) is a great challenge in automated deduction. Computing models is becoming an increasingly important topic in automated deduction. This is due to the potential application areas like disproving conjectures in classical theorem proving and software verification; discourse representation in natural language, deductive databases, product configuration, hardware verification, model-based diagnosis, planning, model checking etc. Some of these methods currently rely heavily on first-order logic with finitedomain or propositional logic (e.g. model checking). On the other side, methods for computing models for first-order specifications have been emerging recently by linking fields like term rewriting, term schematizations, and constraint evaluation and their potential is worth to be explored much further. The workshop is therefore emphasizing model construction principles for the first-order case, although also highly welcoming contributions concentrating on finite-domain, propositional logics, and more expressive logics such as higherorder and modal logics. More specifically, the goal of the workshop is to discuss (non-exclusively) research on the following issues: – Theoretical background, such as representation formalisms for models and their properties (like expressivity and complexity). – Calculi and respective procedures to compute models, implementations, experiments and performance issues. – Applications and related topics, such as finding the appropriate formulation, application problems, and problem sets. The goal of the workshop is to bring together researchers working on these and related topics. As an outcome, the workshop would help to identify important problems in model computation, concentrate our efforts coming from different directions to attack these problems, get new insights by mutually learning from the various aspects of model computation, and stimulate further research. Workshop home page: http://www.uni-koblenz.de/~peter/CADE17-WS-MODELS/

D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 513–513, 2000. c Springer-Verlag Berlin Heidelberg 2000

Workshop: Automation of Proof by Mathematical Induction Carsten Sch¨ urmann Carnegie Mellon University, Pittsburgh, USA

Mathematical induction is required for reasoning about objects or events containing repetition, e.g. computer programs with recursion or iteration, electronic circuits with feedback loops or parameterized components, and properties that hold for all time forward. It is thus a vital ingredient of formal methods techniques for synthesizing, verifying and transforming software and hardware. The automation of proof by induction strengthens the capabilities of mechanical assistants, it reduces the need for designers to be skilled in mathematical proof techniques, and it improves productivity by automating tedious and error-prone aspects of formal system development. This workshop is organized around four sessions. Inductive Theorem Proving and Formal Methods: Formal system development is becoming a mature and established discipline and induction is one of the key techniques for dealing with abstract concepts. The aim of this session is to bring together the merits of inductive theorem proving techniques and formal methods in industrial application scenarios. Higher-Order Inductive Theorem Proving: Higher-order logics provide a rich framework for expressing and reasoning about formal specifications. The importance of mechanizing formal arguments within higher-order logics is reflected by the sustained growth in popularity of verification environments such as HOL, Isabelle, Nuprl, and PVS. The aim of this session is to discuss recent advances of automated reasoning techniques within the context of higher-order logics. Integrating Inductive and High-Performance Theorem Provers: Many first-order theorem provers are based on tableaux, matrix, and resolution techniques and their implementations are in general very efficient and highly specialized. The aim of this session is to elaborate how to integrate inductive theorem provers with other existing theorem proving technology. Meeting the Challenges: We are interested in problems which demonstrate the unique merits of inductive theorem proving techniques. Submitted challenge problems will be displayed on the homepage prior to the workshop. Researchers are invited to submit solutions or counter challenges. The aim of this workshop session is to debate the relative merits of challenges and their solutions. The workshop homepage is located at www.cs.cmu.edu/~carsten/apmi00 and the workshop committee consists of Carsten Sch¨ urmann, Andrew Ireland, Deepak Kapur, Christoph Kreitz, and Toby Walsh.

D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 514–514, 2000. c Springer-Verlag Berlin Heidelberg 2000

Workshop: Type-Theoretic Languages: Proof-Search and Semantics Didier Galmiche LORIA - UHP, Nancy, France

Much recent work has been devoted to type theory and its applications to proofand program- development in various logical settings. The focus of this workshop is on proof-search, with a specific interest on semantic aspects of, and semantics approaches to, type-theoretic languages and their underlying logics (e.g., classical, intuitionistic, linear, substructural). Such languages can be seen as logical frameworks for representing proofs and in some cases formalize connections between proofs and programs that support program-synthesis. The theory of proof-search has developed mostly along proof-theoretic lines but using many type-theoretic techniques. The utility of type-theoretic methods suggests that semantic methods of the kind found to be valuable in the semantics of programming languages should be useful in tackling the main outstanding difficulty in the theory of proof-search, i.e., the representation of intermediate stages in the search for a proof. The objective of the workshop is to provide a forum for discussion between, on the one hand, researchers interested in all aspects of proof-search in type theory, logical frameworks and their underlying (e.g., classical, intuitionistic, substructural) logics and, on the other, researchers interested in the semantics of computation. Topics of interest, in this context, include but are not restricted to the following: Foundations of proof-search in type-theoretic languages (sequent calculi, natural deduction, logical frameworks, etc.); Systems, methods and techniques related to proof construction or to counter-models generation (tableaux, matrix, resolution, semantic techniques, proof plans, etc.); Decision procedures, strategies, complexity results; Logic programming as search-based computation, integration of model-theoretic semantics, semantics foundations for search spaces; Computational models based on structures as games and realizability; Proof synthesis vs program synthesis and applications, equational theories and rewriting; Applications of proof-theoretic and semantics techniques to the design and implementation of theorem provers. Programme Committee: D. Galmiche, LORIA - UHP, Nancy, France. P. Lincoln, SRI, Stanford, U.S.A. F. Pfenning, CMU, Pittsburgh, U.S.A. D. Pym, Queen Mary and Westfield College, London, U.K. J. Smith, Chalmers University, G¨ oteborg, Sweden.

D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 515–515, 2000. c Springer-Verlag Berlin Heidelberg 2000

Workshop: Automated Deduction in Education Erica Melis Universit¨ at des Saarlandes, FB Informatik, Germany

One of the potential real-world applications of deduction systems is in mathematics education. Patrick Suppes’ education system is an early pioneer in this regard, for example, and while the potential has been mentioned in discussions at previous CADE conferences, currently there is renewed interest in this topic as well as several activities and projects within the CADE community. In an intelligent tutor system a deduction component might be used, e.g. to provide the expert model, to provide potential models of erroneous reasoning, as a basis for topic sequencing, as a basis for automated diagnosis. However, typically, a mathematics education system will not or not only include a deduction system, because the need for explanation will dominate the requirements for correctness in theorem proving in an educational context. That is, the power of automated deduction has to be combined with appropriate interfaces, user models, theory construction, and explanation functionalities before a system can be didactively effective. Though extensive production-quality systems are still in the future, some of the knowledge and the knowledge representation that is currently used in automated and interactive theorem-proving systems can be employed for educational needs as well. A purpose of this workshop, in this application area of automated and interactive theorem proving, is to establish more communication between current education projects in the CADE community, to exchange ideas and opinions, and to make available the experience of education systems from other AI-communities. We explicitly encourage the submission of project descriptions. We plan to focus the workshop on the following topics and questions – How best can automated and interactive theorem-proving systems contribute to mathematics education? – What are the proof-presentation and explanation needs for such teaching? – What sort of integration of specialized reasoning systems (e.g., computer algebra systems) should we expect? – How do we generate good examples and counter examples in various subjects? – What is the role of knowledge-based theorem proving for mathematics education? – What are the human-factors requirements for good systems? – How do we evaluate the educational success of such systems? Further information can be available at http://www.ags.uni-sb.de/~melis/cade00ws.htm

D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 516–516, 2000. c Springer-Verlag Berlin Heidelberg 2000

Workshop: The Role of Automated Deduction in Mathematics Simon Colton, Volker Sorge, and Ursula Martin The purpose of this workshop is to discuss the role of automated deduction in all areas of mathematics. This will include looking at the interaction between automated deduction programs and other computational systems which have been developed over recent years to automate different areas of mathematical activity. Such systems include computer algebra packages, tutoring programs, mathematical discovery systems and systems developed to help present and archive mathematical theories. The workshop will also include discussions of the use of automated theorem proving in the wider mathematical community. Presentations which detail the employment of automated deduction techniques in any area of mathematical research have been encouraged. With initiatives such as the Calculemus project, automated deduction is increasingly being seen not as an isolated area of research, but as part of an integrated attack on the problem of automating mathematics. We are interested in the interaction of automated theorem proving programs with (i) computer algebra (CA) packages (ii) constraint solvers (iii) model generators (iv) tutoring systems (v) interactive textbooks (vi) theory formation programs and (vi) mathematical databases. In all these fields automated deduction is either already used or could be fruitfully employed to enhance the power and reliability of existing systems. Particular ongoing projects include the use of deduction to certify CA systems, and also to enhance CA systems. Other projects include the incorporation of deduction into mathematical tutoring systems and interactive mathematical textbooks and the use of theory formation to help in automated theorem proving. The interaction between these programs could be in terms of improving automated deduction or in terms of using automated deduction to improve the techniques employed in the other system. The workshop is intended to inspire the use of automated deduction within other fields of mathematics as well as the incorporation of techniques from other fields into automated deduction. We intend to provide a forum for discussion between researchers from the field of automated deduction and researchers from particular domains of mathematics. In particular, the workshop will address mathematical results proved in part by automated deduction techniques as well as theorems which can potentially be proved with automated techniques. An original goal of automated theorem proving was its application to mathematics, whether by proving established results, enhancing calculation techniques or facilitating discovery of new results. There is still much scope for the use of automated deduction to add to mathematics and we hope to explore these possibilities in the workshop. The workshop home page is located at: http://www.dai.ed.ac.uk/~simonco/conferences/CADE00 D. McAllester (Ed.): CADE-17, LNAI 1831, pp. 517–517, 2000. c Springer-Verlag Berlin Heidelberg 2000

Author Index Allen, Stuart F. . . . . . . . . . . . . . . . . . 170 Andrews, Peter B. . . . . . . . . . . 164, 511 Appel, Andrew W. . . . . . . . . . . . . . . . . 7 Audemard, Gilles . . . . . . . . . . . . . . . 302 Bachmair, Leo . . . . . . . . . . . . . . . 64, 220 Barrett, Clark W. . . . . . . . . . . . . . . . . 79 Baumgartner, Peter . . . . . . . . 200, 513 Belinfante, Johan G.F. . . . . . . . . . . 132 Benhamou, Belaid . . . . . . . . . . . . . . . 302 Bezem, Marc . . . . . . . . . . . . . . . . . . . . 148 Bishop, Matthew . . . . . . . . . . . . . . . . 164 Borralleras, Cristina . . . . . . . . . . . . . 346 Brown, Chad E. . . . . . . . . . . . . 164, 511 Brown, Marianne . . . . . . . . . . . . . . . . 411 Bustan, Doran . . . . . . . . . . . . . . . . . . 255 Chatalic, Philippe . . . . . . . . . . . . . . . 449 Collins, Graham . . . . . . . . . . . . . . . . . 497 Colton, Simon . . . . . . . . . . . . . . . . . . . 517 Constable, Robert L. . . . . . . . . . . . . 170 Degtyarev, Anatoli . . . . . . . . . . . . . . 365 Dennis, Louise A. . . . . . . . . . . . . . . . 497 Dill, David L. . . . . . . . . . . . . . . . . . . . . 79 Eaton, Rich . . . . . . . . . . . . . . . . . . . . . 170 Emerson, E. Allen . . . . . . . . . . . . . . . 236 Farmer, William M. . . . . . . . . . . . . . 115 Fermueller, Chris . . . . . . . . . . . . . . . . 513 Ferreira, Maria . . . . . . . . . . . . . . . . . . 346 Franke, Andreas . . . . . . . . . . . . . . . . . 455 Fujita, Hiroshi . . . . . . . . . . . . . . . . . . 184 Galmiche, Didier . . . . . . . . . . . . . . . . 515 Genet, Thomas . . . . . . . . . . . . . . . . . . 271 Giesl, J¨ urgen . . . . . . . . . . . . . . . . . . . . 309 Gillard, Guillaume . . . . . . . . . . . . . . 417 Giunchiglia, Enrico . . . . . . . . . . . . . . 291 Grumberg, Orna . . . . . . . . . . . . . . . . 255 Harrison, John . . . . . . . . . . . . . . . . . . . . 1 Hasegawa, Ryuzo . . . . . . . . . . . . . . . . 184 Hendriks, Dimitri . . . . . . . . . . . . . . . 148 Henocque, Laurent . . . . . . . . . . . . . . 302 Horrocks, Ian . . . . . . . . . . . . . . . . . . . 482 Horton, Joseph D. . . . . . . . . . . . . . . . 385 Hustadt, Ullrich . . . . . . . . . . . . . . . . . 433 Jackson, Mike . . . . . . . . . . . . . . . . . . . 502 Kahlon, Vineet . . . . . . . . . . . . . . . . . . 236

Kamm¨ uller, Florian . . . . . . . . . . . . . . 99 Kapur, Deepak . . . . . . . . . . . . . . . . . . 324 Kautz, Henry . . . . . . . . . . . . . . . . . . . 183 Klay, Francis . . . . . . . . . . . . . . . . . . . . 271 Kohlhase, Michael . . . . . . . . . . . . . . . 455 Koshimura, Miyuki . . . . . . . . . . . . . . 184 Kreitz, Christoph . . . . . . . . . . . . . . . 170 Lee, Peter . . . . . . . . . . . . . . . . . . . . . . . . 25 Lorigo, Lori . . . . . . . . . . . . . . . . . . . . . 170 Lowe, Helen . . . . . . . . . . . . . . . . . . . . . 502 Martin, Ursula . . . . . . . . . . . . . . . . . . 517 McCune, William . . . . . . . . . . . . . . . 401 Meier, Andreas . . . . . . . . . . . . . . . . . . 460 Melis, Erica . . . . . . . . . . . . . . . . . . . . . 516 Michael, Neophytos G. . . . . . . . . . . . . 7 Middeldorp, Aart . . . . . . . . . . . . . . . 309 Necula, George C. . . . . . . . . . . . . . . . . 25 Nivelle, Hans de . . . . . . . . . . . . . . . . . 148 Patel-Schneider, Peter . . . . . . . . . . 297 Peltier, Nicolas . . . . . . . . . . . . . . . . . . 513 Pulman, Stephen . . . . . . . . . . . . . . . . 509 Rubio, Albert . . . . . . . . . . . . . . . . . . . 346 Ruess, Harald . . . . . . . . . . . . . . . . . . . 220 Sattler, Ulrike . . . . . . . . . . . . . . . . . . . 482 Schmidt, Renate A. . . . . . . . . . . . . . 433 Sch¨ urmann, Carsten . . . . . . . . 507, 514 Seger, Carl-Johan . . . . . . . . . . . . . . . 235 Shumsky, Olga . . . . . . . . . . . . . . . . . . 401 Simon, Laurent . . . . . . . . . . . . . . . . . . 449 Sinz, Carsten . . . . . . . . . . . . . . . . . . . . 177 Slind, Konrad . . . . . . . . . . . . . . . . . . . . 45 Sofronie-Stokkermans, Viorica . . . 465 Sorge, Volker . . . . . . . . . . . . . . . . . . . . 517 Spencer, Bruce . . . . . . . . . . . . . . . . . . 385 Stump, Aaron . . . . . . . . . . . . . . . . . . . . 79 Subramaniam, Mahadavan . . . . . . 324 Sutcliffe, Geoff . . . . . . . . . . . . . 406, 411 Tacchella, Armando . . . . . . . . . . . . . 291 Tiwari, Ashish . . . . . . . . . . . . . . . 64, 220 Tobies, Stephan . . . . . . . . . . . . . . . . . 482 Voronkov, Andrei . . . . . . . . . . . . . . . 365 Zhang, Hantao . . . . . . . . . . . . . . . . . . 513

7th International Conference on Automated Deduction: Proceedings

Automated deduction -- CADE-21: 21st International Conference on Automated Deduction, Bremen, Germany, July 17-20, 2007 : proceedings (Lecture Notes in Artificial Intelligence 4603)

Automated Deduction - CADE-22: 22nd International Conference on Automated Deduction, Montreal, Canada, August 2-7, 2009. Proceedings (Lecture Notes in ... Lecture Notes in Artificial Intelligence)

Automated Deduction - CADE-11: 11th International Conference on Automated Deduction, Saratoga Springs, NY, USA, June 15-18, 1992. Proceedings

Automated Deduction, Cade-12: 12th International Conference on Automated Deduction, Nancy, France, June 26 - July 1, 1994. Proceedings

Automated Deduction - CADE-21: 21st International Conference on Automated Deduction, Bremen, Germany, July 17-20, 2007, Proceedings

Automated deduction--CADE 16: 16th International; Conference on Automated Deduction, Trento, Italy, July 7-10, 1999 : proceedings (Lecture Notes in Artificial Intelligence 1632)

Automated deduction, CADE-20: 20th International Conference on Automated Deduction, Tallinn, Estonia, July 22-27, 2005 : proceedings (Lecture Notes in Artificial Intelligence 3632)

Automated Deduction - Cade-22: 22nd International Conference on Automated Deduction, Montreal, Canada, August 2-7, 2009. Proceedings (Lecture Notes in Artificial Intelligence 5663)

Automated deduction-CADE-18: 18th International Conference on Automated Deduction, Copenhagen, Denmark, July 27-30, 2002 : proceedings (Lecture Notes in Artificial Intelligence 2392)

Automated Deduction -- CADE-23

Automated Deduction - CADE-18: 18th International Conference on Automated Deduction, Copenhagen, Denmark, July 27-30, 2002 Proceedings

Automated Deduction CADE-20: 20th International Conference on Automated Deduction, Tallinn, Estonia, July 22-27, 2005, Proceedings

Automated Deduction - CADE-15: 15th International Conference on Automated Deduction, Lindau, Germany, July 5-10, 1998, Proceedings

Automated Deduction - CADE-16: 16th International Conference on Automated Deduction, Trento, Italy, July 7-10, 1999, Proceedings: Cade-16, ... Computer Science; 1632. Lecture Notes in Ar)

CADE Automated Deduction 10 conf

Automated deduction in classical and non-classical logics: selected papers (Lecture Notes in Artificial Intelligence 1761)

Automated Deduction CADE-13: 13th International Conference on Automated Deduction, New Brunswick, Nj, USA, July 30 - August 3, 1996, Proceedings: ... 13th

Automated Deduction in Geometry - ADG 2010

Automated Deduction in Geometry: International Workshop on Automated Deduction in Geometry, Toulouse, France, September 27-29, 1996, Selected Papers

10th International Conference on Automated Deduction: Kaiserslautern, FRG, July 24-27, 1990. Proceedings (Lecture Notes in Computer Science)

Instantiation Theory: On the Foundations of Automated Deduction

Automated Deduction in Geometry: Third International Workshop, ADG 2000, Zurich, Switzerland, September 25-27, 2000, Revised Papers

Automated Deduction in Classical and Non-Classical Logics: Selected Papers

Automated Deduction in Equational Logic and Cubic Curves

MICAI 2000: Advances in Artificial Intelligence: Mexican International Conference on Artificial Intelligence Acapulco, Mexico, April 11-14, 2000

Automated deduction - CADE-17: 17th International Conference on Automated Deduction, Pittsburgh, PA, USA, June 17-20, 2000 : proceedings, Volume 17, Part 2000 (Lecture Notes in Artificial Intelligence 1831)

7th International Conference on Automated Deduction: Proceedings

Automated deduction -- CADE-21: 21st International Conference on Automated Deduction, Bremen, Germany, July 17-20, 2007 : proceedings (Lecture Notes in Artificial Intelligence 4603)

Automated Deduction - CADE-22: 22nd International Conference on Automated Deduction, Montreal, Canada, August 2-7, 2009. Proceedings (Lecture Notes in ... Lecture Notes in Artificial Intelligence)

Automated Deduction - CADE-11: 11th International Conference on Automated Deduction, Saratoga Springs, NY, USA, June 15-18, 1992. Proceedings

Automated Deduction, Cade-12: 12th International Conference on Automated Deduction, Nancy, France, June 26 - July 1, 1994. Proceedings

Automated Deduction - CADE-21: 21st International Conference on Automated Deduction, Bremen, Germany, July 17-20, 2007, Proceedings

Automated deduction--CADE 16: 16th International; Conference on Automated Deduction, Trento, Italy, July 7-10, 1999 : proceedings (Lecture Notes in Artificial Intelligence 1632)

Automated deduction, CADE-20: 20th International Conference on Automated Deduction, Tallinn, Estonia, July 22-27, 2005 : proceedings (Lecture Notes in Artificial Intelligence 3632)

Automated Deduction - Cade-22: 22nd International Conference on Automated Deduction, Montreal, Canada, August 2-7, 2009. Proceedings (Lecture Notes in Artificial Intelligence 5663)

Automated deduction-CADE-18: 18th International Conference on Automated Deduction, Copenhagen, Denmark, July 27-30, 2002 : proceedings (Lecture Notes in Artificial Intelligence 2392)

Automated Deduction -- CADE-23

Automated Deduction - CADE-18: 18th International Conference on Automated Deduction, Copenhagen, Denmark, July 27-30, 2002 Proceedings

Automated Deduction CADE-20: 20th International Conference on Automated Deduction, Tallinn, Estonia, July 22-27, 2005, Proceedings

Automated Deduction - CADE-15: 15th International Conference on Automated Deduction, Lindau, Germany, July 5-10, 1998, Proceedings

Automated Deduction - CADE-15: 15th International Conference on Automated Deduction, Lindau, Germany, July 5-10, 1998, Proceedings

Automated Deduction - CADE-16: 16th International Conference on Automated Deduction, Trento, Italy, July 7-10, 1999, Proceedings: Cade-16, ... Computer Science; 1632. Lecture Notes in Ar)

CADE Automated Deduction 10 conf

Automated deduction in classical and non-classical logics: selected papers (Lecture Notes in Artificial Intelligence 1761)

Automated Deduction CADE-13: 13th International Conference on Automated Deduction, New Brunswick, Nj, USA, July 30 - August 3, 1996, Proceedings: ... 13th

Automated Deduction in Geometry - ADG 2010

Automated Deduction in Geometry: International Workshop on Automated Deduction in Geometry, Toulouse, France, September 27-29, 1996, Selected Papers

10th International Conference on Automated Deduction: Kaiserslautern, FRG, July 24-27, 1990. Proceedings (Lecture Notes in Computer Science)

Instantiation Theory: On the Foundations of Automated Deduction

Automated Deduction in Geometry: Third International Workshop, ADG 2000, Zurich, Switzerland, September 25-27, 2000, Revised Papers

Automated Deduction in Classical and Non-Classical Logics: Selected Papers

Automated Deduction in Equational Logic and Cubic Curves

MICAI 2000: Advances in Artificial Intelligence: Mexican International Conference on Artificial Intelligence Acapulco, Mexico, April 11-14, 2000

Automated deduction in equational logic and cubic curves

Lecture Notes in Artificial Intelligence)

Lecture Notes in Artificial Intelligence

Recommend Documents