INTRODUCTION TO METALOGIC
WITH AN
APPENDIX
ON
TYPE·THEORETICAL EXTENSIONAL AND INTENSIONAL LOGIC
BY
IMRERUZSA
ARON PUBLISHERS
* BUDAPEST * 1997
ISBN 963 85504 5 7 ARON PUBLISHERS, H-1447 BUDAPEST, P.O.BOX 4~7, -HUNGARY © Imre Ruzsa, 1997. Printed in Hungary
Copies of this book are available at Aron Publishers H-1447 BUDAPEST, P.O.BOX 487, HUNGARY and L. Eotvos University, Department of SymbolicLogic H-1364 BUDAPEST, P.O.BOX 107, HUNGARY
Nyomdai e16cillft~s: CP STUDIO, Budapest
· l':.
),
,.
.f
~.r-?, :~
/i' : ,.
T
~ .
i..
In'menuiriam my wife (1942-1996)
?
'
" ;
.
ACKNOWLEDGEMENTS The material of the present monograph originates from a series of lectures held by the author at the Department of Symbolic Logic of L. Eotvos University, Budapest. The questions and the critical remarks of my students and colleagues gave me a very valuable help in developing my investigations. My sincere thanks are due to all of them. Special thanks are due to PROFESSOR ISTVAN NEMETI and Ms AGNES KURUCZ who read the first version of the manuscript and made very important critical remarks. In preparing and printing this monograph, I got substantial help from my son DR. FERENC RUZSA as well as from my daughter AGNES RUZSA.
*** This work was partly supported by the Hungarian Scientific Research Foundation
(OTKA II3, 2258) and by the Hungarian Ministry of Culture and Education (MKM 384/94).
lmre Ruzsa Budapest, June 1996.
vi
TABLE OF CONTENTS Chapter 1 Introduction
1
1.1. The subject matter of metalogic
1
1.2. Basic postulates on languages 1.3. Speaking on languages
2 4
1.4. Syntax and semantics
5
Chapter 2, Instruments of metalogic
2.1. Grammatical means 2.2. Variables and quantifiers 2.3. Logical means
7 7 10 15
2.4. Definitions
19
2.5. Class notation
20
Chapter 3 Language radices
3.1. Definition and postulates
25 25
3.2. The simplest alphabets
28
Chapter 4 Inductive classes 4.1. Inductive definitions
31 31
4.2. Canonical calculi
36
4.3. Some logical languages 4.4. Hypercalculi 4.5. Enumerability and decidability
42
47
Chapter 5 Normal algorithms 5.1. What is an algorithm? 5.2. Definition of normal algorithms
51 51 54
5.3. Deciding algorithms
58
5.4. Definite classes
61
Chapter 6
The first-order calculus (QC)
6.1. What is a logical calculus? 6.2. First-order languages 6.3. The calculus QC 6.4. Metatheoremson QC 6.5. Consistency.First-order theories
vii
38
66 66
67 70
72
74
Chapter 7 The formal theory of canonical calculi (CC*) 7.1. Approaching intuitively 7.2. The canonical calculus 1:* 7.3. Truth assignment 7.4. Undecidability: Church' s Theorem
76 76 78 81 83
Chapter 8 Completeness with respect to negation 8.1. The formal theory CC 8.2. Diagonalization 8.3. Extensions and discussions
90
Chapter 9 Consistency unprovable 9.1. Preparatory work 9.2. The proof of the unprovability of Cons
93 93 94
Chapter 10 Set theory 10.1. Sets and classes 10.2. Relations and functions 10.3. Ordinal, natural, and cardinal numbers 10.4. Applications
85 85 87
98 9R 103 106 110
References
114
Index
116
List of symbols
122
APPENDIX (Lecture Notes): Type-Theoetical Extensional and Intensional Logic Part 1: Extensional Logic Part 2: Montague's Intensional Logic References
viii
123 127 148 182
Chapter 1 INTRODUCTION 1.1 The Subject Matter of Metalogic Modern logic is not a single theory. It consists of a considerable (and ever growing) number of logical systems (often called - regrettably- logics). Metalogic is the science of logical systems. Its theorems are statements either on a particular logical system or about some interrelations between certain logical systems. In fact, every system of logic has its own metalogic that, among others, describes the construction of the system, investigates the structure of proofs in the system, and so on. Many theorems usually known as "laws of logic" are, in fact, metalogical statements. For example, the statement saying that modus ponens is a valid rule of inference - say, in classical firstorder logic - is a metalogical theorem about a system of logic. A more deeper metalogical theorem about the classical frrst-order logic tells us that a certain logical calculus is sound and complete with respect to the set-theoretical semantics of this system of logic. Remark. It is assumed here that this essay is not the first encounterof the reader with logic (neither with mathematics), so that the examples above and some similar ones later on, are intelligible. However, this does not meanthat the understanding of this bookis dependent on someprevious knowledge of logic or mathematics.
Another very important task of metalogic is to answer the problem: How is logic possible? To give some insight into the seriousness of this question, let me refer to the well-known fact that modern logic uses mathematical means intensively, whereas modem mathematical theories are based on some system(s) of logic. Is there a wayout from this - seemingly - vicious circle? This is the foundational problem of logic, and its solution is the task of the introductory part of metalogic. The greater part of this essay is devoted to this foundational problem. The device we shall apply in the course consists in dealing alternatively with mathematical and logical knowledge, without drawing a sharp borderline between mathematical and logical means. No knowledge of a logical or a mathematical theory will be presupposed. The only presupposition we shall exploit will be that the reader knows the elements of the grammar of some natural language (e.g., hislher mother tongue), can read, write and count, and is able to follow so-called formal expositions. (Of course, the ability last mentioned assumes - tacitly - some skills which can be best mastered in mathematics.) The introductory part of metalogic is similar to the discipline created by David Hilbert, called metamathematics. (See, e.g., HILBERT 1926.) The aim of metamathematics was to find a solid foundation of some mathematical systems (e.g., number theory, set theory) by using only so-called finite means. In Hilbert's view, fmite mathemat1
ics is sufficient for proving the consistency of transfinite (non-fmite, infinite) mathematical theories. In a sense, the foundation of a logical calculus (which is sufficient for mathematical proofs in most cases) was included in Hilbert's programme. (perhaps this is the reason that scientists who think that modem logic is just a branch of mathematics - called mathematical logic - often say metamathematics instead of metalogic.) Metalogic is not particularly interested in the foundation of mathematical theories. It is interested in the foundation of logical systems. In its beginning, metalogic will use very simple, elementary means which could be qualified as finite ones. However, the author does not dare to draw a borderline between the finite and the transfinite. We shall proceed starting with simple means and using them to construct more complex systems. Every system of modem logic is based on a formal language. As a consequence, our investigation will start with the problem: How is it possible to construct a language? Some of our results may tum out to be applicable not only for formal languages but for natural languages as well. This essay is almost self-contained. Two exceptions where most proofs will be omitted are the first-order calculus of logic (called here QC, Chapter 6) and set theory (Chapter 10). The author assumes that the detailed study of these disciplines is the task of other courses in logic.
Technical remarks. Detailed information about the structure of this book is to be found in the Table of Contents. At the end of the book, the Index and the List of Sym-
bols helps the reader to find the definitions of referred notions and symbols. In the inner reference, the abbreviations 'Ch .' , 'Sect.' , 'Def.', and 'Th.' are used for 'Chapter', 'Section', 'Definition', and 'Theorem', respectively. References for literature are given, as usually, by the (last) name of the author(s) and the year of publication (e.g., 'HILBERT 1926'); the full data are to be found in the References (at the end of the
book). - No technical term and symbol will be used without definition in this book. At the end of a longer proof or definition, a bullet '.' indicates the end. - A considerable part of the material in this book is borrowed from a work by the author written in Hungarian (RUZSA 1988).
1.2 Basic postulates on languages On the basis of experiences gained from natural languages, we can formulate our first postulate concerning languages: (L1) Each language is based on afinite supply ofprimitive objects. This supply is
called the alphabet ofthe language.
In the case of formal languages, this postulate will serve as a normative rule. 2
In spoken languages, the objects of the alphabet are called phonemes, in written languages letters (or characters). The phonemes are (at least theoretically) perceivable as sound events , and the letters as visible (written or printed) paint marks on the surface of some material (e.g., paper) - this is the difference between spoken and written languages. We shall be interested only in written languages.
In using a language, we form fmite strings from the members of the alphabet, allowing the repetition (repeated occurrence) of the members of the alphabet. In the case of spoken languages , the members of such a string are ordered by their temporal consecution. In the case of written languages, the order is regulated by certain conventions of writing. (The author assumes that no further details are necessary on this point.) Finite strings formed from the members, of the alphabet are called expressions - or, briefly, words - of that language. The reader may comment here that not all possible strings of letters (or of phonemes) are used in a (natural) language . Only a part of the totality of possible expressions is useful; this part is called the totality of well-formed or meaningful expressions. (But a counterexample may occur in aformal language!) Be it
as it is, to define the well-formed expressions, we certainly must refer to the totality of all expressions. Thus, our notion of expressions (words) is not superfluous.
Note that one-member strings are not excluded from the totality of expressions. Hence, the alphabet of a language is always a part of the totality of words of that language. Moreover, by technical reasons, it is useful - although not indispensable - to include the empty string .called empty word amongst the words 'of a language. (We shall recur to this problem later on.) Our second postulate is again based on e~periences with natural languages: (L2) If we know the alphabet of a language, we know the totality of its words
(expressions). In other words: The alphabet of a language uniquely determines the totality of its words.
To avoid philosophical and logical difficulties we need the third postulate: (L3) The expressions ofany language are ideal objects which are sensibly realiz-
able or representable (in any copies) by physical objects (paint marks) or events (sound events or others).
This assumption is not a graver one than the view that natural numbers are ideal objects. And it makes intelligible the use of a language in communication., Thus, in speaking of an expression (of a language) we speak of an ideal object rather than of a perceivable object, i.e., a concrete representation of that ideal object. In other words: Statements on an expression refer to all of its possible realizations, not only to some particular representation of the expression. (Reference to a concrete copy of an expression must be indicated explicitly.)
1.3 Speaking about languages Speaking (or writing) about a language, we must use a language. In this case, the language we want to speak about is called the object language, and the language we use is called the language of communication or, shortly, the used language. The object language and the used language may be the same. The used language is, in most cases, some natural language, even when the object language is aformal one. However, the used language is, in most cases, not simply the everyday language, but it is supplemented with some technical terms (borrowed from the language of some scientific discipline) and perhaps with some special symbols which might be useful in the systematic treatment of the object language. (If the object language is a natural one, then the used everyday language is to be supplemented, obviously, with terms of linguistics. In the case of formal languages, the additional terms and symbols are borrowed from mathematics, as we shall see later on.) The fragment of the used language that is necessary and sufficient for the description and the examination of an object language is called usually the metalanguage of the object language in question. In fact, this metalanguage is relativized to the used language. (Changing the language of communication, the metalanguage of an object language will change, too.) If the object language is a formal one, there might be a possibility to formulate its metalanguage as a formal language. (Theoretically, this possibility cannot be excluded even for natural object languages.) However, in such a case we need a meta-metalanguage for explaining - to make intelligible - the formalized metalanguage. This device can be iterated as often as one wishes, but in the end we cannot avoid the use of a natural language, provided one does not want to play an unintelligible game. When speaking about an object language, it may occur that we have to formulate some statements about some words of that language. If we want to speak about a word, we must use a name of the word. Some words (but not too many) of a language may have a proper name (e.g., epsilon is the proper name of the Greek letter e), others can be named via descriptions (e.g., 'the definite article of English'). A universal method in 'a written used language (to be used in this essay as well) consists in putting the expression to be named in between simple quotation marks (inverted commas); e.g., 'Alea iacta est' is a Latin sentence which is translated into English by 'The die is cast' . (Note that in a written natural language the space between words counts as a letter.) The omission of quotation marks can lead to ambiguity, and, hence, it is a source of language humour. An example:
4
- What word becomes shorter by the addition ofa syllable? - The word 'short'. Surely, it becomes 'shorter' but not shorter.
1.4 Syntax and semantics The science dealing with symbols (or signs) and systems of symbols is called semiotics. Languages as systems of symbols belong, obviously, under the authority of-semiotics. Semiotics is, in general, divided to three main parts: syntax, semantics, and pragmatics. (See, e.g., MORRIS 1938, CARNAP 1961.) The syntax of a language is a part of the description (or investigation) of the language dealing exclusively with the words of the language, independently of their meaning and use. Its main task is to define the well-formed (meaningful) expressions of the language and to classify them. The part of linguistic investigations that deals with the meaning of words and with the interrelations between language and the outer world but is indifferent with respect to the circumstances of using the language belongs to the sphere of semantics . Finally, if the linguistic investigation is interested even in the circumstances of language use then it belongs to the sphere of pragmatics. No rigid borderlines exist between these regions of semiotics.
In natural languages,
most parts of syntactic investigations cannot be separated from the study of the communicative function of the language, and, henceforth, investigations in the three regions become intertwined strongly. Of course, in the systematization of the results of studies, it is possible to omit the semantical and the pragmatic aspects; this makes possible pure syntax as a relatively independent area of language investigation.
In case of formal languages, the situation is somewhat different. A formal language is not (or, at least, rarely) used for communication. It is an artificial product aiming at the theoretical systematization of a scientific discipline (e.g., a system of logic, or a mathematical theory). Its syntax (grammar) and semantics are not discovered by empirical investigations, rather, they are created, constituted. Thus, seemingly, here we have a possibility for the rigid separation of syntax and semantics. However, if our formal language is not destined to be a l'art pour l'art game, its syntax must be suitable for some scientific purposes; at least a part of its expressions must be interpretable, translatable into a natural language. Consequently, the syntax and the semantics of a formal language created for some scientific purpose must be intertwined : it is impossible to outline, create the former without taking into consideration the latter. After the outline of the language, of course, the description of its syntax is possible inde-
5
· pendently from its semantics. In this case, the role and function of the syntactic notions and relations will be intelligible only after the study of the semantics.
In this essay, the following strategy will be applied: Syntax and semantics (of a formal language) will be treated separately, but in t1l:e description of the syntax, we shall give preliminary hints with respect to the semantics. By this, the reader will get an intuitive picture about the function of the syntactic notion. However, our main subject matter belongs to the realm of syntax. Fonnallanguages are "used" as well, even if not for communicative purposes, but in some scientific discipline (e.g., in logic). Applications of a fonnallanguage can be assumed as belonging to the sphere of pragmatics - if somebody likes to think so. However, this viewpoint is not applied in the literature of logic. After these introductory explanations, we should like to tum to our first problem: How is possible to construct the syntax of a language? However, we shall deal first with the means used in the metalanguages. This is the subject of the following chapter.
6
Chapter 2 INSTRUMENTS OF METALOGIC 2.1 Grammatical Means The basic grammatical instruments of communication are the declarative sentences. This holds true for any metalanguage - and even for our present hypermetalanguage used for the description of instruments of metalanguages. We formulate
defmitions, postulates, and theorems by means of declarative sentences. In what follows, we shall speak - for the sake of brevity - of sentences instead of declarative sentences. Thus, sentence is the basic grammatical category of metalanguages. Another important grammatical category is that of individual names - in what follows, briefly, names. Names may occur in sentences, and they refer to (or denominate) individual objects, of course, in our case, grammatical objects (letters, words, expressions). In the simplest case, they are proper names introducedby convention. Compound names will be mentioned later on. The mostfrequent components of sentences willbe called functors. In the first approximation, functors are incomplete expressions (in contrast to sentences and names which are complete ones insofar theirrole in communication is fixed) containing one or more empty places called argument places which can be filled in by somecomplete expressions (names or sentences) whereby onegetsa complete expression (nameor sentence): Remark. Thereexist functors of whichsomeemptyplaceis to be filled in by another functor. In our investigations, we shall not meet withsuch a functor. Hence, the aboveexplanation on functors - although defective - will sufficefor our purposes.
Functors can be classified into several types. The type of a functor can be fixed by determining (a) the category of words permitted to fill in its argument places (for each argument place), and (b) the category of the compoundexpressionresulted by filling in all its empty places. According to the number of empty places of a functor, we speak on one-place or monadic, two-place or dyadic, three-place or triadic, ... , multi-place or polyadic functors. A functor can be considered as an automaton whose inputs are the words filled in its empty places, and its output is the compound expression resulted by filling in its empty places. Using this terminology, we can say that the type of a functor is determined by the categoriesof its possibleinputs and the categoryof its output. A functor is said to be homogeneous if all its inputs belong to the same category, i.e., if all its empty places are to be filled in with words of a single category. We shall deal only with homogeneous functors.
7
In metalanguages, we shall be interested in the following three types of (homogeneous) functors .
(1) Sentence functors forming compound sentences from sentences. Their inputs and outputs are sentences.
(2) Name functors forming compound names from names. Their inputs and outputs belong to the category of names.
(3) Predicates forming sentences from names. Their inputs are names, and their outputs are sentences. Another cross-classification of functors consists in distinguishing logical and non-
logical functors. Logical functors have the same fixed meaning in all metalanguages. All the sentence functors we shall use are logical ones. We shall meet them in Sect. 2.3. Among the predicates, there is a single one that counts as a logical one: this is the dyadic predicate of identity. All the other functors we shall deal with are non-logical ones.
Name functors express operations on individual objects in order to get (new) objects. Well-known examples are the dyadic arithmetical operations: addition, multiplication, etc. Thu s, the symbol of addition , '+' , is a dyadic name functor. The expres sion 5+3 is a compound name (of a number) formed from the names '5' and '3'. Here the input places surround the functor (we can illustrate the empty places by writing' ... + -'); this is the general convention with respect to using dyadic functors (so-called infix notation). In metalanguages, the most important dyadic name functor we shall use is called
concatenation. It expresses the operation by which we form a new word from two given words , linking one of them after the other. For example , from the (English) words 'cow' and 'slip' we get by concatenation the word 'cowslip' (or even, in the reversed order, 'slipcow'). This operation will be expressed as
cow'sllp where
,n,
is the concatenation functor. Now, one sees that (in any language) the words
consisting of more than one letter are composed from letters by (iterated) applications of concatenation. We shall deal with this functor in more detail in Sect. 3.1.
Monadic predicates are used to express properties of individual objects. If the argument place of such a predicate is filled in by a name, we get a sentence stating that the object denominated by the name (if any) bears the property expressed by the predicate. An arithmetical sentence can serve as an illustration:
Eleven is an uneven number. Here the property is expressed by the predicate ' ... is an uneven number'; and its argument place is filled in by the name 'eleven'.
8
Multi-place (or polyadic) predicates express relations between individual objects. In arithmetic, the symbol
'<' is
an abbreviation of the dyadic predicate' ... is smaller
than --'; thus, e.g., '9 < 7' expresses a (false) arithmetical statement. Among the dyadic predicates, it is the logical predicate of identity. Its well-known symbol, '=', can be expressed by (English) words as ' ... is identical with ---'. Putting names for the empty places we get a sentence stating that the two names denote the same object; e.g., (1)
8 + 5 =6 + 7,
(2)
9 x 7 = 61.
Obviously, sentence (1) is true (for '8+5' denotes the same number as '6+7' does), but (2) is false (since 9x7' and '61' denote different numbers) . We can express the denial
of (2) by 9x7:;t:61.
In general, let us agree in using the symbol ••• ':F- ---
for expressing' ... is not identical with ---' , or, in other words, ' ... differs from ---'. No doubt, identity is a logical predicate in the sense that its meaning is uniquely fixed by the stipulation that its output is a true sentence if and only if their inputs denote the same object. As a consequence, a sentence of form
a =a where the letter 'a' is replaced by any name (assuming only that this name refers to a unique object) is always true (is a logical truth ), although it conveys no information. Remark. Unfortunately, in the literature of mathematics and logic, the term 'equality' is mostly used instead of 'identity' (and one finds 'equals ' instead of 'is identical with'). This is regrettable, for identity and equality can be clearly distinguished. For examp le, a triangle may have two equal sides (or angles) which are not identical. Or, in the eye of the law , we are (supposedly) all equal, but, certainly , not identical with each other.
The grammatical categories reviewed in this section are very important ones in metalanguages. However, this is not at all what we need. We cannot dispense with the use of further means treated in the next section.
9
2.2 Variables and Quantifiers Let us consider the following sentence speaking about the grammar of a certain natural language: (1)
Each substantive noun is a noun phrase.
In this sentence no (individual) name occurs. We see in it the predicate 'is a noun phrase', and we suspect that the expression 'substantive noun' refers somehow to the predicate 'is a substantive noun'. And we realize that the grammatical categories treated in the previous section are insufficient forparsingmeaningful metalanguage sentences. Let us re-formulate (1) as follows: (2)
Be it any word, if it is a substantive noun then it is a noun phrase.
Here the two predicates are directly present, but their argument places are filled in by the pronoun 'it' instead of a name. The core of (2), namely (3)
if it is a substantive noun then it is a noun phrase
seems to be a correct sentence of English, although it1 information content is unclear as long as the reference of the pronoun is not given. Now the prefix 'Be it any word' tells us that the reference of 'it' may be any word. Sentences containing pronouns in the place of names - such as (3) - may be called open sentences. By applying a suitable prefix - like 'be it any word', or, more generally, 'be it anything' - to an open sentence we can get a closed sentence having an unambiguous informationcontent. In formal languages, it is customary to use special symbols called variables instead of pronouns. Thus, variables are artificial pronouns in formal languages. They were consciously and regularly used in mathematics in modem times. It proves to be useful to introducethemin metalanguages as well. (Theparsing of our example (l), perhaps, doesnot give convincing evidence for the use of variables, but our laterexamples will be sufficient.) In the formal objectlanguages, variables mayoccurin several grammatical categories. However, in metalanguages, we only need so-called individual variables referring to individual objects (incaseof syntax, to words of a certain language). Thus, variables may occurin everyplace
where names may occur. We shall use mainly singleitalicised letters as variables (Roman and Greekletters, both upper- and lower-ease ), but sometimes groups of letters, letters with sub-or superscripts willbe used. Using the letter 'x' as a variable, we can re-formulate the open sentence (3) as follows: (3')
if x is a substantive noun then x is a noun phrase.
And for the closed sentence (2), we introduce as its regular form: 10
(2')
For all x (if x is a substantive noun then x is a noun phrase).
Here the prefix 'For all x' is to be called the universalquantification of the variable x, and the open sentence between parentheses following the prefix is to be called the
scope of the quantification. The parentheses serve to stress the limits of the scope (which might be important if the sentence occurs in a longer text). Now we shall introduce another device in order to get closed sentences from open ones, called existential quantification. Let us consider the sentence: (4)
Some adverbs end in '-ly'.
We shall re-formulate this as follows: For some x (x is an adverb, and x ends in '-ly').
(4')
This is to be understood as stating that from the open sentence
x is an adverb, and x ends in '-ly' we can get a true sentence at least in one case by putting a name (of a word) in places of the variable x. That is, (4') says that there is at least one adverb ending in '-ly'. Thus , the plural in (4) - which suggests that there are several such adverbs - is neglected in this reconstruction. The prefix 'For some x' in (4') is to be called the existential quantification of the variable x, and the open sentence between parentheses after the prefix is to be called again the scope of the quantification. For the abbreviation of the prefixes used in the universal and the existential quantification we shall use 'I\x' instead of 'For all x', and
'Vx' instead of 'For some x'. The symbols '/\' and 'V' are called universal and existential quantifiers, respectively. These will be used exclusively in metalanguages only. (In object languages, we shall use other symbols for the quantifiers.) The adjectives 'universal' and 'existential' are understandable, but the terms 'quantifier', quantification' are somewhat misleading: they came from the logic of Middle Ages. Now the final parsing of our examples is as follows: (2")
/\x(if x is a substantive noun then x is a noun phrase).
(4")
Vx(x is an adverb, and x ends in '-ly').
(In fact, these are not the final results. The expressions 'if... then' and 'and' will be re-
considered in the next section as logical sentence functors.)
11
The quantifiers are not functors, at least not in the sense treated in the previous section. They belong to the category of variable binding operators not occurring in natural languages. In order to introduce some fine distinctions, we have to take into consideration that
a variable may occur more than once in a sentence. Thus, we need to speak sometimes about certain occurrences (e.g., the first, the second, ... , etc. occurrence) of a variable in a sentence. Now, we say that a certain occurrence of a variable, say 'x', in a given sentence is a bound one if it falls within a subsentence of form '/\x( ... )'
or 'Vx(... )'
(here the dotting represents the scope of the quantifier), and we say that the quantifier binds the variable following it throughout in its scope. Occurrences of a variable in a sentence that are not bound ones will be called free occurrences of that variable in the sentence spoken of. Thus, a variable may have both free and bound occurrences in the same sentence. However, in the metalanguages, we shall try to avoid sentences in which the same variable occurs both free and bound. A sentence is said to be an open one if some variable has some free occurrence in it, and in the contrary case it is said to be a closed sentence. An application of a quantifier may be called effective if its variable (i.e., the vari-
'\
.
able following the quantifier immediately) has some free occurrences in its scope. In the contrary case, the quantification may be said vacuous or ineffective. In formal languages, vacuous quantification is permitted. In metalanguages , we shall avoid this as far as it is possible. Applying an (effective) quantification to an open sentence, the number of (distinct) free vari,ables in the resulting sentence will diminished by 1 - comparing to the sentence in its scope. A name may be again open or closed. A name is open if it involves some variables (think of a name functor whose empty places - or some of them - are filled in by variables); thus, a variable alone counts as an open name, too. A name is closed if no variables occur in it. We do not introduce an operator that forms a closed name from an open one. Thus, variables of a name can be bounded only by quantifiers
appli~d
to a
sentence involving the name in question. Substitution offree variables in a sentence. Given a sentence involving free occurrences of a variable, we can get another sentence by substituting each free occurrence of the variable by the same closed name. Substituting open names for a variable is also permitted under the condition that no variable of the name will be bounded in the resulting sentence. More detailed: If x is the variable to be substituted by a name involving the variable y, and the sentence in question involves a subsentence of form 12
r
or 'Vy( ... r then x must not occurfreely in this subsentence- for, in the contrary case the quantifier with y would bind a free variable (namely y) of the name. To see the importance of this condition, let us considerthe following example: 'Ay( ...
(5)
If x is a word then Vy(y is longer than x).
The universal quantification (with respect to x) of this sentence seems to be true, if applied to a language. Thus, one can think that the variable x can be substitutedby any name without risking the meaningfulness of the result. However, by substituting y for x, we get: If y is a word then Vy(y is longer than y). I suppose,no commentis needed. Bound variables are used in order to show clearly the inner structures of quantified sentences. They cannot be substituted by closed or compound names. However, a bound variable can be substituted by another one. It is unimportant what letter is used as a variablein a quantifieras long as the scope remains intact. Thus, we can substitute a bound variable, say 'x' , by another variable, say 'y', provided y has no free occurrences in the scope of the quantification. Such a substitution is to be understood as replacing 'y' for the occurrence of 'x' following the quantifier and for all free occurrences of 'x' in the scope of the quantifier. For example, in (5), the bound variable 'y' can be substitutedby 'z' - but not by 'x ' -: If x is a word then Vz(z is longer than x ).
The substitution of bound variables is often called re-naming of bound variables. In metalanguages, we shall rarely apply this device. (In formal object languages, it is an importantprocedure.) By a universal and an existential sentence let us mean a sentence offonn 'Ax(...)' and 'Vx(...r, respectively, where 'x' is anyvariable. Given sucha sentence, let us omitthe initial quantifier (andthevariable following it), andletus substitute thevariable x by a closed name in the remaining sentence. The result will be called an instance or an instantiation of the original (universal or existential) sentence. Taking into consideration the meaning of the quantifiers, it is obvious thatwe cancorrectly infer (a) from a universal sentenceto its any instantiation, and (b) from any instantiation of an existential sentenceto the existential sentence. In metalanguages, we are compelled to use variables systematically. In most cases, our variables refer to the words of a certain language. In the contrary case, we shall give a declaration about the permitted reference of the variables. Without such a declaration,
13
the meaning of quantification would be unclear. (What does it mean 'For all x'
if we do
not know to what sort of objects the variable 'x' refers?)
Tacit quantification. Let us introduce the convention that if a "completed" metalanguage sentence (i.e., one which is not a part of a longer sentence) involves free variables then it is to be understood as if its free variables would be bounded by universal quantifiers standing before the sentence and having as their scope the whole sentence. This is a usual device applied both in mathematics and metalogic, and is called tacit
universal quantification. Of course, we can never omit an existential quantifier, or a universal one applied only to a subsentence (a clause).
Naming open expressions. In a metalanguage, sometimes we must refer to expressions involving free variables. Imagine, e.g., a grammatical rule saying that if A and B are sentences then
(6)
if A thenB
is a (compound) sentence as well. Here 'A' and 'B' are used as variables (referring to the words of some language) , and, moreover, the rule seems to be a universal one, that is, these variables are tacitly quantified. How can we name the expression standing in line (6)? Including it by quotation marks would be wrong, for we do not want to say that the expression 'if A then B' is a sentence. Instead, we should like to say that we get a sentence from the schema (6) whenever we put sentences in the place of 'A' and 'B'. Probably, a long and complicated circumscription would be possible, but it is more simple and economical to introduce some new boundary marks in order to naming schemata involving variables (such as (6)). We shall use double quotation marks (double inverted commas) for this purpose. Then, the grammatical rule mentioned above can be expressed by the following sentence: (7) AAI\B (if A is a sentence and B is a sentence then "if A then B"is a sentence). Expressions bordered by double inverted commas - such as "if A then B " above - will be called quasi-quotations. We agree that quantifiers - explicit or suppressed (tacit) ones - are effective with respect to variables occurring in a quasi-quotation (in contrast to variables occurring in a simple quotation). Now an instantiation of a universal sen-
tence involving a quasi-quotation is to be formed as follows: Occurrences of variables within the quasi-quotations are to be substituted by words - not by names of words -; and the signs of the quasi-quotation (i.e., the double inverted commas) are to be replaced by simple quotation marks (i.e., by simple invert,ed commas); and the other occurrences of variables (outside of the quasi-quotation) are to be substituted - as usually - by names of words. Thus, an instance of (7) is as follows:
14
if 'pro' is a sentenceand 'con' is a sentence then 'if pro then con' is a sentence. Provided, of course, that the words 'pro' and 'con' are possible values of the variables 'A' and 'B' occurringin (7). Let us realize that an occurrence of a variable can be considered as an open name, and, hence, it could be includedbetween double invertedcommas. For example, a part of (7) could be written as if "A" is a sentenceand "B" is a sentence ... Accordingto our convention above, an instantiation of such a sentence would give just the correct result;e.g., if 'pro' is a sentenceand 'con' is a sentence .. . Thus, considering occurrences of variables as quasi-quotations would lead to no confusion. However, this treatment would be superfluous, and, hence, we.shall avoid its use. Let us agreethat weshallonlyusequasi-quotations in naming complex expressions involving (free) variables. Also, quasi-quotations within a quasi-quotation mustbe avoided. The means introduced in this section will be used intensivelyin the next section. Remark. If thereader is familiar with first-order logicthen most of this section seemsto bewell-known for himlher. Note,however, the special use of variables and quantifiers in metalogic. In fact, our instruments in metalogic findroomin the frame ofclassical first-order logic, butwedo notreferhere to any formal system oflogic.
2.3 Logical means Our first topic in this section is the investigation of sentence functors (mentioned in Sect. 2.1 already)used in metalanguages. The single monadic sentence functor we shall use is called negation. It serves to express the denial of a sentence. In the case of simple sentences, it can be expressedby insertinga negativeparticle ('not' , or 'does not') into the sentence. 0Ne saw in 2.1 how the denial of an identity sentence can be expressed.) In the general case, negation can be expressedby prefixingthe sentencewith the phrase 'it is not the case that'. "' Our dyadic sentence functors are called conjunction, alternation, conditional, and biconditional. Conjunction and alternation are expressed by inserting 'and' and 'or', respectively, betweentwo sentences. The form of a conditional is (1)
if A, (then) B
15
provided 'A' and 'B' refer to sentences. Here A is called the antecedent and B is called the consequent of the conditional sentence (1). The word 'then' is between parentheses, for sometimes it is omitted. Finally, the form of a biconditionalis A if and only if B,
(2)
and it is used as an abbreviationof a more complex sentence of form if A then B, and if B then A.
(2')
The artificial expression 'if and only if' comes from mathematics but its use became general nowadays in scientific literature. It is often abbreviated by 'iff '. In what follows, we shall use this abbreviationsystematically. We shall introduce symbols for expressing these functors, according to the following conventions where the variables 'A' and 'B' refer to (closedor open) sentences: (3) "-A" for "it is not the case that A"; (4) "A & B" for "A and B"; (5) "A v B"
(6) "A
~
for "A or B";
B" for "if A, (then) B";
(7) "A ¢::> B" for "A if and only if B".
The meanin gs of our sentence functors will be fixed by the following truth conditions based on the assumption that sentences are either true or false. (a) If A is false then "- A" is true, otherwise it is false. (b) If both A and B are true then "A & B" is true, otherwise it is false. (c) If both A and B are false then "A v B is false, otherwise it is true. (This H
shows that our use of 'or' corresponds to that of 'and/or' rather than to that of 'either ... or' .) (d) If A is true and B is false then "A => B
H
is false, otherwise it is true.
(e) If both A and B are true, or if both are false, then "A
¢::>
B" is true, other-
wise it is false. (Taking into considerationthat according to (2'), "A is an abbreviationof "(A
~
¢::>
B
H
B) & (B => A)", this condition follows from (b)
and (d) above.) Remarks. 1. On the basis of everyday experiences, the reader may doubt the assumption that sentencesare either true or false. However, this doubt is not justified in metalanguages wheretruth is created by fiat , i.e., somebasic sentences are true as postulates or as definitions, and other sentences are inferred from these.Perhaps, the reader may acceptthat in somelimiteddomains, the true-false dichotomy of sentencesis acceptable, especially if we are speaking about ideal objects like in mathematics or '" in metalogic.
16
2. The truth conditions (a) ... (e) are more or less in accordance withthe everyday use of the words expressing our functors. Rule (d) seemsto be most remotefromthe everyday use of 'if ... then', but this is just the sentence functor very useful in forming metalanguage sentences. In most cases,the symbol '~' will occur between open sentences standing in the scope of (tacit or explicit) universal quantification(s). Examples of this use of 'if ... then' are the sentences occurring just in the rules(a) ... (e) above. 3. The symbols introduced in this section and in the preceding one will be used sometimes in the followingexplanations whenever theiruse is motivated by the aimsof exactness and/orconciseness. However, the everyday expressions of these symbols ('and' , 'or', 'if ... then', 'iff,'for all' etc.) will be used frequently.
Given the meaning of our sentence functors, it is clear that they are logical functors. They - and their symbols - are often called (sentence) connectives in the literature of logic. Let us note that in mathematical logic, the terms disjunction, implication, and
equivalence are used instead of our alternation, conditional, and biconditional, respectively. These are not apt phrases , for they can suggest misleading interpretations. ITwe apply more than one of our sentence functors in a compound sentence then the order of their applications can be indicated unambiguously by using parentheses. However, some parenthese s can be omitted if we take into consideration the properties of our functors. First, it follows from the truth conditions above that conjunction and alternation are
commutative and associative, and hence, we can write, e.g.,
"A &B &C" and "A vB vC' (where the variables 'A " 'B', and 'C' refer to sentences) without using parentheses. Further, we can see easily that the truth conditions of
"(A&B)oC" and "A o (B o C)" are the same. That is, if the consequent of a conditional is a conditional then the antecedent of the latter can be transported by conjunction to the antecedent of the main conditional. This suggests the convention to omit the parentheses surrounding the conditional being the consequent of a conditional, i.e., to write
"A
0
B
~
C"
instead of
"A
~
(B 0 C)".
Of course, this convention can be applied repeatedly. Finally, we can realize that the biconditional is (a) reflexive , in the sense that "A
~
A" is always true,
(b) symmetrical, in the sense that from "A
~
B" we can infer "B
~
A ", and
(c) transitive, in the sense that from "A ~ B" and "B ~ C" we can infer "A ~ C".
17
Identity bears the same remarkable properties - the fact which legitimates the use of chains of identities of form a=b=c= ... (where the variables 'a', 'b', 'c' refer to names), practised from the beginning of primary school. Then, chains of biconditionals of form A~B~C¢::>
...
will be used sometimes in the course of metalogical investigations. As illustrations of our new symbols, let us re-formulate the more detailed logical structure of some sentences used as examples
(2")* Ax(x is a substantive noun ee x is a noun phrase). (4")* Vx(x is an adverb & x ends in '-ly'). (5)*
x is a word::::::> Vy(y is longer than x).
(7)*
I\A/\B ((A is a sentence & B is a sentence) ::::::> "if A then B" is a sentence). Inferences. In the metalogical investigations, we draw some inferences from our
starting postulates and definitions (definitions will be treated in the next section) on the basis of the meaning of our logical words - i.e., quantifiers, identity, sentence functors, be they expressed by symbols or by words of a natural language. The meaning of these words or symbols was exactly fixed by their truth conditions in the present and the preceding section. No formal system of logic will be used here as a basis legitimating our inferences - at least not before Chapter 6 that treats of a system of logic. However, on the basis of the mentioned truth conditions, a list of important infer-
ence patterns could be compiled. In the preceding section, it was mentioned, e.g., inference from a universal sentence to its instantiations. The properties of the sentence functors treated in the present section also contain some hidden inference patterns. Instead of giving a large list of inference patterns, we only stress two important ones: (A) From a conditional "A::::::>B" and from its antecedent A we can correctly infer its consequent B. This pattern is called modus ponens [placing mood] (in formal systems, sometimes called detachment). H
(B) From a conditional "A::::::> B and from the falsity of its consequent, i.e., from
"-B H, we can correctly infer to the falsity of its antecedent, i.e., to "-A". This pattern is called modus tollens [depriving mood]. It is the core of the so-called indirect proofs. In such a proof, one shows that accepting the negation of a sentence would lead to a sentence which is known to be false, that is, a conditional of form "-A
~
-B" is ac-
cepted. From this and from the falsity of "-B" - i.e., from the truth of B -, the falsity of
"-A" - i.e., the truth of A - follows by modus tollens.
18
2.4 Defmitions In metalanguages, we often use definitions, mainly in order to introduce terms or symbols instead of longer expressions. Such a definition consists of three parts: (a) the new term called the definiendum, (b) the expression stating that we are dealing with a definition, and (c) the expression that explains the meaning of the definiendum called definiens. In verbal definitions, (b) is indicated by words such as 'we say that', 'is said I to be', 'let us understand', etc. Some examples of verbal definitions:
(1) By the square ofa number let us mean the number multiplied by itself.
(2) We say that a number x is smaller than the number y iff there is a positive number z such that x + z = y. Here the definienda are italicized, and the words indicating that the sentence is a definition are printed in bold-face letters. In (2), the definiendum is, in fact, the relation expressed by the dyadic predicate 'is smaller than', but its argument places are filled in by variables. This
s~ows
that we cannot avoid the use of free variables in definitions.
Definitions involving free variables will be called contextual definitions. In (1), the use of a variable (referring to numbers) is suppressed due to the fact that it defines a very simple monadic name functor (i.e., the operation of squaring numbers). In the canonical forms introduced below, a special symbol standing between the definiendum and the definiens will indicate that the complex expression counts as a definition. Now, the canonical form of a contextual definition of a predicate has the following shape: (3)
where A indicates the definiendum: an open sentence formed from the predicate to be defined by filling in its argument places with (different) variables, and B indicates the definiens: an open sentence involving exactly the same free variables which occur in
the definiendum. Of course, the predicate to be defined must not occur in the definiens (the prohibition of circularityin defmitions).Furthermore,in order that the definition be reasonable, the definiens must contain only functors to be known already. For example, the canonical form of (2) - if we use the sign '<' instead of 'is smaller than' - is as follows: (2')
x < Y ~df Vz(z is positive & x + z
=y).
The canonicalform of the contextual definition ofa namefunctor has the following shape: (4)
19
where the definiendum a involves the functor to be defined filled in by different variables on its argument places, and the definiens b is an open name involving exactly the same variables occurring in a. Again, the functor to be defined must not occur in b, and the functors occurring in b are to be known ones. In (4), we can put a new name for a, and a compound closed name for b, to get an explicit definition of a name. Practically, in this case the new name serves as an abbreviation for the (probably longer) name in place of the definiens. As an example , let us re-formulate the verbal definition under (1): square of x =df x.x or
The sign of definition in (3) is
'~df',
and in (4) is '=df " taking into consideration that
the symbol of biconditional can only occur between sentences, and the symbol of identity can only occur between names. Contextual definitions are to be understood as valid ones for all permitted values of the free variables occurring in them. That is, contextual definitions are universal sentences with suppressed quantifiers. Consequently, we can infer from such a definition to its instantiations, omitting even the subscript 'er' from the side of
'~df'
or '=df' .
Furthermore, we can replace in any sentence an occurrence of a definiens of a definition by its definiendum or vice versa; the resulting sentence counts as a logical conse-
quence of the original one and the defmition. Another important type of definition is the so-called inductive definition. We shall meet it in Section 4.1. Remark. Contextual definitions couldbe replaced by explicit ones if we woulduse theso-called lambdaoperator. However, theauthordoesnotseeconsiderable benefit in itsintroduction in theseintroductory chapters.
2.5 Class Notation Instead of saying that 'of is an (English) preposition, we can say that' of is a member of the
class of (English) prepositions. In general, instead of saying that a monadic predicate holds for a certain object, we can say, alternatively, that this object is a member of the class of objects of which the predicate in question holds. This class may be called the extension or the truth domain of that predicate. Instead of a monadic predicate, we can apply this way of speech to any open sentenceinvolvinga single variable. This way of speech is sometimes advantageous in metalogical investigations. Thus, we shall introduce new notational devices for using the terms mentioned above.
20
Let 'fp(x)' denote an open sentence involving the single variable 'x', and let' rp(a)' denote the sentence resulting from "fp(x)" via substituting a name a for x. Then, expressions of form (1)
{x: rp(x)}
will be called class abstracts. (Recommended reading: "the class of rp-s".) Intuitively: (1) is a name of the class being the extension or truth domain of the open sentence "rp(x)". The variable x is qualified here as a bound one (i.e., the expression "x:" works
as a quantifier) in accordance with the fact that in verbal readings, the mentioning this variable is often avoidable. For example, {x : x is a preposition}
can be read as 'the class of prepositions'. Remark. In the literature, insteadof (1),the following notation is used as well: {xl {O(X)}.
The expression of form a
(2)
E
{x: qJ(x)}
- where a is a name - means, intuitively, that the object denoted by a is a member of the class "{x: rp(x)}". That is, the symbol 'E' expresses the relation 'is a member of', the membership relation. However, the exact meaning of (2) will be determined by the following defmition: (3)
a
E
{x: qJ(x)}
~df
fp(a).
For example: 'of
E
{x : x is a preposition} ~df 'of is a preposition.
According to the definition in (3), class abstracts are elirninable; thus, their use does not compel us to accept the ontological hypothesis about the existence of classes as a new sort of abstract entities. The use of class notation does not mean, either, an entry into set theory (this will be treated of in Chapter 10). A class abstract is acceptable just in the case the open sentence occurring in it is acceptable. To wit: the class of horses is just as exact as the term 'horse' (or the predicate 'is a horse'). Of course, we do not want to speak about the class of horses in our metalanguages; what we shall deal with will be classes of linguistic entities.
In the following definitions, we shall use Roman capital letters (A, B, C) as variables referring to class abstracts . We shall call them, briefly, class variables, although they refer merely to class abstracts. These variables are not quantifiable, for we do not 21
assume the existence of the totality of classes, nor even the totality of all class abstracts . In fact, class abstracts are language dependent (grammatical) entities. Thus, the open sentences (definitions, statements) involving class variables are to be understood as follows: Given a language , class variables can be substituted by any legitimate class abstracts of that language, i.e., by class abstracts of form "{x: tp(x)}" whenever "rp(x)" is an accepted open sentence (of the language) with a single free variable x. - Furthermore, lower-case Roman letters (a, b, c) will refer to names.
Some abbreviations. The denial of a sentence of form "a E A" will be written as ai!:A
(read: "a is not a member of A"). The compound sentence "a
E
A& b
E
A" will be
abbreviated sometimes as
a, hEA. This convention may be extended to more than two names. Assume that the open sentence" rp(x )" has the following particular form:
x =a 1 v x =a2 v .. . x =an, _ that is, it is an n-member alternation of identities of form "r = a," , and, of course, at ,
a2, ... , an are names. Let us include in this form even the case n = 1. Then, the members of the class "{r. rp(x )}" are just the objects denoted by a lt a2, ... , an' This motivates the following abbreviation:
(Did you hear the slogan according to which a class can be defined in two ways: (a) by a property of its members, or (b) by enumerating its members? Now you see that (b) is merely a special case of (a). Note howeverthat enumeration holdsonly forfinite classes the members of which all have names. Try to defme by enumeration the classof treesin a big forest!
Relations between classes. We say that A is a subclass of B, or B is a superclass of A - in symbols: " A ~ B " - iff every member of A is a member of B. That is: (5)
A~B
The expressions 'x
E
<=)df
1\x (x
A' and 'x
E
E
A~
X E
B).
B' occurring in the definiens are eliminable by (3)
whenever A and B are replaced by fixed class abstractions. For example: {x: tp(x)} ~ {x: 'f/(x)}
<=)
l\x(fI'{x) => 'f/(x).
If two classes are mutually subclasses of each other then we say that their extensions
coincide. Unfortunately, the symbol expressing this coincidence - used generally in 22
literature - is the sign of identity C='). Thus, according to this convention, the definition of coincidence of extensions is as follows: (6)
A
=B
¢::>df
(A ~ B & B ~ A).
Or, by using (5): A =B
¢::> df
J\x(x E A ~ X
E
B)
Note that "A = B" may hold even in a case A and B are defined by different properties (open sentences). To mention an example, in the arithmetic of natural numbers we find that {2} = {x: x is an even prime number}.
Identity between individual object is a primitive relation characterized by the fact that each object is only identical with itself. (However, an object may bear more different names , and this fact makes identity useful.) On the other hand, the symbol '=' as used between class abstracts bears a meaning introduced by definition (6); thus, "A = B" only means what this definition tells us. If the extensions of A and B do not coincide, and A is a subclass of B then we say that A is a proper subclass of B,' in symbols: "A c B". That is: (7)
A
c B
¢::>df
(A
c
B & A;t: B).
It may occur that a quite meaningful predicate defines a class with no members. For example: {x: x is a prime number & 7
< x < I I }.
The simplest definition of such an empty class 'is: {x: x 7= x} .
The extensions of two empty classes always coincide, that is, empty classes are "identical" with each other (in the sense of (6)). Hence, we can introduce the proper name '0 ' for the empty classes: (8)
o = df
{x: x;t: x}.
This definition is to be considered as the concise variant of the following contextual defmition: X E
0
¢::> df
x;t: x.
Analogously, we can introduce a proper name, say
r, for any class abstracts
"{x: rp(x)}" by
r
=df
{x: ~ (x)}.
23
By our definitions in (5) and (8), it is obviousthat
ok
A,
and
AkA.
Operations with classes. We introduce the dyadic class functors of union, intersection, and difference, symbolized by 'u', '(l' ,and '-', respectively. (9)
A
u B
=df
{x: x
E
A
(10)
An B
=df
{x: x
E
A& x
(11)
A-B
=df
{x: x
E
v X E B}, E
B},
A & x ~ B}.
Some properties of these operations: (12)
A
u
B =B
u
A (l B =Br. A;
A,'
C=A
(13)
(A u B) u C = A u (B u C);
(A ( l B)
(14)
A u A =A;
A (lA =A;
(15)
Au 0 =A;
A
(l
0 = 0;
(16)
A u (B ( l C) = (A u B)
A
(l
(B u C) = (A (lB) u (A ( l C);
(17)
A-(A-B) = A ( l B;
(18)
AS; B
~
(l
(A u C);
(A-B)
(A u B = B) ~ (A (l B = A);
u
(l
B
(l
(B ( l C);
=A u B;
A-B~A.
Identities (12), (13), (14) tell that union and intersection are commutative, associative, and idempotent (self-powering) operations. In (15), the role of the empty class in the operations is shown. Union and intersection are distributive with respect to each other, as (16) tells us. In (17), we see important connections between difference and the two other operations. Finally, in (18), some interrelations between the subclass-relation and our operations are presented. - The proof of these laws is left to the reader. (Use the definitions (9), (10), (11), (5), and (8).) Finally, let us note that two classes are said to be disjoint iff they have no common members, i.e., if their intersection is empty.Thus, A(lB=0
expressesthat A and B are disjoint classes.
24
Chapter 3 LANGUAGE "RADICES 3.1 Definition and Postulates In what follows, we shall use script capital letters (Jil, '.B, C, etc.) to refer to an alphabet. The totality of words formed from the letters of an alphabet Jil will be denoted by "Jil° ", and the members of jilO will be called sometimes "jil-words". According to our postulate (Ll) in Sect. 1.2, an alphabet is always afinite supply of objects. Hence, we can use class notation in displaying an alphabet, e.g.,
where a b
a~
... ,an stand for the letters of jil.
According to the postulate (L2) in Sect. 1.2, if we know Jt then we know jil° as well. By using class notation, we may write: jilO = {x: x is an x-word}. Of course, this identity cannot serve as a defmition of jilo, for 'is an jil-word' is an undefined predicate. We mentioned already in Sect. 2.1 that the introduction of the empty word may be useful for technical reasons (this will be demonstrated in Ch. 4). We shall use the symbol '0 ' for the empty word:
o =df the empty word. Obviously, this notion is language-independent. Also, in Sect. 2.1, the name functor concatenation was mentioned; its symbol is ,n , . We can imagine that the words of an alphabet are "produced" starting from the
empty word via iterated concatenation of letters to words given already. Thus, at the beginning of the description of a language, we have to deal with four basic notions: the letters of the language, the words of the language , the empty
word, and the name functor concatenation. Up to this point, we have an intuitive knowledge about these notions. We shall say that these four notions together form a
language radix. Now we shall give a so-called axiomatic treatment of these notions the significance and the importance of which will be cleared up gradually later on.
2S
DEFINITION.
By a language radix let us mean a four-component system of
form
whereJ'i and J'i ° are nonempty classes,0 is an individualobject (the empty word), fl is a dyadic operationbetween the members of J'i0 , and the postulates (Rl) to (R6) below are satisfied. The members of J'i will be called letters, and the members of JilOwill be called words ~-words). Remark. Pointed brackets above under (*) are used to sum up the four components of a lan-
guage radix into a whole. The reader need not think of the set-theoretic notion of an ordered quadruple which is undefinedup to this point.
(Rl) (R2)
J'i
c
JilO
and 0 e
x, yeJil o =>
JilO •
X fly eJilO.
(Tacit universalquantification of the variables. Similarlyin the following postulates.) (R3)
Concatenationis associative: if x, y, Z e JilO then (X fl y) fl Z
= X fl(y fl Z).
Consequently, parentheses can (and will) be omitted in the cas~ of iterated concatenations.- In what follows, the variables x, y, z will refer to members of Jilo. (R4) This tells us that a word is nonempty iff it has a final letter.
That is, the last letter of a word is uniquely determined. (R6)
(x
fl
y =X
¢:)
Y
=0) & (x
fl Y
=Y <=> x =0).
That is, a concatenation is identicalwith one of its members iff its other member is the empty word. Some consequences of our postulates:
By (R6), the empty wordis ineffective in a concatenation: (1)
x
fl0=x=0 flx.
From "a. eJil & x =Y fl a." it follows by (R4) that "X;1: 0". In other words:
(2)
cx eJil=> yfl cx:;t 0.
26
Particularly, if y
=0
then - with respect to (1) - we get that aEj{:::) a:;t0,
(3)
which means that
that is, the empty word does not belong to the class of letters. (Note that this was not explicitly stated in our postulates.) Assume that x, y Ej{0 , and
(4) Now, if y = 0, then, by (R4), it is of form "z n a" where a Ej{. Thus, in this case, (4) has the following form:
or, with respect to (R3),
However, this is excluded by (2). Hence, (4) excludes the case y:;t 0. On the other hand, "x n 0 = 0" implies "x = 0 ", by (l). Thus, we have that (5)
By (1), this holds conversely, too. Hence, we can replace the
"~"
in (5) by "¢:>". In
words: The result of a concatenation is empty (i.e., the empty word) iff both its members are empty. Our postulates and their consequences are in full accordance with our intuitions with respect to the four basic notions of a language radix. Among others, they assure that the empty word can be "erased" everywhere, for it is ineffective in concatenations. What is more, these postulates determine "almost" uniquely the class j{0: the objects which are j{-words according to our intuitions are really (provably) in j{0. However, (Rl) ... (R6) do not assure that
jIO
contains no "foreign" objects, i.e., objects which are
not words composed from the letters of jI. This deficiency could be supplied e.g. by the following postulate: (R7)
If B is a class such that
(i)
0 E B, and
(ii)
(x E B & a Ej'f) :::)
thenx"
X
n a E B,
c B.
27
In other words: 51° must be a subclass of all classes satisfying (i) and (ii). Another
usual formulation: 51° is the smallest class which contains the empty word and the lengthening of every contained word by each letter of 51. Now we see that (R7) involves a universal quantification over classes, and, hence, it passes the limits of our class notation introduced in Sect. 2.5. On the other hand, 51° is not perfectly determined by the remaining postulates (Rl) to (R6). We are compelled to refer to our postulate (L2) introduced already in Sect. 1.2. On the basis of experiences in natural languages, we can accept that the class of 51-words is perfectly determined by the alphabet 51. - However, the notion of a language radix introduced in the present section will be utilized later on (mainly in Ch. 7). Remark. The systems called language radices above are called free groupoids with a unit element in mathematics when the postulate (R7) is accepted as well. They form a particular family of algebraic systems. Here ~o is the field of the grupoid, n is the groupoid operation, 0 is the unit element, and ~ is the class of free generators. - Let us note that accepting (R7) makes possible to weaken some of the
other postulates ; e.g., (3) above is sufficient instead of (R4), and (2) instead of (R6).
3.2 The Simplest Alphabets 3.2.1. Notational conventions. (a) It is an obvious device to omit the sign of concatenation and using simple juxtaposition instead; i.e., writing "xyz" instead of
"x (") y (") z". (b) Displaying an alphabet, we have to enumerate its letters between braces.
The letters are objects, thus, in enumerating them, we have to name them. For example, the two-letter alphabet whose letters are '0' and '1', is to be displayed as {CO', '1' }.
Could be omitted here the quotation marks? We can answer this question by YES, agreeing that inside the brackets, the letters stand in an autonymous sense as names of themselves. However, we can give a deeper "theoretical" argument in doing so. Namely, if we do not want to use a language for communication, if we are only interested in the structure of that language then we can totally avoid the use of the letters and words of the language in question - by introducing metalanguage names for the letters and words of the object language. (For example, the grammar of Greek or of Russian could be investigated without using Greek or Cyrillic letters.) Hence, we shall not use quotation marks in enumerating the letters of an alphabet, but we leave undecided the question - being unimportant - whether the characters used in the enumeration are names in the metalanguage for the letters, or else they stand in autonymous sense.
28
(c) In presenting an alphabet, different characters denote different letters. That is, the alphabet contains just as many letters as many are enumerated in the alphabet.
The simplest alphabet is, obviously, a one-letter one: (1)
J{o
= {a}.
However, the words of this minimal alphabet are sufficient to naming the positive integers: the words a, CIa, CICIa, ... can represent the numbers 1, 2, 3, ... . (Even, the empty word can be considered as 0.) - We shall exploit this interpretation of
J{o
later
on. The two-letter alphabet (2) is used naming natural numbers in the so-called dyadic (or binary) system. In the everyday life, we use the decimal system in writing natural numbers; this system is based on a ten-letter alphabet. However, the dyadic system is exactly as good as the decimal one (although the word expressing the same number is usually a longer one in the dyadic system than in the decimal system). This leads to the idea: Is it possible to replace any multi-letter alphabet by a two-letter one? The existence of the Morse alphabet suggests the answer YES . In fact, this is a three-letter alphabet
I: ,-, I} where the third
character serves to separate the the translations of the letters of (say) the English alphabet. For example, the translation of ' apple' to a Morse-word is:
. -I· - - · I · - - · I · - · · I · which shows that Morse-alphabet is, in fact, a .three-Ietter one. Let our "canonical" two-letter alphabet be (3)
JI 1
={CI, ~}.
Furthermore, let C be an alphabet with more than two letters, e.g.,
We define a universal translation method from C into Jl 1• Let the translation of the letter
Y; the Jlrword beginning with a and continued by i copies of rule can be displayed in the following table: the letters of C translate into
Yn a(3 .,. (3 ---,...-....
n copies of f
29
~
(for 0 ~ i ~ n). This
Then, the translation of a C-word is defmed, obviously, as the concatenation of the translations of its letters. Detailed, in a hair-splitting manner: The translation of 0 is 0; and if the translation of a c-word c is c", and the translation of Yi is gi then the translation of "cy i" is c gi "
•
We avoided here the use of a separating symbol by the
rule that the translation of each letter of C begins with a.. Translations of c-words among the Jilrwords can be uniquely recognized: divide the given Jilrword to parts, by putting a separating sign, e.g., a vertical stroke, before each occurrence of the letter a.. Now if it is the translation of some c-word then . each part must be the translation of a letter of C (it is easy to control whether this holds or not). Re-translating the C-Ietters, we get the c-word the translation of which the given Jilrword was. - Summing up: 3.2.2. THEOREM. Any finite language based on an alphabet with more than two letters can be equivalently replaced with a language based on the two-letter alphabetJil 1• We shall exploit this theorem in Sect. 4.4. Note that the table above can be continued beyond any great value of n. Thus, our theorem holds even for a denumerably infinite alphabet - an alphabet that has just as many letters as much natural numbers are. A language radix itself is not a "full-fledged" language, although it is the radix of the "full tree" of a language. After fixing the radix of a language, the next task is to define the well-formed ("meaningful") expressions of the language. In most cases , these are divided into classes called categories of the language. The problem of defining such categories will be the main topic of the next chapter.
30
Chapter 4 INDUCTIVE CLASSES 4.1 Inductive Defmitions In order to solve the problem outlined at the end of the preceding chapter - i.e., the definition of categories of
a language -
we shall introduce a new type of definition
called inductive definition.
4.1.1. Explanation. Let us assume a fixed alphabet
Vtl
jlo,
jl,
and, hence, a language radix
0, II ). Assume, further, that we want to introduce a subclass F of
jlO
(intended as a category of the language based on jl) by pure syntactic means. The general method realizing this plan is applying an inductive definition that consists of three parts: (a) The base of the induction: we give a base class B (of course, B ~
jlO)
by
an explicit definition, stating that F must include B (B c F). (b) Inductiverules: we present some rules of form (1) where
a J, .. . , ak
and b are words formed from the letters of
jl
and, possibly, from
variables referring to jl-words, that is, the shape of these words is (2) where co,
Cit .. .
. c;
E jl
0
(any of these may be 0), and
Xit ... 'X n
are variables refer-
ring to jl-words. (c) Closure: we stipulate that the class F must contain just the words prescribed to be in F by (a) and (b). • Except some trivial cases, the stipulations (a), (b), (c) cannot be re-written into a form of an explicit or contextual definition (cf. Sect. 2.4) - at least we have no means to do so, up the present time. Thus, inductive definition is, in fact, a new type of definitions .
4.1.2. Supplements to the explanation of inductive definition. Ad (a). The base B can be defmed explicitly by enumerating its members, e.g.,
where b t , ... .b; are fixed jl-words (in this case, B is fmite), or by using variables as under (2), e.g., B
= {x:
x EjlO & Vy(x ="ayb" v x
31
="ycy") },
where a, b, care fixed x-words (in this case, B is infinite). Finally, B can be a class defined earlier. - Note that if B is empty then the class F defined by the induction is empty, too. Ad (b). The inductive rules involving free variables are to be understood as universal ones, i.e., with suppressed universal quantification of their free variables (in ac-
cordance with our convention on metalanguages). Two examples of inductive rules:
Y E F => "axbyc "
F,
(3)
X,
(4)
(x E F &
U
~ -words
(some of them may be 0).
In both rules, a, b, c are fixed
E
axbyc" E F) => Y E F.
Ad (c). In fact, the closure tells that F must be the smallest subclass of ~ 0
satisfying the conditions in (a) and (b). Thus, the closure contains a suppressed universal quantification over the subclasses of
o
~ • We
met a similar situation in the postulate
(R7) (in Sect. 3.1), with the considerable difference that in the present case, the quantification is limited to the subclasses of 510 (not all classes in the whole .world). However, in the sense we use class notation, the notion totality of all subclasses of
~o
is
undefined; thus, we have no logical justification of the closure. - In what follows, we shall omit the formulation of the closure whenever we indicate in advance that we want to give an inductive definition. Clearly, in such a case, it is sufficient to present the base and the inductive rules. • We saw that inductive definitions are not eliminable by our present means. (The kernel of the problem lies in the closure.) However, our intuition suggests that a subclass of 510 is clearly determined by an inductive definition. And, what is perhaps more important, we are unable to determine categories of languages without using this tool. As a consequence, we accept inductive definitions as a new means in metalogical investigations. Remark. In set theory, inductive definitions can be reduced to explicitdefinitions. However, the
introduction of the language of set theory is impossible without inductive definitions. (More on this topic see Ch. 10, Sect.4.1.)
Our next goal is to find out some new devices of presenting inductive definitions that will pave the way for some generalizations as well. We approach this goal via some examples.
4.1.3. The simplest example of an inductive definition is as follows. Given an alphabet 51, let us define the subclass A * of Base: 0
E
~0
by induction as follows:
A*.
Inductive rule: (x E A * &
(X
E51) => "xa" E A
32
*.
This rule comprises as many rules as many letters are in .91.. Thus, if .91. = {<X.o, a v ... , an}, then we can enumerate them: X E
A*
:::::) X E A * : : :)
"xao" E A *. "xa( E A *.
X E
A * :::::) "xa.n "
E
A *.
The reader sees that A * = .91. o • A "theoretical" consequence: .91. 0 is an inductively de-
fined subclass of itself, for any alphabet .91.. We can simplify the notation by omitting the dull occurrences of "s .91. *" and the quasi-quotation signs. Furthermore, let us use '-7' instead of ':=>'. We get a table:
o
(5)
Any inductive definition can be presented by such a table. The first line represents the base , and the other lines represent rules. Each rule has an input on the left side of the arrow, and an output on the right side of the arrow. Even the first line can be considered as a rule without any input. Note that the name of the class to be defined does not occur here; you can give it a name later on. The rules mentioned as examples under (3) and (4) contain two-two inputs. Remembering that "(A & B) :=> C' is the same as "A:::::) B :::::) C'
(cf. Sect. 2.3), we can re-formulate them as follows: ~y ~axbyc,
(3')
x
(4')
x -e axbyc
~y.
Thus, a rule may involve more than one input.
4.1.4. Our next example will be more complicated. Let us consider the dyadic alphabet.~d
= {O,l} introduced in Sect. 3.1. We want to define the class D of dyadic numer-
als representing natural numbers divisible by three. (We shall call numerals the words expressing numbers in a certain alphabet.) These numerals are >td-words such as
(6)
0, 11, 110, 1001, 1100, 1111, 10010, ... 33
(in decimal notation: 0,3,6,9,12, 15, 18, ... ). Some initial members of this (infinite) totality D can be included into the base, and the other ones are to be introduced via inductive rules. Our intuitive key to the inductive rules is: by adding three to any member of the class D , we get another member of D. The problem is to express "adding three" in the dyadic notation. Let us note that each member greater than three has one of the following forms in the dyadic notation:
xOO, xOl, xlO, xlI where x is any dyadic numeral other than O. It is obvious that by adding three to "xOO" we get "x 11". Thus, we can formulate a rule:
xOO
(7)
~
xl l.
Assuming that the input is "good" (i.e., represents a number divisible by three) then the output is a "good" one as well. By adding three to "xOl" we get "yOO" where y is the numeral following x (i.e., x plus one). Similarly, "xl0" plus three gives "yOl", and "xlI" plus three gives
"y 10" where, again, y is the numeral following x. In order to express these facts, let us introduce the notation "xFy" for "x is followed by y". Then we can formulate the missing three rules as follows: (8)
xFy
~xOl ~
(9)
xFy
~
(10)
xFy -,\ xl l
yOO,
xl0 ~ yOl, ~
ylO.
Of course, this is incomplete without defining the relation represented by 'F. This is simple enough: (11)
xOFxl
(12)
xFy~
xlFyO.
The first line serves as the base: it says that any word ending with 0 is followed by the word we get by replacing its final letter by 1. The second line is an inductive rule saying that "xl" is followed by "yO" provided x is followed by y. - Now we see that even a
relation (a dyadic predicate) can be defined via induction. Putting the empty word for x in (11) we get 'OFl', i.e., that 0 is followed by 1. Putting 0 for x and 1 for y in (12), we get 'OFI ~ OIFIO'. Knowing that the input holds, we get 'OlFIO' . However '01' is not accepted as a well-formed dyadic number, thus, we cannot accept this result as saying that 1 is followed by 10. We must add an extra stipulation: (13)
IFI0.
34
The continuation is correct: we get 'I OFll' from (11) by putting 1 for x; then we get from (12)
IFI0
-7
IIFIOO,
which, using (13), gives 'llFlOO', and so on. Now, the complete inductive definition of our class D can be compiled by taking the first three members of (6) as input-free rules, and collecting the rules from (7) to (13). We get the following nice table: (14)
o 11
110
xOFxl IF10
xFy -7 xlFyO XOO -7 xII xFy -7 xOl -7 yOO xFy -7 xlO -7 yOl xFy -7 xlI -7 yIO We see that in some cases, we shall use three sorts of letters in an inductive definition: (a) letters of the starting alphabet Yl (b) letters as variables referring to Yl-words - of J
course, these letters must not occur in Yl, (c) subsidiary letters (like F in our example above) representing (monadic, dyadic, etc.) predicates necessary in the definition again, these letters must differ from those in (a) and (b). Furthermore, the character' -7' occurring in rules must be foreign from all the mentioned three supply of letters.
Conventions for using subsidiary letters. In our example, the subsidiary letter 'F' was surrounded by its arguments. If a subsidiary letter, say 'G' represents a monadic predicate, then its argument will follow this letter, as "Gx". If the letter 'H' stands for a triadic predicate, then its arguments, x, y, z will be arranged as follows:
xHyHz. In the case of a tetradic predicate letter 'K', the arrangement of the arguments x, y, z, u is a similar one:
xKyKzKu. The convention can be extended for more than four-place predicate letters, but, fortunately, in our praxis we can stop here.
3S
In what follows, tables such as (5) and (14) representing inductive definitions will be called canonical calculi. In fact, a canonical calculus is a finite supply of rules (permitting some rules without any input). But the more detailed explanations will follow in the next section where the connections with the inductive definitions will be cleared up exactly.
4.2 Canonical Calculi 4.2.1. DEFINITION. Let C be an alphabet and
I
~'
a character not occurring in C.
We define inductively the notion ofa c-rule by the following two stipulations: (i) Iff eC~ f is a C-rule. (ii) Ifr is a c-rule, and f eC O then
(Remark. Note that by (i),
(0
uf~
r'' is a c-rule.
is a c-rule.)
4.2.2. DEFINITION. Let j{ and j{ / be alphabets such that Then, a finite class K of
j{
Jil I;;
j{ /
and
I
~' ~ j{ ~
(given by enumeration of its members) will be
called a canonical calculus over Jil. The letters in j{ / -j{ (if any) will be called variables of K, or K-variables. The members of K - i.e., the J'l/-rules occurring in K will be called rules ofK. Remarks. 1. A canonical calculus Kover j{ is a finite class of J'l U 0/ U
{~ }
-words where tV is the class of K-variables. - 2. As a "pathological" case, the empty word
(0
is a canonical calculus over any alphabet. - 3. If K = {rt, ... r n } and '.' is a
new character (not occurring in J'l' u
{~})
then K can be represented as a single
word of form
This means that canonical calculi can be assumed as mere grammatical entities.
4.2.3. DEFINITION. Let J'l be an alphabet and K be a canonical calculus over J'l. We define inductively the relation
uf is derivable in K"
- in symbols: UK H f' - by
the following three stipulations: (i) If t
e«
thenK
»r.
f, andj" is the result of substituting an j{-word for a K-variable x inf (the same word for all occurrences of x in}) then K H f~ (iii) Detachment: If K H f, K H f~g, and the arrow '~' does not occur inf then K H g. (The stipulation for f means that f E (Jil U 0/)0 where tV is the class of (ii) Substitution: If K
H
K-variables.)
4.2.4. DEFINITION. Let j{ be an alphabet. We say that the class of words F is an inductive subclass of j{0 iff there exist '13 and K such that 36
(i) '13 is an alphabet and J'l ~ '13, (ii) K is a canonical calculus over '13, and
(iii) F = {x: x EJ'l° & K H x}. (Briefly: F is the class of j[°-words derivable in K.) Here the members of 'lJ-J'l (if any) may be called the subsidiary letters of the calculus K. Comments. 1. From now on, inductive definitions can be treated as defmitions by means of a canonical calculus. Obviously, the notion of an inductive subclass ofJit0 (given above) includes (is a generalization of) the notion inductively defined
subclass of j[0 (as given in the Explanation 4.1.1 in the preceding section). Thus, the acceptance of inductive definitions can be re-formulated by stating that: whenever j[ and '13 are alphabets, j[
~
'13, and K is a canonical
calculus over j[ then the open sentence xEj[°&KHx
is an accepted definitionof a monadic predicate (and, hence, of a subclass of j(0). 2. Note that a canonical calcu!us "in itself' defines nothing; to get a definition of a class F we must add the stipulation "F = {x: x
E
j[0 & K
H
x}". Even in doing so
we may get that F is empty . 3. Using a canonical calculus K in defining a subclass of j[0, we use, in fact, the alphabet 5f. u tV uS u {~} where tV is the class of K-variables and S is the class of subsidiary letters used in K. One of tV or S or both may be empty. The usefulness of subsidiary letters was exemplified in the previous section under (14) in the defmition of the class D. 4. If '13 ~ C, and K is a canonical calculus over '13/ then, obviously , it is a canonical calculus over C as well. 5. The definitions in the present section are (mostly) verbal inductive defini-
tions. The question arises whether these definitions could be transformed into more rigorous ones, into definitions by some canonical calculi. We shall get the positive answer to this question in Sect. 3.4. • Conventions. For the sake of brevity, we shall sometimes use the term 'calculus' instead of 'canonical calculus'. This convention will hold till Ch. 6 where we shall speak about logical calculi; after this, the adjectives 'canonical' and 'logical' cannot be omitted. - Instead of saying that "F is an inductive subclass of 5f. 0 " we sometimes will say that "F is an inductive class". In general, the term 'inductive class' will be used in this sense.
37
4.2.5. THEOREM. For all alphabets 51, 0 and of51.0; and
if
;1.°
are inductive subclasses
E = {eh ... en} is an enumeration of 51-words, then E is an inductive
subclass of 51.0. Proof. The calculus involving the single rule "x
~
x" (where x is a variable)
definesthe empty class (the base of the induction being empty). Concerning
;1.0,
the
calculus under (5) of the preceding sectiondoes the job. Alternatively, the calculus involvingthe simpleinput-free rule "x" defmes ;1.0, too. For E, the calculus involving just the input-freerules eI, .. . ,en is appropriate. 4.2.5. THEOREM. If ;I. is an alphabet, F and G are inductive subclasses of 51.0 thensoare Fu G and rr. G. Proof. Assume that K I and K2 are the calculi defining F and G, respectively. Choose two new letters,say '
us mean the rule r' obtained from r by inserting
-? X
rx -?x (where x is a KI - or a Krvariable), the resultingcalculus K3 defines Fu G: {x : xe ;l.° & K 3 »+ x} = Fu
G.
On the other hand, by adding the single rule
to K I ' u K2' , the resultingcalculus K4 defines F n G: {x: x e;l.° &
K4 »+ x} = F n G.
Remark. The question may arise: whether the difference F-G (of two inductive classes) is an inductiveone? We shall see in Sect. 4.4 that this is not always the case. •
4.3 Some Logical Languages 4.3.1. A language of propositional logic. We shall deal with a very simple language called the language of classical propositional logic. (It is unimportant whether the reader is familiar with this system of logic.) We want to defme the most important category of this language called the category of formulas. 38
This category contains an infinite supply of atomic formulas which are to be meant representing unanalized sentences. Using an "initial" letter '1t' and an "indexing" letter '1' , they can be expressedas 1t,
m, 1tU, rtttt,
...
Compound formulas can be formed from given formulas x and y in the form "-x" and "(x::) y)", where '-' and '::)' are the functors of negation and conditional, respectively. (Thus, '::)' corresponds to the metalanguage symbol '=:>'.) No other symbol is needed. Thus, the alphabet we need is as follows: .stPL
={(,), 1t, t , -, o}.
(The subscript 'PL' refers to propositionallogic.) In the calculus defining the class of formulas, we shall use the subsidiary letters'!' and 'F' ('1' for index, and 'F' for for.
mula), and the letters 'u' and 'v' as calculus variables. Now our calculus called KpL consists of the followingrules: K pL :
1.
2. 3. 4. 5. 5*.
10 Iu ~ Iut Iu ~ F1tu Fu ~ F-u Fu ~ Fv ~ F(u::) v) Fu~ u
(The numbering of rules does not belong to the calculus. We use the ordinals only for helping to refer to the singularrules.) Comments. Rule 1 tells that the empty word is an index. Let us agree that we omit 0 in any rule except when it stands alone, i.e., if '0' is a rule. Thus, rule 1 can be written simply as 'I'. - Rules 1 and 2 together define the class of indices; one sees that indices are just the {t}-words. - Rule 3 defines the atomic formulas, rules 4 and 5 define compound formulas. Finally, rule 5* serves to release the derived words from the subsidiary letter 'F' . Let us call such rules as releasing rules. Now we can define the class of formulas of classical propositional logic - in symbols: 'FormpL' - as follows:
Let us note that FormpL can be defined by a calculus immediately over .stPL , i.e., without any subsidiaryletters.Namely:
39
11. 12. 13. 14. 15.
Tt Ttt ut~un u~-u
u ~ v ~ (u:::> v)
By 11 and 12 we can get the atomicformulas 'rc' and 'm'. By 13, a word terminating in t can be lengthened by an t . Thus, rules 11, 12, and 13 together are sufficient to producing atomic formulas. Rules 14 and 15 need no comment. However, the preceding calculus seems to be more in accordwith our intuitions concerning the gradual explanations of "what are the formulas". Releasing of subsidiary letters seems to be unimportant, due to the simplicity of the studiedobject language. We can define a subclass of FormpL called the class of logical truths of propositional logic. The formulas of this class have the property that they are always true independently of the fact whetherthe atomicformulas occurring in them are true or false, assuming that negation (-) and conditional (:::» have the same meaning as '-' and '~' in the metalanguages (cf. Sect. 2.3). For this definition, we introduce the calculus KLPL as an enlargementof Kp L above. Our basic alphabetremains J'l.PL ; we use a new subsidiary letter 'L' and a new variable w. Omit rule 5* from Kp L and add the following new rules: ~
Fv -7 L(u :::> (v:::> u))
6.
Fu
7. 8.
Fu -7 Fv ~ Fw ~ L«u :::> (v o w)):::> «u:::> v):» (u:::> w))) Fu ~ Fv ~ L«-u:::> -v):::> (v:::> u))
9. 10.
Lu ~ L(u :::> v) ~ Lv Lu -7 U
The last rule releasesthe subsidiary letter 'L'. The calculusKLPL consists of the rules 1 to 5 (taken fromKpL) and 6 to 10 givenjust. Let us define: L pL = {x: x
E J'l.PL o
& K LPL
Jt+
x}
as the class of logical truths of propositional logic. Referring to the truth conditions of the negation and the conditional (as given in Sect. 2.3), it is ;easy to prove that the members of LpL are, really, logical truths. In fact, L~L contains all logical truths expressible in propositional logic - but we do not prove here this statement. (The proof belongsto the metatheory of classicalpropositional logic.) 4.3.2. A first-order language. Our next topic will be a language of classical first-order logic. First-order logic uses all the grammatical and logical means used in metalogic (cf. Sections 2.1, 2.2, 2.3): it applies (individual) variables, names, name functors, predicates, and quantifiers. (The adjective 'first-order' refers to the fact that only variables referring to individuals are used.) First-order languages mayuse different stocks of name functors and 40
predicates. We shalldealherewith the maximal first-order language which has an infinite supplyof name functors for all possible numbers of argument places, and, similarly, an infinite supplyof predicates for all possible numbers of argument places (and, of course, an infinite supplyofvariables). Thus,the alphabet we needmustbe muchricherthan JlPL. We applyas initial letter'x' (gothic eks) for variables, it willbe followed by an {t}wordas an index. Forindicating numbers ofemptyplaces offunctors, we shalluse the Greek letter' 0' (omicron); {o}-words maybe calledarities. As initialletterfor nomefunctors we shall use the letter 'cp', it will be followed by an arityand an index. If the arityis empty, we have a name. For predicates, we shall use the initialletter '1t', followed by an arity and an index. If the arityis empty, we havean unanalyzed atomic formula. We apply ''tI' as the universal quantifier, and the symbols '-', 'zi', '=', and the parentheses will be used as well. (Themissing sentence fimctors - e.g.,conjunction - and the existential quantifier can be introduced via contextual definitions.) - Thus,ouralphabet willbe: J{MF
(The subscript
'MF'
= {(,), t,
0, X,
=, -, ::l, 'tI}.
refers to 'maximal first-order'.) The main category of this language is,
again, that of the formulas. To define it, .we need some auxiliary categories and relations which willbe expressed by subsidiary letters. Theclassof oursubsidiary letters is:
s = { I, A, V, N, P, T, F }. Here I, A, V, T, and F stand for index, arity, variable, term andformula, respectively. N and P represent dyadic predicates; the intuitive meaning of "xNy" is "y is a name functor of arityx", and that of "xPy" is "y is a predicateof arity x". Now we can formulate the following calculus over J{MF U S with variables x, y, z:
1. 2. 3. 4.
I
5. 6.
Ix~
7. 8. 9. 10. 11. 12. 13. 14.
15. 16.
Ix ~ Ixt A Ax~
Axo
Vxx
Ax ~ Iy ~ xNcpxy Ax ~ Iy ~ XP1txy Vx -?Tx Nx~Tx
Ax Ax
~
xoNy ~ Tz ~ xNyz
~
xo Py ~ Tz ~ xPyz
Px~
Fx
Tx~Ty~F(x=y) Fx~
F-x Fx ~ Fy ~ F(x::l y) Vx ~ Fy ~ F't/xy
41
By adding the releasing rule
16*.
Fx~x
we get the calculus KMF defining the class of formulas, FormMF of our language. Other categories of this language can be defined by using a part of this calculus and another releasing rules; e.g.,
6*. 7*. 10*.
xNy~y
xPy ~y Tx ~x.
Rules 1 to 4, 6 and 6* define the class of (atomic) name functors; rules 1 to 4, 7 and 7* define the class of (atomic) predicates; rules 1 to 6, 8, 9, 10, and 10* define the class of terms; and rules 1 to 13 and 16* define the class of atomicfonnulas.
Comments. Most of our rules are obvious ones. The sign '0' of empty word is omitted everywhere (in 1, 3, 9, and 12). Rule 12 says that a name functor of arity 0 is a term, and rule 12 states that a predicate of arity 0 is a formula. Rule 10 prescribes that by adding a term to a name functor of arity xo (i.e., x plus one) we get a name functor of arity x (i.e., arity diminishes with one). Rule 11 tells the same for predicates. Note that these rules do not prescribe to include the argument between parentheses when filling in the empty places of a functor. This is as good as it is, for a functor and its arguments are uniquely separable, due to the fact that every possible argument begins with
1944, SMULLYAN 1961, MARTIN-LOp 1966, FEFERMAN 1982. - Our 'rules' are sometimes called
'productions', the basic alphabet (of a calculus) is said to be the 'terminal alphabet', and the subsidiary lettersare called 'non-terminalsymbols'.
4.4 Hypercalculi We posed the question in Sect. 4.2: Could the notions concerning canonical calculi be defined by canonical calculi? We are going to answer this question positively in this section. The key to our answer lies in the Th. 3.2.2 according to which each language can be replaced by a language based on the two-letter alphabet
42
j[t
={a.,~}.
The intuitive background is as follows. Imagine a calculus K over an alphabet '13. According to the theoremjust cited, '13 can be replaced by j{i- Let ; bea third let-
ter, then, the K-variables can be replaced by words such as ;, ;~, ;~~, ;~~~, .... Choose a fourth character,say '-< ' , in order to substitute the arrows ('..-? ') in the rules of K. Then, each rule of K will be translated as a word of the alphabet {a., ~,;,
character'.', we can form the expression
as the translation of K into a single word of the five-letter alphabet j{cc = {ex,~,;, -<, -lThat is, any canonical calculus can be expressed- via translation- as an j{cc-word. 4.4.1. The hypercalculus HI. - We define the hypercalculus HI that presents the
class of all calculi over j{1. Our starting alphabet will be
Jilcc
but we need the "class of
subsidiary letters 51 = {I, L, W, V, T, R, K}. .
Thus, H. will be a calculus over j{cc c s •. We shall use the letters 'x' and 'y' as H.variables. The intuitive meaning of our subsidiarylettersis included in the following glossary: I - index, L -letter, W - word, V - variable, T - term ({ex,~,;}-word), R rule, K - calculus. H.:
I. 2. 3.
4. 5. 6.
7. 8.
9. 10.
II. 12. 13.
I
Ix..-? Ix~ Ix..-? Lax Ix..-? V;x W Wx..-? Ly..-? Wxy T Tx ..-? Ly ..-? Txy Tx~ Vy..-?Txy Tx..-? Rx Tx~ Ry..-? Rx-< Y Rx..-?Kx Kx~ Ry~ Kx.y
43
By rules 1 and 2, "indices" are the
{~}-words.
By 3, "letters" are the words beginning
with a. and continuing with an index. By 4, "variables" begin with; and continue with an index. By 5 and 6, "words" are formed from 0 by adding "letters". By 7, 8, and 9, "terms" are formed from 0
by adding "letters" and/or "variables". The remaining
rules are, hopefully, obvious. By adding to H t the releasing rule
the resulting calculus H{ defines the class of canonical calculi: CCal =df {k
E5tcc &
HI'''' k} .
The alphabet Jtccu 5t is representable in the alphabet Jt 1 = {a ,~} as well. Thus , even
H t ' can be expressed by a single word hI , and hI
E
CCaI. That is, H{ derives itself
(more exactly, its own translation hI). 4.4.2. The hypercalculus H 2•
-
Our next task is to define the relation
word/is derivable in calculus k by means of a canonical calculus. For this aim, we shall enlarge the hypercalculus HI by new rules. We need two new subsidiary letters 'S ' and 'D', and the new variables u,
v, w, z, According to Def. 3. of Sect.4.2, there are two important type of steps in the course of deriving a word: (l ) substituting a word for a variable, and (2) detaching a derived word! (containing no arrow) from a derived word ''/ ~ g", to get g. The latter is easily expressible by a rule. Not so the former, although, intuitively, substitution seems to be a very simple notion. The subsidiary letter'S' will be used as a four-place predicate. The expression
"vSuSySx" is to be understood as "we get v from u by substituting y for x", The conditions are always "V x" and "Wy", i.e., that x is a "variable" and y is a "word" . However, we omit these conditions whenever they are unimportant, i.e., if "uSuSySx" holds, that is, u remains intact with respect to the substitution. In fact, there are such cases. Let us begin with these: 14.
15. 16. 17.
Lu ~ uSuSySx -< S-< SySx Vx ~ Iz ~ x~zSx~zSySx Vx ~ Iz ~ xSxSySx~z
These rules tell that in substituting a variable, the letters, the character '-< " and the other variables remain intact. Now an "active" rule:
44
18.
Vx ~ Wy
~
ySxSySx
(we get y from x by substituting y for x). We finish by formulating that substitution in a compound "uz" means substitution in its parts u and z:
19.
vSuSySx ~ wSzSySx ~ vwSuzSySx
Now we turn to derivations. We use the subsidiary letter 'D' as a dyadic predicate,
"xDy" means that "y is derivable from x". 20. 21. 22. 23.
24. 25.
Rx~
xDx Rx ~ Ky ~ y.xDx Rx ~ Ky ~ x.yDx Rx ~ Ky ~ Kz ~ y..x.zDx zDu ~ vSuSySx ~ zDv xDy ~ xDy-< z ~ Ty ~ xDz
Rule 20 is obvious. Rules 21, 22, and 23 assure that in a calculus, each of its rules is derivable. Rule 24 and 25 express substitution and detachment, respectively. Let H 1 be the hypercalculus constituted by the rules 1 to 25. We do not add a releasing rule.to H 1 , for its task is to define the relation D rather than a class. Now, (H 1 "
Ka) & (H2" Wb) & (H2" aDb)
holds iff a represents a calculus in which b E.st/ is derivable . 4.4.3. The hypercalculus H 3•
-
In order to get an important information about in-
ductive classes, we shall enlarge our hypercalculus Hz by some new rules. As new subsidiary letters, we shall use 'F', 'G', and 'A' . .We define first the lexicographic ordering of the ;tee-words. Such an ordering is almost the same used in vocabularies, dictionaries and encyclopaedias, with the exception that a shorter word always precedes a longer one. It is based on the alphabetical ordering of the letters. In our case, we assume that a. is the first letter, it is followed by ~, ~
by;, ; by '-< " and '-< ' by '.'. Using "xFy" for "r is followed by y", our rules
defining the lexicographic ordering of .stee-words, are as follows: 26.
Fa.
27.
xa.Fx~
28.
x~Fx;
29.
x~FX-<
30.
x-< Fx.
31.
xFy ~ x.Fya.
(Note that by 26, 0 is followed by a..) 4S
As the next step, we shall introduce numerical codes of J'lcc-words. We shall use {a}-words as numerals (cf. the comments on the alphabet x, = {a} in Sect.3.2), and apply the simple strategy: the code of the empty word let be itself, and if the code of x is y then the code of the word following x let be "yo;". Encoding words (of a formal language) by natural numbers was first used by Kurt Godel (in GODEL 1931). By this, we shall call the numeral coding of a word as its Godel numeral. In the following two rules, "xGy" may be read as "the Godel numeral of x is y". 32.
G
33.
xFy
~
xGz
~
yGza
Now assume that the word k EJ'lcc represents a calculus (i.e., HI
»+
Kk), and its Godel
numeral- as determined by the rules 26 to 33 - is g, that is, the word 'kGg' is derivable in our enlarged calculus. Now, it may happen that H 2 »+ kDg that means that in k, its own Godel numeral is derivable. Such a calculus may be called an autonomous
one, and its Godel numeral an autonomous numeral. Using the subsidiary.letter 'A' for 'is an autonomous numeral', we can define this notion by the rule: 34.
xDy
~
xGy
~
Ay
Let H3 be the hypercalculus consisting of the rules 1 to 34. Then, the class of autonomous numerals, Aut, can be defined as follows: Aut
=df
{x: x EJ'lo& H3 »+ Ax}.
Or, if we add the releasing rule 34* to H3 , to get the calculus H3' , 34*.
Ax~x
then we have: Aut
=df
{x: x
E
Aoo &
H3~ »+
x}
which shows explicitly that Aut is an inductive subclass of J'lo o. 4.4.4. THEOREM. The class of non-autonomous numerals, J'loo-Aut, is not an inductive subclass of J'lo o. Proof. We have to show that no calculus defines the class of non-autonomous numerals. It is sufficient to deal with calculi over J'l1 = {a,~} (remembering that every
alphabet can be repleced by a two-memberone). Assume k E CCal, and let its Godel numeral be g (i.e., H 3 »+ Kk, and H3 »+ kGg). Then, there are two possibilities: (a) g is derivable in k (i.e., H3 »+ kI?g). Then g E Aut, hence, it is not true that only non-autonomousnumerals are derivable in k.
46
·
I
(b) g is not derivable in k, i.e., g is a non-autonomous numeral. Hence, it is not true that all non-autonomous numerals are derivable in k. By (a) and (b), no calculus derives J{oo-Aut - what was to be proven. • Corollary 1. For all alphabet )il, there exist inductive subclasses F and G of )il0 such that F-G is not an inductive subclass of )il0. Proof We can assume, without violating generality, that >to ~)il (for a certain letter of )il may be chosen for the role of a.). It is obvious thatJ{oo is an inductive subclass of J{0, and so is Aut, for
jI
can be enlarged with subsidiary letters in order to
reconstruct H3' . By our theorem, >too-Aut is not definable by means of any calculus. Thus, taking J{o ° in the role of F, and Aut in the role of G, our statement holds true. Corollary 2. For all alphabets )il, there exists an inductive subclass F of)ilo such that )il°-F is not inductive. Proof. Take into consideration that J{o° is inductive, and
Knowing that ;toO-Aut is not inductive, we have that )il°-Aut cannot be inductive (cf. Th. 6 in Sect. 4.2). Thus, Aut fulfils the role of F. • The results of these corollaries were foreshadowed in a Remark at the end of Sect. 4.2. The importance of these results will be cleared up later on, partly in the next section.
4.5 Enumerability and Decidability Inductive definitions - i.e., definitions by means of some canonical calculi - are accepted tools of introducing categories of a language. If categories F and G are welldefined, then F u G, F (\ G, and F-G seem to be well-defined as well. However, as we have seen in the preceding section, it may happen that although F and G are inductive subclasses of a certain
jIo,
F-G is not so (but F u G and F (\ G are always
such ones, according to Th. 6 of Sect. 4.2). What is the peculiarity inductive classes posses, and non-inductive classes do not posses? This peculiarity is enumerability, in an extended sense of the word. In general, a class F is said to be enumerable, iff there exists some procedure (or algorithm) by which it is possible to list the members of F one after another, i.e., the procedure gives the initial member of the list, and whenever given a member in the list, the next member will be determined uniquely except the given member is the last one (in which case Fis finite). More exactly, for all members of F - except at most a single one - the procedure uniquely determines a successor in the way that
47
(i) the initial memberis not a successorof any member, (ii) no memberof F is a successorof itself,
(iii) the successorof each memberbelongsto F, (iv) each memberof F exceptthe initial memberis a successorof a unique memberof F (thus, successors of differentmembers are different), and (v) if there is a singlememberin F withouta"successorthen it is called the final memberof the enumeration; in this case, F is finite. In the case F is infinite, the enumerating procedure, of course, never ends and never stops, it runs ad infinitum. It is obvious that any finite class of words is enumerable. Also, empty classes will be qualified as enumerable ones (saying that the procedure "do nothing!" enumerates all membersof 0). If Jil is an alphabet, the class JilO is enumerable, e.g., by means of the lexicographic ordering of its words. (The method in the preceding section expressed by the rules 26 to 31 can be easily adaptedto any alphabet.) In Sect. 5.2, an algorithm will be presentedwhich "calculates"the successorof any wordf E Jil o. Another algorithm (ibid.) will be able to "calculate" for any word f E Jil ° wheref involves just k copies of a fixed letter "(E Jil - the word f' involving again k copies of"( and standing "nearest" afterf in the lexicographic ordering of Jil-words. From this it then follows: 4.5.1. LEMMA. The class of Jil-words involving exactly k copies of a fixed letter of Jil is enumerable; here k is any positive integer. Now we can outline - sketchily - a procedure for enumerating all members of an inductiveclass. Let Jil be an alphabetand F be an inductive subclass of Jil ° defmed by a calculus K. Disregarding trivial cases, let us assume that K contains some inputfree rules (thus, F is not empty),and some K-variables occur in the rules of K (thus, K is not finite). From the rules of K we can get further derived words either by substitution or by detachment. Our strategyconsistsin applying alternatively these two permitted acts of derivation. In order to apply all possiblesubstitutions of variables, we need, in advance, an enumeration ofall substitutions. Assume that
is an orderingof the variables of K. A total substitution of these variables is determined by a k-tuple of Jil-words ab meaningthat Xl is substituted by at (in all rules of K), X2 by a 2 ' ... , and Xk by ak' Let 'I' be a new character(not occurring in Jil), then, this substitution can be expressedby the 'Jil u {If -word
a., ... ,
48
involving just k-l copies of the letter 'I '. By our Lemma above, the class of such words is enumerable. Hence, the class of all substitutions of our variables is enumerable. Note that the first word in this enumeration is
010/ ... 10, or, omitting 0, this word consists of k-l copies of 'I '. That is, the frrst substitution consists in putting the empty word for all occurrences of all variables in the rules of K. Now, our enumerating procedure runs as follows. Step 1. Apply the first substitution of variables . Arrange the words resulting from the rules of K by this substitution in two sequences 8 and P: put the words resulting from the input-free rules (words without arrow) into 8 and the other ones into P. (The order of the words in 8 and P is indifferent; however, if you have an ordering of the rules of K then the ordering of 8 and P may follow this ordering.) Step 2. For each member of form "/ ~ g" in P - where / involves no arrow examine whether / occurs in 8. If so, and if g involves no arrow, then add g to 8 (except when it occurs in 8 already), and omit "/ ~ g" from P. If/occurs in 8 but g is not arrow-free then replace ''/ ~ g" in P by g. Finally, if / does not occur in 8, go further. Step 3, and, in general, Step 2n+l (n ~ 1). Apply the second (the n-th) substitution of variables. Extend 8 by adding the arrow-free words resulting from the rules of
K by this substitution, and P by the other words. Step 4, and, in general, Step 2n. The
s~me
as Step 2.
The sequence marked by '8' will be extended in each step of uneven number, and, sometimes, in steps of even number. The other sequence marked by 'P' contains 51. u {~} -words, it increases in steps of uneven number and may change or even de-
crease in steps of even number. Clearly, the ever-growing sequenceS gives the enu-
meration of 5l-words derivable in K.
In case K uses subsidiary letters we need a further procedure to omit the words from 8 involving subsidiary letters. But this can done by simple inspection of the members of 8. It seems to be obvious that a necessary condition of accepting a syntactic category of a language as a well-defined one is that the category be an inductive class defined by some calculus. If this condition is fulfilled then - according to the results of the present section - the words belonging to the category can be enumerated systematically. However, if the category contains infinitely many words then the enumeration although it can be continued without any limits - never ends. Thus, the enumeration
49
itself does not give a methodto decideon every word whetherit belongs to the category in question. 4.5.2. DEFINITION. A category of a language is said to be decidable iff there exists a universal procedure by which for any word of the language it is decidable in finite numberof steps whetherit belongsto the categoryin question. Although the notionof procedure is unclear, we can give examples in which its use seems to be quite correct.This is the case in the next theorem. 4.5.3. THEOREM. If
Jil
is an alphabet, and both F and Jil0 -F are inductive
subclasses of JilO then F is decidable. Proof. We give a decisionprocedure for F. By our assumptions, both F and its
complementary class Jilo_F are enumerable. Hence, any word a E Jil° must occur in one (and only one) of the enumeration of these classes. Apply the enumeration procedure outlined above for these classes, by forming alternatively the strictly increasing sequences and of words of F and Jilo-F, respectively (where S, is the result of applying 2i steps in enumerating F, and Si/ is analogously for Jilo_F). Since the word a must occur in one of these enumerations, there must be a number n such that either a is in S; or a is in S; /. Thus, after a finite number of steps in our procedure, the question "a E F ?" is answered. • The question "What is a (deciding) procedure?" seems to be unimportant in case we can presenta decisionprocedure for a class F. However, it will be an important one if we are unable to present such a procedure, and, moreover, we suspect that our class F is undecidable. To prove such a guest, an exact notion of procedure is indispensable. Furthermore, we can pose the questions: Is every inductive class decidable? Is every non-inductive class undecidable? (Or, in contraposition: Is every decidable class (of words) an inductiveone?) Our next chapter is devoted to the investigation of these and similar questions.
so
Chapter 5 NORMAL ALGORITHMS 5.1 What is an Algorithm? Instead of the term 'procedure' - used in the preceding section - the expression 'algorithm' is widely used in mathematics. In the most general sense, an algorithm is a regulated method (a procedure) of solving all (mathematical) problems belonging to a certain class. All of us have learned in the primary school a method of multiplying a pair of numbers written in decimal notation. This method is a simple example of an algorithm that is applicable for all pairs of numbers written in decimal notation. In fact, it deals not with numbers but with numerals, i.e., with words of a certain alphabet. This last remark leads us to delimit the intuitive notion of an algorithm. We stipulate that each algorithm must be connected with an alphabet j[ I and its task is
always to transform j[-words into j[-words. (Note that pairs, triples, etc. of j[-words can be considered as words of a larger alphabet - e.g., of
jl
u {I} - ; thus, our"delimi-
tation does not exclude the above example of the multiplying algorithm. We can apply subsidiary letters in an algorithm, too.) If an algorithm deals with j[-words, we shall call it an algorithm over the alphabet j[. Now, an algorithm over j[ must prescribe what we have to do with an input word fEjlo. In most cases, it depends on the question which letters' occur inf Thus, the working of an algorithm may begin with a question concerning the form of such a question is: "does the word a occur in
f
In most cases,
f ?" The further step depends
on the two possible answers. The work of the algorithm may be continued either by putting a new question or else by a command what we have to do with the given word. A command, in general, prescribes to change a subword of the given word by another one. (This includes prefixing or suffixing by some letters of the given word as well as omitting some of its letters. Think of the application of the empty word!) After performing the act prescribed by the command, we have a new word f~ and the algorithm may pose questions concerning
t'
or may give a new command. Thus, questions and
commands may occur alternatively in an algorithm. Finally, the algorithm may give a command of stopping the work and producing the output. Thus, an algorithm may involve questions and commands, and a steering apparatus that tells us how to begin, how to continue, and how to finish our work.
In the general setting, we cannot exclude the case that an algorithm over an alphabet j[ is not applicable for some j[ -words (including the case that it works "forever", getting tangled up into a circle of steps). Of course, a "good" algorithm must be applicable at least to the members of a certain (nonempty) subclass F of j[0. Since both canonical calculi and algorithms are based on alphabets, it may be useful to stress an intuitive but essential difference between them. The rules of a calcu51
Ius tell us what we may do, what are permitted to do. On the other hand, the commands of an algorithm tell us what we must do, what.are prescribed to do. All what has been said up to this point is not an exact definition of algorithms, rather, it is merely an illustration of an inexact, intuitive notion used during centuries in mathematics. Only in the twentieth century was the claim raised to give an exact formal definition of this notion. However, we never can tell that a formal definition of a notion reflects exactly the content of a non-formal, intuitive notion. Most of we can do is to introduce the notion of a class of algorithms hoping that any effectively working algorithm can be substituted by a member of this class. To be more exact, let us take into consideration the following definition. 5.1.1. DEFINITION. We say that two algorithms are equivalent with respect to an al-
phabet jl iff for all jl-words they apply with the same result, i.e., iff E jlO then either (a) none of them is applicable to f, or else (b) both are applicable to f, and both transform f into the same word f", Now, the statement that every algorithm can be substituted by an algorithm belonging to a certain type (class) T of algorithms is to be understood as follows: Given an alphabet jl and an algorithm G, there exists an algorithm G / belonging to T such that G and G / are equivalent with respect to jl. According to the preceding considerations, such a statement never can be proved rigorously - although it can be somewhat confirmed by empirical facts, as long as no counterexamples exist. Or it may serve as a definition of algorithms - in which case an alleged counterexample will be refused by telling that it is not an algorithm at all. (However, this would be a very curious situation. If it is proved that a procedure works effectively and successfully then it is unreasonable to say that it is not an algorithm.) Empirical investigations on existing algorithms show that their questions and commands can be "dissolved" - in most cases':" into more simple steps. Then, the challenge arises: Try to find the simplest type of questions and commands as well as the simplest forms of steering! This will lead to the most general type of algorithms. There are several proposed solutions of this problem, which, in the course of time, were proved to be equivalent. In the field of researches in the foundations of mathematics (i.e., in metamathematics) , the problem of algorithms was re-formulated as the problem of effective computability ("reckonability") of (number theoretic) functions. In this field, the most popular solutions are elaborated in the theory of recursive
functions and in the theory of Turing machines (the latter are idealized computers). (See, e.g., KLEENE 1952.) Researches directed immediately toward algorithms were shall treat in detail the theory of Markov algorithms -
52
I
~alled
succe~sful,
too. We
also normal algorithms -
for this is best suited to our aims concerning the foundations of metalogic. (MARKOV 1954; a good report is in MENDELSON 1964, Ch. 5, § 1.) We give here an intuitive picture of Markov (or normal) algorithms. (The formal definitions follow in the next section.) Let us agree in using the term normal al-
gorithm in referring to this class of algorithms.
A normal algorithm over an alphabet Jil is, in essence, an ordered class of commands of form a~b
(1)
or a~·b
(2)
where a and b are Jil-words - called the input and the output of the command, respectively -, and the characters '-? ' and '.' (the arrow and the dot) do not belong to Jil o • A command of form (l) is said to be a non-stop command, and a command of form (2) is said to be a stop command. A command C is said to be applicable to a word
f
E Jil
0
iff its input a occurs as
a part in f. If C is applicable to f then by applying it to f let us mean changing the first occurrence of its input a by its output b. Given a normal algorithm N over an alphabet Jil and a word f
E
Jil 0 , the appli-
cation ofN to f runs as follow. (l) Find the first command of N applicable to f. If there is' no such command
then N does not apply to f. If there is such a command C then apply it to f, to get another word g. If C is a stop command, the work of N is finished, N transformed f into g. (2) If C is a non-stop command then apply the procedure described in (l) to g, that is, find again the first command of N applicable to g, and so on. Now, in applying N to a word f, there are three possible cases: (a) The algorithm N blocks the wordfin the sense that either N does not apply to f, or after some steps we get a word g such that N does not apply to g. (b) The algorithm N is successful with respect to f in the sense that after a fi-
nite number of steps we get a word g as the result of applying a stop command . Then we say that N transformedfinto g. (c) The algorithm N is everlasting with respect to f, it runs forever without blocking and without stopping by a stop command. Of course, only case (b) is a useful one. The empty word may occur in commands. A command of form
means to prefixing a word with b (for the first occurrence of 0 in a nonempty word is just before of its first letter), and
53
a.-70 means erasing the first occurrence of a. A command
means to do nothing - thus, it seems to be superfluous - , it is applicable to all words. The stop command
may indicate finishing the word of the algorithm. An algorithm containing this command never blocks, for it is applicable to any words. Now we see that in normal algorithms the questions are included into the commands. Every question is of form: "does the word a occur in the studied word?", and every command prescribes a change of the first occurrence of a subword in the studied word by another one. Probably, these are the simplest forms of questions and commands . Further, the steering in normal algorithms is uniformly regulated by the prescriptions (1) and (2) above; beyond these, only the order of commands takes part in the steering. However, by using subsidiary letters, we can modify the steering - as we shall see later in the examples .
5.2 Definition of Normal Algorithms 5.2.1. DEFINITION. By a normal algorithm over the alphabet Jot let us mean an ordered (nonempty) fmite class of words called commands of form (i)
a-+b
(ii)
and
where a, b e j[0 (any of them may be empty), and the characters '.-7 ', '.' do not belong to Jot. Here a and b are called the input and the output of the command, respectively. A command is said to be a non-stop one if it is of form (i), and a stop one if it is of form (ii). 5.2.2 DEFINITION. A command C with input a is said to be applicable to the word
f
E Jot °
iff f is of form "xay". If b is the output of C, and a does not occur in x then the
word "xby" will be said the result ofapplying C to f = "xay", and it will be denoted
by C(j). 5.2.3. DEFINITION. Let N be a normal algorithm over the alphabet J:t, andf E Jot 0 • We define the relations "N blocks f', "N leads f to g", and "N transforms f into g" by simultaneous induction, according to the stipulations (a) to (e).
54
(a) If no command of N is applicable to f then N blocks f - in symbols: "N(j)
= #. (b) Assume that C is the first command applicable to f, and C(f) = g. Then:
if C is a non-stop command, N leads f to g - in symbols: "N(f/g)"; and
if C is a stop command, N transformsf into g - in symbols: "N(f) = g". (c) If N(f/g) and N(g/h) then N(j/h). (d) If N(j/g) and N(g) = # then N(j) (e) If N(j/g) and N(g) = h then N(f)
5.2.4.
= #. =h.
•
DEFINITION. We say that the normal algorithm
N is applicable to a word f iff
there is a word g such that N(f) = g.
Comments. 1. According to our definitions, N(j) = g iff there exists a fmite sequence fh f2' ... ,fn with fl = f such that N leads /; to /;+1 (for 1 S i < n), and N transforms fn into g. Furthermore, N(j) = # iff there exists a finite sequence fh f2' ... , f« with /1 =/ such that N leadsf, to/;+1 (l S i < n) andN blocks/no 2. We do not introduce a term and a notation for the case in which the algorithm neither blocks a wordfnor is applicable to/, i.e., when we try to apply the algorithm to / it runs forever . We shall not be interested in this case. 3. If the commands of a normal algorithm are, in their ordering, C1,
• •• ,
C;
then we can represent it by the word
Thus, any normal algorithm over an alphabet .!it can be represented by a single word of the alphabet 5t u
{~,
., .}.
lliustration 1. Let .!it ={ai, ... , an} be any alphabet. The normal algorithm No below transforms each 5t-word into the empty word: No:
1. 2.
We have that /e5t° <=> No(f)
=0 .
It is clear that the ordering of the first n commands is indifferent here, but the n+I-th command must be the last one. As a short presentation of No, we can apply the following concise notation: No:
1.
[a e5t]
2.
ss
Of course, line 1 comprises n different commands. - In what follows, similar abbreviations will be used systematically. Illustration 2. Let ;t be an alphabetcontaininga pair of parentheses, i.e., . ;t =;t' u {(, )}
where ;t' = {alt ... , an},
n ~ 1.
Let us give a normal algorithm N p that checks whether the parenthesesin ;t-words are well-paired. For this aim we need three subsidiaryletters L, R, and t ; thus, Np will be an algorithm over ;t u {L, R, t} . Of course, the subsidiary letters must be alien to ;t, and we want Np to be applicable to all ;t-words. Given a word f e ;to, our algorithm will preftx it with 'L ' , and then L will "jump" over the letters off, one after the other. However, wheneverL jumps over a left parenthesis 'C, it will be preftxed by an 't' as an index counting the parentheses. These t-S will follow the letterL. WheneverL jumps over a right parenthesis ')' , an t will be erased, but if there is not an t to erase then L will change to 'R' , and R will go to the end off. Thus, finally.j' will be either with L without any t , or by one or more copies of t and L, or by R. In the first case, the parentheses inf are well-paired, in the other cases they are not so. Now, our algorithm is as follows: N p:
1. 2.
La~aL
3. 4.
ta ~a t t L) ~)L
5.
L)~
6.
Ra~aR
7.
tL~R
[a e;t' ]
L( ~ (t L
)R
8.
tR~R
9.
L~·L
10. 11.
R~·R
0~L
Commands 1 to 10 involve subsidiary letters in their inputs. Hence, if f e ;to, only command 11 is applicable to f (and it will be never applied in the following steps). Note that command 5 is to be applied only if none of the precedingones are applicable, i.e., when L meets a right parenthesis, and no t precedes L, which means that our right parenthesis has no left mate. H some left parentheses have no right mates then commands 7 and 8 change L to R and erase the remaining t-s. The stop commands are 9 and 10.Thus, we have: If f e;t° then Np(f) = "fL" if the parentheses inf are well-paired, and Np(f) = ''fR'' otherwise. S6
Illust ration 3. Assume that Yl = {ab ... ,an}, n ~ 2, and let us assume the lexicographic ordering of Yl-words based on the ordering of the enumerationof the letters ab ... ,an (cf. the hypercalculus H 3 in Sect. 4.4). The normal algorithm N suc below transformsevery jl-word into its successor (i.e., into the word following it immediately according to the lexicographic ordering). It uses the subsidiaryletters T and 'J'. Nsuc : 1.
Ia~
2.
I~J
[a EYl]
aI
J~.
3.
a;
4.
an J ~ Jal
5.
J ~·al
6.
0~I
a;+l
[i < n]
Command 6 prefixes the word! EYl O with'!'. By iterated applications of command 1, 'I' goes to the end off, and, by command 2, it changes to J. If the final letter of! is not an then command 3 finishes the work. Otherwise we apply command 4, and after this,
command 3 or command 4 will be applicable. In case we must apply again and again command 4 (which means that each letter of! is an), J goes back to the leftmost position, and command 5 closes the work. Using the existenceof this algorithm, and referring to the notion of enumerability (cf. Sect. 4.5), we have: 5.2.5. THEOREM. For all alphabets Yl, the class Yl O is enumerable. Illustration 4. Let
be as in Illustration 3. We produce a normal algorithm
j[
N[kl that transforms every word! EYl O into a word g containing exactly k copies of the
letter an and standing nearest after! (in the lexicographic ordering). Here k is a fixed positive integer. This algorithmuses the subsidiaryletters I, J, K, Ko, K}, ... ,KA? Kk +1 • N[kl :
~
1.
Ia
2.
I~J
3.
a;J
4.
anJ ~ Jal
5.
J~Kal
6.
aK~Ka
[a EYl]
aI
~
[i < n]
Ka;+l
[a EYl]
7.
K~Ko
8.
K; aj ~ ajK;
U< n, i sk]
9.
Klan ~ an Ki + 1
[i
10.
~+1~0
11.
~~·0
12.
K;~0
13.
0~I
s k]
[i < k]
57
Commands 1 to 5 are almostthe same as in Nsuc except that instead of the stop dot '. , we find the subsidiary letter 'K'. The work of the algorithm begins with command 13 that prefixes the inputwordf E jIO with'!'. Commands 1 to 7 present the successor off prefixed by 'Ka'. The nextcommands serveto countthe occurrences of an in the considered word. ITit is just k; command 11 closes the work. In othercases,command 10 or 12 erases ~l or K; wherei < k. Then,bycommand 13, we repeatthe procedure, now applied to the successor of f. Aftera finite numberofrepetitions we surely findthe wordinvolving just k copies of an' The existence of this algorithm proves the lemma in Sect. 4.5. This lemma was used in showing that inductive classes are enumerable. Thus, our results in Sect. 4.5 are reinforcedby referring to the algorithms Nm; and N[ k] •
5.3 Deciding Algorithms In Sect. 4.5 , we gave a defmition of decidability of a class of words. Now, the decidability of a class can be effectively demonstrated by means of an appropriate normal algorithm.-Assume that z is an alphabet, F~jIo, jI' is an alphabet includin~jI (jI ~ jil'), w is a fixed jil' -word, N is a normal algorithmover jil' applicable to all x-words, and fejilo
~
if
E
F
¢::>
N(f) = w) .
Obviously, in this case F is a decidable subclass of jI and N may be called the deciding algorithm of F. . We give an example of a deciding algorithm. In Sect. 4.3, we met the alphabet of the maximal first-order language 0
jIMF
,
={(, ),1 , 0, X,
and a canonical calculus KMF defining the class of formulas, Form, of this language. We construct a normal algorithm NMF which for all f e (jIMF)O decides whether f E Form. We need three subsidiary letters 'z', 'c', and 'q' : thus, NMF will be an algorithm over the alphabet f}J =jIMF
N MF :
1. u ~ t 2. n~x 3. Vx ~ Vz 4. x~ c 5.
U {z, C, q} .
12. ooc ~ C 13. <po
m~q
16. 1tOC ~ q 17. 1t~q
=
18. (c c) ~ q 19. -q ~ q 20. (q o q) -+ q 21. Vzq ~ q
22*. 0 -+·0 S8
Our algorithm is applicable for all $-words; and, hence, for all J'tMrwords. This is assured partly by the last command which is trivially applicable for all words, and partly by the fact that none of our commands has an output longer than its input, moreover, except commands 3,4, 14, 17, the outputs are shorter than the inputs, but these exceptional commands are applicable in a finite number of cases. Hence, starting with a word f
E J'tMF ,
we get sooner or later shorter words.
If the input word f is really a formula then our algorithm will transform it into the letter 'q'. Let us check this statement. Command 1 reduces the connected occurrences of i-s to a single t. Command 2 erases t after
x, and command 5 replaces
'
the subsidiary letter 'c' (thus, indexed variables and names disappear), Occurrences of
x after 'V will be changed to 'z', and its other occurrences to 'c', by commands 3 and 4. The remaining t-S will be erased by commands 6, 7, and 15. Commands 8 to 13 reduce the arities of name functors and predicates (the final step for predicates is embedded into command 16). Commands 8, 13, and 16 erase the occurrences of '0'. By commands 8, 10, 12, 13, and 14, occurrences of
las into 'q' , and 19 to 21 eliminate the characters '-', ':J', ''V' , 'z ', and the parentheses. Thus, the starting formula is transformed into 'q' , and command 22* stops the work. Now we have to prove the converse of the statement just verified, namely, that if
NMF(J) = q thenf must be a formula. For this aim, let us consider 'z' as a variable, 'c ' as a name functor of arity 0 (i.e., as a name), and 'q ' as a predicate ofarity 0 (i.e., as an atomic formula). By this enlargement of the mentioned categories, the notion of term and formula will be enlarged, too. Let us call quasi-formulas the formulas taken in this extended sense . By checking our commands in N MF , we see that by applying them in the converse direction - i.e., by changing the roles of input and output - we get always quasi-formulas from quasi-formulas. Thus , whenever C is a command of our algorithm, and C(g) is a quasi-formula then so is g. Hence, if NMF(J) = q thenfmust be a quasiformula. However, if
f
E
(J'tMF) ° , it must be a formula. - By these, we have:
It is easy to modify our algorithm in such a way that non-formulas will be transformed into a fixed word. For this purpose, we need three new subsidiary letters, say, a, ~ and
'Y. Omit command 22* at the end of N MF and add the following new commands: 22.
~b~ b~
23.
aq~~·
24.
25.
a ~ 'Y 'Y b ~ r
26.
'Y~ ~.
27.
0
[b E $]
q [b
E
$]
a
~ a~
S9
If none of the commands 1 to 21 is applicable to a word f fixes it with
a~.
Then, by command 22,
~
E '.13
then command 27 pre-
goes at the end of the word, and if we get
just 'aqW then command 23 stops he work. In the contrary case, commands 24, 25, and 26 erase the letters of f, yielding finally a. Thus, if the algorithm consisting of commands 1 to 27 will be denoted by N*, we have: iffe (J1MF)O then
N*(f) = q iff E Farm, and
N*(f) = a otherwise.
To give another example of a deciding algorithm, assume that K is a calculus over an alphabet JI consisting only of input-free rules of form (1)
(k? 1)
where Xtt
.. . ,Xk
are (different) K-variables, and
Co, CIt .. .
.c,
are J1-words, Co and
Ck
may be empty but the other ones (if any) must not be empty. We let: F=
if
EJlO &
K Hf}.
We construct a decision algorithm for the class F. In case K consists of a single rule of form (1), the following algorithm with the subsidiary letters N[if] :
a , <Xo, ah .. . ,a b ak+ l
1.
a i C; ~
2.
o; a ~ aa,
3.
ak Ck ~ Ck ak
4.
Ck a k ~. Ck
5.
a, ~. 0
6.
o
~
will suffice:
c, a i+I [0 ~ i < k] [a e j[, 0 < i ~ k]
a [0
s i s k]
ao
It is left to the reader to check that N[if]
if)
= "fa"
if
f
e
F,
and
N[if]
=f
(f)
otherwise.
If K contains more than one rule of form (l), we can construct for each of these rules a deciding algorithm. However, we can ''unify'' these algorithms into a single one. Assume that the number of rules is n > 1. Then their form is as follows: (l
Now we need the subsidiary letters
a, ~j'
and
deciding algorithm is as follows:
60
aji
sj
~
n, kij) ? 1)
for 1 ~ j ~ n and 0 ~ i ~ kij). Our
N[if,n).: 1.
[1 ~j ~ n, 0 s i < klj)]
. aji cji -4 Cjiaj,i+l
2.
ajja ~ aajj
3.
aj,klj) Cj,klj)
[1
"7 cj,klj)aj.klj)
~j ~
n, 0 < i s k(j), a E 51]
s n]
[l
~j
[l
~j:S;
4.
cj.k(j)aj,klj) -4. cj.Jc(j)a
5.
Cl.ji -4
6.
a~j -4 ~fl
[l
7.
~j -4 Cl.j+I ,O
[1 ~j < n]
8.
Cl.ni~· ·0
[0 ~ i
9.
0-4 Cl.1.O
~j
n]
[1 ~j < n, 0 ~ i ~ klj)] ~j
s k(n)]
Again: N[if,n] if)
= "fa" if
f e F, otherwise
N[if.n] if)
=f
5.4 Definite Classes 5.4.1. DEFINITION. Let 51 be an alphabet and F c 51°. We say that F is
adefinite
subclass of 51° iff there exist $, w, and Nsuch that
(i)
$
is an alphabet, 51c
$,
w E $0,
(ii) N is a normal algorithmover $ applicableto all 51-words, and
(iii) fe
)to ~
ife F ~ N(f) = w).
By our definition, definite classes are always decidable ones. What about the converse of this statement? If we have some decision procedure for a class of words, does there exist a normal deciding algorithm for that class? The positive answer is a consequence of the following statement: 5.4.2. Markow's Thesis. If 51 is any alphabet and M is any algorithm over 51 then there is a normal algorithm N such that M and N are equivalent with respect to 51
(cf. Def. 5.1.1). The content of this thesis was foreshadowed already in Sect. 5.1. Ibid., it was argued that no rigorous proof is possiblefor this statement.
Remark. It is proved that Markov's thesis is just as strong as Church's Thesis according to which everyeffectively calculablefunction (in number theory)is a general recursive one. (See KLEENE 1952, § 60; MENDELSON 1964,Ch. 5.) Our next theorempreparesthe way provingthat all definite classes are inductiveones. 5.4.3. THEOREM. Assume that N is a normal algorithm over an alphabet $, 51 and N is applicable to all 51-words. Then there are C, K, and f1 such that
61
S;;
'13,
C-'1J, (ii) K is a canonical calculus over C, and (iii) for all x E jIO and Y E '1J, (i)
C is an alphabet, '1J c C, J.i
E
N(x) = Y
~
K
11+
xuy.
Property (iii) can be expressed succinctly as follows: Calculus K represents algorithm N. Thus, our theorem can be re-phrased as:
If a normal algorithm is applicable to all words of an alphabet then it is representable by a canonical calculus. Proof. Let the commands of N be C b
.. .
,Cn (in this order). Firstly, we con-
struct a subcalculus to each command.
In defining the subcalculus K, connected to command C, (l to distinguish two cases: (a) C; is of form "u, ~ v;" or "Uj~· v;" where
U;
~
* 0.
i ~ n) , we have Assume,
U;
=
ubl... bi" where bl , ... ,bk E '1J, and k'? 1. We need in K, the subsidiary letters j A , A;o, AilJ' "'J' A ik» A i,k+1 and the variables x and y. Then: Ki:
i1.
a. i3. i4. i5. i6.
Ailx xA it by ~ xbA il y xA ij by ~ xA il by
[b E '1J-{b I } ] [b E'1J-{bj } , 1 ~j ~ k]
xA ij bjy -7 xb~ ;,j+1 Y [1 ~j ~ k] xA ij -7 A iO X [1 ~j ~ k] i xu; A i,k+I Y -7 XUi yA XVi Y
Comment. We check whether C; is applicable to the word x E '1J. We start with rule i1 and then apply i2 or i4 (putting '0' for x in the first case) to find the letter bl • In case b, is not followed by b 2 (or, in general, bj is not followed by bj+I), we are looking for b, again, by rule i3. In the negative case, we get "Alo x" by rule i5 (which expresses the fact that C; is not applicable to x). In the positive case, we get by rule i6 "xAiy" where y
= C;(x). (b) C, is of form "0
~
v;" or "0
~. v;". Then
K, consists of a single rule:
..• .K; and a subcalculus Ko defined below. The subsidiary letters of Ko are Aio, Ai (l ~ i ~ n), M, and u. A
Our planned calculus will be the union of the subcalculi K I , new variable z (beside x and y) will be used, too.
62
The application of our calculus to a word x E j[0 , runs, intuitively, as follows. We try to apply K 1• If it is successful, i.e., if we get "xLi~", let us use the following rule: (i) (ii)
xLi~ ~ xMy
if CI is not a stop command, if CI is a stop command.
xLi~ ~ xJlY
In case (ii), we are ready (we get "xuy" by a detachment). In case (i), we can repeat our procedure with y, insteadof x. If the application of KI to x is unsuccessful, i.e., if we get "Li 1ox", we turn to K2 with x. This step is regulated by the following rule: 2 Lito x ~ xLi Y ~ xMy
if C2 is a non-stopcommand,
LilOX ~xLi2y ~xJlY
if C2 is a stop command.,
provided K2 is successfully applicable to x. In the contrary case, we must go to K3• If we have good luck then let us apply the following rule:
whereZ is Jl or M, according to the case C3 is or is not a stop command. - If C3 is not applicable to x, we tum to K 4 , and so on. Now we define the calculus Ko by the rules 1 to n+2 below. For all i from 1 to n, the letter Z in rule i is to be substitutedby Jl or M, according to the case C, is or is not a stop command. Ko:
2.
xLi~ ~xZy LilOX ~ xLi2y ~xZy
3.
Li] oX ~ Li 20X ~ xLi
n.
Li10X ~
1.
3y
n +1. xMy n+2. xMy
~ xZy
~ Lin-I,ox ----? xLiny ~ xZy
----?
yMz ~ xMz
----?
yJlz
~
xu; K = K o U K 1 U ... u K n •
5.4.4. THEOREM. If j[ is an alphabet, and F is a definite subclassof j[° then F is an inductive subclassofj[o.- In short: Every definite class ofwords is an inductiveone.
63
Proof Assume, Fis a definite subclass ofj[°. Then, by Def. 5.4.1, there are '13, . wand N such that j[
~
'13,
WE
'130, N is a normal algorithm over '13 applicable to all j[-
words, and
f
(1)
E
F
~
N(j) = w.
By our preceding theorem, there are C, J.L, and K such that '13 c C, J.L
E
C-'13, and K is a
calculus over C representing N, i.e.,
(N(j) = g
fE~O ~
~
K H fJ.Lg).
Thus, in case g = w:
(2)
fEj[o
~
(N(j) = W
~
K Hfflw) .
Let us add to K the rule XJ.LW -7 X.
In this extended calculus K / we have obviously:
By (1), (2), and (3), we get:
which means that F is, in fact, an inductive subclass of j[ 0. • COROLLARY
1. If F is a definite subclass of j[0 then so is ~o- F; hence, both F and
j[°-F are inductive subclasses of~o. Proof An algorithm deciding F can be modified easily to deciding j[ o_F; see the example in Sect. 5.3 about the modification of N MF into N*. COROLLARY 2.
For any alphabet j[, F is a definite subclass ofj[ ° iff both F and j[ O_F
are inductive subclasses ofj[o. Proof IfF is definite, then, by Corollary 1, both F and j[°-F are inductive. Now, going to the converse, assume that both F and j[°-F are inductive. Then, by Th. 4.5.3, F is decidable, i.e., there exists a procedure - an algorithm
~
for deciding the
membership in F. According to Markov's Thesis, this procedure can be substituted by a normal algorithm. Hence, F is definite. (Note that this was the first case we had to exploit Markov's Thesis.) 5.4.5. THEOREM. For all alphabet j[ there exists a class F c j[° such that F is an in-
ductive but not a definite subclass of j[0. In short: There exist inductive but not definite classes ofwords.
64
Proof. By Corollary 2 of the preceding theorem, it is sufficient to show the existence of a class F c
jl
° such that F is but j l o_F is not inductive. This was shown in
Corollary 2 of Th.4.4.4 where the class Aut (definable in any alphabet) was used in the role of F.
Final comments. For a grammatical category of a formal language, it seems to be an indispensable criterion to be a defmite class. This holds especially for the category of declarative sentences. In formal languages, the expressions representing sentences are called, in most cases, formulas. (Cf.
JilpL
and
JilMF
in Sect. 4.3.) Thus, the class of
formulas - in a formal language - is to be a definite class. Now we are in the position to present a full system of logic - by pure syntactic means. (Up to this point, we have no semantic means.) This will have the form of a logical calculus. It will involve an inductive defmition of the (syntactic) consequence relation holding between a class of formulas
r
and a formula A (expressible by "A is
a consequence of F"). This will be the subject matter of the next chapter.
6S
Chapter 6 THE FIRST-ORDER CALCULUS ' (QC) 6.1. What is a Logical Calculus? A logical calculus consists of a description of the common grammar of a family of languages and an inductive definition of the syntactic consequence relation. The adjective 'syntactic' is important here, for, in most cases, there is a possibility to define a
semantic consequence relation (in which case one speaks on a semantic logical system rather than on a logical calculus). The usual notation of a syntactic consequence relation is of form "T where
r A"
is a class of formulas (of a certain language) and A is a formula (The sign ' F
r
is sometimes supplied by a subscript referring to the calculus, e.g., "rQc",) It is read usually as "A is deducible from T' where
r
may be called the class of premises. (As
we shall see later on, this relation is similar to the derivability relation "K iH f" used in the canonical calculi.)
The base of the inductive definition of the relation "T
r A" may include the
definition of a class of formulas deducible from the empty class of premises. The formulas of this class may be called basic formulas of the calculus. In most cases, they are exhibited by means of schemata (called basic schemata). Furthermore, it is postulated that "T
r A" holds, whenever A
E
T or A is a basic formula.
The inductive rules in the definition of "T
r A"
are the usual ones: they tell
us that from
how can we get" I"
I-
A "". They are called rules ofdeduction or simply proof rules.
In general, the definition of "T
r A" is divided into two parts: (a) the definition
of basic formulas and (b) the definiton of rules of deduction (including that if A is a basic formula or a member of Fthen "T
I-
A" holds).
There are different possibilities of defining "T
r A" with the same outcome. To
be more exact, let us assume that for a given family of languages, we have two different definitions of the deductibility relation, say, ' h' and
'b
t •
lations are equivalent iff for all languages of the family, "T
We say that these two re-
h
A" holds iff "T
b
A"
holds. In such a case, we have, practically, different styles of formulation of the same relation, and different styles of construction or presentation of the same logical calculus. Now, the literature of modem logic presents us several different styles of fonnulating the same logical calculus. A possible variation lies in choosing the basic formulas and the rules of deduction. In general, the less basic schemata we choose, the more 66
rules we need. As a limiting case, the class of basic formulas may be empty (this is the case in the systems of natural deduction). Another style is dominant in the systems of
sequent calculus. (For the origin, see GENTZEN 1934.) Our formulation of the classical first-order calculus QC follows the style introduced by Gottlob Frege (FREGE 1879). We shall define a class of basic formulas and a single rule ofdeduction. Remark. In the literature of mathematical logic,the Fregeanstyle (of formulation of a logical system) is usuallycalled Hilbert-style. forgetting the fact thatit wasFregewhoinvented the firstlogicalcalculus and formulated it just in thisstyle.
6.2 First-Order Languages We met a language belonging to the family of (classical) first-order languages in Sect. 4.3.2 . This language was qualified as the maximal one among first-order languages by the reason that it contains an infmite supply of name functors and predicates for all
an-
ties (i.e., for all numbers of argument places). Other first-order languages - which are very useful in formulating exact theories - may contain fewer name functors and predicates, perhaps only a finite number of them. This is the reason to give a general definition of first-order languages. Of course, every first-order language is to be based on a certain alphabet J'I.. We shall not determine in advance the letters of this alphabet, for it depends on the richness of the language . However, we shall give hints in the course of the following definition with respect to the letters of J'I.. 6.2.1.
DEFINITION.
By a (classical) first-order language [} let us mean a five-
component system of form
[} = (Log, Var, Con, Term, Form ) where the components satisfy the following conditions (i) to (vii). (i) For some alphabet J'I., Log, Var, Con, Term, and Form are definite subclasses of J'I. 0 • (ii) Log, Var, and Con are pairwise disjoint classes.
(iii) Log is the class of logical constants of
c',
namely:
Log = {(, ), -, o, =, 'V}, the members of Log are called (in their order) left and right parentheses, the signs of
negation, conditional. and identity. and the universal quantifier. - We can assume that the members of Log are letters of J'I..
67
(iv) Var is the class of variables of £} containing an infmite supply of ~ words. - We can assume that
~
x and t, and the variables are
involves two letters:
words of form "xi" where i is a {t}-word. (v) Con is the class of (non-logical) constants of L
1 .
In general:
Con =NuP where N is the class of name functors and P is the class of predicates, of L
1
j
Nn P=
0. We assume that for all members of Can, an {o}-word as an arity is associated. If
Con is finite, we can assume that each member of Con is a letter of arities need not belong to L
1
).
are
~
0
~j
in this case,
(it is sufficient to refer to arities in the metalanguage of
In the contrary case, we can assume, similarly as in Sect. 4.3.2, that our constants ~-words
formed from some initial letters followed by arity words and (if necessary)
indices ({t}-words). 1
(vi) Term is the class of terms of L • In its inductive definition, we need the auxiliary categories T(a) for all arities a. Members of T(a) may be called a-tuples of
terms. The simultaneous inductive definition of T(a)-s and Term is as follows: 1. Var~ Term.
2. T(0) = {0}.
3. (s E T(a) & t E Term) => "sit)" E Ttao) .
4. (
"cps" (Note that if
E
E
Term.
Term, for
(vii) Form is the class offormulas of L • Its inductive definition is as follows:
1.
P&
(1t E
(In case a = 0 , 2. s, t
E
1t E
1t
is of arity a & s E T(a» =>
"1tS" E
Form.
Form - a formula representing an unanalyzed sentence.)
Term => "(s = t)" E Form.
3. A E Form => "-A" E Form. 4. A, B E Form => "(A ::J B)" E Form. 5. (A E Form & x E Var) => "\fxA" E Form. Formulas introduced by rules 1 and 2 are called atomic formulas. • 6.2.2. Comments. 1. In textbooks of logic, we often find - instead of our prescriptions 2, 3, and 4 in (vi) - the defmition: »If
2. According to item 3 of (vi), the arguments of a functor must be surrounded by parentheses. This might be necessary to avoid ambiguities. However, if the grammar 1
of L guarantees that a functor and its arguments are unambiguously recognized by their grammatical form then these parentheses can be omitted. This was the case in
maximal first-order language (cf. 4.3.2).
68
o~
3. The classes Log, Var, Con, Term and Form can vary in the different firstorder languages. Hence,in a more exact notation, we shouldwrite 1
1
l
'
1
1
Log(L), Var(L), Con(L), Term(L), and Form(L ).
(1)
However, the omissionof the reference to L 1 would cause a confusion only in cases we are dealingsimultaneously with more than one (concrete) languages. Thus, in the usual cases, we do not apply the notation in (1). - Moreover, we can assume that Log and Var are the same in all first-order languages. The carrier of the variability is the class Con. Note that Con may be empty;in this case Term = Var, and all formulas are built of from atomiconesof form "(x = y)" where x, y E Var. 1 4. The intuitivemeaningof the logicalconstants and the categories of L is the same as given in Sect. 4.3.2 for the maximal first-order language. Comparing with Sections 2.1, 2.2, and 2.3, we see that the grammatical and logical means of metalogic are almost totally included into first-order languages. Exceptions are the sentence functors conjunction, alternation, biconditional, and the existential quantifier. However, these missing operations can be introduced in first-order languages via contextual definitions as follows: (A ~ -B)
[Conjunction.]
(A & B)
=df -
(A v B)
=df
(-A
(A == B)
=df
(A ~ B) & (B ~ A)
[Biconditional.]
3xA
=df
-\7'x-A
[Existential quantification]
~
[Alternation.]
B)
6.2.3. DEFINITION. Let L 1 = (Log, Var, Con, Term, Form) be as in the preceding definition. We introduce some grammatical relationsin first-order languages. (a) We say that B is a subformula ofA if A, B E Form, and A is of form "uBv" where u, vej.l° (any of themmay be 0). Remark. An inductivedefinitionof this relation may run as follows: (i) If A, B E Form, and x
E Var
then
A is a subfonnula of A, "-AU, " (A ::::> B)", "(B ::::> A)", and " "i/x A". (ii) If A, B, C E Form, and A is a subfonnula of B, and B is a subfonnula of C then A is a subfonnula of C.
(b) If x e Var and A
E
Form, an occurrence of x in A is said to be a bound oc-
currence of x in A iff it lies in a subformula of A having the form "\7'x B". Occurrences of x in A which are not boundones in A will be calledfree occurrences ofx in A. Remark. An inductive definitionof these relations may begin by stating that every occurrence of x is a free one in an atomic formula,and is a bound one in ""i/x AU. The continuation is left to the reader.
(c) A term is said to be open if it involves a variable, and it is said to be closed in the contrarycase.
69
(d) Aformula is said to be open iff it involves some (at least one) free occurrence of a variable. and it is said to be closed iff it is not open. By the free variables of an open formulalet us meanthe variables havingsome free occurrences in it. (e) We say that a formula A is free from the variable x iff A involves no free occurrences of x. A class of formulas r is said to be free from the variable x .iff for all A E T, A is free from x. (f) Where A E Form, x, Y E Var, we say that y is substitutable for x in A iff whenever "VyB" is a subformula of A then B is freefrom x. A term t is said to be substitutable for x in A iff every variable occurring in t (if any) is substitutable for x in A.
(Some specialcases: (i) x is alwayssubstitutable for x in A. (ii) If A is free from x then any term t is substitutable for x in A. (iii) If t is a closed term then t is always
substitutable for x in A.) (g) Assume that the term t is substitutable for the variable x in the formula A. Then, the metalanguage expression
(2) denotes the formula obtained from A via replacing all free occurrences of x in A (if any) by t. The square brackets can be omitted in this notation if A is represented by a single variable. Note that the use of the notation in (2) always presupposes that t is substitutable for x in A. (If A is free from x, then A tlx = A. If t is closed, then A fIx is always welldefined (i.e., it "exists").) • 6.2.4. Notation conventions. In connection with first-order languages, we shall use the metavariables A, B, C referring to formulas, x, y, z to variables, and s, t to terms. Outermostparentheses surrounding formulas will be sometimes omitted. We write "(A::> B::> C)" insteadof " (A ::> (B::> C» ".
6.3 The Calculus QC We shall denote by ~QC' (Quantification Calculus) the version of classical first-order calculusexplainedin the following two definitions. 6.3.1. DEFINITION: Basic formulas. Given a first-order language £1, the class of its basic formulas BF is determined by the following two stipulations: (i) If a formulahas the form of one of the basicschemata (B1) to (B8) below then it is a basic formula.
70
(B 1) (A::> (B ::> A» (B2) «A::> (B::> (B3)
C»
::> «A ::>
«-B ::>-A) ::> (A ::>B»
B) ::> (A ::> C»)
(B4) ('v'x A ::> AtlX) (B5) ('v'x(A ::> B) ::> ('v'xA ::> 'v'xB»
(B6) (A::> 'v'xA)
provided A isfree from x
(B7) (x = x)
(B8) «x = y) ::> (Avz ::> AyIZ» To get basic formulas from these schemata, A, B, C are to be substituted by formulas, x, y, Z by variables, and t by terms of £1. (ii) If A E BF, and x
E
Var then "'v'xA" E BF. •
Remark. It can be proved that BF is always a definite subclass of Form (and, even, of .9l 0
) .
6.3.2.
DEFINITION:
Deductibility. Given £1, Fs; Form, and A E Form, we define by
induction the relation "A is deduciblefrom T" - in symbols: "T I- A" - as follows:
r u BF then r I- A. r I- (A ::> B) and r I- A then r I-
(i) If A E (ii) If
B.
In case 01- A we say that formula A is provable (in QC), and we write briefly" I-A". Rule (ii) is called modusponens (MP) or sometimesthe ruleof detachment. • Remark. In mosttextbooks of logic, our basicschemata are called axiom schemata, and our basicformu-
las axioms (of QC). This seems to be a wrongusage of the term axiom. For, in the generally accepted senseof the word, axioms are basic postulates of a scientific theoryfrom which all theorems of the theory follow by means of logic. Are, then,the basicsentences axioms from which all theorems of QC (or what else) follow by meansof logic? (which logic?) Eventhe question is a confused one. The most we can say is that all provable formulas of QC follow from the basic sentences via applications of modus ponens. Do we identify the classof provable formulas withthe theorems of QC? The latternotionis undefined; but the centralnotionin QC is the deductibility relationratherthan provability. It is hard to find an acceptable reasoning in defence of the mentioned use of 'axiom'.
6.3.3. An intuitive justification of QC. We gave an intuitive interpretation of the sentence functors negation and conditional (denoted in first-order languages by '-' and '::>', respectively) in Sect. 2.3, by referring to truth conditions. In Sect. 2.2, the meaning of the universal quantification was clarified as well. Now let us imagine a nonempty domain D of individual objects and assume that the members of Var (and Term) refer to members of D, so that "'v'xA" says: "for all members x in D, A holds". Then, one can check easily that - according to this intuitive interpretation- any formula of forms (B1) to (B8) is always true (is a logical truth), even independently from the choose of the domain D. (However, if we are dealing with more than one formula, we must as71
sume the same D with respect to all formulas being used.) Of course, in case (B7) and (B8), we must exploit the meaning of identity as well. In addition, if A is always true then so is "'v'xA". Hence, the members of BF are logical truths. Furthermore, we see that the rule Modus Ponens leads to a true formula from true ones. Hence, if "T ~ A" holds in QC, and the members of rrepresent true sentences (with resPeCt to a fixed domain D) then the formula A represents a true sentence (with respect to D). These considerations show that QC is really a logical calculus, a syntactic formulation of the consequence relation. We can use it with confidence in our reasoning. 6.3.4. THE CLASSICAL PROPOSITIONAL CALCULUS (PC). By a zero-order (or pure propositional) language L
0
let us mean a three-component system (based on a certain
alphabet) (Logo. Ato, Formo) where Logo = {(, ), -, ::J}, Ato is a nonempty class of words called atomic formulas, and Forms is defined by the two stipulations: (i) Ato ~
Forms. and (ii) (A, B
E
Forms ) => ("-A", "(A ::J B)"
E
Formo). - We met such a
language in 4.3.1. A logical calculus for zero-order languages is the (classical) propositional cal-
culus, PC. It can be presented as a fragment of QC, based on the schemata (B 1), (B2), (B3) and the rule of Modus Ponens. (Of course, the prescription: (A
E
BF => "'v'xA"
E
BF) is to be omitted here.) - In 4.3.1, the canonical calculus KLPL defines just the logi-
cal truths of PC. PC is, in itself, a very weak system of logic. However, it is interesting as a fragment of QC. For, any first-order language has a zero-order fragment if we define
Ato as containing all the first-order atomic formulas and all formulas of form "'v'xA". Then all laws of PC are laws of QC as well. In proving PC-laws, we use only the basic schemata (B1), (B2), (B3), and the rule MP. - This strategy will be applied in the next section.
6.4 Metatheorems on QC It was noted already in Sect. 1.1 that the author assumes: the reader has some knowledge on classical first-order logic. Up to this point, this assumption was not formally exploited. In what follows, we shall give a list of metatheorems on QC without proofs, assuming that the reader is able to check (at least) the correctness of these statements. The particular style of presentation of QC given in this chapter, and the metatheorems listed below will be essentially exploited in the remaining chapters of this essay. Hence, the reader's assumed familiarity with first-order logic does not make superfluous the explanations of the present chapter.
72
6.4.1. Metatheorems on PC. The frrst group of our theorems is based on the basic schemata (Bl), (B2), and (B3) as well as the rule MP. Then, referring to 6.3.4, these laws may be called PC-laws. Thus, .they will be numbered as PC.1, PC.2, .... Some of these will get a particular code-word, too. - In the notation, T and r'refer to class of formulas, and A, B, C to formulas.
r:
PC.l. ir ~ A, Ts; r') ~ ~ A. PC.2. ({A , A:::l B}C r) ~ r ~ B. PC.3. ~ A:::l C) ~ (ru {A} ~ C). PC.4. ~ A :::lAo PC.5. (DT) (ru {A} ~ C) ~ (F ~ A:::l C). - Deduction Theorem . The converse of PC.3. PC.6. (Cut.) ir ~ A, r'u {A} ~ B) ~ (ru r' ~ B). PC.7. (ru { -A} ~ -B) ~ (ru{B} ~ A). PC.8. {--A} ~ A, and A ~ - -A. PC.9. (Co.po.) (ru {B} ~ A) ~ (ru {-A} ~ -B). (The law of contraposition.) PC.tO. {A, -A} ~ B. {A, -B} ~ -(A:::l B). PC.12. {-A:::l A} ~ A, and A ~ -A:::l A. PC.13. -A ~ A :::l B, and B ~ A:::l B. PC.14. {A} ~ B, and {-A} ~ B) ~ ~ B.
cr
rc.n.
tr.:
ro
r
6.4.2. Laws of quantification. For the proof of the following laws , one needs the basic schemata (B 1) to (B6). - In the notation , x and y refer to variables. QC.I. (UG ) If F -Especially: ~ A
~
~
t
r
A, and F is free from the variable x then
~ VxA.
VxA . (Universal generalization.y
QC.2. If y is substitutable for x in A, and A is free from y then VxA ~ VyAylx and VyAylx ~ VxA. (Re-naming of bound variables.) QC.3. If the name t (i.e., a name functor of arity 0) occurs neither in A nor in the members of F, and
r
~ Atlx then
r
~ VxA.
QC.4. VxVyA ~ VyVxA. QC.5. If Q is a string of quantifiers "VXIVX2 ... Vx n " (n ~ 1) then {Q(A :::l B), QA} ~ QB. (A generalization of (B5).) 6.4.3. Laws of identity. Now we shall use the full list of our basic schemata (Bl) to (B8). - In the notation , s, s ~ and t refer to terms. Q C.6.
~
(t
=t).
QC.7. {(s = t), ASIz } ~ Atlz
•
Q C.8. {(s = t)} ~ (t =s). QC.9. {(s = s' ), (s'= t) } ~ (s = t).
73
~-
-
-
-
- -
-
-
-
-
-
-
-
-
-
-
-
-
6.4.4.
DEFINITION.
Let A be an open formula, and let
Xl, ... , Xn
be an enumerationof
all variables having free occurrences in A (say, in order of their first occurrences in A). Then, by the universal closure of A let us mean the formula "VXl cording to QC.4, the order of the quantifiers is unessentialhere.
...
VxnA". - Ac-
6.5 Consistency. First-Order Theories 6.5.1.
DEFINITION.
Given a logical calculus I and a class of formulas r, we shall
denote by "CnsI (T)' the class of formulas deduciblefrom F; i.e., Cnsx(r) = {A: r ~A}.
We shall be interested in case Iis PC or QC. Obviously: Cnspc (F) ~ CnsQc (T) .
We say that r is I-inconsistent iff CnsI (F) = Form, i.e., iff everyformula is deducible from r.
-
Finally, T'is said to be I-consistent iff it is not I-inconsistent.
•
Clearly, if r is PC-inconsistent then it is QC-inconsistent, too. Or, by contraposition, if r is QC-consistent then it is PC-consistent as well. We know from PC.10 that a class of form {A, -A} is PC -inconsistent. We could to prove that the empty class (or, what is the same, the class BF) is QCconsistent, but we shall get this result as a corollarylater on. 6.5.2. THEOREM. ru {A} is PC-inconsisten t iff r reader: Use PC.12, PC.IO, DT and Cut.
~
-A. - The proof is left to the
6.5.3. THEOREM. If "A:::> B" E F, and r is QC-consistent then at least one of the classes ru {-A}, r u {B} is QC-consistent. Proof (sketchily): Assume, indirectly, that both of the mentioned classes are
inconsistent.Then, by the precedingTh. and PC.11, we have that (i)
r~
A,
From (i) and (iii) we get by Cut -
r
r~ -B,
(ii)
r
u {-B}
(iii)
r -(A:::> B),
{A, -B} ~ -(A:::>B).
by Cut. This and (ii) gives - again,
r -(A:::>B), contradictingthe assumption of the theorem.
6.5.4. THEOREM. If "-V xA " E T, r is QC-consistent, and the name t occurs neither in A nor in the members of r then ru {_At/X} is QC-consistent. Proof (indirectly). IT
and
rt
ru {_At/X} is inconsistent then
rr At/
x
VxA (by QC.3), contradictingthe assumptionof the theorem.
74
(by Th. 6.5.2),
1
6.5.5. DEFINITION. The pair T = (L , r) is said to be e first-order theory iff L 1 is a first-order language and
r
is a class of closed formulas of c'. The members of
r
are
said to be the postulates (or axioms) of the theory T, and the members of CnsQc (T ) are called the theorems of T. The theory T is said to be inconsistent iff
r
is QC-
inconsistent, and it is said to be consistent in the contrary case. •
r
In the limiting case in the language L
1
.
= 0, the theorems of the theory are the logical truths expressible
According to our intuitive interpretation given in 6.3.3, we believe
that such a theory is a consistent one. (For a more convincing proof, we need some patience in waiting.) In the next chapter, we shall introduce a first-order theory that will lead us to a very important metatheorem on QC. In addition, we shall have an opportunity to show the application of a canonical calculus in defining a first-order theory .
7S
~
-
-
~-
~~-~~~~
Chapter 7 THE FORMAL THEORY OF CANONICAL CALCULI (CC*) 7.1 Approaching Intuitively Our aim in this chapter is to reconstruct the content of the hypercalculus H 3 (see 4.4.3) in the frame of a frrst-order theory. We shall call this theory CC*. (Here the star '*' refers to the fact that this is an enlarged theory of canonical calculi. The restricted theory of canonical calculi would be based on H 2 instead of H 3• We shall meet this in
Ch.8.) The kernel of this reconstruction consists in transforming the rules of H3 into (closed) first-order formulas which will serve as postulates of CC*. The transformation procedure will be regulated by the following stipulations (i) to (viii). (i) The subsidiary letters of H3 are to be considered as predicates of the firstorder language £1* to be defined. (ii) The variables of H3 are to be replaced by first-order variables. (iii) The letters of the alphabet j{cc
= {a,~,~, < , * }
are to be considered as names (i.e., name functors of arity 0) of £1*. (iv) j{cc-words are to be considered as closed terms of £}*. Hence, we would need a dyadic name functor in £ 1* to express concatenation. However, we shall follow the practice used in metalogic instead, expressing concatenation by simple juxtaposition. To do so, we formulate an unusual rule for terms as follows: H s and t are terms,
"st" is a term. (v) As the subsidiary letters are considered as predicates, their arguments are to be arranged according to the grammatical rules o! frrst-order languages: the arguments are to be surrounded by parentheses, and they have to follow the predicates. (vi) In some rules of Hj, the invisible empty word 0 occurs as an argument of some subsidiary letter. In the formulas of frrst-order languages, a predicate symbol must not stand "alone", without any arguments. Hence, we need a name representing the empty word; let it be (vii) The arrows
't}'. (~)
in the rules are to be replaced by the sign of conditional
'::>'. According to our convention, "(A:::> B:::> C)"
stands for
"(A:::> (B:::> C»)",
thus, we need no inner parentheses within the translation of a rule.
76
(viii) Finally, after applying (i) to (vii) to a rule, let us include the result be-
tween parentheses if it involves some '::)', and prefix it by universal quantifiers binding all the free variables (if any) occurring in it. (In case of more quantifiers, their order is unessential, by QC.4 .) For example, the translations of rules 1, 13, and 16 are:
(1')
I(~)
(13')
V'xV'n(K(x)::)R(n) ::)K(x.n»
(16')
V'x'Vn'Vnt (V(x) ::)I(nt) ::) S(x~nt)(x~nt)(Xt)(X»
The hypercalculus H 3 consists of 34 rules. (The releasing rule 34* will be omitted here.) Thus, we get 34 postulates by transforming these rules into formulas. However, we need some other postulates in order to assure that the system
should behave as a language radix (see Sect. 3.1). This means , in essence, that the postulates (Rl) to (R6) are to be included into our planned theory . After these preliminary explanations we can begin the systematic formulation of
cc-, 7.1.1. DEFINITION. The frrst-order theory of canonical calculi CC* is defined by
where L 1* is a first-order language based on the alphabet Jilc• = ((,), t, X, - ,::), =, 'V ,~, c, ~,~, -<, *? I, L, V, W, T, R, K, A, D, F, G, S }
and 1
L *
= (Log,
-
Var, Con*, Term", 'Form *}
where
Con* = N* u P*, and
N* = {~,
(X,
~, ~, -<, * },
p* ={I,L,V,W,T,R,K,A,D,F,G,S} (here the members of N* are of arity 0, the predicates D, F, G in p* are of arity 00, S is of arity 0000, and the other members of p* are of arity 0). - The definition of
Terms, Form* and r* will be given later on. • The total definition of CC* will be given by an enormous canonical calculus 1:* «described in the next section). More exactly: 1:* will defme the class of theorems of
CC*. The basic alphabet of 1:* will be just Jile-, but we shall need several subsidiary 77
letters printed in bold-face in order to be.distinguishable from the members of p* which were, originally, the subsidiary letters of H3• As variables in 1:*, we shall use the letters x, y, t, u, v, w, and z. The full class of subsidiaryletters of 1:* is:
s; =
{I, V, N, T, P, F, FR, S, BF }.
7.2. The Canonical Calculus L* The first group of rules of 1:* (from 1 to 29 below) defines the grammar of £1*. Its subsidiary letters are: I (index), V (variable), N (name), T (term), P (monadic predicate), and F (formula). I
8.
N-<
15.
9.
N.
16.
3. 4.
Ix ~ Ixt Ix~ Vxx
PV PW
10.
Vx~Tx
17.
PT
N~
11.
Nx~Tx
18.
5.
Ncx
12.
Tx~
19.
PR PK
6. 7.
N~
13.
N;
14.
PI PL
22.
Tx ~ Ty ~ FD(x)(y)
26.
Tx ~ Ty ~ Tu ~. Tv ~ FS(x)(y)(u)(v)
23.
Tx ~ Ty ~ FF(x)(y)
27.
Fx~
24.
T x ~ T y ~ FG(x)(y)
28.
Fx ~ Fy ~ F(x ~ y)
25.
Tx~Ty~F(x = y)
29.
Vx~Fy~FVxy
1. 2.
Ty ~ T xy
20. 21.
PA
Pu ~ Tx ~ Fu(x)
F-x
Our canonical calculus 1:* must include thefull proofmachinery of QC. In formulating the necessary rules,a crucial notion is the substitutability of a variable in a formula by a term. As a preparation ofthis notion, we define therelation "y isfree from thevariable x" by therules 30 to 45 below where the subsidiary letter-pair 'FR' represents this relation (in form of "yFRx"). 30.
Vx~
Iy ~ xFRxty
31.
Vx ~ Iy ~ xtyFRx
39.
(FRx
32. 33.
Ny~
40.
Py
-FRx ~ FRx
34.
DFRx
42.
=FRx
35. 36.
FFRx
43.
Fu
GFRx
44.
Vx ~ Fu ~ VxuFRx
37.
SFRx
45.
yFRx ~ zFRx ~ yzFRx
~
yFRx yFRx
38.
41.
) FRx
~ uFRx ~
Vy ~ VyuFRx
These rules tell us that any variable is free from any other variable (30, 31); names, predicates, and logical symbols - except 'V' - are free from any x (32 to 42); if a for78
mula u is free from x then so is "Vyu" (43); "Vxu" is always free from x (44); and if two words are free from x then so is their concatenation (45).· The following four rules regulate substitutions. The new subsidiary letter'S' occurs here; the meaningof "vSuStSx" is: "we get v from u by substituting t for x", 46. 47. 48.
49.
Vx ~ Tt ~ tSxstSx yFRx ~ ySyStSx zSuStSx ~ wSvStSx ~ zwSuvStSx Fu ~ Vy ~ vSuStSx ~ tFRy ~ xFRy
~
VyvSVyuStSx
The crucial case is containedin rule 49: If we get v from u by substituting t for x then we get "Vyv" from "Vyu" by the same substitution provided t is free from y and x, y are different variables. (In case x = y, we get by 47 and 44 that "Vyu" remains intact at this substitution.) If these provisos are not fulfilled, the substitution is prohibited. This is essential in rules 53 and 57 below. Now the proof machinery of QC is included in the rules 50 to 60 below. The new subsidiaryletter-pair 'BF' stands for 'basic formula'. 50. 51. 52. 53. 54-
Fu ~ Fv ~ BF (u ::::> (v::::> u» Fu ~ Fv ~ Fw ~ BF «u ::::> (v » w) ::::> «u ::::> v) ::::> (u ::::> w») Fu ~ Fv ~ BF «-u::::> -v):» (v o u» Vx ~ Tt ~ Fu ~ vSuStSx ~ BF (Vxu ::::> v) Vx ~ Fu ~ Fv ~ BF (Vx(u::> v)::::> (Vxu::> Vxv» ~
uFRx ~ BF (u::> Vxu)
55.
Vx -7 Fu
56. 57.
BF (x = x) Vx ~ Vy ~ Vz ~ Fu
58.
Vx ~ BFu
59. 60.
BFu~ u
~
u ~ (u::> v)
~
vSuSxSz ~ wSuSySz ~ BF «x = y)::> (v::> w)
BF Vxu
-7
v
We continue by enumerating - as input-free rules of 1:* - the special postulates of CC*, i.e., the formulasof F *. Here we shall apply some notationconventions in order to make easier the graspingof the contentof the postulates. Namely: (i)
First of all, we shall apply the conventions of omitting parentheses (see
6.2.4). (ii) Instead of the variables XL, xu, XLU, ... we shall write Xtt Xfa, xs, (iii) "-(s = t)" will be abbreviated to "(s
* t)".
....
(iv) The symbols' &', 'v', '3' will be used sometimes in the sense of the definitions given in 6.2.2, Remarks4. The first group of our postulates (from 61 to 81 below) will correspond to the language radix postulates (R1) to (R6), given in Sect. 3.1. Postulates (Rl), (R2), and 79
(R3) are already included into the notion of Term * (cf. the rules 11 and 12). Now we have to postulatethat the empty word is differentfrom all lettersof Jilcc : 61. (a;t t) .
62.
(~;t
t).
63. (y;t t).
64. (-<:;t t).
65. (.;t t).
Further postulatesconcerningthe empty word: 66.
Vx(xt) = x)
68.
VxVXt«XXt
67.
=t)
::J «x
=t)
Vx(t)x
=x)
& (Xt = t))
Postulates 61 to 68 assure, among others, that the empty word has no "final letter". This is half part of (R4). Its other half is expressedin 69:
According to (R5) , words terminating in different letters must not be identical. Concerning our five-letteralphabet Jilcc , this gives ten postulates. We write the first and the last of these:
From these ten postulateswe get (by applying (B4) and substituting t) for x and Xl) that the letters of Jilcc are pairwise different. The remaining contentof (R5) is included into 80 below:
Half part of postulate(R6) is includedinto 66 and 67. The other half is in 81 below:
The second group of our postulates contains the translations of the rules of the hypercalculus H 3 • 82.
I(t)
84.
VX(I(x) ::J L(ax»
86. 88.
Wet)~
rre
VX(I(x)::J I(x~»
83. 85.
Vx(I(x) ::J V(~x»
87. 89.
VxVx.(f(x)::J L(x.)::J T(xx.»
90.
'v'x'v'Xt(f(x) ::J V(Xl) ::JT(n.»
92. 94.
'v'X'v'Xl(f(X) ::J R(x.) ::J
VxVXt(W(x)::J L(Xt)::J W(n.»
91. 93.
'v'x(R(x) ::J K(x»
95.
'v'x'v'xl'v'xt (L(xt) ::J S(xt)(xt)(x.)(x»
96.
VxVXt S(-<)(-<)(Xt)(x)
97.
'v'X'v'xl'v'~ (V(x)::J l(x,>::J S(x~Xt)(X~Xl)(~)(X»
98.
'v'x'v'x.'v'~ (V(x) ::J I(x.) ::J S(x)(x)(xt)(x~x,»
'v'x(f(x) ::J R(x»
so- x.»
'v'xVx.(K(x) ::JR(xt) ::J K(x.xt»
80
99.
VxVXt (V(x) ::> W(Xl) ::> S(Xt)(x)(Xt)(x»
100.
VXVXIV~VX3VX4VX:s(S(xs)(~)(xv(x)::> S(X:s)(X4)(Xt)(X)::> S(XsX5)(¥~(Xv(x»
ooooo
101.
Vx (R(x) ::>
102.
Vx\fXt (R(x) ::> K(xv ::> D(xt*x)(x»
103.
VxVXt (R(x)::> K(Xl) ::> D(x*xv(x»
104.
VXVXIV~ (R(x) ::> K(Xl)::> K(~) ::> D(xt*X*X2)(X»
105.
VxVX1V~VX3VX4 (D(x~(~::> S(xs)(~)(Xt)(x)::> D(x~(x:v)
106.
VXVXIV~ (D(X)(Xl)::> D(X)(Xl-< ~)::> T(xv::> D(x)(X2»
107.
F(~)(
108.
Vx
109.
Vx F(x~)(x~)
no.
Vx F(x~)(x-<)
111.
"'Ix F(x-<)(x.)
112.
VxVXt (F(x)(xv ::> F(X.)(Xl
113.
G(~)(i})
114.
VXVXtVX2 (F(x)(xv::>
115.
VxVXt (D(x)(Xt) ::> G(x)(Xt) ::> A(xv)
F(x
G(x)(~)::>
G(xv(X2
Here the list of rules of 1:* is finished . - Now we can continue the Def. 7.1.1 as follows :
Term* = {x: 1:* ~ Tx} . Farm* = {x: 1:* ~ Fx}. F* = {82 ... US}. The last (irregular) identity is to be understood as saying that the members of F* are the formulas given by the input-free rules of 1:* from 82 to 115. - We then have: A is a theorem of CC* ~ (1:* ~ FA & 1:* ~ A) ~
r*
~ A.
7.3 Truth Assignment We shall introduce a truth assignment of the formulas of the theory CC*, i.e., we define a dichotomy of true and false formulas. In this definition, we shall refer sometimes to the transformation procedure described in the beginning of Sect. 7.1 under (i) to (viii) according to which the rules'of H3 are translated into formulas. It is obvious that this procedure can be applied to all words derivable in H3• Let us denote by "Tr(f)" the translation of the word f where H 3 ~ f. 7.3.1. DEFINITION. We define inductively the truth (and the falsity) of formulas members of Farm*- as follows. (a) Closedformulas (1) A closed formula of form "(s = t)" is true if after deleting the occurrences of ''6' (if any) both in s and t the resulting words are literally the same; otherwise "(s = t)" is false. 81
(2) A closed atomic formula A which is not an identity is true if for some word
J, H 3 '" J, and Tr(j) =A; otherwise A is false. (3) "- A" is true if A is false, and it is false if A is true. (4) "A::> B" is false if A is true and B is false; in all other cases, "A ::> B" is true. (5) "TIxA" is false if for some t E Jilcc0 , At/x is false; otherwise it is true. (b) Open formulas. An open formula is true if its universal closure (cf. Def. 6.4.4) is true, otherwise it is false.
•
On the basis of this definition, the following statements CC.l to CC.4 are almost trivially true (the detailed checking is left to the reader).
CC.I. All basic sentences of
£1* are true. (See the rules 50 to 58 of
E* in the pre-
ceding section.)
CC.2. The postulates under 61 to 81 that refer to identity are all true. CC.3. All postulates from 82 to 115 are true. (These are the translations of the rules of H3·) If both "A ::> B" and A are true, then B is true (according to (4) of the definition above). Then , using that MP is the single proof rule in QC, we have:
CC.4. Every theorem of CC* is true. It is an open question whether all (closed) true formulas are theorems of CC*; we shall return to this question only at the end of the next chapter. However, we can show that the true closed atomic formulas are theorems of CC*. This will be detailed in three steps.
CC.5. If H3 ... f then Tr(j) is a theorem of CC*. Proof' by induction with respect to derivations in H3• Base: f is a rule of H3• Then Tr(j) is one of the formulas in the list from 82 to 115 (of the rules of 1:*), i.e., it is a member of PC. - Induction step (a): Assume that H 3
'"
g, Tr(g) = A, A is a
theorem of CC*, and we get! from g by substituting certain Jilcc-words tl, ... , t1 for certain Hj-variables first-order variables is of form "TIXl
.. •
Zl, . .. , Zk Xl, ..• ,XJco
in g. In A, these H 3-variables are replaced by some
Then, we can assume (using QC.4 if necessary) that A
TIXl B". Then, Tr(j) is the formula
which is deducible from A by k applications of (B4), and, hence, is a theorem of CC*.
- Induction step (b): Assume that H 3 '" g Tr(g
-7 f)
-7
J, H 3 '" g where g involves no arrows,
= "Ql(B::> A)", Tr(g) = "Q2 B" where Ql and Q2 are (possibly empty)
strings of quantifiers of form "TIX". Assume that these formulas are theorems of CC*. 82
It is clear that by appropriate choosingof the variables, we can assumethat Q2 is a part of Q1 (use, if necessary, QC.2), and, by using (B6), we can replace Q2 by Q1' Thus, we can assumethat our formulas are of form
From these we get "QI A" by QC.5. To get Tr(f) which is of form "Q3 A" we apply (B4) and QC.2 (if necessary) to omit the superfluous quantifiers and re-name the bound variables. Thus, Tr(f) is a theoremof CC*. CC.6. If A is a true closed atomic formula but not of form "(s = t)" then A is a thearemofCC*. Proof. According to item (2) of our truth assignment, there is a word f such that H3 . . f, and Tr(f) = A. Then, by CC.5, A is a theorem of CC*. CC.7. If "(s = t)" is closed and true, it is a theoremof CC*. Proof. According to item (1) of our truth assignment, we get from s and t the same term c by deletingthe occurrences of '~' (if any).Now, "(c = c)" is obviously a theorem of CC* (see QC.6). The omitted occurrences of '~' can be placed back by using the postulates 66,67, and the basic schema (B8). Hence, " (s = t)" is a theorem ofCC*. CC.8. If A is a true closed atomicformula then A is a theorem of CC*. - This is the summary of the previous two statements. The formula '(ex = ~ )' is obviouslyfalse, hence, by CC.4, it is not a theorem of CC*. Thus, not all formulas are theorems of CC*. In other words: CC.9. Theory CC* is consistent. (Cf. Def.s 6.5.1 and 6.5.5.) COROLLARY. The empty class offormulas - or the class BF of basic formulas - is
consistent.
7.4 Undecidability: Church's Theorem Let us pose the question: Is it possible to find a procedure - an algorithm - by which we would be ableto decidefor every formula of .L1* whetherit is a theoremof CC*? If we had such a procedure then we would be able to decide, among others, for all formulas of form "A(t)" where t is an ~o-word whether it is a theorem of CC*. Now, r* A(t) iff H3 " At (by CC.4 and CC.8). However,
r
H 3 " At iff t EAut
83
(cf. 4.4.3). Hence, in the presence of a decision procedure, we would be able to decide for all numerals (i.e., {a}-words) whether it is an autonomous one. By Th. 4.4.4, the class of non-autonomous numerals (J'o-Aut) is not an inductive class. Then, by Th. 5.4.4 and its corollaries, Aut is not a definite class, and, according to Markov's Thesis (see 5.4.2) it is not a decidable one. Hence, the answer to our question turned to be a negative one: no normal algorithm can decide the theoremhood in CC*, and, if we accept Markov 's Thesis, no decision procedure exists for the class of theorems of CC*. Summing up:
7.4.1. THEOREM. Theory CC* is undecidable in the sense that the class of its theorems is not a definite subclass of its formulas. Since "A is a theorem of CC*" means the same as "T * ~ A" we get from the result above that in QC, no general procedure (or, at least, no normal algorithm) exists to decide the relation "T ~ A". Of course, we can imagine a general decision procedure as a schematic one that can be adjusted in some way or other to all particular first-order languages. To be more unambiguous, we can state that no decision procedure exists for the maximal frrst-order language (see 4.3 .2 and the first paragraph of Sect. 6.2.). For, this maximal language includes £1*; hence, if we had a decision procedure for the former then it would be applicable for the latter as well. Let us realize, furthermore , that " r
*~
A" tells the same as "P 61 :::) ...
:::) P 115 :::) A "
where P6h ... ,P115 are the members of r* (the formulas in 1:.* enumerated from 61 to 115) - according to the Deduction Theorem (see PC.5). Thus, it follows from the undecidability of "T * ~ A" that the class of provable formulas of £ 1* is undecidable. The same holds, a fortiori , for the class of provable formulas in the maximal first-order language . Summing up:
7.4.2. THEOREM. (Church's Theorem.) In QC, there exists neither a universal procedure (representable by a normal algorithm) for deciding the deductibility relation t" r ~ A") nor for recognising the provable formulas ( It ~ A "). This theorem was first proved by Alonzo Church (CHURCH 1936) in another way than the one applied here. Obviously, this undecidability theorem holds for all larger logical calculi including QC. In some first-order languages there exist decision procedures for the deductibility relation; obviously, this does not contradict our theorem. For example, if a first-order language involves only names and monadic predicates as (nonlogical) constants then it is decidable. The same holds for zero-order (i.e., propositional) languages.
* * * The investigations of the present chapter showed us an interesting example for the defmition of a first-order theory by means of a canonical calculus, and, in addition, presented a very important metatheorem on the first-order calculus QC. 84
Chapter 8 COMPLETENESS WITH RESPECT TO NEGATION 8.1 The Formal Theory CC In Sect. 7.3, we saw that every theorem of CC* is true (see CC.4). If the converse holds too (up to the present point in this essay, this question was not yet answered) then the identity (1)
{A: A is a true formula of CC*}
= {A: A is a theorem of
CC*}
holds true. Using that for any formula A, exactly one of the pair A, "- A" is true, it follows from (1) that for all formulas A, either A, or "- A" is a theorem of CC*, or, as we shall express this property, the theory CC* is complete with respect to negation. 8.1.1. DEFINITION. A first-order theory Tis said to be complete with respectto nega-
tion - briefly: neg-complete - iff for all formulas A of T, one of A, "-A" is a theorem ofT. An inconsistent theory is, trivially, neg-complete. Thus, the problem of negcompleteness is an interesting one only for consistent theories, such as CC*. Intuitively, we can say that a consistent and neg-complete theory grasps its subject matter exhaustively.
Since any consistent class of formulas can serve as a basis (postulate class) of a consistent theory, it is not surprising that many consistent first-order theories are not neg-complete. As we shall see later on, this is the case even with CC* that means that identity (1) does not hold. Moreover, there are surprising cases of neg-incomplete theories, theories which are irremediably neg-incomplete in the sense that any consistent enlargement (with new postulates) of the theory remains neg-incomplete. Such a theory is especially interesting if we can give a truth assignment of its formulas according to which all its theorems are true, and, in addition, our intuition suggests that the postulates of the theory characterize exhaustively the notions represented by the constants of the language of the theory. We shall give an example of such a surprising frrst-order theory. It will be a fragment of CC*, let us call it CC. The intuitive background of CC* is the hypercalculus H 3• In CC, we shall rely on Hz, instead. Let us remember that Hz defines the notion of a canonical calculus and the derivability in a canonical calculus. The additional notions defined in H 3 are the lexicographic ordering, the Gooel numbering, and the autonomous numerals. These considerations show that the theory CC that will be based on the hypercalculus Hz is
85
the smallest first-order theory of canonical calculi whereas CC* is one of the .possible enlargements of it. (However, CC* was useful in demonstrating the undecidability of QC.) According to our intuitions, 8 2 regulates exhaustively the notionsinvolved in it. This gives the (illusory) hope that the theory CC based on 8 2 will be neg-eomplete. The formulation of CC is simple enough: we get it by certain deletions from CC* (cf. Def. 7.1.1). 8.1.2. DEFINITION. The first-order theory CC is defmedby CC =
(£10,
r o)
where .£10 is a first-order language based on the 23-letteralphabet Jt cO = {(,), t,
x, -,::>, =, 'if, ~, a,
~, ~, <,
*, I, L, V, W, T, R, K, D, S},
and .£10
= (Log, Var, Con-; Terms, Forms)
where
Cono = Nou Po, and No = N* (cf. Def. 7.1.1),and
Po = {I, L, V, W, T, R, K, D, S}. The definition of Terms, Forms, and To will be given below by means of a canonical calculus 1:. Remark. Up to this point, CC differs from CC* merely by the omissionof the predicates A, F, and G. We shall see that Terms = Term", and Forms ~ Form*.
Again, the full description of CC will be givenby a canonical calculus1:. 8.1.3. DEFINITION. The canonical calculus 1: is that fragment of 1:* (cf. Sect. 7.2)
which we get from 1:* by omittingthe rules 20, 23, 24, 35, 36, and 107 to 115. In referring to the rules of 1: ~ we retain their original numbering given in 1:*. The omittedrules are just those involving the omittedpredicates A, F, and G. - The subsidiary letters and variablesin 1: are exactlythe same as in 1:* (see at the end of Sect. 7.1). Then: Terms = {x: 1:'" Tx} = Term*. Forms = {x: 1:'" Fx} ~ Form*. To
= {61 ...
106} u {SUD}
86
i.e., the members of To are the formulas enumerated from 61 to 106 as input-free rules in 1:*, and a further postulatedenotedby 'SUD' which will be given in the next chapter (in section9.2].Furthermore: A E Forms
(To ~ A¢;)1:'" A).
=?
The truth assignment defined in 7.3.1 can be applied to Forms (since Forms c Form*), of course, by referring to H 2 instead of H 3 in item (2). Thus, we can speak
on true and false formulas of CC. Furthermore, the statements CC.1 to CC.9 (in Sect. 7.3) hold - mutatis mutandis - for CC as well. The truth of the additional postulate SUD will be shownin section9.2. Hence: 8.1.4. THEOREM. Theory CC is consistent.
8.2 Diagonalization We know that any alphabet can be replaced by the two-letter alphabet 51} = {o, ~} (see Th. 3.2.2). Let us consider the 32-letteralphabet 5I cO v Sc which is the full alphabetof the canonical calculus 1:.Given a translation of this alphabet into 51} , we can extend this to get a translation of 1: as a single word of the five-letter alphabet J(cc = {o, ~, ~, -<, .}, replacing the 1:-variables by ~,~~, ~~~, ... , and the arrow (~) by '-<', and using '.' between the translations of the rules of 1:. Given this extended
translation, let us denote the translation of a word f by (f]
A
E
5t cc0 where the square
bracketswill be omittedif f consistsof a singleletter or a metavariable. Now, 1: is translatedinto a single wordof J(cc , say 1:'" = cr. Then, in the hypercalculus H 2 , the word 'Kc' is derivable. Moreover, if 1: ~ f, then
r
is derivable in o which means that
Now let us remember the translating function Tr introduced at the beginning of Sect.7.3 which translates words derivable in H3 into formulas of CC*. Let us restrict Tr to H 2 by which its applications result CC-fonnulas. We then get: Tr(aDf"')
r
= D(a)(fA) E Forms
r
where is a closed term (i.e., E 5t cc0). It follows from (1) - by our truth assignment - that "D(a)(fA)" is a true atomic formula, and, according to CC.8, it is a theorem of CC. Summingup:
87 ~~
~~-
~---~--------
8.2.1.
LEMMA. If 1: ~ f, then Fo
r D(a)(f') -
Now assume that A e Forms, A"
b e J'llo , B is a closed formula, and Fo fact, unessential. For, Fo
where o
=1:".
=a eJ'ltO, B =Aa/x, X e Var, B"
r B.
=
(The stipulation that B is closed is, in
r B implies that B is true (cf. CC.4 in Sect. 7.3); and, if B is
not closed, this means that its universal closure is true.) Then, the following words are derivable in 1:: (2)'
FA,
BSASaSx t
B.
Hence, by our previous Lemma, the following atomic formulas are theorems of CC: (3)
D(cr)(F"a),
D(cr)(bS"a[SaSxY'), D(a)(b).
Then, obviously, their conjunction is a theorem of CC as well. Let us abbreviate this conjunction by "Diago(a/x,b)". (4 )
Diagda/x,b) =df (D(a)(F"a) & D(cr)(bS"a[SaSx] " ) & D(cr)(b».
Here the term 'Diag' reminds us that the formula B is, in a sense, a diagonalization of A: we get B from A via substituting its own translation A" for the variable x. The sub-
script 'a' reminds us that our procedure depends essentially on the canonical calculus
1: (whose translation is a). However, in what follows, we omit this subscript whenever no misunderstanding can arise by its omission.
8.2.2. THEOREM. Whene ver a, be J'llo, Diagia/x.b} is a theorem of CC ifffor some QIX Ae Forms, A"= a, [A ] " = b, and the (closed) formula Aa/x is a theorem ofCC. Proof. Half part of this statement is proved above. Concerning its other half,
r
assume that Fo Diagia/xb). Then, by CC.4 (cf. Sect. 7.3) "Diagia/x.b]" is true. With respect to its definition under (4), we have that all the atomic formulas in (3) are true, and, by CC.8, they are theorems of CC. According to our truth assignment, this means that the following words are derivable in Hz: crDb.
Since
a is just the translation of 1:, a word is derivable in o iff it is a translation of a
word derivable in 1:. Thus, there must be words A and B such that the words in (2) are
Fo
derivable in 1:. Then, really, B = Aa/x ,and
r B.
Now let us consider the open formula
and assume that [Ao ]"
=a«.
88
and
By our preceding theorem: (5)
(To
I-
'tXt'tXfa -Diag(xlxltJ2» ~ (To
I-
Diag(arfx,bo)·
With respect to the basic schema (B4), we have: (6)
(To
I-
'tXI'tXfa -Diag(aO'/xl t J2) ) ~ (To
I-
-Diag(aclx,bo))·
From (5) and (6) we get immediately: (7)
(To
I-
Diag(ao/x,bo)
~ (To
I-
-Diag(ao/x,bo) ).
Since CC is a consistent theory, we have that (8)
'Diag(aolxtb o) ' is not a theorem of CC.
Then, by (5): (9)
''tXI'VXfa
-Diag(aO/xl t~'
is not a theorem of CC.
However, we can show that this formula is true. For, let us consider the conjunction that is abbreviated by ' D iag(ao,/XttXfa}' (cf. (4» :
By (9), L ... Eo does not hold, thus (according to QC) L'" Ao cannot hold either. Then D(o)(bo) and D(a)(ao) are false atomic formulas .
According to our 'definitions of Ao and Eo , the following words are derivable in L:
(Concerning the second word, take into consideration that Xl is not a free variable of
Ao-) Then, the following atomic sentences are true:
By the rules 30 to 49 (of 1:.), the substitution of variables by terms - represented by the four-place subsidiary letter S - is uniquely determined in the sense that if 1:. ... vSuStSx then the word v is uniquely determined by the words x, t, and u. This means that the second conjunct of (10) is true iff
Xfa is replaced by bo and Xl is replaced by X, or Xfa is replaced by ao and Xl is not x 89
However, in both cases, the third conjunct of (10) is false (as we have seen above). Henceforth, (10)- that is, 'Diag(aclxt,~)' - is false by any substitution of the variables Xl and xt; hence,its negationis "always" false; consequently, its universal closure 'VXlV~ ...Diag(at/Xl'~)' is true. Then, its negation is false. By CCA, no false formula is a theorem of CC. Hence: (11)
'- VXlVxt -Diag(aolxt,XtJ' is not a theorem of CC.
With respectto (9) and (11) (and Def. 8.1.1)it is provedthat: 8.2.3. THEOREM. Theory CC is not complete with respect to negation. - In other words: The class oftheorems of CC is a proper subclass of the trueformulas of cc.
8.3 Extensions and Discussions To investigate the possible generalizations of Th . 8.2.3, let us look over carefully its proof. It is very important here the diagonalization procedure, i.e., the introduction of the schema"Diag(T(a/x,b)". Since o = L", it is exploited here that CC is defined by a canonical calculus :E. Reference to the hypercalculus H 2 is unavoidable. Furthermore, the truth assignmentto Forms was exploitedin the proofof Lemma 8.2.1, ofTh. 8.2.2, and of the statements in (8) and (11) of the preceding section. Now assume that T is a frrst-order theory including CC (in the sense that all theorems of CC are theorems of 1). Assume, further, that T is defined by a canonical calculus K. Then, we can defmethe translation of K into the alphabet Jitcc ; say, K"= k. Retaining our truth assignment with respect to the formulas of CC, we can prove the analogueof Lemma 8.2.1: If K ~ f then "D(k )(f')" is a theorem of T. Furthermore, we can introduce the diagonal schema "Diag, (a/x,b)", using k instead of o everywhere. Now, the proofof the analogue of Th. 8.2.2 is unproblematic, if we assume, in addition, that no false formulas of CC are theorems of T (fromthis, the consistency of T follows obviously). For, take into consideration that "Diag, (a/x,b)" E Forms (i.e., that it is a formula of CC), and, hence,if it is a theorem of T, it cannot be false. Also, the explanations from (5) to (11) in the preceding section, mutatis mutandis, remain correct. As a final consequence, we get that theory T is incomplete with respectto negation.- Let us summarise these observations: 8.3.1. THEOREM. LetT be afirst-order theory satisfying theconditions (i) to (iii) below: (i) CC is a subtheory of T (every theorem of CC is a theorem of 1). (ii) The class of theorems of T is definable by means of a canonical calculus. (iii) No false formula of CC (in the sense of the truth assignment defined in Sect. 7.3) is a theoremof T (hence, Tis consistent). Then: T is incomplete with respectto negation. 90
COROLLARY: Theory CC* (Ch. 7) is incomplete with respect to negation.
Remark. Condition (i) can be fulfilled by means of a translation procedure from the language of CC to the language of T satisfying some obvious provisos. ' What may be the reason of the neg-incompleteness of CC? Our postulates translating the rules of Hz seem to give an exhaustive report on canonical calculi and on derivations in them. However, we can be suspicious with respect to our postulates corresponding to the language radix postulates (Rl) to (R6). We noted in Sect. 3.1 that the supply of these postulates is incomplete: they do not determine uniquely the class of JiI.words (where JiI. is an alphabet). We should have the further postulate (R7) - but we abandoned it because it involves a quantification over classes. Thus, we can suspect that the neg-incompleteness of CC is caused by the incompleteness of our formulation of the notion of language radices. However, we can try to formulate the content of (R7). Let us consider the following rule that can be added to 1:: (1)
Fx -? tSxSt'}Sx -? ySxSx<xSx wSxSx.Sx ~
-? zSxSx~ Sx -? uSxS~Sx -?
vSxSx-< Sx
-?
«t & Vx(x:J (y & z & u & v & w» :::> Vx x).
(Here x, y, t. Z, .u. v, w are 1:-variables.) To grasp more easily the content of this rule, let us assume that x is a monadic open formula having some free occurrences of the variable x; let us write "
r/X
where s is any term. Then we
have that t. y, Z, u; v, w are just the formulas
respectively. Then, the final output of the rule under (1) can be written as follows: (2)
«
Vx
Its content is: If cp is a monadic predicate which (a) holds for the empty word, and (b) whenever it holds for an Jil.ee-word a then it holds for all words getting from a by suffixing it a letter of Jil.ee -
then
and this is a harmless logical truth. Hence, it is sufficient to assume that x is free from
all variables other than x. This means that x~ - that is, t - is a closed formula . 91
Now, we can enlarge 1: by new rules to define closeness. Let C be a new subsidiary letter expressingthe predicate 'is closed'. We need a list of input-free rules tellingthat the letters of j[cO (see 8.1.2) - except x and 'if - are closed, e.g., C-, C'6, ce, CL, ... ; this means21 rules.Then, we finishby the following two rules: Cx~ Cy~
Fu
~
Vx ~
Cxy, vSuS'6Sx ~ Cv ~ C'ifxu.
Finally, we can includeinto (1) the input "Cr". Despiteof all our efforts, enlarging1: and CC in the way outlinedabovegives a theory for which Th. 8.3.1 is applicable, that is, the enlargedtheory remained incomplete with respectto negation. Hence,we can suspectthat the force of the schema(2) is, nevertheless, less than that of (R7).To understand this situation, take into consideration that a monadic predicate defmes a subclass of J'lcc ° (the class of words of which the predicate holds true). We can define as many subclasses of j[cc° as many monadic open formulas exist in the language £10. Schema (2) deals just with such formulas, Now, it may happen that j[cco has more subclasses than as many monadic predicates are expressible in £ 10 . However, the exact meaning of this conjecture can be explained only in set theory (see Ch. 10) where, in addition, its truth is provableas well. It was Kurt Godel who gave the first (non-trivial) example of a formal theory that is incomplete with respect to negation (GODEL 1931). He showed that if a theory includesthe arithmetic of naturalnumbers, and no false formula of arithmetic is among its theorems then the theory cannot be complete with respect to negation. This result is cited in the literature of metamathematics as Godel's First Incompleteness Theorem. In Godel's proof,a fragment of the theory of recursive functions playedthe role analogous to the role of Hz in our approach. Theorem8.3.1 is, obviously, an analogue of Godel's First Incompleteness Theorem. Its peculiarity - which deserves some attention - is that it refers to no mathematical theory. Whereas Godel's· proof is arithmetically based, our approach is purely grammatically based. It conforms the motto: Keep aloof metalogic from arithmetic (in general: from mathematics) as long as it is possible. - But this is not possible to the very last - as we shall see in Ch. 10.
92
Chapter 9 CONSISTENCY U1'-WROVABLE In this chapter we shall show that although the consistency of the theory CC is expressible in its language by means of a formula, this formula is not a theorem of CC.
9.1 Preparatory Work The consistency of CC can be expressed by the formula Cons.;
(1)
=df
3x(D(cr)(FJ\x) & ~D(cr)(x»
meaning that for some u, 1: ... Fu but not 1: ... u.
If both FA and A are derivable in 1: then (and only then) A is a theorem of
CC. This leads us to define the schema of theoremhood (2)
Thora)
=df
(D(a)(FJ\a) & D(a)(a».
In what follows, the subscript '0' will be omitted both in (1) and (2).
9.1.1. LEMMA. If 1: ... f (3)
r
0~
~
g
~
h, and the words f and g involve no arrows then
D(a)(f')::> D(a)(gJ\) ::> D(a)(hJ\).
Proof. By Lemma 8.2.1, it follows from our assumptions that (4)
F 0 ~ D(a)(f'-< gA-< hJ\).
Rule 106 of 1:* is a postulate of CC (a member of r 0) from which we get by applications of the basic schema (B4) (of QC) that (5)
Fo ~ D(a)(f')::J D(a)(f'-< g"-< h")::> T(fA) =:l D(cr)(g"-< h").
Furthermore, ''T(f')'' is obviously true, and , hence, by CC.8 (see in Sect. 7.3) (6)
ro
~ T(f').
We get from (4), (5), and (6) - by PC - that (7)
Fo ~ D(a)(f')::> D(a)(gJ\-< h").
Again, we get from the postulate under 106 that (8)
Fo ~ D(a)(g")::> D(a)(g"-< h") ::> T(g") ::> D(a)(hJ\)
and (9)
F o ~ T(g").
93
From (7), (8), and (9) we get by PC:
which was to be proven. We see that if h involvesan arrow- i.e., if h/: is of form "h I A -< h2A " - we can continueour proof to get "D(a)(h I A ) :::> D(cr)(h2A ) " instead of "D(a)(h A ) " provided h, involvesno arrow.Thus, we can extendour Lemmafor rules of 1: containing more than two arrow-free inputs. Furthermore, our result is independent of the fact whetherthe rule in question is an original one - i.e., listed in the presentation of 1: - or is a derived rule of 1:. To mentionjust a derivedrule of 1: which will be important in the following discussions, let us considerthe followingone: (l0)
1: »+ Fu
~
vSuStSx ~ Fv.
Obviously, we get a formulafrom a formulavia substitution. Thus, if the two inputs in (l0) are derivablein 1: then the output "Fv" must be derivablein 1: as well.
9.2 The Proof of the Unprovability of Cons Let us considerthe followingabbreviations (introduced partly in Sect. 8.2):
Ao
=df
e, =df Co
=df
VXIV'X2 - Diag(xJxlt~; VXI-Diag(alxlt~; bo
=df
Diag(aoIx,bo);
ao =df Ao" ; Bo"; Co =df Co"·
Let us recall the main results of Sect. 8.2:
(2)
¢:> (Fo ~ Bo)· (Fo ~ Co) => (Fo ~ -Co)·
(3)
None of Bo, -Bo is a -theorem ofCC.
(1)
(Fo ~ Co)
Our proof will be detailed in severalsteps. Step 1. Since 1: »+
B~A~a~xt
we have by Lemma 8.2.1 that
We know that here the word bo is uniquely determined by the words ao, ao" and x. Henceforth, the following conditional is true:
94
To accept this formula as a theorem of CC, the following auxiliary postulate is sufficient:
(SUD) Since To ~ K(o) we have that (4) follows from SUD (Substitution Uniquely Determined) by QC; thus, (4) is a theorem of CC. Remark. The introduction of the auxiliary postulate SUD wasmentioned in Def. 8.1.3 already. We couldformulate a moregeneral version of SUD, e.g.,
However, the present version suffices ouraims. - If someone objects to SUD, he/she canomitit fromL and
Fa; the results of Chapter 8 remain correct without SUD as well. However, SUD is indispensable in the present chapter (except if you find a proofof (4) without using SUD; this possibility is not ab ovo excluded).
A particular case of (l0) of the preceding section is: 1: ... FA o ~ vSAoSaoSx
~
Fv
(where v is a 1:-variable) . Then, by Lemma 9.1.1:
From (4) and (5) we get by PC: (6)
To ~ (D(cr)(F"ao) & D(o)(~"ao[SaoSx]") & D«J)(~» :) (D(cr)(FA~)
& (b o = ~ & D(o)(~».
Here the antecedent is "Diag(ao/x,~", and the consequent yields ''D(o)(F''bo) & D(o)(bo)" , by the basic schema (B8) of QC. The latter formula is - by (2) of the preceding section - just "Th(bol'. Hence:
To ~ Diag(ao/x ,~:::J Th(bo). Then , by QC:
(taking into consideration that 'Th(bo
r is free from x, Xl, and x,a). Here the antecedent
is just the negation of BOo Thus, our fmal result is that: (7)
To ~ -Bs:» Th(bo).
95
Step 2. By (1), if one of Co, Bo is deducible in 1: then so is the other.Thus, we have the following derivedrules:
and Then, by Lemma 9.1.1 and PC, we get easily: Fo ~ D(a)(bo) == D(cr)(co)·
With respectto the definition of 'Th' (see (2) in the preceding section) we then have: F o ~ Th(b o) ~ Th(co)·
From this and (7) we get: (8)
Fo ~
-e, ~ Th(Co ).
Step 3. By (2), if Co is derivable in 1: then so is its negation. This yields the
derived rule:
Then, similarly as in the previous step, we have: (9)
F o ~ Th(Co ) ~ Th( _/\ co).
Step 4. By PC, from a pair (of formulas) A, "-A" , any formula is deducible.
Hence, we have the derived rule: 1: .. Fu ~
U
~
F-u
~
-u ~ Fv ~ v,
Then, using the generalization of Lemma9.1.1 and applying the definition of'Th' , we get:
From this it then follows by QC: Fo ~ (Th(co) '& tu-» co) ~ VXl(D(a)(F/\x~~ D(a)(x~)
Here the consequent is exactlythe negation of 'Cons' (see (1) in Sect. 9.1). Hence: (10)
F o ~ (Th(co) & Thi-> co) ~ - Cons.
Step 5. We get from (8), (9), and (10),by PC, that Fo ~ -Bo ~ -Cons .
96
Or, by contraposition:
Fo ~ Cons ~ Bo . Hence, if 'Cons' would be a theorem of CC then so would be Bo . By (3), Bo is not a theorem of CC. Consequently, 'Fo ~ Cons ' does not hold. Our aim was just to prove this statement.
•
Our result can be extended to certain enlargements of CC. The conditions are the same as in Th.8.3.1. The metatheorem just proved is an analogue of Godel' s Second Incomplete-
ness Theorem which states that the consistency of Number Theory is unprovable although expressible - within Number Theory. Concluding remarks. We have finished our work on the pure syntactic means of metalogic. However, every system of logic is defective without a semantical foundation . - at least according to the views of a number of logicians (including the author). Thus, if the question is posed, 'How to go further in studying metalogic?' the natural answer seems to be, 'Tum to the semantics !'. Now, a logical semantics which is best connected to our intuitions concerning the task and applicability of logic can be explained within the frames of set theory. Set theory is a very important and deep discipline of mathematics. We need only a solid fragment of this theory in logical semantics (at least if we do not go far away from our intuitions concerning logic). Fortunately, set theory can be explained as a first-order theory. After studying its most important notions and devices, we can incorporate a part of this theory into our metalogical knowledge, and we can utilize it in developing logical semantics. Our next (and last) chapter will give a very short outline of set theory as well as some insights on its use in logical semantics. We assume here (similarly as in Ch. 6) that the reader had (or will) take a more detailed course in this discipline - our explanations are devoted merely to give the feeling of the continuity in the transition from syntax to semantics. Let us mention that another field of logical semantics is the algebraic seman-
tics. This is foreign to the subject matter of the present essay, for, in the view of the author, it does not help us to understand the truly nature and essence of logic. However, it is a very important and nowadays very fashionable field of mathematical logic presenting interesting mathematical theorems about systems of logic.
97
Chapter 10 SET THEORY 10.1 Sets and Classes 10.1.1. Informal introduction. The father of set theory was Georg Cantor (1845-1918). It became a formal theory (based on postulates) in the 20. century, due to the pioneering works of Ernst
Zermelo and Abraham Fraenkel (quoted as 'Z-F Set Theory'). Further developments are due to Th. Skolem, J. v. Neumann, P. Bernays, K. Godel and many other mathematicians. (On the works of Cantor, see CANTOR 1932.) The intuitive idea of set theory is that some collections - or classes - of individual objects are to be considered as individual objects - called sets - which can be collected, again, into classes which might be, again, individuals, i.e., sets, and so on. Briefly: the operation of forming classes can be iterated; and classes which can be members of other classes are called sets. Thus, according to this intuition, sets are in-
dividualized classes. Then, an important task of set theory is to determine which classes can be individ ualized (i.e., considered as sets). Now, formal set theory gives no answer of such questions as 'what are classes?' or 'what are sets?'. Its universeof discourse is the totality of sets, and most of its postulates deal with operations forming sets from given sets. There exist different formulations of (the same) set theory. In most formulations, set theory is presented as a first-order theory whose single nonlogical constant is the dyadic predicate 'E ' (' is a member of), and the possible values of free variables are assumed (tacitly) to be sets. Thus, in the formula "x
E
y",
both x and y are sets; members of sets (if any) must be, again, sets. Moreover, identity of sets is introduced via definition: (1)
=
(a =b) <=>df 'Vx((x E a) (x E b)).
From this "a = a" - and "'Vx(x = x)"
is deducible; hence, the basic formula (B7)
of QC is omitted. The same holds for (B8); instead of the latter, a postulate called
axiom ofextensionality is accepted: (a = b):::) 'Vx«a EX):::) (b EX».
In QC, "3x(x = x)" follows from "'v'x(x = x)". This means that the domain of individuals is not empty. Hence, we need not a postulate stating that there are sets (for, in this approach, everything is a set).
In logical semantics, it is advantageous to assume that there are domains - i.e., sets - whose members are not sets but other type of individual objects (e.g., physical or
98
grammatical objects). By this, we shall depart a little from the usual formulation of set theory sketched above. The main peculiarity of our approach is "to permit individuals other than sets, these will be called primary objects, briefly: primobs. Of course, they will have no members. To differentiate between sets and primobs, we need a monadic predicate 'i', where "i(x)" represents the open sentence 'x is a set', and "--i(x)" tells that x is a primob. We cannot omit the identity sign '=' from the supply of our logical constants, for, if we try apply the definition (1) for primobs we get that all primobs are identical with each other. Thus, we shall use the full machinery of QC, retaining (B7) and (B8) as well. - Note that we shall not prescribe the existence of primobs, we want only to permit them. After these preliminary discussions let us return to the systematic explanation of our set theory.
10.1.2. The language of set theory. To avoid superfluous repetitions, it is sufficient to fIX that in the language of set theory, the class of nonlogical constants is
Con where i is a monadic and
={i , E}
is a dyadic predicate. No name functors are used - al-
E
though several such ones can be introduced via definitions. Thus, Term = Var.
Notation conventions in our metalanguage. We shall use lower-case Latin letters (a, b, c, x, y, z) as metavariables for referring to the object language variables (x, Xt, xu, ...). The logical symbols &, v,
==, 3 introduced via definitions (see 6.2.2, Re-
marks 4) will be used sometimes . The convention for omitting parentheses (see 6.2.4) will be applied , too. We write "(x instead of "-(x
E
y)"
E
y)" instead of "E(X)(y)", "(x ~ y)" and "(x
:I;
y)"
and "-(x = y)", respectively. The expressions "qJ(x)" and
"tp(x,y)" refer to arbitrary monadic and dyadic open formulas, respectively.
We do not want to list all postulates of set theory in advance. Instead, we shall present postulates, definitions and theorems alternatively, giving a successive construction of the theory. Now, if T, denotes the class of postulates of our set theory, we shall write
"Ir
A" instead of "T,
~
A" in this chapter. Most theorems - including postu-
lates - will be presented by open formulas; these are to be understood as standing for their universal closures.
10.1.3. First postulates: (PO)
3x(x
(PI)
i(a) ::> j(b) ::> Vx«x
E
a) ::> i(a). E
a) == (x E b» ::> (a = b).
(According to our conventions, (PO) stands for "Va(3x(x be prefixed with 'V aV b'.)
" 99
L
E
a) ::> i(a»", and (PI) is to
(PO) says that if something has a member, it is a set. (But it does not state that every set has a member.) By contraposition: -Sea) ~ -3x(x e a),
i.e., primobs have no members. - (PI) tells us that if two sets coincide in extenso (containing the same members) then they are identical. Here the conditions sea) and s(b) are essential, without them all primobs would be identical.
Before going further, we shall extend our metalanguage.
10.1.4. Class abstracts and class variables. As in Sect. 2.5, we introduce class abstracts and class variables, with the stipulation that in the class abstract {x: qJ(x)},
rp(x) must be a monadic open formula of the language of set theory. Then, class ab-
stracts and class variables (A, B, C) will be permitted in place of the variables in atomic formulas (that is, everywhere in a formula except in quantifiers), but these occurrences of class symbols will be eliminable by means of the definitions (01.1) to (01.6) below. Thus, the introduction of these new symbols does not cause an extension
of our object language; it gives only a convenient notation in the metalanguage. (Note that the class of monadic open formulas is a definite one; hence, the same holds for the class of class abstracts which is the domain of the permitted values of our class variables .) The six definitions below show how a class symbol is eliminable in atomic formulas. (D!.l)
a e {x : qJ(x)}
(01.2)
(A = B) ¢:>df 'Vx«x e A) == (x e B)).
(01.3)
(a
=A)
¢:>
(01.4)
(A
e b)
¢:>df
(01.5)
(A E B) ¢:>df 3a(a = A) & (a E B)).
(01.6)
seA) ¢:>df 3a(a
¢:>df
(A
tpta).
=a)
¢:>df
sea) & 'v'x«x e a) == (x E A)).
3a«a = A) & (a E b)).
=A).
We get from (Dl.3) that
Ir
-Sea) ~ (a *A)
(primobs are not classes). By (01 .6), a class is a set iff it is coextensive with a set, Hence, "-s(A)" means that the extension of A coincides with no set. In this case, A is said to be a proper class.
10.1.5. Proper classes. Set theory would be very easy if we could assume that every class is a set (as Cantor thought it before the 1890's). As we know today we cannot 100
assume this without risking the consistency of our theory. Here follow the definitions of some interesting classes: (D1.7)
lnd
=df
o =df
(D1.8) (D1.9)
Set
=df
(D1.10)
Ru
=df
{x: (x = x)}. {x: (x:;e x)} . {x: s(x) }. {x: .o(x) & (x ~ x)}.
lnd and Set are the classes of individuals and of sets, respectively. 0 is the empty
class. Ru is the so-called Russell class: the class of "normal" sets (which are not members of themselves). Except 0, all these are proper classes. It is easy to show this about Ru. - Assume, indirectly, that j(Ru). Then there is a set, say r, such that Vx«x
E
r) == (j(x) & (x ~ x»).
Then, by (B4) of QC, we get
(r E r) == (j(r) & (r ~ r» , which implies that " - j (rt, contradicting our indirect assumption. Hence:
Ir
(Th.1.1)
-j(Ru),
i.e., Ru is a proper class . Since Ru
~
Set c lnd, we suspect that Set and lnd are
proper classes, too. (Proof will follow later on.) Thus, there "exist" proper classes . This - seemingly ontological - statement means merely: We cannot assume , without the risk of a logical contradiction, that for all monadic predicates " rp(x)" of the language of set theory there is a set a such that "Vx«x
E
a) == rp(x)" holds .
Remark. The Russell class Ru was invented by Bertrand Russell in 1901. (See e.g. RUSSELL 1959, Ch. 7.) The existence of proper classes was recognised (but not published) by Cantor some years earlier (see CANTOR 1932, pp. 443-450). These recognitions led to the investigations of fmding new foundations for set theory.
10.1.6. Further definitlons and postulates. - From now on, our treatment will be very sketchy. Let us introduce an abbreviation for the simplest class abstractions: (D1.11)
a
E
= df {x: (x
E
a)}.
By (D1.3) and (D1.1) we have: (a = a~ == (j(a) & Vx(x
l
101
E
a) == (x E a»).
From this we get by QC: I~ (a =a~ == j(a).
(Th.1.2)
That is, any set a coincides with the class aE • Thus, all sets are classes (but not conversely). By this theorem, all definitions and theorems on classes hold for sets as well. On the other hand, if a is a primob, aE has no members, and, by (01.2), it
c0-
incides with the empty class 0 : I~
(Th.1.3)
-j(a)::::> (aE = 0).
In what follows, we shall use all notions and notations introduced for classes in Sect. 2.5 - see especially (4), (5), (7), (9), (10) and (11) in 2.5. In case of sets, we speak on sub- and superset instead of sub- and superclass, respectively. We define the union class of a class A - in symbols: "u(A)" - as follows: (01.12)
u(A)
{x: 3y«x E y) & (y E A»}.
=df
Note that by (PO), " (x
E
y)" implies "i(y)". Thus, if no set is a member of A then u(A)
=0. Particularly: I~ -j(a)::::> (u(a~ = 0)
and
I~ u(0)
= 0.
Now we can formulate two further postulates: (n)
j({a, b})
[Axiom of pairs.]
(P3)
j(u(a~)
[Axiom of union.]
We omit the proof of the following consequences of these new postulates: (Th.1.4)
I~ j({a}).
(Th.l .5)
I~ j(a u b~. E
If a, b are sets, we can write "u(a)" and "a u b" instead of "u(a~" and "aE u bE ", respectively. Let us introduce provisionally Zermelo's postulate: (Z)
j(aE (J A)
which will be a consequence of the postulate (P6) introduced in the next section. Its important consequences (without proof):
(B c a~ ::::> j(B). j(aE-B).
(Th.l.6)
I~
(Th.1.7)
I~
(Th.1.8)
I~ j(0).
It follows from (Th.l .6) that Set and Ind - as superclasses of Ru - are proper classes. 102
The set corresponding to 0 is uniquely determined and is called the empty set. In set theory, it represents the natural number 0, hence, we shall use '0' as its proper name. However, the use of '0' in formulas is eliminable by the following contextual defmition: (D1.13)
~(1)} ¢::>df
3a(s(a) & ~(a) & Vx(x ~ a». We define the power class of a class A - in symbols: "po(A)" - by (D1.14)
pl(A) =df {x: sex) & (x ~ An. E
(The denomination is connected with the fact that if A has n members then po(A) has 2n members.) - Our next postulate: (P4)
[Axiom of power set.]
This states that the power class of a set a is, again, a set, called the power set ofa. Our next postulate - among others - excludes that a set could be a member of itself: (P5)
(aE;f. 0) ~ 3x«x
E
a) & (xE l'I aE = 0» .
[Axiom of regularity.]
Its important consequences:
(Th. l.9)
I~
(Th.1.10)
I~ (a ~ a).
(a
E
b) ~ (b ~ a).
These mean that the relation fe' is asymmetrical and irreflexive (cf. Def. 10.2.4).
10.2 Relations and Functions 10.2.1. Ordered pairs. An ordered pair (or couple) is an (abstract) object to which there is associated an object a as its distinguished (or first) member, and an object bas its contingent (or second) member. Such a pair (couple) is denoted as "(a,b}"; the case b = a is not excluded. This seems to be an irreducible primitive notion that can be regulated only by means of the postulate: (1)
( {a,b} = (c,d'»
¢::>
«a
=c) & (b = d) ).
However, in set theory, there is a possibility of representing (or modelling) ordered pairs. Within set theory, this representation has the form of a definition: (D2.1)
(a,b)
=df
{{a}, {a,b}}.
103
This definition satisfiesthe postulate under (1). Note that (a,a) is reducible to, o{ {a}}. Furthermore, if a, b e Ind then I~ j«a, b». - In what follows, we shall deal with ordered pairs only withinset theory. In defining classes of ordered pairs, we have to use class abstracts of form (2)
{z: 3x3y «z = (x,y)) & If/(x,y) ) }.
Let us agree to abbreviate (2) by {(x,y) : tf/(x,y)}.
The Cartesian product of the classesA and B - in symbols: "A x B" - is definedby (02.2) (02.3)
AxB
=df
A (2) =df
{(x,y) : (x E A) & (y E B) }.
A x A.
It is easy to show that a E x bE
c po(po(aE u b1). Then, by (Th.1.5), (P4), and
(fh.l.6), we have that: (Th.2.1) The class of all orderedpairs, Orp, is (02.4)
Orp
=df
Ind x Ind.
10.2.2. Relations. A class of ordered pairs is a possible (potential) extension of a dy-
adic predicate. Such a predicate expresses a relation (cf. Sect. 2.1). This is the reason that in set theory, subclasses of Orp are called relations (although, in fact, they are only potential extensions of relations). We introduce the metalogical predicate 'ReI' by (02.5)
Rel(A) ~df A c Orp.
In the following group of definitions, let us assumethat R is a relation. (D2.6)
xRy ~df (x,y)
E
R,
Dom(R)
=df
{x: 3y(xRy)},
Im(R)
=df
{y: 3x(xRy)} ,
Ar(R)
=df
Dom(R) u Im(R).
Here Dom(R), Im(R), and Ar(R) are said to be the first domain, the second or image domain, and the area or field, respectively, of the relation R. A relation R may be considered as a projection from Dom(R) to 1m(R). The restriction of a relationR to a class A - denotedby "RJ,A" is defined by (02.7)
RJ,A
=df
{(x,y) : (x E A) & xRy }
104
=R
(l
(A x Im(R).
If aRb holds, we can say that b is an R-image ofa. The class of all R-images of a will be.denoted by "Rcz::t a}". We extend this notation to an arbitrary class A in the
placeof {a} : (02.8)
RCCA
=df
{y: 3x(x E A) & xRy)}.
If everymember of Dom(R) has a single R-image then R is said to be afunc-
tion. The metalogical predicate 'Fnc' is defmed by
(02.9)
Fnc(R)
<=>df
Rel(R) & 'Vx'Vy'Vz«xRy & xRz):J (y = z».
Now we can formulate Fraenkel's postulate of set theory: Fnc(R):J i(R CC a~
(P6)
tellingthat the R-image of a set is a set, provided R is a function. Now, let "Id -1.A " be the class {(x,x): XE A}, the identity relation restricted to the class A. Obviously, Id-1.A is a function. Then, by (P6), we have: I~ i«Id-1.A)cc a~ .
However, (I d-1.A)CC a E = aE ( l A. hence:
whichis exactlyZermelo's postulate (Z) in 10.1.6. . If R is a function, we can write "R(a) = b"
instead of
"aRb".
10.2.3. Further notions concerning relations. The following definitions are useful in
logical semantics. Changingthe two domainsof a relation R we get its converse denoted by "R v ,,: RU
(02.10)
=df
{(x,y): yRx}.
The relative product of the relations R and S - denoted by "RIS" - is defined by (02.11)
RIS
(D2.12)
A function is said to be invertible iff its converse is, again,a function.
=df
{(x,y): 3z(xRz & zSy)}.
The class of all functions from B into A - denoted by (02.13)
BA
=df
,JlA" -
if: i(f) & Fnc(f) & if~ B'x A) & Dom(f)
lOS
=
is defined by B}.
(fh.2.3)
It
(Th.2.4) (Th.2.5)
I~ ~A = {OJ). It (A;t 0) ::> (A 0 =0).
(02.14)
1 =df {OJ.
(i(A) & i(B)) ::> i(BA).
This is the set -theoretical representation of the naturalnumber 1. Using it, (Th.2.4) can be writtenas
It 10.2.4.
~A = 1).
DEFINITION. The relation R is said to be reflexive iff Vx«x E Ar(R»)::> xRx), irreflexiveiff Vx(-xRx), symmetricaliff VxVy(xRy::> yRx), antisymmetrical iff VxVy«xRy & yRx)::> (x = y)), VxVy(xRy::> -yRx), asymmetricaliff
transitiveiff VxVyV z«xRy & yRz) ::> xRz), connectediff VxVy«(x E Ar(R») & (y E Ar(R») & (x;t y))::> (xRy v yRx)), an equivalence iff it is both symmetric and transitive, a partial orderingiff it is reflexive, antisymmetric, and transitive, a linear ordering iff it is irreflexive, transitive, and connected. Note that (R is symmetric and transitive) ==:> R is reflexive,
(R is asymmetric) ==:> R is irreflexive, (R is irreflexive and transitive) ==:> R is asymmetric.
10.3 Ordinal, Natural, and Cardinal Numbers The successor of an individual a - denoted by "a+ " - is definedas follows: (03.1)
Natural numbers in set theoryare representable by the following definitions: (03.2)
o =df
0,
3
2+
=df
1 =
=df
{O,1,2},
0+
=
{OJ,
4
=df
2 3+
=
=df
1+
=
{O, I},
{O,1,2,3},
and so on. Intuitively: any natural number n is the set of natural numbers less than n. Or: if a natural number n is defined already then the next natural number is defined as n+ = n u in}.
106
How can we define the class of all natural numbers?This work needs a series of further defmitions.
10.3.1. Ordinal numbers. We say that the relation R well-orders the class A - in symbols: "R.Wo.A" iff R-LA is connected, its area is A, and everynonempty subset a of A has a singular membernot belongingto Im(R-La). In details: (03.3)
R.Wo.A ¢:)df VxVy«(x E A) & (y E A) & (x *" Y»::J (xRy v yRx» & Va(i(a) ::J (aE k A) ::J (a ~ 0) ::J 3x«x E a) & -3y«(y E a) & yRx»).
One can prove that if R.Wo.A then R-LA is a linear ordering. (03.4) A class A is said to be transitive iff
Vx«x E A) ::J (i(x) & xE k A». Let us denote by 'Eps' the relation 'is a member of' , i.e., (03.5)
Eps
=df
{(x,y): (x E y)}.
Now, all the sets 0, 1, 2, 3, 4 in (D3.2) are well-ordered by Eps and are transitive. (D3.6)
A class A is said to be an ordinal iff it is transitive and Eps. Wo.A.
Ordinals which are sets will be called ordinal numbers. Their class, On, is defined by (D3.7)
On
=df
{x: i(x) & xE is an ordinal}.
10.3.2. The following statements can be proved: (1) Every nonempty class of ordinal numbers has a singular member with respect to the relation Eps. (2) Every memberof an ordinal is an ordinal number. (3) On is an ordinal. (4) Every ordinal other than On is a memberof On. (5) On is a proper class, i.e., -B(On). (6) The successorof an ordinal number is an ordinal number (i.e., (a EOn) ::J (a+ EOn» . We shall use lower-caseGreek letters - a, P,
r - referring to ordinal numbers.
An ordinal number other than 0 mayor may not be a successor of another ordinal number; if not, it is called a limit ordinal number. Now, the class of non-limit ordinal numbers, On" is defmed by (D3.8)
On,
=df
(a: (a =0) v 3f3(P EOn) & (p + = a»)},
107
whereas
is the class of limit ordinalnumbers.
10.3.3. Natural numbers. - In set theory, natural numbers are represented by those members of On I which, starting from 0, are attainable by means of the successoroperation. Thus, the definition of the class of naturalnumbers, co, is as follows:
1
(03.9)
.1
Now, co is provedto be an ordinal. Hence, eitherco = On, or else co ics cannotbe devoid of the following postulate: (P7)
E
On. Mathemat-
i(co).
From this it then follows that co is a limit ordinalnumber. (D3.10)
(a < /1) ~df (a E /1);
(a ~ fJ) ~df «a < fJ) v fa =fJ) .
We omit the details how the full arithmetic of natural numbers - including arithmetical operations - can be developed in the frames of set theory. The essenceis that accepting set theoryin metalogic, we can use the notions of arithmetic as well. By induction on co, we can definethe notionof ordered n-el-tuple (n ~ 2) as an orderedpair whosefirst memberis an orderedn-tuple. We agreein writing
Similarly, we definefor n > 0
10.3.4. Sequences. Where a is an ordinal number, by an a-sequence let us mean any function defined on a, i.e., a memberof alnd. If s is an a-sequence, and P < a then the s-imageof Pis called the P-th member of s. The usual notation: (sP}fkrx .
Sequences defined on a natural number (a member of co) are called finite sequences. The single O-sequence is O. If co ~ a (a EOn), the a-sequences are called transfinite sequences, whereas co-sequences are said to be (ordinary) infinite sequences. If S =(Si}i
= {ti}i<Jc
are finite sequences (n, k
E
co) then their con-
catenation can be definedby S (1 T
= S u {{n+i, x}: (i,x) E T}
(here '+' denotesarithmetical addition whichis definable in set theory). 108
•i
10.3.5. Finite and immite sets (i) Sets a, b are said to be equivalent or ofequal cardinality - in symbols:
"a == b" - iff there exists an invertible functionj' such that Dom(f) = a, and lm(f) = b (or conversely). This relationis an equivalence on the class Set. (ii) A set a is said to be finite iff for some n E ro, a == n, denumerably infinite iff a == c.o, denumerable iff for some a ~ co, a == a. (iii) We say that set a is of smallercardinality that set b iff for some b / C b, a == b' , but a == b does not hold. As Cantor showed,everyset a is of smaller cardinality than po(a). Applying this theorem to co we get that there are non-denumerable infinite sets (e.g., po(c.o ». 10.3.6. Cardinal numbers. If a E On, the class
A = {p:
p== a}
has a minimal member ao that will be called the cardinal number of each memberof A (hence, of a, too). In general, an ordinal number a is said to be a cardinal number iff 't:/ /3«(/3 EOn) &
p== a)::J
a ~ fJ),
and if a is a cardinal number then it is said to be the cardinal number of any set a of which a == a holds. (Thus, if a and b are of equal cardinality - in the sense of 10.3.5 (iii) - and they have a cardinalnumberthen they have the same cardinalnumber.) Does any set have a cardinalnumber? The investigations of Zermelo lead to the result: if there exists an invertible function which well-orders a set a then there is an ordinal number a such that a == a, and, hence, the cardinal number of a is the cardinal numberof a. To provethat each set can be well-ordered, Zermelo needed the axiom ofchoice (AC), the final postulate of set theory: (P8)
(i(a) & 't:/x«x E a) ::J 3y(y EX» ::J 3f(Fnc(f) & (a ~ Dom (j) & 't:/x«x E a) ::J (f(x) EX»).
This postulateis rarely used in logicalsemantics. The arithmetic of ordinal and cardinal numbers coincide in the finite (natural numbersare cardinalnumbers, too) but bifurcatein the infinite(the simplestexample: c.o + == co, hence, co + is not a cardinalnumber.
109
Remark. We posed the question at the end of Sect. 8.3: Why is it impossible to enlarge theory CC into a neg-complete first-order theory? Now, set theory gives an answer. The class JtcO° is denumerably infinite (if we assume that it is a set). Then, its subclass, the class of all monadic predicates definable in [}o is a denumerable one. However, the truth domain of such a predicate is a subclass of Jt cco. Then, the class of all possible truth domains is
PO~cc 0)
which is not a denumerable class (taking into
consideration that Jtcc 0 == (0). That is, there are more possible extensions of monadic predicates than as many such predicates are expressible in a first-order language.
If the postulates from (PO) to (P7) form a consistent system (what we hope but do not know) then it remains consistent by adding either (P8) or the negation of (P8).
In this sense, (P8) is independent from the other postulates of set theory. The same holds for the so-called generalized continuumhypothesis (GCH) which is proved to be independent even from (P8); GCH tells that if a is an infmite cardinal number (00
~
a)
then no cardinal number exists between a and the cardinal number of po(a). The case
a =00 is the original hypothesis - its truth was believed by Cantor. Remark. Our (sketchy) treatment of set theory follows the style of ZARING
TAKEUTI &
1971 - except the permission of primobs (which is missing in that work).
10.4 Applications 10.4.1. Inductive definltlens , We say that a class A is closed with respect to the relation R iff for some n >0 I A(n ) c Dom(R), and J(IA(n)~ A.
Let a be a set, b ~ a, and let a be closed with respect to the relations Rt
, ••• ,
Rk (k ~ 1). A class C is said to be an (b, Rt , ••. , Rid-inductive class iff b ~ C, and C is closed with respect to RJ, ... , Ri . By our assumption, a is a (b, Rh ... ,Rk) inductive set. Now, the intersection of all (b, Rt, ... , Rid-inductive subsets of a, i.e., Co =df
{x: Vc« c ~ a & cis (b, R t , ... ,Rt}-inductive) ~ (x E c)}
is obviously the smallest (b, Rt , .. . ,Rid-inductive subclass of a. Since Co ~ a, we have that Co is a set. We can say that Co is inductivelydefined by the conditions (b,RJ, ... ,Rid where b is the base of the induction, and Rt, ... ,Rk are the inductive rules (cf. Sect. 1.4). - Now we see how inductive definitions can be transformed into explicit definitions in set theory - of course, if some conditions are fulfilled.
110
10.4.2. Reconstruction of syntax in set theory. Let Jil be a (finite) alphabet. We can assume that the members of Jil are primobs in our set theory. Sincethe membersof Jil can be enumerated, we get by (P2) and (P3) that Jil is a set. Jil-words consisting of n letters can be represented as n-sequences of letters (members of nJil). Concatenation of words can be expressed as concatenation of sequences (cf. 10.3.4).Then, the empty word 0 is represented by the empty sequence, i.e., by O. Finally, the class of Jil-words can be definedby Jil O =df u{x: 3n«n eO» & (x e ')i[»}. By referring to the postulates (P3), (P6), and (P7), we get
Ir
j(JilO).
Also, canonical calculi can be represented in set theory. We omit here the details, but we show an examplein reconstructing a rule of a canonical calculus. Assume that the rule in questionis of form:
and the variables occurring in it are replaced by the frrst-order variables XI Consider the function G defined on (Jiloi k) with Im(G) =Jil O: G
=df
{(uJ, ... , ui , v): 3x1 (UI = il') &
3xn«x j, ...
,xn
, . .. , Xn•
eJilO) &
& (Uk = Ik 1 & (v = g'» }
whereiI" ... ,Ik', and g' representthe words iI , ... ,Ik and g, respectively, in set theory. - Now, Jil O is closed with respect to the function G. It then follows that if K is a canonical calculus over jI/ the class of jI-words derivable in K can be defined by an inductivedefinition of set theory (in the sense of 10.4.1). Not surprisingly, theory CC (see Ch. 8) can be embedded- by some translation - into set theory. It then follows that if set theory is consistent, it cannot be complete with respect to negation. 10.4.3. First-order semantics (in nutshell). Let £} = (Log, Var, Con, Term, Form) be a first-order language (cf.Sect.6.2). By an
interpretation of £} let us mean a couple Ip =(V,p) where V is a nonempty set and p is a function with Dom(p) =Con satisfying the following conditions: (i) If qJ is a name (i.e., a name functorwith arity 0) then p(qJ) e V. (ii) If qJ is a name functorof arity n > 0 then p(qJ) is a function from cfn) to V. (iii) If tris a predicate of arity 0 then p(1t) e 2 = {O,l}. (iv) If 1tis a predicateof arity n > 0 then p(1t) is a function from cfn ) to 2.
111
By a valuation of the variables - associated to lp - let us mean a function v
E ' VarU
=
V(U). If v E V(U), x E Var, and u E U, we denote by "v[x: u] " the valuation which differs from v (at most) in v[x: u](x) = u. That is: v[x: u](Y) = v(y) if y is other than x, and
v[x: u](x) = u.
We define the semantic value of any term and formula of £1 as determined by lp and v. The semantic value of an expression A, according to lp and v, will be denoted by "IA~Jp ". However, the superscript
will be omitted for it is constant throughout in
dp»
the following inductive definition.
Var => lr], = vex).
(1)
X E
(2)
If fP is a name functor of arity 0 then l(fP.Av = p(rp).
(3)
If s is an n-tuple of terms (n ~ 1),and t E Term then ls(t)lv = {Islv ,I~v}'
(4)
If rp is a name functor of arity n > 0, and s is an n-tuple of terms then
Irpslv = p(fPXlslv)' (5) If nis a predicate of arity 0 then Inlv =pix). If n is a predicate of arity n > 0, and s is an n-tuple of terms then Imlv
(6)
=
p(nXlslv)'
=t)~ = 1 iff I-Alv = 1 - IAlv .
(7)
s, t E Term
~
(8)
A E Form
=>
(9)
A, B E Form => I(A:::> B)lv
I(s
=0
Islv =I~ (and 0 otherwise).
iff IAlv = 1 and IB~ =0 (and 1
otherwise) .
(10)
(x E Var & A
E
Form) => l'Vx Alv = 0 iff for some u
E
U,
IA~(x: u]
=0
(and 1 otherwise) . Let
r
be a class of formulas, lp an interpretation, and v a valuation associated to lp.
We say that the pair (lp, v) satisfies r iff f\ A(A E
and we say that
r
=> IAI/p = 1),
r is satisfiable iff there is a pair {lpJ v)
we say that sentence A is a semantic consequence of r
u {-A} is not satisfiable. (In case F» 0, we write
"~
that satisfies
r.
Furthermore,
- in symbols: "T ~
A" - iff
r
A", and say that A is a logical
truth (or a valid fa rmula) of first-order logic. Metatheorems. (a) QC is sound with respect to first-order semantics in the sense that
r
~
A =>
r
mantics in the sense that
1=
r
A. - (b) QC is complete with respect to first-order se~
A =>
r
~
A.
Proof of (a) is simple enough, whereas the proof of (b) needs some work.
112
* * * Closure. From now on, the definition of a logical system, after describing its language (or a family of languages) can be continued by formulating its semantics, using settheoretic notions and operations. (This means that a part of set theory is included into our metalogic.) Later on, we can investigate whether there exists a logical calculus which is sound - and perhaps complete - with respect to our semantical system. This is the semantics-motivated way of logical investigations.
113
REFERENCES CANTOR
1932
Georg Cantor, Gesammelte Abhandlungen (hrsg. E. Zermelo). Berlin CARNAP
1961
Rudolf Carnap, Introduction to Semantics and Formalization of Logic. Harvard Univ. Press, Cambridge, Mass. CHURCH
1936
Alonzo Church, "A note on the Entscheidungsproblem." The Journal of Symbolic Logic 1, 40-41,101-102. FEFERMAN
1982
S. Fefennan, "Inductively presented systems and the formalization of metamathematics." In: D. van Dalen et al. (eds.), Logic Colloquium '80. North Holland, Amsterdam, 95-128. FREGE
1879
Gottlob Frege, Begriffsschrift, eine der arithmetischen nachgebildete Formelsprache des reinen Denkens. Halle. GENTZEN 1934
Gerhard Gentzen, "Untersuchungen tiber das logische Schliessen." Mathematische Zeitschrift 39, 176-210,405-431. GODEL
1931
Kurt Godel, "Uber formal unentscheidbare Satze der Principia Mathematica und verwandter Systeme." Monatshefte jUr Mathematik und Physik 38, 173-198. HILBERT
1926
David Hilbert, "Die Grundlagen der Mathematik." Abhandlungen der Hamburger Mathematische Seminar, Bd. VI (1928), 65-85. KLEENE
1952
S. C. Kleene, Introduction to Metamathematics . Van Nostrand, Princeton/Amsterdam. MARKOV
1954
A. A. Markov, T'eori'a algorifmov. (Russian.) Moscow.
114
MARTIN-Lop 1966 Per Martin-Lof, Algorithmen und zufiillige Folgen. Erlangen. (Unpublished manuscript.) MENDELSON
1964
E. Mendelson, Introduction to Mathematical Logic. D. Van Nostrand Co., Inc. Princeton etc. 1938 Ch. Morris, Foundations of the Theory ofSigns. Chicago. MORRIS
POST 1943 E. L. Post, "Formal reductions of the general combinatorial decision problem." American Journal ofMathematics 65, 197-215.
POST
1944
E. L. Post, "Recursively enumerable sets of positive integers and their decision problems." Bulletin ofthe American Mathematical Society 50,284-316. R USSELL 1959
Bertrand Russell, My Philosophical Development. George Allen & Unwin Ltd., London. 1988 I. Ruzsa, Logikai szintaxis es szemantika, I. ('Logical syntax and semantics', in HunRUZSA
garian.) Akademiai Kiad6, Budapest. SMULLYAN
1961
RaymondM. Smullyan, Theory ofFormal Systems . Princeton Univ. Press, Princeton. TAKEUTI & ZARING 1971 G. Takeuti and W. M. Zaring, Introduction to Axiomatic Set Theory. Springer, New York etc.
115
INDEX A
B
.>t (alphabets), 25 .>to, j[1' 29
Bo , 89
(Bl) .. . (B8), 71
j[cc,43 86 .>td , 29,33 j[MF,41 Jil PL , 39
base of induction, 31 basic formulas/schemata, 66, 70 Bernays, Paul, 98 BF, 70 BF, 78,79 biconditional, 15,69 blocking (of an algorithm), 53,54 bound occurrences of variables, 12,69
j[c:O'
Ao , 89
103, 108 algebraic semantics , 97 algorithm, 47, 51-54 applicability of , 51, 53, 54 commands of, 51.54 deciding, 58 normal, 52-55 questions of, 51 steering of, 51 alphabet, 2, 25 alternation, 15, 69 antecedent, 16 antisymmetrical relation, 106 AT, 104 area (field) of a relation, 104 argument places (of functors), 7 arity, 41,68 associativity , 17, 24, 26 asymmetrical relation, 103, 106 atomic formulas in first-order languages, 42,68 in propositional languages, 39, 72 a-tuples of terms 68
A(2) , A(n),
c Co, 94 C,92 CC.l ... CC.9, 82-83 CC ,85-86 CC*, 76-77 calculus canonical,36-37 logical, 1, 64-65 Cantor, Georg; 98, 100, 101, 109, 111 cardinality, equal, 109 cardinal numbers, 109 Camap, Rudolf, 5 Cartesian product, 104 categories of a language, 30 CCal, 44 characters, 2 Church, Alonso, 61, 83, 84 Church's Theorem, 84 Church's Thesis, 61 class abstracts, 21, 100 notation, 20 variables, 21, 100 classical first-order logic, 1,40, see also first-order calculus closed formula/term , 69-70 sentence, 10, 12 closure (of inductive definition) 31 Cns , 74 coincidence of class extensions, 22-23 commands (of an algorithm), 51, 53,54 commutativity, 17, 24
Aut, 46 autonomous numerals, 46 autonym sense, 28 Jil-words, 25 axiom, 71, 75 of choice, 109 of extensionality, 98 of Fraenkel, 106 of pairs, 102 of power set, 103 of regularity, 103 of union, 102 of Zermelo, 102 of 0), 108
116
completeness with respect to negation, 85 with respect to semantics, 1, 112 computability, 52 Con, 68 Cono, 86 Con*, 77 concatenation, 8,25, 108 conditional, 15-16,69 conjunction, 15, 69 connectedrelation, 106 connectives, 17 Cons.«, 93 consequencerelation, 66, 112 consequent, 16 consistency, 74,93 constants (in first-order languages), 68 continuum hypothesis, 110 contraposition, 73 converseof a relation, 105 co.po., 73 Cut, 73
Dom, 104
domains of a relation, 104 DT,73 dyadic functors, 7 numerals, 29, 33
E empty class, 23 set, 103 word, 3,25 enumerability, 47 Eps, 107 equality (versusidentity), 9 equivalence, 17 as relation, 106 of algorithms, 52 of logicalcalculi, 66 of sets, 109 existentialquantifier, 11,69 . sentence, 13 expressions (of a language), 3 extension(of a predicate), 21
D
F
D,33 decidability of a class of words, 47,50,58 of QC, 84 deciding algorithms, 58 deductibility, 66, 71 DeductionTheorem, 73 definiendum, 19 definiens, 19 definite classes, 61-64 , definition, 1-20 contextual/explicit, 20 inductive, 31 denumerablesets, 109 denumerably infinite sets, 109 derivability (in canonicalcalculi), 36 derived rules (of canonicalcalculi), 94 detachment, 18, 36, 71 Diagda/x.b), 88 diagonalization, 87-90 differenceof classes, 24 disjoint classes, 24 disjunction, 17 distributivity, 24
F, 78 Feferman,S. F., 42 field of a relation, 104 finite means, 1 sets, 109 first-order calculus, 2, 66-70, see also QC language, 66-69 maximal,41, 67 semantics, 111 theory, 74 Fnc, 105 Form. 68 Formo .86 Form*, 77,8] FormMF,42 FormpL, 39
formal language, 2 formula, 39-39,42,68 atomic, 39,68 foundational problemof logic, 1-2 FR,78 Fraenkel, Abraham, 98, 105
117
Fraenkel's postulate, 105 free from a variable, 70 groupoid, 28 free occurrences of variables, 12, 69 Frege, Gottlob, 67 function, 105 invertible, 105 functor, 7, 8 as automaton, 7
inferences, 18 infinitesets, 109 inputs of a functor, 7 ofarule, 33 of a command, 53 instantiation (of a universal or existentialsentence), 13 interpretation (of ,£1), 111 intersection of classes, 24 invertible function, 105
G
Ip, 111 irreflexive relation, 103, 106
GCH, 110 generalized continuum hypothesis, 110 Gentzen,Gerhard, 67 GOdel, Kurt, 46,85,92,97,98 Godel's First Incompleteness Theorem,
J justification of QC, 71 juxtaposition, 28
92
SecondIncompleteness Theorem, 97
K
Godel numeral, 46
H
K LPL , 40 K MF , 42 KpL , 39 KIeene, S. C., 52, 61
8 1 , 43 8 2,44 8 3 , 45 Hilbert,David, 1, 67 homogeneous functor, 7 hypercalculi, 41-46
L ,£1,67 ,£1*, 77 ,£10, 86 (Ll), 2 (L2), (L3), 3 L pL , 40
I I, 78 Id, 105 ideal objects, 3 idempotence, 24 identity, 9,23,67, 73 in set theory, 98 'iff', 16 1m, 104 implication, 17 inconsistency, see consistency Ind, 101 indices, 39,41,68 indirectproof, 18 inductiveclasses, 36-37 definition, 31 in set theory, 110 rules, 31
lambdaoperator, 20 language first-order, 40, 67 maximal, 40-42 formal, 2,4
meta-, 4 natural, 2-3 object, 4 propositional, 38-40, 72 radix, 25-28 used, 4 laws of logic, 1 letters, 2, 25 lexicographic ordering, 45 limit ordinal numbers, 107
118
J
,
1 I
I
1
J
Log, 67 logic,the possibility of, logicalcalculus, 66 functors, 8, 15-16 systems, 1
naming words, 4 naturaldeduction, 67 natural language, 1, 2 numbers, 3, 106, 108 negation, 15,67 neg-complete (theory), 85 Neumann, J. von, 98 non-limitordinalnumbers, 107 non-logical functors, 8 non-stop commands, 53, 54 normal algorithm, 52, 54 notation conventions in first-order languages, 70, 79 in inductivedefinitions, 33 in presentinga language, 28 in set theory, 99 NumberTheory, 1,97 numerals, 33, 46 autonomous, 46
truth, 9,40, 71, 75, 112
logics, 1
M Markov, A. A., 53, 84 Markov's Thesis, 61 Martin-Lof, Per, 42 mathematical logic, 1 maximalfirst-order language, 40,67 meaningful expressions, 3 membership relation, 21 Mendelson, E. M., 53,61 metalanguage, 4 metalogic, subjectmatterof, instruments of, 7 metamathematics, 1 metatheorems on PC, 73 on QC, 72-73 modus ponens, 1, 18, 71 modustollens, 18 monadic functors, 7 Morris, Ch., 5
o object language, 4 occurrences of a variable, 12 in a formula, bound/free, 69 On, OnI, Onu, 107, ~08 open formula, 70 sentence, 10, 12 term, 69 operations, 8 ,orderedpairs, 103 n-tuples, 108 ordering, linear/partial, 106 well-, 107 ordinal, 107 numbers, 107 Orp, 103 output,of a functor, 7 ofa rule, 33
MP,71
N N,68 N*,77
No,86 N,78 N(f/g), 55 N[kl,57
L
Np,56 Nsuc , 57 N*,60
p
N[if], NUf•n] , 60, 61
P,68
NMF , 58 name,individual, 7, 41 closed/open, 12 functors, '7, 41 naming open expressions, 14
P*,77
Po, 86 P,78 (PO), (PI), 99 (P2), (P3), 102 (P4), (P5), 103
119
(P6), 105 (P7), 108 (P8), 109 parentheses, 11, 17, 67 PC,72 PC.1 ... PC.14, 73 phonemes, 3 po, 103 polyadic functors 7 Post, Emil L., 42 postulates on languages, 2, 3 of a theory, 75 of CC*, 79-81 of CC, 86 power class/set, 103 pragmatics, 5 predicates, 8,41,68 premises, 66 primary objects, 99 primobs, 99 procedure, 47,49 pronouns, 10 proof rules, 66 proper class, 100 properties (of individual objects) 8 propositional calculus, 72 logic, 38-40
relations, 103-106 relative product, 105 re-naming of bound variables, 13, 73 representability (of an algorithm by a canonical calculus), 62 restriction of a relation, 104 R-image, 105 RU,101
rules of a canonical calculus, 36 releasing, 39 . of deduction, 66 Russell, Bertrand, 101 Russell class, 101
s 51 , 43 5c I 78 S,78 satisfiability, 112 scope of quantification, 11 semantical foundation (of a logical system), 97, 111 semantic consequence, 66, 112 value, 112 semantics, 5, 97, 111 semiotics, 5 sentence, declarative, 7 closed/open, 10 functors, 8, 15-16 sequences (finite, infinite), 108 sequent calculus, 67 Set, 101 set theory, 1, 2, 98-100 Skolem, Thoralf, 98 Smullyan, Raymond, 42 soundness, 1, 112 steering (of an algorithm), 51 stop command (of an algorithm), 53 subclass, 22 proper, 23 subformula, 69 subsidiary letters, 35,37,54,56 substitution of free variables in formulas, 70 in sentences, 12 successor, 47-48, 57 in set theory, 106 SUD, 86-87, 95
Q quantification effectivenneffective, 12 existential/universal, 11 vacuous, 12 quantifiers, 11, 67, 69 quasi-quotation, 14 QC, 66, 70-71 QC.l ... QC.9, 73 questions of an algorithm, 51 quotation sign, double, 14 simple, 4
R (Rl) ... (R6), 26 (R7), 27 recursive functions, 52 reflexive relation, 17, 106 Rei, 104
120
superclass, 22 symmetricalrelation, 17, 106 syntax, 5 reconstruction of, in set theory, 111
union of classes, 24 universalclosure (of a formula), 74 generalization, 73 universalquantifier, 11, 41, 67 sentence, 13 used language, 4
T
v
T,78 tacit quantification, 14 Takeuti, G., 110 Term, 67-68 Term *, 77,81 Terms , 86 terms (in first-order languages),42,68 a-tuples of, 68 ThJa), 93 theory, first-order, 75 of canonicalcalculi, 76-77 Tr, 81 transfmite (mathematics), 2 transitive classes, 107 relation, 17, 106 triadic functors, 7 truth and falsity, 16 truth assignment (in CC*), 81-82 conditions, 16 domain (of a predicate), 21 Turing machines, 52
V,78 vacuousquantification, 12 valid formula, 112 valuationof variables, 112 Var, 67-68 variable, 10, 68 bound and free occurrencesof, 12 in canonical calculi, 36 binding operators, 12 V(U), 112
w well-formedexpressions, 3, 30 well-ordering, 107 WO,107
words (of a language), 3,26
z
u
Zaring, W. M., 110 Zermelo, Ernst, 98, 102, 105, 109 Zermelo's postulate, 102 zero-orderlanguage, 72 Z-F Set Theory, 98
VG,73 undecidability (of QC), 84 union class, 102
121
LIST OF SYMBOLS (Symbols composed from Latin letters areto befound in the alphabetic INDEX.)
(J
= =¢:.
Ax
Vx &
v => ~
~df
=df
{x: ~x)} E ~
8,25,108 9 9, 79 11
53 55 r 65 66 t 69 3 69 70 [Atx i) 76 77,81 r* 1:* 77 ro 86 1: 86 87 a fA 87 98,99 i 99 IrE a 101 102u(A) o(empty set) 103 (a,b) 103 AxB 104 RJ,A 104 IfIA . 105 U R 105 105 RIS B A 105 106 1 = {OJ a+ 106 (J) 108 108 <,~ 109 (V,p) 111 1A~lp 112 112 1= -7.
#
11
16,39,67 16,69 16,69 16 16 19 19 21 21,99 22,99
{ah .. . , an} 22
c c 0
u (1
A-B 51.0 (0 -7
*
»+
:::>
t 1t
x 0
'if
l; -<
22 23 23, 101 24 24 24 25 25 33,36,53 36,43,55 36 39,67 39 39 41 41 41,67 43 43
122
APPENDIX (LECTURE NOTES)
TYPE·THEORETICAL EXTENSIONAL AND INTENSIONAL LOGIC
123
L _
CONTENTS TECHNICAL INTRODUCTION
125
PART I: EXTENSIONAL LOGIC
127 127 127 127 129 130 132 133 134 134 135 142 144
1.1. THE SEMANTICAL SYSTEM EL 1.1.1.The extensional type theory 1.1.2. The grammar of the EL languages 1.1.3. Semantics for EL languages 1.1.4. Somesemantical metatheorems 1.1 .5. Logicalsymbols introduced via definitions 1.1.6. The generalized semantics 1.2. THE CALCULUS EC 1.2.1. Definition of EC 1.2.2. Some proofs in EC 1.2.3. EC-consistent and EC-complete sets 1.2.4. The completeness of EC
2.3.3.Reduction of intensionality: meaning postulates 2.3.4. Some criticalremarks
148 148 148 149 150 155 156 156 156 159 160 162 166 166 171 176 179
REFERENCES
182
PART 2: MONTAGUE'S INTENSIONAL LOGIC
2.1. THE SEMANTICAL SYSTEMS IL AND IL + 2.1.1 . Montague'stype theory 2.1.2. The grammar of IL and IL + 2.1.3. The semantics of IL and IL + 2.1.4. The generalized semantics of IL 2.2. GALLIN'S CALCULUS IC 2.2.1.Defmiton of IC 2.2.2.The modallaws of IC 2.2.3. IC-consistent and IC-complete sets 2.2.4.Modal alternatives 2.2.5.The completeness of IC 2.3. APPLICATIONS OFIL+ 2.3.1.A fragment of English: L E 2.3.2. Translation rules fromL E into L (ij
124
TECHNICAL INTRODUCTION This Notes contain only the most important technical parts of the lectures held by the author.
In this Notes, the canonical symbols of set theory will be used where it is necessary. The emptyset is denoted by '0', and the set of natural numbers by '0)'. An expression of form "{x: rp(x) }"refers to the setof objets x suchthat rp(x) where lp stands for somepredicate. Theset ofjunctions from a set B into a set A will be denoted by ,48A". In speaking about a formal language and its expressions, a metalanguage will be used which is commonEnglish augmented by some terms and symbolsof set theory, by other symbols introduced via definitions, and by isolated letters (sometimes by groups of letters) used as metavariables. (The detailed use of metavariables will be explained in due course, preceding their actual applications in the text.) In speaking about a particular expression, say, about a symbol, we shall include it in between (simple) invertedcommas, e.g. '&'.
However, symbols will be mentioned sometimes autonymously (omitting the inverted commas) if this does not lead to a confusion. We often have to speak about compound expressions of an object language; in such a case, we shall use schemata composed of metavariables and some symbols of the object language. These schemata will be included in between double inverted commas (servingas quasi quotationalmarks), e.g. "((A & B)
::J. C)".
(Here A, B and C are metavariables referring to certain expressions of an object language.) Double inverted commas will be omitted if the schema is bordered by some symbol introducedin the metalanguage. Definitions will be, in most cases, inductive ones. A class of expressions will
be defined usually as the smallest set satisfyingcertain conditions. Among these conditions, there is one - or there are some - serving as the basis of the inductive definition prescribing that some set(s) defined earlier must be included in the definiendum. The other conditions prescribe that the definiendum must be closed with respect to certain operations (in most cases, syntactic operations) applied to its members. - Identity by definition will be expressedby '=df" Proofs will be, in most cases, of the same inductive nature. To show that all
members of a set defmed inductively have a certain property it is sufficientto show that (i) the members of the basis set (of the inductivedefmition) have that property and (ii) the propertyis hereditary via the operationsmentioned in the inductivedefinition. This 125
proof method will be called proof by structural induction if it is about a set of grammatical entities. The symbol
'~'
stands for 'if and '(::)' or 'iff 'for 'if and only if'. 'Def.' and
'Th. ' are abbreviations for 'Definition' and 'Theorem', respectively. Budapest, April 1992. 1. 9{uzsa
126
PART 1: EXTENSIONALLOGIC 1.1 THE SEMANTICAL SYSTEM EL We shall introduce a semantical system of the full type-theoretical extensional logic called EL. 1.1.1. THE EXTENSIONAL TYPE THEORY We shall use to' (omicron) and
(iota) for the types of (declarative) sentences and
'l '
(individual) names, respectively. A type of an extensional functor will be of form It
a(p)" where p refers to the type of the input (i.e., the argument) and a refers to the
type of the output (i.e., the expression obtained by combining the functor and its argument). The full inductive definition of the set of extensional types, EXTY, is as follows: o~ l E
a, p
E
EXTY
EXTY
~
"a(p)"
E
EXTY.
If P consists of a single character (0 or r), the parentheses surrounding it will be
omitted. We shall use' a ', Ip', and Iy' as variables referringto the members of EXTY. Instead of a(p)" we write sometimes " ap ". It
1.1.2. THE GRAMMAR OF THE EL LANGUAGES System EL deals with the grammar and the semantics of a family of type-theoretical extensionallanguages. 1.1.2.1. DEFINITION. By an EL language let us mean a quadruple [,0:1
=(Log, Var, Can, Cat}
where: Log = { (, ), A, = }
is the set of logical symbols of the language (containing left and right parentheses, the lambda operator A, and the symbol of identity); Var = U
a e EXTY
Var(a)
is the set of (bindable) variables of the language where each Var(a) is a denumerably infmiteset of symbolscalled variables oftype a; Can =
U
a e EXTY Con(a)
is the set of (nonlogical) constants of the language where each Con( a) is a denumerable (perhapsempty) set of symbolscalled constants of type a;
127
all the sets mentioned up to this pointare pairwisedisjoint; Cat
= U ae EXTY Cat(a)
is the set of the well-formed expressions - briefly: terms - of the language where the sets Cat(a) are determined by the grammatical rules (GO) to (G3) below. For a E
EXTY, Catia) may be calledthe a-category ofLr:
s:Cat(a).
(Gl ) Aap(Bp) E Cat( a). [Read: If A E Cati ap) and BE Cat(p) then HA(B)"
E
Cat( a).] (G2) "('A.xpA a )" E Cat( ap) . (G3) "(A a =B a ),' E Cat( 0).
We write sometimes "('A.x.A)" insteadof "(AxA)" for the sake of easierreading. 1.1.2.2. DEFINITION. (i) An occurrence of a variable x in a term A is said to be a bound occurrence of x in A iff it lies in a part of form "(Ax.B)" of A . An occurrence of x in A is said to be afree occurrence of x in A iff it is not a boundoccurrence of x in A. (ii) A term A is said to be a closed one iff no variable has a free occurrence in A.
A term A is said to be an open one iff it is not a closedone. (iii) The term A is said to be free from the variable x iff x has no free occurrencesin A. (iv) A variable Xa is said to be substitutable by the term B a in the term A iff whenever "(Ay.C)" is a part of A involving some occurrence of x which counts as a free occurrence of x in A then B is free from y. (v) By the result of substituting Bafor xain A let us mean the term A / obtained from A via replacing all freeoccurrences of x by B provided x is substitutable by B in A. We shall use the notation [A]/
for A / . (The squarebrackets will be omitted in case A consists of a single character.) In using this notation, we assume always the fulfilment of the proviso which assures that the free occurrences of variables in B remainfree ones in A/ as well.
128
(vi) We shall denote by "C[B/A]" the term obtained from C via replacing a (single) occurrence of A - not preceded immediately by 'A.' - by B, provided A and B belong to the samecategory. This syntactic operation willbe called replacement. 1.1.3. SEMANTICS FOR EL LANGUAGES Throughout in this section, letLt)(! be any EL language. 1.1.3.1. DEFINITION. By an interpretation of L~ let us mean a couple Ip = (U, p)
where U is a nonempty set and p is a function defined on Con such that A E Con( a) ==:-
~A) E
D( a)
whereD is a function defined on EXTY such that D(o) ={O,I}, Dit ) = U, and D(afi) = D(jJ) D(a). (Here '0' and '1' stand for the truth values False and True, respectively.) Given U, the function D is uniquely determined by theseprescriptions. D(a) is said to be the domain offactual values of type a..
A function v defined on Var is said to be a valuation (of variables) joining to Ip iff XE
Var( a) ==:- v(x)
E
D( a).
If x E Var( a) and a E D( a), we denote by "v[x:a]" the valuation which differs from v (at most) in v[x:a](x) =a. That is: if y is other than x, then v[x:a](y) =v(y). 1.1.3.2. DEFINmoN. Givenan interpretation Ip of L~ , we shall define for all terms A E Cat and for all valuations v joining to Ip, thefactual value ofA according to Ip and v - denoted by "IAI/p" - by the semantic rules (SO) to (S3) below. (In the notation, the superscript 'Ip' will be usually omittedwhenever Ip is assumed to be fixed.) Semantic rules: (SO) If x E Var, lxl, = v(x). If C E Con, IClv =P(C).
(SI) 1Aaj1(BP)l v =lAlv(lBl v )' (S2) 1(A.xpAtJlv is the function rp E D( aft) such that b e D(fi) ==:- ~b) = IAlv[x:b]. In other words, for all b E D(fi),
=
1(A.x.A)llb) IAlv[x:bj". (S3) I(Aa BaJ l, = 1, if IAlv = IBlv ,. and 0 otherwise. 1.1.3.3. LEMMA. The factual values IAI/p are uniquely determined by the rules (SO) to (S3), and, if A E Cat( a), then IAI/p E D( a).
=
Proof' by structural induction using the semantic rules. (The details are left to the reader.) 1.1.3.4. DEFINmoN. Let rbe a set of sentences (F c: Cat( Ip an interpretation and v a valuation joining to Ip.
0»,
129
(i) We say that the couple (Ip, v) is a model of riff for all A E I: IAI/p ='1. (ii) ris said to be satisfiable iff rhas a model, and ris said to be unsatisfiable iff rhas no model. (iii)The sentence A is said to be a semantic consequence of T - in symbols: "T ~ A" - iff everymodel of r is a model of {A}. (iv) Sentence A is said to be valid (or a logical truth of EL) - in symbols: "~ A" - iff A is a semantic consequence of the emptyset of sentences. (v) Terms A and B are said to be logically synonymousiff ~ (A = B). Note that if r is unsatisfiable then for all sentences A, r ~ A,. and if ~ A then for all r, r ~ A. 1.1.4. SOME SEMANTICAL METATHEOREMS Throughout in this section, a language L~ and an arbitrary interpretation Ip for L~ is assumed. Let us denote by "FV(A) " the set of variables having some free occurrences in the term A. Then: 1.1.4.1. LEMMA. If the valuations v and v/coincideon FV(A), then IAI/p = IAlv'/p . Proof. Our statement is obviously true if A is a variable or a constant. If A is of form "B(C)" or H(B = C)" then use the induction assumption that the lemmaholds true for B and C, and take into consideration that in these cases, FV(A) = FV(B) u FV(C) (and use the rules (S1) and (83). Finally, if A is of form "(A.xaBp)" then, FV(A) = FV(B)- {x}.
If v and
y'
coincide on FV(A) then for all b
E
D(b), v[x:b] and v' [x:b] coincide on
FV(B),' thus, by inductionassumption,
IBlv[x:b]
=IBlv'{x:b] .
Then (usingthe rule (S2», for all b e D(ft): I(AX.B)l v (b) = IBlv[x:b]
= IBlv'[x:b] =I(AX.B)lv'(b)
which means that I(AX.B)l v = I(Ax.B)l v " COROLLARY. If A is a closedtermthen for all valuations v and v~ IAlv = IAlv~ 1.1.4.2. LEMMA. If for all valuations v, lAaI v =I BuJv, then for all valuations v, ICl v = IC[B/A]l v ' (Cf. (vi) of Def. 1.1.2.2.) Proof. For the sake of brevity, we writeX'instead of "X[B/A] ". Our statement holds trivially if A is not a part of C, or Cis A. If C is of form "F(E)" or H(F =E)" then use the induction assumption that IFlv = IF' l, and IElv = IE' l, (for all v). If C is of form "(AXpE)" then C'must be "('Ax.E')". Using that for all v, IElv = IE' l, we have that for all v and for all b E D(ft), 130
IElv[x:bJ
=IE"lv[x:bJ
whichmeansthat for all valuations v, I(Ax.E)lv = 1(A.x.E")l v ' COROLLARY. If i= (A a =BaJ then i= (C = C[B/A]). Let us emphasize a furthercorollary: 1.1.4.3. THEOREM. The law of replacement. If i= (A a = BaJ and i= Cathen 1= C[B/A). 1.1.4.4. LEMMA. If xp is substitutable by BP in the term A then for all valuations v:
~ 1.4/ l, =IAlv[x:bJ' Proof. If A is free fromx then A/ is the sameas A, and (byLemma 1.1.4.1) IBlv
=b e D(P)
IAlv = IAlv[x:b].
Now assume that A is not free from x. Then we use, again, structural induction on A. If A is x then A/ is B, and, trivially, IBlv
=b = Ixlv/x:b] .
The cases when A is of form UF(C)" or U(C = E)" are left to the reader. Now let us consider the case A is of form "('Ay.C)". Then y x (for "(Ax.C)" is free from x), and B must be free from y (for B is substitutable for x in "('Ax.C)" ). Hence, if y E Var(y) then for all c E D(y),
'*
IBlv[y:cJ = IBlv
(byLemma1.1.4.1). Then,for all c E D(y), 1('Ax.C/ )I v (c)
=lex Iv[y:, ] B
= IClv[y:cllx:b]
= 1('Ay.C)lv[x:b](c),
by inWcticit asswDptiCll
whichyieldsthat 1('Ay.C/)Iv =1('Ay.C)l v/x:bJ' 1.1.4.5. THEOREM. The law oflambda-conversion. If x is substitutable by B in A then 1=
((Axp.Aa)(Bp) =A/.
Proof. We haveby (S1) and (S2) thatif IBlv = b then 1('AxpAtJ(Bp)lv = I(Ax.A)l v (IBl v) = IAlv[x:bJ'
According to the precedig lemma(usingthe assumption on the substitutability): IAlv[x :b]
=IA/ l,
Hence: I(Ax.A)(B)l v =IA/ Iv for all interpretations and valuations. Our statement follows trivially fromthis fact.
131
L
1.1.5. LOGICAL SYMBOLS INTRODUCED VIA DEFINITIONS
We defme first the sentences i and J" called Verum and Falsum, respectively:
i [Show that li~lp
=df «Ap()p) = (Ap.p»; J, =df «Ap()p) = (Ap. i» .
=1 and IJ,I/p =0, for all Ip and v.]
We continueby introducing negation, 1_' : (Df.-) Then -A =df(Ap(J,=p»(A). By the law of A-conversion, the right side is logically synonymousto "(J, =A)". Hence, the contextual defmitionof'-' is as follows: -A
=df
(J, =A).
The explicitedefmitionof the universalquantifier "'i/ a (of type a) is: (Df."'i/) Its contextual definitionis:
(Here the type subscript aof"'i/ can be omitted.)We can introducethe usual notation by "'i/xaA o =df "'i/(Ax.A) [= «Ax.A) = (AX.
t» ].
The definition of the conjunction ' &': (Df.&)
& =d f (Apo(Aqo ·"'i/fro [P = (f(p) = f(q))]
».
[For the sake of easier reading, we applied here a pair of square brackets instead of the "regular" parentheses. This device will be applied sometimes later on.] We shall write the usual "(A & B)" instead of "&(A)(B)". Thus, the contextual definition of' &' is as follows: (A & B ) =df vt.oo (A = (f(A) = fiB)))
where A and B must be free from the variablef. [Show that our definiton of '&' satisfies the canonical truth condition of the conjunction.] The further logical symbols will be introduced via contextual definitions only: (Df. o)
(Ao~ B o) =df -
(A & - B)
(Df. v)
(Ao v Bo ) =df
(-A & - B),
-
(We do not need a new symbol for biconditional since"(A = B )" is appropriate to express it.) 132
(Df.3)
3xa·A o =df- V'x-A.
(Df. ¢)
(A a
*Ba )
=df "....(A =
B).
1.1.6. THE GENERALIZED SEMANTICS It follows from a result of KURT GbDEL (1931) that there exists no logical calculus
which is both sound and complete to our semantical system EL (i.e., a calculus in which asentence A is deducible froma set of sentences riff r FA holds in EL). However, via following the method of LEONARD HENKIN 1950, it is possible to formulate a generalized semantics - briefly: a G-semantics - in such a way that the calculus EC -
to be introduced in 1.2 - proves to be sound and complete with respect to this Gsemantics. The present section is devoted to formulate the G-semantics of the EL languages. The semantics introduced in Ll.Lmay be distinguished by calling it the standard semantics. 1.1.6.1. DEFINITION. By a generalizedinterpretation - briefly: a G-interpretation - of
a languageLo:! we meana triple Ip = ( V, D, p) satisfying the following con"ditions: (i) U is a nonempty set., (ii) D is a function defined on EXTY such that D( 0)
={0,1},
Dtt) = U, and D( ap) b D(fJ) D( a).
(iii) p is a function defined on Con such that
CeCon(a) ~ ~C)eD(a). (iv) Whenever v is a valuation joining to Ip (satisfying the condition v(x a) e D( a) ), the semantic rules (SO) to (S3) in Def. 1.1.3.2are applicable in determining the factualvalues (according to Ip and v) of theterms of Lo:!. Comparing G-interpretations and standard interpretations (defined in 1.1.3.1) one sees the main difference in permitting 'c' instead of '=' in the definition of D( ap). However, the domains D( ap) must not be quite arbitrary and 'too small' ones: the restriction is contained in item (iv).For example, to assurethe factual value of the term U(Axa(X = x) " the domainD(oa-) mustcontaina function (/J such that for all a e D(a), f./J(a) = 1 hold.
1.1.6.2. DEFINITION. Consider Def. 1.1.3.4. Replace the term 'interpretation' by 'G-
interpretation', and prefix 'G-' before the defined terms. Then one gets the defmition of the following notions: a G-model of a set of F, G-satisfiability (andG-unsatisfiability), G-consequence- denoted by r FG A - and G-validity ( FG A), G-synonymity. Since every standard interpretation is a G-interpretation we have the following interrelations:
133
ris satisfiable ::::> T is G-satisfiable,
ris G-unsatisfiable
::::>
r FoA FoA
::::> ::::>
ris unsatisfiable,
r F A, F A.
We have proved some important semantic laws in Section 1.1.4. Fortunately, their proofs were based in each case on the semantic rules (SO) to (S3) which remained intact in the G-semantics, too. Hence: 1.1.6.3. THEOREM. All logical laws proved in the standard semantics - in section 1.1.4
- are logical laws of the G-semantics as well. The most important laws - which will be used in 1.2 - are the law of the replacement (1.1.4.3) and the law ofthe lambda-conversion (1.1.4.5) .
1.2. THE CALCULUS
sc
1.2.1. DEFINITION OF EC
The calculus EC introducedbelow will be a pure syntactical system joining to the semantical system EL. Our presuppositions here are: the extensional type theory, the grammar of the Lei\1 languages (including the notational conventions), and the definitions of the (nonprimitive) logical symbols (as i,
J.., - , 'V, &, etc.). (See 1.1.1, 1.1.2,
1.1.5.) EC will be based on five basic schemata (E1)...(E5), and a single proof rule called replacement. Basic schemata:
(E1) (A a = A a) (E2) (if aft) &f oofJ)) = 'Vpo.f(p) )
(E3) «x a =Ya) ::J if aJx) =f(y»)) (E4) «(fap = gap) = 'Vxpff(x) = g(x)]) (E5) «'Axp A a )(Bp) =
A/ )
Here the metavariables f, g, x, y refer to variables and A, B to arbitrary terms of a formal IanfJCf
guage.L
•
(Ofcourse, in (E5), it is assumed that the term B is substitutable for x in A .)
By a basic sentence (of Lei\1) we mean a sentence resulting by a correct substitution of terms of L~ into one of the basic schemata. (A substitution is said to be a correct one if the lower-case letters are substituted by variables of the indicated types and the upper-caseletters are substitutedby terms of the indicatedtypes.) Proof rule. Rule of replacement - RR. From "(A a =
B~"
and Co to infer to
"C[B/A]".
Proofs in EC. By a proof we shall mean a nonempty finite sequence of sentences such that each member of the sequence is either a basic sentence, or else it follows from two precedingmembersvia RR. 134
A sentence A (of .£~) is said to be provable in EC - in symbols: "fee A" iff there exists a proof in EC terminatingin A. (As one sees, our definitions are language-dependent. In fact, we shall be interested, in most cases, in the proofs of sentence schemata rather than of singular sentences (of a particularlanguage).) In what follows, we shall omit the subscript 'EC' in the notation' fee', writing simply '~' instead. (The distinction is important only if we are speaking of different calculi.) The notion"A is a syntacticconsequence of the set r of sentences" (or "A is deduciblefrom r") will be introducedin Section 1.2.3. It is easy to see that all basic sentences are valid in the semantical system EL.
Furthermore, by Th. 1.1.4.3, the rule RR yields a valid sentence from valid ones. Hence: If ~ A then ~ A.
Let us realize that the above statement holds not only for our standard semantics of EL but even for the generalized semantics explainedin 1.1.6, in consequence of Th. 1.1.6.3. Consequently: 1.2.1.1. THEOREM. The soundness ofEC with respect to the generalized semantics of EL. If ~e A then FG A. To prove the converse of this theorem, we need first to prove some theorems about the provability in EC . 1.2.2. Some proofs in EC
In this section, we shall prove some metatheorems about the provability in EC. Some of these theorems state that a certain sentence, or a sentence schema is provable in EC, and some others introduce derived proof rules. At the beginning, the proofs will be fully detailed. A detailed proof will be displayed in numbered lines. At the end of each line, there stands a reference between .square brackets indicating the provability of the sentence/the schema occurring in that line. Our references will have the following forms: 'ass.' stands for assumption occurringin the formulation of the theorem. Reference to a basic schema or to a schema proved earlier will be indicated by the code of the schema (e.g., '(E2)', 'E3.2', etc.), A reference of form "Df. X" (e.g., 'Df. \/ ', 'Df. :::J') refersto the definitionof the logicalsymbolstanding in the placeof 'X' . A reference of form "kim" in line numberedby n states that line n follows by a replacement (RR) accordingto the identity standing in line k into the schema in line m. Instead of k or m, we use sometimes codes of schemata proved earlier. We shall refer often to the basic schema (E5) - the identity expressing A-conversion - , in this case we write simply 'A' in the place of k.
135
References to derived proofrules will be of form "RX: k" or "RX: k.m" where "RX" is the code of the rule and k, m are the line numbers (or codes) of the schema(ta)
to whichthe rule is to be applied. Later on, the proofs will be condensed, leavingsomedetailsto the reader., Outermostparentheses will be sometimes omitted. Notethat instead of "(A 0 (Bo~
~
Co»" we write "A ~ B ~ C'.
Proofs from (E1) 1. ~ (A = A)
2. 3.
~ (A
(A =B) ~ ~ (B =A). - Proof:
~
El.I.
[(El)]
=B)
[ass.]
~ (B = A) [211] COROLLARY.
If ~ (A = B) and ~ Co then ~ C[B/A). In what follows, we shall
refer to this rule as to our basic rule RR.
El.2.
(~
1. ~ (A = B)
2.
~ (B
=C)
(A = B) and ~ (B =C» ~ [ass.]
r (A =C). -
Proof:
[ass.]
3. ~ (A = C) [1/2]
EI.3. ~ t [by Df.i and (EI)]. EI.4. ~ (Ao =i) ~ r A o · - Proof: 1. ~ (A =i) [ass.] 2. ~
i
[El.3]
3. ' ~ A
[l/2]
EI.S.
1. ~
2.
~
(~
Ao=B o) and (A = B) [ass.]
A
r A ) ~ r B. -
Proof:
[ass.]
3. ~ B
[112]
These resultswill be used, in most cases,withouta particularreference. Proofs from (E5) ES.I. ~ (A = 'B) ~ (Ax C = B,c) (provided, of course,that C and x belong to the samecategory, and C is substitutable for x both in A and B). 1. ~ (A = B) 2. ~ «Ax.A)(C) =( Ax.A)(C» 3. ~ «Ax.A)(C) = (Ax.B)(C» 4. ~ «Ax.A)(C) =Axc) 5. ~ «Ax.B)(C) =Bxc ) 6. ~ (Axc = Bxc )
Proof:
[ass.] [(EI)] [1/2] [(E5)]
[(E5)] [4,5/3]
Note that line 6 comprises two applications of RR. In what follows, steps such as 2 and 3 will be contractedinto a single step with the reference[l/(El)]. 136
COROLLARIES: We get from (E2) and (E4) by E5.1 that: (E2*) ~ (FaAi) & F(J,» = 'Vpo' F(p) [where Fis free fromp]. (£4*) ~ (F ap = Gap) ='Vxp [F(x) = G(x)] [two applications; F and G must be free from x]. To apply this device to (E3), remember that "(A ::> B)" is an abbreviation for "(J, =(A & -B»". Hence, E5.1 is applicable to (E3) as well. By three applications we get: (E3*) ~ (A a = B a ) ::> (Fat (A) = F(B». E5.2. ~ 'V x.A o ~ ( ~ (A/ =i) and ~ A/). 1. ~ (Ax.A) = (Ax. T) [ass. and Df. 'V] 2. ~ (Ax.A)(B) = (Ax.i)(B) [1/(EI)] 3. ~ (A/ = i) [A /2, twice] 4. ~ A/ [EI.4: 3] [The provisos are analogousto those of E5.1.] Proofs from (E4) 1. 2.
3. 4.
Proof:
E4.1. ~ «A =A) =T). - Proof: ~ «Ax.A) = (Ax.A» = 'Vx[(Ax .A)(x) = (Ax.A)(x)] [(E4*] ~ 'Vx[(Ax.A)(x) = (Ax.A)(x)] [£1.5: I,(El)], ~ 'Vx(A =A) [A /2, twice] ~ «A =A) =T) [E5.2: 3] E4.2. ~ ('Vx.i = T). - Proof: H'Vx.i" is "(Ax.i) = (AX. i)". Now apply E4.I. E4.3. ~ (- J, = T). - Proof' '- J,' is 'J, = J,'. Apply E4.1. .
E4.4. ~ ('Vpo'P = J,). - Use that "'Vp.p " is "(Ap.p) = (Ap. i)", and the latter is J" by its definition. E4.5. ~ 'VPJ.p = i) = J,. - Proof: 1. ~ «Ap.p) =(Ap. T) = 'Vp(p = T) [(£4*) and A] 2. ~ J, ='Vp(p = i) [Df. J, 11] Complete by using EI.l. Proofs from (E2) E2.1. ~ i & i) = i). - Proof: In (E2*), let F be "(Apo tr, and apply Aconversions.At the right side, use £4.2 . E2.2. ~ (( i & J,) = J,). - Proof' In (E2*), let F be "(Apop)". Apply Aconversions. At the right side, use E4.4. E2.3. ~ 'VpJ..(i & p) = p). - Proof: Let F be "(APJ..(i & p) =p»"in (E2*). After A-conversions, we have:
«
~
«t &i) =i) & «i &J,) =J,)]= 'Vp«i & p) =p). ~
~
(i = i) '--.r-----J
&
,i
B: i
(J, '
•
= J,)
1
.
[by E2.1 and E2.2], [by E4.1, twice], [by E2.1]. 137
Complete by using EI.I and El.4. E2.4. ~
«r & A o ) =A).
[From E2.3, by E5.2.] E2.S. ~ (( J, = i) = J,). - Proof: Let F be "(Ap(P = i»" in (E2*). Use Aconversions, E4.1, E2.4, and (at the right side)E4.5. E2.6. ~ (- t = J,). [Df.-1E2.5.] E2.7. ~ 'VpJ(P = i) =p). - Proof: In (E2*), let Fbe "(Ap«(P = i) =p)". AfterA--eonversions, use E4.1, E2.5, and B2.1. Complete as in the proofofE2.3. E2.8. ~ «A o =i) = A). [FromE2.7, by E5.2.] E2.9. ~ 'VPo (- - p = p). -
Proof: In (E2*), let F be "(Ap(- -p = p»". Use
E4.3 and E2.6. E2.10. ~ (- - A = A). [FromE2.9, by E5.2.] R'V. ~ A o ~ ~ 'Vx lJ'A. - Proof: 1. ~ (A = i) [fromthe ass. and E2.8]
2. ~ (( Ax.A) = (Ax.i»
[l 1(E1)]
3. ~ 'Vx.A
[Df. 'V: 2]
R&. ( ~ A and ~ B) ~ ~ (A & B). 1. ~ (A =T) [ass. and E2.8] 2. ~ (i & B) = B [B2.4] 3. ~ (A & B) = B [1/2] 4. ~ (B = i) [ass. and E2.8] 5. ~ (A & B) = t
[4/3]
6. ~ (A & B)
[B2.8/5]
Proof:
at J,. If A, B E Cat(o), p E Var(o), ~ Api, and ~ ApJ, then ~ A/ . Proof: In (E2*),let Fbe "(Ap.A)". Use the assumptions, R&, andE5.2. R::J . ( ~ (Ao::J B o ), and ~ A) ~ ~ B. - Proof: 1. ~ (A = T)
[ass. and E2.8]
2. ~ (J, = (A & -B)
[ass. and Of. ::J]
3. ~ (J, = (i &
-B»
4. ~ (J, = -B)
[1/2]
[E2.4/3 ]
5. ~ (-J,
= --B)
[4/(El)]
6. ~
= B)
[E2.1 0, E4.3 15]
7.
~
(i B
[E1.1,E2.8 16]
138
~---------- - - -- - -
-
-
Proofs from (E3)
(Aa =Ba ) ::J (B =A). - Proof: In (E3*), let F be "(Axa.(B whereA and B are freefrom x. After A-eonversions we have:
EJ.t.
~
~ (A
=x»"
=B)::J «(B =A) =(B =B).
Complete by usingE4.1 and E2.8. E3.2. ~ (I::J I). - Proof: In (E3*),let A and B be I, and let Fbe "(Apo- I)". We have (by A-conversions) that: 1. I- (I = I) ::J (I = I) 2. ~ I::J 1 [E4.11l] Putting J, for A, we get analogously: E3.3. I- J,::J t [Cf.E2.5.] E3.4. ~ A o=> I. (Verum ex quodlibet.) - From E3.2 and E3.3, by RI J,. E3.5. I- J,::J J, . - Proof: In (E3*), let F be "(Apo-p)", and let A and B be J, and t, respectively. Use E2.5. E3.6. I- J,::J A o . (Exfa lso quodlibet.) - From E3.3 and E3.5, by RI J,. E3.7. I- Ao::J A. - From E3.2 and E3.5, by RI J,. Note that by Df. ::J,
By E4.3, E2.l, and E2.5, this reducesto:
On the other hand, E3.5 and E2.8 yield:
Similarly, E3.2 and E2.8 yield:
From (a) and (b) we get by R 1J, that:
I- (Ao::J J,) =-A. We get analogously from (a) and (c) that: E3.9. I- (I::J A o ) =A. It follows from E3.8. that I- -(Ao::J - i) E3.8.
=--A.
Using that (A ::J -I) =df -(A & I), this means that: E3.10. I- (A o &- I) =A.
139
The Propositional Calculus (PC)
PCI. 1.
r «Ao::JB o) & (B::J A»::J (A =B). -
r «A::J i) & ~
i
(i ::J A» ::J (A = i) ~
A
&
'--v----J
),::J
A
[E3.4,E3.9, E2.8]
::J
A
[E2.4,E3.7]
A
2.
r «A::J J,) & ~
~ ~A
Proof"
(J,::J A» ~
i
&
- A
= (,1, = A) '--v----J
~=
-A
[E3.4,E3.9, E2.8]
=
- A
[E3.l0, (El)]
r (,1, = A) ::J (A = J,) [E3.l] r «A::J ,1,) & (J, ::JA»:::l (A = J,) [2/3] [Ri ,1,: 1,4] 5. r «A::J B) & (B::J A»::J (A = B) PC2. r (Aa =B a) =(B =A). - Proof" 1. r [«A=B )::J (B=A» & « B=A) ::J (A=B» ] ::J [(A=B) = (B=A)] 2. r (A = B) ::J (B = A) [E3.l] 3. r (B = A) ::J (A =B) [E3.l]
3. 4.
[by PC1]
Now use R& and R ::J. Consider the proof of Pel. In line 1, the main '" :::l' can be replaced by (Why?) In line 2, '"(,1,
,"=1.
=A)' can be replacedby '"(A =,1,)' (accordingto Pe2). From these,
one gets by Ri J, that:
PC3. PC4. 1.
r «Ao::J Bo) & (B ::>A» =(A =B). r (A = B ) ::> (A ::> B). - Proof"
r (i = B ) ::J (i :::l B) '--v----J
B 2.
~
::J
B
[E2.8, E3.9]
r (,1, = B)::J (,1,::> B)
~
'--v----J
- B::J i Completeby using at J,.
"[E3.6, E3.4]
The followinglaws can be proved analogously: substitute i and J" respectively, for A, and use at t.
r r
PCS. Ao::J Bo::J A. PC6. (Ao::J B o::> Co) = «A::J B)::> A::J C). PC7. Bo::J - A o) =(A::J B). By PC3 (and R::», the two latter laws can be weakenedas follows: PCS. (Ao::> B o::J Co) ::> (A ::J B) ::J A ::J C. PC9. t (-B o::> -A o) ::J A ::> B.
r (r
140
Knowing that PC:5,8,9 and R::J are sufficientfor the foundation of PC, we have that EC contains PC. Note that PC3 and PC4 assure that the identity symbol '= ' acts, between sentences, the part of the biconditional. In what follows, we shall refer by 'PC' to the laws of the classicalpropositional calculus. Laws of Quantification (QC)
=
QCl. ~ 'VpoC'vxa(P ~ A o) (P ~ 'Vx.A). -
Proof" Show that
'Vx(p ~ A) = (p ~ 'Vx.A)
is provable with i and J. insteadof p. Then use (E2*). -
COROLLARY:
QC2. ~ 'VxJ..Co-::;) A o) -::;) C ~ 'Vx.A), provided C is free from x.
Using that
~
(Ao~ A)
[PC] and using R'V, we get from QC2:
QC3. ~ A 0 ~ 'Vxll'A provided A is free from x. QC4. ~ 'VxaA o -::;) A/. -
Proof"
1. ~ ((h.A)=(h. i» -::;) [(A/ro-f(B»(h.A)=(Af/(b»(Ax. i)] [by (E3*)]
[Notethat U(A/ro-f(Bu,)"
E
Cat(o(oa)).]
2. ~ « h .A )=(h . i ) ~ « h.A)(B )=(Ax. i )(B»
[A 11 ]
3. ~ 'Vx.A ~ (A/ =
[Df. 'V and A /2]
i)
4. ~ 'Vx.A:::> A/
[E2.8/3]
QC5. ~ 'VxJ..Ao-::;) B o) -::;) 'Vx.A ~ 'Vx.B. -
1.
~ 'Vx(A ~ B) ~ (A ~ B)
2.
r 'Vx.A ~ A
Proof"
[QC4] [QC4]
r ('Vx(A ~ B) & 'Vx.A) -::;) (A & (A -::;) B» r (A & (A ~ B» -::;) B 5. r ('Vx(A ~ B) & 'Vx.A) ~ B 3.
[PC: 1,2]
4.
[PCl
6.
r 'Vx[('Vx(A -::;) B) & 'Vx.A) -::;) B]
[pC: 3,4]
[R'V: 5]
r ('Vx(A ~ B) & 'Vx.A) ~ 'Vx.B 8. r 'Vx(A ~ B) ~ 'Vx.A ~ 'Vx.B
[R~;
7.
QC2, 6]
[pc: 7]
The laws QC4, QC5, QC3, (EI), (E3*), and R'V - togetherwith the laws of PC - are sufficient for the foundation of the frrst-order Quantification Calculus QC. Hence, all laws of QC are provablein EC, with quantifiablevariables of any type (in contrast to QC where the quantifiable variablesare restrictedto type z). QC6. If the variable yp does not occur in the term A a then
r (hpA~ = (AypAl
). -
Proof" Let A" be Ai , and notethat - owingto our proviso - [A "l/ is the same as AxZ •
r (Ay.A ")(zP) =[A "] / 2. r (Ay.A ")(z) =A/ 3. r (h.A)(z) =Ax 4. r (Ax.A)(z) =(Ay.A ")(z) 5. r 'Vz[(h.A)(z) = (Ay.A ")(z)] 1.
l
6.
~ (ALA)
= (Ay.A ")
[(E5), with suitable z] [from 1, for [A "]/ [(E5)]
[3/2]
[R'V : 4] [(E4*) /5]
141
=Ax
Z
]
This is the law of re-naming bound variables. Finally, we prove a generalization of (E3) which will be useful in the next Section. (E3+)
r
(Ap = Bp ) ::J (CutiA) = C(B». · -
Proof: In (E3*), let F be
"('J....x{iC(x) = C(B) ))" (which belongs to Cat(ofJ». After A-conversions we have:
r (A =B) ::J([C(A) = C(B)] =[C(B) = C(B)]) ~
t By FA.! and E2.8, we get the required result. 1.2.3. EC-consistent and EC-complete sets 1.2.3.1. DEFINITION. A sentence A is said to be a syntactic consequence of the set of
sentences r(ordeducible from n-in symbols: "T ~cA" - iff
r A, or
T is empty and
T is nonempty and there exists a conjunction K (perhaps a one-member one) of
r K::J A. F r A , for all r. - Prove that ru {Co} r A iff
some membersof F such that F
r
[Provethat A ~ this is the so-called Deduction Theorem .]
r C::J A -
1.2.3.2. DEFINmON. A set ris said to be EC-inconsistent iff T
r J,; and ris said to
be EC-consistent iff r is not EC-inconsistent. [Prove that tF is EC-inconsistent) ¢:> (for all sentences A, F some A
E
T, r
r A)
¢:>
(for
r . . A). - Prove that if ris EC-consistent and r r A then ru {A}
is EC-consistent as well.] 1.2.3.3. D EFINmON. A set ris said to be EC-complete iff
(i) ris EC-consistent; is 3- complete (existentially complete) in the sense that whenever
(ii) T
"3x aA o " E F then for some variableYa, A/ E r, (iii) r is maximal in the sense that if Ao~ r then ru {A} is EC-inconsistent. 1.2.3.4. THEOREM . If the membersof F are free from the variable v« . X a does not oc-
r A/ then T r V x.A . r K::J A/ where K is a conjunction of some members of T, and K is free from y. Then, by RV, QC2, and R::J, r K::J 'fIy.A/We get by QC6 that r K::J 'fIx.A, that is, r r 'fIx.A.
cur in the sentence A/ and F t
t
Proof. By the assumption,
1.2.3.5. THEOREM. Every EC-consistent set of sentences is embeddable into an ECcomplete set. More exactly: If ro is an EC-consistent set of sentences of a language Lao:! then there exists a language Lo:! and an EC-complete set T of sentences of Lo:!
such that ro
~
r. 142
Proof For each a E EXTY, let Var'(a) be a sequence of new variables, and let be the enlargement of Loo:J containing these new variables. Let (En)n E iV be an enumeration of all sentences of form "3x aAo " of the extended languageLr;t;!. Starting with the givenset ro, let us definethe sequence of sets (rn)n E iV by the schema:
Lr;t;!
rn+1 = Tn if F; u {En} is EC-inconsistent; and otherwise r n+1 = T; u { En , En} where, if En is "3x a A o " then E is A/ where y is the first memberof Var'(a) occurring neitherin the members of F; nor in En . LEMMA. If T; is EC-consistentthen so is r n+1 • Proof This is obvious in case rn+1 = rn • In the other case, F; U {3x a A o } is assumed to be EC-consistent. Now assume, indirectly, that F; U {3x.A, A/ } is ECinconsistent, i.e., that
F; U {3x.A}
~ ...Al .
Using that T; U {3x.A} is freefrom y, we get by the preceding Th. that
rn U
{3x.A} ~ 'v'x A . . r-
Since "3x.A" is "-'v'x....A" , we have that F; U {3x.A } is EC-inconsistentwhich contradictsthe assumption. Continuing the proofof the theorem, let us define
Showthat riV is EC-consistent (using that in.the contrary case, for some n, rn would be EC-inconsistent) and 3-complete. Now let (Cn )n E aJ be an enumeration of all sentences of Lr;t;!. Let us define the sequence of sets (r~)n E iV by the following recursion:
r~+ l
= T'; if T';
U
{C n
}
is EC-inconsistent, and in the contrary case:
Obviously, for all n, T'; is EC-consistent and 3-complete. Consequently, the same holdsfor Furthermore, Fo s; r. Finally, show that r is maximal (use that if A o ~ I: then for some n, A is C; , and F'; U {Cn } is EC-inconsisteent).
143
1.2.3.6. THEOREM. Assume that r is an EC-completeset of sentences. Then: (i)
r
~ A ~ A E
(ii) {(A
=B), C} ~
r. r
~
UC[BIAj" E F.
(iii) If the term A a occurs in a memberof r then for some variable Xa r "(A a = xa)"
E
r.
(iv) If "(Cf1/J = Df1/J)" E r then for all variables xp ~ "(C(x) = D(x) " E r . Proof. Ad (i). This follows fromthe maximality of F, using that rv {A} is ECconsistent. - Ad (ii), By (E3 +), ~ (A =B) ::J (C = C[BIAJ). Now use PC and (i). Ad (iii).By (i), "(A = A)" ~ (A
=A) ::J 3Ya (A =y),
for some X a , "(A
E
r. By contraposition of QC4 we have that
and, hence, "3y(A
=y)" E r. By the 3-eompleteness of I:
=X a )" E r .
=D) =VYp(C(y) =D(y) (with y such that C and D are free from y), and by QC4, r V y(C(y) = D(y)::J (C(x) = D(x) ), with arbitrary x a. Ad (iv). By (E4*), ~ (C
Complete by using (i). 1.2.4. The completeness of EC 1.2.4.1. THEOREM . If r is an EC-complete set of sentences then there is a Ginterpretation Ip = { V, D, p } and a valuation v such that for all A E F, IA I/P = 1.
(l)
Proof. - Part I: The definition of Ip and v.
We shalldefine D and v by induction on EXTY. (a) For p E Var(o), we defme v(p) simply by v(p) = 1, if p e T; and v(p) = 0 otherwise.
Knowing that i and J. are terms of
£()(!,
we have by (iii) ofTh . 1.2.3.6that for some Po
and qo, "(i =p)"
Since
ri
E
r
"(J. =q)" E r.
and
, we have that pEr r and, hence, v(p) = 1. If q E
r then, by (ii) of the Th.
just quoted, J, E r which is impossible (by the EC-consystency of Ty; hence, q ~ and v(q) =0. Thus, we can define: D( 0) =
r,
to, I}.
Assume that "(Po =q oJ" E r, and v(p) =1, that is, p e T: Then, q E F; too, and v(q) = 1. Assuming that v(q) =1, we get analogously that v(p) = 1. Hence: "(p 0 = q0)" E r iff v(p)
=v(q). (b) Let (Zn}n E (l) be an enumeration of Var( z). Let us definethe function rp by rp(Zn )
=k
~
for some k ~ n, "(Zn " (Zn
=Zj )" ~ r
=Zk )" E r,
[i, k, n
144
E
(J) ] •
and for all i < k,
(In other words: let fP{zn) be the smallest number k such that "(z, = Zk )"
r.
Note that "(Zn =z, )" E r. Then fP{zo) = 0.) Now let U =D(l) be the counterdomain of ¢J, i.e., U =D(l) = {k E (J) : for some x E Var(l), fP{x) =k }. We then define v for members of Var(l) by vex) =df rp(x). [Using the symmetry and the transitivity of the identity, show that "(r, = YI )" E r iff E
vex) = v(y).]
(c) Now assume that D(a), D(j3) are defined already, v is defined for the membersof Yarea) u Var(ft>, and the following two conditions hold for r E {a, fJ} : (i) If C E D(r) thenfor some x r' vex) = c. (ii) "(Xr = Yr)" E r iff vex) =v(y). (Notethat by (a) and (b), these conditions hold for r E {o, l }.) Now we are going to define v for the members of Var(a(p]) as well as the domainD( a(p). Forf E Var(afJ), a E D( a) , b E D(j3), we let: v(f)(b)
=a
iff for some Xa , YP such that vex)
=a, v(y) =b, "(f(y) =x)" E r.
Using (i) and (ii), it is easy to provethat v(f) is a (unique) function from D(j3) to D(a). - We define: D(afJ) = { rp E D(jJ)D(a): for somef ap, v(j) = qJ} c
D(jJ)
D( a).
Now provethat (i) and (ii) hold for r = a(p). (For (ii), use (iv) of Th. 1.2.3.6.) By (a), (b), and (c), the definition of D and v is completed. (d) If C E ConCa) then for some Xa , "(C = x)" E r . We then define P(C) = vex). Show that this definition is unambigous.
By these, the defmition of lp is completed. Part II: The proofof (1). (A) We prove(1) firstly for identities of form "(Ba = Ya)". If B is a variable or a term of form "f(x)" or a constant, then (1) holds according to the defmition of lp and v. In other cases, B is a compound term of form "F(C)", " (Ax.C)" , or U(C = D)". We shall investigate one by one these three cases. Meanwhile, we shall use the induction assumption that (1) holds true for sentences whichare less compound ones than the one underinvestigation. (AI) If U(FatiCp) = Ya )" E r then - by (iii) of Th.1.2.3.6 - for some variables fap and xs . "(F = f)" E F. and "(C = x)" E r. Furthermore, by the ECcompleteness of F, "({(x) =y)" E r. Then,by the definition of lp and v, (2)
I(f(x) =y)l v
= 1.
145
We can assume that (3)
I(F
=f>/v =1
and
(4)
I(C
=x)lv = 1
for F and C are less compoundterms than "F(C)". From (2), (3), and (4) it then follows that I(F(C)
=Y)/v =I
which was to be proved. (The furthercases will be less detailed.) (A2) If "«'AxfJCa) = YaP )"
r
E
then - by (iv) of Th. 1.2.3.6 - for all zp,
"«'Ax. C)(z) = y(z»" E r, and, by 'A-conversion, "(C/
(5)
=y(z) ) E r.
However, for all zpthere is an U z
E
Var(a) such that
"(y(z) = u z )"
(6)
E
r.
By (5) and (6), we have that for all z E Var(/J) there is an Uz E Var(a) such that "(CXZ
=Uz )" E F. By inductionassumption, I(Cx
l(y(z)
=Uz )Iv =1. Henceforth: I(C/ =y(z»lv =1, that is"
Z
= U z )Iv =1, and,furthermore,
/«'Ax. C)(z) = y( z) )/v = 1
(7)
for all zp . Remembering that for all b ED(jJ) there is a zp such that v(z) = b, we get from (7) that for all b E D(jJ), I(AX.C)l v (b) = v(y)(b), which means that I(Ax.C)l v = v(y), and, hence: 1('Ax.C)
=y)l =1. v
(A3) If "«C a = D a ) =Po)" E H(D = y)" E F; and "«(x = y)
(8)
r
thenfor some Xa, Ya: "(C =x)" E T;
=p)" E r.
We can assume that (9)
I(C = x)l v = 1
and
(10)
Now v(p) is 1 or O. If v(p) = 1 then "(p = vex) (11)
I(D
tr E r,
=y)lv
= 1.
and, by (8), "(x = y)
E
r, that is,
=v(y). This and (9) and (10) togetherimply that =
I«(C D) = p)l v
=1.
On the other hand, if v(p) = 0 then "(p = J,)" E r, and, by (8) again, «; (x hence, vex) ;I:. v(y). This and (9) and (10) togetherimply (11).
146
= y) E r,
(B) Secondly, we prove (1) for identities of form "(B a = C a )"where both B
and Cmay be compound terms, If"(B =C)" E r then for somexj and y., , "(B = x)" r, "(C = y)" E r, and "(x = y)" E r. By (A), we have that I(B = x)lv =1,
I(C = y)lv = 1,
and
I(x
whichyieldthat I(B = C)lv =1. (C) Finally, if the sentence A is not an identitythen A
E
E
=y)lv =1, r implies that
"(A =i)" E
r. Since IAl v =I(A =i)l v this case reducesto the preceding one. 1.2.4.2. THEOREM. If the set r is EC-consistent then r is G-satisfiable. This follows fromTheorems 1.2.3.5 and 1.2.4.1. 1.2.4.3. THEOREM. The completeness ofEC with respect to the G-semantics. If r ~G A then r
hr:
A.
Proof. Assume that r ~G A. Then I" =
r
u {- A} is G-unsatisfiable, and, by
contraposing the preceding theorem, F' is EC-inconsistent which meansthat
r u {... A} ~ J,. Then, by the Deduction Theorem, we have that r ~ (- A ::> J,) which reduces to r ~ A. 1.2.4.4. THEOREM. (LbWENHEIM-5KOLEM.) If the set r is G-satisfiable then r is "denumerably" satisfiable in the sense that r has a G-model Ip = {U, D, p} with a valuation v such that each D( a) is at most denumerably infinite. Proof. Note firstly that if r is G-satisfiable then r is EC-~onsistent. (For if r ~ J, then ... continue!) Then, by Th. 1.2.3.5, r is embeddable into an ECcomplete set I" , and, by the proof of Th. 1.2.4.1, I" has a G-model in which the cardinality of each D(a) is not greater than the cardinality of Var(a) (which is; of course, denumerably infinite). That is, I" has a "denumerable" G-model. Since r s: I", this is a G-model of r as well.
147
PART 2: MONTAGUE's INTENSIONAL LOGIC The sourcesof this chapterare the following works: R. MONTAGUE, Universal Grammar, 1970.
R. MONTAGUE, The ProperTreatment of Quantification in Ordinary English, 1973 - briefly:
PTQ. D. GAllIN, Intensional and HigherOrderModalLogic, 1975. We shall presentthe essenceof the most important parts of these writings, of course, without literal repetititons. The resultsof Part 1 will be utilizedextensively.
2.1. THE SEMANTICAL SYSTEMS IL AND IL+ 2.1.1. MONTAGUE'S TYPE THEORY
Montague uses the basic symbols t, e, and s - t for truth value, e for entity, s for sense - in his type theory. The type of a functor with the input type a and output type f3is denoted by "( a~fJl' . The full inductive definition of his types is as follows: t and e are types.
If a, p are types, "( a~p>" is a type. If a is a type, (s,a)" is a type.
Here t and e correspond to our type symbols "(a~p>"
0
and
l,
respectively (cf. 1.1.1), and
correspondto our f3(a). .Finally, "(s,a)" is the type ofexpressions naming the
sense (or the intension) of an expression of type a. It is presupposed here that there exist terms naming intensions (senses) of terms. (For example, if A is a sentence, the term "that A" is a name of the sense (intension) of A; or if B is an individual name say, 'the Pope' - then "the concept of B" is a name of the sense (intension) of B.) Note that the isolated's' is not a type symbol. However, we shall not use Montague's original notation for types. Instead, we shall follow our notation introduced in Section 1.1.1, of course, with suitable enlargements. Thus, our inductive defmition of the set of Montagovian types - denoted by 'TYPM' - runs as follows:
o, lE TIPM, a~pETYPM
pE
TYPM
=> "a(p}'ETYPM, =::) "(/J)s" E TYPM.
Again, if Pconsists of a single character (0 or z), the parentheses surrounding it will be omitted. Furthermore, we write usually "(P'l" instead of "(/J)s" [except when it occurs in a subscript], and instead of "«P'lt " we write simply "(/J)5S "[e.g., d', (55, (Ol)5SSS, etc.].
148
The unrestricted iterations and multiply embedded occurrences of's' may provoke some philosophical criticism, but let us put asidethis problem presently. 2.1.2. THE GRAMMAR OF IL AND IL + The semantical system IL is introduced in Universal Grammar. In PrQ, system IL is extended by the introduction of tense operators; this extended system will be called IL+. 2.1.2.1.
DEFINITION.
By an IL language let us mean a quadruple L (i) = (Log, Var, Con, Cat)
where Log = { (, ), A., =, A
, v }
is the set of logical symbols of the language (containing left and right parentheses, the lambdaoperator A., the symbol of identity, the intensor A, and the extensor v); Var = U
a e TYPM
Var(a)
is the set of variables of the language whereeach Var(a) is a denumerably infinite set of symbols called variables of type a; Can
= UaeTYPM Con(a)
is the set of (nonlogical) constants of the language where each Con(a) is a denumerable (perhaps empty)set of symbols called constants of type a; all the sets mentioned up to this point are pairwise disjoint; Cat
= U ae TYPM Cat(a)
is the set of the well-formed expressions - briefly: terms - of the language where the sets Cat( a) are inductively defined by the grammatical rules (GO) to (G5) below. For a E TVPM, Cat( a) maybe called the a-category of L{i) . [Thenotational conventions will be the same as in 1.1.2.] Grammatical rules: (GO) Var(a) u Con( a)
~
Cat(a).
(G1) "Aa/1(BpJ" E Cat( a). (G2) "(AxpAa) E Cat( ap) . (G3) "(A a = B~" E Cat( 0) (G4) "AA a " E Cat( as).
(G5) "vA a s" E Cat(a). Let us enlargethe set of logical symbols Log by the symbols 'P' and 'F' (called past andjUture tense operators, respectively) and let us add the rule (G6) to the grammaticalrules: (G6) "PAo", "FAd' E Cat(o). 149
By these enlargements, we get the grammar of the IL + languages. [In PrQ, some symbols introduced via definitions in Universal Grammar, are treated as primitive ones; but we do not follow here this policy.] Remark. Montaguespeaksof a singlelanguage of IL, and,hence,he prescribes that each Con( a) must be (denumerably) infinite. We follow the policy of Part 1 in dealingwith a family of languages the members of whichmaydifferfromeachotherin havingdifferent sets of theirnonlogical constants.
2.1.2.2.
DEF1NITION.
(i) Free and bound occurrences of a variable in a term as well
as closed and open terms are distinguished exactly as in EL (cf. (i), (ii) and (iii) of Def. 1.1.2.2). (ii) The set of rigid terms of I}) - denotedby 'RGD' - is definedby the fol-
lowing induction: (a) Var s: RGD;
"AA a "
E
RGD.
RGD ~ F(B) E RGD. (c) A E RGD ~ "(Mp A)" E RGD. (b) Fap, Bp
(d) A, B
E
E
RGD
~
"(A = B)" E RGD.
(e) In case IL": AoE RGD ~ "P(A)", UF(A)"
E
RGD.
In other words: rigid terms are composed of variables and terms of form "AA" via applications of the grammaticalrules (G1), (G2), and (G3). A motivation of the adjective 'rigid' will be given in the next section. (iii) A variable Xa is said to be substitutableby the term Ba in the term A iff (a) whenever "(Ay.C)" is a part of A involving some occurrences of x which counts as a free occurrence of x in A then B is free from y; and (b) if a free occurrence of x in A lies in a part of form "AC' (or - in case of IL + "P(C)" or "F(C)") of A then B is a rigid term. (iv) The result of substitutingBa for Xa in A - in symbols: "[A]/" - and the replacement of A a by B a in a term C - denotedby "C[B fA]" - are defined exactly as in EL (cf. (v) and (vi) of Def. 1.1.2.2). 2.1.3. THE SEMANTICS OF IL AND IL + 2.1.3.1.
DEFINITION.
(a) By an interpretation of an IL language I./) let us mean a
quadruple Ip = (U, W, D, a ) where U and Ware nonemptysets; D is a function defmedon TYPM such that
and a is a functiondefinedon Con such that (2) C E Con(a) ~ o(C) E Int(a) 150
=df W D( a).
(b) By an interpretation of an IL + languagewe mean a sixtuple
Ip = (U, where U,
~
~ T,
-c, D, a}
and T are nonempty sets, < is a linear ordering of T, D is the same as in
(l) except that
D(a S
= I D(a)
)
and a is as in (2) except that Int(a)
where 1= W xT,
= I D( a).
(c) A function v defmed on Var is said to be a valuation joining to Ip iff
Var(a) => vex)
X E
E D( a)
.
The notation "v[x: al" will be used analogouslyas in EL (cf. Def. 1.1.3.1).
Comments. W is said to be the set of (labels of) possible worlds, Tis the set of possible time moments, and < represents the 'earlier than' relation between time moments. I = W x T is said to be the set of indices. For a E TYPM, D( a) is the set of factual values and Int(a) is the set of intensions, of type a, respectively. 2.1.3.2. DEFINmoN. Given an interpretation Ip of an IL or an IL + language If), we shall define for all terms A E Cat and for all valuations v joining to Ip, the intension of Aaccording to Ip and v - denoted by "IIA II /p "- by the semantic rules (SO) to (S6) below. According to our definition,if A E Cat(a) then
will be satisfied where I
=W
in the case of IL, and I
= W x T in the case of IL + •
Hence, I A I /p is defmed iff for all i E I, IAI/p =df
IIA I /p (i) E D( a)
is defmed. We shall exploit this fact in our definition. The object IAI/p may be called
the factual value of A, according to Ip and v, at the world index i. - In what follows, the superscript" Ip' will be, in most cases, omitted. (SO) If x
E
Var, !xlvi
=vex).
If
=IFlvi (IBlvd. (S2) I(AXp A a )Ivi is the function b e D(P) => (S3) I(A a = Ba)vi = 1 if IAlvi =
(SI) IFafJ(Bp)l vi
IIAllv' IVA a s Ivi =IAlvi (i).
C E Con, II CII v =o(C). _
E
D( ap) such that
= IAl v1x: bJ.i·
IBl vi, and 0 otherwise.
(S4) l"Alvi =
(S5)
[Note that if lAa s Ivi
uniquely.] (S6) [Only for IL + .]
151
E I
D( a) then IAl vi (i) is defmed
=1 if for some ( < t. IAl v {w.t') =1, and 0 otherwise; IF(Ao)lv(w.t) =1 if for some t' > t, IAl v (w.t/ =1, and 0 otherwise. IP(A oJ l, (w.t)
2.1.3.3. LEMMA. (A) The intensions IIA II /p and the factual values IAIv/P are uniquely determined by the rules (SO) to (S6), and, if A E Cat(a) then II A II /p E Int(a) and (for all i E l) IAl v /p E D( a). - (B) If A E RGD then II A II /p is a constant function on I, that is, for all i. j E I, IAl vi =IAl vj ' (Thisfact motivates the name rigid terms.) Proof' by structural induction, using the semantic rules (SO) to (S6) in case (A), and using the conditions(a) to (e) ofRGD (see (ii) of Def. 2.1.2.2)in case (B). 2.1.3.4. DEFINITION. Let r be a set of sentences U'«; Cat(0), Ip an interpretation, va valuationjoining to Ip, and i E I an index. (i) We say that the triple (Ip, v, i) is a model of r iff for all A E I' , IAIv/P = 1. (ii) r is said to be satisfiable [unsatisfiable] iff rhas a model [has no model]. (iii) The notions semantic consequence, validity, and logical synonymity are defined literallyas in EL (cf. (iii), (iv), and (v) ofDef. 1.1.3.4). 2.1.3.5. LEMMA. If FV(A) is the set of variables having some free occurrences in A, and the valuations v and v / coincideon FV(A) then II A II v = II A II v ' . Proof. Similarly as in Lemma 1.1.4.1 (put IAl vi instead of IAl v everywhere), taking into consideration that FV("A), FV(vA), FV(P(A» and FV(F(A» are the same as FV(A).
2.1.3.6. LEMMA. Given any interpretation, if for all valuations v, IIAall v for all valuations v, I CII v = I C[BIAJII v • Proof' similarlyas in Lemma 1.1.4.2. - A corollary: 2.1.3.7. THEOREM. The law of replacement.
= IIBall v then
If ~ (Aa =B~ and ~ Co then ~ C[BIA.
2.1.3.8. THEOREM. The law ofdeleting VI\. Proof. By (54), II\Al vi
~
(VI\A = A ).
= IIA II v '
and,hence II\Alv;(i) = IIAII v (i) = IAlvi
(3)
•
Consequently: IVI\Al vi = II\Al vi (i) = IAl vi .
'\
'\ by (S5)
by (3)
Our theoremfollowsfrom this fact. Note that "(I\VA as
II "vA II
\I
E
=A)"is not valid. For "l\vAas" is a rigid term, and, hence,
Int(as) is a constant function on I whereas I A II \I E Int(d) might be a non-
constantfunctionon I. 152
2.1.3.9. LEMMA. If xp is substitutable by BP in the term A then for all valuations v:
=bE D(P)
IBl vi
~ IA/lvi
=IAlv[x: bl.i
•
Proof. The proof method is the same as in EL (see Lemma 1.1.4.4.). The new cases not occurring in EL are as follows: Case (G4): A is offonn "AC". Then (assuming that x has some free occurrences in C) B must be a rigid term, that is, (4)
for allj
E
IBl vj = b.
I,
Our induction assumption (writing C' for "C/"): for alIj E I,
IC' Ivj = IClv[x: bi.j
•
that is,
II c: II v = II CII v{x: bi
•
Then:
Case (G5): If A is offonn rc;: then use the induction assumption that IC' IVi
which implies that for allj
E
=IClv[x: su
I: IC' IVi (J) = IClv[i
ib l.i
(j) .
Then: IV C' IVi
=IC' Ivi (i) =IClv1x: bi .i (i) =IV Clv[x: bl.i
•
Case (G6): A is of form "P(C o )" or "F(C o )" . As in Case (G4) , B must be a rigid term, i.e., (4) holds. Thus, we can use the induction assumption: for all t E T,
IC' Iv (w.t ) = ICl v1x: bJ.(w,t)
Then, we get by (86) that IP(C/ )Ivi = 1P(C)l v[x: b),i
and similarly for F.
153
;
•
2.1.3.10. THEOREM. The law of lambda conversion. If x is substitutable byB in A then
r«up A a )(Bp) = A/ ). Proof' analogously as in Th. 1.1.4.5. Logical symbols introduced via definitions
.We adopt from EL the definitions of the symbols
I, J" -, 'V, 3, &, v, ::) without any alteration. (Cf. Section 1.1.5.)Note that
I and J, are rigid sentences.
As new logical symbols, we introduce the signs of the necessity 0 aI1cfj}ossibility
0 ,
by the following contextual defmitions:
"I),
(Df. O)
DA o =df ("A =
(Df. 0 )
o A 0 =df - 0 - A.
One sees immediately that "DA" and "OA" are rigid sentences. [Interestingly enough, a correct explicit definitionof ' 0 ' is impossible.] The truth conditions of " DA" and "OA " are obvious: IDAl vi 10 Alvi
=1 iff for allj E I, IAlvj =1. =1 iff for somej E I, IAl vj =1.
2.1.3.11. THEOREM. If A a s and Bas are rigid terms then
Proof Our assumptions on rigidity mean that for all i E I, IAl vi = rp E D( as) and for all i E I, IBl vi = If! E D( a 5). Then for all j E I, IVAlvj = qJ(j), and IVBl vj = If/(j). Now, 10(VA = VB)l vi = 1 iff for allj E I, rp(j) = If/(]), i.e., iff
qJ =
If!, i.e., iff I(A = B)lv; = 1.
COROLLARY. ~ O(A a = Ba ) = ("A = "B). Proof Replace ""A" and ""B" for A and B, respectively, in the Th. (noting that ""A" and ""B" are rigid terms), and delete
v"
at the left side of the main '=' (by
Th.2.1.3.8).
The theorems and the lemmas of this section hold in both systems IL and IL", In the next section 2.1.4 and in chapter 2.2, we shall deal with IL only.
154
2.1.4. THE GENERALIZED SEMANTICS OF IL
Having the standard semantics of IL we are going to introduce the generalized semantics of IL - briefly: the Gl-semantics. The calculus IC ·to be introduced in 2.2 will
be proved both sound and complete with respect to this Gl-semantics. (The enlarged systemIL + will not be touchedupon here.) Our method will be the same as in 1.1.6 at the definition of the G-semantics of system EL. Hence, we can proceed veryconcisely. 2.1.4.1. DEFINmON. By a generalized interpretation - briefly: a GI-interpretation of a language L (i) of IL we mean a quadruple Ip = (U, ~ D, a) satisfying the followingconditions: (i) U and W are nonempty sets. (ii) D is a function defmedon TYPM such that D(o)
={O,I},
D( afJ) c
D(l)
=U,
D(a S
D(ft D( a),
)
c
W D(a).
(iii) a is a function definedon Con such that C E Con(a)
:=)
o(C)
E
Inti a) =df wD(a).
(iv) Whenever v is a valuation joining to Ip (satisfying the condition v(xcz) E D( a) ), the semantic rules (SO) to (S5) in Def. 2. I .3.2 are applicable in determining the intensions (according to Ip and v) of the termsof L(i) . 2.1.4.2. DEFINITION. Let Tbe a set of sentences of L~) , Ip a GI-interpretation of L(i) , v a valuation joining to Ip, and W E W. The triple (Ip, v, w) is to be said a GI-model of T iff for all A E T, IAl vw1P = 1. The notions GI-satisfiability (GI-unsatisfiability) , GIconsequence (r FGI A ), GI-validity (FGI A ), and GI-synonymity are defined in the
usual way. Also,the variantofTh. 1.1.6.3. holds: 2.1.4.3. THEOREM. All logical laws proved in the standard semantics of IL - in the
previous section- are logicallaws of the GI-semantics as well. The most importantlaws - which will be used in 2.2 - are: The law of replacement (2.1.3.7). the law of deleting v 1\ (2.1.3.8). the law of lambdaconversion (2.1.3.10), and the law F D(vAas = vBas ) = (A = B) whereA, B are rigid terms (2.1.3.11).
155
2.2. GALLIN's CALCULUS IC 2.2.1. DEFINITION OF IC
The calculus IC will be based on six basic schemata (II) to (I6) and a single proof rule. Most of the basic schemataare known alreadyfrom the calculusEC (cf. 1.2.1). Basic schemata (II) (VAA = A) (I2) =(£2) (13) =(£3) (I4) = (£4)
= (I6) ot fas =v gas) =if= g) (15) (E5)
The notion of a basic sentence (of L (i)
)
is defined analogously as in EC (in
1.2.1). Proof rule: Rule of replacement - RR. From "(A o = B o)" and Co to infer to "C[B fA]" - exactlyas in EC.
The notion of a proof in IC and the provability of a sentence A in IC - in symbols: "he A" - is defined analogously as the corresponding notions in EC. - In the notation "he", the subscript 'IC' will be usually omitted (except in cases when misunderstanding can arise by its omission). At the end of the preceding section, Th. 2.1.4.3 verifies that the basic schemata (11), (I5), and (I6) are GI-valid. (In case (16), take into consideration that the variables
f, g are rigid terms.)By the same Th., the rule RR yields a GI-validsentencefrom GIvalid ones. The GI-validity of the schemata(12), (13), and (14) can verify easily.Hence: 2.2.1.1. THEOREM. The soundness of IC with respect to the GI-semantics of IL. If
he A
then
FGI A.
then he A. Proof It is sufficientto show that schema(£1) is provablein IC. In fact
2.2.1.2. THEOREM. IC includes EC: If 1.
2.
r (vAA =A) r (VA A = A)
3. I- (A = A)
Ice A
[by (I1~]
[same] [according to line 1, replace" vAA" by A in line 2, using RR]
In what follows, we can utilize all laws of EC . The surplus of IC is hidden in the schemata (II) and (16). In the next section, we shall prove the modal laws of IC most of which are based on (11) and (16). 2.2.2. THE MODAL LA WS OF
rc
We shall prove first a generalization of (16) similarly as we proved in EC the laws (£2*), (E3*), and (£4*) as generalizations of (£2), (E3), and (£4), respectively. Our proof technique will be the same as in EC, and we shall use the same methodof reference (cf. the conventions introduced at the beginningof 1.2.2). 156
(16*) If A, BE RGD then ~ O(vA as
=vBas) =(A = B).
- Proof' From (I6)
by E5.1, using that A and B - being rigid terms - are substitutable forf and g, respectively,in (I6). COROLLARY:
AB} ~
16.1. ~ O(A a = B a) =(AA =AB). - Proof' Use (16*), exploiting that {AA, RGD, and delete VA by using (II).
RO - The rule ofmodal generalization. ~ A o ~ ~ OA. - Proof' 1. ~ (A = i) [from the ass.] 2. ~ (AA = AA) [(El)] 3. ~ (AA = Ai)
[1/2]
4. ~ OA
[Df. 0 : 3]
11.1. ~ DAo:::> A. - Proof' 1. ~ (AA = Ai):::> «Apos .vp)(AA) = (Ap.vp)(Ai» 2. ~ (AA =Ai) :::) (vAA =VA T)
[(E3+)]
3. ~ (AA =
[(Il)/2]
A
[All]
T) :::> (A = i )
4. ~ DA :::> A
[Df. 0 and E2.8: 3]
COROLLARY: ~ A o:::> OA.
11.2. If Ao E RGD then
~ O(A o:::> B o ) = (A :::) DB). -
1. ~ (i:::>B) = B
[PC]
2. ~ O(i :::) B) = DB
[1/ (E1)]
3. ~ (i :::> DB)
=DB
[PC]
4. ~ O(i :::) B) = (i :::) DB)
[31 2]
5. ~ (J, :::> B) 6. ~ O(J, :::) B)
[PC]
[RO: 5]
7. ~ (J,:::) DB)
8.
Proof'
[PC]
r O(J, :::> B) =(J, :::) DB)
[pC: 6,7]
9. ~ O(A :::> B) = (A :::> DB)
[Ri J,: 4,8]
Onthe last step, we use that in "o(p 0:::> B) = (p :::> DB)", A is substitutable for p, sinceA is rigid.)
11.3. If A E RGD, 1. ~ «Ai = Ai) = I) 2. ~ (01
=I
3. ~ OJ,:::> J, 4. ~ J,:::) OJ, 5. ~ (OJ,
=J,)
6. ~ (OA =A)
l-
(DA o = A). - Proof' [(EI) and E2.8] [Df. 0 : 1]
[11.1] [PC]
[pC: 3,4] [Ri J,: 2,5]
COROLLARIES:
11.4. ~ (ODA =OA), and ~ (OOA =OA). It follows from 11.3 that if A o is rigid then} (- O-A 11.5. If A o E RGD, ~ (OA = A). 157
= --A) .and, hence:
COROLLARIES:
11.6. t (OOA =OA), and t (oDA =OA). Furthermore, usingthat OA::> A, and A::> 0 A, we have that: 11.7. t 00A ::> A, and t A::> DOA. (TheBrouwerschemata) 11.8. t D(A::> B)::> OA ::> DB, and t D(A::> B)::> oA ::>
B) & DA) ::> B [11.1 twice,and PC]
t
2.
t
t
0 [(D(A::> B ) & DA) ::> B]
[RD: 1]
3. ~ (D(A::> B) & DA) ::> DB
[11.2: (2 = 3)]
4. ~ D(A ::> B) ::> DA ::> DB
[pC: 3]
5.
t
0 (- B::> - A) ::> 0 - B ::> 0 - A
6.
t
0(- B::> - A)
7.
~
D(A ::> B)::> OA ::> OB
=D(A ::> B)
[by 4]
[PC, (EI), RR] [pC: 5,6, and Df. 0]
RD::>. The Lemmon Rules:
t 1.
2.
A::>B ==>
(t
t A::>B t - B::> -A
OA::> DB, and
OA ::> oB .J - Proof'
[ass.] [PC: 1]
3. ~ D(A ::> B)
[RD: 1]
4. ~ 0( - B ::> - A)
[RD: 2]
r OA::> DB
[(3::> 5)
5.
t
=11.8]
6.
r 0 - B::> 0 - A
[(4::>6) = 11.8]
7.
t
[pC: 6, and Df. 0]
oA ::> oB
Lines 5 and 7 contain the results.
11.9.
r (DV'xa A o =V'x.DA). (Barcan 's schema.) - Proof"
1.
r V'xA::> A
[QC4]
2.
t
[Lemmon: 1] [RV': 2, and QC2]
3.
DV'x.A ::> DA
r DV'x.A ::> V'x.DA
r V'x.DA::> DA 5. t o V'x .DA ::> oDA
4.
[QC4]
6.
t
[Lemmon: 4] ·[11.7 (Brouwer)]
7.
t o V'x.DA ::> A
[pC: 5,6]
8.
t
OV'x .DA ::> V'x.A
9.
t
OOV'x.OA ::> DV'x.A
[RV': 7, and QC2] [Lemmon: 8] [11.7 (Brouwer)]
10.
ODA ::> A
t V'x.DA::> DOV'x.DA
r V'x.DA ::> DV'x.A
[pC: 10,9]
12. t (DV'x.A = V'xDA)
[pC: 3,11]
11.
r
=3x.OA.) -
(Prove this!) The next law will be usedin Section2.2.4. COROLLARY:
(03x.A
158
r r 3. r 4. r 5. r 1.
2.
11.10. ~ (- O(B &A) & O(B & C»::J O(C & - A). - Proof: O«B & C)::J (B & A»::J O(B & C)::J O(B & A) [11.8] (-O(B&A) & O(B&C» ::J -O«B&C)::J (B&A» [pc: 1] - «B & C) ::J (B &A»::J (C & - A) [PC] - O«B & C)::J (B & A»::J O(C & - A) [Lemmon:3] (-O(B & A) & O(B & C»::J O(C & - A) [pC: 2,4] If the readeris familiar withmodallogic,he/she realizes that the modalfragment of Ie is an SS-
type modallogic.Furthermore, the combination of quantifiers and modaloperators yielded a Barcan style
systemcharacterized by theschema11.9.
2.2.3. IC-CONSISTENT AND IC-COMPLETE SETS 2.2.3.1. DEFINmoN. (a) A sentenceA is said to be an IC-consequence of the set of sentences T(or IC-deducible from T) - in symbols:"T he A" - iff Tis empty and he A, or T is nonempty and there exists a conjunction K of some members of r such that he K::J A. (b) Compare (a) with Def. 1.2.3.1, the definition of syntactic consequence in ECI One sees that the difference is merely in the reference to the calculus IC instead of EC . Consider Definitions 1.2.3:2-3. Substitute 'EC' everywhere by 'IC'. Then one gets the definitions of IC-inconsistent/IC-consistent sets, and IC-complete sets. 2.2.3.2. THEOREM. If the members of r are free from the variable Yt» the variable Xa does not occur in the sentence Al , and r he Al then T he V x.A. - For the proof see 1.2.3.4. 2.2.3.3. THEOREM. Every IC-consistent set is embeddable into an IC-complete set. Proof Consider the EC variant of this theorem: Th. 1.2.3.5,and its proof. The proof of our present theorem is essentially the same, with obvious modifications. The starting point is that To is an IC-consistent set of sentencesof a language Lo(i) and we shall define an enlargement L P) of L/i) by introducing new variables in all types. Replace in the quoted proofeverywhere: 'Lot)(! , by 'Lo(i) " 'Lt)(! , by 'L PJ ,, J
'EXTY' by 'TYPM " 'EC' by 'IC'. 2.2.3.4. THEOREM. If r is an Ie-completeset of sentences then: (i) r heA => A E I>: (ii) {(A a= B a ), C} ~ r => " C[B/ A]" E r, (iii) if the term A a occurs in a member of r then for some variable X a , "(A a =x a )" E r, (iv) if "(Cup = Dup)" E r then for all variables xp ,"(C(x) = D(x)" E r. Proof" See Th. 1.2.3.6. 159
2.2.4. MODAL ALTERNATIVES 2.2.4.1. DEFINITION. We say that f/J is a modal alternative to r iff
(i)
r and f/J are IC-complete sets of sentences (of the same language)and
(ii) for all sentences A, if "OA III
E
r
then A
f/J .
E
2.2.4.2. LEMMA. If f/J is a modal alternativeto r then for all Co E f/J , " OC'Er.
Proof. Since "....o C"
E
r
r
is IC-complete, one of "
implies ".... C"
E
u ....
o C ' must be in
r . But
f/J (by (ii) of the preceding Def.) which is impossible if
C E f/J (by the IC-eonsistency of f/J). Hence, "OC"
E
r.
Note that condition (ii) of Def.2.2.4.1 is equivalent to the following condition (ii'): for all A o
E
f/J , "oA"
E
r .-
Our lemma proves half part of this equivalence.
Prove the other half! 2.2.4.3. THEOREM. Modal alternativeness (as defined in 2.2.4.1) is an equivalence
relation (between IC-eomplete sets), i.e., it is reflexive, symmetric, and transitive. Proof. (a) Reflexivity. If r is Ie-complete then whenever "OA'" E T; A E r.
r .If "DA" E I" then E r (by the preceding lemma), and, by the Brouwer schema, A E r. By this, r
(b) Symmetry. Assume that I" is a modal alternativeto "oOA"
is a modal alternative to I", (c) Transitivity. Assume that modal alternative to E
r
/I
is a modal alternative to I: ' and
r ' is a
r . we have to show that T" is a modal alternative to r. If "DA"
r then " OOA" E F, hence, "OA" E I" and A E F"
.
Note that our theorem is not a general law of modallogic. It holds only for the 85-typemodalities. Our proof exploits essentiallythe factthat 85 is included in IC .
2.2.4.4. TH EOREM. If r is an IC-eomplete set and "OB"
E
r then there exists a mo-
dal alternative f/J to r such that B E f/J .
Proof. (i) Let ic, }nE aJ be an enumeration of all sentences of form "3x.A" of the given language L (i). We shall defme first a sequence tH; }n E aJ of finite sets, and we shall denote by K; the conjunctionof all members of H n • H o = {B}.
If "O(Kn & C; )"
Now assume that " O(Kn & C; )"
E
~
r
then H n+1
r and C; is
=H; .
"3x a .A o " . We can assume here that
K; is free from x (if not, let us "re-name"it), and so: (1) ~ (K; & 3x.A)
(2)
~ O(K n
=3x(Kn & A)
[by QC]
& 3x.A) = 0 3x(K n & A)
(3) ~ 0 3x(K n & A)
By these, , " O(K n & Al
=3x O(K n
"3x. O(Kn
& A)
& A)"
E
[Lemmon: (1)] [Barcan]
r. Then, by the 3-completenessof r, for some Ya
)" E r . Furthermore: 160
(4) ~
uc;
& Al ) ~ (K n & Al & 3x.A) [QC] (5) ~ O(K n & AI) ~ O(K n & Al & 3x.A) [Lemmon: (4)]
Hence, for someYa, "O(K n & Al & 3x.A)" E r. Choose such an Y and define: Hn+1 =n, U {3x.A, Al } (then Kn+1 "(Kn & 3x.A & Al )"). Note that
(6)
for all n E
(J),
"o K;"
E
r.
We now define:
By this definition, HO) is 3-eomplete, and, in consequence of (6): if K is any conjunctionof some members of H0) then "oK" E r (for K being finite, it must be a subconjunctionof some K;). (ii) Now let (An}nE 0) be an enumeration of all sentences of
LV) .
We definethe
sequence ( r.Pn }n E Q) of sets by the following induction:
r.Pn+1 = r.Pn if for someconjunction K of members of r.Pn , "-O(K & An)" E and tPn+1 = tPn U {An} otherwise.
r,
Finally,
By the definition of tP, whenever K is a conjunction of some members of tP, "OK" E r. Since
I- - K
~
~ 0 ... K :=; " ...0 K" E
r,
the IC-consistency of F impliesthe IC-consistency of r.P. (iii) Using that HQ) ~ r.P we have that r.P is 3-complete. To show that r.P is IC-complete we have to showthat if Co ~ tP then tP u {C} is IC-inconsistent. Assumethat C ~ tP. Then C is a member of our enumeration (An}n EQ), say, C is Am . Then C ~ tPm+I which means that for some conjunction K of members of tPm, u ... O(K & Am )" E r. On the other hand, if K' is an arbitrary conjunction of some members of tP then "O(K & K')" E r. However,
(cf.11.10), whichmeansthat (7)
for all conjunctions K' of members of F , UO(K ' &- -Am)" E r
161
.
Since "-Am" occursin our enumeration too, say, it is Ak , "-Am" must be a member of cPk+1 (for "-O(K' & - Am )" E r is excluded by (7) and the IC-consistency of n. Hence, "- Am " E cP, that is, cP u {Am } is IC-inconsistent. (iv) Finally, we have to show that cP is a modal alternative to r and B E cP . The latteris obvious, for {B} = Ho ~ H(J) = cPo ~ cP. Now assumethat "DA" E r. By the IC-completeness of cP , one of A, «; A" must be in cP . If «; A" E cP then "o-A" E
r,
contradicting the fact that "DAn E r (and r is IC-consistent). Hence, A E cP .
2.2.4.5. DEFINITION. By an IC-complete family let us mean a set W such that (i) the members of Ware IC-complete sets of sentences of a common language (ii) the members of W are pairwise modal alternatives of each other (cf. Th. 2.2.4.3), and (iii) whenever "OB" EWE W, then for some w' E W, B E w~ 2.2.4.6. LEMMA. Assume that Wis an IC-complete family. Then: If A is a rigid sentence, and for some W E W, A E W thenfor all w' e W, {A, oA, DA} ~ w~ Proof. By n.3, A E W implies that "DA" E w. Using that every w' E W is a modal alternative to W we have that A E w', for all w' E W. Then,by the laws Il.3 and II.5 we get that for all w' E W, {A, oA, DA } ~ w'. We shall apply this lemma mainly for the cases A is of form Po , "(x a = Ya )" (where p, x, y are variables), "DCo" , or "oCo". 2.2.4.7. THEOREM. If r is an IC-complete set then it is "embeddable" into an ICcomplete family: there existsan IC-eomplete family W such that r E W. Proof. Let {OBn }o< n E (J) be an enumeration of all sentences of form "OC' of r. By Th. 2.2.4.4, for all n > thereexists a Wn such that B; E Wn and Wn is a modalalternative to r. Let Wo be r, and define:
°
W= {n
E tV: Wn }.
To show that W is complete we have to prove that whenever " 0, Cis B, , and C E Wk • 2.2.5. THE COMPLETENESS OF IC 2.2.5.1. THEOREM. If W is an IC-eomplete family then thereexists a GI-interpretation lp = {U, ~D, a} and a valuation v such that (1)
for all W E Wand for all A E W, IAl vw1p = 1.
Proof. Note first that the set of worlds W in the interpretation lp is the same as the given IC-complete family W. Furthermore, consider the analogous Th. 1.2.4.1 of EC. We shall adapt some detailsfrom the proofof this theorem by the reference motto 162
Has in EC". In case of such an adaptation, an obvious modification consists of introducing a reference to a world w, e.g., an expression of form "lAlv " is to be replaced by "lAlvw" , and so on. Part I: The definiton ofIp and v. We shall define D and v by induction on TYPM. Let us choosea Wo from W. Since the variables are rigid terms, in defining v it is sufficient to refer to Wo only (owingto Lemma2.2.4.6). (a) For p E Var( 0) we defme: v(p) =
1 if p
E Wo,
°
and otherwise;
and D(0) = {O,1}.As in EC, we have that there are Po, q 0 such that v(p) = 1 and v(q) = 0; and v(p) =v(q) iff "(p = q)" E Woo (b) We define v for members of Var(l) and the domain Dit) = U exactly as in EC (but referring to Wo insteadof T). (c) Assumethat D( a) and D(P) are defmed, v is defined for Var( a) u Var(p), and for rE {a, P}, (i) and (ii) below hold: (i) a E D(y) ~ for some x r ' v(x) = a, (ii) "(x r =Yr)" E Wo ~ v(x) = v(y). We then define v for Var( ap) and the domain D( ap) similarly as in Ee, putting Wo insteadof r . Then (i) and (ii) hold for r = a(pJ as well. Furthermore: (2)
" (f afJ (yp ) =x a)"
E Wo ~ v(f)(v(y))
=v(x).
(d)Turning to the type as, we define: for all W E W, v(Xa s )(w) = a iff for someYa, v(y) =a and "(vx =y)"
E
w.
Here our induction assumptions are that v is defined for Var( a), D(a) is defined, and
=a . Now v(x a s ) E
(i), (ii) aboveholdfor r D(a S
)
=df
{rp: for some Xas , v(x) = rp} ~
Prove that (i) and (ii) hold for (3)
"(
vX a s
W D( a). Then: W D(a).
r = as, and
=Ya) E W
~ v(x)(w)
=v(y).
(e) If C E ConCa), then for all W E W, there is an Xa such that H(C = x)" (cf.Th. 2.2.3.4, (iii». We thendefine: (4)
o(C)(W) = v(x) iff H(C = x)" E w.
Nowour definition of Ip and v is completed. 163
E W
Part II:.Theproofof (1). (A) As in EC, we prove (1) firstly for identities of form "(B a = Ya )". If B is a variableor a term of form ''f(x)'' then - using that these terms are rigid ones - (1) holds according to the definition of Ip and v (see also (2) above which holds not only for Wo but for.all W
E
W). If B is a constant then (1) holds by (4). In other cases, B is a com-
pound term of form (Al) "Fap(Cp)", or
c.r.
or (A2) "('Axp (A3) rcc; =o;)", or (A4) "AC a ", or (A5) "v Cas" .
The proof for the cases (AI), (A2), and (A3) runs similarly as in EC (put "w
E
W" for
r, and "IXlvw " for "IXlv "
everywhere). Let us turn to the remaining two
cases. (A4) If "(I\C a = Yas )" rigid terms, " D(C =v y)" all W
E
W, there is an X w
lows that for all w, "(C
E Wi ' E
E Wi
E
W then, by (16*), using that "AC" and Y are
Then, for all WE W, "(C =VY)" E w. Furthermore, for
Var( a) such that "(y = xw )
=xw )"
E W.
w. From these it then fol-
By (3)
I( y =xw)lvw
(5)
E
=1.
We can apply the induction assumptions: for all w, I(C = xw)lvw = 1.
(6)
From (5) and (6) it then follows that for all w, I(C = v y)lvw
=1,
and this implies that (for all w) ID(C =v y)lvw = 1, or, in another form, (for all W (including the case W
E
W) I(AC = y)lvw = 1
=Wi ).
(AS) If "(Cas = Ya)" EWE W then for some variable X a s -cc = vx)" E W and "(x = y)" E W. By (3), we have that
"(C = x)" E w. Then (7)
and by induction assumption, I(C
=x)lvw =1. 164
The latter impliesthat
This and (7) togetherimply that
(B) Now we can prove (1) for identities of form "(B a = Ca )" - where both B and C may be compound terms - exactly as in EC [cf. the proof of 1.2.4.1, Part II, (B)].
(C) Finally, if the sentence A is not an identity then apply the same device as in EC [see the proofof 1.2.4.1, Part II, (C)]. 2.2.5.2. THEOREM. If the set r is IC-consistentthen F is GI-satisfiable. Proof. By Th. 2.2.3.3, T is embeddable into an IC-complete set Wo , and by Th. 2.2.4.7, Wo is embeddable into an IC-complete family W. By the preceding theorem, there is a GI-interpretation Ip = (U,W,D, (J ) and a valuation v such that the triple (Ip, v, Wo ) is a GI-model of wo, and, since T ~ Wo, it is a GI-model of T as well. COROLLARY. (Lowenheim-Skolem.) If the set T is Gl-satisfiable then T is "denumerably" satisfiable in the sense that T has a GI-model «U, ~D, (J ),v,w) such that each D( a) [a e TYPM] is at mostdenumerably infinite. - Cf. Th. 1.2.4.4. 2.2.5.3. THEOREM. The completeness of IC with respect to the Gl-semantics of IL. If F I=GI A then r he A. - For the proofsee Th. 1.2.4.3.
165
2.3. APPLICATIONS OF IL 2.3.1. A FRAGMENT OF ENGLISH: LE In several papers, Montague formulated some fragments of English as a formal(ized) language, giving the lexicon, the syntax, and the semantics of the fragment. In his last two works (Universal Grammar and PTQ), the semantics of the fragment was not formulated directly; instead, he formulated translation rules from the English fragment into the language of IL (or IL + ); thus, the semantics of the fragment was indirectly given via the semantics of Montague's intensional logic. In what follows, we shall present the approach explained in PTQ (with minor changes in the notation). Let us call the fragment of English treatedhere oLE • (Montague qualifies it as "a certain fragment of a certain dialect of English"; here the reference to "a certain dialect" will mean that LE involves some compound terms unusual in "ordinary English.) We shall begin with the defmition of the system of categories of LE' (Here the categories correspond to the types of logical languages.) The basic categories are t and e: the category of declarative sentences and of individual names, respectively. The functor categories are of form " a I P" and "a II P" where the difference between the single and the double slash ('I') is of grammatical nature only, not concerning the semantic values. (This will be enlighted belowin the particular cases.) An expression of either category is to be such that when it is combined (in a specified way which is difII
ferentfor the two categories) with an expression of category P, an expression of category a is produced. The functor categories explicitly usedin LE are as follows: IV = tie, the category of intransitive verb phrases. eN =t lie, the category of common noun phrases. [Comment. Intransitive verbs and common nouns are, obviously, different categories of English (this holds for most languages), they may get different suffixes etc. From a logical point of view, both are monadic predicates which means that their semanticvaluesbelongto the same domain. This motivates the notation.] NOM =t I IV, the category of nominal phrases. - Roughly, nominal phrases are expressions whichcan occupythe subject(andthe directobject)placesof verbs. [Remark. This category was denoted by 'T' and called 'the category of terms' by Montague. We use the word 'term' in the sense of 'well-formed expression" of some language. To avoid confusion, we have to abandon here Montague's original notation.] TV = IVI NOM,the category of transitive verb phrases. ADV = IVIIV, the category of IV-modifying adverbs. VIV = IVII IV, the category of IV-taking verb phrases.
166
j
I
I
[Remark. An example of a VIV phrase is 'try to', e.g., in 'try to find'. Again, the grammatical rules governing ADV and VIV phrases are different(as we shall see this later on), but when applied to an IV phrase both result a compound IV phrase. This motivatesthe notation.] ADS =tIt, the category of sentence-modifying adverbs. SV = IVI t, the categoryof sentence-taking verb phrases. PRE =ADVI NOM, the categoryof adverb-making prepositions. The full inductive definition of categories of L E is as follows: t and e are categories. If a, P are categoriesthen "a I P" and "a II P" are categories.
However, only the categories listed above will playa role in LE . Concerning the lexicon, the set of basic terms of the category a will be denoted by "B(a )". These sets are given explicitlyas follows: B(lV) = {run, waft taft rise, cfiange }, B(CN) = {man, woman, part [ish, pen, unicorn, price, temperature }, B(TV) = {find, rose, eat, love, {ate, be,seek, conceive },
B(NOM) = {Jolin, Mary, fJ3i{[, ninetg, lieo , lie} , 1ie2 ,
•••},
B(ADV) = {rapUffg,sfowfg, vo{untarifg, a{{£.geafg }, B(ADS) = {necessarifg }, B(VIV) = {try to, wisli to },
B(SV) = {oefieve tliat, assert that }, B(PRE) = {in, aoout },
BC a) = 0 if a is any categoryother than those mentioned above. We used here script letters in printing the basic terms of LE . In what follows, even the compound expressions of LE will be printed similarly, and we shall not use quotation marks surrounding them (except when the quoted text involves metavariabies). Onecan thinkthat the sets B(a) contain only sample termsand they could be enlarged by further "similar" terms, Probably, this is true. However, one must be cautious in doing so for it may happen thatthe application of the further rulesto the newtermswillleadto undesirable results.
The set B(NOM) contains a potentially infmite sequence of pronouns with numerical subscript (lien)n E aJ . This will be useful in the construction of some complicated sentences. The basic sets B(t) and B(e) are empty.This is so because (a) in English, there are no one-wordsentences, and (b) individual names are ranked into B(NOM) (why? - the answer will be given later on).
167
Notethat, when applying the Montagovian approach to Hungarian, the basic set B(t) might contain such "subject-free" sentences as fto.'lJ(U,;t viIlJJrrl;t ta'lJtJ.S.lAAfik Cit is snowing', 'it is lightning', 'spring is coming') etc.
The set of all terms of category a will be denoted by "T( a )". The inductive definition of thesesets will be givenby the syntactic rules (S1) to (S17)below. Syntactic rules Basic rules
(S1) B(a ) ~ T( a) foreverycategory a . (S2) If A e T(CN) then the terms "every A", "the A", and "a / an A " are in T(NOM).- Here the notation "a / an A" is to be understood as to choose the indefinite articlea or an according as the initial letterof A is a consonant or a vocal. Note that the spaces (blanks) in the defined complex terms represent interspaces. - Examples: Since {man, woman, parR. } ~ B(CN) ~ T(CN), {every man, a woman, tfi£parR.}
c
T(NOM). (S3n ) If A e T(CN) and S e T(t) then "A suefi tfiat Sen) "e T(CN) where Sen) comes from S by replacing each occurrence of fi£n or fii1Ttn by lie / she/ it or him/ her/ it respectively, according as the first common noun in A is of masculine/feminine/neuter gender. [Here it is assumed that the gender of the members of B(CN) is given in advance.] - Example: Assuming that 1ie1 Iooes fiittto e T(t) (cf. the example of (S5) below), woman suefi tfiat sheIooes fiittto e T(CN),by (S3} ). Rules of functor application (S4) If A e T(NOM) and B e T(lV) then "A B/" e 'I'(t) where B' is the result of replacing the first verb (i.e.,member of B(lV) u B(fV) u B(SV) u B(VIV)) in B by its third person singularpresent. [Here it is assumed that the third person singularpresent form of each verb occurring in the lexicon is known.] - Example: Since a woman e T(NOM) and talK.. e T(lV), a woman ta~ e T(t).
(S5) If B e T(fV) and A e T(NOM)then "B A' "e T(lV) whereA' is fii1Ttn if A is of form fi£n , and A' is A in other cases. - Example: Since love e T(TV) and lieo e
T(NOM), Iooe fiittto e T(lV), and, by (S4), 1ie1 loves fiittto e T(t). (S6) If Be T(PRE) and A e T(NOM)then "B AI" e T(ADV)whereA' is as in (S5).- Example: Since in E B(PRE)and tlie parR. e T(NOM), in tlie parR. e T(ADV). (S7) If B E T(SV) and S e T(t) then "B S" e T(lV). - Example: Since6efieve tfiat e T(SV) and 1ie1 roves fiittto e T(t), 6efieve tfiat 1ie1 roves fiittto e T(lV). (S8) If B e T(VIV) and C e T(IV) then "B C" e T(lV). - Example: Sincetry to E B(VIV) and run E B(lV), try to run e T(lV). (S9) If B E T(ADS) and S E T(t) then "B, S" e T(t). - Example: Since necessarifg e B(ADS)and a woman ta~ e 'I'(t), neeessarifgJ a woman ta~ e T(t), (SID) If Be T(ADV) and C e T(lV) then "C B" e T(lV). - Example: Since sfowfg E B(ADV)and walR. E B(lV), wa[ksfowfg E T(lV).
168
Rules of conjunction and alternation (SI1) If s., S2
E
T(t) then "S! aTUf S2" and "S! or S2" are in T(t).
(SI2) If A, BE T(IV) then "A aTUf B" and "A or B" are in T(IV).
(S13) If A, B e T(NOM) then "A or B" E T(NOM). [Here the aTUf operation is absent for "A aTUf B" in subject position requires the plural of the verb, but in this fragment only the thirdpersonsingularfonn of verbsis used.] Rules of quantification (SI4 n ) If A
E
T(NOM) and Be T(t) then "B[A / n]"
E
T(t) where
(i) if A is of form Iie~ then B[A / n] comes from B by replacing all occurrences of lien or lii"'n by Iie~ or lii"'k respectively, (ii) and, in other cases, B[A / n] comesfrom B by replacing the first occurrence of lien or lii"'n by A and all other occurrences of lien or lii"'n by fielsfielit or liim/lier/it respectively, according as the genderof the first common noun or nominalin A is masculine/feminine/neuter. Example: Since a woman sueli tliat she roves liitno e T(NOM) and rove e T(TV), rove a woman sueli tliatshe roves liitno e T(IV) (by (S5)), and lieo roves a woman sueli tliat sheroves liitTto e T(t).
Using that every man e T(NOM) we get by (SI40 ) that every man roves a woman sucli that she roves liim e T(t).
(1)
(SI5 n ) If A E T(NOM) and BE T(CN) then "B[A/n]" E T(CN) where B[A/n] is
as in (SI4 n). (S16n ) If A e T(NOM) and Be T(IV) then " B[A / n]"
E
T(IV) where B[A / n]
is as in (S14 n)' Rules of tense and negation
(S17)If A e T(NOM) and B e T(IV) then "A B-''', "A BF " , "A B-F", "A B ", and "A B-F" are in T(t) where P
B-' is the result of replacing the first verb in B by its negative third person sin-
gular present, ~ is the resultof replacing the first verb in B by its third personsingularfuture, B-F is the result of replacing the first verb in B by its negative third person singularfuture, BP is the result of replacing the first verb in B by its third personsingular present perfect, and B~ is the result of replacing the first verbin B by its negative third person singularpresent perfect. As we see, the majoroperation of forming a sentence consists of a combination of a nominal phraseand an intransitive verb phrasewherethe latter may be in a tensed 169
and/or negative form. The rules of this operation are (S4) and (S17). The precise characterisation of the notions occurring in (S17)- such as the (negative) third person singular future or presentperfect form of a verb- may be given, as Montague says "in an obvious and traditional way"; but the author gives no details. Unfortunately, no example of a tensedor negated verboccursin PrQ. The most important novelty of this syntax is the rule (SI4 n) [and its variants (S15 n ) and (S16 n )] not occurring in the earlier writings of Montague. Without this rule, the construction of sentence (1) would be impossible. These rules are, in fact, rules of substituting (free) pronouns. The term which is substituted for the pronoun is often a quantifying expression (as in (1): eve'!! man); probably, this is the reason that Montague speakson rules ofquantification. The construction of sentences may be demonstrated by analysis trees (often used by theoretical linguists). For example, the analysis tree of sentence (1) is as follows: [Thenumbers in square brackets refer to the numberof syntactic rule applied at the indicatedstep.] every man roves a woman such. tliat she roves him
I
man
~Cfi ~
tfiat she Iooes
/
every man
[140 ]
[2]
Iiima
[4]
/
lie o
.
rove a woman sucb that she roves liimo [5]
/~ rove
a woman sfUli that she roves liimo [2]
I
woman sucli that sheroves liimo [3d
r<:
woman
fie 1 roves liimo [4]
/""
liel
rove liimo
[5]
1\
rove
lieo
The termin each node is eithera basic term or else it comesfrom terms standingin inferiornodesby meansof the indicated rule. Montague acknowledges that some sentences of LE are ambiguous. Such a sentence has two (or more)essentially different analysis trees.His example is: (2)
Jolin seeK.! a unicorn.
Herefollow the two different analysis treesof (2): 170
Jolin seek.! a unicorn
~"" / \
Jolin
[4]
[5]
seek. a unicorn
seek.
a unicorn [2]
I
unicorn
[140 ]
Jolin seek.! a unicorn
.> -,
a unicorn. [2]
Jolin seek.! liitno
[4]
r-.
1
unicorn
Jolin
[5]
seek.liitno
1\neo
seek.
As Montague says, "the first of these trees correspond to the de dicto (or nonreferential) readingof the sentence, and the second to the de re (or referential) reading." In other words: The first reading does not presuppose that there are unicorns whereas the secondreading maybe paraphrased as follows: "Thereexists a unicorn such that John wants to find it." The translation rules (given in the next section) will verify these statements. In his Universal Grammar, Montague used foursorts of parentheses in orderto get an unambiguous fragment of English. In the present Zj, , no parentheses are used. The disambiguation of an ambiguous sentence may be done here by supplying an analysis tree to the sentence. Thus, we can say that a pair (8, T(S)} - where 8 is a sentence of.LE andT(S)is an analysis tree of 8 - represents a disambiguated sentence of .L E . (To make these notions more exact needs some work; e.g., it is necessary to defme the relation "not essentially different" between analysis trees of the same sentence. For example, applying fie3 insteadof fie 1 in the analysis tree of sentence (1), the resulting tree is not essentially different from the one in which fie1 is applied. But let us neglectherethis problem.) 2.3.2. TRANSLATION RULES FROM .LE INTO
.L{i}
The first step towards the translation is to define a mapping f from the categories of .LE to the typesof .L (i). The intention is that if A E T( a) then the translation of A is to be of type f( a). The inductive definition of the function f is as follows: f(t)
=0,
f(a I
f(e)
fJ> = f(a II fJ> = f(a)(f(fJ»S . 171
L_
=t,
This meansthateveryfunctor of LE countsas an intensional one in the sense of operating on the intension of its argument. Experience showsthat this is not always the case; e.g., run, man, find are extensional predicates. Montague does not deny this fact; his remedy willbe treated in the next section. A tableauof the mapping f: a:
IV and CN
NOM
TV
ADVand VIV
We use here the abbreviations:
Concerning the translation of the basic terms, Montague prescribes that if A
E
B(a) then - with some exceptions - the translation of A is a member of Con(f(a »,
of course, with the precondition that the translations of different terms must be different ones. The exceptions are the nominal terms (members of B(NOM», the transitive verb Be, and the sentence modifier necessarify; theirtranslations are defined separately. Montague assumes a unique language of IL + , and, hence, he was compelled to choose some constants of this language as the translations of the basic terms of LE' However, we have spokenalways about a family of IL + languages. Hence, we can assume that there is a particularlanguage Le
(i)
= (Log, Var, Cone ' Cat. ';
such that the (nonlogical) constants of this language are just the basic terms of LE . We shall proceed this way where the translation rules will be somewhat simpler. The translation of a term A will be denoted by "[A]* ", but the square brackets will be omittedif A consists of a single word. The translation rules (Trl) to (Tr17) correspond to the syntactic rules (8 I) to (817) of the preceding section. Translation roles Basic roles (Trl)
(a) If A E {Jolin, Mary, A*
=(Af5s'
v
'13i£~
ninety} then A E Conit) and
f("A»
E
Cate(E).
(b) lien * = (Af5s .v f( ~n)
E Cat, (e), where ~n is the 2n-th member of Var(l S). (c) ne'" = (Ages(Ax zs .vg("Ayz s(vx = vy» ) E Cat, (OgS ),
(d) necessarify* = (Apos.Ovp) E Cate(o d
).
(e) If A is a basic term of category a not occurring in (a), (b), (c), (d) abovethen A
E
Cone (r( a» and A * = A.
172
Notethatby (a)and (b), proper nouns and pronouns aretransferred from type l into type Eo (TI2)below shows that the translations ofcompound nominal terms belong to the sane type. Thisis the reason that Montague ranked the proper nouns and pronouns intothe category NOM (instead of category e).
(fr2) If A E T(CN) then
[every A]* =(A/& 'Vx,S[A *(x)::) vftx)]) E Cat, (e), [tlie A]* = CA/& 3Yls ['Vx ,S(A*(x) = (x = y» & v 1(Y)] ) E Catie), [alan A]* =(A/& 3x ,S[A*(x) & v ftx)] ) E Cafe (e). If A E T(CN) and S E T(t) then [Asueli tliat S(n) ]* = (')...~n (A*(~n) & S*» E Cafe (0). [For the meaningof S(n) see (S3n ).] Rules of functor application (fr4) If A E T(NOM) and BE T(IV) then [A B1* = A*("B*). [For the meaning of B' see (S4).] (frS) If BE T(TV) and A E T(NOM) then [B A']* = B*("A *). (fr6) If B E T(PRE) and A E T(NOM) then [B A']* = B*("A *). (fr7) If B E T(SV) and S E T(t) then [B S]* = B*("S*). (fr8) If B E T(VIV) and C E T(IV) then [B . C]* =B*("C*). (fr9) If BE T(ADS) and S E T(t) then [B, S]* = B* ("S*). (frIO) If BE T(ADV) and C E T(lV) then [C B]* = B*("C*). Rules of conjunction and alternation (Trll) If S1 ,S2 E 'I'(t) then [S1 ana S2]* = (S1 * & S2*)' and [S1 or S2]* = (S1 * v S2 *). (Tr IZ) If A, BE T(IV) then [A ani B]* = (AX l S (A*(x) & B*(x») and [A or B]* = (')...x lS (A * (x) v B* (x»). (Trl S) If A, BE T(NOM) then [A or B]* = (A/ & (A * if) v B*if)). Rules of quantification (TR14n) If A E T(NOM) and B E T(t) then [B[A I n]]* = A *("(Aqn B*». [For the meaning of "B[A In]", see (S14n ).] rrns, )If A E T(NOM) and BE T(CN) then
rns, )
[B[A I n]]* =(AYlS.A*("[A~n .B* (y)]). (TrI6 n ) If A E T(NOM) and BE T(IV) then [B [A I n]]* = (AYl sA*("[Aqn .B*(y)]).
Rules of tense and negation (TrI7) If A E T(NOM) and B E T(lV) then [A BO]*
=- A*("B*),
[A Jt]* =F A*("B*), [A B-F]*
=- FA *("B*» )
[A BP ]* = P A*("B*),
=-
[A B-l']* P A*("B*). [Concerningthe superscriptsof B (B"", if, etc.), see (S17).]
173
Examples of translation We shall use the sign '=' for expressing logicalsynonymity, i.e., "A ::: B" abbreviates " 1= (A =B)", and we shallexploitthe transitivity of this relation by writing sometimes "A B s::: C s:::•• •''. Numbers in square brackets refer to thenumbers of translation rules applied. Thefrequent occurrence of "(A/Bs 1(A u )" will be abbreviated sometimes by t;1
"(A)+ ".
As the first example, let us take the step by step translation of sentence (I) of the preceding section. (a)
(b)
[rove fiitno]*
=rove("fieo *)
[le,5]
:::
r:;
rove("(Alrs v f( ~d»
r:;
rove("(~o)+)
[byour abbreviation]
[fie1 roves fiitno]* = fie1 * ("[rove fiitno]*) r:;
[Ib]
r:;
[4]
r:;
(A15s .v.tt;v)("(rove("(~o)+»)
[by(a)] [byA-eonv.] [bydeleting v" ]
r:;
= v"[rove("(~o)+ )] (~1 ) ::: r:;
(c)
(rove("( ~o )+ »(~1)
[woman sucfi tfiat she roves fiitno ] *
=
= (A~l(woman(~l) & [fie1 roves fiitno]*» r:;
= [Ie,3d
(A~l [woman (~1 ) & (rove("(~o)+»(ql)] )
[by.(b)].
Let C abbreviate the term a woman suc.fi tfiat she roves fiitno . (d)
C* = [a woman suc.fi tfiat she roves fiitno]* = = (A15s 3x lS([woman suc.fi tfiat she roves fiitno ]*(x) & v.f(x)) = r:;
(Af3x[(A~l [woman(~.z)
& (rove("(~o)+»(ql )])(x) & v.f(x)]) r:;
::: (Af3x[woman(x) & (rove("(~o)+»(x) & vj(x)]).
[Weused here (Tr2), (c), and A~rconversion.] (e)
[rove C]* = rove("C*) r:;
[5]
r:;
rove("(Af3x[woman(x) & (rove("(~o)+»(x) & vf(x)])
[(d)]
(f)
[fie o roves C]* = (Airs .v f(~o»("[rove C]*) = ::: v"[Cove C]*(~o) r:; [Cove C]*(qo)
[lb,4] [AI and deleting v"]
(g)
[eve1Y man]* = (A15s 'VYls [man(y)::J v.f(y)])
[1e,2]
(h)
[evety man Coves C]* = [evety man]* ("(A~o[fieo roves C]*»
r:;
[14 0]
r:;
'VYls [man(y)::J V"(A~O[fieo Coves C]*)(y)] =
[(g),(h),AY]
=
vv., (man(y)::J [[lieo roves C]*]~t/)
[del. v", A~O]
174
Finally, we have that (1)
[eve1Jl man roves a woman suefi tfult she roves fiim]* = := V'YlS[man(y) :J rove(A'A/6s.3xlS [woman(x) & rove(A(y)+)(x) & v ftx)])(y)].
The ambiguous sentence (2) of the preceding section has two different translations. Its de dicta meaningis expressed as follows: (a)
[seek. a unkorn]*
=seek. (A[a unicorn ]*)
[5]
=
:= seek.(A('A/6s .3xl 5 [unkorn(x) & vftx)]) (2.1)
lJofinseek,saunitom]*
=Jofin* (A[seek.aunitorn]*)
[2] =
:=('A/65 .vf(AJofin»(A[seek.(A'A/6s 3xls (unitorn(x) & vftx)])]) :=
=seek.(A('A/6s 3x ,S[unitorn(x) & vftx)])("Jofin)
[by 'A/anddeleting VA].
The translation of the de re reading is as follows: (a)
lJofin seek,s him] * = seek.( A( ~0 )+)("Jolin)
[5,4]
(b)
[a unitorn]* = ('A/ 65 3xls [unitorn(x) & vftx)])
[2]
(2.2)
lJolinseek,s a unitorn]*
= [a unitorn]* (A('A~olJolinseek,s
fiimo]*»:=
:= ('A/6s 3x[unitorn(x) & vftx)])(A('A~oseek.(A(~o)+)(AJofin») =
[140 ]
.
['A/, 'A~o, del.VA]
:= 3x ,S[unitorn(x) & seek.(A(x)+)(AJolin)]
Our next examples will refer to the verb be. (3)
[6e 1Ji{{ ]* = 6e* (A('A/6s v/(A1Ji{C») =
=vy» ]) (A(X/ 6s vf(A1Jim» = ('Ax ls vA['A/6s v/(A1JifC)(A(AYls (x =vy»]) :=
= ('Ages (Ax ls Vg[A(AYls (vx
[Ie]
:=
:= ('AxVA('Ay(vX = vy» (Af}JifC) = := (Ax(vx = vAf}JifC» = (AXlS (x = 'Rim)
[Ag] ['AJ] ['Ay, del. VA]
(The references ' [Ag]' etc.referto A-conversions with respect to thevariable following A.) (4)
[fie o is 1Jift]* = (A/6sVft ~t1)(A(Axls (x = 'RifC») = = VA(AX(vX
='RifC»(~o)
[tfie temperature]*
= (v~o
='13im
= ('A.f.3y[V'x(temperature(x) =(x =y»
[AJ: Ax, del.vA] & "ley)])
[6e ninety]* = (Ax ,s (vx = ninety» (5)
[cf.(3)]
[tfie temperature is ninety]* = := 3Yls [V'xlS (temperature(x)
[2]
=(x =y» & (y =ninety)].
(6)
[tfie temperature rises]* = 3y[V'x(tettperature(x) = (x = y» & rise(y)].
(7)
[ninety rises]*
= rise(Aninety). 175
In referring to the examples (5), (6), and (7), Montague wrote: "From the premises tlie temperature is ninety and tlie temperature rises, the conclusion ninety rises would appear to follow by normal principles of logic; yet there are occasions on which both premises are true, but none on which the conclusion is." (This exampleis due to Barbara Hall Partee.) Now, according to the translations above, the argumentin questions turns out not to be valid. The reason, according to Montague, is this: "'llie temperature 'denotes' an individual concept, not an individual; and rises, unlike most verbs, depends for its applicability on the full behaviour of individual concepts, not just on their extensions with respect to the actual world and (what is more relevant here) moment of time. Yet the sentence tlie temperature is ninety asserts the identity not of two individual conceptsbut only of theirextensions." Montague continues: "We thus see the virtue of having intransitive verbs and common nouns denote sets of individual concepts rather than sets of individuals - a consequence of our general development that might at first appear awkward and unnatural." We can add that the analogous treatment of transitive verbs can be appreciated at the light of the translation in example (2.1) above. Montague remarks also that his translation rule for he adequately covers both the is of identity and the is of predication. Concerning identity, we have examples (4) and (5) above.Now let us take an examplefor the predicative/copulative use of is. [he a man]* =he* (A[a man]* ) = =('AgEs (Axcs Vg[A('Aycs (vX = y» ]) (A'Af 6s .3z cs [man(z) & Vf(z)])
[lc ,5]
~
[Ag]
V
(8)
(AX.vA'A!3z[man(z) & v.f(z)][A(Ay(vx = vy»]) ~
~
('A.x. 3z[man(z) & vA(Ay(vx = Vy»(z)])
~
(Axcs .3z cs [man(z) & (vx =vz)])
[lliflis a man]* ~
~
[Ay]
= (Afos .vf( Allill) (A('A; 3z[man(z) & (vx =vz)])
VA(Ax.3z[man(z) & (vx = Vz)])(A1Jift) ~
~ 3z[man(z)
[Aj]
& (llill =vz)]
~
[Aj]
['Ax]
We do not have here '(llill =Z)', hence, we cannot get the result man(Allill) [still less man(lliflJJ. This is so because man is in Cone (OlS), not in Cone (or ). It seems here (and
in the earlier examples, too) that the translation rules are "over-intensionalized", Maybe, some functors in our lexicon are really intensional ones, but most of them is extensional. To get rid of the superfluous intensions, Montague introduces some restrictions on the possibleinterpretations of L e(i)• 2.3.3. REDUCTION OF INTENSIONALITY: MEANING POSTULATES Montague suggests the following restrictions (Ml) to (M7) with respect to the admissible interpretations of c,(i)•
176
E
(M1) The proper names in B(NOM) must be rigid names. In other words, if A {Jolin} fJJ~ fMarg} ninety } then the sentence 3x l O(x
=A)
must be valid. (M2) The common nouns in B(eN) - except price and temperature - are to be extensional predicates: if A is one of these terms then the sentence
must be valid.Then, in the role of vI, the term
will be suitable. We can replace in the translations A by A. (and its argument B by "VB') everywhere. E.g., example (8) in the preceding sectionreducesto (8.)
['1Ji£[ is a man]*
s::
3x l [man. (x) & (fJJiC[ = x)]
=:
man. (fJJi[[).
(M3) The same holds for the verbs in B(IV), except rise and cfiange. An exampIe: [a man wa~]*
s::
3xl [man. (x) & wa[k... (x)].
(M4) The transitive verbs in B(fV) - except seek; and conceive - are to be extensional ones with respect to their both arguments: If A is one of these verbs then the sentence
is to be valid. Note that we need not apply this postulate to the verb be, for it holds automatically by the definition of be: . Again,in the role of vI, the term
is suitable in the translations. [Remember that "(I\y)+" abbreviates "(A!as .v.f(l\y»".] E.g., example(1) of the precedingsectionreduces to (1.)
[everg man Coves a woman sucli tfiatsfie Coves liim]* s::
'VYl[man. (y)
::J
=:
3x l [woman. (x) & Cove. (y)(x)]].
(M5) The transitive verbs seek; and conceive as well as the special verbs in B(VIV) and B(SV) are extensional with respectto their subject argument. The postu-
late for A E {seet conceive } is as follows:
177
Replacing Se« by Pos or hos onegetsthe postulate forB(SV) or B(VN), respectively. The examples (2.1), (2.2) in thepreceding section canbe simplified as follows: (2.1.)
[see("(A/o s .3xe[un~orn. (x) & vit"x)])("Jolin)
(2.2.)
3xe[un~orn. (x) & seek,(x)(7olin)]
Here "seek. (x)" denotes the extensional reduct of "seek..("(x)+)", in the sense of our postulate. (M6) The preposition in is extensional (but aoout is not!) which can be expressedby the sentence:
Thus, if in.
E
Cone (TJl) playsthe role of vg, then:
['.Bil{ wa{~ in tliepar.tl* == [tlie par.tl* (AYes (in. )(vy)(wa{k)("'J3ilf» == (Aj. 3z['
=(u =z»& vf(z)])("Ay(in. )(vy)(wa{k)("f}Ji{f)
so:
3z es ['
s:::
3x e[7'Ye (park. (y) =(y =x) & in. (x)(wafk.. )('.Bill) ].
so:
~
so:
(M7)The verb seek; is to be expressible as trg to fituf, namely: O[seek.. lfes )(xes ) = trg-to("find(f)(x)] .
These restrictions may be qualified as meaning postulates concerning the extensional functors of LE . Example: The de dicta and the de re readingsof the sentence Jolin tries tofind a unicorn are translatedas follows: trg-to("(AYes 3xe[un~orn. (x) & find. (x)(vy)])("jolin); 3xe[unicorn. (x) & trg-to ("(AYes find. (x)(vy»)("Jolin )] .
Similarly, Jolin ta~ abouta unicorn has two readings: about. ("(A/os .3xe[un~orn. (x) & vit"x)])("tafk..)("Jolin) ; 3x e[unuorn. (x) & abou: ("(x)+)("ta{k..)("Jolin)] .
The next example shows that ambiguity can arise even when the sentence consists of purely extensional terms. Let us considerthe sentence a woman loves everg man. We can apply (S4) for the terms a woman and love everg man for getting this sentence. In this case, its translationis:
3xe[woman. (x) & '
178
On the otherhand, we can apply(S140) for the terms every man and a woman Iooes liimo.. The translation of the resultis Vy, [man. (y) ::J 3x, (woman. (x) & Iooe, (y)(x»]
[Every manis loved by somewoman.] The sentence !Mary 6efieves tnat Jolinjintfs a unicorn ani lie eats it
has threedifferent readings: a)
3x t [unicorn(x) & 6eCieve-tnat(Afjini. (x)(Jolin) & eat. (x)(Jolin)])(A!Mary)]
b)
3x t [unicorn. (x) & 6eCieve-tnat[A(jini. (x)(Jolin»](A!Mary) & eat. (x)(Jolin)]
c)
6eCieve-tnat (A3xt [unitorn. (x) &jini. (x)(Jolin) & eat. (x)(Jolin)])(A!Mary)
In the following examples - due to the pronoun it - only the de re readingis possible: Jolin see~ a unicorn ani!Mary see~ it, Jolin tries tojini a unicorn antfwishes to eat it.
To get a nonreferential reading of these sentences, anothergrammarand anotherlogic wouldbe necessary. 2.3.4.SOME CRITICAL REMARKS (A)The firstgroupof our remarks concerns the system IL (and IL +). (a) The system is "over-intensionalized" by permitting sequences of types such as as, aSs , a sss , ... ad infinitum, and by the unlimited iteration of the intensor (A). No iteration of the type symbol s occurs evenin the translations of sentences and terms of LE - although we have examples of nested occurrences of s. It seems that one has to fmd anotherdevice for distinguishing extensional and intensional functor types. (b) Definite descriptions are absentfrom IL. Hence, the translation rule for the definite article tlie follows the Russellian-Quinean schema[see(Tr2)]. Of course, the introduction of definite descriptions in type l wouldimplypermitting the possibility of semantic valuegaps whichis totallyaliento the spiritof the semantics of IL . (c) The domain of individuals - i.e., the domain of quantification of type l - is the same at all indices (at all worlds and time moments), although our intuition suggests the variability of this domain. A partial remedy is to introduce a monadic predicate E - expressing actual existence - whose truth domain may vary from index to index, and to express quantification on "existing individuals" by "Vxt(E(x)::J F(x»" insteadof "v x.Fix)",
179
Then a sentence of form "3x l.F(x) & -3x(E(x) & F(x)" would say: "There exists a nonexistentobjectof which F holds" - whichis (at least)somewhat curious. System IL is not strong enough to express differences in meaning. If A and B are valid sentences then F(A =B), i.e., they countas logical synonyms of each other. Thus, for all A and B, "A :::::> A" and "B :::::> B" are synonymous, and so are A and "A & (B v-B)". Hence, no difference in meaning can be exhibited in IL between the sentences: !Marg tlii~ tliat '13i{{sings. !Mary tliink§ tliat '13i{{sings atufJolinsfeeps ordoes notsleep, (B) Returning to Montague's grammar of a fragment of EnglishLE , the followingtwo mistakes may be qualified as simple oversights of the author (easily corrigable): (a) By (SI2), wa{R.anita{R. E T(IV). Then,by (S4), Mary wa{k§ ani taC( E T(t).
For (S4) says that only the first verb is to be replaced by its third person singularpresent. (b) By (S5) and (S4), lieo Coves lieo E T(t). Then, by (S140) , !Mary Coves her E T(t). According to the translation rule (Trl40) , its translation is Cove. {Marg}{Marg} which means that Mary loves herself. It is very doubtful that the sentence Marg Coves her has such a reading. It seems that (SI4n ) needs somerestrictive clauses. A more essential reflection: In the syntax of LE , Montague does not distinguish extensional and intensional functors (in the same category). According to the translation rules, all functors are treatedas intensional ones. Later on, the introduction of "meaning postulates" will make, nevertheless, a distinction between extensional and intensional functors (in certain categories). All this means that a correct construction of a fragment of English is impossible withouta sharp distinction of extensional and intensional termsin some categories. Then it wouldbe a moreplainmethod to makethis distinction in the syntax already. For example, the category of transitive verbs could be handled by introducing the basic sets B(TVext) and B(TVint), that is, the set of extensional and intensional transitive verbs, respectively. Then, the translation rules for these verbs would be as follows: If A E B(TVint) then A E Cone (~E S ), and A * = A. If A E B(TVext) then A E Cone (Oll), and A * = (Ages [Ax lSvg("(AYls A(vx)(vy) )) E cat, (~t ). [Cf. (M4) of the preceding section.] Thus, the translation of every transitive verb belongsto the same logical type.Let us consider an example of application:
180
Vimf apen]* = jimf* (A[a pen]*)
=::
= (AgEs [Axls.Yg(A(AYlsfimf(Vx)(Yy»])(A(Afos .3Zt [pen. (z)& Yf(A z)]) = = [Ax(Af3z[pen. (z) & Yf(VZ)])(A(Ayfi7Ul"tx)ty»)] ::: :=
(Axl s 3zl [pen. (z) & jimf(Yx)(z)])
E
Cat, (b).
['13ifljitufs apen]* = (Afrs Y f(AtJ3ill)(AVimf apen]*) = (Ax.3z l [pen. (z) & jimf(Yx)(Z)])(A'13ill) = 3z[pen. (z) & jimf('13i{{)(z)].
:=
:=
An analogous method is applicable for the categories CN,
1':' and PRE. How-
ever, one can raise some doubts about the existence of intensional terms in B(eN) and B(IV). Montague's example concerning temperature and rise (cf. (5), (6), and (7) in 2.3.2) apparently proves that these predicates are intensional ones. However, tlie tem-
perature in the sentence tlietemperature rises refers to a function (defined on time) whereas the same term in the sentence
tlietemperature is ninety refers to the value (at a given time moment t) of that function. This is a case of the systematic ambiguity of natural language concerning measure functions (as, e.g., 'the
velocity of your car', 'the height of the baby', 'the price of the wine', etc.). Montague's solution of this ambiguity seems to be an ad hoc one. Instead, a general analysis of the syntax and the semantics of measure functions would be necessary. It seems to be somewhat disturbing that the rules governing the verb be permit the construction of the sentence:
tlie woman is eve'!! man. Its translation is:
3x l [V'Yl (woman. (y) =(y =x) & V'Zt (man. (z) :J (x
=z»] .
It is dubious whether an eve'!! expression can occur after is in a well-formed English sentence. Let us note, finally, that T( e) is empty in L E , for the individual names belong to T(NOM). In fact, the basic categories of LE are t, IV, and eN; by means of these all other categories are definable. Why used Montague e at all? The reason is, probably, that the definition of the function f mapping the categories into logical types (see at the beginning of 2.3.2) became very short and elegant. If he had chosen t, IV, and CN as basic categories the definition of f would grow longer with a single line. The mathematical elegance resulted a grammatical unelegance: an empty basic category. 181
REFERENCES BARCAN, R. C. 1946,'A functional calculusof first order based'on strict implication.' The Journal ofSymbolic Logic, 11. GALLIN, D. 1975, Intensional and Higher-Order Modal Logic. North HollandAmericanElsevier,Amsterdam-NewYork. HENKIN, L. 1950, 'Completeness in the theory of types.' The Journal ofSymbolic Logic, 15. LEWIS, C. I. AND LANGFORD, C. H. 1959, Symbolic Logic, seconded., Dover, New York. MONTAGUE, R. 1970, 'Universal Grammar'. In: THOMASON 1974. MONTAGUE, R. 1973, 'The proper treatment of quantification in ordinaryEnglish.' In: 1HOMASON 1974. SKOLEM, TH. 1920, Selected Works in Logic, Oslo-Bergen-Tromso, 1970,pp.l03136. 1HOMASON, R. H. (ed.) 1974, Formal Philosophy: Selected Papers ofRichard Montague. Yale Univ. Press, New Haven-London. RUZSA, 1. 1991, Intensional Logic Revisited, Chapter 1. (Available at the Dept. of SymbolicLogic,E.L. University, Budapest.)
182