Advances
in COMPUTERS VOLUME 13
Contributors to This Volume
B. CHANDRASEKARAN PATRICIA FULTON JAMESJOYCE BOZENA HEN...
45 downloads
1075 Views
13MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Advances
in COMPUTERS VOLUME 13
Contributors to This Volume
B. CHANDRASEKARAN PATRICIA FULTON JAMESJOYCE BOZENA HENISZTHOMPSON FREDERICK B. THOMPSON L. WEXELBLAT RICHARD
Advances in
COMPUTERS EDITED BY
MORRIS RUBINOFF Moore School of Electrical Engineering University of Pennsylvania and Pennsylvania Research Associates, Inc. Philadelphia, Pennsylvania
AND
MARSHALL C. YOVITS Department of Computer and Information Sciencc Ohio State University Columbus, Ohio
VOLUME
13
ACADEMIC PRESS
9
New York
9
Son Francisco
A Subsidiary of Harcourt Braw Jovanovich, Publishers
9
London--1975
COPYRIGHT 0 1975, BY ACADEMIC PRESS,INC. ALL RIGHTS RESERVED. NO PART OF THIS PUBLICATION MAY BE REPRODUCED O R TRANSMITTED IN ANY FORM OR BY ANY MEANS. ELECTRONIC OR MECHANICAL, INCLUDING PHOTOCOPY, RECORDING, OR ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM, WITHOUT PERMISSION IN WRITING FROM THE PUBLISHER.
ACADEMIC PRESS, INC.
11 1 Fifth Avenue, New York, New York 10003
United Kingdom Edition published by ACADEMIC PRESS, INC. (LONDON) LTD. 24/28 Oval Road. London NWl
LIBRARY OF
CONGRESS CATALOG CARD
NUMBER : 5 9 - 15 7 6 1
ISBN 0-12-012113-1 PRINTED IN THE UNITED STATES OF AMERICA
Contents CONTRIBUTORS PREFACE
ix xi
Programmed Control of Asynchronous Program Interrupts Richard L. Wexelblat
1. Introduction . 2. Definition of Terms . 3. Attentions and Synchronism . 4. Facilities in Current Languages . 5. External Attentions . 6. Extended Attention Handling . 7. Examples . 8. Conclusion Appendix 1. Syntax of the Attention Handling Language . Appendix 2. Detectable Conditions in PL/I, COBOL, and FORTRAN . Appendix 3. Glossary of Terms . References
1 2 4 5 13 16 31 37 37
38 39 40
Poetry Generation and Analysis James Joyce
1. Introduction . 2. Computing Expertise . 3. Poetry Generation: The Results . 4. Poetry Analysis: Introduction . 5. Concordance-Making . 6. Stylistic Analysis . 7. Prosody . 8. Literary Influence: Milton on Shelley 9. A Statistical Analysis . V
.
43 44 47 52 53 58 61 62 63
CONTENTS
vi
10. Mathematical and Statistical Modeling 11. Textual Bibliography . 12. Conclusion . References .
.
64 67 69 70
Mapping and Computers Patricia Fulton
1. Introduction . 2. History . 3. What Is a Map? 4. The Earth Ellipsoid 5. The Geoid . 6. Geodetic Datum 7. Geodetic Surveys 8. Satellite Geodesy 9. Photogrammetry 10. Projections . 11. Cartography . 12. Data Banks . 13. Future Trends . 14. Conclusions . References .
73 74 76 77 78 79 80 87 89 92 98 102 103 105 106
.
.
. . . .
Practical Natural language Processing: The RE1 System as Prototype Frederick 6.Thompson and Bozena Henisz Thompson
Introduction . 1. Natural Language for Computers . 2. What Constitutes a Natural Language? 3. The Prototype REL System . 4. Semantics and Data Structures . 5. Semantics Revisited . 6. Deduction and Related Issues .
. .
.
. .
.
.
110 110 111 115 122 128 135
CONTENTS
vii
7. English for the Computer . 8. Practical Natural Language Processing References
Artificial Intelligence--The
. .
. ,
143 158 167
Past Decade
B. Chandrasekaran
1. Introduction . 2. The Objectives of the Review . 3. Language Processing . 4. Some Aspects of Representation, Inference, and Planning 5. Automatic Programming . 6. Game-Playing Programs . 7. Some Learning Programs . 8. Heuristic Search . . 9. Pattern Recognition and Scene Analysis 10. Cognitive Psychology and Artificial Intelligence . 11. Concluding Remarks . References .
AUTHORINDEX . SUBJECT INDEX . CONTENTS OF PREVIOUS VOLUMES .
. .
. . . . .
. . .
. .
. .
.
170 173 176 195 202 205 208 213 217 220 224 225
233 237 245
This Page Intentionally Left Blank
Contributors to Volume 13 Numbers in parentheses indicate the pages on which the authors' contributions begin.
B. CHANDRASEKARAN, Department of Computer and Information Science, T h e Ohio State University, Columbus, Ohio (169) PATRICIA FULTON,U.S. Geological Survey, 12201 Sunrise Valley Drive, Reston, Virginia (73) JAMESJOYCE,Computer Sciences Division, Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, California (43) BOZENAHENISZTHOMPSON, California Institute of Technology, Pasadena, California (109) FREDERICK B. THOMPSON, California Institute of Technolog.y, Pasadena, California (109) RICHARD L. WEXELBLAT, Bell Laboratories, Holmdel, New Jersey ( 1 )
ix
This Page Intentionally Left Blank
Preface
It gives u s great pleasure to welcome Marshall C. Yovits a s Co-Editor of Advances in Computers. As Guest Editor of Volume 11, Dr. Yovits commented on how extensive and diverse is the field commonly known as computer information science. The prescnt volume demonstrates that diversity in its five comprehensive articles : four devoted to advanced computer applications ranging from the practicalities of gcodctics and mapping to the esthetics of poetry generation and analysis, the fifth directed to the problems that arise when computer opcration is asynchronously interrupted by service requests from the “outside” world. The central theme of each of the articles deserves presentation here. In her article on Mapping and Computers, Patricia Fulton describes many of the recent successes of computerized cartography and related uses of cartographic data bases. Frederick B. Thompson and Bozena Henisz Thompson describe a language system that accommodates natural language communication with computers. B. Chandrasekaran points to the complexity of the processes involved in “creating intelligence” and provides a critical examination of the various branches of research on artificial intelligence. James Joyce describes poetry generation and analysis by computer including concordance-making, stylistic analysis, prosody, literary influence, statistical analysis, and mathematical modeling. Richard L. Wexelblat presents extensions to existing programming languages that provide a capacity for interrupt handling through the use of “on-units” and a facility for the synchronization of independent tasks. MORRISRUBINOFF
xi
This Page Intentionally Left Blank
Programmed Control of Asynchronous Program Interrupts RICHARD 1. WEXELBLAT Bell loborotorier Holmdd. New Jersey
.
1 Introduction . . . . . . . . . . . . . . . . . . . . . 1 2. Definition of Terms . . . . . . . . . . . . . . . . . . 2 3. Attentions and Synchronism . . . . . . . . . . . . . . . 4 4 Facilities in Current Languages . . . . . . . . . . . . . . . 5 4.1 PL/I . . . . . . . . . . . . . . . . . . . . . . 5 4.2 COBOL . . . . . . . . . . . . . . . . . . . . . 10 4.3 FORTRAN . . . . . . . . . . . . . . . . . . . . 11 5 External Attentions . . . . . . . . . . . . . . . . . . 13 5.1 An Example of an Asynchronous Attention . . . . . . . . . 13 5.2 External Attentions and Multiprocessing . . . . . . . . . . 15 6. Extended Attention Handling . . . . . . . . . . . . . . . 16 6.1 The On-Unit Approach to Attention Handling . . . . . . . . . 16 6.2 Multiprocessing Considerations . . . . . . . . . . . . . . 25 6.3 Attention Handling through Multiprocessing-An Alternative Approach . 27 6.4 Extensions to FORTRAN . . . . . . . . . . . . . . . 30 7. Examples . . . . . . . . . . . . . . . . . . . . . . 31 8. Conclusion . . . . . . . . . . . . . . . . . . . . . . 37 Appendix 1. Syntax of the Attention Handling Language . . . . . . . 37 Appendix 2. Detectable Conditions in PL/I, COBOL, and FORTRAN . . . 38 Appendix 3. Glossary of Terms . . . . . . . . . . . . . . . 39 References . . . . . . . . . . . . . . . . . . . . . . . 40
.
.
.
1 Introduction
Few high-level programming languages have sufficient richness to give the user the explicit ability to control asynchronous and independent parallel processes . This article discusses the problems associated with the handling of externally caused asynchronous interrupt conditions and presents extensions to existing languages that provide a capacity for interrupt handling . The PL/I programming language is used as a basis for many of the examples because it already has a rudimentary capacity in this area . However, consideration is given to other common languages such as 1
2
RICHARD 1. WEXELBLAT
FORTRAN and COBOL. Two basic control structures are used in the development: “on-units” and a facility achieved through the synchronization of independcnt tasks. The primary area of concern is what happens to a running program as the result of an event or condition arising ‘Loutside”of the main program stream such as might result from an 1/0 device, graphical terminal, or online process controller. Upon encountering this situation, a special segment of code is executed. This might, on the one hand, be a particular subroutine designated by the programmer t o be activated when the external event occurs. On the other hand, it might be a separate program or routine previously activated but currently inactive (in a so-called “wait-state”) awaiting the occurrence of the external event. The response to the event might cause the program to stop or perhaps t o take some special action such as printing some data, calling a subroutine, or signaling some special device. Although such facilities may be achieved in almost any programming language through calls to machine language subroutines, this discussion is primarily devoted to explicit high-level language. Thus, special interest will be paid to the language a programmer will use to specify the action taken by the program when the external event occurs. This article begins with a tutorial overview and a brief survey of the current state of the art, followed by a suggestion of a possible extension to existing facilities (Section 6 ) . At present there does not seem to be any single “best way” to process interrupts and t o specify the programs to do the processing. Alternative formulations are presented and potential efficiency considerations are discussed where appropriate.
2. Definition of Terms
This presentation will, for the most part, be concerned with the effect of external events on a single program. This may be a user’s problem program or the operating system itself. Although ther.e is usually substantial difference in complexity between the two, the basic principles involved are the same. The term task is used to refer to a single running section of code. I n the absence of multiprogramming, a task is roughly equivalent to a program. Under multiprogramming, there may be many tasks running concurrently-all associated with a single program or job. Two tasks that are executing independently but which communicate through some common data area and jointly compute some function or provide the solution to some problem are said to be cooperating. In the special case in which
ASYNCHRONOUS PROGRAM INTERRUPTS
3
the tasks take turns executing, each in turn starting the other and waiting for it to return control, the cooperating tasks are called coroutines. Let the term attention refer to that which a task experiences as result of an external event. Depending upon what action the programmer has specified, the attention may or may not interrupt the program. The word condition will be used to refer to that event which gives rise to an attention. Thus, the occurrence of an overflow condition may cause an overflow attention to be raised and this may in turn interrupt the program. (The term attention has deliberately been introduced to remove the potentially confusing ambiguity in the use of the word interrupt as both verb and noun. This double meaning could lead to the situation where an interrupt (noun) may not interrupt (verb) a program. Interrupt will be used here only as a verb, while attention will be the noun that names the action resulting from the external event.) A brief glossary of definitions of terms is included as Appendix 3. A few examples from a familiar context will help in this informal definition of terms. Assume that a teacher is before a class and a student in the front row raises a hand. The student has raised an attention. This attention may or may not interrupt the teacher, depending upon just what the teacher is doing. I n one case, the teacher may be reading and not looking a t the class a t that moment. The teacher’s attention handling mechanism is said to be disabled. If things continue thus, the student may give up and the attention will have been ignored; or the student may keep his hand up and the attention may then be handled later. It is possible that the teacher might notice the raised hand, stop, and immediately ask the student what is the matter. I n this case, the attention handling mechanism was enabled and the raised hand caused an asynchronous or immediate interrupt. Had the teacher noticed the hand, but chosen to finish the current activity before querying the student, the attention would have caused a synchronous or delayed interrupt. The example can be carried a bit further. Assume the teacher has just begun to ask a question of the class in general. As the teacher talks, some of the students begin raising their hands ready to answer the question. The teacher may note the order in which the hands are raised but go on to finish the question, later calling on the student whose hand went up first. In this situation the attention handling mechanism was enabled but did not cause an asynchronous interrupt. Rather the attentions are stacked or queued and then dequeued by the teacher synchronously. Depending upon the answers received as the students are called on one a t a time, the teacher may dequeue all the attentions or a t any time decide to ignore the remainder of the queued attentions.
4
RICHARD L. WEXELBLAT
One final consideration: suppose the teacher begins a question and suddenly a student with an agonized look raises a hand suddenly very high. Although there may be several hands up, the teacher will very likely recognize the existence of a high priority attention and handle i t immediately (i.e., asynchronously).
3. Attentions and Synchronism
Although the event that causes an attention is by definition external to a task, it may very well be the result of an action of the executing program. Following are some examples of conditions that can cause attentions to occur : Computational Conditions such as underflow/overflow and zero divide; 1/0 Conditions such as end of record, end of file, and transmission error ; Program Conditions such as subscript out af range and attempted transfer to a nonexistent label or address; Machine Error Conditions such as invalid machine operation code and improper data format. Other attention interrupts may occur due to completely external events such as might arise in such real-time applications as process control and time sharing. Depending upon the specific hardwarc involved, some of these conditions can be recognized by the hardware, some must be simulated in the software. Ideally, the methods for programming the handling of all of these should be expressable in some uniform syntax. Although not without exception, there seems to be a rule that the “more synchronous” the event causing the attention, the “easier” it is to handle. It would be reasonable then to consider the kinds of synchronism that must be dealt with. At the hardwarc level, a single machine instruction may be considered a basic unit. Even though the hardware may go through several internal cycles to execute the instruction (especially if microprogrammed) , there is a trndency among hardware designers to synchronize interrupts with respect to the machine language. As computers grow in speed and capacity, however, it becomes harder and harder to determine precisely when machine conditions actually occur. To get the best use of hardware, it is necessary to permit interrupts to be “imprccisc” or asynchronous with respect to the machine language. In some overlapped machines, an overflow condition, for example, may not be detectcd until many instructions after the execution of the actual instruction that initiated the computation that led to overflow. This problem is further compounded in a machine with a “pipeline” arithmetic section. I n a high level language such as ALGOL or PL/I, a single instruction
A S Y N C H R O N O U S P R O G R A M INTERRUPTS
5
may correspond to a great many machine instructions. An attention may be synchronous with respect to the machine language but asynchronous with respect to the higher level language. These two degrees of synchronism do not give enough resolution. Consider the PL/I language. At the implementation level of PL/I (F), IBM’s original PL/I implementation for OS/360 (IBM, 1970), there are certain types of operations that are intrinsically noninterruptable. For example, in assigning a value to an item with a dope vector, if validity of data is to be maintained, no interrupt should be permitted between updating the data and updating the dope vector. This operation of updating a dope vector, although several macliinc instructions, must still be considered a primitive operation a t the “implementati~n’~ level of PL/I. Levels of interrupt can he specified corresponding to levels of hardware or software. For purposes of this exposition the following levels of interrupt are of interest:
T y p e 0-Asynchronous with respect to the machine language (although possibly synchronous with respect to an underlying microprogram). T y p e I-Synchronous with respect to the machine language ; asynchronous with respect to an implementation. T y p e %-Synchronous with respect to an implementation ; asynchronous with respect to the language. T y p e %-Synchronous with respect to the language. T y p e 4-Synchronous with respect to a specific program (can occur only at specific points in a program). Examples of these are given in the following sections.
4. Facilities in Current languages
Most currently available high-level programming languages have at least a rudimentary facility for permitting the programmer to find out about attentions arising from conditions that result from program execution. The following sections describe what is currently available in PL/I and in the two most popular general purpose languages, COBOL and FORTRAN. 4.1 PL/I
At this time, PL/I is the only (fairly) widely used h i g h - l e d language that contains explicit language for handling tasks, attentions, and interrupts. The software of the PL/I environment will classify an interrupt
6
RICHARD 1. WEXELBLAT
according to type (converting the type if /necessary) and pass control to the appropriate part of a user’s program. Unfortunately, the language as now defined by the IBRl implementations tends to force synchronization of all interrupts. Some of the restrictions may be relaxed by the new proposed standard PL/I (ANSI/X3Jl, 1973). The MULTICS system on the Honeywcll (formerly G E ) 645 is implemented almost entirely i n P L / I (Corbato et al., 1972). MULTICS PL/I does not take any special precautions concerning the possibility of unexpected interrupts except t h a t the nvmber of events t h a t can occur asynchronously is severcly limited. It is possible, however, that an attention from an online user’s terminal will interrupt a program during a “noninterruptable” operation such as an area assignment. Although this sort of unfortunate timing may occasionally create difficulty with a user’s problem program, it cannot bother the operating system itself. 4. I . I P L / I Attention Handling
T h e basic attention processing feature of PL/I is the on-statement. A set of predefined conditions such as OVERFLOW, ZERODIVIDE, ENDFILE, CONVERSION, etc., is available and, by executing an onstatement referring to a particular condition name, the programmer may establish the statement or block of statements he wishes to be executed when an attention corresponding to the specified condition actually occurs. The full set of conditions defined in P L / I is given in Appendix 2. A detailed description of the on-statement and its action may be found in a PL/I manual or text (IBM, 1971 ; Bates and Douglas, 1970). For purposes of this exposition, the use of the on-statement can best be illustrated by example. T h e first example shows a simple program for copying one file into another, using the end of file condition to print the number of records processed and stop the program.
T: P R O C E D U R E ; DCL (A, B) R E C O R D FILE, STR CHAR(80); ON ENDE‘ILE(A) B E G I N ; P U T (J); STOP; E N D ; DO J = l BY 1 ; R E A D VILE(A) INTO(STR) ; W R I T E FILE(B) lTtO3f(STR); END; E N D T;
A S Y N C H R O N O U S P R O G R A M INTERRUPTS
7
The on-statement establishes the action to be taken when an end of file occurs on file A. This action is specified by the block of statements following the ON ENDFILE (from BEGIN to E N D ) , called the on-unit, which will be executed when the end of file is detected by the program. The on-unit has been referred to as a “subroutine invoked by the gods.” This is meant to reflect the behavior of an on-unit invoked out-of-line by an agency that is only indirectly the result of the program’s execution. It should be noted that although the occurrence of an end of file is in genera1 potentially a type 0 interrupt, completely asynchronous with respect to the CPU, on the System/360 the hardware (or microprogram) converts it to type 1. The PL/I language as currently defined, however, forces the conversion to type 4 and in the IBM PL/I implementations, the end of file will appear to have occurred a t the end of the read-statement. This has greater significance in the light of asynchronous 1/0 as seen below. On-units may also be used with computational conditions :
T: PROC; DCL (X(lO), Y(10)) FLOAT; ON ENDFILE(SYS1N) STOP; O N OVERFLOW B E G I N ; P U T (‘OVERFLOW I N CASE’,J); GO TO END-LOOP; END; DO J = l BY 1; G E T (X’Y); /* computation on X and Y that might cause overflow*/ END-LOOP: END; END T; I n this example, the overflow on-unit will be executed when and if overflow occurs. At the machine level, overflow is type 1 although the PL/I definition causes the condition to be treated as type 2. Once the on-unit is entered, however, synchronization with the source program is temporarily lost. If the overflow had occurred in the middle of a computation, the result of the computation would have been undefined. I n the given example, the program forces execution back in synchronization with a transfer to the end of the loop. Had the goto-statement been omitted from the on-unit, the program would have continued from the point where the interrupt occurred, with the result of the computation undefined.
8
RICHARD L. WEXELBLAT
An interesting consideration reIated to fixed point overflow concerns the circumstances in which it can occur. When it results from an explicit programmer specified operation (e.g., I = J K ; ) , it is type 2, as mentioned above. The same hardware condition which occurred as the result of some operation in the environment, the evaluation of an expression specified as an array bound, for example, couId be handIed in the environment without the programmer ever being aware of it. I n some cases, it is possible to set up attention handling code that can “fix-up and continue.”
+
ON CONVERSION BEGIN; IF ONSOURCE = ‘ ’ T H E N ONSOURCE = ‘0’; ELSE STOP; END; When an attention is raised as result of a conversion error (in PL/I, type 2, gemrated by the code in the software environment), the specified on-unit will check to see if the offending field was blank. If so, that field will be replaced by zero and execution will continue from the point of interrupt with the conversion attempted again. If the field is not blank the stop-statement is executed. 4.1.2 P l / I Asynchronous I/O
All of the above cases have referred to synchronous attentions or to interrupts that have been synchronized by some hardware or software interface. Consider the following examples of how asynchronous operations are affected by attentions. PL/I has the ability to specify that certain input and output operations are to be executed asynchronously, in parallel with program execution. With each such 1/0 operation is associated an event variable which can record the completion status of the associated operation.
ON ENDFILE(F) B E G I N ; /*things to do on end of file*/ END;
...
DO
...
READ FILE(F) INTO(STR) EVENT(E); /*things t o do that do not require the new input data*/
ASYNCHRONOUS PROGRAM INTERRUPTS
WAIT(E); / * when t.he d a t a are needed /*process the data*/
9
*/
END; I n this example, as above, the endfile on-unit specifies the code the programmer wishes to be executed when an end of file is encountered. The main section of the program is a loop containing an input statement which reads from file F into a variable STR. The presence of the EVENT option in the statement specifies that the read is to be performed asynchronously and associates the event variable E with the operation. E is set ‘Lincomplete’’a t the start of the input. The input activity is then free to go on in parallel with the computation and E will be set “complete” when the input operation finishes. When the data are required, a wait-statement referencing E is executed and the program will wait, if necessary, until the input completes, whereupon the input data may be processed. It would be natural t o assume that an end of file attention will be raised immediately upon occurrence of the end of file condition since it is likely of type 0 or type 1 in the hardware. Unfortunately, the definition of the System/36O precludes type 0 and the definition of PL/I forces the implementation t o treat the interrupt as type 4 and the program does not “find out” about the end of file until the wait-statement is executed. That is, if the condition has occurred, the appropriate on-unit will be invoked a t the time the wait-statement is executed. Indeed, all conditions associated with asynchronous input/output operations are, by the definition of PL/I, forced to be treated as type &even transmission errors.
4. I . 3 Enabling and Disabling in P L / I
On-units in PL/I have block scope. That is, once an on-statement is executed in a given block, that on-unit remains in effect until overridden by the execution of another on-statement for the same condition. When an on-statement is executed in a subroutine or block, the new on-unit will temporarily replace any on-unit for the given condition specified in an outer block. When control returns to the outer block, however, the on-unit for that block is restored. Within a single block level, the execution of an on-statement for a given condition overrides and replaces any existing unit from a previous on-statement executed a t the same block level. It is possible to enable and disable attentions from many of the PL/I conditions a t the block or statement level by putting a prefix on the given
10
RICHARD L. WEXELBLAT
block or statement. Thus, (SUBSCRIPTRANGE) : S: PROCEDURE;
...
END; will enable the normally disabled subscript range checking for the duration of the execution of the procedure S except where the checking is explicitly disabled by a NONSUBSCRIPTRANGE prefix on a statement within S. Similarly the statement
+
(NOOVERFLOW) : A = B C/D *E ; will be execiited with floating point overflow ignored. Note that this does not say that overflow will not occur ; it merely says that i t is not necessary to check for it and if it should occur, the program should not be interrupted. What happens when a given condition occurs while disabled is an interesting question. From the point of view of the programmer and the programming language, the meaning of the program is defined only if the condition does not occur. If i t does actually occur while disabled, the resulting execution is highly dependent on the particular condition and on the hardware and implementation involved. It would be unlikely that the behavior of a program under such a circumstance would be the same on two different types of machine. This leads to interesting problems in the area of program interchange and in the area of programming language standards. The situation is ignored in the definition of Standard FORTRAN (ANSI, 1966). The proposed PL/I Standard (ANSI/X3J1, 1973) attempts at least to identify such problems and to try to make the implementor aware of where the program execution becomes iniplementation dependent. 4.2 COBOL
The language for handling machine and computational condition attentions in COBOL is extremely rudimentary compared with what is available in PL/I. For example the occurrence of an overflow condition may be detected only by an explicit test coded after each computation by the programmer. An optional ON SIZE ERROR clause may be appended to 8 computation sentence. If the computation results in a value too large to be held by the target variable specified in the sentence, a single imperative sentence specified by the SIZE ERROR clause will be executed. At least one COBOL text suggests using the SIZE ERROR clause on every calculation for which an explicit scaling computation has not previously been done (Coddington, 1971). Due to the limited variety of control structures available in COBOL,
A S Y N C H R O N O U S P R O G R A M INTERRUPTS
11
the statement specified in the SIZE ERROR clause is usually a gotostatement. For example, ADD AMOUNT, COUNT TO TOTAL; ON SIZE ERROR GO TO HELP. COMPUTE X = Y/Z; ON SIZE ERROR GO TO NEXT-CASE. Thus, the computational conditions of overflow and zero divide are converted by the COBOL operating environment to an attention class that is in a sense even more restrictive than type 4. The input/output conditions are treated similarly-an AT E N D clause may bc included in a read statement to specify the action to take on encountering an end of file. An INVALID KEY clause is available for use in conjunction with direct access files. READ SYSINFILE; AT E N D GO T O NO-MORE. For some conditions, COBOL does provide a facility similar to the onunit of PL/I. It is possible to specify out of line a section of code to be executed when an 1/0 error occurs. This code will be executed following an error, but after any system-provided error handling routines have executed. The USE sentence in the Declaratives Section of the Procedure Division serves t o define the so-called USE procedure. For example, PROCEDURE DIVISION. DECLARATIVES. ERROR-UNIT SECTION. USE AFTER STANDARD ERROR PROCEDURE ON INPUT. . . . code to be executed after an i/o error on an input file . . . E N D DECLARATIVES.
It is possible to specify an error handling routine for all input files, for all output files, for all files, or for a list of files given explicitly by name. This code is established a t compilc time for the duration of execution and it is not possible to turn the action off and on dynamically. The USE procedure facility may also be used in conjunction with tape label processing and report generation. 4.3 FORTRAN
There is nothing in FORTRAN that corresponds even remotely to PL/I’s on-conditions or COBOL’s USE procedures. Indeed, in American National Standard FORTRAN (ANSI, 1966) there is no provision whatever for detecting any form of computational, machine, I/O, or external condition. Almost all implementations of FORTRAN, however,
12
RICHARD 1. WEXELBLAT
include many extensions in these areas and the current draft of the proposed revision of Standard FORTRAN includes some of these extensions (ANSI/X3J3,1973). Typical of FORTRAN supersets is the IBM FORTRAN IV language (IBM, 1972b). Floating point overflow and underflow may be detected in IBM FORTRAN IV only after the fact. A service subroutine, OVERFL, is provided that may be called with an integer variable as argument. This variable is set to 1 if exponent overflow occurred, 2 if no overflow condition exists, and 3 if exponent underflow was last to occur. A side effect of the call is to reset the indicator. Similarly, there is a DVCHK service subroutine that sets its argument t o 1 if a divide check occurred and 2 if not. 1/0 conditions are handled in a different manner. The Standard FORTRAN formatted read-statement is of the form: READ(u,f)k where u is a unit identification, f identifies a format and k is an input list. IBM FORTRAN IV extends this to
where s,, s2 are statement numbers; E R R = is an optional parameter specifying a statement to which transfer is made if a transmission error occurs during the read; and END= is an optional parameter specifying a statement to which transfer is made if an end of file is encountered during the read. For some reason, IBM’s FORTRAN language implementors chose not to permit the ERR = parameter on the write-statement. Thus, all of IBM FORTRAN IV’s conditions are converted to type 4 through either the system subroutines for computational condition detection or the additional 1/0 parameter mechanism. Although Standard FORTRAN has no form of asynchronous I/O, the IBM implementation does provide a facility in this area similar to that present in PL/I. If a read- or write-statement contains an ID = n parameter ( n is an integer or integer expression whose value must be unique to each read or write), then the transmission occurs asynchronously. A wait-statement is also provided ; however, unlike the PL/I wait-statement, the FORTRAN version may specify an 1/0 list. The FORTRAN wait-statement must be executed to “complete” the 1/0 operation. If the operation has not finished when the wait-statement is executed, the program will wait for the completion a t that time. The E N D = and E R R = parameters may not be used on asynchronous read-statements, rather the function is served by an optional COND = i
ASYNCHRONOUS PROGRAM INTERRUPTS
13
parameter. If the parameter is present, the variable i is set to 1, 2, or 3, depending on whether a read completed normally, an error condition occurred, or an end of file was detected. There seems to be no logical reason for the inconsistency between the error and end of file mechanisms of synchronous and asynchronous input. The FORTRAN implementation for the Honeywell Series 6000 computers (Honeywell, 1971) is supplied with extensions similar to those of IBM FORTRAN IV. I n this case, however, the set of detectable extensions is larger and somewhat more consistent. The 1/0 error parameter may be used in conjunction with a writestatement as well as with a read. In addition, the service subroutines provided, as well as extending the set of computational conditions, permits the user to test for end of file and 1/0 error conditions. It is also possible, through a system subroutine, to establish the label of a statement to which transfer will be made whenever almost any error detectable by the environment, be it computational or I/O, occurs. Although it would require a bit of initialization, it is possible to specify a unique transfer point for each of the set of roughly 80 distinct conditions. This is approximately equivalent to the PL/I facility with normal return from an on-unit forbidden.
5. External Attentions
All of the potentially interrupting conditions considered so far have one thing in common: they are associated directly with some statement, computation, or reference in the source program. I n order to handle the general case of external attentions additional language is needed as well as additional semantics for current language. 5.1 An Example of an Asynchronous Attention
Below are two examples of programs that use a simple attention handling facility, written in a terminal oriented dialect of PL/I known as CPS (IBM, 1972a). CPS has a rudimentary attention handling facility of the form: ON ATTENTION simple-statement ; The on-unit may consist only of a single simple statement. (The attention referred to is the “ATTN” or “BREAK” key of an on-line user’s terminal.) The first example shows the use of an attention to cause the cur-
RICHARD L. WEXELBLAT
14
rent index of a do-loop to be printed. During the loop’s execution, should the programmer get impatient and wish to see how far his program has gone, he may push the attention button and the current value of the index will be printed, after which the program will continue.
ON ATTENTION P U T LIST(J); DO J = l TO N;
/* computation */ END; One of the serious problems with online use of low speed terminals is that a verbose program can waste much time and create mu,ch annoyance while printing long messages, especially when the same program is run repeatedly. I n the next example, the attention key may be used by the programmer a t the terminal to cut short the undesired output and to permit the program to continue.
... ON ATTENTION GO TO SKIP1; P U T E D I T ( ) ( ); SKIPI: . . .
... ON ATTENTION GO TO SKIPB; PUT EDIT( ) ( ); SKIPZ: . . . ON ATTENTION GO TO SKIP3; P U T E D I T ( ) ( ); SKIPS: . . . (To be completely correct, the ellipsis following each “skip” label should contain an ON ATTENTION . , . that nullifies the previous ON
ATTENTION GO TO . . . .)
I n the absence of an attention on-unit, the CPS system default action in response to an attention is to stop the running program and return control to the command level. The user’s explicit specification of an attention on-unit overrides the system response, a t times making it hard to stop a looping program. I n the CPS implementation on the IBM System/36O, if an attention condition is raised while an attention on-unit is being executed, then the program will be stopped. It is sometimes difficult for the programmer to hit the attention key a t the right time.
ASYNCHRONOUS PROGRAM INTERRUPTS
15
In one of the earliest time sharing systems, MIT’s CTSS, the need for more than one level of terminal attention was recognized and, very early in the development of CTSS a mechanism was provided to permit the user to specify multiple levels of execution (Corbato et al., 1963). The problem was that the terminal devices used with CTSS usually had only one attention or break mechanism. The simple but quite successful solution adopted was to permit the terminal user to send his break signal a series of times, each successive signal raising the level of execution one step until the top or command level was reached. The most common use of this mechanism was to provide a two level attention facility: level ¬ify level l-notify button.
the system that the user pushed the break button and the problem program that the user pushed the break
Thus, running under a program with these two levels implemented, a single break was equivalent to raising an attention condition in the problem program, while two breaks in quick succession served to interrupt the program and return control to the command level. 5.2 External Attentions and Multiprocessing
Before going on to look into generalized attention handling language,
it will be necessary to look into one of the potential problems associated with the interaction between attentions and multiprocessing. Although the situation is presented in the context of PL/I, the resulting problem area will be present in any language that involves both multiprocessing and externally generated asynchronous attentions. When a subroutine is called from a PL/I task, it inherits all of the on-units of its progenitor, even if it is spawned as an independent task (i.e., free to execute in parallel). In the case of overflow, for example, this creates no problems since an overflow attention can easily be associated with the task in which the overflow occurred. Any overflow occurring will raise the attention only in the task in which the overflow occurred and only in this task will the on-unit be executed. On the other hand, if an attention is associated with an interrupt from an online process controller of some sort, when a task with this condition enabled spawned subtasks, each would inherit the main task’s on-unit. Suppose several tasks are active, each with this condition enabled, and the condition arises: Which task’s on-unit would be raised? Would they all? I n sequence? I n parallel? Which first?
16
RICHARD 1. WEXELBLAT
6. Extended Attention Handling
Following is a discussion of a possible high level language facility that would permit specification of attention handling algorithms. Although similar in form and syntax to the style of PL/I, the concepts involved are general and could be applied to languages such as ALGOL, COBOL, or FORTRAN. While this is not necessarily the only way to achieve the desired end, the statements and options described below seem to provide a reasonable way to go. Two different approaches to a PL/I extension are described: a fairly complex but powerful facility making use of on-units and permitting multiprocessing, and a somewhat limited facility that makes use of multiprocessing alone. A possible FORTRAN extension is also described. 6.1 The On-Unit Approach to Attention Handling
I n order for a programmer to write programs to process asynchronous attentions, it will be necessary to provide a new data type: the attention. New statements to operate on attention data are also required. The following sections describe the new data type and the statements that operate on it. The syntax of the new statements may be found in Appendix 1. (Wherever a list of names is mentioned below, it is assumed that commas are used to separate names in the list.) 6.1.7 The Attention Data Type
A name declared with the ATTENTION data attribute will be associated with an external condition and will priniarily be used in an on-statement to identify the code to be executed when the attention is raised. Attention data have much in common with conditions in the current PL/I language. Attention handling code specified in an on-unit will be permitted to execute in parallel with the task in which it is invoked. Following is an example of the declaration of an attention: DECLARE A1 ATTENTION ENVIRONMENT (. . .) ; Each attention is associated in some implementation-dependent manner with an external device, condition, or event. The environment option is used just as with a file declaration, to specify implementation dependent parameters. For example the environment may contain the information that identifies the external source of attentions and specifies the maximum depth of the attention queue.
ASYNCHRONOUS PROGRAM INTERRUPTS
17
6 . I .2 Attention On-Units
The code to be executed when an attention is raised will be specified as an on-unit in a manner similar to that illustrated in the examples of Sections 4 and 5. I n order to increase facility, however, two options will be added to the on-statement: task and event. Either of these is sufficient to specify that the on-unit when invoked is to be executed as an independent subtask in parallel with the interrupted task. The event option requires the programmer to specify an event name that will be set incomplete a t the start of the on-unit and set complete when that code is finished executing. The event variable may be used by the program in which the attention was raised to determine when the on-unit is complete and to synchronize computations. The task option permits the programmer optionally to specify a name that may be used to refer to the on-unit from another task; as, for example, when it is necessary to terminate an on-unit executing in parallel. Use of the task option alone permits independent execution when there is no need for an explicit name. If neither the task nor the event option is specified, raising the attention will cause the interrupted task to suspend execution until the onunit completes. Following are some examples of on-statements for conditions :
ON ATTENTION(BREAKKEY) STOP; ON A T T E N T I O N ( B R E A K K E Y ) TASK BEGIN;
ON ATTENTION(EXT1) EVENT(EXT-EVENT-1)
. . . END;
...
END;
BEG1N ;
The key word ATTENTION is used to differentiate between attentions and built-in conditions. While not absolutely necessary, its use makes it possible to extend the set of built-in conditions easily and improves program documentation by making it easy to identify programmer defined attentions. It should be noted that, permitting task and event options in the onstatement would not be necessary if these options were permitted on the begin-statementa more smooth and natural extension to a blockoriented language. It should also be noted that if the task and event options were permitted on neither the on-statement nor the begin-statement, the equivalent effect could be achieved by making the on-unit code an out-of-line subroutine and then invoking that subroutine through a call-statement in the on-unit. The call-statement may, of course, contain the task and event options. This last method would appear to be the smoothest way to add the facility discussed here to a language such as FORTRAN.
18
RICHARD L. WEXELBLAT
A system default action is provided just in case an enabled attention is raised when there is no programmer defined on-unit for that attention. The action taken depends upon the particular implementation and will most likely be to print a message and return.
6.1.3 Values of Attention Data
Each attention datum takes on several status values: a. an activity status: active or inactive b. an enablement status: enabled or disabled c. an access status: immediate, asynchronous or queued. An active attention is one which the program is prepared to process should the associated external condition occur. The result of the occurrence of the external condition associated with an inactive attention is not defined and may vary from implementation to implementation and from condition to condition. This situation may in some circumstances be classified as an error while in others it may be ignored. An attention is activated when any statement in which it is referenced is executed and remains active in all tasks in which it is used. This activity has global scope in the sense that it is not meaningful for an attention to be active in one task and not in another task executing at the same time. Enablement, on the other hand, is a status that is local to a task. If an attention is enabled within a given task, then the occurrence of any external condition associated with the attention would raise the attention in that task, interrupting the task and causing the on-unit for that attention to be executed. If the attention had been disabled, the occurrence of the condition would be ignored. The difference between activity and enablement is somewhat subtle, activity referring to an external phenomenon and enablement associated with the program state itself. The occurrence of the appropriate external event for an inactive attention might very well not be recognized by thc hardware. If the attention were disabled then this occurrence would indeed be noted by the environment but no interrupt would occur and no on-unit would be invoked. An implementation might choose to consider the raising of an attention disabled in every active task to be an error. The access status may have various interpretations, depending upon wh.ether a priority interrupt system is under consideration. Initially it will be assumed that attentions do not have priorities. There are three distinct ways in which an attention may be enabled within a task:
ASYNCHRONOUS PROGRAM INTERRUPTS
19
i. Immediate-this attention must be processed immediately upon the occurrence of the corresponding condition. It may interrupt any existing task other than another immediate attention on-unit. ii. Asynchronous-this attention will be processed in an asynchronous manner, interrupting any existing task other than an attention on-unit. iii. Queued-this attention will not interrupt any task but, rather, the occurrence will be noted and an appropriate entry made into an attention stack. The access status of an attention is established when the attention is enabled and asynchronous access is assumed as default if no status is specified explicitly. (In Section 6.1.5 an alternative formulation that replaces these three attention levels by a set of priorities is described.) 6.7.4 Attention Processing Statements
The access status and enablement of an attention is determined by the enable-statement and its complement, the disable-statement. The former both enables an attention and activates it if it was inactive. It may also be used to change the access status of an enabled attention. The disable-statement, as its name implies, is used to disable attentions. As an attention may be enabled independently in separate tasks, the enable-statements applies only to the task in which it is executed unless otherwise specified. Two further options for the enable-statement are described in Section 6.2. Following are some examples of simple enable and disable-statements. ENABLE ATTENTION(BREAK_KEY) ; ENABLE ATTENTION(BREAK-KEY) ASYNCHRONOUS ; ENABLE ATTENTION(SENSORl,SENSOR2) QUEUED, ATTENTION(LIM1T) I M M E D I A T E , ATTENTION(FLOW,RESERVE) ASYNCHRONOUS; ; DISABLE ATTENTION(BREAK-KEY,LIMIT,FLOW) The first two statements are logically equivalent as asynchronous is the default access class. The flowchart of Fig. 1 illustrates the logical sequence of operations in the execution of a n enable-statement. (Figure 6, in Section 6.2, is a more detailed description of the enable-statement, reflecting the additional options presented in that section.) The occurrence of the external condition associated with a n attention enabled for asynchronous access will cause that attention’s on-unit to be invoked immediately. The task in which the attention was raised will
20
RICHARD 1. WEXELBLAT
ATTENTION
\k w I
PERFORM ENABLEMENT
I
REMOVE A STACK ENTRY; RAISE THE ATTENTION
V
NO FURTHER ACTION
FIQ.1. Attention enabling-simple form. The sequence of decisions and operations involved in the execution of a simple enable-statement. This process is repeated for each attention named in the enable-statement.
either continue or be interrupted, depending upon the options used in the on-statement for that attention. If an attention is enabled for queued access, the occurrence of the associated external event will not interrupt the program or invoke the on-unit a t once. Rather, the attention will be placed on a stack and remain there until the programmer explicitly interrogates the stack. Each task has its own attention stack. Upon executing a dequeue-statement for that attention, the entry will be removed from the queue and the associated on-unit invoked. For example: DEQUEUE ATTENTION (SENSORl) ;
If an attention for SENSOR1 is enqueued, that stack entry will be removed and the on-unit for the SENSOR1 attention will be invoked. If no such attention is enqueued, the statement will have no effect.
ASYNCHRONOUS PROGRAM INTERRUPTS
21
A built-in function, QUEUE, is available to test whether an attention is stacked. Given an attention name as argument, it returns a nonnegative integer that indicates the number of instances of that attention on the queue. In order to prevent the repeated occurrence of an external event from causing an attention on-unit to recurse upon itself, an immediate or asynchronous attention is treated as if it had been enabled for queued access when its on-unit is entered. Thus, further attentions will be stacked and may be accessed from within the on-unit itself. When the on-unit completes, the attention reverts to its former access status and, if the stack is not empty, the on-unit is immediately invoked again. Of course, the programmer is free to modify the access status within the on-unit. If a queued attention is already on the stack when that attention is enabled for asynchronous or immediate access, the on-unit will be invoked a t that time. When an attention is disabled, any stacked interrupts for that attention are removed from the queue and lost. The flowchart in Fig. 2 illustrates the logical sequence of operations that follows the raising of an attention within a single task. This operation occurs independently in each independent task active a t the time the attention is raised. Figure 3 illustrates the action taken upon return from an attention on-unit. 6.7.5 Priorities
I n order to program many practical real-time applications and to mirror many current types of process-control hardware, it might well be necessary to permit the programmer to specify relative “importa,nce” among groups of attentions by associating a priority with an attention. Instead of the three access options described above, the enable-statement would permit a priority option that associates a nonnegative integer with that enablement of the given attention. An attention of some given priority would then be able to interrupt any on-unit for a n attention of equal or lower priority but would not be able to interrupt the on-unit of a higher priority attention. A task spawned from an attention on-unit executes a t the priority of that attention unless otherwise specified. Any time an attention cannot interrupt due to the presence of a higher priority interrupt, the low priority attention would be stacked. With a general priority system, there would be no need to differentiate between asynchronous and immediate attentions and if the lowest priority were defined always to be stacked and never to interrupt, until accessed, the priority option could replace the queued as well as the asynchronous and immediate options.
22
RICHARD 1. WEXELBLAT AN ATTENTION I S RAISED
IGNORE
IGNORE OR ERROR
FIG.2. Interrupt or cnqucue? The action of the attention handler following the raising of an attention.
ASYNCHRONOUS PROGRAM INTERRUPTS
23
RETURN FROM AN ATTENTION ON-UNIT
FIG.3. After return. The action of the attention handler following the return from an attention on-unit. Figures 4 and 5 are analogous respectively to Figs. 2 and 3 and illustrate actions taken when an attention is raised and upon return from an on-unit in a priority attention handling system. 6.7.6 Comparison o f the Three Level and Priority Access
The three level approach is simpler in concept and potentially simpler in implementation than the more general priority system. Although this approach would probably be quitte adequate for most online applications that do not require millisecond or microsecond response times-timesharing and information retrieval systems, for example-an online process control application that required instant interrupt response times would probably need the full priority facility. Both approaches are amenable to subsetting. Restriction of attentions to queued (or lowest priority) access would have the effect of converting all interrupts to type 0 (see Section 3).
24
RICHARD L. WEXELBLAT AN ATTENTION OF PRIORITY. P I S RAISED
FIG.4. Priority handler: interrupt or enqueue? The action of a priority attention handler following the raising of an attention.
Permitting queued and asynchronous access or, alternatively, two priority levels would provide the full generality of interrupt type but is not really sufficient for applications in which there is a difference in “importance” between and among attentions.
ASYNCHRONOUS PROGRAM INTERRUPTS
25
RETURN FROM ON-UNIT FOR ATTENTION OF PRIORITY P
OF THE HIGHEST PRIORITY
RAISE THAT' ATTENTION
I
FIG.5. Priority lian lei-: after return. The action of a priority attention handler following the return from an on-unit. 6.2 Multiprocessing considerations
When a subroutine is called as a task and inherits the on-units of its progenitor, all attentions associated with those on-units will be passed in the disabled state. This will avoid the problem of multiple invocation mentioned in Section 5.2 above. The programmer may then enable the attention in the subtask and disable it in the calling task. T o permit the programmer to maintain complete control over attentions in his program, two additional options are available on the enablestatement. These may be used to synchronize attention enabling in a multiprocessing environment: (a) location, which specifies explicitly the task or tasks in which the attention is to be enabled and (2) exclusion, which restricts the set of tasks in which an attention is enabled. The location option consists of the keyword I N followed by a list of one or more task names in parentheses. An asterisk may be used in lieu of the task names and implies that the enabling applies to all active tasks. If the location option is omitted, then the statement applies only to the task in which it is executed. For example, ENABLE ATTENTION (A) ; ENABLE ATTENTION (A) I N (T1) ;
RICHARD 1. WEXELBLAT
26
ENABLE ATTENTION(A) IN(T1, T2) ; ENABLE ATTENTION (A) I N (") ; The first statement enables A only in the task in which that statement is executed. In the second example, A is enabled in the task named T1 while in the third, it is enabled in both task T1 and in task T2. The effect of the fourth statement is to enable A in all tasks currently active. The exclusion option consists of the keyword ONLY or the keyword LOCALLY. If ONLY appears then the attention is enabled in the task or tasks specified and simultaneously disabled in all other tasks. The LOCALLY keyword implies a more restricted enabling: At any time during the execution of a program, the set of active tasks may be represented as a tree with the initial task a t the top and each active task represented as a subnode of the task from which it was spawned. Tasks a t the same level in the tree and with the same predecessor are called sibling tasks. When a task is enabled LOCALLY, it is enabled in the specified task and simultaneously disabled in any of its subtasks, any sibling tasks and their subtasks and in the predecessor task. This applies independently to each task named in the location option if it is present. For example, ENABLE ATTENTION ( A ) I N ( T l ) ONLY ; ENABLE ATTENTION (A) I N (Tl) LOCALLY; I n both of these examples A is enabled. I n the first example, however, A is disabled in every task other then T1. In the second example, A is disabled in all tasks of the minimal subtree of the task tree that contains the predecessor of T1 except for T1 itself. If an atkention is enabled locally in a group of tasks which fall in the same subtree then the disabling applies only to the tasks not named in the location option. When an attention is enabled in a task exclusively, any attentions on the stack of that task's predecessor(s) are transferred to the stack of that task. Attentions stacked for other tasks being disabled a t that time are lost. The following fragment illustrates the use of some of these features. ENABLE ATTENTION(L1GHT-PEN) ; CALL SUB(X,Y,Z) TASK(TSUB); ENABLE ATTENTION(L1GHT-PEN) IN(TSUB) ONLY;
... Initially, the LIGHT-PEN attention was enabled in the calling routine
ASYNCHRONOUS PROGRAM INTERRUPTS
27
asynchronously. When it was desired to transfer control for light pen attentions to a subroutine, the subroutine was first invoked as an independent task and then the caller enabled the attention in the subtask, simultaneously disabling the attention in itself. Any light pen attentions that occurred during the transfer of responsibility and up to the completion of the enabling were handled by the caller. Any attentions that were pending on the stack of the caller when the transfer occurred were passed along to the subtask. I n the previous example, it was assumed that the on-unit for the LIGHT-PEN attention was passed along when the subtask was called. If the subtask is not to assume control of attention processing until it has established its own on-unit, responsibility for the transfer could be given to the subroutine. This example is a bit more complicated and is illustrated in the section on sample applications below. Figure 6 illustrates the actions taken during the execution of an enablestatement and is an extension of the flowchart of Fig. 1. 6.3 Attention Handling through Multiprocessing-An
Alternative Approach
While the facilities described in Sections 6.1 and 6.2 could be quite smoothly added to a large language such as PL/I or COBOL, current trends in language design are toward smaller, more compact, and more modular languages. Although it is possible to formulate a subset of any language, it is often very difficult to define a subset that is functionally adequate yet still smooth and consistent. Furthermore, without some actual experience with an implementation of the full facility, i t is impractical to try to define a subset. Thus, an alternative method of handling attentions-one that does not use on-units and may thus b.e better suited to small languages-will be presented. In this formulation there is again an attention data type but in this case the attention bears resemblance to the event rather than to the on-condition. As above, an attention will be associated in an implementation defined manner with an external condition. Initially, the attention will have the value incomplete. When the condition is raised, the attention is set complete. If the value is already complete when the attention is raised, the attention will be stacked and kept on the queue until the value again becomes incompl.etc. A task may test if an attention has been raised through use of the COMPLETION built-in function and a task may wait for an attention to be raised. These topics are discussed in more detail after some preliminary description of the environment in which they are to be used.
RICHARD L. WEXELBLAT AN ENABLE-STATEMENT I S EXECUTED
ASSUME THE CURRENT TASK TO BE THE NAMED TASK
-I \
FOR EACH TASK NAMED WHOSE PREDECESSOR IS NOT NAMED: COPY STACK (SEE NOTE I )
I
DISABLE IN ALL BUT NAMED TASKfS)
I
LOCALLY
-ALL SUBTASKS -ALL SIBLINGS -PREDECESSOR
PERFORM SIMPLE-ENABLEMENT IN EACH TASK NAME0 (SEE NOTE 2 )
FIG.6. Complete attention enabling. The complete sequence of decisions and operations involved in thr rxecution of the enable-stat,ement. This process is repeated for each attention named in the enable-sktrment. Note 1. When an exclusion option in the enabling of an iittcntion in a given task causes that attention to be disabled in the predecessor of that task, any queue entries for that attention in the predecessor’s stack are transferred to the stack of the subtask. Note 2. The “simple enablement” referred to is that process illustrated in Fig. 1.
6 . 3 .J Cooperating Tasks Consider a graphics application in which two routines are cooperating in the processing of the data. The purposc of one routine is to respond to each light pen attention by reading in a set of s-y coordinates. The other routine takes thc consecutive coordinate pairs and generates a n
ASYNCHRONOUS PROGRAM INTERRUPTS
29
appropriate display. Although the two routines may communicate through common variables, each is free to execute independent of the other within the bounds of certain constraints. Due to the nature of the physical device involved, it is important that response to a light pen attention be as rapid as possible. It is also necessary that the coordinate-reading routine not get so far ahead of the coordinate-using routine that the common area overflows and it is necessary that the coordinate-using routine not get ahead of the coordinate-reading routine a t all. The problem of synchronizing data access between these two processes is beyond the scope of this discussion. See Dijkstra (1968) for a full discussion of the topic. Synchronization facilities are included in the proposed Standard PL/I (ANSI/X3J1, 1973) and in ALGOL-68 (Lindsey and van der Meulen, 1971). The synchronization of the task control and the processing of the attentions are, however, of direct interest. I n the simplest formulation, control will reside with the tasks most recently invoked until that task terminates or until it voluntarily gives up control by invoking another task or by executing a wait-statement. Thus, in order to prepare to accept an attention, a program will invoke a subroutine as a task. After performing any necessary initializations, the subroutine will execute a wait-statement naming the attention and, if the attention has not been raised, give up control. (If the attention has already been raised, control will “fall through” the wait-statement immediately.) When the attention is raised while the subtask is waiting, the main task will be interrupted, the attention set complete, and the subtask will resume execution. Following the wait-statement, the attention is set incomplete again so as to prepare for the next raising. The subtask then does any computation necessary to respond to the attention and returns to the wait-statement to await another occurrence of the condition that raised the attention. This formulation does not directly permit an attention to interrupt its own attention processing task. If such an interruption is desired, the attention processing task may call itself recursively as a task. Synchronization in this situation would be quite complex. 6.3.2 Enabling, Access, and Priorities
So far nothing has been said about enabling and disabling when h a w dling attentions through tasks. Depending upon the complexity of facility desired, this form of attention handling could either permit the full complexity of the enable-statement or it might be assumed that an attention is enabled from the beginning of execution of a block in which the attention is defined.
30
RICHARD 1. WEXELBLAT
The concept of access level or priority is tied in with the method of task dispatching. Many simple multiprocessing systems use a one level round-robin dispatching algorithm which would not be sufficient for a n attention handling environment. It may be necessary that tasks awaiting attentions get a higher priority than other tasks. If only a single level is available, it would be necessary for the programmer to make sure that all attention handling tasks gained control a t periodic intervals. While this would perhaps be adequate, such a system very likely would never be able to give very high speed response and it is likely that a t least a two level system would be necessary. Again, depending on the host environment, the system could provide either the three level system or the full priority system. If the multiprocessing host system had its own form of priority dispatching control, there is no reason why this control could not be used with the attention handling tasks, provided that a task responding to an external condition of short duration-such as a light pen signal-be able to gain control sufficiently quickly. An example of attention handling through multiprocessing is given below in Section 7. 6.4 Extensions to FORTRAN
Syntactically, FORTRAN is a fairly simple language. I n general, there is only a single keyword in each statement type and excess punctuation and delimitation is minimized. There is no block structuring to speak of and grouping is used only for iteration. What parts of the attention handling language would fit comfortably into FORTRAN? On the surface, the CPS language mentioned in Section 5.1 bears some resemblance to FORTRAN and inspiration will be taken from there. Let the on-statement be added to FORTRAN in the following form:
ON attention-name simple-statement The simple-statement is one of the following : assignment call got0 read/write stop The attention-name is the identifier of an attention. I n Standard FORTRAN, names containing alphabetic characters are either data vari-
A S Y N C H R O N O U S P R O G R A M INTERRUPTS
31
ables or entry names. Files and statement labels are represented by integers. It would probably be more in keeping with the style of FORTRAN to make attention identifiers numeric also but the need for mnemonic significance indicates that permitting alphanumeric names would be better. Although there is some inconvenience in putting the attention handling code outside of the main program, the use of the call-statement in the on-statement would give the effect of a multiple statement on-unit. In this case, communication between main program and attention handling code would have to occur through COMMON storage. The attention handling facility is somewhat restricted in power without multiprocessing. It does not, however, seem in the spirit of FORTRAN to permit multiprocessing. There is no reason why simple versions of the enable and disable-statements could not be supplied and two access levels would probably suffice. On the other hand, it would be possible to use the on-statement itself as the enabling statement and t o disable by providing ‘a null on-unit. (As FORTRAN has no explicit null-statement, the continue-statement could be used for this purpose.) The following program fragment is similar to the first CPS example in Section 5.1. A terminal break key is used to determine how far a doloop has gone.
... C
c
ESTABLISH T H E ON-UNIT O N K E Y WRITE(6,30)L 30 FORMAT (18) E X E C U T E T H E LOOP DO 100 L = 1,2000
... C
100 CONTINUE “DISABLE” T H E CONDITION ON K E Y CONTINUE 7 . Examples
The first example illustrates the treatment of an external attention associated with the light pen of a graphics terminal. It is assumed that an attention is raised when the user depresses a button on the light pen and furthermore that as long as the button remains depressed, attentions continue to be raised a t (possibly irregular) intervals. COORDS is an external function that rcturns a two element vector:
32
RICHARD 1. WEXELBLAT
the current x-y position of the light pen. Initially the attention is declared : DECLARE LIGHT-PEN ATTENTION ENV(. . . parameters . . .); The parameters in the environment option are implementation dependent and, in addition to associating the declared name with the external device may, for example, specify an expected maximum queue depth. The main routine will process the coordinates recorded by the on-unit that is processing the attentions. A very large 2-by-many vector, CVEC, will be used to store the successive coordinate pairs and two indices are used : TAKE-an integer pointer to the coordinate pair in CVEC most recently processed by the main routine. PUT- -an integer pointer to the coordinate pair in CVEC most recently placed in the vector by the attention on-unit. The on-unit is established by the following on-statement :
ON ATTENTION(L1GHT-PEN) CVEC(PUT+l,*) P U T =P U T + 1 ;
BEGIN;
= COORDS;
END; Thus, when the attention occurs, a new set of coordinates is recorded and the index is advanced. The main routine is ready to process the successive coordinate pairs as they become available. Whenever the value of P U T is greater than the value of TAKE, there are data to be processed. The code in the main routine might be: DO WHILE(PUT> TAKE); TAKE = T A K E + 1 ; /*code to do something with the new coordinate pair*/ END;
If attentions are arriving faster than the coordinates can be processed by the main routine then the main program will loop while the P U T > TAKE test is true. It is necessary to provide a mechanism to permit the main program to wait for attentions when it has no data to process. Thus, an event, MORE, is defined which will be set incomplete by the main program when it has processed all of the data and set complete by the on-unit when data are available, The clear and post-statements from the proposed Standard PL/I are used, respectively, to set the event incomplete and
A S Y N C H R O N O U S P R O G R A M INTERRUPTS
33
complete. For reasons explained below, the attention is enabled for immediate access. A “toggle,” ACTIVE, is set to true (‘l’B, in PL/I) initially and then used to control the loop. The mechanism for terminating the loop is described below. The program now looks like: ON ATTENTION(L1GHT-PEN) BEGIN; CVEC(PUT+l,*) = COORDS; PUT =P U T + l ; POST(M0RE) ; END; PUT,TAKE = 0; CLEAR(M0RE) ; ACTIVE = ‘1’B; ENABLE ATTENTION(L1GHT-PEN) IMMEDIATE; DO WHILE(ACT1VE) ; WAIT(M0RE) ; CLEAR(M0RE) ; DO WHILE(PUT > TAKE) ; TAKE = TAKE 1; /*code to process a coordinate pair*/ END; END;
+
I n order for the outer loop to terminate, it is necessary for the variable ACTIVE to be reset in some way to false. Assume another attention, PEN-OFF, is defined to be raised when the light pen button is released. The on-unit for this attention will set ACTIVE to false (‘O’B, in PL/I) and also post MORE complete. (This latter action is necessary as the main loop may be waiting for MORE when the release occurs.) As it is necessary to make sure that all light pen attentions are processed before terminating the processing loop, the PEN-OFF attention must never preempt the LIGHT-PEN attention. ENABLE ATTENTION(L1GHT-PEN) IMMEDIATE, ATTENTION(PEN-OFF) ASYNCHRONOUS ; ON ATTENTION(PEN-OFF) BEGIN; ACTIVE = ‘l’B; POST(M0RE) ; END; Although it would appear that the mechanism described so far would be sufficient t o handle all situations in the simple graphics application used in this example, there is one critical area in the program where a
34
RICHARD 1. WEXELBLAT
combination of unfortunately timed attentions may result in the loss of some data. If a LIGHT-PEN attention followed immediately by a PEN-OFF attention should occur between the end-statements of the two do-loops, the setting of ACTIVE to false would cause the outer loop to terminate without ever making use of the coordinates recorded in the final LIGHT-PEN on-unit. The most straightforward solution to this problem is t o include in the outer loop'^ while-clause a test to make sure th a t all coordinates have been processed; that is, the loop continues while ACTIVE is true or while P U T is greater than TAKE. Putting all of this together in a subroutine:
PROCESS-PEN: PROCEDURE ; DCL (PUT, TAKE, CVEC(10000,2)) FLOAT, COORDS ENTRY EXT RETURNS((2) FLOAT), MORE EVENT, ACTIVE B I T ( l ) , LIGHT-PEN ATTENTION ENV(. . .), PEN-OFF ATTENTION ENV(. . .); ON ATTENTION(L1GHT-PEN) BEGIN; CVEC(PUT+l,*) = COORDS; P U T = PUT+1; POST(MORE) ; END; O N ATTENTION(PEN-OFF) BEGIN; ACTIVE = ‘O’B; POST(MORI~); END; PUT, TAKE = 0; CLEAR(iVI0RE) ; ACTIVE = ‘1’B; ENABLE ATTENTION(L1GHT-PEN) IMMEDIATE, ATTENTION(PEN-OFF) ASYNCHRONOUS; DO WHILE(ACT1VE 1 P U T > TAKE) ; WAIT (MORE) ; CLEAR (MORE); DO WHILE(PUT > TAKE) ; TAKE = TAKE+ 1; /*process coordinate pair*/ END; /*critical area referred to in text*/ END;
A S Y N C H R O N O U S P R O G R A M INTERRUPTS
35
DISABLE ATTENTION(L1GHT-PEN, PEN-OFF) E N D PROCESS-PEN; In this example, because of the use of PUT and TAKE in the main program and in the on-units, it would not be safe to permit the on-units to execute as tasks. Given suitable facilities for synchronization of access, it might be possible to gain efficiency through the use of the task option on the on-units. Another example of attention handling will be taken from a time-sharing systems application. When the system is started up, a small “telephone answering” routine is initiated at a high priority. This routine is prepared to respond to attentions raised by a terminal interface each time a data connection is made. The response by the answerer is to call a copy of the terminal monitor routine to handle the user. It is assumed that a variable, LINE, contains an integer that identifies the line on which the connection is made. ANSWERER: PROCEDURE; DECLARE LINE FIXED, /*phone line of connection*/ FOREVER EVENT, /*to permit program to wait*/ RING ATTENTION ENV(. . .), /*raised when a phone connection is made */ MONITOR ENTRY EXT, /*this will be called each time a connection is made*/ USER (maxno) TASK; /*to identify a monitor task*/ CLEAR(F0REVER) ; ENABLE ATTENTION(R1NG) ASYNCHRONOUS; /*the ring attention on-unit will be invoked as an independent task, once for each connection*/ ON ATTENTION(R1NG) TASK(USER(L1NE)) BEGIN; / *preliminaries-see text */ ENABLE ATTENTION(R1NG) ASYNCHRONOUS; CALL MONITOR(L1NE) ; /*termination-see text */ END; WAIT(F0REVER) ; E N D ANSWERER; The main body of the program is easily described-after initializing the event, enabling the attention, and establishing the on-unit, the program goes into a wait state and remains that way except when actually
36
RICHARD 1. WEXELBLAT
in the on-unit. Each time a connection is made, the on-unit is invoked and that on-unit remains active until the user on the line disconnects. When the attention is raised, its access status is changed to queued. This permits the on-unit to execute safely any code that requires exclusive access to any critical data. The attention is then enabled for asynchronous access again to permit other connections to be made. At this I’oint the MOXITOR program is called that will take care of the user’s reyucsts. The monitor remains active until the user is finished and then, after any termination code that may be necessary, the on-unit terminates. It is assumed that the answerer itself will remain active but in the wait state until terminated by some outside action. This example could be extended somewhat by assuming another attention that will be raised when a phone line disconnects. This action may orcur due to a line fault or t o a user hanging up without logging out. The following code would be added before the wait-statement in ANSWERER (D-LINE is the number of the disconnected line) : DECLARE DISCONNECT ATTENTION ENV(. . .); ENABLE ATTENTION(DISC0NNECT) I M M E D I A T E ; ON ATTENTION(DISC0NNECT) TASK B E G I N ; STOP TASK(USER(D-LINE)) ; EXD; The primary purpose of the on-unit is to terminate the instance of MONITOR that corresponds to the disconnected line. The stop-statement could be preccded or followed by any necessary clean-up code. The last example illustrates a possible application of the attention liandling through multiprocessing approach of Section 6.3. The attention in this case is assumed to be a terminal’s break key and it will be used to initiate a debugging printout during the run of a long program. Early in its execution, the main program will execute the following statement: CALL DEBUG-PRINT TASK; The attention task itself will look like: DEBUG-PR1NT:PROCEDURE DECLARE KEY ATTENTION ENV(. . .); ENABLE ATTENTION(KEY); WAIT (KEY) ;
/ * debugging printouts */ E N D DEBUG-PRINT;
ASYNCHRONOUS PROGRAM INTERRUPTS
37
The subroutine will enable the attention and then enter a wait state until the attention is raised. If that never happens, the subroutine will be terminated when the calling program terminates. When the attention is raised, the debugging print statements will be executed and then the subroutine will terminate. It would also have been feasible to have the DEBUG-PRINT program loop back to the wait-statement so that the print would be repeated each time the terminal break key was depressed.
8. Conclusion
A primary goal of this article was to define and illustrate an attention handling facility embedded in a high level programming language. It is not the first such attempt-as early as 1969, a paper on a process-control language based on PL/I was published (Boulton and Reid, 1969) and in mid-1969, an informal presentation on attention-handling in PL/I was made a t ,z SHARE Meeting (Norman, 1969). Although many of the examples were based on extrapolations of PL/I, the concepts involved are not tied to any language in particular and could be applied to FORTRAN, COBOL, PL/I, or any ALGOL-like language. T o avoid the impression of strong dependence on any particular application area, it should be noted that the underlying language concepts can find application in areas from time-sharing through systems programming and online process control. ,4ttention handling through on-units has a certain elegance that seems well suited to the top-down structural approach to programming, allowing the programmer to specify his attention handling code in a compact modular fashion, almost as part of a block's declarations.
Appendix 1 Syntax of the Attention Handling language The syntax of the attention handling statements is described here in a notation that is an extension of Backus-Naur Form. The :: = operator separates the term being defined (on the left,) from its definition (on the right). Square brackets indicate items that are optional. The ' operator separates options that may appear in arbitrary order. The vertical bar separates alternatives, no more than one of which may be used in a single statement. Upper case letters and punctuation other than that defined in the metalanguage represent characters from the statements being defined while lower case names represent categories defined within the definition itself. Braces are used for grouping and the ellipsis operator ( . . . ) is used t o denote arbitrary repetition.
38
RICHARD 1. WEXELBLAT
For purposes of this Appendix, some informal definitions appear as prose text between # signs. This notation is similar to that devised for the definition of COBOL and is fully defined in the draft PL/I Standard (ANSI/X3Jl, 1973).
1. The Enable-Statement enable-statement : : = ENABLE { attention-part { [access-option] [locationoption] ' [exclusion-option] } } . . . attention-part : : = ATTENTION(attenti0n-list) I ATTENTION(*) access-option : : = IMMR1)IATE 1 ASYNCHRONOUS I QUEUED 1 PRIORITY (unsigned-integer) location-option : : = IN(task-list) I IN(*) exclusion-option : : = LOCALLY I ONLY attention-list : : = # a list of attention names, separated by commas if more than one # task-list : : = # a list of task names, separated by commas if more than one# qinsigned-integer : : = #an unsigned integer #
2. The Disable-Statement disable-st,atement : : = IIISABLE
I
attention-part [location-option] ]
..
3. The Dequeue-Statement dequeue-statement : : = DEQUEUE attention-part
4. The On-Statement (as used with attentions) on-statement : : = ON attention-part { [task-option] [event-option] ] on-unit task-option : : = TASK[ (task-name) ] event-option : : = E:VE:NT(event-name) on-unit : : = # a simple statement or begin-block as for a PL/I on-unit #
Appendix 2 Detectable Conditions in PL/I, COBOL, and FORTRAN
As drscrihcti in Section 4, the genrrnl purpose lunguages PL/I and COBOL and implernented versions of FORTRAN all have a t least somc ability to take special action as result of some potentially interrupting condition. The tables below list the conditions that can be detected in each of these languages and processed without terminating the program run. 1. PL/I
Fixed overflow, floating over and underflow Size-number too large for target
ASYNCHRONOUS PROGRAM INTERRUPTS
39
Division by zero String truncation or reference outside of string bounds Subscript out of range Conversion error 1/0 errors of all sorts ERROR-a general catchall that is raised for any error condition not explicitly named. This includes errors in built-in functions, machine errors, etc. FINISH-a condition raised in a routine just as it is about to return, permitting any last minute cleanup
2. COBOL Overflow, division by zero-combined into it general purpose SIZE ERROR clause End of file Invalid key Error on an 1/0 device Start of tape volume (for label processing) Start of section of Report Program
3. FORTRAN
(16M FORTRAN I V ) Floating point under and overflow Division by zero End of file Error on input device End of asynchronous 1/0 operation
(Honeywell Series
6000)
Integer overflow, floating point under and overflow Division by zero, integer and floating point 1/0 errors of all sorts, sequential and random acces Format and conversion errors Illegal and/or, improper arguments to built-in functions (Square root of negative, log of zero, etc.)
Appendix 3 Glossary of Terms The definitions below are in no sense univrisal, but rather reflect the specialized usage above. ( A number in hrackpts indicxtrs thr section in the text in which a term is defined.)
40
RICHARD 1. WEXELBLAT
access-status of an attention enablement, determining whether i t will interrupt or not. L6.1.31 active-turned on; an active attention may be raised. c6.1.31 asynchronousan asynchronous attention may interrupt a program a t any time except when the on-unit of an immediate attention is active. r6.1.31 attention-the manifestation within a program of an external condition. [21 condition-a situation, occurrence or event that causes an attention to be raised. r21 delayed-an attention whose action does not immediately interrupt a program is said to be delayed. dequeued-removed from a queue or stack. L6.1.41 disabled-not prepared to be raised. E6.1.31 enabled-prepared to be raised. L6.1.31 enqueued-placed on a queue or stack. event-1. an occurrence outside of a program that may cause an attention to be raised. [21 2. a data item that takes on the values “complete” and “incomplete”. r4.1.21 immediate-an immediate attention may always interrupt a program but its attention on-unit may be interrupted only by another immediate attention. r6.1.31 inactive-turned off; the occurrence of the condition corresponding to an inactive attention may be ignored. L6.1.31 interrupt(See Section 2) on-unit-a statement or block of code to be executed when a program is interrupted by an attention or condition. C4.1.11 priority-an integer measure of the relative importance of a task or attention. L‘6.1.51 queued-1. on a queue or stack. 2. raising a queued attention causes an entry to be made on a queue but does not cause the program to be interrupted. 16.1.31 raise-an attention is raised when it is enabled and the external condition with which it is associated occurs. [21 task-(See Section 2) type (of interruption)-a classification of interruptions differentiated by the level of synchronization with the hardware or software environment. [31 REFERENCES ANSI. (1966). “USA Standard FORTRAN.” Amer. Nat. Stand. Ass., New York. ANSI/X3J1. (1973). “BASIS/l (Working Document for PL/I Standard).” Copies available from CBEMA, 1828 L St. NW, Washington, D.C. 20036. ANSI/X3J3. (1973). “FORTREV (Working Dorument for Revised FORTRAN Standard).” Copies available from CBEMA, 1828 L St. NW, Washington, D.C. 20036. Bates, F., and Douglas, M. L. (1970). “Programming Language/One.” 2nd ed. Prentice-Hall, Englewood Cliffs, New Jersey. Boulton, P. I. P., and Reid, P. A. (1969). A process control language. ZEEE Trans. Comput. 18, No. 11, 1049-1053. Coddington, L. (1971). Quick COBOL. Computer Monographs Series (S. Gill, ed.), Vol. 16. Amer. Elsevier, New York.
ASYNCHRONOUS PROGRAM INTERRUPTS
41
Corbato, F. J., Daggett, M. M., Daley, R. C., et al. (1963). “The Compatible TimeSharing System, A Programmer’s Guide.” MIT Press, Cambridge, Massachusetts. Corbato, F. J., Clingen, C. T., and Saltzer, J. H. (1972). MULTICS: The first seven years. Honeywell Comput J. 6, No. 1, 3-14. Dijkstra, E. W. (1968). Cooperating sequential processes. In “Programming Languages” (F. Genuys, ed.), pp. 43-112. Academic Press, New York. Honeywell. (1971). “FORTRAN,” Doc. No. CBP-1686. Honeywell, Phoenix, Arizona. IBM. (1970). “PL/I (F) Language Reference Manual,” Form GC28-8201-3. IBM, Armonk, New York. IBM. (1971). “PL/I Checkout and Optimizing Compilers: Language Reference Manual,” Form SC33-0009-1. IBM, Armonk, New York. JBM. ( 1972a). “Conversational Programming System (CPS),” Form GH20-0758-1. IBM, Armonk, Yew York. IBM. (197213). “FORTRAN IV Language,” Form GC28-6515-9. IBM, Armonk, New York. Lindsey, C. H., and van der Meulen, S. G. (1971). “Informal Introduction to ALGOL68.” North-Holland Publ., Amsterdam. Norman, A. B. (1969). Attention handling in PL/I. Unpublished notes-Entry 2108 in the IBM PL/I Language Log.
This Page Intentionally Left Blank
Poetry Generation and Analysis
JAMES J O Y C E Computer Sciences Division Department o f Electrical Engineering and Computer Sciences University of California Berkeley, California
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1. Introduction. 2. Computing Expertise. . . . . . . 3. Poetry Generation: The Results 4. Poetry Analysis: Introduction . . . . 5. Concordance-Making. . . . . . . 6. Stylistic Analysis . . . . . . . . 7. Prosody . . . . . . . . . . . . 8. Literary Influence: Milton on Shelley . 9. A Statistical Analysis . . . . . . 10. Mathematical and Statistical Modeling . 11. Textual Bibliography. . . . . . . 12. Conclusion . . . . . . . . . . .
References
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
. . .
.
. . .
43 44 47 52 53 58 61 62 63 64 67 69 70
1. Introduction
The progress in literary data processing has neither made human poets obsolete nor rendered literary criticism a matter of having a friendly chat with a HAL 9000. The application of computers to the writing and analysis of poetry is approximately 15 years old and shows great promise for achieving every new subdiscipline’s ideal goal-ceasing to be a thing apart from what one normally accepts as the mainstream of activity in literary study. Results sometimes fall short of expectation in the application of computers to the generation and analysis of poetry, but the successful applications have encouraged computer technology to develop beyond the number-crunching stage toward the no less difficuIt area of natural language analysis. Poetry generation and analysis have nothing to do with automatic translation (as far as I can determine) ; and although some studies of 43
44
JAMES JOYCE
authorship have been done, the bulk of poetry analysis by computer is to understand what is there in the poetry rather than who wrote it. Thus, this survey does not include authorship studies. Poetry generation-the act of programming a computer to produce strings of output in imitation of human poetry-is less often done than poetry analysis, which is programming a computer to search for strings or patterns within human poetry; perhaps this is because most of us would rather read and talk about poems a human writes than about those written by a computer. However, as we shall see, there is a t least one good reason for producing poetry by computer. There are also a number of good reasons for going to the effort of converting poems into computer-readable form and writing (or borrowing) programs to search the lines for strings of characters or patterns. Fortunately, the human effort involved in the computational analysis of poetry can be decreased as more material is available in computer-readable form, as programs for stylistic analysis are shared, and as human-oriented output is easier to produce on high-speed printers, terminals, computer-controlled typesetting equipment, and the like. Because many of the points I make in this survey of the kinds of work going on relate not only t o the analysis of poetry, but more generally to literary analysis, I will a t times substitute “literary analysis” for “the analysis of poetry” to suggest the more general application. The examples, however, will be drawn from applications of the computer to the writing and study of poetry. The studies mentioned were selected because they are indicative of work done or going on, and I regret that it was not possible a t least to mention every worthwhile study. Most of the examples arc of applications to English poetry because I expect my readers are more familiar with English and American literature, not because work done in French or German, for example, is any less important. 2. Computing Expertise
Before one can do much data processing there must be some data t o process. Thc task of encoding data for someone engaged in poetry analysis can become monstrous all too quickly, depending in part upon the complexity of the project. Optical readers have not yet progressed to the point where a book of poetry can be “fed in” to the computer-or anything near that desirable but distant state of direct data entry. The person wishing to do poetry generation can (and generally does) content himself with a limited input vocabulary, concentrating his energies on the program logic necessary to create lines of poetry as output. But the person wishing to do an analysis of, say, Milton’s Paradise Lost must (if he cannot locate one of the several copies in computer-readable form)
POETRY GENERATION AND ANALYSIS
45
arrange for someone to enter the poem into computer-readable form or do so himself. It is this single point that discourages many novices in literary data processing. Although there is a listing of machine-readable text published each January in Computers and the Humanities, a journal devoted to computer applications to the various fields in the humanities, either the investigator may find the material he desires is not yet in machine-readable form, or the person in possession of the machine-readable text for some reason is not willing to share it. Another source for poetry already in computer-readable form can be found in the articles and books written as a result of computer-aided scholarship ; once again, some scholars apparently are not willing to share, whereas others are quite helpful. The situation is such that a group of scholars, among them this author, are a t work on plans for a clearinghouse of literary materials in computer-readable form ; this archive of material will hopefully eliminate much duplication of effort, and make computerreadable materials as easy to obtain as their printed forms are through interlibrary loan. Unfortunately, much work needs to be done before the archive can be operational, and a t present, most literary investigators must encode their data as well as design and write (or have written for them) programs. By and large poetry generation and analysis projects are done in computer languages designed for other purposes, such as FORTRAN, PL/I, ALGOL, and SNOBOL. This is because the traditional lack of money in humanistic disciplines is continued in computer applications. The literary scholar who uses a computer in a project may be more likely to find funds than he was before computers were used in such tasks. But there is nowhere enough money involved to encourage a major computer manufacturer to offer a supported compiler for a language more suitable for poetry generation or analysis. However, I do not mean that the major computer languages are totally inappropriate for poetry generation and analysis. Although FORTRAN is still best used for numeric calculations, it can be made almost tame for literary data processing; the advantage of using FORTRAN is that it is the most widely available computing language and a t a given installation tends to be kept the most problem-free of the installation’s languages. I n my opinion, the beet language for natural language programming is PL/I, a language implemented for IBM, Burroughs or CDC computers; PL/I allows both text processing and number processing without making the programmer work too hard to use them, although it is sometimes too easy to write in inefficient code. Some poetry analysis projects are done in assembler-level languages, but writing in a language so machine-oriented takes someone who is very patient and as interested in com-
46
JAMES JOYCE
puters as he is interested in literature. SNOBOL has been offered by several people as a good language for literary analysis, but I find i t a bit too alien for a confirmed humanist; this objection may well be more personal than professional. One problem with SNOBOL is that it is not available everywhere in a fast version and can be costly to use. Milic uses SNOBOL for his experiments in poetry generation because he does not need to be concrrned with programming efficiency as much as obtaining results quickly which he can then devote his time to analyzing (Milic, 1971) ; for programs processing large amounts of data, however, I suspect it would behoove the investigator to choose another language. COBOL has been used several times for concordance-making programs and may be an overlooked language for natural language processing; COBOL’s advantages are that it looks very much like English and is generally very fast in execution. Some resrarchers, unfortunately, speak ill of COBOL not because they know the language to be improper for literary data processing but because COBOL is the number one language in business data processing. Since there is more activity in poetry analysis than in poetry generation, more programs have been written and made available by their authors so that the same program need not be rewritten by independent investigators. Computers and the Humanities publishes in its May and November numbers a “Directory of Scholars Active” which indicates the projects under way and whether the programs being used or developed in a project are available. The January 1973 issue of the journal includes a brief summary of humanities programs available, in addition to the useful registry of machine-readable text. Comments in some of the program summaries indicated machine-to-machine incompatibility problems, but also a sophistication about programming that may indicate a growing away from the technological ignorance which has hindered work in the field. For a long time investigations into poetry using a computer were done by scholars trained in the humanities but who relied on programmers (generally not trained in the humanities) to instruct the computer. This led to the inevitable communications problems and misestimation of the computer’s capabilities, both over and under what the machine could do. I do not know whether the percent of scholars engaged in literary data processing who can program has passed 5070, but I tend t o doubt it. One reason it is important for literary scholars to know how to program is that computer center consultants do not generally understand the problems proposed by literary scholars, and there are not enough graduate students in literature who know programming to assume the programming chores that graduate students in physics, economics, and other such dis-
POETRY GENERATION AND ANALYSIS
47
ciplines perform. Another reason literary scholars need to know how to program is that effective and efficient literary data processing techniques can best be developed by people who understand the goals of literary data processing and the capabilities of computer programming. Curiously, those who generate poetry by computer all seem to know how to program (that is, there are none of the references to “my programmer” which one finds in literary data processing reports). Perhaps this can be accounted for by the fact that th.e person wishing to generate poetry by computer realizes that the experiment in creation requires a close relationship between literary person and computer; another reason may be, as one generator of poetry told me, the whole thing is being done as an exercise in programming. 3. Poetry Generation: The Results
Poetry generated by a computer has appeared from time to time in print and was on exhibit a t the Association for Computing Machinery’s 1969 Annual Conference in San Francisco, California. More recently two publications have appeared containing a number of computer-produced poems: Cybernetic Serendiptiy, and a volume devoted to poetry entitled Computer Poems. The poetry in these publications gives a good idea of what is being done in poetry generation, both the successes and the failures. A most interesting and thought-provoking application of the computer to writing is by Marc Adrian, the first selection of computer poems and texts in Cybernetic Serendipity. Adrian’s piece “CT 2 1966” is called a “computer text” in the book, and I do not know to what extent I am justified in including it in this discussion. But since the piece is similar to poetry by, for example, Ian Hamilton Finlay in Scotland, I decided I should include it (see Figs. 1 and 2 ) . An IBM 1620 I1 was programmed to select the words or syllables at random for Adrian’s poem; the program also selected a t random a type size for each of the words chosen, and they were set in that size Helvetica type using no capitals. The question of whether the result is a poem is not to be brushed aside lightly. Adrian’s (1969) comments on his work sound very much like an artist discussing his medium: “TOme the neutrality of the machine is of great importance. It allows the spectator t o find his own meaning in the association of words more easily, since their choice, size and disposition are determined a t random.” The question of whether the product is in the medium of poetry or of graphics is answered hastily a t the peril of the reader. I do not believe it is poetry because it is not language, for I hold as an assumption that poetry is a subset
JAMES JOYCE
cocco
bOl
POP
do
POOP
POI
Poldo
blood
doll
coop
POI0
colpo
POOP
loco FIG.1. A computer text. (From Adrian, 1969.)
a
r
a C
r
t t b
a
b t
S
a
a
a S
t b
0
r
C
a
a
a C
b t
t b
r a
a
r C
a
r 0
b t
S
0
C
a C
0
0
r C
r
0
0
b
a
t b
r a
a S
a
b t
t
b
0
r C
a
FIG.2. I. H. Findlay’s poem “Acrobats.” (From Smith, 1968.)
POETRY GENERATION AND ANALYSIS
49
of language. Perhaps one value of Adrian’s work is that as we judge whether it is poetry or not, we have the opportunity to reassess for ourselves just what the word poetry means to us. Gaskins choses the haiku form as the structure from which a poem is t o emerge. Gaskins has a fine sense of haiku and has evidently programmed his computer well. His word-choice captures the Oriental spirit well, and the result is a number of believably Japanese haiku. In a conversation with Gaskins I learned he wrote the program to generate the haiku in SNOBOL language, and, among other things, the program checks to maintain a consistency of seasons to be evoked in the computer-produced poem. The traditional haiku is a three-line poem having five syllables in the first line, seven in the second, and five in the last; the poem is to be evocative rather than discursive, and there shouid be some pointed reference to the season the haiku suggests in the last line. Though these rules do not seem very complex, a good haiku is not easy to write, whether helped by a computer or not. The following are some of Gaskins’ haiku (1973) : Arriving in mist Thoughts of white poinsettias Snow leopards watching.
and Unicorns drowsing Thoughts of steep green river banks Scented thorngrasses.
and Pines among riders Old man is speaking of frost Mount,ain jays watching.
These haiku, while not great poems, are good. Other attempts a t haiku generation by computer are not as good ; their programmers apparently ignored basic haiku structure and generated random words under the constraint of syntax alone, hoping for output which (if the reader stares at for a long time) may make sense. Gaskin’s efforts make sense and have that sense of abbreviation that is in human-written haiku. The title “Haiku Are Like Trolleys (There71 Be Another One Along in a Moment) ”-suggests their machine origin. Gaskins explained to me that after hc had written the haiku-generating program he had modified his computer’s operating system so that in the event of machine idle time, haiku began appearing on a cathode ray terminal which was permanently online, the lines moving upward steadily on the screen as more lines appeared, and finally disappearing off the top of the screen, lost; the haiku
50
JAMES JOYCE
did remain on the screen long enough for someone to copy it if he happened to be watching the screen and desired to do so. Unfortunately, it is my understanding that the haiku-producing program is no longer such an integral part of the system. Two people who generate poetry by computer, Kilgannon (1973) and Milic (1973), have goals beyond the basic goal of generating poems through a computer program. Kilgannon, using an ALGOL program running on an Elliot 4130 computer, generates “lyrics” and then develops his own poem from the one the computer generated. This fascinating symbiosis of man and machine suggests a new way of thinking about poetry-a way that may indeed become an influence on poetry itself. The computer has become a source of inspiration, a generator of something to revise and make into a poem. Poets for centuries have drawn inspiration from things, but drawing also the terms of inspiration-to share the process of inspiration and creation with the source-is a rather intriguing development. An example of this symbiotic relationship is illustrated in the pair of poems “Lyric 3205,” written by the computer, Lyric 3205 judy gotta want upon someone. wanna sadly will go about. sammy gotta want the thief him but the every reason. real distance carry. before god wanna remain. private distance talk indeed baby. an. diane likta tell the thief him but the every reason. real distance carry. before god wanna remain. private distance talk indeed baby. an
and, Kilgannon’s poem developed from “Lyric 3205”: Restlessness judy needs to need someone sadly searching everywhere sammy finds his soul attached to travel, movement, free as air diane lusts communication every life is her domain private distance talks indeed and drives us all to search in vain
A poet writing poems based on words or phrases selected at random is not new. As Milic (1971) has brought to our attention, a childhood
POETRY GENERATION A N D ANALYSIS
51
friend of Dylan Thomas wrote that Thomas “carried with him a small notebook containing a medley of quite ordinary words.” When he wanted to fill in a blank in a poem he would try one word after another from his dictionary, as he called it, until he found one that fit. His aim “was to create pure word-patterns, poems depending upon the sounds of the words and the effect made by words in unusual juxtaposition.” Kilgannon’s use of the words and ideas from “Lyric 3205” in his “Restlessness” is not unlike Thomas’ practice as described by his friend. The difference between a Dylan Thomas poem and one by Kilgannon is not found in whether one used a notebook or a computer to suggest the stuff of which poems are made; the difference lies in the ability of the two poets. Milic’s collaboration with the computer is a pointedly systematic one; he seeks to understand more about what poetry is, and the poems his programs generate serve two functions: (1) they are poems, and (2) they are experiments to see what kinds of freedoms and constraints will produce poetry. He says, “in an important sense, strings of words are interpreted as poetry if they violate t w o of the usual constraints of prose, logical sequence and semantic distribution categories” (Milic, 1971, p. 169). This suggests that poetry does not have to make quite as much sense as prose does, a statement that is in its way true. As Milic points out, modern poets have exploited the reader’s willingness “ t o interpret a poet, no matter how obscure, until he has achieved a satisfactory understanding” (1971, p. 172). Behind the reader’s willingness is a seldom expressed but definite ag+eement that there is something there to understand; and, beyond that, there is someone to understand. Some computergenerated poetry shows a concern for the first part of the agreement; if any shows a concern for the second I am not aware of it. Marc Adrian’s words that the ‘(neutrality of the machine . . . allows the spectator t o find his own meaning in the association of words more easily” can be viewed as so much bad faith on the part of the poet if one insists upon the agreement between reader and writer characterized above. Milic’s computer-generated poems show a concern for the integrity of the poem: although poetry may violate some of the rules, it is not devoid of some basic sense. An example of this concern is in his “Above, Above”: Above, Above This is my epistle to the universe: Above the eager ruffles of the surf, Above the plain flounces of the shore, Above the hungry hems of the wave.
There is a basic sense there, something the reader can enter into; i t is not terribly profound, but it i s not totally obvious (as expository prose is supposed to be) either.
52
JAMES JOYCE
4. Poetry Analysis: Introduction
The agreement between poet and reader that the poet has given the reader something to understand if he will but interpret it is a basic tenet of literary analysis and a major justification for the analysis of a poem by computer; indeed it may not be too much of an exaggeration that the analysis of poetry using a computer allows the reader the opportunity to understand the someone who wrote the poetry with a precision usually not attained through the use of noncomputational analysis alone. This is not to say that literary data processing is all one needs to do to understand poetry-far from it; using a computer to analyze poetry provides a reader with knowledge to be used in addition to more conventional methods of analysis. The analysis of poetry is no less difficult than the analysis of any other natural phenomenon. But what can one analyze in poetry using a computer, a device which is so insensitive and unforgiving to computer programs that the obvious misspelling of a variable name can produce pages of misunderstanding in the form of error messages or bad output? The answer to this, of course, is that i t is the human who does the analysis ; the computer is his instrument for handling large quantities of data quickly and consistently, especially if the task involves counting or searching. A digital computer can process two kinds of data, numeric and string. Poetry can be viewed as string-type data, and thus the kinds of operations one performs on strings can be performed on poetry. For example, a line of poetry is generally mapped, character for character, into a fixedlength field and padded to the right with blanks. There may be other fields for identification information to make up the record depending upon whether or not the project has a need for that information. Most records are card images, reflecting the most common input medium-cards. Most lines of poetry will fit comfortably on a single card along with, say, a linc number or identifier to indicate the poem. Programs search the text field for strings or patterns and if they meet a desired criterion are counted or noted. Generally the kind of processing done is more like business data processing than scientific data processing, except for a few isolated projects. A common product of literary data processing is a concordance, which can be defined for the moment as an alphabetical list of entries, each entry consisting of a word used in the text being concorded and each line in which the word was used (usually with some indication of the location within the text being concorded). An example of a concordance can be seen in Fig. 3.
POETRY GENERATION AND ANALYSIS
53
5. Concordance-Making Production of a concordance is not a computationally sophisticated task, although there are some aspects of concordance-making that as of now are not possible using a computer. The poetic line is searched and each word found (if the concordance is not selective) is written as a record, containing also the poetic line and identifying information, to some intermediate storage device. If the concordance is selective, the word found is checked against a list of words to be omitted and, if it is on that list, the record for that word is not written out. The intermediate output is read from the temporary storage device and sorted into alphabetical order, the result later input to a print program which accumulates statistics about the number of different words and the distribution of relative frequency for the different words. This sounds simple, and basically is; far more difficult for the literary data processor is the task of entering the data to be concorded into computer-readable form, a task which includes the decisions of how to indicate characters not available on the encoding device or the printer, the fields to be included in the record, etc. However, if the concordance project consists of only what I have described above the resulting concordance may be a help t o scholars (in that some help is better than none), but it will not really be satisfactory. Homographs, for example, ‘‘love” as a verb and “love” as a noun, are not two uses of the same word; they are different words and should be treated and counted as such. True, the human user of the concordance should be able to separate homographs, but to do the job correctly they should be separated in the finished concordance. This process a t present requires the human to edit a preliminary form of the concordance, a task which is made considerably easier if one has an online text editor such as the WYLBUR system developed by Stanford University;’ a program designed to regroup lines according to an editor’s indication is not complex to write in any event, and allows a number of other desirable regroupings to be made as well. For example, it is also desirable to have the various forms of a verb, such as “coming” grouped together under “come.” One of the forms, “went,” needs to be placed out of its strictly alphabetical arrangement, but that is a minor problem for the regrouping ‘A description of Stanford’s WYLBUR system is available from the Stanford Center for Information Processing, Stanford University, Stanford, Ca. 94305. Another version of the WYLBUR system is described by Fajman and Borgelt (1973).
PAGE LOUO
ICONTINUEDI AND BY ME, I N S 3 F T RED RAIMENT. T H E F E N I A N S MOVED I N LOUD STREAMS* H E A R I N G H E L L LOUD '.WITH A MURMUR* A S SHOUTING AND MOCKING WE .. SWEEP. ..- - . AND MY LOUO BRAZEN SATTLE-CARS m AN0 W I T H LOUD S I N G I N G I RUSL(ED ON I PARTED HER L I P S W I T H A LOUD SUDDEN CRY. a s THAT HE MAY F I G H T THE WAVES OF T d E LOUO SEA.* I THAT HE MIGHT F I G H T THE WAVES OF T d E LOUO SEA.. W H I L E HUSHED FROM FEAR, OR LOUD W I T H HOPE. A BAhD AND THE LOUD SONG OF THE E V E R - S I 4 S I N G L E A V E S * . AND THE LOUD CHANTING OF THE U N Q U I E T LEAVES. I . AND THE L O U 0 CHAUNTING OF THE U N O U I E T LEAVES, AN0 THE WHITE HUSH END A L L BUT THE LOUD BEAT FROM MARBLE C I T I E S LOUD W I T H TABORS OF OLD NOR GAVE LDUO S E R V I C E T O A LAUSE I WITH THE LOUD HOST BEFORE THE S E A S THEY HAVE LOUD M U S I C e HOPE EVERY DAY RENEWED THAT CRAFTY DEMON AND THAT LOUD B E A S T NOW THAT THE LOUO BEAST RAN e I . COME W I T H L O U 0 CRY AND PANTING a R E A S T rn UPON CROAGH P A T R I C K SANG I T L O V D l s WHILE H i s LOUD SONG REPROVES * . I'LL QUENCH H I S S I N G I N G Y l T H LOUD SONGI LOUD FOR THEE THE MORNING C R I E T H * OH. CEASE YOUR SN I GN I GI WLID AND S H R I L L AND LOUD. BUT LOUO THE GRASSHOPPER T H A T SITS BENEATH. Q A TITAN. W I T H LOUD LAUGHTERS THAT HAVE GONE THITHER TO LOOK FOR THE LOUD STREAMS. THOUGM WE WERE B I D D E N TO SING, CRY NOTHING LOUD. TO THE LOUD SANDS BEFORE DAY HAS BROKEN. LOUD-CRASHING OF THE LOUO-CRASHING AN0 EARTH-SHAKING FIGHT, LWDER SOUL C L A P I T S HANDS AN0 SING, AND LOUDER S I N G BUT LOUDER SANG THAT GHOSTS *UHAT THEN?. * * . LWDLY AND HE CALLED LOUDLY TO THE STARS TO BEND a a AND LOUDLY T W I C E THE L I V I N G WINGS F L A P 1 W I D E $ I nUGH WE RODE I N SADNESS ABOVE LOUGH LAEN. ROUND LDUGH DERG'S HOLY I S L A N O I WENT UPON THE STONES1 LWGHLAN AND THOUGH THEY HAVE TO CROSS THE LOUGHLAN WATERS m AN0 THOUGH THEY HAVE TO CROSS THE LOUGHLAN SEAS LOUNGING L I K E AN ARMY OF OLD MEN LOUNGING FOR REST FROM THE FOAM OF THE SEAS. LeUR I CAN SCOFF AND LOUR a a a m m LOUT I NEEDS MUST MARRY SCUE POOR LOUT,. I NEED MUST MARRY SOME POOR LOUT*. 9 W I T H S K E L P I N G M I S B I G BRAWLING LOUTI m A DRUNKEN. V A I N G L O R I O U S LOUT. A ORUNKENr V A I N - G L O R I O U S LOUT. AND L E F T ME BUT A LOUT, DAN AND JERRY LOUT s Q A LOUT BEGETS A LOUT, . > A LOUT BEGETS A LOUTI m LOVE SEE TRUE-LOVE AND LOVE, I N T H E M W R S WHEN YOUTH HAS CEASED1 FOR LOVE OF USHEEN MY FEET RAN FOR LOVE OF O I S I N FOAM WET FEET FOR L O V E OF O l S I N FOAM-WET FEET I N T O A DESPERATE GULPH OF L W E I -GOOD REASON HAVE I FOR MY LOVEv. LOVE. T I L L THE OANAAN POETS BROUGHT I N LOVE. I CRIED, .THEE WILL I WED, AN0 THE BLUSHES O F F I R S T LOVE NEVER HAVE FLOWN1 AND THE FLUSHES OF F I R S T LOVE NEVER HAVE FLOWN1 M U S I C AND LOVE AND SLEEP AWAIT, * . FOLOEO IN LOVE T H A T FEARS NO HWROW. * . FOR L O V E THEY FOLLOWED ONE AND A L L 1 s W H I L E H I S HEART S T I L L DREAMS OF B A T T L E AND LOVE YET LOVE AND K I S S ON D I M SHORES F A R AWAY s THEY LOVE AN0 K I S S I N I S L A N D S FAR A Y A Y r BUT LOVE AN0 K I S S ON D I M SHORES F A R AWAY s L I G H T IS M A N I S LOVE, AND L I G H T E R I S MAN'S RAGE1 AN0 THEN LOST N I A M H F~URMUREDI .LOVE, WE GO 0 LOST N l A M MOURN AND SAY, .AH, L O V E 0 WE GO I N THE D I M N E S S THEY LOVE G O I N G B Y 1 v COMING OUT OF THE SEA AS THE DAWN COMES, A CHAUN1' 01 .OVE ON MY L I P S . AND T H E I R HOUND. AND T H E I R HORSE, AND T H E I R LOVE, t . THROUGH THE O E U X LOVE OF ITS YOUTH WHEN WANDERING I N THE FOREST, I F HE LOVE BE PLENTIFUL.-A.IO I F HE LOVE ANOTHER* s AS N I THE WOODS n E WANDERS. IF HE LOVE BE PLENTY.-IF HE G I V E S ANOTHER LOVE. *
. . . .. .. ... ... . .. . .. ..
.. .. . . . .. . . . . ... ... . .. . . . .. . ... .. .. . .- . .. ... .. ... .. .. .. . . . .. .. .. . . . . . . .. . ... ... . . . .. .. .. .... . . .. .. . .. .. .. ... . .. . ..
. .
---
.
.. .... .. . .. .. .. .. ..
.. .. . ... ... ... ... ... .. . . . .. .. . . .. ... ... ... ... ...... . .. . .. . ... .. .... .... .... .... . . .. . . ... ... ... ... . . .. .. .. .. . . .. .. . ... ... .. .. . . .. .. . . . .. - ... ... ... ... .... . . .. .. .. ..
.. . . .
. .. . . .. .. .
.. .. .. .. ..
TITLE
LINE
53 O I S l N 3
93
62 O I S I N 3
208
83 MAD K I N G GOLL 93 M A 0 K I N G GOLL l D 5 C U C H U L A I N SEA 111 C U C H U L A I N SEA 111 C U C H U L A I N SEA 114 ROSE OF BATTLE I20 SORROY OF LOVE 120 SORROW O F LOVE 120 SORROW O F LOVE 137 T O SOME I TALK 162 ASKS F O R G I V E 213 GREY ROCK 276 GREY ROCK 398 LEADERS CROWD 399 DEMON BEAST 400 O E W N BEAST 412 TOWER 528 DANCER CRUACH 596 P A R T I N G 641 I S L E STAT 1 1 b 4 8 I S L E STAT I 1 648 I S L E STAT I 1 6 6 2 I S L E STAT I 1 2 6 8 0 LOVE AND DEATH 165 SHADOW WATER A 111 SONG DEIRDRE 3 784 YOMANS BEAUTY
V
18 31 8
V V
81
V V V
3 11 11
81
5
15
9
51 128 11 2 16 V
86 5
7 63 86
99 8
11 328
2 18
38 O I S l N 2
141 11
401 S A I L B V Z A N T I U M S17 WHAT T H f N
20
b1 SAD SHEPHERD 659 I S L E STAT 1 1 1
12
2 OlSlN 1 593 P I L G R I M
5
v
7H 6
V
113
v
144
282 TWO K I N G S 2 8 2 TWO K I N G S
57 O I S I N 3
113
510 JANE JUDGMENT
8
41 47 9 32 32 14 5 LO5 SONGS BURDEN 1 b05 SONGS BURDEN 1 2 OISIN 1 6 DlSlN 1 6 OISIN 1 6 01SIN 1 1 OlSlN 1 7 OISIN 1 7 OISIN 1
7 8 8
9
23 23 27 29
29 29 45 46 46
53 58
DISIN OISIN 01SIN OISIN OISIN OISIN OISIN OISIN 015111 OlSlN DISIN OISIN OlSIN OlSlN OISIN
1
b
6 V V V V
7A 51 57 51 73
V V V
61
V
85 103
1 1
1 1 1 1
2 2 2 2 2 2
341 V 345G 406 13 V V
13 11
V
243 245 2450 98 158
3 3
176 216
59 O l S l N 3 63 O l S l N 3 71 ANASHU V I J A Y A
71 11 11
ANASHU V I J A Y A ANASHU V I J A Y A ANASHU V I J A Y A
63 14 85
3 V V
5 3 5
POETRY GENERATION AND ANALYSIS
55
program sketched above. What is important of course is that when the entries for “went” are regrouped with “come,” there be a cross reference entry to indicate that practice. If the cross-reference text, “See under ‘come’,” is made the context for a single entry under “went,” that problem is solved easily enough. What is described here, it should be said, is not intended to represent the task of making a concordance as ridiculously easy or to pretend there are arcane reasons for editorial decisions made in preparing a concordance which make complete automation impossible; rather, the production of a concordance is basically a computationally simple task which includes some hard work the computer is generally not asked to do. Another purpose for dwelling here on basic concordance-making is to acquaint the reader with the idea of a concordance, as well as the ideas of homographs and the different forms of words. These concepts will be important in understanding the other kinds of literary analysis done by computer. The concordance is useful to literary scholars in that if a question arises over how a writer uses a particular word, the entry for that word can be looked up in a concordance and all instances in which the writer uses the word evaluated to synthesize what the writer means when he uses the word. By reference to a concordance rather than to the original work, individual words that suggest appeals to the senses can be traced in the writer’s work; by finding the words in the concordance the investigator has a better chance of locating all words he desires, rather than risk a mind-numbing search through a large body of work and having to consider each word every time i t appears. Approaches to concordance-making by computer have shown almost as much variety as the people preparing them. The most famous of the concordance-making enterprises is the Cornell Concordance series. Before the advent of computers, concordances were made a t Cornell by teams of graduate students and local housewives working with 3 x 5 in. index cards; such a procedure was lengthy and tedious to those involved, but was an improvement over a single person doing the concordance alone (Smith, 1971). The Cornell Concordances, begun in the early 1960s under the general editorship of Stephen M. Parrish, have produced a series of concordances to the complete works of a number of writers. These concordances are produced camera-ready on the high-speed printer, which with the earlier concordances meant punctuation marks were generally omitted and the letters printed were in all upper case. Subsequent refinements of the machinery involved provide punctuation and the distinction FIG.3. A page from a concordance to the poems of Yeats. (From Parrish and Painter, 1963.)
56
JAMES JOYCE
between upper and lower case, recently seen in the concordance to the poetry of Jonathan Swift (Shinagel, 1972). The distinction between upper and lower case, while not always so important that the absence of the distinction seriously faults the work, is always desirable. The users of concordances are people, and literary people a t that; literary people are used to the upper- and lower-case distinction, as well as the distinction between roman and italic type. The makers of concordances who use a computer, faced with the limited character sets of high-speed print chains, compromised a t first because they had to. The alternative was to have the computer output hand-set by linotype, which would have introduced errors and held up production of the concordance as well as adding substantially to the cost. Recently concordances have been appearing which, although they were produced by computer, have all the advantages and attractiveness of typeset books. These concordances have been filmset, a process by which material is composed on fllm guided by appropriate programs that sense in the input stream “escape codes” which indicate a change in type font, size, or special characters. As an attractive concordance Ingram and Swaim’s (1972) A Concordance to Milton’s English Poetry is a visual delight; not only are capitals and lower case present, but roman, italic, and boldface letters help guide the eye (see Fig. 4 ) . High-frequency words are omitted, but such omissions are few, reflecting the trend in concordance-making to exclude as few words as possible. However, there is no consistency in treating homographs. The concordance also lacks what has become a standard item among computer-produced concordances: a frequency list for all the words. Professor Trevor Howard-Hill’s Concordances to the First Folio of Shakespeare,* eachvolume of which is devoted to a different play, is filmset also. The obvious rightness of producing filrnset computer concordances has yet to be universally acknowledged, but it is apparent that the trend is there. One concordance influenced the text it was keyed to because the project for the concordance was carried out while the edition being concorded was in preparation (Spcvack, 1968-1970). It was based on G. Blakemore Evans’s The Works of Shakespeare. Concordance and edition were in simultaneous production, and several decisions about the text being edited were influenced by the preliminary output from the concordance. Perhaps it is an error to speak of “the concordance,” for Spevack’s work is really several concordances, one to each piece by Shakespeare and one to the total corpus. Interestingly, no lines of context are given with the entries in the concordance, which makes the distinction between upper and lower
’Published
one play per volume by Oxford University Press, 1970-1972,
POETRY GENERATION AND ANALYSIS
57 SINCE R r Lost 10.631
Dusk facer with whits ailkcn Turbsnu wrath'&
Wctlin the borders of her silk'n veil: Soh s i i m Rimrow fading t i m ~ l m l i ~ ,
ull"
Samson 130
Fair Id 2
I . ,
Wasall th~tdidthciriillythavphtaaobusic keep 810 InSiio htr bright Sanctuary:
s1r
Delight l h r t m m . and Sib's Brook that llow'd ma 'Silos's'
SLlru Dwtl1'~thers with Pan. a Siivan. by blest Sons We, If mcttsl, pan wcmd Gold. pait Silver deer. &fore his decent rlcps P Silver wand And O'IC the dark her Silver Mantle threw Others on Silver Lsksr and Rivers h t h ' d
Nativity92 Samson 1614 Par Lost I .I I
Mzrk 268
h r Loa13.595 ParLostItd4 Par Lo114.609 R r Lost 1.437
lnd&live mainly 10 the sin of E r Ten thavsindfould the sin of him who slew To whom thus M w h d Doubt not but that sin Smagainrt L s ~ 1 6 l i ~ h t : t h 1 w h r n t h r y ~ L a w can divovcr .in. but not remove. In sin lor c v n IOU from lik:thir sct &frSli"g Sin and Death. hi8 two m a i m armes. Of washins them lmm i u i l t of inn 10 Life Whether Ishould r y t me now of sin Rctrndi to wash a1 sin. and f i t them u) Toconquer S i n m d Deaththetwosrand f a r . With i w l t of his own sin. for he himself of sin. or Iqd debt; Weakly PI Ic.sl. and ihamcfull A sin Rcpenllhssin. but if thcpunir&h
Fivourrsnsw'd, andsddagmcrrin Tia onely dsy4ght that mskci Sin Trinilyms'un'
B r i d g r u e r ms kin'
D r i v i y f u o f l u ~ h t h i n gof u n andguilt. 1637 *sinme'
R r Lost 12.429 RrLosl 12,431 Par Lost 12.443 R r Lou 12.414 Par Reg I .11 Rr R g I.I59 RrR~3.141 Samson313
Samson 4 9 S m w n 504
Ssmron 1357 Mask 126
Mask 456
Turn forth her i i l ~ e rIming on the night?
Trinity mr 'ansell' -'arch.mgell'
.-._ -,
--'symphonic of
iirrilvn.busklnd Nyppha asgrcat andgoad, 1613 'silver buakm'd Trinity ms 'silver-burkm'd'
R.lm4.19 R . l m 80.14 R.lm84.40 h l m as. 7
8lVU.bftCd
Fair i i l v w r h d l e d @won for ever chiits. Bridpwater m i 'IIIVCIshalter'
s*lrr
Jurt Slmron and Raphetie Anne. wam'd But trouhle. .I oldSimeon plain 1ore.told. Simllla* k g o t l c n Son. Divino Similitude. In our aimilmdc. and 1st them rule Retainins still Divine similitude Simon Andrew nndstmon. famous alter known
R r Lor1 II ,390
P.r LOS13.384 Par Lost 1 520
ParLolt11.512 Par Reg 2 1
01Orrb. 01ollinoi. didst mipire mr 'Sinai' Cod from the MountofSim8. whougray lop As on mount Sind rani 9n.b" Whom thui the S n b o r n Monrtcrmrwnd soon.
sllw Simbrcd. how havcyctroubl'dall mankind
?%nple Shcpherdi keeping watch by night: Atas haw smwle. 10 these Cam comoar'd.
Par Lor1 12.365 rmrRg2.348 Mask 621
Sin Doubl'd that 31" ~nBethd and in Don. At first. and call'd me Sm. and lor a Sign Stranie sllsratttront Sin and k t h amain By sin to foul crorbitsnt desires. Of all things Iranrilorie and vain. when Sin Envicthcmth~l?enn~lbantoknow. Farr be it. that Irhovld write thee sin or blame. Thy Sin and place of doom obscure and loulc. BY sin of disobsdicnce. It11 that hour
1667 ..inn*. Forsin.onwvsrrandmulu.lrlnu harbent. Save what iin bath impaird. whictyet hath wrought And govern well thy ap titc least sin Sinncsad her shadow gnlh:%nd Mivric Now not.lhovihSin,notTime. firrtwrmuphttheehi Farrvshlhou~rt,fromain.nd blsmemtire But harm precedesnot sin. onrly our F a Weptntcom l u l i n ~ o l t h c m o r t dSnn T h e s o h a efthtr sin t i l l dowicsloc And manifold in sie.'dcserv'd 10 falr
1
Par Lost 1.485 Psr L0*12 760 Rr Lost2.1024 Far Lalt3.111 Par Lost 3.446 ParLo~l4.511 Par Loll 4.158 Par LoS14.840 R r Loa6.396 R r Los16.506
Rr Lm16.691 Par Lost 7.346 RrLom9 12 Par Lo&t9.10 R r Los19.292 R r Lon19.321 R r Loit9.lm3 h r Lor19.IM4 Par Lost 10.16 Par Lo.110. I33 Rt Lost 10.172 Par Loll 10.230 R, Lo*t10.234 R,Lost 10.251 P l r Lost 10.352 Fa, Loll 10.407
sirr This d y n l a l l . since by Fstc rhc strength 01 Gods -~. ~.". -~
Par Loll I.I Pa, Lost 12.221 Nativity 158
Par Lost 10 596 R r Lo114.31 5
Par Lost I .I16
Smcr throuRh experienceof this p a t went
Glories: For never iineccrssrcd man. And all who sin-. Baptdd or Infidel For since no drtp within her gulf can hold By my advice: since fate inevitable Worth Walling. since our prcacnl la1a p p " , Dear Daughter. since thou claim's1 me or thy L r s . May Icrprcsa t h e unblsm'd?sinee God i n light. Into 8 Limbo largcand broad. sinsscalld k thcn his Love accurst since love or hait N&&d k t h o ~ ; - & & ~ & & ~ hi;lhfih So since inlo his Church lewd Hirclin@ c l i m b That svcr since in loves imbrascs met. Adam the goodlier1 man of men since borne A l l Bu~trofth'E.rth,iinccwildc,nndof~llchslc Well known lrom Hcav'n; and iinec Meridian hour Mind us 01 like repow. since Gad hath lcl SinaSarcm fell. whom fellic overthrew. T o b o s t what Arms can doe. s i n e thine no more And why not Gods of Men. ainsc g a d . the more Smw hy dacending lrom the Thconnnbovo. N o t me+ I~tulsr.smcc by k r e e But more illuitrioui made. since he the H u d Of thinright handprovokl. iincc first UIall~npue Sincenowwrfindthiiour Empyrullarm Since M k h d and his Powers went forth 10 tame Of ending t h i l g m l Warr. since none hut Thou Or I alone against them. since hy strength 0fwh.twr.re. Bvtiincethou h a i l v o u t u f l Who sincethe Mornins hour vtavtfrom Hur'n Not hither iummand. rincc they cmnol change Follow'd with b d i s t i a n . Since to part. Since lint this a b j e c t for Heroic h n g Since Urid R q c n t d t h e S v n dcuri'd Not l o n p then since I i n OM Night freed Since hiEhw Ifall short. on him who next Sin= R u l o n not impauibly may m e t For now, and s i n a first break of dnwnethc Fiend,
P.rht3.3
Par Lolt3.495 Par Lost 4 69
R;LOit4.7i
R~Lost4.193 R r Los14.322 R r Los14.323 PsrhI4.341 RrLo114612 Par Losl4.58l R r Los14.WS
Far Lostl.lM8 R r Last 5.11
Par Lost5.363 R r Los15.174 Rr Loll 5.342 R r Lost 6.154 R r Los16.433 R r Lo16.f.66 Far h t 6 . 1 0 2 RI ~ o ~ 6 . 8 2 0 R, LOU 7.80 R r LO.18.1 II
RrLo.18.341
R r Loll 8 4 4 5 R r Lost9.25 RrLoa9.M
ParLo119.10
FIG.4 . A page from a concordance t o Milton. (From Ingram and Swaim, 1972, @ Oxford Univ. Press. by permission of the Clarendon Press, Oxford.)
58
JAMES JOYCE
case unnecessary. A nice feature of the work is a concordance for each character, indicating each word the character in the play speaks, the number of speeches, how many of these lines are in verse or prose, the total number of words a character speaks, and the number of different words a character uses. These concordances will undoubtedly help studies of individual characters in the plays; for example, a study of the relative frequency of the kinds of words Hamlet uses in Hamlet, Prince of Denmark may help us better understand this complex character. Are his words primarily self-abusive, despondent, uncertain, calculating, or what? Spevack’s concordance also has word-frequency lists and reverse spelling word indices to facilitate, for example, a study of Shakespeare’s use of -ing words, and homographs are clearly indicated. Misek (1972) shows what can be done in concordance-making with a little imagination and sensitivity towards intended users. The body of the concordance is augmented by an indication, for each line of the text, of the speaker of the line and to whom he is speaking-a tagging Professor Misek did by hand. This work gives every word in the poem with no omissions, and provides a number of excellent charts to summarize information. The work comes very close (through the charts) to computational stylistics, which is the next area of poetry analysis by computer we will investigate. 6. Stylistic Analysis
It may be overstating the point to claim that the single most important result of a computer-produced concordance project is that literary material is thus put into computer-readable form so that more sophisticated kinds of literary analysis can take place. That is, as valuable as the concordance is as a tool for the analysis of poetry, the other kinds of analysis done by computer generally rely on material put into computer-readable form so that a concordance can be made. Stylistic analysis by computer can- be an exercise in what is not characteristic of a poet’s style, and although nonresults are actually quite valuable, the literary community does not quite know how to handle them, so they are generally unannounced. The expense of putting a large amount of material into computer-readable form is balanced against the risk of nonresults, and I have known several would-be investigators who felt the encoding of literary data too high a price to pay for the kinds of results they might get back. One reason for such an attitude on the part of literary data processors when the question of stylistic analysis comes up is that the discipline of literature has a rather fuzzy-set notion of what style means, but researchers in literature are generally not formally oriented enough to take
POETRY GENERATION AND ANALYSIS
59
advantage of the work by Zadeh3 (1965, 1972) and others in characterizing their area of investiation. Elements of style in poetry belong by degrees to that sense of identity we find in the work of an author and which we call the author’s style; within an author’s lifetime (even one as short as Keats’ five active years) critics may speak of changes in the poet’s style, but with the usually implicit assumption that there is a thing such as an overall style. There have been notable attempts to establish precisely the concept of style and other attempts to encourage abandonment of the term altogether. Abandonment of the term “style” is of course no solution to the problem we feel which motivates us to use the word. Two investigators, Ross and Rasche (1972), have assembled a system of routines for the description of style they call EYEBALL. It is their belief that certain kinds of literary statistics are indicators of style and can be the data a literary investigator uses to develop statements about what a writer does. EYEBALL accepts the piece of poetry or prose (the system can handle both) in a form requiring only a minimum of marking: if the input is poetry, the ends of lines are marked with a slash, /. This is because EYEBALL treats the input as a stream of characters, and unless instructed to do otherwise the limitations of each physical record are irrelevant to EYEBALL. This decision regarding input is wise, for one problem in the discipline of literature has been the artificially different ways in which poetry and prose have been treated. M y own work, described below, has convinced me that prose and poetry are tendencies in language and need to be studied for those stylistic measures common to both as manifestations of verbal art and for what formal measures characterize their differentness from one another. Ross and Rasche (1972) have provided a system which will find and list, for example, clauses with compound subjects, periodic sentences, or phrases containing more than one adjective. This information tells the investigator what the writer has selected to do within the context of language, and a study of such selection can lead us either to describe the individual writer’s trademark (which is one thing we mean by “style”), or the presence in a writer’s work of patterns characteristic of a group of writers (which is another meaning of the word “style”). This latter meaning of style as the presence of patterns characteristic of a group of poets motivated Green’s study (1971) of formulas and syntax in Old English poetry. Old English poetry can roughly be said to be the poetry written in the English language before the Norman invasion, and in the interests of getting on with the discussion we will accept 3 F ~ z z ysets are “classes of objects in which the transition from membership to non-membership is gradual rather than abrupt.” See Zadeh (1965, 1972).
60
JAMES JOYCE
that characterization. Old English poetry has a German look to it, as this example shows (the lines in parantheses are literal Modern English translations of the Old English lines they follow) : Caedmon’s Hymn Nu sculon herian heofon-rices Weard (Now we shall praise heaven-kingdom’s Guardian,) Metodes meahta and his mod-gethanc, (Creator’s might and his mind-thought,) weorc Wuldor-Faeder, swa he wundra gehwaes, (work of the Glorious Father, as he wonders each one,) ece Dryhten, or onstealde. (eternal Lord, beginning established.) He aerest scop idelda bearnum (He first shaped for men’s children) Heofon to hrofe, halig Scyppend; (Heaven as roof, holy Creator;) tha middan-geard mon-cynnes Weard, (then Middle-Earth, mankind’s Guardian,) ece llryhten, aefter teode (eternal Lord, afterwards made) firum foldan Frea aelmihtig. (for men earth Master almighty.)
The Modern English version of the lines is purposely a rather literal rendering of the Old English; in this way the two-part structure of the Old English line is most evident and it is easier to see that the hemistich (or half-line) is the unit of composition rather than the line. Green’s study, by focusing on syntax rather than semantics, showed that the poems were constructed by techniques which are suited to and almost certainly grew out of oral composition. For example, in Caedmon’s Hymn, the half-line “ece Dryhten” (eternal Lord) is a formula for expressing the concept of God that the poet is able t o draw upon as he tells his poem, an idea that is associated with the frame Adjective Noun. Green found 30 such frames in an examination of 12,396 hemistiches from the range of Old English poetry, the repeated frames accounting for nearly 32% of the poetry examined. Such a high instance of a limited number of syntactic frames suggests that these frames were memorized by the poets as forms for expression much in the same way t h a t those of use who make up limericks remember that the pattern begins “There was ,’I etc. Notably, Green does not claim the com,/Who puter has helped him decide once and for all that extant Old English poetry was composed orally and later written down; he carefully establishes repetitions of syntactic frames and then attempts to account for such a phenomenon in the poetry. Green’s project involved considerable manual editing of the input data :
+
POETRY GENERATION A N D ANALYSIS
61
the half-lines had to be analyzed for metrics and syntax and this information with the text entered into computer-readable form. Such manual editing of poetry to be processed by computer is fairly common, perhaps because existing systems such as EYEBALL are inappropriate to material such as Old English, or because the investigator did not have access to a machine on which an existing syntax analyzer worked. But there is another way to get from the limitations of the letters of the poem to the music of the poem.
7. Prosody
Dilligan’s recent study (1972) employed a multistage approach which combined effective use of the computer with efficient use of the human investigator. His study was of quantitative verse, the basic rhythm of which is determined by the duration of sound in the utterance (long or short syllables) rather than the traditional system of the accent (strong or weak) each syllable takes, characteristic of most English poetry. Dilligan entered a dictionary of British pronunciation based upon Daniel Jones’ English Pronouncing Dictionary into machine-readable form, and entered Gerard Manley Hopkins’ entire 4800 lines of poetry. A concordance was prepared, and from that output distinctions were made between homographs such as L‘present” (to offer to someone) and (‘present” (as in a gift). From the updated text and dictionary a program produced ‘(afairly broad phonetic transcription of the text which accurately reflects all consonant and vowel sounds, lexical stress, and sentence stress insofar as this last was indicated on the updated text (Dilligan, 1972, p. 40).” This effort did not require quite as much hand-editing as the Old English study, although it should be clear that truc recognition of words by a computer is far from an established fact in literary data processing. Still, Dilligan’s technique took the computer “just over six minutes to produce a phonetic text of Hopkins’s” poetry, the six minutes being the machine’s time after 150 hours of work by two research assistants whose job it was to “transcribe, keypunch, and proofread the pronouncing dictionary (1972, p. 40) .” The transcribed text was then input to a scanning program, in which stress patterns were recognized and tabulated, as well as note taken of assonance (rcpetition of vowel sounds) and alliteration (repetition of consonant sounds). The computer was then used in its familiar role as tireless drudge, and the results sorted, cross-referenced, and listed in various tables. The same process was done for Robert Bridges’ Ibant Obscvri, a translation into English quantitative verse of part of Book VI of Vir-
62
JAMES JOYCE
gil’s Aeneid, and the purpose of the study was to use Hopkins’ practice as a background against which to view Bridges’ experiment with English quantitative verse. For readers whose interests are not prosody, Dilligan’s results are not a s easily summarized as Green’s. After an interesting analysis of Bridges’ practice Dilligan pronounces the experiment with quantitative verse a success.
8. literary Influence: Milton on Shelley
Joseph Raben and Philip H. Smith compared two authors to suggest the influence of one upon the other. The results were reported by Raben (1965). The study was a comparison of Milton’s Paradise Lost and Shelley’s Prometheus Unbound. Verbal correspondences between the two poems were sought “within the confines of single sentences. Even with such limitations, the program has produced tens of thousands of correspondences, most of them consisting of two words, but many ranging up as high as seventeen.” Admitting that perhaps the lower orders of correspondence between the two poems “are probably of no great individual importance,” Raben does point out that “the first sentence in both Paradise Lost and Prometheus Unbound contain God and world” and “the last correspondence located by the computer, at the end of each poem, is good and great (Raben, 1965, p. 242).” The technique used was not reported in the Proceedings, however, but can be found in Smith (1971) ; the technique used is of interest in that i t illustrated another way in which the computer is taught to perform the kinds of analysis which previously were either very taxing to the human investigator or near impossible. Briefly, in Smith’s technique a separate word index (indicating the word and sentence) was generated for Milton and for Shelley, and then the two were merged. The resulting list was then marked by Raben to indicate cquivalents for each word which would be actually considered in the computer comparison of the poems. This marking is crucial, sincc if the scholar indicates an inappropriate word as a n equivalent, the study is thus in error. For example, forms of the word ‘Llove”(such as ‘Llove,’l ‘Lloved,”“lovely,” etc.) were assigned “love” as a n equivalent word, a word such as i%clovcd” was also assigned “love” as an equivalent. The marking process served as a n indication of the basic meaning of the word. Since common words (such as “and,” “or,” etc.) were omitted, they were assigned an asterisk (”) as their equivalent word t o indicate that. The list of word equivalents, which Smith calls a “canonization dictionary,” was processed against Milton’s word index, and then against Shelley’s word index, the output from each run being canonized forms of the words and the sentence number in which the word appears. These
POETRY GENERATION AND ANALYSIS
63
lists were then matched, the output indicating the words and the number of the sentence within which each word appeared in Shelley and the number of the sentence in which it appeared in Milton. These records were then sorted by the sentence number for Milton, thus bringing together all words shared by the two in the same sentence of (say) of Milton’s Paradise Lost. This step should provide the evidence for verbal echoes of hlilton in Shelley, for if a number of words Milton uses in sentence 4 are also used in sentence 237 of Shelley, there is the chance the shared words indicate Milton’s influence on Shelley. Naturally if such a correspondence is very infrequent the best that can be said is that the influence is infrequent, and the worst that the shared words are a matter of chance. But Raben’s results showed, as was mentioned earlier, tens of thousands of correspondences. Sad to say, however, the mere presence of numbers of correspondences can be a matter of chance, and had Raben only the evidence of the numbers his conclusions could only be very tentative indeed. The number of correspondences could best be termed indicative if there were some reason to believe Shelley had knowledge of Milton, and as Raben (1965, pp. 230-232) demonstrates Shelly knew MiIton’s work and was fond of it. This terribly important step of motivating the literary data processing done is, of course, a basic part of experimental design. Investigations of poetry by computer by and large have had a rationale behind them; unfortunately, I cannot say that all investigations have, and a somewhat outstanding example of what can result from a lack of sufficiently motivated research will serve as the next topic considered here.
9. A Statistical Analysis
The -work I am about to discuss, I should like to point out, is not bad work; indeed, the technical aspect of the work seems quite good. What is unfortunate about the work is that it makes a reality the fears of some opponents to the use of the computer in literary analysis. Central to the analysis of litcrature is the hope that the results shed some light on what is being analyzed. For this reason literary scholars, a t the same time they may admire tables of carefully compiled figures, have a nagging question at the back of their minds which demands to be answered; that question can best be phrased as, “So what?” Sainte Marie et al. (1973) present “a brief expository account of the technique [of principal component analysis], together with a summary of the rather striking results obtained from the pilot project.” The application of the technique, if I understand what the authors are saying, is useful “for detecting differences and similarities [within an author’s work] that are interesting from
64
JAMES JOYCE
a literary point of view, or for confirming the existence of such differences and similarities if they are already expected (Sainte-Marie et al., 1973, p. 137).” The article presents results graphically, and says they are suggestive, but does not attempt to answer the questions raised a t all; the authors are pointedly not concerned “with offering conclusions about the work of Moliitre (Sainte-Marie et al., 1973, p. 137) .” Statistics, after all, are not meaningful in and of themselves; they must be interpreted. Granted that their article is written based on the results of a “pilot project.,” but then so was Raben’s article regarding Milton’s influence on Shelley, and Raben did relate his research to the poetry. The work on Moligre was to suggest a technique which will be a useful addition t o the literary scholar. The investigators in the Moliitre project, quite simply, have failed in the most important aspect of their work: to make their activity answer the question every aspect of literary analysis, including literary data processing, must speak to: So what? A number of literary scholars arc somewhat loath to excuse even concordance-making from the additional task of interpreting their results, although a concordance is an obvious aid to understanding what an author means by his words. Literary scholars are even critical of the kind of work bibliographers do in assembling a text which they hope most closely gives us the author’s work, a text free of typographical and editorial errors which keep us from reading the author’s words. The usefulness of principal component analysis as a statistical technique is not my objection; my objection is to the dangerous example of a presentation that stops short of truly demonstrating usefulness. Any methodology must be justified through its demonstrated usefulness in answering the question, So what?-in telling us, even within the limits of a pilot study, something about the literature being investigated. Such a stopping short of literary conclusions, if it continues and spreads, will surely render literary data processing purely an academic exercise in publication. I apologize to the authors of the Moliitre article for being so harsh on their work; I am personally very much a proponent of statistical applications to literary analysis and find what they did quite solid, but their lack of interpretation seems to me a most dangerous example, no matter how ultimately useful their method may be.
10. Mathematical and Statistical Modeling
Quite a different approach from Sainte-Marie’s statistical inference within Moliitre is found in Kahn (197313). Dr. Edward Kahn, a former graduate student in English and mathematics a t the University of Cali-
POETRY GENERATION AND ANALYSIS
65
fornia a t Berkeley, and his work, unlike most of the work described so far, does not involve the computer processing natural language data, but it may signal a rather important direction for literary data processing Kahn’s work is in mathein particular and literary studies in matical modelling of narrative, specifically the aspect of dramatic allegory in Edmund Spenser’s epic, T h e Faerie Queene. The various characters in the poems were grouped into equivalence classes based on their common qualities, the sets being named Faeries, Titans, Paynims (pagans), etc. The relation among the sets was that of dominance-that is, a Faerie dominates (wins a battle with) a Paynim, and so on. Kahn uses the term “plot” to mean “a network of abstract relations that define the universe of discourse in which the narrative is apprehended,” one relation here being that of dominance. A plot is then represented by “a directed graph where each node is understood as an object on which the relations are defined,” and the directed graph representation is canonically transformed into a finite state automaton. By identifying his automata as semigroups, Kahn is able to program a computer (in SNOBOL) to construct the semigroups, the interpretation of which indicates the accuracy and success of the simulation performed. The value of Kahn’s work is not found in only what it says about Spenser’s poem regarding the much-discussed allegory there, but more importantly, in the general applicability of the technique to any narrative for the purpose of characterizing or typing that narrative. The study of types of narratives is of value in helping us understand better just how literature works its spell upon us. My own work in literary data processing, like Kahn’s work, attempts to say something about literature, in general, as well as the individual material examined. The work I refer to is on a theory of poetry which provides insight into the subtle yet important distinctions between poetry and prose.5 The theory basically focuses on the pauses in a poem generally known as caesuras and end-stops; these pauses are of the sort one finds marked by a slash (/) in the following examples: 1. The man who came to dinner/went home./ 2. Bill,/ a healthy lad,/ laughed./
I n poetry, it seems, a rhythm of expectation, a periodicity of pause, is set up by the pauses which occur a t the end of the lines. Not every line needs to end in a pause for this to be true, nor does every line have to contain the same number of syllables-being close is enough for the ‘See Kahn (1973a) for a more lengthy and technical treatment of the work summarized here. Presented at the Computer Science Conference, Columbus, Ohio, February 20-22, 1973.
66
JAMES JOYCE
human ear. These same pauses set up a rhythm of expectation in prose as well, but the period established in most prose is approximately twice as long as the period for poetry. The shorter period of pause for poetry has nothing to do with whether the poetry is good or not; it has everything to do, however, with whether something is poetry or not. By characterizing the poetic line in this fashion one is better able to understand the development of poetry in English ; and, even more interestingly, when they are taught the theory, students understand poetry better and have a better appreciation for it than when the importance of periodicity of pause in poetry is not demonstrated. The role the computer plays in the development of this theory is that of examining bodies of prose and poetry with pauses marked in the text and determining which periodicity (if any) is marked by the pauses, as accumulating other information about them. My results, though a t an early stage, indicate that periodicity of pause is indeed a fruitful way of discussing poetry and prose alike. Poetry and prose can be viewed as tendencies within a continuum of language, since what basically separates them is the difference in period length for pauses. Further, the approach provides a way of accounting for especially poetic passages in a piece which is obviously prose; one would expect (and it seems to be true) that the periodicity of pause for the “poetical” section of a prose piece should be nearer that for poetry than for prose. Characterizations have also been developed for the adjectives “poetic” and ‘Lprosaic” (meaning, respectively, “like poetry’’ and “like prose”) which are used to discuss how regularly the periodicity of pause for the piece occurs; that is, how often the period is carried along by a pause being close to where one would expect it to be rather than a t the exact location. This measure is the variance for the periodicity, and if that variance is near zero the piece is poetic; if not near zero, prosaic. I n addition to the theory of poetry just discussed I have been working on a computational model for stanzaic structures in poetry that has promise. The model, presented a t the Second Computer Science Conference held in Detroit in February, 1974, is of the generic, or characteristic, stanza; the choice of the word generic is to suggest particular stanzas in a given poem grow out of the characteristic stanza. Like the theory of poetry the model of stanzaic structure depends heavily upon pauses in the poetry-this time confining itself to those at the ends of lines where one comes to a complete stop. By observing which lines end in full stops most frequently for a given stanza type one can derive a model for its substanzic structure. For example, in a Shakespearean sonnet thc fullstopped lines occur most often a t lines four, eight, twelve, and fourteen-marking the places in the stanza where major shifts in the stanza
POETRY GENERATION AND ANALYSIS
67
take place. Not every sonnet called Shakespearean has stops only at the ends of lines four, eight, twelve, and fourteen: in fact, few sonnets confine themselves so. But taken as a whole the sonnets clearly display the familiar pattern. This principle can be extended to any group of poems written in what one believes to be a single basic stanza form to yield similar results: Petrarchan sonnets, rhyme royal (as in Chaucer’s Troilus and Criseyde),and Spenser’s stanza in The Faerie Queene. When one examines a poem composed of variable length stanzas, as in the anonymous but important fourteenth-century poem Sir Gawain and the Green Knight, the strength and generality of the models shows itself most decidedly. Comparing variable lengths seems a fool’s errand until a way is devised so the variable lengths are seen as being the same relative distance: all the way through the stanza. We can represent all the way through as the number 1, and if the stanza has L lines a full stop a t the end of line E can be said to have occurred E/L of the way through. By computing E/L for all full-stopped lines in all stanzas of the poem we have a plethora of proportions that, if represented on the interval 0 to 1 show some groupings but apparently without cohesion. But if each proportion is multiplied by the arithmetic mean of the number of lines per stanza and the result grouped by closeness to lines, we can view the distribution of full-stopped lines for variable-length stanzas the same way we view them for fixed-length stanzas; significantly stopped lines then become generic lines of the generic stanza. Of course, when looking within particular stanzas for the lines identified as being significant the lines of the model stanza are not true lines a t all, but indicate the proportion of the way through a stanza the break occurs. Applying the model outlined above to Sir Gawain and the Green Knight I was able to identify the overall stanza pattern for the poem as well as the various emphases given to the pattern in each of the four parts of the poem. Also, the stanza in the poem that seems to disrupt the orderliness of the poem’s structure turns out to be the only stanza in which the audience is treated to an extended description of Morgan le Fay ; Morgan le Fay is responsible for disrupting King Arthurs’ court by sending the Green Knight to challenge it. Gawain’s response to the challenge engineered by Morgan le Fay is the heart of Sir Gawain and the Green Knight.
1 1. Textual Bibliography
The last area to be discussed here, in a way, cannot properly be called a topic in the analysis of poetry, although it is certainly the foundation
68
JAMES JOYCE
of good literary analysis. I am referring to that part of bibliography concerned with establishing the text of an author’s work. Establishing a text is not the same as establishing the identity of the author of a text. Rather, it means that the scholar attempts to put together a text which represents what the poet produced as a poem. This kind of work takes place when there is reason to believe that a poem has been altered in some way either through a printer’s typographical mistake or a n editor’s changes over which the poet did not have control. These conditions are surprisingly frequent in literature; it is widely believed that few major poets are represented to us by texts which are what the poet wrote. What, then, do we have to read, and how inaccurate is it? Both are very good questions. Perhaps it is enough here to say that we do not have Keats’ poem to read when we read the version of The Eve of Saint Agnes most often published as Keats’ poem, or Shakespeare’s play-you pick your favorite, for it is true of all of them. Some poets, such as Emily Dickinson, had their punctuation “corrected” by well-meaning editors who felt it was for the best to do so; but the Dickinson Variorum with the original punctuation shows even better the dramatic energy in the poetry than conventional punctuation of her poetry shows. Computers are being used in several of the steps a bibliographer must take to arrive a t the best text of a poem, including collation, identification of differences among texts, and production of the text itself. Widmann’s (1971) use of the computer to collate “some 80 to 120 editions of A Midsummer Night’s Dream is one approach to the collation problem. The various editions of the play were entered into computer-readable form and, after proofreading, were compared, line for line and within each line word for word (including punctuation). Since Widmann’s program evidently restricted itself to the environment of a line in the word-forword comparison, the problem of determining automatically which words were omitted from a given line was greatly reduced. I n producing the output of the collation Widmann printed all versions of the same line together, beginning with the line from a specific edition which serves as the basis from which other comparisons are made. I n printing subsequent lines only those words which differed from those in the first version of the line were printed, making it considerably easier for the human editor to see where there were differences among the texts. Being able to automate the collation process as described above is quite valuable, for it is only after all the editions have been compared that the literary detective work of constructing a critical edition can take place. Construction of a critical edition involves analysis of the corpus of variants produced by collation, an analysis which is frequently done by hand even after the computer has produced the list of variants. Peavler
POETRY GENERATION AND ANALYSIS
69
(1974) is a t work on a statistical analysis by computer of the corpus of variants for several of Chaucer’s short poems using material supplied to him by Professor George Pace, editor of the Variorum Edition of Chaucer’s shorter poems. Pace had collated the various manuscripts of the short poems by hand and was curious whether the computer might be of help in showing the degree to which one manuscript was similar to another. Peavler’s technique was to transform the manuscript readings into an array, the rows representing the manuscripts and the columns the individual readings, the array entries indicating whether the manuscript had that reading or not. A FORTRAN program then compared each manuscript against every other manuscript and indicated the number of shared readings; this output was both printed and sorted by the number of shared readings so that manuscript pairs were arranged in such a way that pairs of manuscripts which agreed the most were listed first, and so on, until the pairs of manuscripts which agreed the least were listed. These kinds of groupings could be useful in determining genetic relationships between manuscripts such as “manuscript A is a direct copy, or descendent, from manuscript B.” Peavler has not indicated the extent to which he has tried to have a program suggest genetic relationships among the manuscripts, although he does feel that the work of making editorial decisions should not be done by a program.
12. Conclusion
This survey of current work in poetry generation and analysis has attempted to show the spectrum of activities and to convey a sense of why such work is undertaken. To predict where literary data processing is going seems unnecessary and perhaps rash. There are a number of courses at least touching on literary data processing in universities throughout the country, and computer research sections are established within the framework of many literary conferences. In England last year (1973) the Association for Literary and Linguistic Computing was founded, with an East Coast branch (Professor Joseph Raben) and a West Coast Branch (Professor Rudolf Hirschmann) in the U.S.A., and the University of Waterloo now has a special interest group WATCHUM (Waterloo Computing in the Humanities) organized to further humanistic computing a t that university. At Stanford recently an interactive, open-ended question-and-answer program has been developed to lead students in the freshman English program to write poetry-not computer-generated poetry, but poetry written by the students in response to conversations they carried out with the program. In none of the work reported above has
JAMES JOYCE
70
there been the least suggestion that computational analysis should replace a human either in writing or in reading the poem, and that is as it should be. The examples of poetry generation and analysis given here demonstrate that the computer can serve both as a medium for an artist and as a useful colleague, who does the repetitious shuffling and counting within poetry needed by investigators who want to know better what and how a poet means. Literary data processing is obviously growing, developing more computational techniques which will help the scholar and reader in the quest to understand, to hear, what the poet has to say. REFERENCES Adrian, M. (1969). I n “Cybernetic Serendipity” (J. Reichardt, ed.), p. 53. Praeger, New York. Dilligan, R. J. (1972). Ibant Obscvri: Robert Bridge’s experiment in English quantitative verse. Style 6, 38-65. Fajman, R., and Borgelt, J. (1973). WYLBUR: An Interactive Text Editing and Remote Job Entry System. Commun. ACM 16, 314-322. Gaskins, R. (1973). Haiku are like trolleys (there’ll be another one along in a moment). I n “Computer Poems” (R. w. Bailey, ed.), pp. 16-19. Potagannissing Press, Drummond Island, Michigan. Green, D. C. (1971). Formulas and syntax in Old English poetry: A computer study. Comput. Humanities 6, 85-93. Ingram, W., and Swaim, K. (1972). “A Concordance to Milton’s English Poetry.” Oxford Univ. Press, London and New York. Kahn, E. (1973a). Finite state models of plot complexity. Poetics 9, 5-20. Kahn, E. (197313). Algebraic analysis for narrative. (unpublished). Kilgannon, P. (1973). I n “Computer Poems” (R. W. Bailey, ed.), pp. 22-31. Potagannissing Press, Drummond Island, Michigan. Milic, L. T. (1971). On the possible usefulness of poetry generation. I n “The Computer in Literary and Linguistic Research” (R. A. Wisbey, ed.), p. 170. Cambridge Univ. Press, London and New York. Milic, L. T. (1973). I n “Computer Poems” (R. W. Bailey, ed.), pp. 37-40. Potagannissing Press, Drummond Island, Michigan. Misek, L. (1972). “Context Concordance to John Milton’s ‘Paradise Lost’.” Andrew R. Jennings Computing Center, Case Western Reserve University, Cleveland, Ohio. Oxford Univ. Press, London and New York. Parrish, S. M., and Painter, J. A. (1963). “A Concordance to the Poems of W. B. Yeats,” p. 477. Cornell Univ. Press. Ithaca. Peavler, J. M. (1974). Analysis of Corpa of Variations. Comput. Humanities 8, 153-159.
Raben, J. (1965). A computer-aided investigation of literary influence: Milton to Shelley. I n “Literary Data Processing” (J. B. Bessenger et al., eds.), pp. 230-274. Materials Center, Modern Language Ass., New York. Ross, D., Jr., and Rasche, R. H. (1972). EYEBALL: A computer program for description of style. Comput. Humanities 6,213-221. Sainte-Marie, P., Robillard, P., and Bratley, P. (1973). An application of principal component analysis to the works of Molihre. Comput. Humanities 7, 131.
POETRY GENERATION AND ANALYSIS
71
Shinagel, M., ed. (1972). “A Concordance to the Poems of Jonathan Swift.” Cornell Univ. Press, Ithaca, S e w York. Smith, B. H. (1968). “Poetic Closure,” p. 268. Univ. of Chicago Press, Chicago. Smith, P. H., Jr. (1971). Concordances and word indexes. In “Literary Data Processing” (V. Dearing et d.,eds.), IBM Publ. No. GE20-0383-0, pp. 14; 64-70. IBM, Yorktown Heights, S e w York. Spevack, M. (19681970). “A Complete and Systematic Concordance t o the Works of Shakespeare, 6 vols. George Olms, Hildesheim. Widmann, R. L. (1971). “The computer in historical collation: Use of the IBM 360/75 in collating multiple editions of A Midsummer Night’s Dream. I n “The Computer in Literary and Linguistic Research” (R. A. Wisbey, ed.), p. 57. Cambridge Univ. Press, London and S e w York. Zadeh, L. A. (1965). Fuzzy sets. Inform. Contr. 8, 338-353. Zadeh, L. A. (1972). Outline of a New Approach to the Analysis of Complex Systems and Decision Processes,” Electron. Res. Lab. Memo ERL-M342. University of California, Berkeley. BIBLIOGRAPHY The fine bibliographies which have appeared in Computers and the Humanities scarcely need reproduction here; those who wish to see a list of “everything that has been thought and said” can consult those bibliographies. Rather, I would like to indicate items for those interested in reading more on the topics of poetry generation and analysis. The items given here sometimes repeat items from the Reference List, but not all items cited in the article are given here-just those which would help a reader understand more in depth the variety of computer applications to poetry (and prose). Bailey, R. W., ed. (1973). “Computer Poems.” Potagannissing Prem, Drummond Island, Michigan. Available from the Editor for $2.25 postpaid, 1609 Cambridge Road, Ann Arbor, Michigan 48104. Good collection of computer-produced or inspired poems. Bessinger, J. B., Parrish, S. M., and Arder, H. F. eds. (1965). “Literary Data Processing Conference Proceedings.” Materials Center, Modern Language Association, 4 Washington Place, S e w York, S e w York 10003. Good collection of papers illustrating various literary applications. Dearing, V., Kay, M., Raben, J., and Smith, P. H., eds. (1971). “Literary Data Processing,” IBM Publ. S o . GE20-0383. IBM, Yorktown Heights, S e w York. A good nontechnical introduction to the computer as a tool in natural language research. Doleiel, L., and Bailey, R. W. (1969). “Statistics and Style.” Amer. Elsevier, New York. Collertion of articles concerning the application of mathematical models and statistical techniyues to the study of literary style, not all studies computer-related. Leed, J. (1966). “The Computer and Literary Style.” Kent State Univ. Press, Kent, Ohio. Collection of papers reporting computer-assisted investigations of literary style. Mitchell, J. L., ed. (1974). “Proceedings of the International Conference on Computers in the Humanities.” Univ. of Edinburgh Press, Edinburgh (in press). Selected papers of the conference heId July 2CL22, 1973, a t the University of Minnesota.
~,
72
JAMES JOYCE
Richardt, J., rd. ( 1969). “Cybernetic Serendipity.” Praeger, New York. Interesting collection of computer-produced art. Sedelow, S. Y. (1970). The computer in the humanities and fine arts. Comput. Surv. 2, 89-110. An overview of the roles the computer plays in art,, architecture, music, literature, and language. Wisby, R. A,, ed. (1971). “The Computer in Literary and Linguistic Research.” Cambridge Univ. Press, London and New York. Covers applications to lexicography, textual editing, vocabulary studies, stylistic analysis, and language learning.
Principal Journals
Bulletin of the Association for Literary and Linguistic Computing Quarterly. A new journal based in England. Computers and the Humanities (J. Raben, ed.). Five issues a year. Devoted to the use of computers in the humanities. Articles range from surveys of developments to fairly technical applications. Contains an Annual Survey of Recent Developments, an Annual Bibliography of studies in the humanities using a computer, and twice a year a Directory of Scholars Active, which describes ongoing projects. Computer Studies in the Humanities and Verbal Behavior ( S . Y. Sedelow, ed.). Quarterly. Much more exclusively concerned with language than Computers and the Humanities, and articles strike me as much more of a technical nature.
Mapping and Computers
PATRICIA FULTON U.S. Geological Survey I220 7 Sunrise Volley Drive Reston. Virginia
1. Introduction . 2. History . . . 3. What Is a Map?
. . . . . . . . . . . . . . . . . . . 73
. . . . . . 4 . The Earth Ellipsoid . . 5. The Geoid . . . . . 6. Geodetic Datum . . . 7. Geodetic Surveys . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.1 Horizontal Networks . . . . . . . . . . . . . . . . 7.2 Vertical Networks . . . . . . . . . . . . . . . . . 7.3 Early Computations . . . . . . . . . . . . . . . . . 8. Satellite Geodesy . . . . . . . . . . . . . . . . . . . 9. Photogramnietry . . . . . . . . . . . . . . . . . . . 10. Projections . . . . . . . . . . . . . . . . . . . . . 11. Cartography . . . . . . . . . . . . . . . . . . . . 12. Data Banks . . . . . . . . . . . . . . . . . . . . 13. Future Trends . . . . . . . . . . . . . . . . . . . . 14. Conclusions . . . . . . . . . . . . . . . . . . . . Refcrences . . . . . . . . . . . . . . . . . . . . . .
74 76 77 78 79 80 80 85 86 87 89 92 98 102 103 105 106
.
1 Introduction
Computers have become such an integral part of the mapping process that it is now virtually impossible to consider mapping without computers . This is true of all areas involved: production of conventional maps. research on new applications. data processing for the auxiliary systems. and nontechnical or administrative utilization . Although this has happened within the span of a few years. this union of computers and mapping is deep rooted . It is a point of pride within the mapping agencies that the earliest use of computers was not only accepted but actually encouraged and promoted by their enthusiastic adoption of what was then an original and startling innovation . 73
74
PATRICIA FUTON
2. History
The first map used by man unqestionably predates all written records.
It was in all likelihood a line drawing made with a sharp stick in soft sand or earth. It undoubtedly gave general information about major landmarks such as mountains and rivers and specific details on dry caves or the salt pans valued by a primitive tribe. As mankind evolved, so did his mapmaking skills, and this same information was conveyed in a more permanent form by being scratched or marked in some way on animal hides, clay tablets, and stone. Judging from some of the maps still in existence today which were made by the primitive tribes, it can be assumed that there were symbols for different kinds of plants, animals, seasons of the year, or times of the month. These graphics are rather stylized and can be learned with very little effort. In fact, there is not too much difference between the basic idea of those very early primitive maps and the maps of today. Maps are more important and more widely used today than they have ever been in the history of mankind. As man’s social structure grew more complex, his need for information increased correspondingly. A part of this expanded need included the need for better and more precise maps. The ancient Egyptian civilization which grew up along the banks of the Nile is a prime example of the way this interrelationship grew. The yearly inundation that renewed the fertility of the fields also washed away the markers that identified the ownership of the fields and enabled the tax collectors to procure their revenue. And so of necessity, surveying was born. The Egyptians, however, were not the only ancient peoples who contributed to mapmaking. The various nations who occupied early Mesopotamia made a very substantial contribution to mapping. The Chaldeans with their great interest in astrology divided the circle into 360° and simultaneously established the sexagesimal system, which we still use today in mapping. Of equal or even perhaps greater importance, the old Sumerians used the same small set of symbols for all the numbers but indicated the exact value by position. This is essentially the same system in use today for computations with arabic numerals. These people also knew the so-called Pythagorean theorem, not only for special cases, but in general. Pythagoras, born in approximately 582 B.c., considered the world to be a globe. Aristotle also argued that the earth was, of necessity, spherical. Eratosthenes (276-195 B.c.) did something practical about the theory and conducted experiments from which he derived the circumference of
MAPPING AND COMPUTERS
75
the earth. He made use of the fact that in Syene, in upper Egypt, the rays of the sun shone vertically into a well. At this same time, the summer solstice, they fell a t an angle in Alexandria, Egypt. Measuring the angles and the distance between the towns, Eratosthenes arrived a t a figure remarkably close to the circumference known today. His figure was too large by a factor of about 16%. After Columbus discovered the new world and Magellan circumnavigated the globe, there was no longer any question about the shape of the earth. In general, everyone agreed that it was spherical. It was not long, however, until men were attempting to define the exact kind of a sphere upon which they lived. I n fact, by the 17tH and 18th centuries there was a very lively dispute between the English and the French. Was the earth prolate, flattened a t the equator, or was it oblate, flattened a t the poles? The argument was settled in the 1800’s by the French Academy of Sciences which sent investigative expeditions to South America and to Lapland. The measurements made by these expeditions proved that the earth was an oblate spheroid. I n 1849 Aime Laussedat, a French Army officer in the Corps of Engineers, decided that photography could be used to create maps. He worked for years on this project, and when he was finally done, in the late 1860’s or early 1870’s, he had defined many of the basic principles that still hold true today. The first application by Americans was probably during the Civil War when balloons were sent up with cameras to photograph enemy positions. For the most part, balloons have been replaced by airplanes as a vehicle for the aerial cameras. However a few balloons, which have the combined properties of a kite-balloon, are still used for this task. The history of computers as an everyday tool for mapping undoubtedly began with J. Presper Eckert and John W. Mauchley a t the University of Pennsylvania, where they worked on the prototype of the electronic computer. This was the ENIAC (Electronic Numerical Integrater and Computer). It was funded through the Ballistics Research Laboratory and the U.S. Army. It took 30 months to complete, and when it was finished it was the only electronic computer in the world. It consisted of 47 panels, each 9 feet high, 2 feet wide, and 1 foot thick. The word size was 10 digits plus a sign. Input and output were chiefly via punched cards, although switches could be set and light displays could bk read. Programs were hard wired into it. A stored program memory was later contributed by John von Neumann. Soon a second machine appeared, the EDVAC (Electronic Discrete Variable Computer). Eckert and Mauchley continued developing computers, including the Universal Automatic Computer or UNIVAC. In time, this machine became available
76
PATRICJA FULTON
commercially and was shipped to waiting customers. The first unit went to the U.S. Census Bureau and the second to the U.S. Army Map Service. Thc UNIVAC retained many of the features of its ancestor, the ENIAC. It was composed of a multitude of resistors, capacitors, and vacuum tubes. The memory was approximately 6 feet long, 5 feet wide, and 5 feet high inside. Repairs were made by simply walking inside the memory and replacing the necessary parts. It was actually a small room with its own door. There was a high console with flashing lights and a bank of 6 to 8 magnetic tape units for auxiliary storage. Input was from cards (round punches) and output on a printer. For most of its life the UNIVAC worked almost 24 hours a day on three shifts. By the time it was retired in 1962, it had made an indelible and irreversible impact on mapping. By that time it seemed that every government agency and every big laboratory of private industry had a t least one large computer and dozens of smaller ones. The names of some of these computers have become household words, especially I B M ; others, like the BERLESC at the Aberdeen Proving Grounds, were known only to a handful of users.
3. What Is a Map? A map is a graphic representation of a part of the earth’s surface. It is also a selective representation in that some features are emphasized to serve a particular purpose. The familiar road map is an ubiquitous example. There are many types of derivative product maps used by geographers and demographers, for example, maps that show population density by clusters of little black dots, or livestock production by pictures of cows and sheep. Such maps can illustrate a situation or condition that might otherwise be awkward to describe. Weather maps show storm tenters, isobars, and isotherms for certain areas. Hydrographic maps are concerned with water, the courses of streams and rivers, and the outlines of lakcs and ponds. Nautical charts delineate underwater features as well as shoreline details. These charts may show bays and harbors so that ships and boats may travel safely without going aground. They may show contours on the ocean’s floor which depict mountain ranges and volcanoes that are still submerged. They may outline the oyster beds in Chesapeake Bay, or they may trace the routes of the Chinook salmon on its way to the spawning grounds in the Northwest. One of the most widely used maps is the topographic map. A topographic map is also a graphic of thc earth’s surface, but it must be plotted to a definite scale. It represents the slope and character of the terrain; that is, it shows planimetry and terrain relief. Planimetric features are
77
MAPPING AND COMPUTERS
roads, houses, the outlines of fields, and city boundaries. Terrain relief is depicted by contour lines with labeled elevations and spot elevations.
4. The Earth Ellipsoid
The best way to really appreciate the all-pervasive influence of computers on mapping is to look a t each phase of the mapping process. It all begins when people need to examine part of the earth’s surface in detail. The steps should be considered in the natural order of their occurrence-first a look a t the earth itself. The earth can be considered an ellipsoid of revolution, a figure obtained by rotating an ellipse about its shorter axis. The flattening is that amount by which the ellipsoid differs from a sphere. Thus, two dimensions will uniquely define an ellipsoid of revolution. By traditional usage, the semimajor axis and the flattening serve this purpose. An ellipsoid of revolution is an ideal mathematical figure. Many of the relationships can be expressed by equations in closed form, which facilitates the computations. The shape of the earth, as everyone is well aware, is far from this ideal figure. Yet to be able to compute distances and directions with any fidelity and still be manageable, the computations must be reduced to a figure like the ellipsoid. This fact is a primary and basic reason for the merging of computers into the mapmaking procedures. I n earlier years, the simple expedient of using different ellipsoids for different parts of the earth served the purpose superbly well. Thus, for any one area of the earth, its chosen ellipsoid fits computationally, and practically, better than any other method so far devised. Let it now be emphasized that whcn computations are discussed, they are made on the ellipsoid or the idealized geometric figure of the earth. The following is a list of some of thcse ellipsoids and where they are used: Ellipsoid Clarke 1866 Brazil Modified Airy Airy British Clarke 1880 World Geodetic System 1966 Everest International
Area North America South America east of the Andes Ireland Great Britain Africa south of the Sahara Soviet Union India Pacific and Atlantic Oceans
As can be seen from the dates, some of thcse ellipsoids were determined
70
PATRICIA FULTON
a century ago. The methods used were quite primitive by modern standards. The newest methods only date from the 1950’s.
5. The Geoid
As everyone knows from their own observations, the earth is not a t all an idealized geometric solid. A map, to merit its name under the present definition, must relate to ground truth in some measurable dimensions. Thus, another surface must be defined and utilized; it is denoted as the geoid. The geoid is often likened to the mean sea-level surface. This means that if the oceans were free to flow over the entire surface of the earth and to adjust to the effects of gravity and centrifugal force, their resultant shape would be that of the geoid. It is also known that the earth’s mass is not evenly distributed. For this reason, the surface of the geoid itself is irregular. When outlines of the ellipsoid and the geoid are superimposed one upon the other, there are many discrepancies. These are termed geoid separations, geoid heights, or geoid undulations. More precisely, the geoid is an equipotential surface; in this case, gravity is everywhere equal, and the direction of gravity is always perpendicular to the surface. This last is particularly important in the positioning of measuring instruments. That is, the vertical axes of the instruments correspond to the direction of gravity. The difference between the perpendicular to the ellipsoid and the perpendicular to the geoid (or the commonly known plumb line) is measured by an angle called the deflecPerpendicular to Geoid Axis of
Geoid
Center df the Earth
(I
I
/
i = semimajor axis
FIG.1. Relationship of ellipsoid and geoid.
MAPPING AND COMPUTERS
79
tion of the vertical. The science of geodesy is the study of this figure, the geoid.
6. Geodetic Datum
The horizontal datum is composed of the latitude and longitude of the initial point or origin of the datum and the geoid separation of that point a t the origin. The datum also includes the azimuth, the angle of horizontal deviation measured clockwise from a standard direction (i.e., north or south), and the parameters-usually the radius and flatteningof the particular ellipsoid consigned to that datum. By this definition, the measurements are consistent within the system of any one datum. This means further that a datum is, by definition, a reference or base for other geodetic measurements. The historical method of establishing a geodetic datum was to select a first-order triangulation station and to designate this as the origin. If this was near the center of the network it was even better. Then the astronomical coordinates and azimuth a t the datum origin were derived. By this method, the geoid and ellipsoid were defined as being coincident a t the origin point. Or, to restate it, a t the origin, the deflection from the vertical and the separation between the ellipsoid and the geoid were zero. Another way of describing this is to say that the plumb line a t the origin was normal to both geoid and the ellipsoid. In point of fact, this is not the case a t all. Neither the real deflection nor the undulation are usually zero. At any rate, the final situation is one where the points within any one system are correct with respect to each other. Furthermore, although the deflection and the separation could be zero a t the origin, this is not the situation a t other stations in the network. In fact it is quite possible, and happens more frequently than not, that rather large discrepancies appear between geodetic latitude and longitude and the corresponding astronomical latitude and longitude of one point. It was quite impossible to handle these problems except by approximations. Because these differences can become unmanageable in certain situations, a second type of datum orientation called the astrogeodetic orientation has been developed. This second method is possible only because of the introduction of computers. In this type of orientation, the deflection of the vertical is not arbitrarily set to zero. Instead, it is the quantity computed from the least squares solution applied to the astronomic station observations within the network. By this method, the discrepancies of all stations are minimized. That is, when computers do the processing,
80
PATRICIA FULTON
the actual observations are reduced mathematically to give a reasonable and true model of the earth. Thus, geodesists no longer need to rely on a collage of approximations to estimate the earth’s surface. 7 . Geodetic Surveys
Surveys are the procedures for making measurements on the surface of the earth. Thus, any description of surveys should begin with an explanation of the different types of surveying that are carried out at present. The logical type to start with is geodetic surveying-for two reasons: ( 1 ) this is the type of surveying defined by law as the task of federal mapping agencies; (2) and perhaps more pertinently for this article, this is the first type of surveying that utilized the computer. Geodetic surveying is the process of making measurements for the determination of the size and shape of the earth. The positions of points determined by these methods are of such a high degree of accuracy that they provide the basis for other surveys. In the United States, as in most countries, the government establishes the rules and regulations for maps. The responsibility of establishing geodetic control networks throughout the United States is charged to that branch of the National Oceanic and Atmospheric Administration (NOAA) which was formally known as the Coast and Geodetic Survey and which is now known as the National Geodetic Survey (NGS). Control surveys are of two main types: those concerned with horizontal positioning, and those concerned with leveling or the elevation of points. Horizontal survey control networks themselves can be done in any one of several ways. The preparation for surveying any control network requires a great deal of forethought and many technical decisions. The higher the order of control the greater the accuracy demanded of it. Thus, firstorder surveying necessitates the most care and attention. First-order triangulation must provide point positions which are accurate to one part in 1,000,000 in distance and orientation. A major first-order geodetic network spanning continents will, of course, cross all sorts of international political boundaries. The attendant problems are magnified and complicated in direct proportion to the size of the network. 7.1 Horizontal Networks
7.7.7 Astronomic Stations
The positions a t which astronomic measurements are taken are called LaPlace stations. Their purpose is to tie the geodetic networks together and fix them at the specific points on the surface of the earth. These
MAPPING AND COMPUTERS
81
measurements are astronomic latitude and longitude. The observations are made with optical instruments which are positioned perpendicular to the geoid; that is, the vertical axis of the tool is coincident with the direction of gravity. For geodetic work, astronomic longitude is the difference in time measured when a specific star is directly over the Greenwich meridian and when the same star is directly over the station. The most accurate chronometers available are carried along to measure the time a t the points in the network. The exact time a t which a star is over the prime or Greenwich meridian can be found in the star catalogs, of which there are several. Then the difference in time or astronomic longitude can be determined. These catalogs have been generated by computer programs, and the computed results are available in both printout form and on magnetic tape. Astronomic latitude is the angle between the perpendicular t o the geoid and the plane of the equator. Geodesists who define and compute such parameters are concerned with the exact position of the North Pole (the axis of rotation) a t the time of measurement. The situation is aggravated by the fact that the North Pole does not stay in the same spot, but roams around in the area contiguous to the point generally considered to be the North Pole. 7.I .2 Triangulation
The triangulation method is probably the oldest type of survey. It depends upon the geometric proposition that if one side of a triangle and all the angles are known, or in this case measured, the values for the remaining sides can be computed. When this process is extended by adding more triangles to form a chain, then the area to be surveyed can be covered by a network of these triangles. It should be emphasized that the measurement of an angle at a point will often include the azimuth, the direction of that point relative to the North Pole. When the included side and two angles of the triangle are known, the solution of the triangle is then known. However, for triangulation, all three angles of the triangle are usually measured. Multiple measurements are recorded t o provide the most probable value. The stations of these triangles must be intervisible to the instruments taking measurements. Many times, determination of all three points by direct observation is not feasible. Of course, with two angles and the included side, that third angle can be computed, and this is frequently done. The initial measurement of length in a triangulation network is known as the base line. From the earliest days of surveying until quite recently, it has always been easier and more economical to obtain accurate angular measurements than accurate linear measurements, hence the popularity of triangulation.
a2
PATRICIA FULTON
B
FIG. 2. Triangulation network. A, north base; B, south base; AB, baseline. Preliminary high precision data: length of baseline AB; latitude and longitude of points A and B ; and azimuth of line AB. Measured data: angles to new control points. Computed data: latitude and longitude of new points; and length and azimuth of new lines.
In flat terrain the requirement that the vertices of a given triangle be intervisible is accomplished by building towers. The instruments are then placed on the tops of these towers, and the requisite measurements made. At present, this practice is seldom used for first-order surveys, and for the most part is reserved for second- and third-order survey. 7. I . 3 Trilateration
Another method used for the higher order surveys is called trilateration.
It also is expanded in the form of a network (Fig. 3 ) . By this method, the angular values are obtained after measuring all the distances and then solving by the law of cosines. Multiple-distance measurements are made a t each station to provide the necessary accuracy and precision. Trilateration was not especially practical until the development of electronic distance-measuring equipment. Now it is considered to be the equal of triangulation. Some of its proponents claim it even surpasses triangulation. 7. I .4 Traverse
A traverse is the easiest way of extending a control network. The procedure starts at a known point with a known azimuth (direction) to
MAPPING AND COMPUTERS
83
Fro. 3. Trilateration network. A, north base; B, south base; AB, baseline. Preliminary high precision data: length of line AB; latitude and longitude of points A and B; and azimuth of line AB. Measured data: length of each line. Computed data: latitude and longitude of new points; and length and azimuth between new points.
another point. Then the surveyor measures the angles and distances between the points of this network. The direction of each line of the traverse is computed from the angular measurement. The position of each control point is then computed from the length measurements of the line. When the first station and thc last station are coincident, it is called a closed traverse; the other type is the open traverse (Fig. 4 ) . Until recently, a traverse was considered only good enough for secondary networks, and because it was an economical method, it was used extensively. However, where the new electronic distance-measuring instruments are used, a traverse can be as accurate as triangulation. In fact, the interior of Australia was measured by this means. So far, the surveying methods mentioned have been used exclusively for horizontal control. These have been triangulation, trilateration, and traversing. All the measurements have been made on the apparent or topographic surface of the earth. Note that this is the third separate and distinct surface that is involved. 7 .I .5 tnsfruments
The instruments upon which the measurements are made are a vital part of the process and so require some description. The instruments used for measuring have changed drastically within the past few years. Perhaps the best known of the earlier instruments is the theodolite (Fig. 5 ) , an optical tool which is really a high-precision telescope with a cross-
04
PATRICIA FULTON
FIG.4. Diagram of a traverse. The given traverse is closed. If it extended only to point C, either north or south of the dotted line, it would be an open traverse. A, north base; B, south base; AB, baseline. Preliminary data: latitude and longitude of point B ; and azimuth of line AB. Measured data: length of the lines; and angles between the lines. Computed data: latitude and longitude of new points; and length and azimuth of new lines.
hair for sighting and scales for determining horizontal and vertical angles. On a theodolite, provisions have been made for the exact leveling of the axes of the telescope. Accurate horizontal angular readings of a graduated circle can also be made. The instruments are usually mounted on the familiar tripods seen along the roadside before the start of a housing development or at the construction site of a large shopping center. Some of the most successful examples of new tools are the electronic distance-measuring devices. One popular type, the geodimeter, emits pulses of ordinary light a t controlled frequencies. Another, the tellurometer uses radar to measure slant distances. More recently, laser beams have been used successfully. With this electronic equipment, greater distances than ever before can be measured. I n fact, continent-spanning networks are in operation. The first computations done on measurements taken in the field are done on site to screen out gross errors. To their undying credit, the surveyors have always acknowledged the possibility of errors. I n fact, they attempt to pinpoint the possible sources of error and exert great care to control them. Among the situations cited in the instrument instruction manuals are details covering possible errors in sighting, mensuration, and instrument settings. All this care is necessary because an error in any part of a control network is propagated throughout the entire network.
MAPPING A N D COMPUTERS
85
FIG.5. The theodolite, a high-precision telescope. 7.2 Vertical Networks
Just as there are several types of horizontal networks there are also several types of vertical control networks. The one generally considered to be the most accurate is established by differential leveling. In this sort of a network, two calibrated rods are held vertically a t different locations along a planned route, and readings are made with an optical instrument positioned between them (Fig. 6 ) . The reading is the difference in elevation between the points. As the optical instrument (telescope) is leveled by means of a bubble, gravity affects the instrument; therefore, the telescope and bubble are parallel to the geoid. A second type of leveling is called trigonometric leveling. This is accomplished by using a theodolite or similar instrument to measure a vertical angle between two points having a known distance between them. The elevation of the desired point can then be computed. By this method, both horizontal control and vertical control can be established a t the same time on the same network. Although this is a much more economical
86
PATRICIA FULTON
FIG.6. A level rod being used in a field survey.
method, it is less accurate than differential leveling. I n actual practice the high-order horizontal and vertical control networks are independent and separate one from the other. Barometric leveling is the third type used. The differences in atmospheric pressure a t the various elevation control stations are measured. These measurements, together with air pressure are used to determine the elevations of various other points. The accuracy of barometric leveling is less than that of the other two methods. It is used a great deal, however, in preliminary surveys where later and more accurate measurements will be made by either trigonometric or differential leveling. Just as there are horizontal-control networks and vertical-control networks, there are also horizontal and vertical datums.
7.3
Early Computations
These survey networks were solved triangle by triangle, generally in the shape of a quadrilateral. There were printed forms for each phase of the computations with spaces for input, intermediate answers, and final results. Every step had to be done by two different mathematicians as
MAPPING AND COMPUTERS
a7
a check on the accuracy. Angular measurements in degrees, minutes, and seconds were laboriously converted to trigonometric functions by tables and interpolation. The reverse process was carried out for angular results. The task was performed by rooms full of mathematicians, each with a mechanical calculator on the desk. Even obtaining a square root was a time-consuming process requiring repetitive entries on the calculators. It took years before final adjusted values could be assigned to positions in only moderate-sized networks. It is no wonder that electronic computers with stored program capability were welcomed so eagerly. The mathematicians were a t last free to solve problems in ways more compatible with the physical situation. Consider the conditions for a geodetic survey network. The astronomic positions are recorded a t LaPlace stations and referenced to the geoid. Triangulation and/or trilateration measurements are taken on the topographic surface. Remember that all the measurements must be reduced to the ellipsoid for the computations. These observations comprise different combinations of known and unknown variables, sides, and angles with repeated measurements. All these equations require a simultaneous solution involving thousands of measurements. Needless to say, matrix operations of this magnitude are still a formidable task even with the third- and fourth-generation computers of today. Now, however, by means of computers the solutions to the true situations are being achieved. Rigorous mathematical formulations are applied to the real observations for the earth in its entirety. These yield accurate worldwide solutions to replace the unsatisfying approximations, limited in area, of the precomputer years. Even now, the programs are being readied for reducing the measurements of the latest high-precision traverse. A search is underway for the biggest and fastest computer available, and even this will probably not be completely adequate to the task. The detailed formulation and development of the requisite mathematics will be found in the literature cited in the bibliography.
8. Satellite Geodesy
The use of artificial earth satellites as research vehicles for obtaining geodetic information was promulgated in the 1950’s. The earliest versions were primitive modifications of the traditional triangulation and trilateration. These rapidly increased in sophistication and in the quality of the results. One of the most successful systems is a worldwide geocentric model with stations so distantly spaced that they are nonintervisible, yet most satellite observations are three-station events. When a satellite
88
PATRICIA FULTON
is observed from the ends of a base line, the rays form a plane in space. The reasoning is that the spatial orientation can be determined from the measured direction cosines of the two rays. The direction of the baseline can be computed as the line in which two such planes containing the baseline intersect. When three stations form a triangle, five such planes are necessary and sufficient for a unique solution. All the triangles must have geometric strength. The final positions for points on the earth are expressed in a geocentric Cartesian coordinate system in three dimensions, a model of elegant simplicity. However, consider the not-so-simple details. The practical application depends upon the successful development of suitable telemetry and advance tracking algorithms. To an even greater degree, however, it depends upon high-speed computers to monitor the actual orbit and to predict the path of the future orbits. Orbit-prediction programs are based on the extremely complicated equations of dynamic astronomy for the gravitational potential of the earth. Nearly always written in FORTRAN, they take years to become fully operational. Even on today’s big computers they take hours to run. Once the programming is done, the most advantageous orbits are devised, and the satellites are shot into space (see Fig. 7 ) . The stellar cameras and all the other equipment require calibration before use. This demands repeated measurements fed as input to computer programs. The satellites are photographed against a star background, and the star positions are also used in the computations of the satellite position by referencing the star catalogs. All this computer processing just mentioned occurs before the recording of the first image. After the data are acquired, as many as 3500 measurements are made on each photograph. The data reduction requires computer programs which reference the star catalogs, adjust the data for instrument deviations, transform it to the same reference system, and finally compute the most probable positions. It requires many years of programming and computer years of processing to perform all these tasks properly, but the results provide positional accuracy never before achieved. Several other techniques are particularly appropriate for satellite geodesy. Some electronic systems measure distances by means of highfrequency signals, as described in Section 7.1.5. Others measure positions by utilizing the Doppler effect. This latter method takes advantage of the fact that the rate of change in the frequency of a constant signal can be measured as it approaches and recedes from the station. Optical systems use light flashed from a satellite for positioning information. All systems require massive programming efforts and hours of run time to reduce the data.
MAPPING AND COMPUTERS
89
Satellite Orbil
i I
Perigee Earth’s
/
k
x
Q argument of Perigee
i inclination
9. Photogrammetry
Originally, photogrammetry started out as “the science or art of obtaining reliable measurements by means of photography.” By the latest definition photogrammetry is, “the art, science, and technology of obtaining reliable information about physical objects and the environment through processes of recording, measuring, and interpreting photographic images and patterns of electromagnetic and acoustical radiant energy and magnetic phenomena.” I n its simplest form, the first phase of a photogrammetric system is composed of a certain part of the earth which is to be photographed, a camera with its lens and film, and a vehicle t o carry the camera aloft. The flight paths require good design for the same reasons as the survey networks. As a plane travels along t h e flight path, a series of photographs are shot which overlap in the direction of flight. The next path produces pictures which overlap those of the first path; optimum coverage of the area to be mapped is thus insured. Map scale and flight height are directly correlated. Low-altitude photography cap-
90
PATRICIA FULTON
tures the detail required for large-scale maps. High-altitude photography incorporates more area per picture as needed for small-scale maps. The translation of the information on the photograph to digital form is the next step. To perform the necessary operations, an entire series of photogrammetric instruments have been specially designed and built. The comparators can measure the position of a point on a photograph to within plus or minus 1 or 2 micrometers. The stereocomparators permit viewing the images of two overlapping photographs called a stereo pair. When the operator of such an instrument looks through the eye pieces, a three-dimensional scene appears. There are also stereoplotters (Fig. 8 ) with a “floating mark mechanism” which is built into the instruments. Viewing the dot through the optical system, the operator can then maneuver the dot until it appears to be on the surface of the earth. I n this way it is possible to trace out a contour, a line of equal elevation on the surface of the earth. These contours can be traced onto paper or film a t the same time by means of a moveable arm with a marking device attached to the stereoplotter. For added stability during the measuring process, the image is often transferred from the photographic film to a glass plate. Thus the terms “photo” and “plate” are used interchangeably. Before an instrument’s initial use and periodically thereafter, test data are run through calibration programs to maintain accuracy standards. The coordinates of the plate points are usually recorded on punched cards, paper tape, or magnetic tape at the same time that they are measured. Analytical photogrammetry is the data-reduction technique. The sim-
FIG.8 . A stereoplotter.
MAPPING AND COMPUTERS
91
plest case is the solution for a single photograph. Next in complexity is the solution for a stereo pair of photographs (Fig. 9 ) , termed a single model. Two types of analytical photogrammetry are used a t present. One of these is derived from the principle of coplanarity; the other makes use of the principle of collinearity. In the coplanarity model, B represents the air base and pl and p z are a stereo pair of exposures on which point P appears (Fig. 10). By the illustrated geometry, the rays of vectors A,, A , , and B lie in the same plane. The inclusion of condition equations assure the intersection of A , and A,. The ground coordinates of the points are not implicit within the model. For this reason, a solution can be computed only after additional equations supply these coordinates. This same principle was described in Section 8. The principle of collinearity is shown in Fig. 11. This model is based on the fact that the image of a point, the exposure station,
I
I
I
I
I
4
FIG.9. Stereopair of photographs.
92
PATRICIA FULTON
FIG.10. Diagram illustrating the coplanarity model.
and the point itself on the ground are all on the same straight line. The collinearity condition equations contains all the elements needed for a solution-image coordinates, ground coordinates, and camera orientation. I n practice, the condition equations are linearized by use of a Taylor or McLaurin series. There are six camera orientation parameters for each photograph and three coordinates for each point. The number of equations is then equal to twice the number of photo points. The known values and some approximations are put in matrix form. These are overdetermined systems and are then handled iteratively within a rigorous leastsquares solution. From the time the mathematical formulation is translated into computing algorithms and the programs become operational, several years may have elapsed. This is true for the single-camera and single-model solutions. The computer run time and core requirements for such a solution are negligible. I n the next step, these models are expanded to permit the solution of a strip of photographs. At this point, computer run time and core limitations become additional program parameters. When these strips are built up so as to form a block solution, the usual expedients of program overlays and auxiliary disk and/or tape storage become mandatory for even the big computers. Similarly, the run time increases to hours. 10. Projections
One thing that has remained unchanged over the years is the basic problem of mapping-that is, to show the earth, a curved solid, on a
MAPPING AND COMPUTERS
93
FIG.11. Diagram illustrating the collinearity model.
plane, the flat piece of paper which is the map, with measurable fidelity. The problem is then exemplified by the classic case of peeling an orange or apple so that the skin or rind can be laid out flat. Of course, this cannot be achieved without some tearing or distortion of the fruit peeling. The same is true for mapping. The earth cannot be represented on a flat sheet of paper without some distortion. Because it is impossible to transfer points from a curved surface to a plane surface without some distortion, the projections have been designed to control these distortions. The most popular projections in use today are either conformal or equal area. Conformal means that the angles are preserved, in particular, that some given latitude and longitude that are a t right angles on the earth spheroid are also a t right angles on the map.
94
PATRICIA FULTON
The equal-area projections, just as the name implies, show areas on the map true in relationship to the areas on the spheroid. In most conformal projections, the actual correspondence between points on the ellipsoid and points on the map are defined by means of mathematical equations. In many projections it is extremely difficult if a t all possible to describe geometrically and to depict graphically the method of projection used. One of the most satisfactory ways of formulating some of these projections is by means of complex variables. An informative introduction to projections is achieved by use of sphere and conics, shown graphically. On many occasions, actual computations are carried out using the sphere in preference to the ellipsoid. Sometimes this approximation is the most practical method, considering accuracy requirements and the savings in time and money. This simplified version of a conic section and a sphere can illustrate several of the more efficacious projections. By this means, the geometric representation of the Mer-
f __
.- . -
.
.-
__
FIG.12. An example of a Mercator projection map.
MAPPING AND COMPUTERS
95
FIQ.13. Map illustrating a transverse Mercator projection.
cator projection can be envisioned as a sphere surrounded by a cylinder which is tangent a t the equator (Fig. 12). The transverse Mercator projection is popularly depicted as a cylinder which is tangent to the earth a t a given meridian. This meridian is designated as the central meridian for a given zone. The whole earth generally is divided into zones of a few degrees ( 2 O to 6 O ) each. With this relationship, the scale is true only along the central meridian. The closer a point approaches to the edge of the zone the greater the error in scale that will be found in its position. This error is then corrected by the application of the proper scale factor. The scale factor can be minimized by reducing the size of the cylinder and allowing i t to sink into the sphere. Now instead of the tangent condition along the central meridian, the cylinder actually cuts into the sphere, and the result is that now there are two circles on the cylinder where the scale is true. At the central meridian the scale varies in the direction perpendicular to the meridian, but now a t the edges of the zone, the scale error has been greatly diminished.
PATRICIA FULTON
96
This approximates very well what is done in mapping today. On a map a graticule is either a series of intersecting lines or a series of tick marks which represent the parallels and meridians of latitude and longitude. A grid is a rectangular system defined on the particular map projection. It too is composed of intersecting lines or tick marks. The transverse Mercator projection (Fig. 13) is generally used for those political areas whose dimension is greatest north to south. The transverse Mercator projection is used officially for those States of the United States with such configurations and in other parts of the world such as Great Britain and Norway. The Universal Transverse Mercator (UTM) grid came into popular use after World War 11. It is in official use in all of the departments of the Defense Mapping Agency. The other projection used officially for parts of the United States is
Upper standard parallel
Limits of projection
Lower standard parallel
\
Area of stretching S c a l e too l a r g e
'-. __
,
,'
Standard parallel Scale e x o c t Area of compression Scale t o o s m a l l Scale e x a c t
FIQ.14. Diagram of sphere and conic projections.
MAPPING 'AND COMPUTERS
97
FIG.15. An example of a Lambert conformal conic projection map.
the Lambert conformal conic projection (Figs. 14, 15). It is especially appropriate for areas that are narrow in the latitudinal direction (northsouth) and much wider in longitudinal direction (east-west). By its definition, the parallels and meridians are, respectively, arcs of concentric circles and the radii of these concentric circles. Many aeronautical charts are based upon it, and it is the official projection of several countries in South America. The transverse Mercator and the Lambert conformal conic projections provide the legally sanctioned state plane grid system. They are used for those latitudes between SOON and 8OOS; for the regions north of 80"N and south of 80°S the polar stereographic projection is generally used. Because of their cxtensive use and mathematical definitions, projections along with survey adjustments were among the first map elements programmed for computers. As each new generation of computers emerges, these essential programs are modified and entered into the sys-
98
PATRICIA FULTON
tems. Many are characterized by equations in closed form; others must be solved by means of series expansions. Going from the ellipsoid to the plane is called the direct solution, and from the plane to the ellipsoid, the inverse.
11. Cartography
Cartography is described as the production of maps including design, projection, compilation, drafting, and the actual reproduction. By this definition, cartography may be considered as the oldest branch of mapmaking. Indeed much of the hand work that goes into creating a map had its beginning hundreds of years ago. Handmade items are excessively expensive in today’s market place. Even though the finished‘product may be an example of superb craftsmanship, the time and expense involved are quite often prohibitive and are the main reasons why the old methods of cartography are under close scrutiny today. As a result of the investigations of the past few years, some of the newest and most exciting changes in the whole mapping field are taking place in the particular area of cartography. These innovations come under the general heading of digital mapping and automated cartography. Digital mapping and automated cartography have evolved only because of the existence of computers. The basic premise of computer-aided mapping (CAM) is that a map in digital form on magnetic tape is as truly a map as the conventional form with contours and drainage printed on a sheet of paper. The translation of the information contained on a photograph or a published map into digital form has become an active project in all government mapping agencies. Various techniques are now being used. The photogrammetric instruments described in Section 9 tabulated discrete points. Many of these comparators and plotters have been modified to record continuous strings of x, y, and z coordinates. Now when an operator completes a plate there is the conventional contour tracing on paper and, in addition, a three-dimensional digital terrain model recorded on magnetic tape. Another version of these stereoplotters does the same thing but in a slightly different manner. Instead of traveling around the contours, the movement is directed back and forth across the photograph; z, y, and z coordinates are recorded a t predetermined intervals. That is, profiles of the earth’s surface are traced, but again the final result is the threedimensional digital terrain model. The intervals a t which the data are recorded are generally specified in one of two ways-either after the re-
MAPPING AND COMPUTERS
99
cording instrument has traveled over a fixed distance or after the elapse of a set time period. Another method of data acquisition is the manual digitization of maps (Fig. 16). This is accomplished on instruments of the drafting-table type. A cursor, a pointing tool, is positioned over a point, and the coordinates are recorded. The cursor also can be traced along linear features such as roads or contours, and so record the three-dimensional coordinates. Again the collection intervals can be set for either distance or time mode. The development of automatic line followers is another approach to map digitization. These photooptical instruments carry light sensors (usually diodes) on a framework very similar to that of an automatic plotter suspended over the manuscript. The line followers work quite well on single lines even though they follow a twisted, convoluted course. If a line should branch or intersect with another line, then human intervention, manual or preprogrammed, is required. Again, the digitized coordinates are written on some medium that can be fed into a computer for the actual data processing. Scanner-digitizers are another answer to the problem. Those scanners that are currently operational have certain features in common. They all make use of some sort of photoelectric device for sensing the data. Most commonly, this
FIG.16. Manual digitizer.
100
PATRICIA FULTON
is either a deflectable beam photomultiplier tube or an image dissector tube, both capable of sensing various gray levels. Ordinarily, a lower power of two such as 32 (29 or 64 (2Oj gray levels is found to be adequate. Some of these sensors can also distinguish colors, and this provides an additional way of identifying the various map features. The scanners generate overwhelming amounts of data. A 3 X 5-inch film chip digitized at 0.001-inch resolution would produce 15 million pixels, discrete picture elements. The data from a standard contour plate digitized a t 50 pm fill five 10.5-inch tape reels with approximately 95 million bytes. This same contour plate digitized manually a t l-mm increments would not fill one reel but would generate points with three dimensions each. Because of the tremendous amounts of data involved, a scanner-digitizer system usually has its own dedicated computer. This can be either a fairly large general-purpose computer devoted exclusively to this one project, or it can be a minicomputer or sometimes a series of minicomputers also dedicated to this one project. At present, there are many diverse opinions on this very subject. The question of which is best, a very large general-purpose computer or a large bank of minicomputers has yct to be resolved. Regardless of the collection technique, the data are processed through error-correction routines where known errors are deleted immediately, and the rest of the data are plotted. The plots are then visually inspected for any remaining errors. This step is repeated until the data meet preset error criteria. Commonly, the next step is to superimpose a grid upon the terrain model. Essentially a mesh is created, and an elevation value is assigned to each node. Research is still in progress to determine the best mathematical method for this procedure. Many of the better programs make use of a weighted distance function. If a known point falls exactly on a grid intersection, that value is assigned to the intersection. I n all other cases where the known points fall within the area being examined, a weight value computed as a function of the distance of the known point from the intersection point is computed. The final value for the intersection will then be the ratio of the sum of the weighted random point elevations over the sum of the weights. After gridding, the data are ready for some mapping applications, one of which is the now popular orthophoto. The orthophoto is quite different from the conventional line map in several ways. It is an aerial photograph which has been subjected to differential rectification. This means that the image displacements caused by terrain relief and the tilt of the camera a t the moment of exposure have been corrected. These corrections
MAPPING AND COMPUTERS
101
are made one a t a time, each correction covering a very small part of the photograph. The areas are so small that they blend together and are not discernible on the finished orthophoto. The terrain data in profile form, digital or graphic (Fig. 17) , guide the orthophoto instruments during the exposure process. Later, contour lines, grids, graticules, and place names are added just as they are on the traditional line map. At this point, after the labor and expense involved in obtaining a digital map in machine-readable form has been described, it is wise to ask why digital mapping is being done. Indeed, over the past 10 years (for digital mapping is no more than 10 years old), this valid question has arisen repeatedly. At some time during the course of debates on this question, someone will point to an existing line map or aerial photograph and ask the question “Look a t the vast amount of data recorded here. Does anyone really think there is a better storage medium?” This doubt is understandable. The amount of information on an aerial photograph or printed line map is virtually unlimited. The data are easily stored-a
FIG.17. Examples of terrain profiles.
102
PATRICIA FULTON
sheet of paper or film takes very little space. It is equally easy to retrieve. What reason can there be for going to the trouble and expense of putting a map into digital form and then storing it? Until a few years ago the answer to this question was very much in doubt. Today the answer is apparent and generally accepted. Once a map is in digital form i t can be retrieved and, most importantly, reproduced in whatever form the user desires. This is really the crux of the whole matter. The user needs a map that serves his particular needs, and more often than not he needs it quickly. I n this decade, legislators and others concerned with the conservation and proper development of our natural resources need a variety of specialized maps. Some of these users are fortunate, for they can take their time in formulating long-range plans. Others are faced with emergencies arising from natural phenomena, and they must make vital and immediate decisions; the effects of some of these decisions will last a long time. In any sort of natural disaster, the necessary information can be presented to policy makers and decision makers quickly and cheaply by maps in digital form. Hence, the effort is worthwhile.
12. Data Banks
Even as map data are being digitized, the problem of their subsequent storage and retrieval arises. The exact form in which these data are to be stored is still in question. For many of the map features such as buildings, bridges, and airports, no standard form has been agreed upon as yet. The matter of format, that is, the size and length of each record, is also undecided. Many persons believe that certain identifying features must be included but that the details should be left up to those creating each particular data base. The theory behind this is that it adds the needed flexibility for a new and growing field of endeavor. Data bases continue to proliferate even without general agreement on the details. Figures 12, 13, and 15, which illustrate the various map projections, were derived and plotted from such a geographic data base and library of projection programs. Display
All maps are graphics, and with increasing frequency they are becoming computer graphics. The significant reason for putting map data in digital form is because such data can be used to derive specific informa-
MAPPING AND COMPUTERS
103
tion with the speed and flexibility unknown in any other medium. This can be as simple as a printout, which identifies certain regions and provides statistical information such as the area, the perimeter distance, and the location of the centroid of the region. Many times this information is more useful when accompanied by a plot produced either online or offline from data bases. Some digital map data can also be reproduced as ordinary text data for bibliographies. The production of bibliographies from computerized data is constantly increasing in both magnitude and importance. The placement of names on maps involves printing the name of a city, town, lake, etc., on a map close enough so that the site may be identified with the name, and placing the name so that it does not interfere with any other graphic information on the map. Because of these restrictions, name placement becomes a very difficult task. Indeed, some mapping agencies have both computers and plotters dedicated to this one application. When plotters are mentioned in this section, the drum type or flatbed type which uses either a ballpoint pen or a liquid-ink pen is meant. These run the gamut from high-speed and medium-precision to low-speed and extremely high-precision plotters. The more demanding specifications can also be satisfied by coordinate plotters with photoheads. On these plotters, a light beam traces out a line or flashes a symbol through a template on special film. Very fine, very smooth lines can be produced by these plotters. Many plotters also produce scribed copy. Scribing is the process of etching a groove on coated material, usually mylar. This involves positional accuracy, line-weight accuracy, and also line-depth control. Scribed sheets like the film are important because they can be used directly to produce the press plates that actually create the maps and other graphics. Another useful and increasingly popular display device is the cathode ray tube (CRT). Most are black and white, but an increasing number of color sets are in use. These are especially useful for the interactive manipulation of data and for online error correction. They also provide a fast method of combining different kinds of data in one display. The C R T is usually accompanied by a printer so that the pictorial information may be retained in hard copy.
13. Future Trends
Information about the oceans and seas, the surface of the earth, and the atmosphere is being recorded and mapped. Long before the astronauts landed on the moon, a series of observational satellites had been sent
104
PATRICIA FULTON
up to orbit it. From their various sensors, maps of the moon were created. These were accurate enough to assist in safe landings and takeoffs. All data acquisition is in some way involved with computers. The moon was only the beginning of extraterrestrial mapping. Probes have been sent to Venus and to Mars. Even now the surface of Mars is being mapped (Fig. 18), and some of the data are being recorded in digital form and processed on computers. I n nearly all terrestrial and extraterrestrial projects the data so gathered are stored in digital form in some computerized data bank. With increasing frequency, the pictorial output can be described as some form of computcr graphic. Aerial photography is now supplemented by satellite photography and utilizes a greater variety of sensors, including multispectral scanners and side-looking radar (Fig. 19). The use of digitized data from these sensors
FIG. 18.
Image of Mars. (NASA Photography.)
MAPPING AND COMPUTERS
l E C C # $ ~ ~ & . - l O N 4 4 - 5 1 N #&%W?S-W
MSSm3g"hS U i EL33
#8-#1.1IW-N-l-N-D.#?'IEWS
105
E-1079-15131-6 01
FIG.19. Multispectral image from a satellite. (NASA Photography.)
increases continually. The data are usually transmitted in digital form and undergo computer processing for error deletion and quality control before display.
14. Conclusions
The various steps that constitute the mapmaking process have been reviewed. Each step has involved the use of computers, and it has become obvious that their use is constantly expanding.
106
PATRICIA FULTON
One additional point is of interest. When the programmable desk-top calculators became available, many mapping applications were withdrawn from the large general-purpose computers and placed on these small but powerful devices. I n some respects this is the completion of a circle, inasmuch as the same computations which had been transferred from the old mechanical desk-top calculators to the electronic computers have now returned to desk-top instruments. From another viewpoint, however, particularly the viewpoint of the user, the only similarity is the desk-top location. Long tedious strings of arithmetical manipulations are now performed by depressing one button. This analogy holds true for mapmaking in general. Its basic purpose, to present pictorial information about the earth in a measurable format, remains unchanged. The methods and procedures have been irrevocably altered by data reduction via computer. The instrumentation of mapmaking has been redesigned by the same explosion in technology that has spawned generation after generation of computers within the span of a few years. The combination of mapping and computers is both happy and successful. REFERENCES Anderle, R. J. (1972). Pole position of 1971 based on Doppler Satellite observations. U.S. Nav. Weapons Lab., Tech. Rep. TR-2734. Bickmore, D. P. (1971). Experimental maps of the Bideford area. Commonw. Surv. Officers Conf., Proc. Bickmore, D. P., and Kelk, B. (1972). Production of a multicolour geological map by automakd means. Int. Geol. Congr., Proc. Zdth, 1972 Sect. 16, ‘pp. 121-127. Bomford, Brigadier G. (1962). “Geodesy.” Oxford Univ. Press (Clarendon), London and New York. Boyle, A. R. (1970). Automation in hydrographic charting. Can. Sum. 24(5). Boyle, A. R. (1971). Computer aided map compilations. Can. Nut. Res. Counc., 2nd Semin. Rep. ManlMach. Commun. Breward, R. W. (1972). A mathematical approach t o the storage of digitized contours. Brit. Cartog. Soc. J . 9 ( 2 ) , 82-86. Brown, D. (1969). Computational tradeoffs in a comparator. P ~ o ~ o g r a r nEng. ~. 35(2), 185-194. Brown, D . (1971). Close-range camera calibration. Photogramm. Eng. 37(8), 855-866. Burkard, R. K. (1964). “Geodesy for the Layman.” U S . Air Force Aeronaut. Chart and Inform. Cent., Chart Res. Div., Geophys. and Space Sci., Br. Colvocoresses, A. P. (1973). Data referencing by map grid cell. Surv. Mapping 2 3 ( 1 ) , 57-60. Connelly, D. S. (1971). An experiment in contour map smoothing with the ECU Automated Contouring System. Cartog. J. Ezp. Cartography Unit S(1). Deutsch, E. S. (1970). Thinning algorithms on rectangular, hexagonal and triangular arrays. Univ., Computer Sci. Cent., Tech. Rep. 70-115 (NASA Grant NGL-21002-008. U S . Atomic Energy Commission Contract AT440-1) 3662. Doyle, F. J. (1966). Analytical photogrammetry. I n “Manual of Photogrammetry,” 3rd ed., Vol. 1, pp. 461-513.
MAPPING AND COMPUTERS
107
Doyle, F. J. (1969). “Useful Applications of Earth-Oriented Satellites, Summer Study on Space Applications.” Nat. Res. Conc.-Nat. Acad. Sci., Div. Eng., Washington, D.C. Doyle, F. J . (1970). Photographic systems for Apollo. Photogram. Eng. 36(10), 1039-1044. Evans, I. S. (1970). The implementation of an automated cartography system. I n “Data Processing in Biology and Geology” (J. L. Cutbill, ed.), Syst. Ass. Spec. Vol. 3. Fischer, I. (1960). A map of geoidal contours in North America. Bull. Geod. Int. 57. Fischer, I. (1969). The geoid in South America referred to various reference systems. R e v . Cartografica 18. Fischer, I. (1972). I n “Geographical Data Handling” ( R . F. Tomlinson, eds.), Vols. 1 and 2. Int. Geog. Union, Ottawa. Hamilton, W. C. (1964). “Statistics in Physical Science.” Ronald Press, New York. Helava, IT. V. (1969). Some trends in automation of photogrammetry. Bildmess. Lufbildw. 3 7 ( 6 ) . Jancaitis, J. R., and Junkins, J . L. (1972). Mathematical techniques for automated cartography. Final technical report. I;.S. Army Eng. Topographic Lab. ETL-CR73-4 (Contract No. DAAK02-72-C-0256). Johnson, P. H. (1972). The image-processing system for the Earth Resources Technology Satellite. Bendiz Tech. J . Spring, pp. 46-51. Jordon, Eggert, Knessal (1966). “Handbuch der Vermessungskunde,” 10th ed. Jordan, Eggert, Knessal, Stuttgart. Kaula, W. M . (1962). Celestial geodesy. N A S A Tech. Nate NASA TN D-1155. Kaula, W. M. (1964). SECOR orbit revealed. Missiles Rockets February. Keller, M., and Trwinkel, G. C. (1964). Aerotriangulation strip adjustment. U.S. Coast Geod. Surv., Tech. Bull. 23. Keller, M., and Tcninkel, G. C. (1967). Block analytic aerotriangulation. U.S. Coast Geod. Surv.,Tech. Bull. 35. Lyddan, R. H. (1973). Basic mapping-for repourre development, environmental protection and land-use planning. Amer. Congr. Surv. Mapping, SSrd Annu. Meet. 1913, Proc. pp. 223-228. McEwen, R. B.. and Tyler, D. A. (1972). Applications of extraterrestrial surveying and mapping. Amer. Soc. Civil Eng., Surv. Mapping Div., J . 98, No. SU2, 201-2 18. Mancini, A,, and Gambino, L. A. Results of space triangulation adjustments from satellite data. G I M R A D A Res. Note 13. Merrill, R. D. (1973). Representations of contours and regions for efficient computer search. Ass. Comput. Mach. Commun. 1 6 ( 2 ) , 69-82. Mueller, I. I. (1964). “Introduction to Satellite Geodesy.” Frederick Ungar Publ. Co., New York. Petrie, G. (1972). Digitizing of photogrammetric instruments for cartographic applications. Ph,otogrammetria 28(5). Peucker, T. (1972). Computer cartography. Ass. Amer. Geographers, Comm. College Geogruphy, Resow. Pap. 17. Rosenfield, A. (1969). “Picture Processing by Computer.” Academic Press, New York. Sanford, V. (1958). “A Short History of Mathematics.” Houghton, Boston, Massachusetts.
108
PATRICIA FULTON
Schmid, H. H. (1965). Accuracy aspects of a world-wide passive satellite triangulation system. Photogramm. Eng. 31 (l), 104-117. Schmid, H. H., and Schmid, E. (1965). “A Generalized Least Squares Solution for Hybrid Measuring Systems.” US. Coast and Geodetic Survey, Rockville, Maryland. Struck, D. J. (1948). “A Concise History of Mathematics.” Dover, New York. US. Department of the Army. (1951). The universal grid systems (Universal Transverse Mercator) and (Universal Polar Stereographic). U S . Dep. Army Air Force, Training Manual TMS-241. Whipple, J. M. (1973). Surveillance of water quality. Amer. SOC.Photogrammetry J . 39(2), 137-145. Whitmore, G. D. (1949). “Advanced Surveying and Mapping.” Int. Textbook Co., Scranton, Pennsylvania.
Practical Natural Language Processing: The RE1 System as Prototype
.
FREDERICK B THOMPSON BOZENA HENISZ THOMPSON California lnstitufe o f Technology Pasadena. California
Introduction . . . . . . . . . . . . . . . . . . . . . 1. Natural Language for Computers . . . . . . . . . . . . . . 2. What Constitutes a Natural Language? . . . . . . . . . . . 2.1 The Importance of Context . . . . . . . . . . . . . . 2.2 The Idiosyncmtic Nature of Practical Languages . . . . . . . . 2.3 Language Extension through Definition . . . . . . . . . . . 3. The Prototype REL System . . . . . . . . . . . . . . . . 3.1 The REL Language Processor . . . . . . . . . . . . . . 3.2 Base Languages . . . . . . . . . . . . . . . . . . . 3.3 User Language Packages . . . . . . . . . . . . . . . . 3.4 Command Language and Metalanguage . . . . . . . . . . . 3.5 REL Service Utilities . . . . . . . . . . . . . . . . . 3.6 REL Operating System . . . . . . . . . . . . . . . . 4 . Semantics and Data Structures . . . . . . . . . . . . . . . 4.1 The 1mport.ance of Data Struct.urcs . . . . . . . . . . . . 4.2 Are There Universal Data Structurcs for English? . . . . . . . 4.3 Data Management for Relational Data Systems . . . . . . . . 5. Semantics Revisited . . . . . . . . . . . . . . . . . . 5.1 Primitive Words and Semantic Nets . . . . . . . . . . . . 5.2 The Nature of the Interpretive Routines . . . . . . . . . . 5.3 The Unlimited Complcxity of Data Structures . . . . . . . . 6. Deduction and Related Issues . . . . . . . . . . . . . . . 6.1 Extension and Intension . . . . . . . . . . . . . . . . 6.2 The Incorporation of Intensional Meaning . . . . . . . . . . 6.3 More Extensive Intensional Processing . . . . . . . . . . . 6.4 Inductive Inference . . . . . . . . . . . . . . . . . 7. English for the Computer . . . . . . . . . . . . . . . . . 7.1 Features . . . . . . . . . . . . . . . . . . . . . 7.2 Case Grammars . . . . . . . . . . . . . . . . . . 7.3 Verb Semantics. . . . . . . . . . . . . . . . . . . 109
110 110 111 111 112 113 115 116 118 118 120 121 122 122 122 123 126 128 128 131 133 135 135 137 140 142 143 144 146 148
110
F. B. THOMPSON AND
B. H. THOMPSON
The User vs. Linguistic Knowledge . . . . . . . . . Quantifiers. . . . . . . . . . . . . . . . . . . Computational Aspects of Quantifiers . . . . . . . . Steps in Developing English for the Computer . . . . . 8. Practical Natural Language Processing . . . . . . . . . 8.1 Natural Language Processors . . . . . . . . . . . 8.2 Fluency and Language Learning . . . . . . . . . . 8.3 What Kinds of Systems Can We Expect? . . . . . . . 8.4 Why Natural Languages for Communicating with Computers? References . . . . . . . . . . . . . . . . . . . . 7.4 7.5 7.6 7.7
. . . . . . . .
. . . . . .
. . . . . .
. .
. . . . . .
150 151 153
156 158 158 160 162 166 167
Introduction
Much has been written and much good work has been done on natural language as a means of communication with computers. Some of the problems involved in the development of systems with such a capability have been effectively solved ; some others, more difficult, have been more clearly delineated. We are now a t a point where i t is worthwhile to assess the current state of the art and future directions in this area of resgarch. Our own interest has been to achieve a n early operational system for natural language communication with computers. Our assessment here will reflect this concern. We will constantly be focusing on the question: What can be done now? Our views are based upon direct and successful experience with the problems discussed. Our system for comiiiunicating with the computer in natural language, the REL (Rapidly Extensible Language) System, is in experimental operation and is nearing the transportable, prototype stage. I n this paper, we call on examples from this work to niake our points as clear and concrete as possible. The question we have sought to answer in our research is this: How can a qualified professional effectively use the computer for relevant research with minimum constraints on the language of communication?
1. Natural language for Computers
The term “natural language” as used in this paper has the following meaning: a language that is natural for a specific user. This concern is not the only one that motivates research in natural language processing. Some other researchers are interested in the problem of how to build a system that understands language the way humans understand language. I n this paper we will not comment on this approach, interesting a s it is in itself. We proceed to assess the current state and future direc-
PRACTICAL NATURAL LANGUAGE PROCESSING
111
tions for natural language processing from an essentially utilitarian, practical point of view. To that end, we will take up the following issues: What is a natural language? Semantics and data structures Deduction and related issues English for the computer Practical natural language processing 2. What Constitutes a Natural language?
In this section we focus on the nature of languages that are natural for communication with computers. 2.1 The Importance of Context
In the common use of the term, “natural language’’ refers to such languages as English, Polish, French. In some attempts to define natural language in this sense, linguists have introduced the notion of the “fluent native speaker” of a language. This notion is applied in ascertaining what constitutcs an acceptable utterance in a natural language. Thus, an utterance is a well-formed sentence of English if it is recognized as such by a fluent native speaker of English. A definition of natural language along such lines may be adequate for some purposes of linguistic discussion of the nature of language in general. But specific uses of language by individuals in specific contexts show characteristics which are not easily subsumed under such a general definition. The language used by professionals in conversations with their colleagues concerning restricted areas of professional interest has a distinctly different character from general usage. Words take on special meanings, and even quite common words may be used in a highly technical way. Elliptical constructions carry meanings that would not be a t all apparent or even reconstructible in a wider or more distant context. From the point of view of an individual user, language is almost always highly context dependent. This context is not, only linguistic, i.e., the surrounding sentences, but in a very significant degree also extralinguistic-that is, the specific setting of the conversation in a given life situation. The following exarnple, cited by Bross et al. (1972) illustrates the point. They found that surgeons often close their standard write-ups of operations with the sentence: “The patient left the operating room in good condition.” When it was pointed out to them that this sentence was ambiguous and could mean that the patient had mopped the floor,
112
F. 6.
THOMPSON AND 6. H. THOMPSON
put away the equipment, and indeed left the operating room in good condition, thcy tended to laugh and say that such an interpretation would never occur to anyone reading their reports. The problem of context goes far beyond the repression of ordinary or possible alternate meanings on the basis of context. Consider the question: “What is the size of chemistry?’’ Out of context, it seems meaningless. However, one could easily imagine i t asked in a high school faculty meeting. One could also imagine it asked of a computer in the process of arranging class schedules for such a school. Further, “size of chemistry” would be intcrpretcd in this latter context not as ‘current number of chemistry students,’ as it would be in the former context, but rather as ‘number of students who have registered for chemistry.’ I n order that effective communication be achieved in specific situations, the interpretation of sentences has to be highly sensitive to specific contexts. Such interpretations may not be available to the mythical “fluent native speaker” and therefore the sentences may appear meaningless or illformed when out of context. It is also true that those context-specific interpretations may not be available to the same individual speaker, for instance, a high school principal with regard to the above examples, in a different contextual situation. The typical professional works in a narrow, highly technical context, a context in which many interlaced meanings have developed in the course of his own work. The clues that can be found in any ordinary discourse arc not sufficient to distinguish that highly specific context from all possible contexts. The idea of a single natural language processing system that will ingcst a11 of the world’s technical literature and then produce on dcmand the answer to a specific request may be an appealing one. But, a t this time thcrc are no useful insights of how contcxt-delimiting algorithms would operate to meet the requirements of such a system. 2.2 The Idiosyncratic Nature of Practical languages
From the practical point of view of computer applications in the near future it is quite nc.ccssary and advantageous to limit the context of the particular application language to the narrow universe of discourse involved in the user’s area of interest. There is a wide range of reasons why this is so, and many of them will become apparent in the latter sections of this p:q)er. The major reasoil why the universe of discourse of application languages has to bc narrow is the idiosyncratic nature and the rapid rate of change that characterize the interests and fields of bonafide computer users. Suppose one had an online computer system for social sciences, with data banks containing the already vast files of data currently in
PRACTICAL NATURAL LANGUAGE PROCESSING
113
use by social scientists. The vocabulary of the common query language would presumably include the word “developed.” The system might be able to discern from context a difference in meaning between a “fully developed skill,” a “developed country,” and an “early developed child.” However, a particular rcscarcher may not know which of the many meanings existing in the literature for the term “developed country” is used by the system. A sociologist interested in testing a theory of institutional evolution would not find acceptable an economist’s notion of development as a particular linear function of per capita GNP and the ratio of industrial capital to governmental expenditure. Who is to say how ‘(developed” is to be defined for such a broadly based system? The situation is similar with applications of such systems to management. Certainly there are significant differences among firms in the use of such technical terms as “direct salary” or “inventory.” Whether “direct salary” is to include vacation salaries or not may depend on whether one is bidding on governmental or commercial contracts. To some, “product inventory” includes items already sold but waiting shipment ; others keep inventory in terms of dollar value rather than the number of items. Even in the same firm, different managers a t different levels keep separate and incommensurable accounts. Computer scientists usually think in terms of “the” system of accounts for a firm, perhaps not aware of the fact that these accounts are kept largely for tax accounting purposes, and that management decisions are made on the basis of quite distinct accumulations of the detailed statistics. Further, changes in tax law, union contracts, pricing policy, etc., change the meaning of recorded statistics from year to year. Thus, a general set of semantic meanings built into an insensitive system can be worse than useless to a manager who has to live with change. To conclude, we state this emphatically: For the foreseeable future, natural language systems will be highly idiosyncratic in their vocabulary and the semantic processes involved.
2.3 Language Extension through Definition
These same considerations of the idiosyncratic and evolving nature of a given computer application give rise to a second property of systems that are natural. This is that they have to provide for easy and natural extensibility of languages on the part of the user himself. The introduction of new terms and changing definitions of old ones are ubiquitous aspects of language usage, especially in the technical, intensively worked areas in which computers are applied. In our computer applications today, where programs are written in FORTRAN or COBOL, language change is a matter of reprogramming.
114
F. B. THOMPSON AND B. H. THOMPSON
Thus when a social scientist wishes to apply a standard data analysis routine to a new, previously unidentified subset of this data, he writes a short FORTRAN routine that first creates this new subset as a separate entity and then calls for the application of this and perhaps other analysis routines to it. But when he is in a position to communicate with the computer in natural language, he will be able to state the new subset of his data in descriptive terms. If the description turns to be unwieldy, he may wish to simply name his new subset, thus defining it once and for all. This process can be illustrated from experience with our REL system on the part of the anthropologist Thayer Scudder (Dostert, 1971). I n the course of his work this user defined the term %ex ratio” as follows: def: sex ratio of “sample”: (number of “sample” who are male)*100/ (number of “sample” who are female) Subsequently he wanted to examine the structure and properties of families which had had all of their children. He decided t o concentrate on the older women. He created the class of older women of the Mazulu village which he was studying in the following way: def: Mazulu crone: Mazulu female who was born before 1920 He was then in a position to ask such questions as: What is the sex ratio of the children of Rilazulu crones? Note that without the capability to form definitions, this question would have had to be stated this way: What is the number of male children of Mazulu females who were born before 1920 times 100 divided by the number of female children of Mazulu females who were born before 1920? The very essence of intellectual activity is the creation and testing of new conceptual arrangements of basic elements. This is clearly true in the case of research activities. But it is equally true of management where new organizational groupings, new product lines, new accounting policies are constantly evolving as means of organizing the business for more efficient operation, more effective market penetration. Changing and extending through definition and redefinition are the language means concommitant to this process. In terms of today’s state of the art in natural language systems and our goal of a language capability natural for the user, language extension by the user is also of considerable theoretical interest. When we supply a “natural language” system to a new user, even if we build a specialized
PRACTICAL NATURAL LANGUAGE PROCESSING
115
vocabulary around his own data base, he will find the result t o be a new dialect with word usage somewhat stilted, with certain syntactic constructions interpreted semantically in ways t h a t do not exactly coincide with his own usage. Our experience indicates t h a t the new user initially engages in considerable language play, paraphrasing his questions in a variety of ways to check whether he gets the same answers and asking questions whose answers he knows to see whether the computer responds correctly. As he gains familiarity and confidence, he begins to change and extend the language in ways that make i t feel more natural to him. Any language algorithmically defined is in some sense artificial. For some time to come as we learn more about language itself, our natural language systems will initially feel artificial to the user. They will become natural for him as hc makes them his own. A tool can feel natural to a user and still be quite limited in the capabilities he may desire. I n the same way, the natural language systems we can now provide have many limitations, but they can still become natural tools for their users if these users can “fit them t o their hand” as it were, through language extension and language change.
3. The Prototype RE1 System
I n the previous section we stated that each user has to have his own idiosyncratic language package which he can extend and modify according to his needs. How can we design a system that will support these many language packages? How can facilities be provided in such a system so that new language packages may be quickly developed around the individual needs of new users? Each language package will require its own syntax and semantics, will need a parser and a semantic processor, language extension handling procedures, data structure processing utilities, operating system-i.e., an entire language/data processing system. How can such complete packages be factored into parts so that the main modules may be shared while the idiosyncratic aspects are isolated? What are the appropriate interfaces between these modules? In what form should the idiosyncratic parts be declared to the underlying common system? These are questions which we have answered in the implementation of REL. The REL system is a computer system for handling languages which are natural (Dostert, 1971). By designing and implementing the REL System, we have sought a solution to the above and other questions concerned with natural language processing. The early experience we can now get through observation of how the system actually works when used
116
F. 8. THOMPSON AND 8. H. THOMPSON
USER LANGUAGE/ DATA BASE PACKAGES
,-,
o
3
w
l3
5
BASE LANGUAGES
DATA
LANGUAGE REL LANGUAGE PROCESSOR REL OPERATING SYSTEM COMPUTER HARDWARE AND OPERATING SYSTEM
FIG.1. Architecture of the REL system.
by bonafide users is valuable, we believe, for the further development of natural language systems. I n this scction we lay out the major architecture of the REL System. This architecture is diagrammed in Fig. 1. We will discuss separately the following six areas represented in that figure: 1. REL Language Processor, 2. Base Languages, 3. User Language Packages, 4. Command Language and Metalanguage, 5. R E L Service Utilities, 6. R E L Operating System. 3.1 The RE1 language Processor
A detailed description of this part of the REL system, which cannot be gone into here, is found in Thompson (1974). The REL system is designed to support a wide variety of user language packages. Usually a computer language is defined by its own language processor. Thus, for example, FORTRAN exists in the computer in the form of a FORTRAN compiler. We have taken quite a different approach, namcly, to provide one language processor for all R E L language packages. The REL language processor is a simple syntax-directed interpreter. The notion of syntax-directed interpretation is of considerable theoretical importance in language analysis. Syntax-directed interpretation assumes that the language is defined by its rules of grammar and, corresponding to each of those, associated interpretation rules. The essence of this notion is that when a segment of a sentence is recognized as a grammatical phrase according to an appropriate rule of grammar, the meaning of that phrase can be found by calling on the corresponding interpretation rule.
PRACTICAL NATURAL LANGUAGE PROCESSING
117
I n view of the importance of syntax directed interpretation, its operation will be illustrated by an example, i.e., how it would process such an arithmetical expression as: ((3
+ 4)*(6 - 5 ) ) .
Let US suppose that the system had already recognized “3”, “4”, “5”, and “6” as numbers and that it had been supplied these syntax rules:
+
(number)) R1: (number) -+ ((number) R2: {number) -+ ((number) * (number)) R3: (number) -+ ((number) - (number)) Let us suppose also that for each of these rules i t had been supplied with a n interpretive routine, namely: with R1 routine T1 which adds two numbers with R2 routine T 2 which multiplies two numbers with R3 routine T3 which subtracts two numbers The REL language processor (indeed, any syntax-directed interpreter) first parses the sentence or expression, that is, it matches the syntax rules to thc sentence to determine its grammatical structure. The result of the parsing for the above expression is shown in the diagram: (number ;T2) (number ;T1) (number ;T3) (number) (number) (number) (number) (( 3 4 >*( 6 5 1)
+
With each node in this parsing diagram an interpretive routine is associated which corresponds to the syntax ruIe being applied. After parsing, the semantic processor uses the parsing diagram to carry out the semantic interpretation of the sentence. It does this by systematically applying the indicated interpretive routines to the arguments supplied by the nodes below, starting always with the bottom nodes and working up toward the top. In the example, it would first carry out the addition and subtraction using the interpretive routincs T1 and T3, respectively. I t would then apply the interpretive routine T2 to the resulting two numbers, completing the evaluation of the given expression. The simple conceptual scheme has been refined in the REL language processor into highly efficient algorithms. The parser is our own refinement of the Kay (1967) algorithm. It is designed to handle any general rewrite rule grammar. It also has specific mechanisms for handling syn-
118
F. B. THOMPSON AND
B. H. THOMPSON
tactic features, in the sense of transformational grammar (Chomsky, 1965) and certain simple but useful transformations. These will be discussed more fully below. The parser finds all possible parsings of the input sentence. The semantic processor, although simple in conception, has some rather unique features for handling definitions and bound variables. I n these regards, it resembles conceptually the Vienna definition language (Wegner, 1972). It also handles ambiguity (more on that below). By describing the language processor we have also identified how a language is to be defined, namely, it is defined as a set of grammar rules and corresponding interpretative routines. 3.2 Base Languages
As stated above, each user should have a language package built around his individual needs. However, the independent development of a new language of any complexity for each user would still be a major task. Moreover, many of the linguistic and data management aspects are likely to be shared. Thus, technical English augmented by statistical nomenclature and associated routines can be effectively used in a great number of social science and management applications. Such applications will differ from one to another in their vocabulary and their data, but the general, common part of such a family of applied languages can exist. For this purpose, we have implemented “base languages.” The most prominent base language in the R E L system is R E L English (Dostert and Thompson, 1971, 1972, 1974). R E L English contains no vocabulary other than the function words, e.g., “have,” [‘what,’’“of,” and such operators as “maximum.” I t also includes the nomenclature and processing routines of statistical analysis. Each application of R E L English adds its own vocabulary and data and possibly some application-specific processing routines. Two other base languages that have been developed are the Animated Film Language (Thompson e t al., 1974) for interactive graphics and the R E L Simulation Language (Nicolaides, 1974) for designing, testing, and applying discrete simulation models. Other conceivable base languages are R E L French (or some other natural language), a base language for applied mathematics, a base language for music composition, etc. 3.3 User language Packages
The question dealt with here is how a typical user uses the REL system. Consider for example a social scientist who has in hand a body of field data he wants to analyze, say, economic statistics regarding world
PRACTICAL NATURAL LANGUAGE PROCESSING
119
trade. He would typically make use of REL English as a base language. To do this he would create his own language package, say, under the name TRADE, by typing from his terminal the simple command: COPY TRADE FROM R E L ENGLISH The basic vocabulary for TRADE would arise naturally from his data; for instance, it might contain the names of all countries, names of economic groupings, e.g., “European Common Market,” and relation words associated with the variables in his data, e.g., “gross national product,” “governmental expenditures.’’ He could put in his data by typing simple declarative sentences, e.g., The population of Iran was 5000 in 1970. This is one possibility. However, R E L English, as a base language, has provisions for bulk data input from standard unit record sources. He could make direct use of these, submitting his boxes of data cards to the computing center operators. He might do this as a batch job overnight. The next morning he could begin interrogation of his data, defining iicw conceptual notions, etc. To do this, he would avail himself of a terminal, issue the command E N T E R TRADE and proceed. H e might ask, for example, What is the mean and standard deviation of the populations of European countries? He could contextually define the notion of “per capita”: def : per capita “area” : “area”/population and then ask: What is the correlation between per capita gross national product and capital investment of European Common Market countries? For the situation in the above example, the capabilities of R E L English might suffice. For other situations, it may be that no available base language would do. Under such conditions the user would have to seek the help of an applications programmer to create a new language specifically for his needs. Aspects of this task will be discussed below. Suffice i t to say here that our objective is to facilitate this task of creating such a new language so that it could be achieved in a matter of weeks, the major concern being the user’s needs, both in syntax and semantics, rather than the programming problems of the language processor design and implementation.
F. B. THOMPSON AND B. H. THOMPSON
120
3.4 Command language and Metalanguage
The command language is that simple language one uses to communicate one’s needs to the system, e.g., the creation of a new language, entering or deleting an existing language, invoking protection keys, etc. The data associated with the command language is the information concerning the various base languages and user language packages in the system. It appears to the language processor just as any other user language package. The research task of implementing a new R E L language for a specific user is a technical task not too different from other research and management tasks. The REL system itself could be used to facilitate this task with a language that would be natural for language writing. The metalanguage is such an R E L base language (Szolovits, 1974). Unlike other R E L languages, it does not stand alone but in essence underlies each of the other R E L languages. It is in the metalanguage that other languages are defined and extendcd. The metalanguage includes the capability to declare new syntax rules and new data structures, and to program associated interpretive routines. Since the metalanguage knows all about the REL language processor and the operating environment, it can perform a variety of diagnostics and carry out a limited amount of optimization. The metalanguage also contains a variety of debugging facilities such as standard break point, snap shot and trace, and it also facilitates tasks directly related to the language implementation such as display of the parsing graph of a sentence. The following sequence illustrates the use of the metalanguage to examine the output of the parsing phase of language processing. Assume that one had entered a user language based upon R E L English, and asked the question “What is 3 2?” After obtaining the answer, one could switch to the metalanguage and ask for the phrase marker and thus obtain the parsing graph. This is illustrated below.
+
WHAT IS 3+2? 5 METALANGUAGE PMARKER LANGUAGE WHAT IS 3+2?
ss _ _ _
VP
- - -
- _ _ _ - - - -- - VP- - - -
- -
-
..
-
-
NU- _ _ CV N U NU ? WHATIS 3 + 2
PRACTICAL NATURAL LANGUAGE PROCESSING
121
It is the metalanguage that facilitates the design and implementation of new user language packages. With it, new languages can be brought into being quickly and efficiently and base languages can be augmented with specialized syntax and processing routines, tailoring them to the needs of particular users. 3.5 RE1 Service Utilities
A variety of service utility routines are provided to the language writer. They embody an answer to the question: Which facilities must be at the discretion of the language writer and which can be subsumed by the system? Two such services will be described to illustrate this. REL is designed to handle large data bases and the needs of a number of users, therefore it makes extensive use of disk storage through a paging subsystem. The allocation of new pages, addressing problems, management of pages in the high speed memory are all handled by this subsystem. However, loading and locking of pages in high speed memory and any conventions concerning the contents of pages are left entirely to the language writer. H e can also ascertain the number of unlocked page slots available in high speed memory. We are convinced that the timc spent in moving material into and out of high speed memory is now and will be for a considerable time into the future a primary consideration for operationally effective systems. Major efficiencies can be realized by optimization; but such optimization can only be carried out by the language writer who is cognizant of thc nature of the data structures he is manipulating. Most of this optimization will be donc a t the level of the data structure design, but it will depend on the programmer’s being aware of page boundaries and availability of paging slots, and his being able to lock pages in high speed memory. These considerations are reflected in the paging utility routines made available to him. A second set of service utilities concerns language extension. The language processing mechanisms for handling dcfinitions are built into the parser and the semantic processor. They require that the various parts of dcfinitions be placed in appropriate structures. However, the syntax for defining can vary from one uscr’s language package to another. Some user languages may have assignments similar t o those in programming languages; others, such as REL English, may have a variety of syntax forms for the creation of verbs. Therefore the updating and deleting from dictionaries and syntax tables is carried out by service utilities, thus allowing the language writer to adopt his own Conventions in these regards while allowing strict system maintenance of internal tables. These two examples-paging and definition handling-illustrate not
122
F. 6. THOMPSON AND 6. H. THOMPSON
only the nature of service utilities provided, but the general approach to the implementation of special purpose applications languages represented by the R E L System. 3.6 RE1 Operating System
The R E L Operating System provides the interfaces with the underlying operating system environments. T o round out the characterization of REL, the specifications of the environments in which it now operates will be given here. The REL systein has been implemented on an IBM370 computer. I n light of the rapid evolution of computer systems and also of our goal of obtaining early operational experience, we have sought to make the system as transportable as possib1e.l The system is now operating in the following operating system environments: MFT, MVT, TSO, CP67/CMS, VM/CMS, VSZ/TSO. The R E L System requires 120K bytes of core memory. The minimum amount of disk space is approximately lo* bytes; however, effective use of the system requires considerably more, depending on the size of data bascs involved. 4. Semantics and Data Structures
There exists a wide variety of approaches to semantics from the fields of linguistics, logic, and computer science. Our problem here ltowevcr is more limited sincc our interest is in how computers are to be programmed to handle natural languages. 4.1 The Importance of Data Structures
It secnis plausiblc to separate computer processing of language into three steps: ( I ) analyzing the input sentence structure, namely parsing; (2) processing this structure in conjunction with some internal data and thcreby developing an appropriate interpretation of the input ; and (3) preparing an appropriate response, for example, the forming of an answer or the rnovemcnt of a robot. The second of these is semantic analysis. The internal data may be conceptualized in a variety of ways-as sequcntial files, as a relational data base, as sentences of a formal language, as a conceptual or semantic net, as a group of procedures. However one may think of it, it has somc structure. T h e various words of the input scntence point, through the dictionary, into this data structure. The seAs an exainple of its transportability, wc were able to demonstratr the system on the University of Pisa computer at the 1973 Inteinationol Meding on Computational Linguistics in Pisa, Italy.
PRACTICAL NATURAL LANGUAGE PROCESSING
123
mantic processing interrelates the structure of the input sentence with the structure of the data. Data structures, a prime consideration in all aspects of computer science today, are central to natural language processing. I n natural language processing, input sentences tend to be short, interpretive routines tend to be complex, and data bases tend to be large and highly interrelated. This is in contrast to, say, FORTRAN “sentences” which comprise entire programs, interpretive routines that are very short, and collections of otherwise independent data items. Because of these basic diff erences, natural language processors need to be considerably less sensitive to the optimization of working storage and considerably more sensitive to the complexities of data structures and their organization with respect to standard storage. Thus in discussing the semantics of natural language processing, a central topic is data structures. 4.2 Are There Universal Data Structures for English?
A t first glance it would seem that in the design of a language such as REL English, a data structure set could be adopted that would somehow reflect a general English usage and thus be suitable for all, or essentially all, applications. Several user language packages based upon such a general English would differ only in vocabulary, specific data, and added definitions. Theoretically this can be done; for example, n-ary relations or list structures will do. However in the processing of large amounts of data, the inner loops of the interpretive routines which have to do with searching, merging, and sorting are performed many thousands of times. Savings of several orders of magnitude in computer time can be realized by a more judicious choice of data structures. This is particularly true because of the large differential in memory access time between main memory wherc processing is done and peripbcral memory where data is stored. The selection of data structures cannot be made in the abstract. For example, suppose one’s data is time dependent, a supposition true of many data bascs of importance. How is the time variable attached to each record to be handled in storage? We illustrate this particular problem from our REL experience. REL English is time oriented. Thus one could ask of a data base concerning family relationships and locations such questions as : Where was Stan Smith when ,Jill Jones lived in Los Angeles? How many Smiths lived in New York since June 16,1968? In order that time be handled, each item of data carries two times indi-
124
F. B. THOMPSON AND B. H. THOMPSON
eating the interval of time during which the data itern is assumed to hold. Thus, if Jill Jones lived in Los Angeles between May 6, 1963 and October 18, 1972, there would be this data entry in the location file: (Jill Jones, Los Angeles, May 6, 1963, October 18, 1972) This would be, of course, appropriately coded. But what does “appropriately coded” mean? Well, “May 6, 1963” would be translated into a single number. When talking about data structures in a concrete, impIemented system, as opposed to talking about data structures in the abstract, a field size for holding this internal number form of a specific time, together with the units of this number, has to be assigned. The convention we have adopted in REL English is that the unit of time is a day and in the file structure for a relation, two time fields will each be two bytes long. The immediate implication of this is that REL English can only handle time over a 180 year period and only down to units of a day. Therefore, clearly, it could not be immediately applied to airline schedules which require time in units of minutes, but which only cover a span of a week ; or, for that matter, to historic data spannnig centuries. To what degree are these conventions built into REL English? Consider only the output routines. Suppose one asks the question: When did Jill ,Jones arrive in Los Angeles? The answer should be a date--“May 6, 1963,” rather than some coded number representing May 6, 1963. Yet the internal form of this data must be amenable to rapid comparisons and arithmetical operations. The translation routines from internal form to external form have to be cognizant of the calendar, e.g., information about occurrences of leap years. Phrases like “the previous 5 months,” Yhree weeks after February 18, 1960” all require knowledge of the calendar for translation. If the conventions concerning time storage were to be modified in some application, interpretive routines and time conversion utilities would have to be reprogrammed accordingly. Could we have done better in our tasks? What time unit should we have picked? Perhaps minutes? Over what span? In historic time to a thousand years in the future, perhaps a span of 6000 years? That would require approximately 23 bytes per time field, multiplying space requirements by 5 and increasing computer time by even a larger factor because of additional paging, especially in conjunction with sort utilities. Perhaps a form of a base REL English could be designed that would allow a user to select the unit of time, and therefore the time span. But generalizations of this sort run into both a sharply rising implementation cost and a sharply falling marginal number of user-oriented applications.
PRACTICAL NATURAL LANGUAGE PROCESSING
125
The limitations of R E L English are certainly not only in regard to the units and spans of time. Context considerations have to go farther. Consider the following example. We have applied REL English, as it now stands, to analyze interrelationships among scientists in a given data base. For each scientist, the data gives his date of birth, educational background, the institutions and dates of his employment, and his publications. The user of this data, in the process of his investigations wants to use the terms occurring in the titles of papers as part of his query, for example, to inquire about all authors of papers whose titles include the word “radar.” The titles of papers are in the data base, but they are as literal strings in the lexicon. I n this form they are unavailable to the interpretive routines of REL English which know nothing about literal strings as a data structure. I n this particular case, new syntax and associated interpretive routines that apply to literal strings and know how to locate them in the lexicon can be added to REL English. But, such additions comprise language extension a t the application programmer level rather than the user level. Although it is clear that different applications have their own special requirements for interpretive routines, we still have to examine whether an R E L type system can be based upon a single data base management system with its own basic data structures. This is an important question, for there are a number of major efforts and community wide coordinating committees concerned with such general systems. I n particular, relational data systems are being pressed forward as being generally applicable to the organization of data. The data structures that underlie these systems are files of contiguously stored unit records, each record consisting of a sequence of numbers, which are the values of the variables that describe the class of individuals studied. Since the range of applications to be supported by a system such as R E L is broad, it is obvious that relational data structures are not adequate as the only structural form for the data. Deductive techniques, such as theorem proving, require distinctly different structuring of data, and they will be more and more a part of natural language systems. The REL Animated Film Language uses quite different data structures. I n that language a picture is defined in terms of a set of linear transformations on more primitive picture parts, in an iterative way. These linear transformations take the form of a 3 x 3 matrix of numbers. These, of course, can be processed as a lo-ary relation, the 10th component being a pointer to the picture part. However, the interpretive routines that involve presenting these pictures as a moving image on a graphic display require utmost efficiency, efficiency which is obtained by using storage structures and access algorithms specifically designed for this application.
126
F. B. THOMPSON AND B. H. THOMPSON
I n the future, even more than in the past, there will be families of applications which will dictate their underlying data structures. Natural language systems built for such applications will have to recognize these structure requirements on data storage and data management if they are to achieve the necessary efficicncy. Once this fact of computer system design is recognized, the supporting software of opcrating systems can be designed so as to facilitate such highly adaptive language programming on the part of applications programmers. The success of our R E L work in this regard is most encouraging, as evidenced above. 4.3 Data Management for Relational Data Systems
Let us now narrow the range of applications considerably to just those which are restricted to relational data files. R E L English, as an REL base language, is designed for such restricted applications. Is a single data mangement system sufficient in the case of those? This is a moot question. There are two classes of such applications that may well dictate separate data management utilities. These two classes of relational systems arc cxamplified by the United States census data, on the one hand, and the files of information on scientists, mentioned above, on the other. In the formcr, the number of individuals is of the order 106-10X,and typical queries involve an appreciable number of the variables in each record considered independently. I n the latter, the number of individuals is of the order 103-105,and typical queries involve only a few of the variables in each record, but intcrrelatc separate records in nonsequential ways. I n the former, processing can typically (though not exclusively) be done sequentially through the file; in fact it has to be done in this way, since file sizc is so large as to make random link following from one individual’s record to another prohibitive from the point of view of processing time. .Just consider querying the census data as to all persons who have a cousin living in Rlilwaukee! I n the other case, exemplified by the data on scientists, the nature of the investigation demands random link following in proccssing. For example, one might wish to run a cluster analysis which would tend to group scientists on the basis of common authorship of papers, or on thc basis of joint institutional affiliation. A general relational data managerncnt system could certainly handle both kinds of applications, but each requires quite distinct processing optimization that must be internal to the system. I n order to get a more concrete feel for these issues, consider the following data processing problem from our R E L experience. REL English data structures are essentially classes and tinary relations. Suppose that in dealing with the “scientist” data, a subclass of scientists had been formed consisting of all linguistics, numbering, say, 2100. Suppose the
PRACTICAL NATURAL LANGUAGE PROCESSING
127
relation of “institutional affiliation” was carried as a file, say, with 9000 entries. In the REL system, data is stored on “pages” of 2000 bytes each. REL English classes require 8 bytes per entry, a member field and two half-word time fields. Thus a class page holds approximately 250 entries; a relation page holds 150. Thus the “linguist” class uses 9 pages, the “institutional affiliation” uses 60. If we want to compute the meaning of the phrase “institutional affiliation of linguists” we can do i t in two ways; a. consider each linguist at a time, find all of his institutional affiliations In this method a page of the “linguist” class is brought into main memory, and then each “institutional affiliation” page is loaded. This is repeated for each linguist in turn, a total of over 120,000 page loadings. At roughly 60 msec per page, this would require over an hour and a half; b. first determine how many page slots in main memory are available, say, in this case the number is 9. Use one for the output page, one for relation pages and the remaining 7 for class pages. Having brought in the first seven class pages, lock them in main memory and go through each “institutional affiliation” page in turn, finding the institutional affiliations of all linguists that are in the first 7 class pages. Repeat for the remaining 2 class pages. I n this method, 129 pages are moved from disk memory to main memory consuming less than 3 seconds. Thus the ratio of computing time for method (a) relative to method (b) is 3 orders of magnitude. This makes the importance of the problem involved clear. REL English, of course, makes use of optimizing methods illustrated by method ( b ) . These methods were developed by Greenfield (1972). At this stage in the development of relational data base systems, two questions need clarification. (1) Are relational data base applications rather uniformly distributed between the relatively small, highly interdependent data bases and the huge data files with their completely independent records, or is the distribution of applications definitely bimodal? (2) Can internal sort, merge, and other basic file processing optimization techniques be so programmed as to meet the efficiency requirements for the whole class of relational data base applications or must two distinct relational data base management systems necessarily evolve? Basic hardware and operating system decisions are involved, especially as far as virtual memory philosophies are concerned. Because of the large and growing number and the importance of such applications, these are significant research questions.
128
F. B. T H O M P S O N AND B. H. T H O M P S O N
5. Semantics Revisited
What is the character of the basic decisions one must make in designing the semantics of a natural language system? To answer this question, we will examine in some detail the semantics of R E L English. This discussion will also be a useful foundation for the following section on deduction. 5.1 Primitive Words and Semantic Nets
A relational data base may abstractly be considered as referring to individuals, predicating certain relations among these, and assigning values to each of them for certain attributes. I n putting such data into a natural language system we must have ways of: (a) introducing new words into the lcxicon for referring to new individuals, classes, relations, and attributes; and (b) declaring that certain individuals are in certain classes and relationships and have certain values for attributes. Words introduced by process (a) and interlinked by process (b) will be called primitive. How nonprimitive words may be introduced by definitions will be shown shortly. First, what is provided in R E L English for the user to carry out ( a ) and ( b ) ? There are four ways for the user of REL English to introduce new primitive nouns into the lexicon, which are illustrated by the five expressions: John: = name Mary: = name male: = class parent: = relation age: = number relation Once words such as “Mary,” “male,” “parent,” and “age” are introduced into the lexicon, they may be interrelated by declarative sentences such as these: John is a male. John is a parent of Mary The age of John is 35. The computer interprets the expression: male: =class in the following way. ( 1 ) It allocates a new page in the disk memory, say, the page with disk address 01. It puts certain header information
PRACTICAL NATURAL LANGUAGE PROCESSING
129
a t the top of this page, including an indication that i t is a class, but otherwise leaving it blank. Second, it puts “male” into the dictionary with the definition: (noun phrase, a), indicating its part of speech and a pointer to its assigned page. The expression: John:
=
name
results in similar actions, allocating, say, page p. The sentence: John is a male. results in the pointer p being written on page a, thus indicating that John is a member of the class An alternate way of introducing a noun into the vocabulary is by definition. The following illustrates this: (Y.
def : father: male parent I n this case there are no new data pages assigned. Rather, “father” is put into the dictionary with the definition: (noun phrase, D) where D is an internal (compiled) form of the phrase “male parent.” In this internal form, there are pointers to the data pages for the primitive words “male” and “parent” or to their definitions if they are not primitive but are in turn defined. In natural language processing systems, there is a level of primitive data structures that correspond to primitive expressions of the language (words, phrases, or syntactic forms) which serve as the semantic atoms of the system. More complex expressions refer to complex relationships among these atoms. Data is carried in the system as linkages in one form or another between these atoms. It is the sum total of the semantic atoms of the system together with the linkages existing between them that has been given the name of the “semantic net.” The following conversation illustrates how a very simple semantic net is built up in R E L English : Who are males? eh? male: = class Who are males? none John: = name .John is a male. Who are males? John
130
F. B. THOMPSON AND B. H. THOMPSON
Bob: =name Bob is a male. Who are males? John Bob age: =number relation Bob’s age is 14. parent: =relation Who are Bob’s parents? insufficient data John is a parent of Bob. Who are Bob’s parents? John Sue: = name Sue is a parent of Bob. Who arc Bob’s parents? John Sue
It shows how the words “male,” “age,” and “parent” and the names of individuals “John,” “Bob,” and “Sue” are introduced. It also shows how data becomes part of the system. The data structure t h a t results from this conversation is illustrated by the semantic net in Fig. 2. The difference between a primitive noun and a defined noun is illustrated in Fig. 3a,h. (The two conversations in Fig. 3 a,b, are assumed to be continuations of the above conversation.) The same word, “boy” is defined in Fig. 3a, but is entered as a primitive noun in Fig. 3b. Figure 3b is disturbing. It illustratcs that the information held by the system concerning a primitive class is indeed strictly limited to that which is explicitly contained in its linkages to other primitive individuals, classes,
FIG.2. A semantic net in REL English
PRACTICAL NATURAL LANGUAGE PROCESSING def: boy: male whose age is less than 16 Who are boys? Bob Bill: =name Bill is male. BiIl’s age is 8. Who are boys? Bob Bill
boy: =class All males whose age is less than 16 are boys. Who are boys? Bob Bill: =name Bill is male. Bill’s age is 8. Who are boys? Bob
(a)
(b)
131
FIG.3. Introduction of a word by definition (a) and primitively (b) in REL English.
and relations. The issues involved here are the subject of Section 6 on deduction. 5.2 The Nature of the Interpretive Routines
The discussion of how nouns are introduced into the R E L English system has illustrated some aspects of the basic decisions that must be made in designing the semantics of a natural language system, namely, the identification of the atoms and linkages that constitute the data structures and how they are tied to the primitive words. We now turn to the character of the decisions one makes in designing the interpretive routines, the semantic counterparts of the rules of grammar. T o this end, let us consider a spccific rule of syntax from the REL English grammar: (noun phrase) -+(noun phrase)(noun phrase). Examples of phrases to which this rule is applicable are: Boston ships male parent male dogs author Scott We will speak of the two constituents as the left and the right nouns. Our task, in designing the semantics corresponding to this rule is to describe an interpretive routine which will operate on the data structures referenced by the left and the right nouns and produce a data structure which expresses the appropriate meaning of the whole phrase. But what are the data structures referenced by the left and the right nouns? As we have seen above, in REL English a noun refers either to a n individual, a class, or a relation. Thus we have the following nine cases to consider.
F. B. THOMPSON AND 8. H. THOMPSON
132
Case Case Case Case Case Case Case Case Case
1 : class-class 2 : class-individual 3: class-relation 4:individual-class 5 : individual-individual 6: individual-relation 7: relation-class 8: relation-individual 9 : relation-relation
“male dogs’’ “biologist Jones” “male student” “Boston ships” -
“Harvard students” “student employee” “author Scott” “student owner”
Case 1: class-class “male dogs” In this case the answer is very easy-the intersection of the two classes involved. “Male dogs” refers to the class of all things which are members of both the class “male” and the class [‘dogs.’’ Case 2: class-individual “biologist Jones” This expression might be used if there are several individuals with the same name and the noun modifier, i.e., the left noun, is to be used for disambiguation. “Biologist Jones” is that one of the people called Jones who is a member of the class of biologists. Thus the semantic referent of the class-individual phrase is the individual itself if the individual is a member of the given class. If the individual is not a member of the class, the phrase is construed as meaningless. Case 3: class-relation “male student” We must distinguish such an expression as: “male student of Stanford” which may be grouped: male (student of Stanford), and thus reduce to case 1, from: “male students will report to Room 16.” It is the latter that illustrates the usage covered by this case. Clearly what is meant by “male students” is “those who are both male and students of some school.” Thus we simply compute the range of the student relation, the class of things that are students, and go back to case 1.
It is easy to see that in all cases where either one or the other of the constituents refers to the relation, this same procedure applies. Thus: Case Case Case Case
6 reduces 7 reduces 8 reduces 9 reduces
to to to to
Case Case Case Case
4 1 2 3.
Cases 4 and 5 remain. We consider them in reverse order.
Case 5 : individual-individual Constructions of this type do not exist in common English usage. Thus any accidental occurrence of this case will be construed as meaningless.
PRACTICAL NATURAL LANGUAGE PROCESSING
133
Case 4: individual-class “Boston ships” The phrase “Boston ships,” on the face of it, appears to have a clear meaning-the subclass of ships that are in Boston. Surely that is its meaning in the sentence: What Boston ships will leave for New York tonight? However, consider the sentences : What Boston ships will leave London for their homeport tonight? Ships made in Brooklyn are fast but Boston ships last longer. On the basis of these examples we define the notion of an intervening relation: R is an intervening relation between an individual and a class if there are members of that class which are related by R t o the given individual. In the above three sentences, “Boston ships” is being interpreted in terms of three different intervening relations, namely “location,” “homeport,” and “place of manufacture.” We will subdivide case 4 into three subclasses. Case 4a: the individual and the class have no intervening relations. Under these circumstances, the phrase will be construed as meaningless. Case 4b: the individual and the class have exactly one intervening relation R. Then the meaning of the phrase is the class of those elements of the given class related by R to the given individual. Case 4c: the individual and the class have several intervening relations. Then the phrase is ambiguous, its several meanings corresponding as in case 4b to the several intervening relations. Possible redundant subclasses which have the same members even though arising from different intervening relations are suppressed. The above discussion of the (noun phrase) (noun phrase} rule identifies the semantic analyses performed by the corresponding interpretive routines in REL English. 5.3 The Unlimited Complexity of Data Structures
The basic role of data structures in the semantics of natural language processing systems is apparent from the above discussion. R E L English is limited by the choice of individuals, classes, and binary relations as its basic data structures. It gains a significantly greater capability to reflect our normal usage of English by including with each entry in a class or relation two time fields, that is to say, fields that are specifically interpreted as time and thus structurally identified in semantic processing. This addition makes it possible to give semantic responses to all of the time-related syntax of English, from adverbs of time to the tense of verbs. R E L English could be further augmented with a data
134
F. B. THOMPSON AND B. H. THOMPSON
structure for literal strings and semantic routines that would manipulate them. Then it could deal with such phrases as: city whose name is Boston papers whose titles contain the word radar. Another class of English expressions which we handle are expressions which arise from statistics, for example: correlation between age and income of employees. Here, the correlation calculation matches the age of an employee with the income of that employee. This kind of matching is required for the proper interpretation of a number of expressions. For these purposes, REL English, internal to its interpretive routines, uses a data structure we call a labeled number class. Thus each of the phrases: age of employees income of employees gives rise, internally, to a class of numbers, the members of which are “labeled” with the corresponding employee. The correlation calculation matches on this employee label. Without the addition of such structural means, the processing efficiency for many phrases would be intolerably slow. Surely there will be applications of natural language systems to domains of greater conceptual complexity than can be efficiently represented by individuals, classes, and binary relations, even with the additional structures built into REL English. If such systems are to respond effectively in their semantics to the subtle clues of English syntax that alert our minds to these conceptual complexities, astute design of complex data structures will be the key. A theoretical elaboration of these ideas is found in Thompson (1966). The human mind appears to have a t its disposal memory structures of arbitrary complexities. From all of these i t chooses those which best give meaning to experience. At any given instant of time, we use our current cognitive structures including the linguistic structures we have perceived in our speech community to frame our actions and our verbal responses. As the flow of moment-to-moment experiences carries us along, it is imperative that these structures which we have imposed change so that we adapt to a changing world. This restructuring is not just an extension and reinforcement of our old structures. It is indeed the ability to form new conceptualization which grasp morc cogently what is significant that we revere as the quality of an able mind. I n a language for the computer in which the primitive data structures
PRACTICAL NATURAL LANGUAGE PROCESSING
135
are fixed, there can be no transmutation to more significant forms. It is a t the time when the application programmer selects those data structures that can best sustain his user’s domain that man’s ingenuity guarantees an effective language processing system.
6. Deduction and Related Issues
I n this section we examine the extent to which inference-making capability can be usefully incorporated into practical natural language processing systems. 6.1 Extension and Intension
Semantic theory makes an important distinction between extensional meaning and intensional meaning. Synonomous pairs of terms are “denotation” and “connotation,” “referential” and “intensional” meaning. An expression in a language usually has both an extensional and an intensional meaning. Its extensional meaning is the object, relationship, action, or the like to which the expression refers. Its intensional meaning is the conceptual framework that relates this expression with other expressions of the language in a way that reflects the conditions under which i t is used. Thus the extension of the word ‘(city” is the class of all cities, of which Boston is a particular member. The intension of the word “city” includes the facts that cities are geographic locations, that they are relatively dense and self-contained accumulations of domestic, commercial, and industrial human enterprises. This distinction of extension and intension is also useful when considering natural language processing systems. However, the similarity between general semantic theory and computer language semantics is not as direct as some would have it to be. I n the first place, a computer must find the meaning of a phrase in its own data structures. Consider the meanings of the word “city.” We could say that “city” has an extensional meaning for the computer if it interprets the string CITY as referring to a file in the data base that consists of the internalized form of such strings as BOSTON, NEW YORK, etc. “City” would have an intensional meaning if the interpretation of the string CITY referred to a node in a complex list structure which would link it to nodes associated with the words “human,” “geographical entity,” etc., or if it referred to the internalized form of sentences affirming general characteristics of cities. But this distinction is by no means clear. I n the “extensional” structure, we also find links to the population relation files,
136
F. B. THOMPSON AND B. H. THOMPSON
etc. ; the extensional files may have the same data structure as the intensional structures. As a matter fact, the “semantic nets” of Quillian (1969) and the “conceptual nets” of Schank (1973), both of which are rightfully considered as “intensional systems,” have essentially the same linked structures as the ring structures of Sutherland’s (1963) “Sketchpad” and earlier versions of REL English, both of which are “extensional systems.” An apparent difference between systems with “extensional” semantics and those with “intensional” or “conceptual” semantics is illustrated by looking a t how these systems would answer such a question as Does any man have two wives? An “extensional” system would process the file of “man” against the file for the “wife” relation and answer “yes” if some man was an argument for two entries in the “wife” filc. Note that the meaning of “man” is construed as “entries actually existing in the man-file a t the time of query.” An “intensional” system would check whether the concept “wife” had the property of being a function on the class “man.” It might do this by checking linkages and labels that could be found between nodes of a conceptual net or by evoking a theorem-proving program using sentences stored in some structural form as the data and attempt to prove as a theorem the internalized form of the sentence: some man has a t least two wives. If this sentence were shown to be a contradiction, it would presumably apply to all men, whether to be found elsewhere in the data base or not. The distinction between “extensional” and “intensional” systems that is apparent in the example above is certainly a valid one when considering the system as a black box functioning in a specific application. However, it is a difficult one to characterize in terms of system operation. Extrcme forms of each can be recognized; a system using theorem-proving techniques where the universe of discourse may be considered arbitrarily large is clearly intensional. In general, the distinction has little significance for how the system works internally. Perhaps that is how it should be, for in semantic theory “extensional meaning” and “intensional meaning” were never supposed to be mutually exclusive; rather, just two ways of looking a t meaning. The notions of conceptual nets and cognitive as opposed to syntactic systems reflect more on the orientation of the researcher than on the internal operations of the system he may develop. The block manipulating system of Winograd (1972) has a strictly extensional relationship with its universe of discourse, while extcrnally having many features with an intensional feel. R E L English, a highly extensional system, allows the user to impose and use an intensional structure through definitions.
PRACTICAL NATURAL LANGUAGE PROCESSING
137
Deduction, the main topic of this section, is closely related t o intension. Deduction is the process by which one reasons on the basis of known facts to conclusions which follow logically from them. This is in contrast to checking all possible instances to see if the conclusions hold whenever the statement of facts apply. In a formal logical system, deduction can be given a precise definition. However, in practice, when dealing with natural language processing systems, deduction can more usefully be construed as making use of intensional as well as extensional meaning. 6.2 The Incorporation of Intensional Meaning
Intensional meaning can enter into semantic processing is a variety of ways, and systems can vary all the way from purely extensional to intensional based upon theorem-proving algorithms that do not use extensional information at all. From the practical point of view, the problem with systems that use intensional information-deduction, if you will-is that computing time rises inordinately, indeed to the point where there is a t this time no possibility of applying them to real life problems. We are confident that purely extensional natural language systems, such as REL English, can be effectively applied in the near future. We are also confident that systems incorporating general deductive capabilities are at least a decade away. What needs to be examined here is how far and in what directions we can now move to incorporate intensional information in semantic processing. To this end we will first examine in more detail the limitations of purely extensional systems, calling on REL English for illustration. In a purely extensional system, the primitive words of the language are totally independent of one another as far as the internal semantics of the system is concerned. This is illustrated by the words “boy” and iimale” in Fig. 3b. After introducing both “boy” and “male” as primitive classes, nothing prevents one from adding the statements:
Tom: = name Tom is a male. Tom’s age is 12. without adding, possibly by oversight: Tom is a boy.
If one then were to ask: Is Tom a boy? one would get the answer: No. There is no way the system can make
138
F.
8. THOMPSON AND 8.
H. THOMPSON
use of the intensional information embodied in the statement: “All males whose age is less than 16 are boys.” Adding definitional power may provide a limited intensional capability. For example if we define “boy” as in Fig. 3a, then REL English would respond correctly to the questions:
Is Tom a boy? Are boys male? Internally, to answer the latter question, it would first construct the class of boys. It would do this by going through the class of males and picking
out those whose age is less than 16. It would then check to see whether each member of this constructed class of boys is also a member of the male class. Clearly the system should have concluded the question by making use of the obvious intensional meaning contained in the definition that “boy” is a subclass of “male.” Nevertheless, definitions can add significant deductive power to a natural language system. I n a simple application of REL English to family relationships, one can quickly define the usual relationships in terms of the single primitive relation “parent” and the primitive classes of “male” and ‘(female:” def :child :converse of parent def :sibling: child of parent but not identity def :sister:female sibling def :aunt: sister of parent Information about a person’s aunts can then be “deduced” even though only data about parents is included in the data base. Structural means can be added that will incorporate an essentially greater step. Suppose, for example, that we supply the additional lexical statement, here applied to the word “location: ” location: =transitive relation. The result would differ from a simple relation only in the setting of a flag in the data structure. However, the following deduction would be possible: John’s location is room 514. Room 514’s location is T building. T building’s location is ABC Company. ABC Company’s location is New York. Is John in New York? Yes.
PRACTICAL NATURAL LANGUAGE PROCESSING
139
John is construed to be in New York if John’s location is New York, or if the location of John’s location is New York, etc. Techniques of this kind, some considerably more complex, provide the means of handling may aspects of ordinary English usage. When one is interested in a particular domain of application, the interpretive routines of the language can reflect a great deal of the intensional knowledge of the user. Under these conditions, very powerful deductive capabilities can be built into the system itself. One can imagine a system for handling airline reservations where one could ask for a route from A to B, and the computer, in responding, would first look for direct flights; failing that, seek one stop routes, two stop routes, etc., maintaining a check to prevent cycles, ensure adequate transfer times, etc. The Navy has a computer program that computes the length of the shortest sea route between two points on the seas; this could be incorporated in the interpretation routine corresponding to a grammar rule that recognizes the phrase “sea distance.” In the process of determining the sea distance from New York to Los Angeles, it would have to deduce that a ship would need to go through the Panama Canal. Sophisticated and subtle use of procedures of this sort by Winograd (1972) have given his system an impressive capability to effectively handle intensional expressions of natural language concerning block structures. Woods (see Woods et al., 1972) has incorporated a good deal of intensional information concerning geology and chemistry into the procedures underlying the vocabulary and syntax in his Lunar Rocks query language. This, together with the general excellence of his system, makes it the most fluent of the natural language processing systems that are in operation today. Providing the user with the capability to form definitions and using interpretive routines that incorporate intensional information concerning a specific universe of discourse can result in long response times to some questions. However, in the case of these two methods of using intensional meanings, the user himself is aware that his query will entail lengthy computations and thus is more willing to tolerate a long wait for his answer. This is our experience with the users of the R E L System. When a definition is put into the system which obviously entails processing of an appreciable portion of the data base or if a statistical analysis is invoked which abstracts a great deal of detailed data, the aware user is prepared for a delay in response. In fact, i t has been suggested to us by our users that we incorporate the ability to “peel off” a question whose response is going to take some time and thus free the terminal so that they can investigate in detail the results of the previous “long” query while waiting for the response.
140
F. B. THOMPSON AND
B. H. THOMPSON
6.3 More Extensive Intensional Processing
We now turn to the consideration of deductive methods that require computing times, which, at the current state of the art, are too long to permit their employment in practical natural language processing systems. The paradigm, of course, is theorem-proving techniques. However, abstractly similar problems are encountered in much simpler intensional cont.exts. We will first illustrate the problems involved in such a simple context. Let US return to the consideration of phrases such as “Boston ships.” Recall that we defined R to be an intervening relation between “Boston” and “ships” if there is some ship that is related by R t o Boston. Now consider the following two possibilities : a. a phrase of the form (individual noun)-(class noun) is meaningful if there is a primitive intervening relation between them ; b. a phrase of the form (individual noun)-(class noun) is meaningful if there is some intervening relation between them. Further, its meaning derives from the simplest of such relation, i.e., the relation involving the fewest primitive relations.
For example, suppose that the Maru is a ship that it is owned by Jones and that *Jones lives in Boston. Then Maru is related to Boston by the intervening relation “home of owner.” If there are ships located in Boston, owned by the city of Boston, with homeport Boston, that is to say related to Boston by any direct, primitive relation, the “home of owner” relation will be ignored by ( a ) . But if no primitive relation exists between any ship and Boston, rather than immediately construing “Boston ships” as meaningless, the computer would look further into finding some assignable meaning by (b) . I n REL English we restrict ourselves to ( a ) , that is, we consider only primitive intervening relationships. A worrisome consequence is that the meaningfulness of a phrase depends on the initial selection of primitive relations for the language. For example, suppose someone wishes to investigate the relationships between students and courses at some university. Suppose that included in his data is the instructor of each course and the department of each instructor. In putting his data into the REL system, using REL English, it would be natural to use the words “instructor” and “department” to refer to primitive relations, for these correspond directly to relationships in the raw data. Now consider the phrase “mathematics course.” Using (b) above, a mathematics course would be a course taught by someone in the mathematics department. Using (a) above, “mathematics course” would be meaningless unless further defined.
PRACTICAL NATURAL LANGUAGE PROCESSING
141
Now there are arguments on both sides, for if the computer has the ability to look far afield, it may find meanings quite unintended by the user. Since the user can define such notions, e.g., def: “mathematics” course: course that is taught by a lLmathematics” instructor we have accepted in REL English interpretation ( a ) . This choice was greatly influenced by the fact that interpretation (b) incurs unacceptable computing time. Suppose there is no primitive relation between some ship and Boston. How should we proceed to look for a two-step relation? We could construct the class of all those things that are related by some primitive relation to Boston and then examine each of these to see if some ship is related to it by some primitive relation. The number of relation pages that would have to be brought from disk to main memory would be enormous. And if no two-step relation were found, the computer time would escalate exponentially. It is this characteristic of more profound deductive methods that presents the primary problem in their incorporation in practical natural language processing. In each of these methods one is trying to find the simplest relationship between two entities. In trying to find such a pathway between them, one is faced a t each step along the way with too many pathways to explore further. Research into deductive processes generally takes the form of finding means to more discriminately select the pathways one should follow. The work on semantic nets and conceptual processing, especially the work of Quillian (1969), explores in depth problems which, though more sophisticated, are closely similar to method (b) above for finding the meaning of an (individual)-(class) phrase. Most theorem-proving techniques are based upon the resolution principle of Robinson (1968)) and Green and Raphael (1968). One adds to the set of axioms of the system the negation of the theorem to be proved and seeks to derive a contradiction from this expanded set. Along the way, one generates a larger and larger family of statements that can be derived from this expanded set. Insightful methods have been developed to control the growing size of this family. But in practical applications where the set of axioms, or meaning postulates, is large, this intermediate family of statements becomes enormous, far too large to be handled in any reasonable computing time. We have heard it estimated by competent workers in the field that it will take an improvement of an order of magnitude in technique and two orders of magnitude in the speed of computer hardware to bring to
142
F. B. THOMPSON AND B. H. THOMPSON
a practical level deductive techniques of reasonable sophistication; furthermore, t.hat this will take a decade to accomplish. We substantially agree with this estimate. 6.4 Inductive Inference
There is another class of problems related to deduction which involve decisions concerned with intensional meaning, namely, the problems of inductive inference. By an inductive inference we mean arriving a t a conclusion on the basis of insufficient evidence. Inferences of this kind are quite necessary if our natural language processing systems are going to be useful. Inductive inference occurs in R E L English in connection with the interpretation of time-oriented data. Consider the following conversation : John arrived in Boston on December 6, 1965. John departed Boston on May 18, 1970. Where was John in June 1968? Boston John was in New York on July 16, 1967. Where was John in June 1968? Insufficient data
It is assumed, of course, that no other information concerned with John’s location is in the data. After the first two statements, the system infers that John’s location is Boston throughout the interval from December 6, 1965 to May 18, 1970. When other information is added, namely, that he was in New York on July 16, 1967, this inference is broken. I n general, if an individual A is related by a relation R to an individual B a t two times t , and te, and A is related by R to no other individual during the interval between tl and t?, then it is inferred that A’s relation to B holds throughout the interval. Certainly, building into the system inferences of this kind is a worrisome thing. However in practice, a user puts almost all of his data into his language package before he starts his detailed investigation and he quite naturally establishes the intervals in which relations are to hold, e-g., The location of John was Boston from December 6, 1965 t o May 15, 1970. On the other hand, if this inference concerning time had not been built into the system, the usefulness of the system would be greatly reduced.
PRACTICAL NATURAL LANGUAGE PROCESSING
143
For example, suppose the system is applied to the personnel files of a typical business, and the general manager wants to know how many engineers are employed in each of his facilities. Suppose the data shows that engineer Smith was assigned to facility ABC the previous June. The manager would find the system useless if it answers his query: How many engineers are assigned to each facility? with : Insufficient data. Inductive inferences of a variety of sorts will be required in most applications of natural language processing systems. With regard to such general aspects as time, they can be built into the major base languages such as REL English. However, in narrower matters, they must be added with care and knowledgeable appreciation for the application a t hand. Such problems and methods are well known in the field of artificial intelligence where they are referred to as heuristics. Experience from that field, especially with the heuristics involved in making complex decisions, will be useful in guiding the incorporation of inductive inferences in applied natural language processing systems.
7 . English for the Computer
So far we have stressed the idiosyneracy of natural languages, their dependence on context, and the great variety in their function as tools for communicating with the computer. However, there does exist a common body of language mechanisms-a vocabulary of function words such as “and,” ((of,” and “who”-and a richly extended family of syntactic forms which we share as a ubiquitous part of our culture and refer to as English. To what extent can we build natural languages for computers on this English? This is not a matter of all or nothing. At one extreme is the use of a few English words in a programming language like COBOL. At the other is the ability to handle highly elliptic constructions, indirect discourse, conditionals, and other subtle and complex forms of colloquial language. Somewhere along this continuum there is a threshold beyond which one would say “this is natural English.” A few systems are now beyond, though not far beyond, this threshold. More will be said about this question in Section 8.2, Fluency and Language Learning. I n the present scction, we ask, “What are the general techniques that have been Gsed by the systems incorporating English?”
144
F.
B. THOMPSON AND
B. H. THOMPSON
In the first place, these systems handle most of the normal constructions found in technical English-complex noun phrases, subordinate and relative clauses, verbs and auxiliaries, tense, conjunctions, passive, question and negative forms. Here are some examples of sentences which have been processed by the R E L System: Were IBM’s salcs greater than the average sales of electronics companies in 1965? The per capita gross national product of which South American nations exceeded 50 in the last three years? What was the average number of employees of companies whose gross profits exceeded 1000 in 1970 and 1971? How many students who take biology take each mathematics course? Which language courses were taken by more than 10 students? 7.1 Features
Some computational iinguistic techniques have been useful in implementing English syntax in natural language processing systems. One technique that is commonly used is features. I n a grammar for English, one categorizes words and expressions into various parts of speech, e.g., noun phrase, verb phrase, conjunction. In writing computational grammar rules that express the structure of English constructions one needs to make more refined distinctions. How, for example, arc we to allow “the boy” but exclude “the the boy,” and indeed properly reflect the role of determiners like “the” in such phrases as “the big boy” but not “big the boy”? For these purposes we need to distinguish determiner modified noun phrases from noun phrases not so modified. The role of features is to subcategorize parts of speech. Thus a word or phrase may have the part of speech “noun phrase” and the features “plural,” “nominative,” and “determiner modified.” Features in R E L English are binary, that is, each feature may be plus or minus (on or off). Thus the plural feature (PLF) is on for plural noun phrases (+PLF) and off for singular noun phrases (-PLF). The following rule of grammar: (noun phrase)l+pLp-+ (noun phrase)-pLp s allows the plural “s” to go on a singular noun phrase. The “1” means that the features of the first constituent of the right-hand side are also carried over and assigned to the resulting left-hand side. The determiner (DTF) and quantifier (QNF) features are set by such rules as:
PRACTICAL NATURAL LANGUAGE PROCESSING
145
(noun phrase)l+DTF* the (noun phrase)-DTF-QNF (noun phra8e)l+QNF-pLF Some (noun Phr&Se)-DTF-QNF (noun phrase)l+QNF+ d l (noun PhI%Se)+pLF-QNF (noun phraSe)i+QNF--$ all Of (noun phr&Se)+pLF+DTF-QNF --f
accounting for such phrases as: some boy some boys all boys all of the boys but excluding: the all boys all boy all of boys The primary role of features in natural language processing systems is the ordering of the hierarchical organization of syntactic constituents with the aim of controlling syntactic ambiguity. Noun phrases, for example, are hierarchically structured, i.e., some constituents serve as modifiers of others. Some phrases are genuinely ambiguous, for instance, “Jane’s children’s books” can mean either “the books of Jane’s children” or “a children’s book which is in some relation to Jane, e.g., owned or authored by her.” But in computational analysis, phrases which are normally unambiguous also turn out to have alternate analyses: wealthy benefactor’s statue may parse “wealthy (benefactor’s statue) ” or statue”. Similarly for
‘‘ (wealthy
benefactor) ’S
crowded New York’s subways.
To illustrate how features are employed in such cases, we use the adjectival (APF) , possessive (POF) and possessive modified (PSF) features and the rules: (noun phrase)l+poF + (noun phrase)-PLF-pOF’s example: New York’s (noun phrase)z+APF (noun phrase)-APF-DTF-QNF-POF-PSF (noun phraSe)-DTF-POF-QNF-PSF example: wealthy benefactor (noun phrase)2+PSF-+ (noun phrase)-DTF+poF (noun phrase)-DTF-QNF-POF-PSF example: New York’s subways ---f
146
F. B. THOMPSON AND B. H. THOMPSON
These three rules allow, for example, the following phrases: (wealthy benefactor) ’s statue (crowded New York) ’s subways (John’s son)’s teacher good (old uncle) John’s (old uncle) and in each case disallow the alternative grouping. In the case of “good old uncle” we are not dealing with a genuine disambiguation for either grouping would result in the same semantic value. Excluding one or the other grouping has the sole function of preventing multiple parsings. “John’s old uncle” represents a construction where exclusion of one case, i.e., “(John’s old) uncle”, seems in line with good English usage. I n the last two examples we see valid and important use of features to exercise control of syntactic ambiguity. One is tempted to go farther, as indeed we have. The groupings: John’s (son’s teacher) wealthy (benefactor’s statue) are not allowed. As a result, we do not recognize the ambiguity of: Jane’s children’s book stout major’s wife. At some points we have chosen to exclude certain ambiguous forms even a t some loss in fluency on the grounds of computational efficiency. We are not sure we are correct in these decisions. Experienoe with actual users and experimentation with alternate decisions will be invaluable for the improvement of REL English. 7.2 Case Grammars
Another technique that is used in natural language processing systems is the application of case grammars following the linguistic work of Fillmore (1968). The essential ideas can be grasped from the following illustration. Consider the sentences: John gave Mary the book. John gave the book to Mary. Mary was given the book by John. The book was given to Mary by John.
PRACTICAL NATURAL LANGUAGE PROCESSING
147
pG-<
agent
object
\
dative
time
FIG.4. Deep structures with case relationships in REL English grammar
Clearly, all four sentences express the same thought. Three nouns are involved, each noun playing the same role relative to the verb in the four sentences. We say that in each case, “John” names the agent, “the book” names the object, and “Mary” names the recepieiit of the action, which we call the dative. The notion of case grammar is t h a t nouns in a sentence arc each related to the verb by a “case” relation. The cases that we recognize in REL English are: agent instrument object dative locative adverbs of time
AG A1 OJ DA LO AT
Thc task of grammatical analysis is to recognize these case relationships. Thus each of the above four sentences has the basic structure shown in Fig. 4. In R E L English, we use certain verb features to mark a verb phrase as including noun phrases in case relationships. These features are shown in the above list of cases. Other verb features we will need in illustrations are passive (PA) and past participle (PP). The following rules specify the structure of the above four sentences:
R1: (verb phraSe)l+pA -+ was (verb phrase)+pp example: was given R2: (verb phrase)l+oJ (verb phrase)-AG-AI-oJ-DA-Lo-AT (noun phrase)-pop example: gave the book R3: (verb phraSe)l+oJ+DA+ (verb phraSe)-AG-AI-oJ--DA-LO-AT (noun phrase)-pop (noun phrase)-pop example: gave Mary the book ---f
F. B. THOMPSON AND B. H. THOMPSON
148
R4:(verb phraSe)l+DA -+ (verb phraSe)-AG-AI-DA-AT to (noun phrase)-poF example: (gave the book) t o Mary R5: (verb phl’aSe)l+Aa+pA-+ (verb phr&Se)-AG-AI-AT+PP by (noun phrase)-poF example: was given by John R6: (verb phrase)l+AG -+ (noun phrase)-poF (verb phraSe)-AG-AJ-pA example: John (gave the book) R7: (verb phrase)l+oJ -+ (noun phrase)-poF (verb phrase)-oJ+pA example: the book (was given by John) R8: (verb phrase)l+DA 3 (noun phrase)-poF (verb phrase)+oJ+pA example: Mary (was given the book) The r.esults of the application of each of the above rules in the analysis of the sentence: Mary has given the book to John. are illustrated by the following diagram:
ss VP (R8;PP,PA,OJ,AG,DA) AG :John,OJ :the book,DA :Mary,AT:“past” VP (R5;PP,PA,OJ,AG) AG :John,OJ:the book,AT :“past”
V P (R2 ;PP,PA,OJ ) OJ :the book,AT:“past” VP (R1;PP,PA) AT:“past”
NP Mary
CV was
VP(PP) given
NP NP ~ the book by John.
The product is now ready for semantic analysis. (The above eight rules are, of course, only illustrative though similar in spirit to the verb phrase rules of REL English.)
7.3 Verb
Semantics
Handling verb semantics poses a serious design problem in the development of a natural language processing system. Often, verbs may take the form of procedures which express their meanings in terms of the implied processing of the data structures referenced by nouns which are case-associated with the verbs. In this way, the verb “give” may be asso-
PRACTICAL NATURAL LANGUAGE PROCESSING
149
ciated, through the dictionary, with a procedure that looks for a change of ownership, between the agent and the dative, of the object; this procedure would also handle the processing of the time aspect of the verb. In the case of REL English, we have sought to provide a language that can be easily extended by the user, indeed a base language where the meaning of substantive words is in no way preempted. Therefore the only verbs that are procedurally defined are the copulas, e.g., ‘‘is’’ and “has.” The procedure associated with “is,” “are,” “was,” etc., considers the following three cases for the referents of the agent and the object nouns, and returns the semantic value as indicated in each case:
“Is John Mary’s brother?” Case 1 :individual-individual True if the two individuals are the same. “Is John a boy?” Case 2: individual-class True if the individual is a member of the class. Case 3: class-class “Are boys male?” True if the first class is a subclass of the second. On the surface, this is simple. However, complications involving tense and time, classes, numbers, and quantifiers may arise. As a consequence, the “is” (etc.) routines are exceedingly complex. Since forms of “to be” are the only primitive verbs, they are ultimately a t the center of every clause. The procedure that codifies their meaning must anticipate all of the complexities that may arise in the meanings of the noun phrases that may constitute their subject and predicate. Verbs other than copulas are introduced by the user in REL English. He does so by means of a lexical statement of the form:
XXX: verb(YYY) where YYY is a paraphrase of the verb in terms of copula verbs or verbs previously defined. The paraphrase YYY also uses case indicators in place of nouns, e.g., (agent), (object). Introduction of verbs is illustrated by the following conversation: owner: =relation own: =verb((agent) is owner of (object)) John owns Ivenhoe. Kas Ivenhoe been owned by some male before 1969? Yes gave: = verb((agent) owned (object) before (time) and (dative) owned (object) after (time)) John gave Ivenhoe to Mary on June 6, 1970. Who owns Ivenhoe now? Mary
150
F. B. THOMPSON AND B. H. THOMPSON
It can be argued that the definition of “gave” does not capture the entire meaning of the verb “give.” But this is precisely the point in question. If the data base a t hand, the user’s data base, is rich enough to include information concerning motives, money, etc., these notions can be incorporated by the user in his definition of the verb “give.” If the only information in the data base relevant to the everyday meaning of “give” concerns ownership, there is no way in which the computer can go beyond that information. 7.4 The User vs. linguistic Knowledge
I n the defining statement of a verb, the paraphrase, or the contextual definition, is in terms of case identifiers, e.g., (agent). This formulation requires knowledge on the part of the user of linguistic phenomena which he cannot be expected to have or acquire easily. I n general, in systems that can be extended by the user, compromises must be made between the amount of linguistic knowledge that the user may be expected to have and the degree of fluency of the language. Whether to use (agent), etc., in the definition of verbs is a case in point. Another is whether to distinguish between animate and inanimate nouns, for example, John: =name (animate) Boston: =name. The animate-inanimate feature allows the distinction between the agent and instrument noun case forms. Thus the following sentences are handled : The door was opened by the key. The door was opened by John. The door was opened by John with the key. and the following disallowed : The door was opened with John. The door was opened by the key by John. This can be done since the rules handling passives know that “John” is an animate noun and thus can be agent, and that “key” is an inanimate noun and thus must be instrument. Details of these complex rules cannot be gone into here but are found in Dostert and Thompson (1973). The need to deal with such problems was pointed out by Fillmore (1968). The situation is different, of course, in systems that do not allow the user to extend his language, systems where the language, including the
PRACTICAL NATURAL LANGUAGE PROCESSING
151
vocabulary, is completely provided by the language writer. There, refinements such as the animate-inanimate disbinction can be added at will. 7.5 Quantifiers
Another aspect of language that is important in practical natural language processing is that of quantifiers. REL English handles a number of quantifier expressions including, “all,” “some,” Yess than 7,” “at least 5,” “all but 3,” %o,” “not all,” “what,” “each.” Quantifiers, as is known, provide the principal means, other than statistical means, for abstracting and tabulating data. I n REL English, the query What is the grade point average of each student who is taking at least 5 courses? would produce a listing of the form : John Jones 3.6 Sue Smith 3.8 Two important attributes of a quantifier are its range and scope. The range of a quantifier in the set of objects about which information is to be abstracted. I n English, a quantifier usually modifies the noun phrase that names its range, e.g., “all boys,” “some city.” A quantifier phrase such as “which boy” or “all boys” asks or states something about the elements of its range. The smallest segment of a sentence that has to be considered in resolving what the quantified phrase asks for OT states is the scope of the quantifier. The notions of range and scope may be clarified by the following examples:
Do authors who have written some books on violence drink heavily? Authors who have written which books on violence drink heavily? The first of these sentences contains the quantified noun phrase ‘(some books on violence,” the second the quantified phrase “which books on violence.” I n both cases, the range of the quantifiers is the class of all books on violence. I n the first sentence, consider the phrase: “authors who have written some books on violence.” Semantic evaluation of this phrase produces a class, a subclass of the class of authors. Once this class has been constructed, it is used in the evaluation of the remainder of the sentence; the “some” quantifier plays no further role. Thus the scope of the “some” quantifier is the phrase: authors who have written some books on violence.
F. 6. THOMPSON AND 6 . H. THOMPSON
152
The situation is quite different in the second sentence. The question posed by the “which” quantifier cannot be resolved by considering only the phrase: authors who have written which books on violence. The whole sentence must be evaluated before the subset of books that fulfill the conditions of the sentence can be determined. Therefore the scope of the “which” quantifier is the whole sentence. Quantifier scope is determined by phrase, clause, and sentence boundaries, and also by the type of quantifier involved. The order in which quantified noun phrases are considered in the semantic analysis of the sentence also depends on syntactic clues, and does so in interesting ways. Consider the following two pairs of sentences:
Sl 52 S3 S4
John loves Mary. Mary is loved by John. All boys love some girl. Some girl is loved by all boys.
The sentence S2 is the passive form of the S1 and, with the exception of stylistic differences, has the same meaning as the first. The sentence S4 appears similarly to be the passive transformation of S3 but this is clearly not so since the third and fourth sentences do not have the same meaning. Consider the following paraphrase of the third sentence: For any boy there is a girl such that the boy loves the girl. The correct passive transformation of this sentence is: For any boy there is a girl such that the girl is loved by the boy. the transformation being applied to the subordinate clause. It does not seem to have a paraphrase similar to 54. The nesting of quantifiers for purposes of semantic processing does not follow the deep case structure of the sentence but rather the position of the quantifier in the original sentence. The quantifier of the subject of the sentence whatever the subject’s case relationship to the verb, is always the dominant quantifier. A transformation, such as the passive, that rearranges the nouns which indicate the range of quantifiers may make it impossible to signal the proper ordering of their evaluation. Therefore such transformations may not be permissible when quantifiers are involved. The scope of quantifiers and the order in which their evaluation is to take place in the semantic processing require careful attention in developing rules of grammar for natural language systems.
PRACTICAL NATURAL LANGUAGE PROCESSING
153
7.6 Computational Aspects of Quantifiers
We turn from matters of English syntax to take up certain computational aspects of quantifiers. We do so because the computational efficiency with which we handle quantification is the key to achieving an effective system. To see the problems involved, we will contrast two methods of handling quantifiers. We use as an illustration the sentence: Each student’s father attended what college? The desired answer is of the form: John Jones Sam Smith Bill Johnson
Stanford Harvard Ohio State
Expanding the definition of “father” and ((attend,” the internal form of the sentence becomes this: The school of each student’s male parent is what college? The first method we will consider for handling the semantic processing of quantifiers will be called the method of generators. By a generator we mean a procedure which selects one element after another from a class and applies a given process to each element until the class has been exhausted. The following diagram illustrates the process. return
t--
to
ss
generators
VP
1
NP NP
NP “each” generator ......................
NP
NP
NP
N P N PVP -
studenti
CV N P -
“what? generator ...................................................................
collegej
NP The school
of
each student’s male parent is
NP what
college?
As can be seen from the diagrams, a generator has been set up for each quantified phrase. The ‘(each”generator generates a student, say student4 ; then the “what” generator generates a college, say collegej. At that point the expression : the school of studenti’s male parent is collegej
154
F. B. THOMPSON AND 8. H. THOMPSON
is evaluated. If true, the pair (studenti, collegej) is set aside and another student is generated. If false, another college is generated until the school of students’s father is found. This is a conceptually straightforward way to consider quantifiers, each element of the range of each quantifier being considered in turn and the results of these “instantiations” evaluated, with necessary actions taken depending upon the type of quantifiers involved. It is also easy to implement. The interpretation routine for each quantifier rule calls for the semantic processor recursively. At the end of the evaluation of each instantiation, control is returned to these quantifier routines for generation of their next elements. However optimizing page loading would be extremely difficult because of the intervening processes involved in the evaluation of each instantiation. Without optimization, the amount of “threshing” of pages from disk to main memory, and thus the computing time, can quickly become astronomical. The second method we call the method of labeled classes. Instead of generating the elements of the range of a quantifier and evaluating each instantiation separately, the entire range is carried along and evaluated in parallel. The diagram on page 155 illustrates how these various classes and relations are processed along the way.
It is convenient in reading this diagram to start from the right. The noun “college” refers, i n computer memory, to the class of all colleges, which is indicated by the notation (college). The “what” rule marks this class as what quantified, indicated by (college). This “labeled” class now becomes the object of the copula verb. Following to the left in the diagram, we see -t ha t the “parent” relation (child, parent) and the “male” class (male) are combined by the interpretive routine for the ~
(noun phrase) -+ (noun phrase)(noun phrase)
-
rule (as discussed in Section 5.2.) into the “father” relation (child, father). This in turn is combined with the labeled class (student) to form each a labeled class of fathers (father, student) where each father is labeled by his student son or daughter. Next a labeled class of schools is constructed, where each school has as a label the student whose father attended t h a t school. This labeled class becomes the agent of the clause. The “is” routine now combines the agent and object classes to form the final each what class: (student, college of father). The output routine is now is a position to use this information to send the right answer to the user’s terminal.
*
-11
each what SS(student,coIlege of father)
F
7 F
-
i?
each what VP agent:(school of father,student), object:(college) NP(same)
* each NP(schoo1 of father,student) * each N P (f ather,student j what V P object:(college)
NP(same1
-
-
each NP(student)
NP(student’schoo1) The school
of each
* -
-
-
NP(child,father)
-
NP(student) N P (male) student’s
male
-
what NP(col1ege)
-
-
N P (child’parent)
CV -
parent
is
NP(col1ege) what college
?
156
F. B. THOMPSON AND B. H. THOMPSON
Throughout this process the designation of the label fields and the type of quantifiers associated with them can be kept in main memory. Consequently there are only four places in the entire semantic processing where manipulation of the data structures, and therefore access to disks, is required. These are indicated by an "*" in the diagram. I n each of these four places the transfer of pages from disk to main memory can be optimized as was discussed in Section 4.3. As was pointed out in that section, savings in computing time are very significant, a difference in response times of hours to seconds. What has been illustrated here is the handling of quantifiers in REL English. The issue, however, is a general one, namely, the computational efficiency in the semantic processing of quantifiers. This issue, narrow as it may seem, is a key issue in the development of practical natural language processing systems. 7.7 Steps in Developing English for the Computer
We have indicated some of the techniques that are used in computational processing of English. How is the work to be organized in developing the syntax and semantics for computational English? An obvious precondition is that one brings to the task knowledge of linguistic theory and experience with other computational linguistic systems. The first step is to ascertain the uses to which the system is to be applied, in particular the nature of the various universes of discourse and the general kinds of semantic processing involved. Major classes of application systems include (1) question answering systems concerned with relational data, (2) direction of robots and control of processes, (3) speech recognition and edited transcription or analysis, (4) text processing, ( 5 ) machine translation, and ( 6 ) deduction and problem solving. From the research point of view, what we have listed as distinct areas may overlap. However, from the point of view of practical natural language processing those six areas will be quite distinct. for a t least the next decade. Once the first step is taken, i.e., the general nature of the system is determined, the next step is to design the primitive data structures. There is an obvious difference between applied data structures and abstract data structures. The distinctions made in the first include those made in the second. However in the development of a particular system, one has to be concerned with such details as page boundaries, the layout of page headers, field sizes, units of measure, etc. Many of the decisions concerning data structures will bc made explicit by programming a family of utility routines that do the basic manipulations of the data, routines that
PRACTICAL NATURAL LANGUAGE PROCESSING
157
will subsequently be used in translating linguistic considerations into interpretive routines. This view that the design of data structures is the second major step may not be shared by many. One standard way to proceed is to choose an interlingua, a language that lies between the user’s English and the underlying data management system. Some translate English into some form of the lower predicate calculus of symbolic logic. Some take as their task putting an English “front end” on a given data management system. Some, whose interests are more in the research aspects of language understanding, human concept attainment, or artificial intelligence, use sophisticated languages such as PLANNER (Winograd, 1972), since the problems of efficiency are not relevant for their purpose. I n each of these courses of action, however, the choice of data structures is made, even though only implicitly. It is our conviction that in the development of practical systems for users, the explicit design of data structures will be a key step. The next step is to proceed directly with the linguistic issues. Relative to the ultimate uses of the system and the family of data structures, there appears to be a body of English syntax that becomes the core of the system. The syntax and semantics for this core English is then to be designed and implemented. At the completion of this third step, one can begin actual communication with the computer in this core English. From then on the developing system itself becomes a primary tool for its improvement. The fourth and final step consists of a variety of activities, all related to extending and improving the linguistic capabilities of the system. These activities continue in varying degrees until the completion of the system. First, there is a large number of English constructions that are easily implemented-variations of the core syntax, rules whose semantics constitutes rather straightforward application of the data manipulation utilities, rules that express simple grammatical transformations. Second, sympathetic users, who are willing to struggle with the limited nature and frequent failures of the system, are encouraged to use it; they make invaluable suggestions concerning system improvement. Finally, there are classes of linguistic problems that require serious linguistic research, yet demand solution if the system is to achieve fluency. After the first initial stages of development, improvement of a natural language processing system is a protracted iterative process. Both the system itself and the commentaries of sympathetic users provide invaluable inputs. But as this iterative process of improvement proceeds, the system of syntax and semantics becomes ever more complex and interrelated. The cognitive task of keeping all of this straight in one’s mind,
158
F. B. THOMPSON AND B. H. THOMPSON
of making the requisite checks of new work against past solutions is the major limitation in the development of fluent systems. All of our experience dictates that understanding English structure poses no insurmountable problems that would prevent steady improvement, though sound linguistic research will be required. However, English structure in all its glory is of such immense complexity that we get caught up in our own myths long before our systems approach our own natural language.
8. Practical Natural language Processing
The objective of this section is to answer as concretely as possi.ble the question: Where are we now? We will take up in turn the following four topics: (1) natural language processors, (2) fluency and language learning, (3) what kind of systems can we expect?, and (4) why natural languages for communicating with computers? 8.1 Natural Language Processors
The two major parts of a language processor are the parser and the semantic processor. Natural languages require parsers that will handle a t least context free and preferably general rewrite rule grammars. They must also incorporate checking of syntactic features and be able to handle limited forms of transformational rules. There are a t this time two good parsing algorithms, namely, the Martin Kay “powerful” parser (1967) and the William Woods augmented network parser (1970). We know of no other parsing algorithms that deserve attention from the point of view of practical operational systems. Equally important as the algorithm itself is its actual implementation. Certainly if there is any part of the system that needs to be highly honed, it is the parser. Our own implementation of the Kay algorithm, written in IBM370 assembly language, is particularly tight and efficient. Tests show that average parsing times for typical English sentences on an IBM370-135 are of the order of a tenth of a second. For example, the parsing time for the sentence:
Do Cambridge girls who attend Yale love Harvard boys? was 0.08 seconds. We understand from Simmons that he has an implementation of the Woods algorithm that is of equivalent speed. I n balance, parser design for practical natural language systems is adequately solved. One issue in regard to syntactic ambiguity in natural language processing seems to reoccur in the literature sufficiently often to deserve a brief paragraph here. Repeatedly it is suggested that instead of first parsing
PRACTICAL NATURAL LANGUAGE PROCESSING
159
a sentence and then carrying out the indicated semantic processing, these should be done in parallel. When a phrase can be analyzed syntactically in two ways, one way may immediately be shown to be semantically meaningless and thus pruned from the parsing graph, simplifying further parsing. Thus the parsings: (daughter of Chicago) ’s mayor mayor of (Chicago’s daughter) could be eliminated on semantic grounds. However, just as there may be segments of a sentence that are grammatical but semantically meaningless, there may be segments of the sentence that are grammatical and perhaps meaningful but which are not part of any complete parsing of the entire sentence. I n the example: The daughters of (some Harvard professors are attending Radcliff) , the phrase in parentheses will parse into a clause, but this resulting clause is clearly not involved in any complete parsing of the sentence. Carrying out semantic processing on that segment would be a waste of time since it will be excluded by syntactic analysis alone. Semantic processing, with its numerous references to the data base on disks takes a great deal more time than analysis of the syntax of the sentence which can almost completely be confined t o main memory. Thus it pays to conserve on semantic analysis even a t incurring higher costs in the parsing of the sentence. During the development of the REL system, experiments were conducted comparing serial and parallel methods of syntactic and semantic processing. For data bases of any size, serial processing was far more efficient. The task of the semantic processor is to use the parsing diagram to schedule the execution of the interpretive routines. A number of alternative algorithms are available. The two aspects of semantic analysis that are a t all complex are the binding of variables and processing of loops. To illustrate, suppose the following definition is formulated : def: sex ratio of “sample”: number of “sample” who are male *100/number of “sample” who are female and the following question asked: What is the maximum sex ratio of the children of each Mazulu crone? I n performing the semantic analysis, each member of the class of Maaulu crones must be considered, the class of her children formed and substituted in all instances of the free variable, indicated in the definition by “sample.” The resulting number, i.e., the sex ratio for the particular crone, must be put aside and the next member of the Mazulu crone class
160
F. B. THOMPSON AND B. H. THOMPSON
considered. An efficient and elegant style for organizing these functions has been developed in the Vienna Definition Language (Wegner, 1972). The REL Semantic Processor is a clean implementation of similar ideas. However, the details cannot be gone into here. Another important and contentious issue in discussing natural language processing is ambiguity. Ambiguity is an important and useful aspect of natural language. Words and phrases inside a sentence, when taken out of the context of the sentence, may be highly ambiguous. The sentence context in which they are used often disambiguates them. For example, consider the sentence: The subsection manager who has the smallest backlog will report first. Considering its extensional meaning in a given data base, the word “manager” is highly ambiguous. “Subsection manager” is less so, “subsection manager who has the smallest backlog” will usually be unambiguous. A language in which each phrase must in isolation be unambiguous is very low in expressive power. The typical user of a natural language makes use of ambiguity constructively, usually ending up with all his phrases diambiguated within the context of his sentences or by the more general context of the situation. Ambiguity poses many interesting and difficult problems for the linguist. These must be solved, at least in an ad hoc way by the language writer in the development of a language such as R E L English, as we have seen in the previous section. However from the technical point of view of the language processor, there is no difficulty in handling ambiguity. An ambiguous phrase can be considered as a stack of its various meanings. The semantic processor itself can manage such stacks, maintaining calling sequences as it invokes the various interpretive routines involved. In this way, the interpretive routines need not be concerned a t all with ambiguous inputs, though they may return ambiguous output in an appropriate form for the semantic processor to recognize. As long as the semantic processor is called recursively inside interpretive routines that set up internal loops, ambiguity is simply and automatically handled. If an entire sentence or query is ambiguous, the output routines present its various meanings in a straightforward way. 8.2 Fluency and language learning
One learns FORTRAN by learning its syntax, i.e., by learning the relatively few forms statements may tak,e and the strict rules for punctuation. In the early stages in writing FORTRAN programs, or after being away
PRACTICAL NATURAL LANGUAGE PROCESSING
161
from it for a while, one keeps a manual a t one’s side. However, even programming languages of greater complexity than FORTRAN, notably PL/1, are essentially impossible to learn in this way. Natural languages of the character we are discussing in this paper can not be learned from a manual. On the other hand, learning to use R E L English is not like learning a foreign language. The claim is that it is natural. If this is so, one should be able to use it effectively without any learning a t all. Just where are we in achieving such a goal? As was stated in the section on English, there are several systems that are beyond the threshold of being natural. The one that has gone farthest is the Lunar Rocks System of Woods et aE. (1972). It is instructive to review his goals insofar as fluency is concerned and the means he required to achieve them. Woods sought, and to a commendable degree achieved, a system with these characteristics: geologists could approach it for the first time, sit a t a terminal and type in questions concerning technical aspects of the lunar rocks, have their questions correctly understood by the computer, and of course be given correct responses. Certainly Wood’s handling of ordinary English constructions is good. However a real key to the success of his system is the attention that has been given to incorporating vocabulary, syntax, and intensional semantic aspects of the narrow and highly technical universe of discourse. R E L English is not yet as far along in its general English capability. Further, each application of R E L English to a particular data base requires the addition of vocabulary and idioms, definitions incorporating intensional information and, from time to time, special interpretive routines and syntax. Typically, therefore, the user of R E L English goes through a period of several hours of language play in which he formulates questions more to see whether and how the computer understands them than for their intrinsic interest to him. A t this stage in the R E L English development he will soon discover forms the computer does not understand, many of them of an elliptic nature where intensional knowledge is needed. However, our experience even a t this stage of development is that the system responds in a sufficiently satisfactory manner to capture the user’s interest. This period of language play, of intensive learning, tapers off relatively quickly as the user begins to settle down to serious work with the system. However it never completely disappears, especially as he himself begins to extend his language through definitions. Stepping back to appraise the level of fluency achieved today, and equivalently the naturalness of the language t o the new user, it must be admitted that we are not far above the threshold where the Ianguage seems a t all natural. There appear to be no essential barriers to continued
162
F. B. THOMPSON AND B. H. THOMPSON
and rather rapid improvement. Practical systems for technical applications and serious research can to a large degree avoid the difficult semantic problems such as indirect discourse, modal sentences, and adverbials such as “near” and “quickly.” We believe that truly fluent systems can be achieved in the near future. As that goal is approached, the problem of language learning will become similar to the ubiquitous problem of picking up the dialect when one enters a new environment. A system such as REL English, whose objective is a broadly applicable capability that can be subsequently specialized to a narrow application area, will never be found to be as fluent as desired. As ordinary speakers, we seem peculiarly unaware of the specialized nature of the language we use, indeed of the different languages we use as we move from context to context in the span of a day. I n the development and application of natural language processing systems, many English languages will be designed for narrow universes of discourse, incorporating specialized vocabulary and interpretive routines that embody intensional meaning. These will soon be capable of an easy fluency. Others, built to be more generally applicable, will have to be approached as when hiring an inexperienced but bright assistant, one that a t least will not have t o be told twice. 8.3 What Kinds of Systems Can W e Expect? 8.3. I Data Analysis Systems
Researchers, especially in the areas of social and environmental sciences, are applying standard statistical analysis methods to data bases whose size is of the order of lo*-lo6 items of data. The present technology is to use available packages of statistical and related programs in conjunction with FORTRAN. Among the most suecessful packages are BMD, SPSS, OSIRIS, and the IBM Scientific Subroutine Package. To use these packages, one has to have a basic knowledge of FORTRAN. Further, each program in the package has particular, often complex, input and output requirements. It has been our observation, in conjunction with a course in data analysis and modeling, that with the data already in the computer as a data set, students experienced in using FORTRAN still require several days to familiarize themselves with the desired standard routines, write and debug their programs which carry out rather straightforward data analysis assignments, and wait for their results. This can all be bypassed in the immediate future by systems that allow complex data analysis to be carried out in response to straightforward statements in technical English. REL English comprises such a system
PRACTICAL NATURAL LANGUAGE PROCESSING
163
nearing the prototype stage. Data input is handled in R E L English by bulk data input routines that accept data in customary forms from cards or unit record data sets. Once the data base has been built, the researcher can apply the statistical analyses he desires through English queries. For example, we have been using as a test data base the results of a questionnaire; the respondents were 1583 students of a liberal arts college; the 65 questions concerned attitudes and personal data such as age. Typical requests one can make are: What is the correlation between the attitude on marijuana and the political position of the students of each class? What is the mean and standard deviation of height of males and females? Scatter plot personality type against religious beliefs. Another data base on which we have tested the R E L English system is the World Handbook Data from the Inter-University Consortium for Political Research. This data base reports statistics for each of four years concerning 135 United Nations countries. After having defined “per capita” in this way, for example def : per capita “area” : “area”/population one can ask: What are the regression coefficients for birth rate in terms of per capita governmental expenditures, per capita agriculture labor and percent Catholics of South American nations? We have also had experience with other data bases and so, of the data base developed by the anthropologist Thayer Scudder concerning the Tonga tribe in Zambia, one can ask, for instance: What is the correlation between the age of first marriage and the number of wives of males whose age is greater than 35? Of the data base concerning publications and affiliations of scientists, mentioned earlier, one can ask: What are the institutional affiliations of physicists who were authors of papers whose author was John Smith?
CO-
The REL English capability illustrated above can be viewed in a very mundane way as simply the next step in the orderly evolution of data analysis systems. From that point of view, we have put together in a single system: a grammar for a segment of technical English, a straight-
164
F. 6 . THOMPSON AND 6 . H. THOMPSON
forward syntax directed interpreter, an efficient data management system, and programs from a statistical subroutine package. We like to think that we have done so in an elegant way and indeed have done more than just this. REL English as it now stands establishes the feasibility of a unified, effective, natural language system for data analysis. Such systems should now move into the production phase. 8.3.2 Management Information Sysfems
The expression “management information systems” has come to mean a system for managing large data files made up of essentially independent records, with emphasis on report generation and file maintenance. For such tasks, natural language processing systems of the character of R E L English are most likely not appropriate. We have raised the question in Section 4, Semantics and Data Structures, whether large file management systems will require distinctly cliff erent implementation. In the near term, with latency time for accessing a page in disk memory of the order of 51 msec, response times of REL English for queries that require substantial data abstracting of large files (upwards of loe items of data) will not bc compatible with conversational usage. On the other hand, information systems for management should be of an entirely different character than the “management information systems” discussed today. The various offices in the management hierarchy of a modern large scale business each have their own idiosyncratic information requirements. Furthermore, these change ; in fact their change is a manifestation of the constant ongoing reorganizing and reconceptualizing that is the very task of management at all levels in maintaining the firm in a reponsive posture with regard to changing markets, resources, technology and public controls. Let us considcr the prcscnt architecture of REL from the management information system’s point of view. Limit the number of base languages to a single one. This single base language would be an extension of REL English to include the business terminology used generally in the particular firm. Each manager would have his own “user language package” containing his own records in terms of his own choosing. The normal staff meetings and communication between managers would establish categories of data that would be available for query from one manager’s system to another. Modification of REL English data structures to include flags for controlling access would be made. The result would be a flexible, responsive natural language management information system. The system concept we have sketched is only one of many which are
PRACTICAL NATURAL LANGUAGE PROCESSING
165
open for exploration by natural language processing systems. The technology of natural language processing is now a t a point where experimental systems can be designed and tested. Applications of such systems in limited areas of a firm should be possible in the near future. 8.3.3 Specialty languages
Languages based upon English are not the only languages that are natural for communicating with the computer. There are several languages of quite different character that have been operating in the R E L environment, some of which we have already mentioned: (i) the R E L Animated Film Language for creating motion picture films of abstract forms in motion (Thompson et al., 1974), (ii) RELSIM, a language for designing, testing, and running discrete simulation models (Nicolaides, 1974j, (iiij RELILIATH, a language for numerical engineering problems, including the specification of new predictor-corrector type methods to solve differential equations (Bigelow, 1973), and (iv) the R E L Language Writer’s Language for implementing new, high level languages within the REL environment (Szolovits, 1974). These are only a few examples of the many applications areas that could be served by highly specialized languages designed for specific needs. I n the early days of computing, the few of us who used computers did our own programming and in machine language a t that. Then we learned to turn the actual implementation of routines over to pro-, grammers. Algebraic programming languages such as FORTRAN evolved, then business languages such as COBOL. Computer languages have proliferated, until now there are several hundred of them (Sammet, 1972) but by and large these languages were designed to be used by computer programmers and computer scientists, not by the typical user. A bifurcation in the class of programmers took place, thus, on the one hand, ( 1 ) system programmers working on the underlying operating systems and language processors, and on the other (2) application programmers who were using higher level languages to write programs required by the ultimate users they were serving. There has always been a quest for generality, and this was the case in applications programming. Often the programs written by application programmers contained parameters SO that they could be used again and again with various changes in the parameter values. Subroutine packages became available. Note that through writing short programs that call existing subroutines to do the major computing tasks, rudimentary languages have evolved. The next step in this evolutionary development of programming is for the applica-
166
F. B. THOMPSON AND
B. H. THOMPSON
tion programmer to turn to the writing of natural application languages (Thompson and Dostert, 1972; Bigelow e t al., 1973). Consider for example an application programmer in a large bank. Presumably he has become familiar with the business of the bank and the kinds of data analysis tasks that go on in its various offices. Suppose the Trust Officer would like to automate certain aspects of the portfolio analysis. The applications programmer, recognizing that the Trust Officer will want to apply the automated analysis to portfolios of various customers, to various groupings of investments that may be of interest, to hypothetical changes in portfolios, etc., develops an extension of an existing base language such as R E L English. This extension would embody syntax that would be natural for the Trust Officer, with interpretation routines that would efficiently carry out the desired forms of analysis. The Trust Officer would then be in a position to obtain the specific analyses he wants quickly and naturally from his own terminal. Similarly, we can expect those organizations that support the major subroutine packages to integrate programs in these packages underneath a natural syntax, making it unnecessary to write a FORTRAN program to call them. Perhaps the first amalgamation will use a rigid, formal language, hardly an improvement over FORTRAN. Ultimately it will certainly be a natural English language including the means for definition. 8.4 Why Natural Languages for Communicating with Computers?
This question could be approached from a number of points of view. One could argue the merits of English vs. traditional programming languages. One could debate whether the proliferation of so many specialized languages would destroy the ability to exchange programs. One could discuss the nature of problem solving and decision making and its implications for computer languages. However, the answer to the question rests on more down-to-earth, practical grounds-the economic forces of the market place. Here arc the facts of the matter. The cost per computer cycle has been falling a t the rate of approximately 25% annually, a rate that is likely to continue for the next decade. The rate at which programmers write debugged programs seems strangely constant a t about 10 statements per day, independent of the language in which they are writing. Thus, in order to maintain the present dollar value of the computer market, two factors must be increased sharply: the number of people who directly communicate with computers and the number of computer cycles per statement. Natural language systems, by making the computer directly accessible to the user bypassing the programmer, will break through the
PRACTICAL NATURAL LANGUAGE PROCESSING
167
present programming bottleneck. If the language level is raised to that of natural languages and intensional knowledge in the interpretive routines is included, a computer response to a single statement will involve a great many more cycles. Therefore natural language processing systems are a potential solution to the economic pressures resulting from the continuing downward spiral of computing costs and the rapidly increasing relative costs of programming. These pressures, together with the rapid advance in our technical ability to implement efficient natural language processing systems, are reasons why communicating with computers in natural languages will be available in the period ahead. REFERENCES Bigelow, R. H. (1973). Computer languages for numerical engineering problems (REL Rep. No. 5 ) . Doctoral Dissertation, California Institute of Technology, Pasadena. Bigelow, R. H., Greenfeld, N. R., Szolovits, P., and Thompson, F. B. (1973). “Specialized Languages : An Applications Methodology,” Nat. Comput. Conf., 1973, REL Rep. No. 7. California Institute of Technology, Pasadena. Bross, I. D. J., Shapiro, P. A., and Anderson, B. B. (1972). How information is carried in scientific sub-languages. Science 176, 1303-1307. Chomsky, N. (1965). “Aspects of the Theory of Syntax.” M I T Press, Cambridge, Massachusetts. Dostert, B. H. (1971). “REL-An Information System for a Dynamic Environment,” REL Rep. No. 3 (Appendix B : Scudder Protocol). California Institute of Technology, Pasadena. Dostert, B. H., and Thompson, F. B. (1971). How features resolve syntactic ambiguity. Proc. Symp. Inform. Storage Retrieval, 19Y1, pp. 19-32. Dostert, B. H., and Thompson, F. B. (1972). Verb semantics in a relational data base system. Proc. ONR Symp. Text Process. Sci. Res., 19Y3, pp. 97-108. Dostert, B. H., and Thompson, F. B. (1974). “The Syntax of REL English,” REL Rep. No. 1 (rev.). California Institute of Technology, Pasadena. Fillmore, C. J. (1968). The case for case. In “Universals in Linguistic Theory” (E. Bach and R. Harms, ed.), pp. 1-90. Holt, New York. Green, C., and Raphael, B. (1968). The use of theorem proving techniques in question answering systems. Proc. Nut. Conf. ACM, BSrd, 1968, pp. 169-181. Greenfeld, N. R. (1972). Computer system support for data analysis (REL Report No. 4). Doctoral Dissertation, California Institute of Technology, Pasadena. Kay, M. (1967). “Experiments with a powerful parser.” The Rand Corporation, Santa Monica. Nicolaides, P. L. (1974). RELSIM-An on-line language for discrete simulation in social sciences. Doctoral Dissertation, California Institute of Technology, Pasadena. Quillian, M. (1969). The teachable language comprehender : A simulation program and theory of language. Commun. ACM 12, 459-476. Robinson, J. A. (1967). A Review of Automatic Theorem-Proving. AMS Symp. on Appl. Math, 19th, 19U,pp. 1-18. Sammet, J. E. (1972). Roster of programming languages. Comput. Autom. 21, 123-132.
168
F.
B. THOMPSON AND
8. H. THOMPSON
Schank, R. C. (1973). Identification of conceptualizations underlying natural language. In “Computer Models of Thought and Language” (K. M. Colby and R. C. Schank, eds.), pp. 187-248. Freeman, San Francisco, California. Sutherland, I. E. (1963). Sketchpad: A man-machine graphical communication system. Proc. Spring Joint Comput. Conf.,1968 pp. 329-346. Szolovits, P. (1974). The REL Language Writer’s Language : A metalanguage for implementing specialized applications languages. Doctoral Dissertation, California Institute of Technology, Pasadena. Thompson, F. B. (1966). English for the computer. Proc. Fall Joint Comput. Conf., 1966 pp. 349-356. Thompson, F. B. (1974). “The REL language processor,” REL Rep. No. 11. California Institute of Technology, Pasadena. Thompson, F. B., and Dostert, B. H. (1972). The future of specialized languages. Proc. Spring Joint Comput. Conf., 1972 pp. 313-319. Thompson, F. B., Bigelow, R. H., Greenfeld, N. R., Odden, J., and Szolovita, P. (1974). “The REL Animated Film Language,” REL Rep. No. 12. California Institute of Technology, Pasadena. Wegner, P. (1972). The Vienna definition language. Comput. Surv. 4, 5-63. Winograd, T. (1972). “Understanding Natural Language.” Academic Press, New York. Woods, W. A. (1970). Transition network grammars for natural language analysis. Commun. ACM 13, 591-606. Woods, W. A,, Kaplan, R. M., and Nash-Webber, B. (1972). “The Lunar Sciences Natural Language Information System.” Bolt, Beranek & Newman, Cambridge, Massachusetts.
Artificial Intelligence-The
Past Decade
.
B CHANDRASEKARAN Deportment of Computer ond Information Science The O h i o Stofe University Columbus. Ohio
1. Introduction . . . . . . . . . . . . . . . . . 2. The Objectives of the Review . . . . . . . . . . 3 . Language Processing . . . . . . . . . . . . . 3.1 Question-Answering Systems . . . . . . . . . 3.2 Winograd: A Procedural Model for Understanding . . 3.3 Semantic Networks . . . . . . . . . . . . 4 Some Aspects of Representation, Inference, and Planning . 4.1 General Remarks . . . . . . . . . . . . . 4.2 STRIPS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.3 PLANNER 5. Automatic Programming . . . . . . . . . . . 5.1 General Remarks . . . . . . . . . . . . . 5.2 A Program Synthesis System . . . . . . . . . 6. Game-Playing Programs . . . . . . . . . . . . . 7 Some Learning Programs . . . . . . . . . . . 7.1 General Remarks . . . . . . . . . . . . . 7.2 Structural Learning . . . . . . . . . . . . 8. Heuristic Search . . . . . . . . . . . . . . 8.1 Hill Climbing . . . . . . . . . . . . . . 8.2 Search in Graphs . . . . . . . . . . . . . 8.3 Application to Chemical Synthesis. . . . . . . 9. Pattern Recognition and Scene Analysis . . . . . . . 9.1 Formal Approaches . . . . . . . . . . . . 9.2 Heuristic Techniques. . . . . . . . . . . . 10. Cognitive Psychology and Artificial Intelligence . . . . 10.1 General Remarks . . . . . . . . . . . . 10.2 ANALOGY Program . . . . . . . . . . . 10.3 Belief Systems . . . . . . . . . . . . . 11. Concluding Remarks . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . .
.
.
169
. . . . 170 . . . . . 173 . . . . . 176
. . . . .
176
. . . . . . . . .
196 199 202 202 203 205 208 208 209 213 213 213 216 217 217 218 220 220 222 224 224 225
. . . .
. . . .
. . . .
. . . .
. 177 . 186 . 195 . 195
. . . . .
. . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . .
. . . . . .
. . . .
. . . . . . .
. . . . . . . . .
. . . . . . . . .
. . . . . . . . . . . . . . .
170
B. CHANDRASEKARAN
1. Introduction Artificial Intelligence has gone through a sober process of realizing that human beings are cleverer than it supposed. It has turned to a more cautious and diversified strategy of accumulating “know-how” rather than mounting frontal assaults.
This remark, taken from a recent review (Williams, 1973) of a book (Dreyfus, 1972) attacking the claims and objectives of Artificial Intelligence (AI), fairly summarizes the thrust of the most recent approaches and achievements of the field. In a personal survey by this author of many workers active in the field, almost all conceded that their estimates of the complexity of the processes involved in creating intelligence have gone up rather than down in the past few years. There has also been a perceptible difference in the emphases given by the workers in the field. One eminent investigator has suggested that the activities are shaping the field into a broader “science of intelligence” rather than the more narrowly defined task of creating artificial intelligence. These shifts in emphases, aims, and general strategy testify to both a greater appreciation of the complexity of the task and a greater maturity, corresponding to a shift in preoccupation from product to process. The enterprise nevertheless remains controversial-Dreyfus, who was originally a critic of the excessively optimistic claims of some of the practitioners in the field has recently emerged with the book referred to earlier, in which the basis of his objections become startlingly clear as a kind of mysticism of the mind. He ends up raising objections to the very way, in fact the only way, in which science can deal with the external reality: namely, that there exist objects quite independent of human interests, and among these objects are the human organism and its associated behavior. Dreyfus rejects what he calls the “Platonic” assumption, which can be roughly summarized as the belief that phenomena and objects of experience can be studied by explicit and determinate rules of procedure. This leads him to dismiss the question, “How does man produce intelligent behavior?” with the remark, “the notion of ‘producing’ behavior is already colored by the (Platonic) tradition. For a product must be produced in some way; and if it is not produced in some definite way, the only alternative seems to be that it is produced magically.” Williams’ response to this strange suggestion is worth quoting: “. . . [If] the thought is that a given piece of behavior can appear on a given occasion, and not be produced on that occasion in some definite way-then yes, indeed, it would be produced magically. That is the magic Dreyfus is calling to us from his counter-Platonic cavern. But however depressed
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
171
we may sometimes be by the threats and promises of the machine man, we are not forced in there yet.” The point is that no sensible attack on A1 can be launched from a position which regards the scientifically eminently valid question, “How does man produce intelligent behavior?” as “colored” in any way.’ Dreyfus’ position is representative of a class of opinions which appears to speak with philosophic rigor (as opposed to arguments based on revelation) but actually concludes proclaiming a kind of mysticism about the human mind. It by no means follows that all the philosophical, social, and moral ambiguities vanish simply by invoking the scientific world view. While the prospect of a subversive HAL is most probably farther away than 2001, if ever, there are variants to the moral question, not all related to robots. If human intelligence can be equalled by a machine, there is apparently no reason why it cannot be surpassed. There exists a small minority which looks upon the prospect of a mechanical species superior to man, as man is superior to a cow, with either bovine acquiescence or with the excitement of a devout midwife during the birth of a heralded prophet. A variant of the moral question is raised by Weizenbaum (1972), himself an early worker in AI, who maintains that the metaphors used to view man control the direction man fashions for himself, and the metaphor of the machine which is a t the root of AI can be a most damaging one. The rapist, says Weizenbaum, will taunt the victim with “it is your dream, lady!” At the opposite end to a mysticism of the mind, there has been a persistent tendency among many workers in the field to ascribe to the mind a degree of simplicity as almost an obvious fact. This would not be of consequence except for the fact that it sometimes leads to pursuit of models which would be rejected as absurdly inadequate had the investigator had a proper appreciation of the subtlety and complexity of the mind. We have discussed elsewhere (Chandrasekaran and Reeker, 1974) some of the issues relating to complexity of both the structure of the human mind and the identification experiments necessary to unravel the structure. More recently, in Britain, there has been storm borne out of a report critical of the prospects of AI. The report, written by Sir James Lighthill (1973) classifies A1 activities into three categories: Category A : Standing for Advanced Automation, in which category he places optical character recognition, speech recognition, automatic theorem proving, inference of chemical structure from mass spectrometry ‘Unless one means by “produce,” “effectively produce.” In that case, it is “colored,” since it is not self-evident that all the mental functions are effectively cornputable (see Chandrasekaran and Reeker, 1974).
172
B. CHANDRASEKARAN
and other data, machine translation, product design and assembly, problem solving, decision making, etc. Category B : Building robots-mimicking some special functions that are highly developed in man: eye-hand coordination, use of natural language, “common sense” problem solving within some limited universe of discourse such as games and puzzles. Sir James views this as a “bridgebuilding” activity, concerned with building robots for various purposes including the feeding of information into work of categories A and C. Category C : Computer-based central nervous system (CNS) research, which includes modelling various parts of CNS, scene analysis, memory, biological basis of learning, psycholinguistics, etc. H e regards progress in A and C as slow but definite, but B has no triumphs, major or minor, to show for it. Further, he holds that work in A and C is more defensible from the view point of its own stated objectives, but thinks that activity in B will trail off to nothing due to lack of success. Needless to say, the report has generated very strong negative feelings from the leaders in AI, both in Britain and the United States. While the spirit of Sir James’ classification might have some merit to it, his assignment of various research activities into the categories is highly idiosyncratic and misleading. There seems to be a misunderstanding due t o the fact that many problems with a great deal of scientific content have been for funding and other reasons posed in the context of building robots, and certain problems which are closely connected with making robots work have, because of their abstract content, been often presented in a non-robotic context. Be that as it may, there are indeed three main lines of enquiry, sometimes closely connected, sometimes not. Most of the rebutals (printed in the same volume in which the report appears) take issue with one of the conclusions, that work in B is not especially worth supporting. This is where the distinction between the spirit of the classification and the details of assignment of activities to categories becomes important, since even those who might agree that building robots might not be that important would doubtless consider many of the projects placed by Sir .James in category B worthy in terms of scientific content. While A1 still has some distance to travel before it can claim to be a coherent discipline and will indeed inevitably redefine itself as work progresses, its subject matter-the simulation of the human thought process-will undoubtedly be illuminatcd by the power of the computer and computer-based theories. As long as one keeps clear of purely metaphysical objections, and views A1 as no more and no less than the science of intelligence, much of the emotional controversy that A1 engenders can be avoided.
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
173
2. The Objectives of the Review
The first, and still useful, review of work in A1 is by Minsky (1961). A review published a few years later (Solomonoff, 1966) caught up with the main currents until 1966. Minsky and Papert (1972) is a very readable review of A1 work a t the Massachusetts Institute of Technology. As years have gone by, the body of work in A1 has grown enormously and in many directions. Thus a review such as this has to be selective and highly idiosyncratic a t that. Thus we have been forced t o omit eoverage of many subareas owing to limited space. Further, almost any categorization attempting to impose some order on the material is bound to be unsatisfactory. Minsky had organized part of his review on the basis of what he perceived to be the various elements of an intelligent program: search, pattern recognition, learning, planning, and induction. Our aim is to give a feel for the way some of the major programs work, and thus our organization is somewhat different. A few general observations about the evolution of A1 seem worth making a t this point. Formal mathematical models play an increasingly diminishing role, and the vogue for “general purpose machines” or ‘‘ selforganizing machines” seems to be past, because of the paucity of structure resulting in inability to do anything interesting a t the level of interest to AI. However, this does not imply that formal mathematical models do not have any interest t o the student of AI. They can say very interesting things about some foundational questions (see e.g., Minsky and Papert, 1969; Banerji, 1969; see also Kugel, 1973, for a mathematical treatment of philosophical foundations). It is simply that these models do not directly lead to programs to perform complex tasks. Also, there seems to be a great deal of mystique associated with search as a major paradigm. Search is absolutely necessary in some situations, but in general it is a necessary evil until the structure of the underlying system is understood properly and necessary structural heuristics2 are extracted. A dominant school of thought, with which we have sympathy and which this review reflects, regards A1 as almost exclusively defined by heuristic programming. In all difficult fields, important advances do not occur too often. A ‘ A heuristic is a rule of thumb, strategy, method, or trick used to improve the efficiency of a system which tries to discover the solutions of complex problems. A heuristic program is a computer program that uses heuristics [Slagle, 19711. Heuristics are obtained from an intimate knowledge of the problem domain. Heuristics often give great insight into the corresponding psychological mechanisms in the human faced with the same problem.
174
B. CHANDRASEKARAN
review such as this might be useful if a t least an attempt is made t o concentrate on the important ideas rather than to try to be an exhaustive list of references. Of course the risk is taken that it is one man’s idea of important advances, but we think that the risk is worth taking. While most of the major centers of A1 research are in the United States, a few in Europe, a few emerging ones in Japan, and almost certainly some in the U.S.S.R., we have concentrated mainly upon the work in America, with reasonable attention paid to Great Britain. This is not due to any intentional chauvinism, but that is what the ease of accessibility and availability of literature and the limitations of our familiarity dictated. I n our evaluation, the following developments of the past decade have been significant. Computer semantics for natural language analysis has emerged as a significant subfield of activity, and ideas on how syntax and semantics interact in an intimate way have resulted in an interesting language understanding system. Sophisticated and convenient programming languages such as PLANNER promise to aid greatly in problems of representation of knowledge and deduction. Central t o this development is the insight that knowledge is really vacuous unless it is related to action, and thus knowledge should be represented as procedures. Automatic programming, which is really the development of very high level programming languages, scems to be opening up as an area with potential for interesting and practical advances. The late 1960’s saw the development of the Greenblatt program for chess, doing quite well in the middle game. Several other games have been programmed with varying degrees of succcss. In the study of learning systems, most useful progress has been made in structural learning, with emphasis on proper training sequences. Heuristic search has been studied mathematically and some interesting theorems have been discovered. Analysis of scenes containing polyhedra has become quite sophisticated over the years. Finally the relevance of A1 to cognitive psychology is becoming more and more apparent; in fact, as the science of intelligence, A1 can be legitimately looked upon as falling squarely within the field of psychology. We have organized the material in the following sections: Language Processing ; Some Aspects of Representation, Inference, and Planning ; Automatic Programming ; Game Playing Programs ; Some Learning Programs; Heuristic Search ; Pattern Recognition and Scene Analysis; and Cognitive Psychology and AI. We have selected areas in which we believe significant progress has been made in the past decade and which show promise of yielding useful results in the near future. In each of the areas, we select the programs or approaches which seem fundamental and describe them in some detail and give references to others of related interest.
ART1FlClAL INTELLIGENCE-THE
PAST DECADE
175
As mentioned earlier, we mainly emphasize work with a strong heuristic content to it. The references have been selected neither with a view to being exhaustive, nor to being fair in assigning credit or priority. More often recent references containing less basic ideas have been quoted than the seminal articles themselves, since the later paper is likely to contain its own list of references. Also, an arbitrarily exercised notion of “typicality” has governed the choice of references. We have generally omitted detailed discussion of work which has already found its way into easily accessible textbooks. Among other glaring omissions is hardware for A1 (e.g., robot systems, how they work, what they consist of, etc.). This is partly because we believe that the intellectual aspects of it can be adequately covered under other headings, but mainly because of our ignorance of it. We believe that with all the omissions, the following pages should still give the reader a reasonable feeling for the kind of progress that has been made in A1 in the past decade. For the reader who is interested in following on-going activities in AI, the following may be recommended: Publications of the Association for Computing Machinery (ACM), especially the Communications, and occasionally the Journal; SIGART News, the newsletter of the ACM Special Interest Group on AI; A C M Computing Reviews for an annotated bibliography ; the journal Artificial Intelligence published by Elsevier ; Proceedings of the National Computer Conferences ; the Transactions on Systems, M a n C% Cybernetics and the Transactions o n Computers both published by the Institute of Electrical and Electronic Engineers; a series of volumes entitled Machine Intelligence, which are proceedings of the annual workshops on A1 held at the University of Edinburgh, published in the United States by American Elsevier; and the proceedings of the International Joint Conferences on AI, which are biannual affairs. Among the more recent collections of articles, we mention Findler and Meltzer (1971), Pylyshyn (1970), Simon and Siklossy (1972), Minsky (1968), Banerji and Mesarovic (1970), and Schank and Colby (1973). The text books by Slagle (1971), Nilsson (1971), and Jackson (1974) can be recommended as introductions to the subject, while Duda and H a rt (1973) covers scene analysis adequately and Chang and Lee (1973) concentrates on theorem proving in the first-order calculus. The series of memoranda published by the M I T A1 Laboratory, Stanford A1 Laboratory and the Stanford Research Institute are valuable sources of information about current research. Of course, there exist numerous universities both in the United States and abroad, which carry on a spectrum of A1 activities-in fact, too many to list.
1 76
B. CHANDRASEKARAN
3. language Processing 3.1 Question-Answering Systems
Since the early mechanical translation attempts failed, most of the A1 activity dealing with natural language has been in the area of questionanswering systems. Winograd (1972, Chapter 2) provides a brief overview of many of them. Simmons (1970) provides a more detailed survey. Since we believe that much of the progress in understanding natural language, progress of the sort that will be used as foundation in the years to come, has been made by Winograd (1972), Schank (1973), and by Simmons (1973), we discuss them in detail later in this section. Here, we content ourselves with a brief summary, mainly following Winograd (1972) and Simmons (1970), of the activity in the area of question-answering systems. I n the category of special format systems are included programs such as BASEBALL (Green et al., 1963) ; SAD SAM (Lindsay, 1963), STUDENT (Bobrow, 1968), and ELIZA (Weizenbaum, 1966) -the first three dealing, respectively, with answering questions about baseball results, kinship structures in people, algebra problems, and the last with maintaining a somewhat disinterested conversation. They are special format because the only information in the input that they use is that which fits their particular format. Among these ELIZA is still popular in demonstrating some of the capabilities of the computer in language processing in a very superficial manner. It maintains a conversation which goes something like this (the italicized portion is the output of the computer): “well, my boy friend made me come here,” “ Y o u r boy friend made y o u come here?,” “He says I’m depressed much of the time,” “I am sorry to hear y o u are depressed,” etc. I t s basic idea is simple: the pattern-matching part of the program finds certain keywords in the input and this triggers the substitution of a suitable phrase to a specified part of the input sentence. Both the keywords and the substitution instructions are provided by a script which can be changed for different subjects of discourse. For example, if the input is “You are X,” the pattern-matching and substitutions would result in “What makes you think I am X?”. In the category text-based systems, Winograd includes Protosynthex I and a question-answering system based on Quillian’s semantic nets (see the later subsection on semantic nets). Protosynthex I operated mainly by indexing locations of content words in a body of text and, when a question was given, reproduced with some modifications sentences which had in some sense most in common with the question.
ART1FlClAL INTELLIGENCE-TH
E PAST DECADE
177
So-called limited logic systems use some inference from the data base to answer questions. SIR (Raphael, 1965), DEACON (Thompson, 1968), Protosynthex I1 and I11 (Simmons, 1966; Simmons et al., 1968), and CONVERSE (Kellogg, 1968) would fall in this category. Slagle’s DEDUCOM (1965) and the fact-retrieval system of Elliott (1965) would also fall in this category. The general deductive systems category includes systems which use the power of first-order predicate calculus or something close to it to provide a formalism in which inferences can bc naturally made, without the need to provide context-oriented inference rules as in the limited logic systems. Green and Raphael (1968) and Coles (1968), etc., are examples of this approach. Sandwall (1971a) also gives some ideas on how first-order predicate calculus can be used for question-answering systems. With the discovery of the resolution technique for proving theorems in first-order predicate calculus, these systems looked promising for a while. I n fact, the uniformity of representation and inference that were features of predicate calculus-oriented systems were sometimes thought of as advantages in the sense that the inference algorithm did not have to know anything about the subject matter, whether it be answering questions about airline schedules or proving correctness of programs. Later it turned out that the uniformity was obtained a t great costinefficiency and inability to represent strategies appropriate to the specific subject matter. We discuss later some recent attempts to impart semantically oriented strategies to a resolution-based theorem prover. However, it is apparent that the outlook for uniform proof proceduce-oriented natural language question answerers with a respectably sized data base is not very bright. Finally we come to the procedural deductive system of which the most successful example is the system of Winograd which we discuss in detail later in this section. Woods (1968) is also a limited example of this approach. The main idea is that procedural information, i.e., “how-to” knowledge, can be imparted to these systems in a language which does not depend upon the subject matter.
3.2 Winograd: A Procedural Model for Understanding
The most celebrated triumph in artificial intelligence in recent years is the program by Winograd to understand natural language (English) in the limited domain of discourse corresponding to the world of a (simulated) robot manipulating blocks of various colors and sizes and shapes. While the domain of discourse is limited, the complexity of the sentences i t can “understand” is very high indeed, as we shall soon see. One must
178
B. CHANDRASEKARAN
be careful to distinguish between the limited domain of discourse as in Winograd’s program and to the limited kind of understanding programs such as ELIZA (Weizenbaum, 1966), STUDENT (Bobrow, 1968), SAD SAM (Lindsay, 1963)’ etc., achieve. While Winograd’s program is very famous, and justly so, it is also testimony of the usefulness of much work in language processing, less spectacular to be sure, that went on previously. Slow but steady progress had previously been made in parsing (Woods, 1969). The high-level programming languages LISP, MICROPLANNER, etc., were essential to the success of Winograd’s program. When one reads Winograd or Schank, it is difficult to resist the feeling that something very similar must go on in human language processing, a t least in some global-strategic aspects. Of course, it is virtually impossible to validate these models from this viewpoint. One of the basic principles used in the design of Winograd’s program is that the program must use the syntactic, semantic, and inferential components in an interlaced manner; i.e., rather than consider a sentence, parse it by syntactic processes, attempt to assign meaning by means of semantic routines, etc., the program must use available semantics to disambiguate a t the syntactic level, use information based on partial parsing to disambiguate meaning, and if necessary use the inferential component to enable what has gone on before in discourse and the properties of the program’s “world” to select possible meanings and parsings. Since a general language understanding program to deal with the real world will have to be awfully complex because of the enormous previous knowledge that any participant in such a discourse has to have a world in which “deep knowledge” is not very complex has to be chosen to test out the proposed organization-the world of a toy robot with an arm which can manipulate toy blocks on a table, build stacks, etc. As the computer changes its world, its representation of the world changes, while it remembers some aspects of the past world to locate the context of the present dialogue. The noteworthy aspect of the program is that its world can be changed in a relatively simple manner; the syntactic analyzer is complicated enough, and the inference mechanism is powerful enough so that the ability of the program in a real sense is not dependent upon the severity of the constraints imposed by the choice of the robot world to demonstrate the ideas involved. Another important programming feature is that there are special languages appropriate for each major component: the grammar of the language is given to the program in the high-level programming language PROGRAMMAR ; the “facts” of the robot’s world, including deductive aspects of it are represented in a language called
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
179
PLANNER. We delve in some detail into PLANNER later in this article when we consider inferential systems for AI. One of the basic viewpoints underlying the work is that all language use can be thought of as a way of activating procedures within the hearer, i.e., understanding something is doing something, if not explicitly, then a t least to the extent of changing something in the cognitive structure of the hearer. Thus each sentence interpreted by the program is converted into a set of instructions-to manipulate blocks, to change the data assertions, etc. Thus Winograd’s program is based on a procedural model of language understanding. In this connection i t is to be noted that the idea that procedures and data ought to be handled in a uniform way has been around in A1 for some time, and in fact has substantially influenced the development of AI-oriented programming languages such as LISP. To illustrate the capability of the program as well as to note various aspects of it, we give below excerpts from a sample dialogue. The numbering follows that in (Winograd, 1972). Note that the world of the robot a t the beginning of the dialog consists of a desk, two green cubes (one on top of a larger red block and the other having a red pyramid on top of i t ) , a blue block, a small red cube surmounted by a small green pyramid, and a blue pyramid in a box. The program answers “OK” when it executes a command. 1. pick u p a big red block3
OK Since a green block is on top of the red one, the system finds a space for the green block and moves it. 3. f i n d a block which i s taller than the one y o u are holding and p u t it into the box
The program has to decide what “one” and “it” refer to. It has to remember, by deduction, that it is holding the red block. There are two possible choices for “it,” the one selected and the red block. It chooses, by means of heuristics, that the former was intended, but communicates this choice to the human participant in the dialog. 5. what is the pyramid supported by? THE BOX 3The dialog excerpts are taken from “Computer Models of Thought and Language” (R. C. Schank and K. M. Colby, eds.). W. H. Freeman and Company. Copyright @ 1973.
180
B. CHANDRASEKARAN
9. can the table pick u p blocks? NO
It uses the semantic rule that “picking up” is done by an animate object. 10. can a pyramid be supported b y a block?
YES 11. c a n a pyramid support a pyramid? I DON’T KNOW
Question (10) is answered by finding an actual example, while the system cannot find any example for 11 and it does not have any extraneous knowledge to resolve the question. 13. the blue pyramid is m i n e I UNDERSTAND This information is stored i n its data base. 17. will you please stack u p both of the red blocks and either a green cube or a pyramid?
OK
A command is executed even if i t looks like a question. Then follows a discourse at the end of which the robot is asked t o put a small block onto the green cube which supports a pyramid. 21. p u t the littlest pyramid o n top of it
It knows that “it1’refers to the small block, and interprets “littlest” from the root form “little.” When the heuristics and deduction from context are not sufficicnt, the system asks the user for more information. In this case it is not necessary. It can analyze complex tenses of verb groups, can deduce from memory what event is being referred to, and can relate other events to it in time. It can to some extent answer questions about its own motives such as by replying “TO GET R I D OF IT” to “ w h y did you pick it up?.” It does this by keeping track of selected parts of the original subgoal tree. It can accept new definitions, such as: a “steeple” i s a stack w h i c h contains t w o green cubes and a pyramid. It can accept names for objects and can locate the newly named object in its memory at a time when
A RTIFICIA L INTELLIGENCE-TH
E PAST DECADE
181
the object was not yet named. Thus, the sequence: call the biggest block “superblock,” OK, “have you picked up superblock since we began?,” YES. 3.2.1
Representation of Knowledge
Next, we consider some of the details of the program. The world model is a symbolic description, which abstracts those aspects of the world relevant t o working with i t and discussing it. The PLANNER language is used both for representing meaning and to give procedural information, i.e., “how-to” knowledge. The data base contains items such as (IS B1 BLOCK), (AT B2 (LOCATION 1 2 3 ) ) , (SUPPORT B 1 B2), (MANIPULABLE B l ) , (COLOR O F B1 R E D ) (IS BLUE COLOR), e t ~ Note . ~ that it deals with two kinds of “concepts,” concepts such as object B1, events E20, etc, and concepts such as BLOCK, BLUE, etc., which are actually conceptual categories. In a sense, MANIPULABLE and SUPPORT are also concepts, but they are more primitive, i.e., they are needed to describe other concepts. At this point a few remarks, which pertain also to semantic networks of various kinds that we shall discuss shortly, are relevant. I n formal semantics such as that for first-order predicate calculus, there is the underlying notion that meanings can be reduced to a set of pure elements. However, i t is clear that in natural language concepts are quite often defined in a mutually dependent sort of way, i.e., the meaning of a concept resides as much in the interconnections with many other concepts as in itself. I n semantic network models, this idea is represented in the form of a complicated set of relational paths that exist between “atomic” concepts. In Winograd’s program, the system’s knowledge which involves the interconnections between concepts is in the form of procedures written in the PLANNER language, especially suited for deduction in the form of satisfying some goal by setting up successive subgoais, as familiarized by GPS (Ernst and Newell, 1969) of a few years ago. For instance, the concept CLEARTOP X is represented by the procedure (check if X SUPPORTS any object Y, if so GET R I D OF i t ) , involving concepts SUPPORT and G E T R I D OF. Each again is a procedure, involving other concepts such as PICKUP and GRASP. To obtain a feeling for the goal-subgoal structure, let us examine GRASP. Suppose object B1 is in location K1, the robot has object B2 ‘The corresponding natural language representations are, “B1 is a block,” “B2 is at coordinate (1, 2, 3),” “BI supports B2,” “B1 is manipulable, “color of B1 is red,” and “blue is a color.”
182
B. CHANDRASEKARAN
in its hand and is told to GRASP B1. The main goal is GRASP B1. It cannot, since it has B2 in its hand. The subgoal is then G E T RID OF B2. To do this it has to put B2 some place on the table, corresponding to the subgoal PUTON B2 TABLE, where PUTON activates procedure to look for suitable empty space. Let the space be denoted by K2, then, the subgoal is PUT B2 K2. Finally the subgoal is MOVEHAND K1. This subgoal structure is retained in memory and thus the program can answer questions such as “why did you put B2 on the table?,” by looking into the subgoal structure to locate the subgoal previous to P U T B2 K2 and come up with “TO GET RID OF IT.” (Exactly how the natural language response is generated is another question which will be considered later.) It is not necessary that a sentence should be a command for i t to be represented as a procedure. All “concepts” are so represented. Consider the concept denoted by the phrase “a red cube which supports a pyramid.” When this phrase is encountered, essentially a procedure to locate such an object in its world is activated. This procedure consists of goals to decide if an object is a block, is red, and has all its sides equal, and when an object satisfying these goals is found, a further goal to check whether any pyramid is supported by the cube. While meanings are represented internally as procedures, we still have to specify how a program (or a set of appropriately modified procedures) is put together corresponding to an interpreted English sentence. First of all, simple words have both semantic markers and procedural representation as well as the syntactic category in the chosen grammatical system. For instance the word “cube” has in its dictionary definition “noun” for specifying the syntactic category, and the semantics specify that it is an “object,” “manipulable” and “rectangular” (semantic markers), “is block” and “equidimensional” (procedural). The semantic markers aid in efficiency in the sense that quick checks to eliminate inapplicable meanings can be made by using them. As a further example, the word “contain,” which has two meanings, one in the sense of a jug containing water and the other of a book containing pages, has in its dictionary definition the following information : transitive verb, relation, meaning either (as in “container XI” and a “physical-object X,” then “XI contain X,”) or (as in “construct XI” and a “physical-object X,”, then “X, part-of XI”). Of course, the definition is written in the programming language PLANNER. The definitions are calls to programs, in these examples to OBJECT and RELATION, which build the semantic structures. Thus again meanings are programs or procedures. However, as one gets beyond simple words, the programs are more complicated and start incorporating checks to past discourse (for example, in the case of pro-
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
183
nouns or “the”), and actions to look for specific objects in the data base and so on. 3.2.2 Syntactic Processing
Next we come to the organization of the syntactic processor. The grammar used for English is based on the systemic grammar of Halliday (1970). Since the primary task of the system is “understanding,” the processor is based on identifying syntactic units that are useful in determining meaning. I n systemic theory for each of the units making up a syntactic structure, features describing its form and the functions it serves in context need to be specified. For instance, a noun group has features such as determined, singular, etc., and functions such as subject, etc. The syntactic features are basic to the description of the semantic rules, since part of the input to the process of semantic interpretation is the syntactic structure of the sentence, clause, or other syntactic units. The grammar used in the system is mainly an interpretation grammar for accepting grammatical sentences, rather than a generative grammar. This emphasis on interpretation acceptance demands a grammar which is substantially different from traditional grammars like transformational grammars which are highly biased toward generation. The systemic grammar whose syntactic units are organized on the basis of meaning is such a grammar. Perhaps a short detour describing more traditional parsers will be appropriate here. I n earlier artificial intelligence programs involving natural language some kind of pattern matching was used in the place of any elaborate parsing. Systems like STUDENT, SIR, ELIZA could make do with that, mainly because the kind of understanding sought as well as the domain was highly restricted. On the other hand, systems such as the Harvard Syntactic Analyzer (Kuno, 1965) whose main objective was simply parsing rather than analysis along the way to doing something else experimented with context-free grammars and corresponding parsing algorithms for English. Apart from the fact that they cannot, in principle, handle some important aspects of natural language (Chomsky, 19571, such systems are not very amenable to interfacing with semantic routines in the interlaced manner in which Winograd’s program works. Parsers based on transformational grammar (Petrick, 1965 ; Zwicky et al., 1965 ; Friedman et al., 1971) become very complex when applied in an acceptance-based system because of the combinatorial explosion in the inverse transformational process. More recently, a parser based on “augmented transition networks” (Woods, 1969) has been developed that seems capable of handling the
184
B. CHANDRASEKARAN
complexity of natural language. This parser and others closely related to it (Thorne, 1969; Bobrow and Fraser, 1969) have the theoretical power of Turing machines, thus having inherently as much power as any mechanical parser (see also Woods et aE., 1972). For regular grammars, it is well known that there exists a finite-state transition network in which the parser transits to another state depending upon the input symbol being examined. Eventually, when the sentence terminates, if the parser is in an accepting state, the sentence is accepted. Instead of transition based on the current input symbol (or word), suppose transition to another state is on the basis of a condition associated with the arrow such as NP, Aux, etc., and one network is allowed t o make recursive calls to other networks or to itself. For instance from state S to q,, let the transition condition be NP, and let there be a corresponding network for NP. The parser transits to the NP network and when it reaches a terminating state for that network, the parser now reverts to the original network but goes to state q,. Thus corresponding to a grammar there will be a set of transition networks. Now, if, in addition to recursive calls, the parser can use a set of registers associated with the networks, if the contents of the register can be changed, and if the transitions are made conditional on the contents of these registers, we get the augmented transition networks. The conditions can be interpreted as tests for syntactic or semantic agreement, the presence or absence of lexical features, etc., and the changing of the contents of the register can be viewed as operations or subroutines whose arguments might be the type of sentence, the value of the AUX, of the VERB, of the VP, etc. Winograd’s parser has similarities to the augmented transition network parser. Networks and programs are essentially similar kinds of things as persuasively argued by Winograd. The backup features are similar in some ways and different in others. The augmented transition networks are modifications of nondeterministic transition diagrams and thus when an original choice does not lead to an accepting state the network has to revise its choice by an automatic backup mechanism. (It is actually implicit in the operation of the network; either that, or simultaneously all the possibilities are kept track of.) I n Winograd’s parser, however, the backup mechanism can keep track of the reasons for failure and guide itself accordingly. Another difference is that the network systems are still basically transformational grammar-oriented, while the Winograd parser, as mentioned earlier, uses a version of the systemic grammar, with its “ability to examine the features of the constituents anywhere on the parsing tree, and to manipulate the feature descriptions of nodes.” Further,
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
185
the interrupt features of PROGRAMMAR and other special additions add extra efficiency. The systemic grammar recognizes three ranks of syntactic units, the CLAUSE, the GROUP and the WORD, The groups are of several types: NOUN GROUP (NG), VERB GROUP (VG), PREPOSITION GROUP (PREPG), and ADJECTIVE GROUP (ADJG). The WORD is the basic building block and it exhibits features, e.g., PLURAL. The various GROUPS have specific functions in conveying meaning, e.g., NG’s describe objects. The CLAUSE is the most complex unit: it can be a QUESTION, DECLARATIVE, or an IMPERATIVE, and within each category there are several possibilities. Clauses can be parts of other clauses. The advantage of grouping the constituents into units is that the units have features which are important in conveying meaning and in systemic grammar they can be explicitly mentioned. I n traditional grammars these features and others which are not relevant to meaning are all implicit in the rules and there is no efficient way to isolate and identify the semantically useful features. 3.2.3 The System We mentioned earlier that the three components, syntactic, semantic, and inferential, work in an interlaced manner. To quote Winograd, As soon as a piece of syntactic structure begins t o take shape, a semantic program is called to see whether it might make sense, and the resultant answer can direct the parsing. I n deciding whether it makes sense, the semantic routine may call deductive processes and ask questions about the real world. As an example, in the sentence “put the blue pyramid on the block in the bod’ of the dialog, the parser first comes up with “the blue pyramid on the block” as a candidate for a noun group. At this point, semantic analysis is begun, and since “the” is definite, a check is made in the data base for the object being referred to. When no such object is found, the parsing is redirected to find the noun group “the blue pyramid.” It will then go on to find “on the block in the box” as a single phrase indicating a location. In other examples the system of semantic markers may reject a possible interpretation on the basis of conflicting category information.6
Finally, we briefly indicate how the system generates natural language responses. First, there is the patterned response, e.g., “OK,” “I understand,” (‘sorry, I don’t know the word. . . .” Answering questions involves more complex routines. The syntactic classification of questions From “Computer Models of Thought and Language” (R. C. Schank and K. M Colby, eds.). W. H. Freeman and Company. Copyright @ 1973.
186
B. CHANDRASEKARAN
that is done during parsing is used for this purpose. For instance, one of the types is the WH question, and corresponding to various possibilities for it, more or less specific answers can be generated. A “why” question requires searching the memory for the sequence of subgoals used. Yes-No questions tend to require even more complex answer generation schemes, since questions can be LLloaded,” and other complexities arise. I n answering questions, the system needs to name an object or describe an event. A set of PLANNER and LISP functions to examine the data base is used for this purpose. Fluent dialog can be produced with sentences whose complexity is not as high as that of sentences it will be called upon to understand. Still when the answers are extracted from the data base, it might end up with something like, “a blue block and a red cube and a red cube.” I n order to make it sound more natural, this ought to be converted to, “a blue block and two red cubes,” by looking for identical descriptions and by combining them with the appropriate number and with appropriate singular-plural transformations. Similarly, when the same object is being referred to more than once in the answer, the system substitutes “it” or “that” for the second and later references to that object. The above are simply examples of the kind of problems that are faced and handled by the system in generating its portion of the dialog; the system handles substantially the entire range of problems that would arise in its BLOCKS world. We have devoted a great deal of attention to Winograd’s program because it exemplifies a point of view toward language that has enormous potential, a t least for the purposes of artificial intelligence. It is, needless to say, not the final word, but i t is certainly testimony t o the potentials of one approach to language. I n Winograd’s program, the syntactic analysis generates a possibility for interpretation and the semantics decide if that possibility makes sense or alternative possibilities ought to be considered. On the other hand, semantic networks start, in a sense, with the semantics and the syntactic checks enable a decision to be made on the appropriateness of the semantic representation. Next we turn our attention to the concept of semantic networks. 3.3 Semantic Networks 3.3.7 Some Semuntic Network Models
Semantic networks are, as the name indicates, structures in which meaning is represented in the form of nodes and connections between nodes. I n artificial intelligence work, this concept was introduced by Quillian (1966). Quillan’s model dealt with semantic memory, i.e., a psycho-
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
187
logically based model to explain storing and retrieval of concepts. This model is essentially a mass of nodes interconnected by different kinds of associative links. If we think of each node as being named by an English word (corresponding to a concept) then the node corresponding to the definition of the word (there should be only one such node) is called a type node and all other occurrences of the concept in the definition of other concepts will correspond to token nodes. I n what follows, we understand by a node, the word that is the name of the node. Then the associative links are of various types, corresponding to whether one node is a subclass of another node, whether one node modifies another, whether a group of nodes form a disjunctive or conjunctive set, or in a more complicated fashion, for three nodes A, B, and C, whether B, a subject, is related to C, an object, in the manner specified by A, the relation. A word’s full word concept is the set of all nodes that can be reached by an exhaustive parsing process, originating at its initial patriarchal type node, together with the total sum of relationships among these nodes specified by the token-to-token links. Quillian illustrates the idea by the following example: Suppose a subject were asked to state everything he knows about the concept “machine.” Each statement he makes in answer is recorded, and when he decides he is finished, he is asked to elaborate further on each thing he has said. [This is repeated until some sort of end is reached.] This information will start off with the more “compelling” facts about machines, such as that they are usually man-made, involve moving parts, and so on, and proceed “down” to less and less inclusive facts, such as the fact that typewriters are machines, and eventually . . . that a typewriter has a stop which prevents its carriage from flying off each time it is returned. We are suggesting that this information can all usefully be viewed as part of the subject’s concept of “machine.’’a
The memory modelled is recognition memory rather than recall memory. There are no word concepts as such that are primitive. Everything is simply defined in terms of some ordered configuration of other things in the memory. Typical of the output of the simulation of this memory model is the following. In response to “compare: CRY, COMFORT,” the program’ss answer is: “Intersect: SAD. (1) CRY2 is AMONG OTHER T HIN G S TO MAKE A SAD SOUND (2) TO COMFORT3 CAN B E T O MAKE2 SOMETHING LESS2 SAD.” (CRY2, for instance, means the second sense of the word CRY.) Thus
‘Reprinted from “Semantic Information Processing” (M. Minsky, ed.) by permission of the M.I.T. Press, Cambridge, Massachusetts. Copyright @ 1968 by the Massachusetts Institute of Technology.
188
B. CHANDRASEKARAN
comparisons of the concepts are accomplished by looking for the intersections in the tracing of the concepts. Expressing findings in English, because of the restricted nature of the kinds of questions it can deal with, does not involve anything like the mastery of the language that, say, Winograd’s program has. Quillian makes some remarks concerning the relevance of the model to linguistics, the most important of which, for our purposes, is that “sentence understanding should be treated as a problem of extracting a cognitive representation of a text’s message; that until some theoretical notion of cognitive representation is incorporated into linguistic conceptions, they are unlikely to provide either powerful ~ language-processing programs or psychological relevant t h e o r i e ~ . ”This fits in nicely both with one of the basic viewpoints of Winograd, whose work was described earlier, and of Schank, whose work we consider later in this section. Quillian’s Teachable Language Comprehender (TLC) (Quillian, 1969) is a program for learning from natural language text, and the idea relates to how a semantic network of a machine can be expanded; hence the word ‘lteachable.” The Protosynthex I11 system of Simmons e t al. (1968) can be viewed as a semantic net structure. It is based on the concept of the C-R-C triple, where C represents a conceptual entity, and R represents some relation. Each triple represents one unambiguous “sense” of a word. For example, the phrase “the angry pitcher” would be represented by the triple (pitcher MOD Angry). Such triples can be nested to form very complex structures, and have been used to answer simple questions, using a children’s encyclopedia as a data base. Shapiro’s semantic net called MENS (Shapiro, 1971) has been used similarly to store and retrieve information from natural language, as a vehicle for experimenting with various theories of semantic structures, or as a memory management portion of a natural language questionanswering system. 3.3.2 Simmons’ Semantic Networks
As perhaps the most advanced semantic network model with interesting and substantive linguistic aspects to it and as a bridge to a discussion of Schanks’ conceptual structures, we consider briefly the work of Simmons and his associates on using such net structures for understanding English sentences.
‘ Reprinted from “Semantic Information Processing” (M. Minsky, ed.) by permission of the M.I.T. Press, Cambridge, Massachusetts. Copyright @ 1968 by the Massachusetts Institute of Technology.
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
189
An immediate question that arises in semantic network representation is: At what level of depth should the representation be made? For instance, one level of depth might be that which regards “John sold the book to Mary” and “The book was sold by John to Mary” as having identical conceptual structures (the sentences are simply syntactic paraphrases), but that conceptual structure being different from that for “Mary bought the book from John,” though we know that they mean the same thing. This is the depth of conceptualization adopted by Simmons’ network, while Schank’s, as we shall see later, would provide identical conceptualization for all three sentences. Simmons, while conceding that the deeper structure might be a more psychologically valid model, believes that the shallower structure provides a useful definition of a distinction between syntactic and semantic transformations. Simmons (1973) provides an excellent overview of the work dealing with the computational aspects of semantic nets and their use for understanding English sentences, and our summary borrows heavily from that paper. The semantic representation of sentence meanings is based on the deep case grammar of Celce-Murcia (1972), which we explicate by the following examples. The sentence “Mary wore a sweater” would be represented by “wear: LOCUS Mary, T H E M E a sweater.” The sentence “John broke the window with a hammer” will have a more complicated representation, “Break: CAUSAL ACTANT 1 John, THE ME the window, CAUSAL ACTANT 2 a hammer.” The idea is this: the arguments of the verb can be classified as members of deep case relations such as causal actant, theme, locus, etc. Hopefully the examples illustrate how the classification is made. The above representations called propositional representations are only partial. We need the so-called modality representations, which for the second of the above two examples would be, “TENSE past, VOICE active, FORM simple, ESSENCE positive, MOOD declarative.” We can see that the various classifiers are oriented toward the meaning of the sentence and that the modality information is more concerned with those aspects that relate to the surface structure of the sentence. With this case grammar a t hand, let us see how the semantic representations are constructed. The sentence “could John have been courting Mary falsely last year?” would have the structure; C1 TOKEN court, CA2(John) T H E M E (Mary), MODALITY C2 C2 TENSE past, VOICE Active . . . MOOD interrogative . . . MANNER (falsely), T I M E (last year) . . . The dots represent those aspects we will not refer to in this discussion.
B. CHANDRASEKARAN The parenthesized elements have their own semantic structures. We can see that node C1 is mainly derived from the propositional representation and node C2 from the modality representation. The above type of representation is for sentences. For phrases, we can illustrate the representat,ion with ; All seven windows C1 TOKEN window, NBR Plural, DET Def, COUNT 7, QUANTIFIER, All. the red barn, the barn is red C1 TOKEN barn, NBR singular, DET def, MOD C2 C2 TOKEN red, D E G positive Note how the verb “be” is treated. Similar ideas apply for the verb “have.” The relationships between network representations and first-order predicate calculus representations are discussed in Sandewall (1971b) and Simmons and Bruce (1971) who also provide an algorithm for translating from semantic net structure notation into predicate calculus. Simmons then discusses the process of obtaining from natural language sentences their semantic representation. The method uses a lexicon and a variant of the augmented transition network of Woods that we discussed earlier as the grammar. We direct the reader to Simmons (1973) both for this and the generation of English sentences from the network representation, especially since we have covered the basic of these processes for the Winograd program. Simmons has also used the system as a question-answering device. For a useful survey of various questionanswering systems, see Simmons (1970). 3.3.3 Schank’s Conceptualization Networks
For a net representation of semantics a t a level deeper than that of Simmons and associates, we turn to the work Schank (1973), whose aim is representation in an unambiguous, language-free manner. We shall here describe the basis of conceptualizations worked out by Schank and direct the reader to the following references describing implementations : Riesbeck (1973) dealing with a parser, Rieger (1973) dealing with the memory and inference strategy, and Goldman (1973) for the generator of discourse. We mainly follow Schank (1973) for our summary. In parsing, it is well known that both pure bottom-up and top-down methods are tremendously inefficient. The notion of prediction of syntactic categories, which corresponds t o a top-down approach, has been used
ARTIFICIAL INTELLIGENCETHE PAST DECADE
191
for parsing, as in the work of Kuno and Oettinger (1962), for instance. Schank claims that prediction should be based not only on purely syntactic criteria, but on semantic ones as well, i.e., in addition to a syntactic category, a conceptual structure type is predicted. (One can see some convergence between Winograd and Schank in insisting that the syntactic and semantic processors ought not to be isolated entities, but should be interlaced.) The idea is to proceed in a bottom-up manner until syntactic and conceptual predictions can be made and then to proceed in a topdown manner. What are these conceptual structures? They are unambiguous, interlingual representations of conceptual content. It is a t a higher level than the sentential, i.e., a t the conceptual level. The basic unit of the conceptualization is the concept, which can be a nominal, an action, or a modifier. A nominal concept is called, in Schank’s system, a PP (for picture producer), and it is a concept of a general thing-a book, or a specific thing-Mary. An action denoted by ACT, is what a nominal can be said to be doing, and it must be something that an animate nominal can do to an object. I n “John hits Bill,” “hit” is an ACT, but in “John loves Mary” “love” is not an ACT. It is more a state of mind John is in when he thinks of Mary. A modifier relates to and modifies or specifies an attribute of a n ACT or a PP. A modifier of a nominal is a PA (for picture aider) and of an action is an AA (action aider), These conceptual categories relate in specified ways to each other, and these relations are called dependencies, and the related categories will either be dependent or governing. A dependent cannot exist at the conceptual level without the governor and thus predicts the existence of the governor. However, a governor can also be a dependent as in “John hit the tall man hard,” “hit” is a governing ACT, “hard” is the dependent AA, “man” is a governing PP with “tall” the corresponding PA, but both these governors are dependents of the governor “John.” A conceptualization is a set of concepts along with their relationships, and the representation of the conceptualization is the conceptual dependency network, called the C-diagram. I n the example, “John hit the tall man hard,” both “John” and “hit” are governors, and they need each other for the conceptualization to exist. This is a two way dependency denot,ed by PP e ACT. The governor ‘‘man’’ is a dependent of “hit,” and this form of dependency, named objective dependency is denoted ACT PP. The PA ‘%all” modifies “man” and the relation attributive dependency is denoted b y PA 3 PP. The AA “hard” modifies “hit,” and the relation is represented by AA 3 ACT. (This assignment of ACT to “hit” is temporary. Schank shows that if one delves deeper into exactly what concept is represented, the
192
B. CHANDRASEKARAN
underlying phenomenon is more closely represented by “propelling something.”) Thus, the C-diagram looks like: John
0
H
hit + m a n
T
T
tall
hard
The two-way link can be used to reference the entire conceptualization, as by arrow from “yesterday” to w in the above, we have the C-diagram for “John hit the tall man hard yesterday.” The conceptualization rules can be summarized as follows: 1. PP w ACT 2. PP-PA 3. P P e P P 4. PP +-PA
5 . PPe= PP
6. ACT
PP
An example of (2) would be: J o h n e tall. Now what is the difference between this and (4)Jo h n + tall? It depend8 on whether a n attribute about a PP is being predicated or has already been predicated. I n the first case, i t is a two-way dependency, thus rule 2 would apply. Examples of (5) would be “book s t a b l e ” for “book on the table” and “dog POSBBY cJohn” for “John’s dog.” As Schank puts it, “it is the responsibility of the conceptual level t o explicate underlying relationships that speakers know to exist.” Thus, a t first when one considers sentences such as “the man took a book” and “I gave the man a book,” one obtains C-diagrams pretty much as we outlined before, with the addition of
+L+ from
man X,I
to the term ‘(book,”to indicate that whether made explicit or not, in both “give” and “take,))something changes hands fiom one person to another. On further examination, one discovers that actually ‘lgive” and “take” can bc viewed as an ACT say “TRANS” in which when the actor and originator (the person from whom something was transferred) are identical one gets [‘give,’’and when the actor and the recipient are identical, one gets “take.” Once this is seen, then it can be realized th a t “steal,” “Bell,” “want,” etc., all have meanings having to do with “TRANS” as their main ACT. Thus paraphrases become possible at the conceptual level.
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
193
Schank develops the notion of “conceptual cases” for the dependents that are required by the ACT. Rule 6 in the list of conceptual rules is the case of objective dependency, and the R-net tacked onto “book” in a previous example is an example of recipient dependency. Instrumental dependency is a bit more complicated. I n the example, “John ate the ice cream with a spoonl’’ one might be tempted t o regard ((spoon’)as a n I instrument, axid thus have t spoon tacked on to “ate” or whatever the ACT is. However, the instrumental dependent itself should be a conceptualization, since on analysis i t turns out that a single PP cannot be a conceptual instrument, and can only be an object of a n action conceptually. Thus, the instrumental dependency should look more like : John
do To spoon
Here, “do” is left unspecified. One could perhaps elaborate this by riotirig that probably John transferred by means of the spoon ice cream from a container, etc. I n fact, Schank shows that every ACT requires a n instrumental case. Directive dependency is similar to the recipient dependency, but applicable in a situation like “fly from New York to San Francisco” and appears in the representation:
q
San Francisco
fly
New York
Conceptualizations can relate to other conceptualizations, and thus we have the idea of conceptual relations. One of the most important is causality, a special case of which is intentional causality. Other conceptual relations deal with time and location. One of the consequences of defining action as Schank has done is that not all verbs have corresponding actions. For instance, in “John likes Mary,” “like” is a verb but is not an ACT. What happens is that the existence of Mary produces in John a state of mind which one might call “pleased.” I n “I like books,” it is not simply the existence of books that pleases me, it is more probably that reading them produces a pleasant state. Thus, I w do (read)
m
I -pleased.
0
+- books
194
B. CHANDRASEKARAN
Schank goes through a detailed analysis of complex actions to isolate the underlying ACT’S, and the additional conditions imposed upon the context to correspond to one verb or another. Thus, in “Brutus killed Caesar,” however tempting it is, “kill” is not an ACT. Brutus did some thing with something which resulted in Caesar changing from being “alive” to being “dead.” Now if Brutus had simply stabbed Caesar but Caesar did not die, then the underlying ACT that Brutus did would remain the same, but the change of state of Caesar would not have come to pass. Schank analyses various classes of verbs (prevent and instigate ; frighten, comfort, console, kill, and hurt ; threaten, advise, complain, and ask; love and hate) and isolates the underlying acts and accompanying conceptual relations. A set of ACT’S which describe a whole collection of physical action verbs is also arrived at, resulting in about fourteen ACTS which together with a number of states seem to be adequate to represent the information underlying English verbs. We emphasize that this representation aims to be language-independent thus carrying the representation of meaning to its deepest level so far. We have no space to go into the programs which convert from natural language to C-diagrams and back, and thus we simply refer the reader to papers cited in the beginning of this section for further details. It is interesting to note that in analyzing the primitive actions underlying verbs, all those actions turn out to be simply the moving about of ideas or physical objects, producing changes in the states of the objects in motion or in emotional states. With the emergence of computer semantics and carefully conceived attempts to link these semantic systems to other aspects of language processing, we see that artificial intelligence in the specific problem of handling natural language has moved from the naive overexpectations of the early 1950’s through the general gloom of the early sixties to healthy, thriving, and realistic attempts which, if nothing else, a t least promise to reveal some of the secrets of language and its organization. Even in the field of mechanical translation from one natural language to another, an enterprise generally abandoned because of the spectacular failure of early attempts, there are some new approaches that are being pursued. Wilks (1973) is one such approach. In this paper, he makes many insightful remarks on the relative merits of “logical” vs. “linguistic” approaches to the problem of translation, i.e., approaches in which the dominant technique is coding into an appropriate formal logic and coding back into another language from the formal logic formulas vs. approaches in which a mapping from natural language to “conceptualization” (like that of Schank’s we described earlier) and back to another natural language is made. We refer the reader to this and other papers of Wilks referenced in Wilks (1973) for further details of his system.
ART1FICI AL INTELL IGENCE-TH
E PAST DECADE
195
4. Some Aspects of Representation, Inference, and Planning 4.1 General Remarks
One of the attributes that according to Minsky (1961) a n intelligent system might possess is that of pzanning, so as to “obtain a really fundamental improvement by replacing the originally given search by a much smaller, more appropriate exploration.” As it turned out, unlike other attributes such as search, induction] and learning, each of which has been investigated with respect to their mathematical and other properties, the idea of “planning” as such has not been separately explored. Many A1 programs work well to the extent that they can be viewed as having improved planning capabilities. Thus in a sense heuristics are part of planning. The programming language PLANNER makes the task of communicating and incorporating plans in A1 systems easier; and CONNIVER, a successor to PLANNER, incorporates improvements in the form of improved data structures. The form in which knowledge is represented is crucial for the ease with which i t can be retrieved, used, and manipulated by an intelligent system. Since whatever plans one gives a machine are also knowledge, in a sense more fundamental knowledge than that contained in a data base of facts, and since inference mechanisms are easy or difficult, natural or unnatural, sometimes even possible or impossible-all depending upon the representation of knowledge-the connection between representation, inference, and planning can be seen to be a strong cne. The first-order predicate calculus has always been a temptress for those looking for a formalism to represent knowledge] especially with the invention of the resolution technique (Robinson, 1970) for theorem proving in that logic. In fact, there has been tremendous explosion of research on all aspects of resolution theory in the past few years based on a certain possibility that formal logical systems offer: that of a uniformity of representation and inference irrespective of the subject matter. It has become apparent that this very uniformity itself is where the problem lies. It is virtually impossible to incorporate problem-specific heuristics into a resolution-oriented system, though some gallant attempts are being made now and then to incorporate semantically oriented strategies (Minker et al., 1973). Some research has concentrated on logics other than firstorder predicate calculus, e.g., McCarthy and Hayes’ modal logic (1969), higher order calculi (Robinson] 1969). They have the same problems introduced by generality, and in addition, they lack the semidecidability property of the first-order logic and do not offer computationally attractive decision algorithms like the resolution algorithm. Thus, notwithstanding the attractive fact that problems of question-answering, infor-
196
B. CHANDRASEKARAN
mation retrieval, pattern recognition, inference, etc., can all be cast in the form of proving theorems in a logical calculus, formal logic does not seem to be the most promising route to follow for representation. See in this connection (Anderson and Hayes, 1972). Nevertheless, we take a brief detour into the literature on theorem proving and indicate the direction of recent research. Nilsson (1971) has a concise and readable chapter on the resolution technique which is also dealt with in the recent text by Chang and Lee (1973). The essence of the idea is this: Given a set of axioms and a theorem to be proved in that set of axioms, attempt to show that the conjunction of the axioms and the negation of the theorem is an unsatisfiable formula. Resolution is a rule of inference which is suited for the task of proving the unsatisfiability of a set of formulas. While the resolution procedure is sound and complete and can be realized in the form of a computational algorithml8 when the size of the set of axioms becomes large, it takes longer and longer to conclude the proof, i.e., it is not especially intelligent used as it is. So one needs various strategies, mainly heuristics, to increase the probability of concluding the proof rapidly. Many have been thought up, beginning with the so-called unit preference strategy which, in resolving formulas (more precisely, clauses) prefers those which are shorter, through set-of-support which considers only a subset of the set of clauses, to various other refinement strategies. The texts we mentioned above cover them adequately. However, what is important t o note is that all these strategies are syntactic in nature, i.e., they concern themselves with the form of the clauses rather than with the meaning. Thus heuristics which have to do with the subject matter at hand, i.e., what the predicate letters stand for, are not easily introduced into the system. Slagle (1967) attempts to do some of this, as does Minker et al. (1973). But they seem more like special cases rather than general extensible techniques to introduce arbitrary semantically oriented heuristics. 4.2 STRIPS
I n spite of these inadequacies, theorem-proving techniques are being applied in a variety of problem-solving situations, especially with the addition of other techniques such as STRIPS (Fikes and Nilsson, 1971) and other extensions (Fikes et al., 1972). These additions, while impleNote, however, that there is no decision procedure for the first-order predicate calculus. It is semidecidable, i.e., all valid formulas can be so detected, but if a formula is not valid, then there is no guarantee that this fact can be discovered in a finite number of steps.
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
197
mentcd in the context of resolution-based theorem provers, have theoretical content which is independent of the theorem-prover used. We give a brief summary of the ideas behind STRIPS, which can be most usefully discussed in the context of the problem for which it was originally implemented, namely, plmning strategies for simple robots to do simple, nontrivial things like moving a box through a door. The robot vehicle has a set of primitive actions available, such as GOTI-IRU (Dl, R1, R2) which causes the robot to go from room R 1 to room R2 through door D1. The robot world is represented in the form of a collection of predicate calculus well-formed-formulas. Let us consider the example world model used in Fikes et al. (1972) :
M o : INROOM (ROBOT, R1) CONNECTS ( D l , R1, R2) CONNECTS (D2, R2, R3) BOX (BOXl) INROOM (BOX 1, R2) (VxvyVz)
CONNECTS (x,y,z) 3 CONNECTS (x,a,y)
Operators: GOTHRU (d,rl,r2) precondition: INROOM (ROBOT, I 1) A CONNECTS (d,rl ,r2) PUSHTHRU (b, d, r l , r2) precondition: INROOM (b,rl) A INROOM (ROBOT, r l ) A CONNECTS (d, r l , r2)
Go: (Ix) [BOX(x) A INROOM (x, R l ) ]
A few words of explanation: The world model is indexed Mo because the robot’s world will change as things are moved around and we want to correspondingly update the model. The wff’s have their obvious meaning. The goal is given in the form of a n existentially quantified statement to be proved and we assume that the reader is familiar with the fact that a constructive proof of “there exists a box such th a t it is in room” (i.e., a proof that exhibits such a box in order to prove the wff) can then be used to directly locate that box. I n gencial, “What is the x 3 P(x)?” can be answered by giving to the theorem prover the wff to prove “3xI’(x)”. I n the list of operators, the lower case arguments are variables. The precondition associated with each operator has to be satisfied for the operator to be applicable. (It is quite natural, since “push the box from room r l to r2” cannot be applied unless the box is in room r l in the first place). As it turns out, the goal wff as given cannot be proved in the world model as it is, as one can check directly. However, if the world
198
B. CHANDRASEKARAN
model were different, then the goal wff might be provable. How does one make the appropriate change in the world model? By applying operators in a certain sequence. The problem of generating the appropriate operator sequence is precisely the problem of organizing robot plans. There is another aspect of the operator list that is important. When the operators are applied to change the world model, the changes can be represented as a set of wff’s to be added and a set of wff’s to be deleted from the world model. Thus associated with PUSHTHRU (b, d, rl, r2), INROOM (b, .) where we have the delete list, INROOM (ROBOT, the . stands for an arbitrary variable (i.e., all predicates asserting the existence of the robot and the box are to be deleted) and the add list, INROOM (ROBOT, 1-2) and INROOM (b, r2). Similarly for the operator GOTHRU, the delete list is INROOM (ROBOT, .) and add list is INROOM (ROBOT, r2). The problem of organizing the sequence of operators is approached in the manner of the general problem solver (GPS) (Ernst and Newell, 1969). If the goal wff cannot be proved in a given world model, STRIPS chooses that operator which is likely to produce a model in which the “distance” is reduced. If the precondition is not satisfied, a subgoal of satisfying the precondition is set up and so on. When finally the precondition is satisfied, that operator is applied, and the goal wff is tried again in the changed world model. Sometimes the selected operator may not lead to the appropriate sequence of world models, so STRIPS ought t o be able to back up. The search for relevant operators can be viewed as a tree search. Notice further that this search is quite different from the search associated with proving or disproving the goal wff itself in a given world model. There the search strategies would be those applicable for resolution theorem proving. I n this particular problem, this GPS-like search is quite straightforward, and we omit the details. STRIPS would come up with the operator sequence GOTHRU (Dl, R1, R2) and PUSHTHRU (BOX 1, D1, R2, R l ) . The paper Fikes et al. (1972) contains major additions to STRIPS, especially in the matter of generalizing the plans produced for specific cases and storing them in the form of “macroactions.” Further, the generalized plan is also used to supervise the execution of plans in such a manner that the robot can react intelligently to unexpected situations. For instance, suppose a robot has to re-execute a portion of its plan due to some failure. This use of generalization and macroactions then can help the robot to re-execute the actions with different arguments rather than repeating the previous sequence with identical arguments. We now leave representation in a formal logic and proving theorems a ) ,
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
199
by resolutions techniques for problem solving, both because of the fact that text books dealing with it are readily available and because of our lack of enthusiasm for it, and turn to specially designed languages in which knowledge can be represented and strategies specified. 4.3 PLANNER
We now present a description of the PLANNER (Hewitt, 1970) language. Our presentation is based on Chapter 3 of the above reference and Chapter 6 of Winograd (1972). Take the deduction, “Turing is a human, all humans are fallible, so Turing is fallible.” I n first-order predicate calculus, this will be, with appropriate interpretation of symbols, H(Turing), (Ax)H(x) 3 F(x), F(Turing). The deduction can be shown valid by standard techniques in first-order logic. I n PLANNER, which is a language embedded in LISP, the above deduction would appear as ASSERT (HUMAN TURING) ( D E F I N E THEOREM 1 CONSEQUENT (X) (FALLIBLE $?X) GOAL (HUMAN $?X) ) The ASSERT expression is used to add to the data base of facts. Ignoring t,hosc aspects of the above which concern the particular implementation of PLANNER, essentially what the rest of the lines above say is that a theorem is being defined, the theorem is of the CONSEQUENT type, the variable is X, and the theorem states that if you want to show X is fallible, show that X is human. The proof would be generated by asking PLANNER to evaluate the expression {GOAL (FALLIBLE TURING) }. The system first checks the data base to see if FALLIBLE TURING is there, it is not; then finds a theorem which is relevant to the goal (by pattern matching), that theorem being THEOREM 1 defined above, it instantiates the variable X to TURING and is told that it should try the goal HUMAN TURING, which is accomplished immediately by checking the data base (the ASSERT statement having just added it to the base). The point to be noticed is that in PLANNER part of the knowledge is represented in the form of procedures: if you want to do this, do that. Another important aspect of PLANNER is the backup capability in case of failure. This can be illustrated by the following example. We add to the data base {ASSERT (HUMAN SOCRATES)} and {ASSERT (GREEK SOCRATES) } and suppose we ask, “Is there a fallible Greek?”
200
B. CHANDRASEKARAN
The way the question is inputted in PLANNER will correspond to asking the system to find an X such that X is fallible and X is Greek. The first goal is GOAL (FALLIBLE$?X) and by something like the protocol mentioned earlier, it will instantiate X to Turing. The second goal will be GOAL (GREEK TURING) which, evidently, cannot be attained. The problem is of course that in our data base HUMAN TURING was ahead of HUMAN SOCRATES, and the system has to go back and look further in the data base to reinstantiate X to SOCRATES. This back-up can go to the last place where a decision of any sort was made. Here the decision was to choose an item in the data base. Other decisions such as the choice of a theorem to satisfy a goal could have been made and the backup mechanism in planner keeps sufficient information to change any decision and send evaluation back down a new path. PLANNER evaluates functions. It is a goal-directed programming language, and if theorems can be thought of as subroutines, it provides a mechanism by which subroutines can be called not by name, but rather by something like, “Call a subroutine to obtain this result a t this point.” PLANNER can make use of imperative information, stating how t o proceed in order to attain a goal. It has the full power to evaluate expressions which can depend upon both the data base and the subgoal tree, and to use its results to control the further proof by making assertions, deciding what theorems are to be used, and specifying a sequence of steps to be followed. I n PLANNER, one can add to GOAL statements a recommendation list of theorems; the list may contain theorems which are the only ones to be tried, or theorems which are to be tried in a certain order and so on. There is always a problem of deciding what should be in the data base, which deductions ought to be kept there for later use, and which ought to be deduced as one needs it. It obviously depends on the subject, ie., certain facts about it are important and certain others not. ANTECEDENT-type theorems are of use here. Example: if we want the system to have the ability to assert that, if X likes Y, X is human (say), rather than try to deduce it (saves time, and quite probably it will be useful for the system immediately to know this). Then the theorem (DEFINE THEOREM 2 (ANTECEDENT (XY) (LIKES $?X $?Y) ASSERT (HUMAN $?X) ) ) does the job. Thus the programmer can introduce as much heuristic knowledge as he deems necessary about a subject matter. Another useful feature of the language is the ERASE feature which erases assertions in the data base. If we view the data base as the “state
ARTIFICIAL INTELLIGENCE-JHE
PAST DECADE
201
of the world,” then in the process of doing things the state of the world might change and this should be immediately reflected in the data bases. This is the frame problem originally considered by McCarthy and Hayes (1969). The ERASE feature effects a great simplification in keeping track of the state of the world, since the system would otherwise have to retrace an entire sequence of operations to decide what changes have taken place. An example would be: If you want to put X on Y, find out the Z on which X is now (current state of the world), erase from the data base the fact X is on Z, and assert that X is on Y. This can get pretty complex, but the basic idea is the same. I n traditional theorem provers such as the resolution-based ones, this would create problems since the axioms of the system have a certain immutable status. Attempts t o handle such changing states have resulted in tagging state variables which are kept track of (Green, 1969). STRIPS that we have discussed earlier with its add and delete lists has some of the features of this ERASE instruction. PLANNER also has the ability to accept construction of local data bases called states, relative to which PLANNER evaluations can take place. Thus two incompatible states of the world can be studied. Further potentials of PLANNER include the ability for the program itself to create or modify PLANNER theorems. Of course, PLANNER has none of the mathematically satisfying properties of completeness or soundness, the latter since a careless programmer might have introduced contradictory heuristics. But then neither do humans have them. The power of PLANNER lies in the fact that while the language itself is independent of any subject matter, very detailed and arbitrary procedural heuristics can be readily supplied by means of the language. While we have concentrated on one language, PLANNER, it must be pointed out that there have been other attempts such as POP-2 (Burstall et al., 1971), QA4 (Rulifson et al., 1971), and SAIL (Swinehart, 1973) which have some of the attractive properties of PLANNER such as varied data structures, associative memory, pattern matching, automatic and intelligent back-up, and procedural embedding of knowledge. Another language which deserves some attention is REF-ARF (Fikes, 1970). Its language R E F is algebraic-oriented, with features which make it convenient to state certain kinds of problems. Finally, CONNIVER (Sussman and McDermott, 1972) is a successor to PLANNER, with much improved backup features. Both CONNIVER and QA4 have, instead of a data base to which assertions are added or deleted, a data base in the form of a tree. Modification of the data base takes place only in the data associated with the current node of the context tree, and contexts can be changed under program control. Thus alternatives
202
B. CHANDRASEKARAN
to a course of action can be considered by changing contexts instead of making extensive changes to the data base. Eventually, when permanent change to the data base must be made, the bookkeeping for this is taken care of by the implementation in CONNIVER. 5. Automatic Programming 5.1 General Remarks
Our interest in automatic programming is mainly due the fact that artificial intelligence techniques seem relevant to some of the problems in that area. Feldman (1972a) provides a brief survey of some of the activities in automatic programming which he classifies into two types: Type 1, whose goal is automation of the production of programs in a particular domain, with considerable knowledge of the domain built in; and Type 2, concerned with the fundamental theoretical problems in program synthesis. The problem of automatic program synthesis has many elements in common with program verification; i t would also be nice to be able to verify that a synthesized program is the correct one, though it is hardly ever done for most programs, computer-synthesized or written by humans. Again, there is a substantial amount of literature reporting work on both types: practical, special purpose approaches and theoretical, general approaches. London ( 1970) contains a bibliography of the literature on proving the correctness of programs. The work on automatic strategy generation for robots has implication to automatic programming, according to Feldman, who also expects these two efforts to diverge. However, concepts underlying languages such as PLANNER, which we discussed in Section 4, POP-2 (Burstall et al., 1971), &A4 (Rulifson et al., 1971)) and SAIL (Swinehart, 1973) have, as mentioned earlier, features such as varied data structures, associative memory, pattern matching, automatic and intelligent back-up, and procedural embedding of knowledge which might prove useful to the automatic programming effort. Approaches based on these languages are in a sense competitors to uniform procedures based on predicate calculus (Manna and Waldinger, 1971) which have the attractive property that program synthesis can be posed as theorem-proving problems in these systems. One suspects that the same considerations which lead us in our section on PLANNER to prefer such languages over the predicate calculus for inference and deduction will also militate against uniform procedures ever producing very useful systems. Simon’s work on the heuristic compiler (Simon, 1963) has insights which might be useful for automatic program synthesis.
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
203
Green (1969) gives an example where a simple program is produced by theorem-proving techniques. Manna and Waldinger (1971) discuss the sort of theorems that a theorem prover might prove to synthesize programs with recursion or iteration, and these theorems use various formulations of the principle of mathematical induction. Schubert (1973, 1974) discusses some aspects of program synthesis a t a rather abstract level. 5.2 A Program Synthesis System
Another approach to automatic program synthesis is the notion of inferring a program from example computations. Biermann (1972) gives an example of this kind of work, which has its theoretical base in grammatical inference (Biermann and Feldman, 1972; Feldman, 1972b). While the above work is concerned with abstract programs, i.e., synthesis of Turing machines, some recent work by Biermann and associates (1973) has a more practical bent and is strongly oriented towards the A1 paradigm of search. Work by Raulefs (1973) and Barzdin (1972) also seem to have the flavor of program synthesis from computational traces. Such work must be contrasted with that of Amarel (1971) , Balzer (1973) , Waldinger and Lee (1969) , Manna and Waldinger (1971) , and others who aim to design systems which synthesize algorithms “from very weak input information such as input-output pairs or a formal specification of the desired performance.” Here the goal is not automatic synthesis of the algorithm, but a method by which the user’s concept of the algorithm can be transmitted to the machine. Thus in a sense the user has the algorithm in his mind and wishes to use the machine to output the code for the algorithm by leading the machine through the flow chart, so to speak, for example computations. To quote Biermann et al. (1973) : We are currently using a computer display system on which appear data structures declared by the user and the commands available for doing a calculation. The user executes an example calculation by referencing the commands and their operands (among the data structures) with a light pen in scratch pad fashion. The contents of the data structures are continuously updated as the calculation proceeds. While the example is being carried out, the machine records the sequence of commands, and later, after one or several examples have been completed, it constructs the shortest computer program which is consistent with these computation traces. As an illustration, the user might spend several minutes sorting by hand with the light pen the list of integers (3, 2, 1, 4) using some algorithm. After a fraction of a second of computing, the machine would type out a general program for the algorithm which sorts N integers. . . . If the synthesized program should exhibit some shortcomings, the individual could input additional computatiom which would force the system t o appropriately revise the generated program.
204
B. CHANDRASEKARAN
A trace is a sequence of instructions executed, such as “add register 1 to register 2.” Sometimes, an instruction might be executed after a test on the condition of some register, such as Register R1 is positive, is performed. Thus, in an example calculation, a trace might be {I2, Cl:Il, Is}.This would mean that the first instruction executed was I,, then condition C, being fulfilled, instruction I1 was executed and finally instruction I, (which might be the halt instruction), We will refer to I,, C l : I l , and I, as the first, second, and third elements of the trace. The general idea behind the synthesis algorithm can be illustrated by the following example. Suppose the trace sequence is {I1, I;, I,, I?, I,, C,:I,, I,, C 1 : L I,, Cl:Il, C2:Is}with I, the halt instruction. Notice that each instruction Ii, except I,, may occur several times in the trace. Let kI6 refer to the kth occurrence of 1%.If we set L = 4 as the limit on the number of instructions in the final program (excluding tests and branches), then the possible instruction sets will be {II,, II,, 113}, {II,, 21,) lI,, 113}, etc. Elements 1, 2, and 3 of the trace result directly in the tentative synthesis: 1. I* 2. I?.Go to 2
(We are not using any specific programming language. We shall write the algorithms so that the corresponding flow charts are obvious.) The above program so to speak forces the following elements of the trace to be I2 (obviously not correct). The fourth and fifth elements are in agreement with the synthesis so far, except that the sixth is Cl:Il. There is a choice between introducing line 3 with I, or going back to 1, and in such cases the tentative choice is to go back, in order to arrive at the shortest possible program. Thus the tentative program is 1. I1 2. I?.If C,, go to 1; else to 2
The seventh element agrees with the synthesis, but element eight, which is C, :I, contradicts the synthesis. When such a contradiction is found, the strategy is to go back to the previous unforced move, which in this case is I,, corresponding to element 6, and a new line corresponding t o 21, is added. The program now is
1. I, 2. I,. If C,, go to 3 ; else to 2 3. 11. GO to 2 Element 7 of the trace agrees with it, but there is trouble with the eighth. Again back up to the previous unforced move which is element seven,
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
205
I,. A new line corresponding to 21, is added. The program is 1. I, 2. I,. If C,, go to 3; else to 2 3. I,. GOto 4 4. I,. If C, go to 2; else to 4.
Now, however, the upper limit of 4 for the number of instructions is violated since there must be a t least one occurrence of I,. Thus once again, a backup to the last unforced move is made. Note that if L = 5 instead of 4, a modification to the last tentative synthesis, namely, “3 * I,. If C,, I,; else to 4” would produce a correct program, a t least as far as this trace is concerned. However, since L = 4, the synthesis proceeds in the above-mentioned fashion, finally to produce the program 1. I,. If C,, go to 4; else continue 2. I,. If C1, go to 2; else continue 3. I,. If C,, go to 1; else go to 2
4. I, Since this synthesis algorithm is basically enumerative, it is not very efficient. Biermann et al. (1973) describe various pruning techniques to reduce the search space. These techniques seem to be quite efficient and programs of significant complexity have been synthesized in reasonable time. Another significancc of these synthesis results seems to be that it provides a lower bound on the difficulty of synthesis, i.e., other systems with very weak input information can be expected to have considerably greater difficulty in synthesis than this approach. That is, if synthesis is difficult for some problems with this approach, i t will be virtually impossible, in practical terms, for other approaches based on less complete information.
6. Game-Playing Programs
We assume that the reader is familar with the concept of finite games, and that the fact of checkers and chess being finite games is not especially useful for programming a computer to play these games, mainly because of the enormous size of the search space. In this section, we confine ourselves to indicating what sort of progress has been made in the past few years in game-playing programs, rather than attempt to give detailed discussions of the theoretical foundations. There has been an increasing awareness that early methods based on a numerical evaluation function to be minimaxed and backed-up so as
206
B. CHANDRASEKARAN
to enable a move to be chosen are inadequate for complex games, because the single backed-up number cannot represent useful structural information; there is too much averaging going on. Of course, this recognition has been far more important to chess than to any other game. Confining ourselves to this basic evaluation function model for now, the basic technique has been to consider all alternative moves for a given depth, search all continuations to a fixed depth, evaluate each of the board positions to obtain a number indicating the “promise” of that position (by using the evaluation function), minimax back t o the successors of the node which corresponds to the state of the game, and thus choose the best move. The part of the above scheme which is specific to the game is the evaluation in which the knowledge and expertise of the programmer (or those borrowed from an expert) come into play. There are numerous variations of the above basic technique. For instance, instead of searching to a fixed depth, one might search until a so-called dead position is reached, i.e., the board position is stable in some sense; one could use more complicated backup procedures; and so on. There are interesting techniques for computational efficiency, such as the alpha-beta pruning procedure. Restrictions of space forbid a detailed examination of these and other very important techniques. We refer the reader to a text such as Slagle (1971) for more details. I n checkers, an evaluation function may be of the following form (Slagle, 1971) : 6k 4m u, where k is the king advantage, m is the (plain) man advantage, and u is the undenied mobility advantage. The undenied mobility of a player in a position is the number of moves he has such that the opponent can make no jumps in the successor positions. Thus, this static evaluation function measures in some sense the “promise” of a board configuration. Instead of searching to a fixed depth, one might use some more involved termination criteria such as “stop searching if k levels have been reached and if the position is dead (i.e., no immediate jumps are available) .” I n 1967, Samuel described a very powerful checkers program. The termination criteria included game over, minimum depth, maximum depth, and dead position. There are many other heuristics controlling search. The features are man advantage, king advantage, mobility advantage, and a large number of others. Samuel formalized these features in his program, but they were obtained informally from checker experts. There are two aspects of “learning” incorporated into the program: generalization and rote learning. In generalization learning, the program improves its evaluation function (ie., the weighting of features) by experimenting with it until its moves agree with the moves recomniended by experts a large percentage of the time. The rote learning simply keeps
+ +
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
207
in memory book moves (those recommended by experts) as well as positions which have already been evaluated by the program and which occur often in games, thus having an edge over the ordinary player and saving time. While a programmer might be able to come up with a set of features, how they should be combined to form the evaluation function is always an open question. Samuel originally used a simple linear weighting and let the learning program come up with the appropriate weights. However, quite often it might happen that interactions between some features are as important as the features themselves. This could in principle be handled by extending the evaluation to include all possible combinations of features, pairs, triples, etc., and let the learning program find coefficients for all these added components. However, in practice, the number of coefficients would become enormous. Samuel has come up with a so-called signature table technique which is a practical solution to this problem and which is general and not specific to checkers. Because of its potential usefulness, we describe this technique briefly. On heuristic grounds it is often possible to come up with subsets (not necessarily disjoint) of the set of features, such that the subsets contain features whose interactions are likely to be important. The subsets are called signature types. Let us further assume that each of the features has been quantized t o a small number of values, say n. And if a signature type contains, say m features, then nnLpossibilities exist for interaction between the features in a signature type. I n the learning phase the relative desirabilities of combinations of feature values can be estimated and a function which takes as argument one of the nmpossibilities and outputs an evaluation could serve to characterize the interactions between that subset of features. For greater efficiency, each of these functions can be quantized to a small number of values. A further advantage of doing that is that the outputs of the various functions corresponding to different signature types can now be regarded as features a t a higher level and the signature table procedure can be repeated. A good feel for what features ought to be collected together for each signature type and how finely they should be quantized is essential for the success of this approach. Samuel considers reducing the memory requirements by taking into account some symmetries, but we content ourselves with this brief introduction to the signature table technique. It must be added that this technique has significantly improved the Samuel checker playing program. Computer programs have been written to play a variety of other games such as Kalah, different kinds of card games, Qubic, etc., and references to these programs can be found in Slagle (1971). However, computer chess has the most magical attraction among all such games. The present
208
B. CHANDRASEKARAN
author’s knowledge of chess is very meager, and thus we can do no more than mention some of the relevant papers in the area. The program by Greenblatt et al. (1967) is, among the programs that have been published, the most successful by common agreement. Its one big distinction has been that i t beat Dreyfus, whom we have had occasion to mention earlier is a severe critic of artificial intelligence and who had especially scoffed a t computer chess. Berliner (1973) has discussed “some necessary conditions for a master chess program.” H e believes that the structure of today’s most successful programs cannot be extended t o play master level chess. He also believes that tree-searching models for games like chess have a basic weakness: the Horizon effect, which causes unpredictable evaluation errors due to an interaction between the static evaluation function and the rules for search termination. H e presents an outline of an approach which he believes avoids these difficulties. Berliner’s comments on the Horizon effect are closely related to the dcgradation and loss of structural information in purely numerical measures that we alluded to earlier. The book by Botvinnik (1970) holds a special interest in view of the fact that the author is both a chess grand master and an engineer. The paper by Gillogly (1972) describes a chess program based on a brute force search of the move tree with material as the only evaluation function, and with no forward pruning. The idea is that because of its “transparent structure,” this program can be used as a benchmark, providing a lower bound on performance for a given number of “plys,” i.e., levels to which search is conducted. T o give an idea of the quality of computer chess, Greenblatt’s program won the Massachusetts Class D amateur trophy in 1967. It is an honorary member of the US. Chess Federation, and has been given a tournament rating of 1,400 (Jackson, 1974). 7. Some learning Programs 7.1 General Remarks
The term “learning’’ has many meanings, from the simple to the profound. Systems where parameters are adjusted as a function of past inputs and outputs can be considered learning systems. On the other hand, the processes involved in learning by humans-ranging from simple skills to complex concepts-are generally far more complex than this, even though it is undoubtedly true that for any learning phenomenon there does exist some abstract space in which some parameters have been updated. Since we are looking for insights for building intelligent systems, we will not discuss various reinforcement models of learning, which seem
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
209
to us of only peripheral importance for AI. Similarly, in the pattern recognition literature (see the text by Duda and Hart, 1973) considerable work has been reported on “learning” probability densities of the various classes. These learning algorithms are based on statistical theories of estimation and consist of various refinements for different a priori knowledge and structural assumptions. Rosenblatt’s (1960) Perceptron incorporated a model of learning by reinforcement and some interesting results were obtained for simple Perceptron configurations, but as the configurations get more complex, analysis becomes intractable and thus interest in Perceptrons as learning devices has leveled off. However, more recently Minsky and Papert (1969) have obtained very interesting results on Perceptrons viewed as parallel computers, but the content of those results is not of direct interest to us here, An example of a direct application of learning concepts is the previously discussed program of Samuel for playing checkers. There is a proposed learning section to the program MULTIPLE (Slagle, 1971, Chapter 7) and here learning seems to be of a simple kind. A general problem solving system with some learning features is that of Horman (1964). We have done no more than refer to the above learning programs, mainly because of the self-imposed constraint in the Introduction to deal with programs of a strong heuristic programming content. The most important such programs is that of Winston (1970) for “learning structural descriptions from examples.” 7.2 Structural learning
There are two major themes in Winston’s work: (1) good descriptions are essential and (2) learning (or teaching) is a process that should involve carefully chosen examples of concepts as well as near misses. His program operates in the world of blocks (cubes, wedges, bricks, etc.) and the sorts of structures about which the machine learns are exemplified by “the arch,” “the house,” “the pedestal,” etc. The arch, for instance, is a block of some kind supported by two bricks not touching each other. The data structure Winston uses for describing the scene is a network, where the nodes can be objects, scenes, or concepts and the links are the different kinds of relations that the nodes may have to each other. Note that relations themselves may, in their capacity as concepts, be represented by nodes. For example, in “The W E D G E is SUPPORTED-BY the BLOCK,” SUPPORTED-BY is a relation. In “SUPPORTED-BY is the opposite of NOT-SUPPORTED-BY ,” SUPPORTED-BY (as well as NOT-SUPPORTED-BY) is a concept. As an example, consider a scene composed of two bricks, A and B, one standing t o the left of
210
B. CHANDRASEKARAN
the other. In the following network representation, the nodes are underlined. (The reader is urged to convert our representation into network B, R2C; B : R I -, C. form for ease of comprehension. For example, (A:R1 C:R&, R,B_} will represent a network with three nodes A, B, and C, A connected to B with relation R1 and to C with relation R2, etc.) The scene would have the network (SCENE: ONE-PART-IS A, ONE-PART-IS B; A: HAS-PROPERTY-OF STANDING, A-GND-OF-BRICK, LEFT-OF _ B; -B : HAS-PROPERTY-OF STANDING, A-KIND-01’ BRICK, RIGHT-OF A}. The way we write the network automatically incorporates the direction of the arrow in the relational links. If those nodes that are both concepts and relations are double underlined, then A : LEFT-OF B; B: RIGHT-OF -A. 1 a network structure might be {-
LEFT-OF: OPPOSITE RIGHT-OF; RIGHT-OF :OPPOSITE LEFTO F } . Some other relations that are used in description are IN-FRONTOF, ABOVE, ABUTS, and HAS-PROPERTY-OF. Given a scene, there is a cluster of preprocessing programs which goes through the following sequence: find lines, classify vertices, use Guzman’s (1968) algorithm for grouping various regions as belonging to the same object, decide on the relations that exist among objects (these work on the basis of a collection of heuristics, which are interesting in themselves), roughly classify objects on the basis of size, group objects in a scene for a more useful hierarchic description later on, and describe groups in terms of a typical member (if appropriate). To give an idea of how thc group analysis Contributes to the network description, consider a group composed of three standing bricks arranged in a tower. Let node A stand for this as an entity. I n order to represent the group, the notion of a “typical member” is helpful, which in this case would be a brick most likely standing on top of another brick (the bottommost one obviously does not, but then it is not “typical”). Let the entity typical membcr be denoted by B, and let the nodes in this group be denoted nodes h41, 142, and M3. Then the network representation for the group A : ONE-PART-IS M1, ONE-PART-IS M2, ONE-PART-IS would be: {R43, A-KIND-OF GROUP, TYPICAL-MEMBER €3, FORM SEQUENCE; B: SUPPORTED-BY ANOTHER RZEkBER, H Z PROPERTY-OF STANDING, A-KIND-OF BRICK). Once these preliminary programs grind through and produce network descriptions for different scenes, then a powerful network matching program is used to compare them. Given two networks. it decides which nodes of the two are “linked,” i.c., correspond in the sense that they have the same function in their respective networks. For instance, if
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
21 1
we have two networks, one (D l : ONE-PART-IS Rill, ONE-PART-IS n42; MI : SUPPORTED-BY n42, A-KIND-OF WEDGE; n42: A-KINDO F BRICK} and the other identical to this except n o d e s y l , M l , and A42 are represented Dl’, A!Il’, and R42’. then Dl-Dl’, Ml-Ml’, and M2-M2’ are linked pairs and will be so declared by the matching program. Now, if the scenes are not identical, then the program links pairs which do not have identical descriptions, but are “most similar.” If there are two networks (A:QGl, C1, P C2; C1:P H; C2:-D-1 - - QG2, - PP El, P J, P I<; El :-C2, AT, -F1; F1:R N } and {A’:Q -1 G2’ - T L, U P E’, P C 2 ’ ; C1’:P H; C2’:P J, P I<, P El’, P E2’; E l’: -C2’, T - - - - - - - - - - - L-1 mF1’; E2’:U RI; F1’:R N ) where P, Q. etc., are relations and is a don’t care (for our purposes) relation, then the matching program would come up with the linked pair A-A’, G2-G2’, C1-Cl’, C2-C2’, E1-El’, and F1-Fl’. Readers might find the above somewhat difficult to visualize and a net diagram might be helpful. A skeleton is then formed which is something like a copy of the structure common to the two networks. Each linked pair is a node in the skeleton with the pointers between the nodes in the skeleton corresponding to the pointers on the basis of the identity of which the linking of pairs was done. Attached to the skeleton is a second group of nodes, called C-notes for comparison notes. An example is the intersection C-note which will correspond to the situation in which both elements of the linked pair have the same pointer to the same concept. For the networlts, A l : A-KIND-OF WEDGE}, (A2: - A-KIND-OF- WE DG E }, A1 and A2 are linked pairs, the skeleton would be {A : A-KIND-OF WE DG E ) , and the C-note network whose entry point is, say, B would be IB: C-NOTE c; c : A-KIND-OF INTERSECTION, NATION WEDGE, POINTER A-KIND-OF}. A more complicated C-note is the supplementary-pointer C-note, an example of which applies to two scenes in one of which a cube A1 is on top of another A2, and in the other, a cube Al’ is to the left of the other cube A2’. One way to describe the difference is that in one scene, one cube is supported by the other and in the other this is not true of the two cubes. If both the scenes represent one cube supported by the other then the skeleton would be ( A : SUPPORTED-BY B ) . As it is the C-note is {A: C-NOTE C;
--
--
-
L.
POINTER-DES
c A-KIND-OF
-:
SUPP.
POINTER,POINTER-DESTINATION 5 -1
MISSING-POINTER SUPPORTED-BY } . Other kinds of C-notes are E X I T for one network containing more concepts than another, NEGATIVE-SATELLITE PAIR for the two networks being identical except for a pointer in one representing the opposite of the relation that the
21 2
8. CHANDRASEKARAN
corresponding pointer in the other do.es, and similarly MAY-BE-SATELLITE-PAIR, etc. Then there are C-notes corresponding to the case where linked pairs represent different but closely related concepts. We have taken pains to cover the description mechanism in some detail, because it is powerful and uniform and perhaps capable of being used in many different applications. Next we briefly go through the basis of the program for learning structural descriptions. The idea is to show the machine various examples of a particular structure, carefully chosen so that the set contains all the different ways a structure might be realized and also to show it near misses so that it can realize what propertiw are essential but missing caused the near miss not to be an example of the structure. The program builds a model from the descriptions of the various examples and by comparing the networks by means of its repertoire of C-notes. Consider the structure “house,” an example of which is given by the net {SCENE: ONE-PART-IS 01, ONE-PART-IS 02; 01 -:A-KIND-OF 02:A-KIND-OF BRICK). This is WEDGE, SUPPORTED-BY - also the tentative model of the “house.” Next a near miss is presented {SCENE: ONE-PART-IS 01, ONE-PART-IS 02; _ -01 A-KIND-OF WEDGE; 02 A-KIND-OF B R I C K ) . The difference is that the near miss is missing the pointer SUPPORTED-BY. Now the model is - 01 -:A-KIND-OF (SCENE: ONE-PART-IS 01, ONE-PART-IS 02; WEDGE, BY 02 -: 02: - A-KIND-OF BRICK I . Then two further near misses {SCENE:ONE-PART-IS 01, ONE-PART02; 01 :A-KIND-OF BRICK, SUPPORTED-BY 02; 02: IS - A-KIND-OF BRICK} and 1SCENE: ONE-PART-IS 01, ONE-PART-IS _ 02; -01: AKIND-OF WEDGE, SUPPORTED-BY 0% - 02: - A-KIND-OF WEDGE) are shown. The final model now is {SCENE: ONE-PART-IS 01, ONEPART-IS 02 : 01 -: RIUST-BE-A-KIND-OF WEDGE, MUST-’E-SUPPORTED BY 02; 01: MUST-BE-A-KIND-OF BRICK] which is correct. I n this case, the model was built by a suitable conversion into MUST-BE relations. There have been many other programs that have learning components with a strong heuristic programming flavor, of course. The paper by Fikes et al. (1972) which we mentioned earlier and which deals with a program that enables a robot to generalize its plans can be considered a learning program in a sense. Waterman (1970) discusses a method of representing heuristics as production rules and of dynamically manipulating them (the latter is essential if heuristics are to be “learnt”) and in some sense his program creates, evaluates, and modifies its own heuristics. The learning
02;
MUST-BE-SUPPORTED
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
213
is of a limited type; it is not structural, but rather changing certain parameters in heuristics of a prespecified structure. 8. Heuristic Search 8.1 Hill Climbing
It is by now a truism that artificial intelligence is really intelligent search in a space of possible solutions to a problem. It is interesting that as early as 1947, Turing viewed the problem of designing intelligent machines in the context of search (see reprint of that article, Turing, 1970). Minsky (1961) discusses some of the issues involved in search, including the so-called “mesa phenomenon” where a small change in the value of some of the parameters of the system leads either to practically no change a t all or to a very large change in the value of the goal function, during the process of searching for the optimum parameters for the system. If the machine is searching in a flat region in the space of parameters-goal function, then much aimless wandering could result. This early paradigm, which is called hill-climbing, may also operate a t different levels arranged in hierarchies or recursive structures. Minsky suggested that perhaps what may be hill-climbing a t one level may have the appearance a t a lower level of a sudden jump or “insight.” In recent work in AI, however, the mathematical formulation of search is viewed more in terms of searching for a path in an abstract graph. 8.2 Search in Graphs
Abstractly, a problem-solving task can be posed as one of transforming an initial state to a goal state by means of a sequence of permissible operators, i.e., there exists a graph in which the top node is the initial state, one or more of the tip nodes is a goal state, and a path is to be found connecting the initial node to one of the goal nodes. For most nontrivial problems, exhaustive enumeration of all possible paths is out of question due to the enormous size of the graph, thus producing a need for intelligent enumeration. (There is of course another more important problem: what is the intelligence required to convert a problem posed in a different language into the domain of searching for a path in a graph? This problem is nontrivial, but the important thing is that, for many problems, even if a suitable representation is given, still the search for a solution path is nontrivial.) In the case of application of this concept to games, a slight modification is required. Because of the minimax
214
B. CHANDRASEKARAN
aspect, the problem becomes one of searching for a subgraph in a socalled ANWR graph. The basic idea behind A N W R graphs can be understood by considering a finite game tree. The tip nodes of the graph may be associated with win or loss (say) for the computer. Thus the general procedure is to minimax backwards. Consider a node and its successors. If that node corresponds to a move by the computer, then the node can be assigned the value “win,” if any one of the successors has the value “win.” This node will be called an OR node for obvious reasons. On the other hand, if that node corresponds to a move by the o p p o n e n t t h e computer-in minimaxing, can assign the value “win” to that node only if all the successors have value “win.” Thus this node then becomes an AND node. The solution graph in this case would consist of all the paths that are possible a t an AND node and the best path a t each OR node. While we have posed the game playing problem in this fashion, many other tasks, particularly tasks that can be viewed as collection of subtasks each of which can in turn be viewed similarly and so on, until the subtasks can be directly solved or recognized as unsolvable, can be cast in the form of finding an AND-OR solution graph in the space of subtasks. The textbook by Nilsson (1971) deals with the search paradigm in A1 in a thorough and readable manner, so we omit any detailed discussion of how tasks can be converted into path- or subgraph-search problems. For our purposes, it is sufficient to consider the abstract model as given. Given any state, its successors can be generated by applying all the applicable operators to the task configuration corresponding to that state. There might be a cost associated with each path, say f ( m , n) from node m to node n. Let the start node be s and the set of goal nodes be G. Then the problem is: find a path from s to g c G such that f ( s , g) is minimum. Standard breadth-first and depth-bounded depth-first methods are guaranteed to find a solution if there is one. However, suppose the designer has some problem-specific information (heuristic information) which enables him to estimate, for any given node, the cost of a minimal cost path from that node to a goal node, i.e., he has some idea of how difficult or easy it is to complete the task a t any point in the task configuration, then, essentially, he can use this information to possibly effect some savings in the number of paths he should consider (which, for a tree, would grow exponentially with the distance of the goal node from the initial node) before finding the optimal path. Obviously, how much savings are made depends upon the goodness of the estimate. If the estimate is bad enough, he might end up missing the optimum path altogether. However, it is possible to obtain conditions on this estimate guaranteeing finding the optimal path, if such a path exists.
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
215
The cost f(n) of an optimal path constrained t o go through any node n, can be written as the sum of g(n) h(n) where g(n) is the cost of the optimal path from s to n and h(n) is the cost of the optimal path from n to g. For the node n, if a path from s has been obtained, a n estimate f(n)o f f is $ ( n ) h(n) where $(n)is the cost of the best path so far from s to n, and h(n) is the heuristic estimate of h(n),based on problem knowledge. Consider the following algorithm. (1) Place the start node in a set A, computef(s). ( 2 ) If A is empty, failure; else, continue. (3) Take the node in A with the smallest f value, check if it is goal node, if it is, trace the path back to s and exit with success. If it is not goal node, find its successors, compute their f values (if a node is regenerated with a smaller 3 value, keep the smaller value and the associated path) and put the successors in A, and the parent in another set B. (4) Go to 2 . If h = 0 for all n, essentially there is no heuristic information and there is no saving. However, if h = h (perfect knowledge), then only the nodes on the optimal path will get selected every time the algorithm goes to 2, thus obtaining the optimal path in the shortest possible time. Intuitively, if 0 < h < h, then some efficiency would be obtained, but the optimal path will be guaranteed. On the other hand, if h > h then there is no guarantee that the optimal path would eventually be produced. The above statements, made more precise, are in fact the substance of the Hart-Nilsson-Raphael theorem (see Nilsson, 1971). While the condition h 5 h guarantees the generation of the optimal path, in practice quite often there is a bias toward efficiency at the risk of not finding the optimal path, as long as some reasonable path is found.. As a matter of fact, practical memory constraints often demand pruning the s%tA so that some of those with the largest f values possibly containing a node that is on the optimal path, may have to be discarded just so more memory space is released, and the search can continue. A similar algorithm, with similar f, 9, and h functions, is applicable to the case of AND-OR graphs also. I n connection with game-playing programs, for whose search space the abstract models are AND-OR graphs, we already mentioned the alpha-beta heuristic and other techniques to improve efficiency. For details on these and related matters, see Nilsson (1971) or Slagle (1971). More recently Pohl (1973) has suggested a n innovation in the form of explicit dynamic weighting of the heuristic information, the weight being inversely proportioned to its depth in the search tree, thus producing a narrower search tree. However, because of the fact that it is still a lower bound on h, this modification still guarantees finding a n optimal path. Harris (1973) introduces the notion of a “bandwidth” condition on the heuristic evaluation function. This condition is equivalent
+
+
216
6. CHANDRASEKARAN
+
t o demanding that h - d 5 h 5 h e for nonnegative d and e . When d > h and e = 0, this condition becomes equivalent t o the lower bound condition on h. Harris shows that for general values of d and e , this is a convenient method for coping with the practical constraints on time and space. Most commonly seen textbook applications of these concepts have been to puzzles and games. Montanari (1970) has applied the Hart-Nilsson-Raphael theorem and the model in which the theorem is embedded to a very practical problem: chromosome matching. Consider a twodimensional pattern space in which measurements made on a set of chromosomes are plotted. Since it is known that chromosomes occur in pairs, chromosomes that belong to a pair should be in some sense close together in the pattern space. The problem is to automate the process of deciding which chromosomes in the pattern space constitute pairs. This problem can be seen to be an instance of the “weighted matching” problem, for which there are algorithms available but which are very inefficient. Montanari shows how the weighted matching problem can be converted to a shortest path problem. Then he uses the fact that typically the chromosomes are found in clusters in the pattern space to obtain a lower bound on h and thus directly uses the heuristic search algorithm with guaranteed optimal solution. However, there is always a question about whether a certain representation for the search space and the operators is a natural or efficient one. Hall (1971) considers the Montanari solution as labored and holds that a direct branch-and-bound approach is the natural way to tackle the problem. 8.3 Application to Chemical Synthesis
An interesting example of the application of A N W R graph search concepts is that of organic chemical synthesis (Sridharan, 1973). Here AND-OR graphs come about in the following way. The problem is to find a synthesis of an organic chemical molecule from available chemical compounds, taking into account feasibility and valid reaction pathways. Thus the top node will be the target molecule, each of the other nodes except the tip nodes will be potential candidates for molecules from which the top node can be synthesized. A node is an AND node if all of its successors are needed in the generation of the molecule corresponding to that node, and it is an OR node if that molecule can be synthesized with any of the successor molecules. The arcs correspond to valid reaction paths. Now, it can be readily seen that the task of synthesis is one of searching for a solution subgraph in an AND-OR graph. However, instead of using the expertise of the chemist in the form of estimates of h function, the
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
217
search strategy is more complex and more structured. While this may not lead to interesting theorems, a more structured search strategy, which attempts to keep track of information in an intelligent way instead of in the form of one numerical measure, is bound to be a more valuable component of an intelligent system. Incidentally, as long as we are considering chemical applications, we should mention the program heuristic DENDRAL (Buchanan et al., 1969) which attempts to come up with a molecular structure from the mass spectrum and the empirical formula of the molecule. There is some heuristic search involved, but it does not use any of the formalisms we have referred to. The strong point of the program is the extensive knowledge of that specific domain of chemistry which has been programmed in.
9. Pattern Recognition and Scene Analysis 9.1 Formal Approaches
I n the general category of work called pattern recognition, there have been two main lines of approach: (1) those based on statistical classification procedures applied in a feature space, where features are some attributes whose values are measured for all patterns and depending upon the choice of the features, the members of various classes occupy regions in the feature space which are easily separable; and (2) those based on the structure of the patterns, i.e., those in which the logic of the juxtaposition of the elements constituting the pattern becomes decisive. There is an extensive literature on the first approach and by now they have been codified in various textbooks-an example of which is Duda and Hart (1973). We will not concern ourselves with this approach here. The second approach, viz., structural pattern recognition, has also had two different subdivisions. The first of these deals with formal grammars to describe and recognize patterns, while the second uses heuristic techniques to analyze scenes. The basic idea in the first approach is that in many situations, there is an underlying basic, simple mechanism generating the elements of a possibly infinite pattern class. An example might be the line patterns observed in cloud chamber photographs. The formal system describing the rich variety of patterns of the traces of particles is governed by the relatively simple rules of particle physics. The rules expressing the generation of admissible cloud chamber traces can be considered a set of production rules in a grammar whose sentences are the traces. This insight is due to Narasimhan (1965). The usefulness of the idea is that a rather small, possibly recursive, description can be used
218
B. CHANDRASEKARAN
to characterize a potentially infinite set of objects. The grammar written by Watt (1966) to describe the generation of Nevada cattlebrands is one of the more interesting examples of the application of these morphological notions. Miller and Shaw (1968) is a useful survey of the grammatical approach to pattern recognition. Clowes is an important and prolific contributor to this area; a typical reference to his work is Clowes (1969). Kana1 and Chandrasekaran (1972) deals with the interaction between statistical and grammatical approaches. Watanabe (1971) offers a critique of syntactic approaches to pattern recognition. I n fact, for reallife pattern recognition, such grammar-oriented approaches seem quite often ill-suited and we shall not discuss them further. 9.2 Heuristic Techniques
The most useful pattern recognition approach, for artificial intelligence purposes, has been that based on heuristic techniques. Most of this work has centered around robot vision projects. h report by Feldman et al. (1969) summarizes some of the general problems and approaches pertaining to the Stanford hand-eye project. Other useful references are Nilsson (1969) and Popplestone (1969) and they deal with the general approaches to computer vision in other robot projects. The first problem in any program for computer vision is the preprocessing required to get an approximate line drawing corresponding to the edges of the objects in the scene. The state of the art in scene analysis is such that most of the work pertains to scenes with simple objects, i.e., objects with straight line edges, formed by polyhedra of various kinds. Because of the effects of shadows and general quality of the pictures, the problem of obtaining object edges is nontrivial. Changes in illumination migh) eliminate lines that should be there or introduce extraneous lines. We shall not be concerned here with this problem, except to indicate that Duda and Hart (1973) and Rosenfeld (1969) contain an exhaustive list of techniques for edge detection, contour following, picture smoothing, isolating regions in pictures by gray scale analysis, high and low pass filtering operations, and descriptions of line and shape. Some of the techniques are purely mathematical, while others have heuristic components to them. We will consider here, briefly, some approaches to a problem of perception: obtaining three-dimensional models of two-dimensional scenes. For instance, in a scene which is a picture of various polyhedra-some resting on top of others and some obscuring others-all that the two-dimensional scene would consist of would be various lines. From this level of description, the object is to proceed to the identification of the polyhedra that
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
219
give rise to those lines. Notice that some parts of some of the edges might be hidden. The notion of line semantics has been very useful in analyzing scenes containing polyhedra. The idea is that each of the lines in a scene can have only one of a few possible meanings. For instance, if the objects are all trihedral solids (exactly three plane surfaces intersect a t each vertex) then, any line can have one of the following meanings: it might be a line representing a concave edge (like that formed by the bottom line of a brick resting on the floor), a convex edge which does not hide another part of the scene, or a convex edge which hides some other part. Similarly, the vertices of the polyhedra have some properties which do not depend upon the particular pictorial representation (i.e., they are properties of the polyhedra) and some which depend upon it. Among the former is the t y p e of the vertex which depends upon the angles subtended by the plane faces meeting at that vertex. Among the latter is how the edges meeting a t a vertex look in a particular pictorial representation, i.e., whether they form a Y, a T, a V, or a ‘h. With this information, it is possible to arrive a t a decision about which vertices and edges belong to which polyhedron. Huffman (1971) and Clowes (1971) give procedures to label the lines for this purpose. Guzman (1968) gives a collection of heuristic rules to enable the program to decide which closed regions in a two-dimensional scene ought to be grouped together as representing one object. The rules are very powerful and they can deal with very complicated arrangements of polyhedra in a scene. However, it is not a straightforward matter to extend these techniques for the case where the object boundaries can be curved. Guzman’s program requires that the edge extraction process works perfectly. Falk (1972) gives a procedure which can handle imperfect line data in its grouping of regions into threedimensional objects. Mackworth (1973) describes a program which has objectives similar to those of Guzman’s, but uses a more mathematical approach which enables the program to have somewhat more knowledge than Guzman’s; thus, while Guzman’s program tends to see holes in objects as separate objects, both Clowe’s and Mackworth’s program do not have this inadequacy. I n a sense, Mackworth’s approach can be seen as a theory underlying Guzman’s heuristics concerning the relationship between vertex types and connectivity of faces. Our recapitulation of the above polyhedral scene analysis techniques as well as various preprocessing techniques has been rather brief, mainly because the easily accessible text Duda and H a r t (1973; part 11) synthesizes the various strands of investigations very nicely. For the problem of determining which edges of solids will be hidden in a particular projection, Loutrel (1970) and Bouknight (1970) arc useful references.
220
B. CHANDRASEKARAN
To give a Aavor of the application of heuristic techniques to other problems dealing with visual patterns, we briefly describe an attempt a t the generation of human facial imagery on a C R T by using a heuristic strategy (Gillenson and Chandrasekaran, 1974). Sketching a human face is a task which involves subtle spatial decisions and a knowledge of the aspects of the face that are important in recognition. These are talents non-artists lack. The interactive computer system is such that a nonartist can create, on a graphic display, a facial image from a photograph in front of him. The computer system contains prestored facial features, uses an “average” face as a starting image, and employs a heuristic strategy which presents the user with a series of choices for feature selection from a predetermined set of features or for simple adjustments of size and location. In this sense, the computer can be compared to the “police artist.l 1 10. Cognitive Psychology and Artificial Intelligence 10.1 General Remarks
Newel1 (1970, 1973) has been the most eloquent writer on the interplay between cognitive psychology and AI. As we pointed out in our Introduction, A1 is more and more evolving into a science of intelligence, and our report of work on computer semantics makes clear that there is a great deal of emphasis on how the mind might, if not does, go about doing some of the intellectual tasks involved in resolving meaning. There is a lot of work in AI that has nothing to do with human processes, of course. Nevertheless, the information processing model of intelligence itself has been a stimulant to psychology. At the very least, cognitive psychology and A1 share many metaphors. Clowes (1972) discusses the relationship between artificial intelligence and psychology by considering as an example a problem which is one of the central problems in AI: computer vision. He points out that the machine makes possible an empirical study of sophisticated, process-oriented models of intelligence and perception with the observational data of every day experience, such as perceiving visual scenes. Psychology, faced with the individual quality of human cognitive performance, has skirted the problem so to speak, by stressing experiments minimizing individual differences and maximizing repeatability. However, Clowes points out, the knowledge base with which an individual operates “makes possible the extremely complex yet integrated form of everyday experience and a t the same time gives it an individual character. The A1 programs begin to show both how this arises and the logical necessity for it.”
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
221
There is a commonly held belief in the A1 community that the ideal way to theorize in psychology is to produce theories in the form of computer programs since the need to write programs to perform processes forces the theorist to eschew vagueness and gives a degree of operational concreteness. This seems to us overstated and misleading. True, there is nothing vague about a computer program. But sometimes, it can be too concrete. Over and beyond those aspects of the process that the investigator wants to theorize about, computer programs require a level of detail and explicitness that has nothing to do with psychology: many ad hoc details and details to do with implementation. Which part of the code is theory and which is merely ad hoc details? When one attempts to validate such theories, these doubts become more insistent. If one remembers that information processing models in psychology are themselves a liberation from the narrow bondage of behaviorism there seems to be every reason to be cautious about embracing another potential tyranny: instead of the only valid theories being in the form of stimulus and response, the only valid theories become computer programs. But interpreted properly, no doubt computer program theories are, as Newell says, a sort of mental hygiene, helping to avoid the fuzziness of mentalistic terms, if one can do it. Early enthusiasts like Miller (1962; Miller et al., 1960) seem to have modified some of their enthusiasm for this approach recently (Miller, 1974). The most useful contribution made by A1 for psychology seems to be that many A1 programs can be viewed as “theoretical psychology” (Newell’s term), This aspect of A1 work ought to be distinguished from so-called cognitive simulation which might borrow some metaphors here and there from AI, but does not concern itself with mechanism with the insistence that A1 does. Newell makes out a case that symbolic systems (of which A1 systems involved with heuristic programming constitute a subdomain) might provide a successful theory of human behavior. This view can be summarized as follows (see also Simon, 1970, for related notions). The discovery that human memory, short-term memory a t any rate, appears to hold symbols, or chunks as Miller (1956) called it, leads to a view of the human as an information processing system with a small constant short-term memory capacity. Work in psycholinguistics based both on the symbolic models of Chomsky, e.g., Smith and Miller (1966), as well as more directly AI-inspired work, also indicates the importance of symbolic system models. Recent experimental research on problem solving and concept formation (Restle, 1962; Haygood and Bourne, 1965; Reitman, 1965; Feigenbaum and Feldman, 1963) all testify to a shift toward viewing mind as a symbol processor in experimental psychology. Newell quotes extensively from recent work in immediate memory to es-
222
B. CHANDRASEKARAN
tablish the point that in this area a t least there is a recognition that in order to study it, one needs some notion of an immediate (symbol) processor. What looks a t first as if dealing with, say, memory capacities and access characteristics turns out as also dealing with the rest of the processor, the control structure and the basic processing operations. Then Newell comes to A1 per se. H e quotes Quillian’s work on semantic memory, Newell and Simon’s work (1972) on cryptarithmetic problem solving as the kind of direct contributions of AI to cognitive psychology. We would perhaps add Winograd’s and Schank’s work to this list as well as learning programs such as EPAM (Simon and Feigenbaum, 1964), simulation of “probability learning” in binary choice experiments (Feldman, 1963), and concept learning (Hunt e t al., 1966). The work by Winston on structural learning (Winston, 1970) has relevance to human learning of structural notions. We briefly consider some of these learning programs in Section 7.2. For a general introduction to the various aspects of computer simulation of cognitive behavior, some further references are Apter (19701, Newell and Simon (19721, and Norman (1969).
10.2 ANALOGY Program
While we are discussing cognitive simulation, a pioneering program to solve the kind of problems that often constitute I& tests should be mentioned. This is the Geometric Analogy program of Evans (1963). [See also the introductory article (Minsky, 1966), which describes this and some other A1 programs in a clear, readable fashion.] A typical question to this program is: Figure A is to Figure B as Figure C is to which one of five possible answer figures? Here A, B, C and the answer figures are various patterns formed from simple geometrical objects. To give a trivial example, Figure A might be a triangle inside a square, Figure B a triangle to the right of the square, Figure C, a circle inside a triangle and answer figures are various combinations of these geometrical objects, with, of course, one configuration of a circle to the right of B triangle. We can assume that the objects are normalized for size, etc. There are two parts to Evans’ program. The first part of the program analyzes the figures (actually descriptions of figures, e.g., a circle would be described with the coordinates of the center and the radius), labels the objects, and provides relational descriptions holding among the geometrical objects in each figure. These relational descriptions are obtained by using analytic geometry subroutines. For our Figure A, it will come up with thc analysis: [triangle (location of coord.), square (coord.), tri-
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
223
angle to the right of square]. Then a “simularity calculation” is done on pairs of geometrical objects within each figure and between two figures. This is done by considering transformations of the following kinds: horizontal and vertical reflections, rotation, translation, size change, etc. Suppose one figure contains no triangles, the only difference between them is that one is twice as large as the other and rotated 90°. B y searching in the space of possible transformations, the similarity calculator will discover this fact. Similar arguments apply for similarity calculation for objects in two different figures. A11 these analyses are then handled by the second part of Evans’ program. Thus the computations so far would reflect all objects, their locations, their relations in a figure, and whatever similarities might exist between objects. Next comes the generation of possible rules by which Figure A can be transformed into Figure B. The rule might look like this: “TO get Figure B from Figure A, delete the small rectangle, and move the dot from inside the square to inside the circle.” These rules must in some sense be minimal, otherwise a general rule of the following sort would be universally valid: “delete all objects in Figure A and place all objects in their respective positions in Figure B.” Each rule is then applied to Figure C and a check is made to see if one of the answer figures corresponds to the result of applying the rule to C. Sometimes the rules could be too minimal in the sense some sort of generalization of a minimal rule might be the one that would generate the answer figure. Of course, such generalizations need to be attempted only if none of the minimal rules produce one of the answer figures as the correct figure. Similarly, if two candidates for the correct answer are generated, then the program selects that figure corresponding to the ”simpler” rule. The program does pretty well in most situations, but is not capable of revolving the socalled gestalt aspect of figures, e.g., if one overlays a square on top of a triangle such that three small triangular positions stick out, and the lines of the triangle under the square left hidden, then Evans’ program would interpret i t as three small triangles and square. But this does not seem to be an insurmountable problem and methods such as those of Canaday (1962) might be invoked to get around it. We have placed this description of ANALOGY program in the section on cognitive psychology-even though, in its use of descriptions and analogical reasoning, it is a pioneering artificial intelligence program, quite apart from its interest for cognitive simulation-mainly because, as Minsky (1968) in his Introduction remarks, “. . . the details of the selection rules . . . amount, in effect, to Evan’s theory of human behavior in such situations. . . . Some thing of this general character is involved in any kind of analogical reasoning.”
224
B. CHANDRASEKARAN
10.3 Belief Systems
Before we conclude this section, a few remarks about various modelings of “belief systems” are in order. Colby (1973) remarks, ‘ I . . . belief is . . . a fundamental psychological process. Humans, as knowing systems, continually engage in judging whether their representations of experience are true or false, credible or incredible. How these credibility judgments can be simulated is the subject of this enquiry.” While in any operational belief system, there will have to be some way to handle natural language, almost all investigations in this area have concerned themselves more with the structure and processes beyond the natural language interface. Thus, while there are some superficial similarities to questionanswering systems (both of them “answer questions,” for instance), the methods as well as the aims of these two areas are somewhat different. In AI, representation, manipulation, and transformation of knowledge in the machine is naturally one of the main areas of investigation. I n the simulation of cognitive processes, “belief” plays a role in the manner of an epiphenomenon, consequent to possession of knowledge. Thus, the relation between knowledge and belief has a natural interest. Colby and his associates have simulated the belief structure of a paranoid patient, i.e., the computer’s responses, judged by a jury of qualified psychiatrists, could not be distinguished readily from those of genuinely paranoid humans. [Weiaenbaum (1974) is recommended therapy for excessive faith in the validity of simulations of this kind.] Simulation of “normal” belief structures is more difficult, but an understanding of the relations between knowledge and belief is bound to be of use to A1 systems of the future. In addition to this potential relation, work in A1 dealing with natural language has been of use to modelers of belief systems; this is in the common area of semantics and inferences that a particular semantic system would permit and is capable of. Thus Colby’s work has a network structure and Abelson ( 1973) uses Schank’s conceptualizations as the basis of belief modelling. Much work remains to be done in this area and the reader is directed to the above references for further details.
1 1. Concluding Remarks
The year 1974 finds the field of artificial intelligence in a somewhat sober upswing. The remarks of Williams quoted a t the beginning of this article describe the mood well. The current generation of models are tough-minded ; able and willing to accept constraints which are closer to real life. It must be added, however, that the problems are extraordi-
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
225
narily difficult, in many ways close to the brink of the limits of the knowable. Whether current techniques will take the field to a plateau in the quality of the programs, or programs of undisputed intelligence will emerge, only the future can tell. For some limited tasks, mechanization is all but complete, e.g., MATHLAB (Martin and Fat,eman, 1971) shows a remarkable proficiency in certain kinds of formula manipulation. A1 is also having an effect on education, as in Papert’s (1971) work on “teaching children thinking,” which is rooted in AI. See also Wickelgren (1974) for an attempt, so to speak, to teach people to be intelligent, by a conscious adaptation of some A1 techniques. A new emphasis is beginning to be laid on speech recognition (Newel1 et al., 1971). and many A1 techniques are likely now to be tested in this difficult problem, just as the previous generation of techniques used robotics as a spring board. In fact, as Woods and Makhoul (1973) remark, “the speech understanding problem is almost a complete microcosm of the general robot planning problem and in some ways more difficult.” ACKNOWLEDGMENTS The preparation of this review was supported by AFOSR Grant 72-2351. I wish to thank Dr. Peter Kugel of Boston College and Dr. Larry Reeker of University of Oregon for their comments on an earlier draft of the manuscript. REFERENCES Abelson, R. P. (1973). The structure of belief systems. In “Computer Models of Thought and Language” (R. C. Schank and K. M. Colby, eds.), p. 287. Freeman, San Francisco, California. Amarel, S. (1971). Reprcsentation and modeling in problems of program formation. Mach. Intell. 6, 411. Anderson, D. B., and Hayes, P. J. (1972). “The Logician’s Folly,” AISB Bull., British Computer Society Study Group on Artificial Intelligence and Simulation of Behavior. Apter, M. J. (1970). The computer modeling of behavior. I n “The Computer in Psychology” (M. J. Apter and G. Westby, eds.). Wiley, New York. Balzer, R. (1973). A global view of automatic programming. Proc. I n t . Joint Conj. Artif. Intell., Srd, 15’73 p. 494. Ranerji, R. B. (1969). “Theory of Problem Solving: An Approach to Artificial Intelligence.’’ Amcr. Elsevier, New York. Banerji, R. B., and Mesarovic, M. D., eds. (1970). “Theoretical Approaches to NonNumerical Problem Solving.” Springer-Verlag, Berlin and Ncw York. Barzdin, J. (1972). “On Synthesizing Programs Given by Examples.” Computing Center, Latvian State University, Riga, U.S.S.R. Berlincr, H. (1973). Some necessary conditions for B master chess program. Proe. I n t . Joint Conj. Artij. Intell., Srd, 1973 p. 77. Biermann, A. W. (1972). On the inference of Turing machines from sample computations. Artij. Intell. 3, 181-198. Biermann, A. W., and Feldman, J. A. (1972). A survcy of results in grammatical
226
B. CHANDRASEKARAN
inference. In “Frontiers of Pattern Recognition” (S. S. Watanabe, ed.), p. 31. Academic Press, New York. Biermann, A. W., Baum, R., Krishnaswamy, R., and Petry, F. E. (1973). “Automatic Program Synthesis,” TR-73-6. Computer and Information Science Research Center, Ohio State University, Columbus. Bobrow, D. G. (1968). Natural language input for a computer problem solving system. In “Semantic Information Processing” (M. Minsky, ed.), p. 146. M I T Press, Cambridge, Massachusetts. Bobrow, D. G., and Fraser, J. B. (1969). An augmented state transition network analysis procedure. Proc. Znt. Joint Conf. Artif. Intell., l s t , 1969 pp. 557-568. Botvinnik, M. M. (1970). “Computers, Chess and Long-Range Planning.” SpringerVerlag, Berlin and New York. Bouknight, W. J. (1970). A procedure for generation of three-dimensional half-toned computer graphics presentations. Comm. Ass. Comp. Mach. 13, 527-536. Buchanan, B., Sutherland, G., and Feigenbaum, E. A. (1969). Heuristic DENDRAL: A program for generating explanatory hypotheses in organic chemistry. Mach. Intell. 4, 209. Burstall, R. M., Collins, J. S., and Popplestone, R. J. (1971). “Programming in POP-2.’’ Edinburgh Univ. Press, Edinburgh. Canaday, R. H. (1962). The description of overlapping figures. Unpublished M.S. Thesis, MIT, Cambridge, Massachusetts. Celce-Murcia, M. (1972). English comparatives. Ph.D. Dissertation, UCLA Dept. of Linguistics, Los Angeles. Chandrasekaran, B., and Reeker, L. H. (1974). Artificial intelligence: A case for agnosticism. IEEE Trans. Syst., Man & Cybernetics 4, No. 1, 88-103. Chang, C.-L., and Lee, R. C.-T. (1973). “Symbolic Logic and Mechanical Theorem Proving.” Academic Press, New York. Chomsky, N. (1957). “Syntactic Structures.” Mouton, The Hague, Netherlands. Clowes, M. B. (1969). Pictorial relationships-a syntactic approach. Mach. Intell. 4 , 361. Clowes, M. B. (1971). On seeing things. Arlif. Intell. 2, 79-116. Clowes, M. B. (1972). ‘‘Artificial Intelligenre as Psychology,” AISB Bull. British Computer Society. Colby, K. M. (1973). Simulations of belief systems. In “Computer Models of Thought and Language” (R. C. Schank and K. M. Colby, eds.), p. 251. Freeman, San Francisco, California. Coles, L. S. (1968). An on-line questions answering system with natural language and pictorial input. PTOC.Nut. Conf. Ass. Compul. Mach., BSrd, 1970. Dreyfus, H. (1972). “What Computers Can’t Do: A Critique of Artificial Reason.” Harper, New York. Duda, R., and Hart, P. E. (1973). “Pattern Classification and Scene Analysis.” Wiley (Interscience), New York. Elliott, R. W. (1965). A model for a fact retrieval system. Ph.D. Dissertation, University of Texas Computation Center, Austin. E r s t , G. W., and Ncwell, A. (1969). “GPS: A Case Study in Generality and Problem Solving.” Academic Press, New York. Evans, T. G. (1963). A program for the solution of a class of geometric-analogy intelligence-test questions. In “Semantic Information Processing” (M. Minsky, ed.), p. 271. MIT Press, Cambridge, Massachusetts.
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
227
Falk, G. (1972). Interpretation of imperfect line data as a three-dimensional scene. Artif. Zntell. 3, 101-144. Feigenbaum, E. A., and Feldman, J., eds. (1963). “Computers and Thought.” McGraw-Hill, New York. Feldman, J. (1963). Simulation of behavior in binary choice experiments. In “Computers and Thought” (E. A. Feigenbaum and J. Feldman, eds.), p. 329. McGraw-Hill, New York. Feldman, J. A. (1972a). “Automatic Programming,” Stanford Artificial Intelligence Project Memo AIM-160. Stanford University, Stanford, California. Feldman, J. A. (197213). Some decidability results on grammatical inference and complexity. Inform. Contr. 20, 244. Feldman, J. A., Feldman, G. M., Falk, G., Grape, G., Pearlman, J., Sobel I., and Tenenbaum, J. M. (1969). The Stanford hand-eye project. Proc. Znt. Joint Conf. Artif. Zntell., l s t , 1969 p. 521. Fikes, R., and Nilsson, N. J. (1971). STRIPS: A new approach to the application of theorem proving to problem solving. Artif. Zntell. 2, 119-208. Fikes, R., Hart, P., and Nilsson, N. J. (1972). Learning and executing generalized robot plans. Artif. Zntell. 3, 251-288. Fikes, R. F. (1970). REF-ARF: A system for solving problems stated as procedures. Artif. Zntell. 1, 27-120. Findler, N. V., and Meltzer, B., eds. (1971). “Artificial Intelligence and Heuristic Programming.” Amer. Elsevier, New York. Friedman, J., Bredt, T. H., Doran, R. W., and Pollack, B. W. (1971). “A Computer Model of Transformational Grammar.” Amer. Elsevier, New York. Gillenson, M., and Chandrasekaran, B. (1974). A heuristic strategy for developing human facial images on a CRT. In preparation. Gillogly, J. J. (1972). The technology chess program. Artif. Zntell. 3, No. 3, 145-163. Goldman, N. (1973). The generation of English sentences from a deep conceptual base. Ph.D. Dissertation, Stanford University, Stanford, California. Green, B. F., Wolf, A. K., Chomsky, C., and Laughery, K. (1963). BASEBALL: An automatic question answer. In “Computers and Thought” (E. A. Feigenbaum and J. Feldman, eds.), p. 207. McGraw-Hill, New York. Green, C. (1969). Application of theorem proving to problem solving. Proc. Int. Joint Conf. Artif. Zntell., lst, 1969 p. 219. Green, C., and Raphael, B. (1968). The use of theorem-proving techniques in question answering systems. Proc. Annu. Conf., Ass. Comput. Mach. 23rd, 1968. Greenblatt, R., Eastlake, D., 111, and Crocker, S. (1967). The Greenblatt chess program. Proc. AFZPS Fall Joint Comput. Conf., 1968 pp. 801-810. Guzman, A. (1968). Decomposition of a visual scene into three-dimensional bodies. Proc. Fall Joint Comput. Conf.,1968 Vol. 33. Hall, P. A. (1971). Branch-and-bound and beyond. Proc. Znt. Joint Conf. Artif. Zntell., 2nd, 197f p. 641. Halliday, M. A. K. (1970). Functional diversity in language as seen from a consideration of modality and mood in English. Found. Lung. 6,322-361. Harris, L. R. (1973). The bandwidth heuristic search. Proc. Znt. Joint Conf. Artif. Intell., Srd, 1973 p. 23. Haygood, R. C., and Bourne, 1,. E., Jr. (1965). Attribute- and rule-learning aspects of conceptual behavior. Psychol. Rev. 72, No. 3, 175-195. Hewitt, C. (1970). “PLANNER: A Language for Manipulating Models and Proving
22%
B. CHANDRASEKARAN
Theorems in a Robot.” A1 Memo 168, Project MAC, Massachusetts Institute of Technology, Cambridge, Massachusetts. Horman, A. (1964). How a computer system can learn. ZEEE Spectrum 1, 110-119. Huffman, D. A. (1971). Impossible objects as nonsense sentences. Mach. Intell. 6, 295. Hunt, E. B., Marin, J. K., and Stone, P. J. (1966). “Experiments in Induction.” Academic Press, New York. Jackson, P. C., Jr. (1974). “Introduction to Artificial Intelligence.” Petrocelli Books, Mason & Lipscomb, New York. Kanal, L., and Chandrasekaran, B. (1972). On linguistic, statistical and mixed models for pattern recognition. In “Frontiers of Pattern Recognition” (S. S. Watanabe, ed.), p. 163. Academic Press, New York. Kellogg, C. (1968). A natural language compiler for natural language data management. Proc. Fall Joint Comput. Conf., 1968 pp. 47-92, Kugel, P. (1973). Logics of discovery. Ph.D. Dissertation, Harvard University, Cambridge, Massachusetts. Kuno, S. (1965). The predictive analyzer and a path elimination technique. Comm. Ass. Comp. Mach. 8, 687-698. Kuno, S., and Oettinger, A. (1962). Multiple path syntactic analyzer. I n “Information Processing.” North-Holland Publ., Amsterdam. Lighthill, J. (1973). Artificial intelligence : A general survey. Zn “Artificial Intelligence : A Paper Symposium.” Science Research Council, London. Lindsay, R. K. (1963). Inferential memory as the basis of machines which understand natural language. In “Computers and Thought” (E. A. Feigenbaum and J. Feldman, eds.), p. 217. McGraw-Hill, New York. London, R. L. (1970). Bibliography on proving the correctness of computer programs. Mach. Intell. 5, 569. Loutrel, P. P. (1970). A solution to the hidden-line problem for computer-drawn polyhedra. IEEE Trans. Comput. 19, 205-213. McCarthy, J., and Hayes, P. (1969). Some philosophical problems from the standpoint of artificial intelligence. Mach. Intell. 4, 463. Mackworth, A. K. (1973). Interpreting pictures of polyhedral scenes. Artif. Intell. 4, 121-137. Manna, Z., and Wnldinger, R. (1971). Toward automatic program synthesis. Comm. Ass. Comp. Mach. 14, 151-166. Martin, W. A,, and Fateman, R. J. (1971). The Macsyma System. I n “Symbolic and Algebraic Manipulation.” Ass. Comput. Mach., New York. Miller, G. A. (1956). The magical number seven, plus or minus two. Psych. Rev. 63, 81-97. Miller, G. A. (1962). The study of intelligent behavior. Ann. Comput. Lab. Harvard Univ. 31. Miller, ,G. A. (1974). Needed: A better theory of cognitive organization. IEEE Trans. Syst., Man & Cybernetics 4,95-97. Miller, W. F., and Shaw, A . C. (1968). Linguistic methods in picture processing--a survey. Proc. Fall Joint Comput. Conf., 1968 pp. 279-290. Miller, G. A., Galanter, E., and Pribram, K. H. (1960). “Plans and the Structure of Behavior.” Holt, New York. Minker, J., Fishman, D. H., and McSkimmin, J. R. (1973). The Q* algorithm-a search strategy for a deductive question answering system. Proc. Znt. Joint Conf. Artif. Intell., 3rd, 1973 p. 31.
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
229
Minsky, M. (1961). Steps toward artificial intelligence. Proc. IEEE (reprinted in Feigenbaum and Feldman, 1963). Minsky, M. (1966). Artificial intelligence. Sci. Amer. ,916,No. 3, 246. Minsky, M., ed. (1968). “Semantic Information Processing.” M I T Press, Cambridge, Massachusetts. Minsky, M., and Papert, S. (1969). “Perceptrons.” MIT Press, Cambridge, Massachusetts. Minsky, M., and Papert, S. (1972). “Artificial Intelligence Progress Report,” Memo No. 252. MIT A1 Lab., Cambridge, Massachusetts. Montanari, U. (1970). Heuristically guided search and chromosome matching. Artif. Intell. 1, 227-245. Narasimhan, R. (1965). Syntax-directed interpretation of classes of events. Comm. Ass. Gomp. Mach. 8, 162-172. Newell, A. (1970). Remarks on the relationship between artificial intelligence and cognitive psychology. In “Theoretical Approaches to Non-Numerical Problem Solving” (R. B. Banerji and M. D. Mesarovic, eds.), p. 363. Springer-Verlag, Berlin and New York. Newell, A. (1973). Artificial intelligence and the concept of mind. In “Computer Models of Thought and Language” (R. C. Schank and K. M. Colby, eds.), p. I. Freeman, San Francisco, California. Newell, A., and Simon, H. (1972). “Human Problem Solving.” Prentice-Hall, Englewood Cliffs, New Jersey. Newell, A,, Barnett, J., Forgie, J., Green, C., Klatt, D., Licklider, J. C. R., Munson, J., Reddy, R., and Woods, W. (1971). “Final Report of a Study Group on Speech Understanding Systems.” Amer. Elsevier, New York. Nilsson, N. J. (1969). A mobile automaton: An application of artificial intelligence techniques. Proc. Int. Joint Conf. Artif. Intell., lst, 1969 p. 509. Nilsson, N. (1971). “Problem Solving Methods in Artificial Intelligence.” McGrawHill, New York. Norman, D. A. (1969). “Memory and Attention: An Introduction to Human Information Processing.” Wiley, New York. Papert, S. (1971). “Teaching Children Thinking,” Memo No. 247. M I T A1 Lab., Cambridge, Massachusetts. Petrick, S. (1965). A recognition procedure for transformational grammars. Ph.D. Dissertation, MIT, Cambridge, Massachusetts. Pohl, I. (1973). The avoidance of relative castostrophe, heuristic competence, genuine dynamic weighting and computational issues in heuristic problem solving. Proc. Int. Joint Conf. Artif. Intell., Srd, 1973 p. 12. Popplestone, R. J. (1969). Freddy in Toyland. Mach. Intell. 4, 455. Pylyshyn, Z. W., ed. (1970). “Perspectives on the Computer Revolution.” PrenticeHall, Englewood Cliffs, New Jersey. Quillian, M. R. (1966). Semantic memory. In “Semantic Information Processing” (M. Minsky, ed.), p. 227. MIT Press, Cambridge, Massachusetts. Quillian, M. R. (1969). The teachable language comprehender. Comm. Ass. Comp. Mach. 12, 459475. Raphael, B. (1965). SIR: A computer program for semantic information retrieval. In “Semantic Information Processing” (M. Minsky, ed.), p. 33. M I T Press, Cambridge, Massachusetts. Raulefs, P. (1973). “Automatic Synthesis of Minimal Algorithms from Samples of
B. CHANDRASEKARAN
230
the Behavior Institute for Information,” Rep. I N F 1-73-2. Universitat Karlsruhe, Germany. Reitman, W. (1965). ‘Cognition and Thought.” Wiley, New York. Restle, F. (1962). The selection of strategies in cuelearning. Psychol. Rev. 69, 328-343. Rieger, C. (1973). Conceptual memory. Ph.D. Thesis, Stanford University, Stanford, California. Riesbeck, C. (1973). Computer analysis of natural language in context. Ph.D. Thesis, Stanford University, Stanford, California. Robinson, J. A. (1969). Mechanizing higher order logic. Mach. Intell. 4, 151. Robinson, J. A. (1970). An overview of mechanical theorem proving. I n “Theoretical Approaches to Non-Numerical Problem Solving” (R. R. Banerji and Mesarovic, eds.), p. 2. Springer-Verlag, Berlin and New York. Rosenblatt, F. (1960). Perceptron experiments. Proc. I R E 48, 301-309. Rosenfeld, A. (1969). “Picture Processing by Computer.” Academic Press, New York. Rulifson, 3. F , Waldinger, R., and Derksen, J. A. (1971). A language for writing problem-solving programs. Proc. IFIP Congr., 1972 pp. 111-116. Samuel, A. (1967). Some studies in machine intelligence using the game of checkers. 11. IBM J . Res. Develop. 11, 6. Sandewall, E. (1971a). Formal methods in the design of question answering systems. Artif. Intell. 2, 12b145. Sandewall, E. J. (1971b). Representing natural language information in predicate calculus. Mach. Intell. 6, 255. Schank, R. C. (1973). Identification of conceptualizations underlying natural language. I n “Computer Models of Thought and Language” (R. C. Schank and K. M. Colby, eds.), p. 187. Freeman, San Francisco, California. Schank, R. C., and Colby, K. M., eds. (1973). “Computer Models of Thought and Language.” Freeman, San Francisco, California. Schtibert, L. K. (1973). “Iterated Limiting Recursion and the Program Minimization Problem,” T R 73-2. Dept. of Computer Science, University of Alberta. Schubert, L. K. (1974). Representative samples of programmable functions. Inform. Contr: 25, 30-44. Shapiro, S. C. (1971). A net structure for semantic information storage, deduction and retrieval. Proc. Int. Joint Conf. Artif. Intell., dnd, 1971 p . 512. Simmons, R. F. (1966). Storage and retrieval of aspects of meaning in directed graph structures. Comm. Ass. Comp. Mach. 9, 211-214. Simmons, R. F. (1970). Natural language question answering systems: 1969. In “Theoretical Approaches to Non-Numerical Problem Solving” (R. B. Banerji and M. D. Mesarovic, eds.), p. 108. Springer-Verlag, Berlin, New York. Simmons, R. F. (1973). Semantic networks: Their computation and use for understanding English sentences. I n “Computer Models of Thought and Language” (R. C. Schank and K. M. Colby, eds.), p. 63. Freeman, San Francisco. California. Simmons, R. F., and Bruce, B. (1971). Some relations between predicate calculus and semantic net representations of discourse. Proc. Int. Joint Conf. Artif. Intell., 2nd, 1971 p. 524. Simmons, R. F., Burger, J. F., and Schwarcz, R. M. (1968). A computational model of verbal understanding. Proc. AFIPS Fall Joint Comput. Conf. 1968 pp. 441-456.
Simon, H . (1963). Experiments with a heuristic compiler. J . Ass. Comp. Mach. 10, pp. 482-506.
ARTIFICIAL INTELLIGENCE-THE
PAST DECADE
23 1
Simon, H. (1970). “Sciences of the Artificial.” MIT Press, Cambridge, Massachusetts. Simon, H., and Feigenbaum, E. A. (1964). An information processing theory of some effects of similarity, familiarization and meaningful in verbal learning. J. Verb Learn. Verb. Behav. 2. Simon, H., and Siklossy, L. eds. (1972). “Representation and Meaning : Experiments with Information Processing Systems.” Prentice-Hall, Englewood Cliffs, New Jersey. Slagle, J. R. (1965). Experiments with a deductive question answering program. Comm. Ass. Comp. Mach. 8, 792-798. Slagle, J. R. (1967). Automatic theorem proving with renamable and semantic resolution. J. Ass. Comp. Mach. 14, 687-697. Slagle, J. R. (1971). “Artificial Intelligence : The Heuristic Programming Approach.” McGraw-Hill, New York. Smith, F., and Miller, G. A., eds. (1966). “The Genesis of Language.” M I T Press, Cambridge, Massachusetts. Solomonoff, R. J. (1966). Some recent work in artificial intelligence. Proc. IEEE 56, 1687-1697. Sridharan, N. S. (1973). Search strategies for the task of organic chemical synthesis. Proc. Int. Joint. Conf. Artif. Intell., Srd, 1973 p. 95. Sussman, G. J., and McDermott, D. W. (1972). “Why Conniving is Better than Planning,” Memo No. 203, Project MAC. A1 Lab., MIT, Cambridge, Massachusetts. Swinehart, D. (1973). A multiple process approach to interactive programming systems. Ph.D. Thesis, Stanford University, Stanford, California. Thompson, F. B. (1968). English for the computer. Proc. Full Joint Comput. Conf., 1968. Thorne, J. (1969). A program for the syntactic analysis of English sentences. Comm. Ass. Comp. Mach. 12, 476-480. Turing, A. (1947). Intelligent machinery. Reprint in Mach. Intell. 5, 3. (1970). Waldinger, R.J., and Lee, R. C. T. (1969). PROW: A step toward automatic program writing. Proc. Int. Joint. Conf. Artif. Intell., lst, 1969 p. 241. Watanabe, S. (1971). Ungrammatical grammar in pattern recognition. Pattern Recognition 3, 385-408. Waterman, D. A. (1970). Generalization learning techniques for automating the learning of heuritics. Artif. Intell. 1, 121-170. Watt, W. C. (1966). “Morphology of the Nevada Cattlebrands and their Blazons,” Part One, Rep. 9050. National Bureau of Standards, Washington, D.C. Weizenbaum, J. (1966). ELIZA. Comm. Ass. Comp. Mach. 9,36-45. Weizenbaum, J. (1972). On the impact of the computer on society. Science 176, 609-614. Weizenbaum, J. (1974). Automating Psychotherapy. Comm. Ass. Comp. Mach. 17, No. 7, 425. Wickelgren, W. A. (1974). “HOW to Solve Problems.” Freeman, San Francisco, California. Wilks, Y. (1973). An artificial intelligence approach to machine translation. I n “Computer Models of Thought and Language” (R. C. Schank and K. M. Colby, eds.), p. 114. Freeman, San Francisco, California. Williams, B. (1973). How smart are computers? N . Y . Rev. Books 20, No. 18, 36-40. Winograd, T. (1972). “Understanding Natural Language.” Academic Press, New York. Winston, P. C. (1970). “Learning Structural Descriptions from Examples,” Project MAC TR-76. MIT, Cambridge, Massachusetts.
232
B. CHANDRASEKARAN
Woods, W. A. (1968). Procedural semantics for a question-answer machine. Proc. Fall Joint Comput. Conj., 1968 pp. 457471. Woods, W. A. (1969). “Augmented Transition Networks for Natural Language Analysis.” Aiken Computational Lab., Harvard University, Cambridge, Massachusetts. Woods, W. A., and Makhoul, J. (1973). Mechanical inference problems in continuous speech understanding. Proc. Int. Joint Conf. Artij. Intell., Srd, 1973, p. 200. Woods, W. A., Kaplan, R. M., and Nash-Webber, B. (1972). “The Lunar Sciences Natural Language Information System,” Final Report, BBN Rep. No. 2378. Bolt, Beranek & Newman, Cambridge Massachusetts. Zwicky, A., Friedman, J., Hall, B. C., and Walker, D. E. (1965). The MITRE syntactic analysis procedure for the transformational grammars. Proc. Full Joint Comput. Conf.,1966 pp. 317326,
Author Index Numbers in italics refer to the pages on which the complete references are listed. Burger, J. F., 177, 188, 230 Burkard, R. K., 106 Burstall, R. M., 201, 202, 226
A Abelson, R. P., 224, 225 Adrian, M., 47, 48, 70 Amarel, A., 203, 225 Anderle, R. J., 106 Anderson, B. B., 111, 167 Anderson, D. B., 196, 226 Apter, M. J., 222, 226 Arder, H. F., 71
C
B Bailey, R. W., 71 Balzer, R., 203, 225 Banerji, R. B., 173, 175, 225 Barnett, J., 225, 229 Barzdin, J., 203, 226 Bates, F., 6, 40 Baum, R., 203, 205,226 Berliner, H., 208, 226 Bessinger, J. B., 71 Biermann, A. W., 203, 205, 226 Bickmore, D. P:, 106 Bigelow, R. H., 118, 165, 166, 167, 168 Bobrow, D. G., 176, 178, 184, 226 Bomford, Brigadier G., 106 Borgelt, J., 53, 70 Botvinnik, M. M., 208, 226 Bouknight, W. J., 219, 226 Boulton, P. I. P., 37, QO Bourne, L. E., Jr., 221, 227 Boyle, A. R., 106 Bratley, P., 63, 64, 70 Bredt, T. H., 183,227 Breward, R. W., 106 Bross, I. D. J., 111, 167 Brown, D., 106 Bruce, B., 190,230 Buchanan, B., 217, 226
Canaday, R. H., 223,226 Celce-Murcia, M., 189, 226 Chandrasekaran, B., 171, 218, 220, 226, 227, 228 Chang, C.-L., 175, 196, 226 Chomsky, C., 176, 227 Chomsky, N., 118, 167, 183, 226 Clingen, C. T., 6 , 4 l Clowes, M. B., 218, 219, 220, 226 Coddington, L., 6, 10, 40 Colby, K. M., 175, 179, 185, 224,226, 230 Coles, L. S., 177, 226 Collins, J. S., 201, 202, 226 Colvocoresses, A. P., 106 Connelly, D. S., 106 Corbato, F. J., 6, 15,4l Crocker, S., 208, 227
D Daggett, M. M., 15, Ql Daley, R. C., 15, Ql Dearing, V., 71 Derksen, J. A,, 201, 202,230 Deutsch, E. S., 106 Dijkstra, E. W., 29, Ql Dilligan, R. J., 61, 70 Doleiel, L., 71 Doran, R. W., 183, 227 Dostert, B. H., 114, 115, 118, 150, 166, 167, 168
Douglas, M. L., 6, 40 Doyle, F. J., 106, 107 Dreyfus, H., 170, 226 Duda, R., 175, 209, 217, 218, 219, 226
233
AUTHOR INDEX
234 E Eastlake, D., 111, 208, 227 Elliot, R. W., 177, 226 Ernst, G. W., 181,198,226 Evans, I . S., 107 Evans, T. G., 222, 226
F Fajman, R., 53, 70 Falk, G., 218, 219,227 Fateman, R. J., 225, 228 F'eigenbaum, E. A,, 217, 221, 222, 226, 231 Feldman, G. M., 218, 227 Feldman, J., 221, 222, 227 Feldman, J. A,, 202, 203, 218, 226, 226, 227 Fikes, R. F., 196, 197, 198, 201, 212,227 Fillmore, C. J., 146, 150, 167 Findler, N. V., 175, 227 Fischer, I., 107 Fishman, D. H., 195, 196, 228 Forgie, J., 225, 229 Fraser, J. B., 184, 226 Friedman, J., 183, 227
G Galanter, E., 221, 228 Gambino, L. A,, 107 Gasgins, R., 49, 70 Gillenson, M., 220, 227 Gillogly, J . J., 208, 227 Goldman, N., 190, 227 Grape, G., 218, 227 Green, B. F., 176, 201, 227 Green, C., 167, 177, 203, 225, 227, 229 Green, D. C., 59, 70 Greenblatt, R., 208, 227 Greenfeld, N. R., 118, 127, 165, 166, 167, 168
Guzman, A., 210, 219, 227
H Hall, B. C., 252 Hall, P. A., 183, 216, 227 Halliday, M. A. K., 183, 227 Hamilton, W. C., 107
Harris, L. R., 215, 227 Hart, P. E., 175, 196, 197, 198,209,212, 217, 218, 219, 226, 227 Hayes, P. J., 195, 196, 201,226,228 Haygood, R. C., 221, 227 Helava, U. V., 107 Hewitt, C., 267 Horman, A., 209, 228 Huffman, D. A., 219,228 Hunt, E. B., 222, 228
I Ingram, W., 56, 57, 70
J Jackson, P. C., Jr., 175, 208, 228 Jancaitis, J. R., 107 Johnson, P. H., 107 Junkins, J. L., 107
K Kaplan, R. M., 139, 161, 168, 184, 238 Kahn, E., 64, 65, YO Kanal, L., 218, 228 Kaula, W. M., 107 Kay, M., 71, 117, 158,lBT Kelk, B., 106 Keller, M., 107 Kellogg, C., 177, 228 Kilgannon, P., 50, 51, 70 Klatt, D., 225, 229 Krishnaswamy, R., 203, 205, 226 Kugel, P., 173, 228 Kuno, S., 183, 190, 228
L Laughery, K., 176, 227 Lee, R. C.-T., 175, 196,203,226, 231 Leed, J., 71 Licklider, J. C. R., 225, 229 Lighthill, J., 171, 288 Lindsay, R. K., 176, 178,228 Lindsey, C. H., 29, 41 London, R. L., 202,228
235
AUTHOR INDEX Loutrel, P. P., 219, 228 Lyddan, R. H., 107
M McCarthy, J., 195, 201, 228 McDermott, D. W., 201, 231 McEwen, R. B., 107 Mackworth, A. K., 219, 228 McSkimmin, J. R., 195, 196, 228 Makhoul, J., 225,232 Mancini, A., 107 Manna, Z., 202, 203, 228 Marin, J. K., 222,228 Martin, W. A., 225, 228 Meltzer, B., 175, 227 Merrill, R. D., 107 Mesarovic, M. D., 175, 226 Milic, L. T., 46, 50, 51, 70 Miller, G. A., 218, 221, 228, 231 Miller, W. F., 218, 228 Minker, J., 195, 196, 228 Minsky, M., 173, 175, 195, 209, 213, 222, 223, 229 Mitchell, J. L., 71 Misek, L., 58, 70 Montanari, U., 216, 229 Mueller, I. I., 107 Munson, J., 225, 229
N Narasimhan, R., 217, 229 Nash-Wehher, B., 139, 161,168, 184, 232 Newell, A,, 181, 197, 220, 222, 226, 229 Nicolaides, P. L., 118, 165, 167 Nilsson, N. J., 175, 196, 197, 198, 212, 214, 215, 218, 227, 229 Norman, A. B., 37, 41 Norman, D. A., 229 0
Odden, J., 118, 165, 168 Oettinger, A., 190, 228
P Painter, J. A., 55, 70 Papert, S., 173, 209, 225, 229
Parrish, S.M., 55, 70, 71 Pearlman; J., 218, 227 Peavler, J. M., 68, 69, 70 Petrick, S., 183, 229 Petrie, G., 107 Petry, F. E., 203, 205, 226 Peucker, T., 107 Pohl, I., 215, 229 Pollack, B. W., 183, 227 Popplestone, R. J., 201, 202, 218, 226, 229 Pribram, K. H., 221, 228 Pylyshyn, Z. W., 175, 229
Q Quillian, M. R., 136, 141, 167, 186, 188, 229
R Raben, J., 62, 63, 70, 71 Raphael, B., 167, 177,227,229 Rasche, R. H., 59, 70 Raulefs, P., 203, 229, 230 Reddy, R., 225, 229 Reeker, L. H., 171, 226 Reid, P. A., 37,4O Reitman, W., 221, 230 Restle, F., 221, 230 Richardt, J., 71 Rieger, C., 190, 230 Riesbeck, C., 190,250 Robillard, P., 63, 64, 70 Robinson, J. A., 141, 167, 195, 230 Rosenblatt, F., 209, 230 Rosenfield, A., 107, 218, 230 Ross, D., Jr., 59, 70 Rulifson, J. F., 201, 202, 230
S Sainte-Marie, P., 63, 64, 70 Salteer, J. H., 6, 41 Sammet, J. E., 165,167 Samuel, A., 206, 230 Sandewall, E. J., 190, 230 Sanford, V., 107 Schank, R. C . , 136, 168, 175, 176, 179, 185, 190, 222, 230 Schmid, E., 107
AUTHOR INDEX
236
Schmid, H. H., 108 Schubert, L. K., 203, 230 Schwarcz, R. M., 177, 188,230 Sedelow, S. Y., 71 Shapiro, P. A,, 111, 167 Shapiro, S. C., 188, 230 Shaw, A. C., 218,228 Shinagel, M., 56, 70 Siklossy, L., 175, 231 Simmons, R. F., 176, 177, 188, 189, 190, 230 Simon, H., 175, 202, 221, 222, 229, 230, 231 Slagle, J. R., 173, 175, 177, 196, 206, 207, 209, 215, 231 Smith, B. H., 48, 72 Smith, F., 221, 231 Smith, P. H., Jr., 55, 62, 71 Sobel, I., 218, 227 Solomonoff R. J., 173, 231 Spevack, M., 56, 72 Sridharan, N. S., 216, 231 Stone, P. J., 222, 22s Struck, D. J., 10s Sussman, G. J., 201, 231 Sutherland, G., 217, 226 Sutherlund, I. E., 136, 1/23 Swnim. K,, 56, 57, 70 Swinehart, D., 202, 231 Szolovits, P., 118, 120, 165, 166, 16Y, 16s
T Tenenbaum, J. M., 218, 267 Tewinkel, G. C., 107 Thompson, F. B., 116, 118, 134, 150, 165, 166, 167, 16S, 177, 231
Thorne, J., 184, 231 Turing, A., 213, 231 Tyler, D. A,, 107
V van der Meulen, S. G., 29, 41
w Waldinger, R. J., 201, 202, 203, 228, 250, 231 Walker, D. E., 183,232 Watanabe, S., 218, 231 Waterman, D. A., 212, 231 Watt, w. C., 218, 231 Wegner, P., 118, 160, 268 Weizenbaum, J., 171, 176, 178, 224, 231 Whipplc, J. M., 108 Whitmore, G. D., 108 Wickelgren, W. A., 225, 231 Widmann, R. L., 68, 71 Wilks, Y., 194, 231 Williams, B., 170, 231 Winograd, T., 136, 139, 157, 16S, 176, 179, 199, 222, 231 Winston, P. C., 209, 222, 231 Wisby, R. A., 7 2 Wolf, A. K., 176, 227 Woods, W. A., 139, 158, 161, 168, 177, 178, 183, 184,225,829,238 Z
Zadeh, L. A., 59, 71 Zwicsky, A., 183, 232
Subject Index A
ACM Computing Reviews, 175 Aerial photography, mapping and, 104-105 AI, see Artificial intelligence ALGOL in literary analysis, 45 machine instructions in, 4-5 poetry generation with, 50 ANALOGY program, 222-223 AND-OR graph, 214 AND-OR graph search concept, 216-217 Animated Film Language, 118 Artificial intelligence, 169-225 ANALOGY program and, 222-223 automatic programming and, 202-205 belief systems and, 224 cognitive psychology and, 22@-224 game-playing programs in, 205-208 heuristic programming in, 173 heuristic search and, 213-217 language processing in, 176-194 learning programs in, 208-213 macroactions in, 198 mathematical models in, 173 pattern recognition and scene analysis in, 217-220 PLANNER program and, 199-202 planning in, 195 Platonic assumption in, 170 problem solution in, 213 procedural model in, 177-186 program synthesis system in, 203-205 publications on, 175 question-answering systems in, 176-177 representation, inference, and planning in, 195-202 representation of knowledge in, 181183 review of, 173-175 Schank’s conceptualization networks in, 190-194
and search in graphs, 213-216 semantic networks and, 186-194 syntactic processing in, 183-185 upswing of, 224-225 Artificial Intelligence, 175 Association for Computing Machinery, 175 Association for Literary and Linguistic Computing, 69 Asynchronous (adj.), defined, 40 Asynchronous attention, 13-15 A?ynchronous interrupt, defined, 3 Asynchronous program interrupts, programmed control in, 1 4 0 see also Program interrupts Attention active, 18 asynchronous, 13-15 complete, 27 cooperating tasks and, 28-29 dequeue-statement and, 20 dequeuing of, 3 disabled, 18 disable-statement and, 19 external device and, 15-16 high priority, 4 inactive, 18 incomplete, 27 interrupt vs. enqueue in, 22 light pen and, 26-27, 31-35 machine language and, 5-10 multiprocessing and, 15-16, 27-30 on-units and, 17-18, 25-26 priorities in, 21-23 in programming, 3 queued access and, 23-25 stacked or queued, 3 synchronism and, 4-5 two-level, 15 Attention data, values of, 18-19 ATTENTION data attribute, 16 Attention handling extended, 16-31 237
SUBJECT INDEX
238
in FORTRAN, 30-31 on-unit approach to, 16-25 Attention handling language, syntax of, 37-38
poetry analysis with, 44, 5253, 58-64 63-69
representation of knowledge in, 181183
Attention on-units, 17-18, 25-26 Attention processing statements, 19-21 Automatic programming, 202-205 program synthesis system in, 203-205 synthesis algorithm in, 205
B “Bandwidth” condition, in heuristic search, 215 Barometric leveling, in mapping, 86 BASEBALL program, 176 Belief system, artificial intelligence and, 224
Bibliography, in poetry analysis, 67-69 BMD package program, 162 C
Cartography, 98.102 see also Mapping Cathode ray tube, in mapping, 103 Chaldean maps, 74 Checkers program, learning in, 206 Chemical synthesis, heuristic research in, 216-217
COBOL language, 46, 113-114, 143, 165 in asynchronous program interrupts, 1-2
detectable conditions in, 39 SIZE ERROR clause in, 10-11 USE procedure in, 11 Cognitive psychology, artificial intelligence and, 220-224 Collinearity model, in mapping, 91-93 Communications, 175 COMPLETION built-in function, 27 Computational/Conditions, attentions and, 4 Computer card and game playing by, 206-208 English for, 128-135, 143-158 literary analysis with, 44, 62-63 literary influence and, 62-63 mapping and, 73-106 natural language for, 110-111, 166-167
as robot, 171 star catalogs and, 81 stylistic analysis with, 58-61 Computer-aided mapping (CAM), 98 see also Mapping Computer Poems, 47 Computers and the Humanities, 46-40 Conceptualization networks, Schank’s, 190-194
Concordance defined, 52 of Hopkins’ poetry, 61 production of, 53-58 to Shakespeare’s plays, 56-58 stylistic analysis and, 58-61 Concordance to Milton’s English Poetry, A, 56 Conformal projections, 94 CONNIVER program, 195-196, 201 Control networks, in mapping, 85 CONVERSE program, 177 CONVERSION condition, in PL/I language, 6 Cooperating tasks, attention and, 28-29 COORDS function, 31-32 Coplanarity model, in mapping, 92 Cornell Concordances, 55 Coroutines, in programming, 3 CPS dialect, of PL/I, 13-14 CTSS time sharing system, 15 Cybernetic Serendipity (poem), 47
D Data analysis systems, natural language and, 162-164 Data banks, in mapping, 102-103 Data processing, literary, 43-44 see also Computer Data structures complexity of in REL English, 133-135 for English, 123-126 REL System and, 122-127 DEACON program, 177 DEDUCOM program, 177
239
SUBJECT INDEX Deduction in natural language processing systems, 135-143 in REL system, 141-142 Deductive system, in artificial intelligence, 177 Defense Mapping Agency, 96 Definitions extension of natural language through, 113-114 REL system and, 139 DENDRAL program, 217 Dequeue-statement, 20, 38 Desk-top calculators, programmable, 106 Digital computer, numeric and string data in, 52 see also Computer Digital mapping, reasons for, 101-102 Digitizer, in mapping, 99-100 Disabled attention, 3 Disable-statement, 19, 38
E Earth geodetic datum for, 79-80 geodetic surveys and, 80-87 as geoid, 78 Earth ellipsoid, mapping of, 77-78 see also Mapping EDVAC, mapping and, 75 Egyptian maps, 74 ELIZA program, 176, 178 Ellipsoid, earth as, 77-78 Enable-statement, 19, 38 task and, 29 ENDFILE condition, in PL/I language, 6-7 English language see also REL English development of for computer, 156-158 as natural language, 111 for computer, 143-158 syntactic processor and, 183 English Pronouncing Dictionary (Jones), ENIAC, mapping and, 75 Enqueued (adj.), defined, 40 Equal-area projections, 94 Eve of S t . Agnes, The (Keats), 68 Extension, in natural language, 135-137
EYEBALL routine, in poetry analysis, 59
F Faerie Queene, The (Spenser), 65, 67
First Folio of Shakespeare, concordance to, 56 Floating point overflow, in machine language, 10-12 FORTRAN, 113-114, 160-162, 165-166 asynchronous 1/0 and, 12 in asynchronous program interrupts, 2 attention handling in, 30-31 detectable conditions in, 39 error detection in, 11-12 in literary analysis, 45, 69 wait-statement in, 12 FORTRAN IV, 12 Fuzzy sets, in stylistic analysis, 5&59 G
Games, computer playing of, 206-208 Geodetic surveys astronomic stations in, 80-81 early computations in, 86-87 horizontal positions in, 80-84 instruments in, 83-84 leveling in, 85-86 mapping and, 80-87 triangulation in, 81-82 Geoid, earth as, 78-79 Grammar, in REL English, 146-148
H Haiku poetry, computer and, 48-49 HAL, fictional computer, 171 Hart-Nilsson-Raphael theorem, 216 Harvard Syntactic Analyzer, 183 Heuristic search, in artificial intelligence, 213-217 Homographs, in concordance, 53-56
I IBM Scientific Subroutine Package, 162 Inductive inference, in REL system, 142-143
SUBJECT INDEX Institute of Electrical Engineers, 175 Intelligence, artificial, see Artificial intelligence Intension, in natural language, 135-137 Intensional meaning, 137-139 Intensional processing, 140-141 Interpretive routines, in REL English, 131-133 Interrupt asynchronous or immediate, 3 vs. enqueue, 22-24 in programming, 3 synchronous or delayed, 3 Interrupt levels, hardware/software and, 5 1/0 conditions, attentions and, 4 1/0 operation, asynchronous, &9
K Knowledge learning and, 208-209 representation of in artificial intelligence, 181-183
L Lambert conformal conic projection, 97 Language algorithmic definition of, 114-115 machine, see Machine languages natural, see Natural language Leaining, structural, 208-213 Learning programs, artificial intelligence and, 208-213 Leveling, in geodetic survey, 85-86 Light pen, attention and, 26-27, 31-35 Linguistic knowledge, in REL English, 150-151 LISP program, 178, 186 Literary analysis computer and, 44, 62-63 mathematical modeling in, 64-67 poetry analysis and, 44, 52-53, 58-61, 63-69 textual bibliography in, 67-69 WYLBUR system in, 54-55 Literary data processing concordance in, 52 progress in, 53-55
Literary influence, computer analysis of, 62-63
M Machine error conditions, attentions and, 4 Machine instructions, in high level language, 4-5 Machine Intelligence, 175 Machine languages in asynchronous program interrupts, 1-2 facilities in, 5-10 floating point overflow in, 10-12 in poetry generation and analysis, 45 Management information systems, natural language and, 164-165 Map defined, 76-77 name placement on, 103 topographic, 76-77 Mapping see also Cartography BERLESC and, 76 cartography and, 98-102 cathode ray tube in, 103 collimating model in, 93 computers and, 73-106 coplanarity model in, 92 data banks in, 102-103 desk-top calculators in, 106 earth ellipsoid and, 77-78 EDVAC and, 75 ellipsoid and geoid in, 77-80 ENIAC and, 75 future trends in, 103-105 geodetic surveys and, 8-7 history of, 74-76 instruments in, 83-84 manual digitization of, 99-100 Mercator projection in, 94-97 orthoplotter in, 100-101 projections and, 92-98 photogrammetry in, 89-92 photography and, 75 satellite geodesy and, 87-88 scanner-digitizers in, 99-100 scribing in, 103 stereoplotters in, 98-99
SUBJECT INDEX transverse Mercator projection in, 94-96 traverse in, 82-83 triangulation in, 81-82 trilateration in, 82 UNIVAC and, 75-76 vertical networks in, 85-86 Mars probe, 104 MATHLAB program, 225 MENS semantic net, 188 Mercator projection, 94-96 Mesa phenomenon, 213 Metalanguage, command language and, 120-121 MICROPLANNER, 178 Midsummer Night’s Dream, A (Shakespeare), 68 MIT A1 Laboratory, 175 MULTIPLE program, 209 Multiprocessing attention control and, 25-27 attention handling through, 27-30 Multiprogramming, tasks in, 2
N National Computer Conferences, 175 Natural English for computer, 143-158 REL system and, 109-169 Natural language see also REL system as communicating medium for computers, 166-167 computers and, 110-111 computer understanding of, f77-186 constituents of, 111-115 context in, 111-112 data analysis systems and, 162-164 deduction and related issues in, 135-143 defined, 110-111 English as, 111 extension and intension in, 135-137 extension of through definition, 113-115 fluency and learning in, 160-162 idiosyncratic vocabulary and semantic processes of, 113 intensional processing in, 140-142
24 1
management information systems and, 164-165 parser and, 183-184 procedural model for understanding of, 177-186 processing with, 158-167 resolution principle in, 141 semantics for, 174 specialty languages and, 165-166 system of syntactic, semantic, and inferential components in, 185-186 Noninterruptable operations, in PL/I language, 5 0
Old English poetry, analysis of, 59-60 On-statement, 38 On-unit attentions associated with, 25 defined, 40 PL/I language and, 7 Orthoproto, in mapping, 100-101 OSIRIS package program, 162 OVERFLOW condition, in PL/I language, 6
P Paradise Lost (Milton), 44, 62 Parser, for natural language, 183-184 Pattern recognition artificial intelligence and, 217-220 heuristic techniques and, 218-220 Photogrammetry analytical, 90 defined, 89 mapping and, 89-92 PLANNER language or program, 157, 174, 179, 181, 186, 195, 199-202 PL/I language asynchronous attention in, 13-15 asynchronous 1/0 operation and, 8-9 attention handling in, 6-8 block scope in, 9 CPS dialect of, 13-14 detectable conditions in, 38-39 enabling and disabling in, 9-10 literary analysis and, 45 machine instructions in, 4-5
242
SUBJECT INDEX
noninterruptable operations in, 5 on-statement in, 6 on-unit and, 7 tasks, attentions, and interrupts in, 5-10
Poetry generic or characteristic stanza in, 66-67
as prose, in computer analysis, 66 as string-type data, 52 by students, 69-70 stylistic analysis of, 58-61 as subset of language, 47-48 words and ideas in, 50-51 Poetry analysis, 52-53 EYEBALL routine in, 59 literary influence and, 63-64 mathematical modeling in, 64-87 vs. poetry generation, 44 statistical analysis and, 63-64 stylistic analysis and, 58-61 textual bibliography in, 67-69 Poetry generation, 43-51 computer languages and, 4546 results in, 47-51 Practical natural language processing,
Programming automatic, 202-205 task in, 2 Program synthesis system, in automatic programming, 203-205 Projections Lambert conformal conic, 97-98 Mercator, 94-97 Prometheus Unbound (Shelley), 62 Prosody, computer analysis of, 61-62 Protosynthex I, 176 Protosynthex 11, 177 Protosynthex 111, 177, 188 PUT-TAKE integer pointers, 32-35 Pythagorean theorem, 74
Q Quantifiers computational aspects of, 153-156 in REL English, 151-152 Question-answering systems, in artificial intelligence, 176-177 Queued access attention and, 23-25 priority level and, 24-25
109-167
see aLo Natural language; REL English; REL system idiosyncratic nature of, 112-113 Primitive words, in REL English, 128-129
Priority, defined, 40 Priority level of attention, 21-25 see also Attention Problem solving, artificial intelligence in, 213-214
Procedural models, in artificial intelligence, 177-186 Program Conditions, attentions and, 4 Program interrupts see also Interrupt (n.) attention handling in, 1 6 3 1 COBOL and, 10-11 defined, 3 FORTRAN and, 11-12 PL/I and, 6-10 wait-state in, 2 PROGRAMMAR, 178, 185
R Rapidly Extensible Language (REL) system, see REL system REL Animated Film Language, 125, 165 Relational data systems, data management for, 126-127 REL English, 118 see also REL (Rapidly Extensible Language) system binary features in, 144 case grammar in, 146-148 complexity of data structures in, 133-135
for computer, 143-158 data analysis systems and, 162-164 data structures for, 123-135 development of for computer, 156-158 feature8 of, 144-146 fluency and learning in, 161-162 intensional meaning in, 137-139 interpretive routines in, 131-133 limitations in, 125
SUBJECT lNDEX management information systems and, 164-165
metalanguage and, 120 primitive intervening relationships in, 140
243
Satellite, multispectral image from, 105 Satellite geodesy, 87-88 Satellite photography, in mapping, 104 Scene analysis, in artificial intelligence, 217-220
primitive words in, 128-131 quantifiers in, 151-156 for relational data systems, 126-127 semantic nets in, 128-131 semantics of, 128-135 syntax in, 121, 131 time span in, 124 user vs. linguistic knowledge in, 50-151 verb semantics in, 148-150 REL French, 118 REL Language Writer’s Language, 165 RELMATH, 165 REL Operating System, 122 RELSIM (REL Simulation Language), 118, 165
REL (Rapidly Extensible Language) System, 109-167 architecture of, 116 base languages and, 118 command language and metalanguage in, 120-121 data storage in, 127 deduction and induction in, 141-143 defined, 115-116 language processor in, 116-118 metalanguage in, 120 paging and definition handling in, 121-122
parsing diagram for, 117 prototype, 115-122 resolution principle in, 141 semantic and data structures
in,
122-127
service utility routines in, 121-122 specialty languages in, 165-166 syntax-directed interpretation in,
Semantic memory, 186-187 Semantic net(work), 186-194 defined, 186 models in, 186-188 in REL English, 128-131 research in, 141 Simmons’, 188-190 Semantics for computer language, 174 data structures and, 122-127 of REL English, 128-135, 148-150 SIGART News, 175 Sir Gawain and the Green Knight, 67 SIR program, 183 SIZE ERROR clause, in COBOL, 1&11 SNOBOL haiku generation in, 49 in literary analysis, 4546, 65 Specialty languages, natural language and, 165-166 SPSS package program, 162 Stanford A1 Laboratory, 175 Stanza, generic or characteristic, 6 6 6 7 Star catalogs, computer and, 81 Stereoplotters, in mapping, 98-99 STRIPS program, 196-199 Structural learning, 209-213 STUDENT program, 176-178, 183 Style, EYEBALL routine and, 59 Survey networks, mapping and, 85-87 Synchronous interrupt, 3 Syntactic processing, in artificial intelligence, 183-185 Syntax, in REL English, 131 Syntax-directed interpretation, in REL system, 116-117
116117
user language packages and, 118-119 Robots, 171-172 see also Artificial intelligence automatic strategy generation for, 202
T Task enabling, access, and priorities in, 29-30
S SAD SAM program, 176-178
on-unit of, 15, 25-26 in programming, 2 subroutine as, 29
244
SUBJECT INDEX
Teachable Language Comprehender, 188 Topographic map, 76-77 Transactions on Computers, 175 Transactions o n System, M a n and Cybernetics, 175 Transverse Mercator projection, 95-97 Traverse, in mapping, 82-83 Triangulation, in geodetic surveys, 81-82 Trigonometric leveling, 85 Trilateration, in mapping, 82 Troilus and Criseyde (Chaucer), 67
U Underflow/overflow, attentions and, 4 UNIVAC, mapping and, 75-76 Universal Transverse Mercator (UTM) grid, 96
User language packages, in REL system, 118-1 19
v
Venus probe, mapping and, 104 Verb semantics, in REL English, 148-150 W
WATCHUM group, 69 Works of Shakespeare, T h e (Evans), 56 WYLBUR system, in poetry analysis or concordance, 53-54 Z
ZERODIVIDE condition, in PL/I language, 6
Contents of Previous Volumes Volume I General-Purpose Programming for Business Applications CALVINC. GOTLIEB Numerical Weather Prediction NORMAN A. PHILLIPS The Present Status of Automatic Translation of Languages YEHOSHUA BAR-HILLEL Programming Computers to Play Games ARTHURL. SAMUEL Machine Recognition of Spoken Words RICHARD FATEHCHAND Binary Arithmetic GEORGE W. REITWIESNER Volume 2
A Survey of Numerical Methods for Parabolic Differential Equations JIM DOUGLAS, JR. Advances in Orthonormalizing Computation PHILIPJ. DAVISA N D PHILIPRABINOWITZ Microelectronics Using Electron-Beam-Activated Machining Techniques KENNETHR. SHOULDERS Recent Developments in Linear Programming SAULI. GLASS The Theory of Automata, a Survey ROBERTMCNAUGHTON Volume 3 The Computation of Satellite Orbit Trajectories SAMUEL D. CONTE Multiprogramming E. F. CODD Recent Developments of Nonlinear Programming PHILIPWOLFE Alternating Direction Implicit Methods GARRET BIRKHOFF, RICHARD S. VARGA, A N D DAVID YOUNQ Combined Analog-Digital Techniques in Simulation HAROLD F. SKRAMSTAD Information Technology and the Law REEDC. LAWLOR Volume 4 The Formulation of Data Processing Problems for Computers WILLIAMC. MCGEE All-Magnetic Circuit Techniques DAVIDR. BENNIONAND HEWITTD. CRANE Computer Education HOWARD E. TOMPKINS 245
246
CONTENTS OF PREVIOUS VOLUMES
Digital Fluid Logic Elements H. H. GLAETTLI Multiple Computer Systems WILLIAMA. CURTIN Volume 5
The Role of Computers in Election Night Broadcasting JACK MOSHMAN Some Results of Research on Automatic Programming in Eastern Europe TURKSI WLADYSLAW A Discussion of Artificial Intelligence and Self-Organization GORDON PASK Automatic Optical Design ORESTES N. STAVROUDIS Computing Problems and Methods in X-Ray Cryytallography CHARLES L. COULTER Digital Computers in Nuclear Reactor Design ELIZABETH CUTHILL An Introduction to Procedure-Oriented Language8 HARRY D. HUSKEY Volume 6
Information Retrieval CLAUDE E. WALSTON Speculations Concerning the First Ultraintelligent Machine IRVING JOHNGOOD Digital Training Devices R . WICKMAN CHARLES Number Systems and Arithmetic L. GARDER HARVEY Considerations on Man versus Machine for Space Probing P. L. BARCELLINI Data Collection and Reduction for Nuclear Particle Trace Detectors HERBERT GELERNTER Volume 7
Highly Parallel Information Processing Systems JOHNC. MURTHA Programming Language Processors RUTH M. DAVIS The Man-Machine Combination for Computer-Assisted Copy Editing WAYNEA. DANIELSON Computer-Aided Typesetting WILLIAMR. BOZMAN Programming Languages for Computational Linguistics ARNOLDC. SATTERTHWAIT Computer Driven Displays and Thrir Use in Man/Machine Interaction ANDRIESVAN DAM
CONTENTS
OF PREVIOUS VOLUMES
247
Volume 8 Time-shared Computer Systems THOMAS N. PYKE,JR. Formula Manipulation by Computer JEAN E. SAMMET Standards for Computers and Information Processing T. B. STEEL,JR. Syntactic Analysis of Natural Language NAOMISAGER Programming Languages and Computers : A Unified Metatheory R. NARASIMHAN Incremental Computation LIONELLO A. LOMBARDI
Volume 9 What Next in Computer Technology? W. J. POPPELBAUM Advances in Simulation JOHNMCLEOD Symbol Manipulation Languages PAULW. AFIRAHAMS Legal Information Retrieval AVIEZRIS. FRAENKEL Large Scale Integration-an Appraisal L. M. SPANDORFER Aerospace Computers A. S. BUCHMAN The Distributed Processor Organization L. J. KOCZELA Volume 10
Humanism, Technology, and Language CHARLES DECARLO Three Computer Cultures : Computer Technology, Computer Mathematics, and Computer Science PETER WEGNER Mathematics in 1984-The Impact of Computers BRYANTHWAITES Computing from the Communication Point of View E. E. DAVID,JR. Computer-Man Communication : Using Computer Graphics in the Instructional Process FREDERICK P. BROOKS, JR. Computers and Publishing : Writing, Editing, and Printing ANDRIESVAN DAMA N D DAVIDE. RICE A Unified Approach to Pattern Analysis ULF GRENANDER Use of Computers in Biomedical Pattern Recognition S. LEDLEY ROBERT
248
CONTENTS OF PREVIOUS VOLUMES
Numerical Methods of Stress Analysis WILLIAMPRACER Spline Approximation and Computer-Aided Design J. H. AHLBERG Logic per Track Devices D. L. SLOTNICK
Volume 11 Automatic Translation of Languages Since 1960: A Linguist’s View HARRY H. JOSSELSON Classification, Relevance, and Information Retrieval D. M. JACKSON Approaches to the Machine Recognition of Conversational Speech KLAUSW. OTTEN Man-Machine Interaction Using Speech DAVIDR. HILL Balanced Magnetic Circuits for Logic and Memory Devices R. B. KIEBURTZ A N D E. E. NEWHALL Command and Control: Technology and Social Impact ANTHONYDEBONS
Volume 12 Information Security in a Multi-User Computer Environment JAMES P. ANDERSON Managers, Deterministic Models, and Computers G. M. FERRERO DIROCCAFERRERA Uses of the Computer in Music Composition and Research HARRY B. LINCOLN File Organization Techniques DAVIDC. ROBERTS Systems Programming Languages R. D. BERGERON, J. D. GANNON, D. P. SHECTER,F. W. TOMPA, A N D A. VAN DAM Parametric and Nonparametric Recognition by Computer: An Application to Leukocyte Image Processing JUDITHM. S. PREWIIT
A 5 8 6
c 7 0 8
€ 9
F O G 1 H 2 1 3 J 4