Advances
in COMPUTERS VOLUME 34
Contributors t o This Volume
J. K. AGCARWAL TFD
J.
RIGCjtKSTAFF
LAWKENCFCHISVIN ...
37 downloads
1154 Views
22MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Advances
in COMPUTERS VOLUME 34
Contributors t o This Volume
J. K. AGCARWAL TFD
J.
RIGCjtKSTAFF
LAWKENCFCHISVIN R. J A M ~ SDUCKWORTH RALPHDUNCAN Wir I.IAM I. GROSKY R~JDY HlRS(’ki1IEIM H ~ I N K. L KI FIN RAJIVM ~ H K O T R A N . NANDHAKUMAK
Advances in
COMPUTERS EDITED BY
MARSHALL C. YOVITS Purdue School of Science Indiana University -Purdue University at Indianapolis Indianapolis, Indiana
VOLUME 34
ACADEMIC PRESS, INC. Harcourt Brace Jovanovich, Publishers
Boston London
San Diego New York Sydney
Tokyo
Toronto
This book is printed on acid-free paper. ($9
Copyright 0 1992 hy Academic Pms, Inc. All nghis reserved. No pan o f thih publication may be reproduced or tnnaniitted in any form or by any rtieans, electronic or mechanical. including phntocopy. recording, or any inlimnation stordge and retrieval system, without peniiirsion in writing from the publisher,
ACADEMIC PRESS, I NC. 1250 Sixth Avenue, San Diego, CA 92101-4311
United Kingdom Edition published by ACADEMIC PRESS LIMITED 24-28 Oval Road. London NW 1 7DX
Library of' Congress Catalog Card Number: 59-15761
ISBN 0-12-012134-4 Printed in tlic United States of America 92939495 9 8 7 6 5 4 3 2 1
Contents
Contributors . Preface . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. vii ...
.
viii
An Assessment and Analysis of Software Reuse l e d J . Biggerstaff
Introduction . . . . . . . . . . . . . . . . Software Reusability Successes . . . . . . . . . . . Examples of Reuse Implementation Technologies . . . . . . . . . . . . . . . . . Effects of Key Factors . Futures and Conclusions . . . . . . . . . . . . 6. References . . . . . . . . . . . . . . . . .
1. 2. 3. 4. 5.
1 10
30 38 53 54
Multisensory Computer Vision N . Nandhakurnar and J . K . Aggarwal
1. 2. 3. 4. 5. 6.
Introduction . . . . . Approaches to Sensor Fusion Computational Paradigms for Fusion at Multiple Levels . Conclusions . . . . . References . . . . . .
. . . . . . . . . . . . . . . . . . . . . .
Multisensory Vision . . . . . . . .
. . . .
59 63 86 99
. . . . . . . . . . . . . . . 105 . . . . . . . . . . . 107
Parallel Computer Architectures Ralph Duncan
1. 2. 3. 4. 5. 6.
Introduction . . . . . . . . . Terminology and Taxonomy . . . . Synchronous Architectures . . . . . MIMD Architectures . . . . . . . MIMD Execution Paradigm Architectures Conclusions . . . . . . . . . Acknowledgments . . . . . . . References . . . . . . . . . .
.
.
.
.
.
.
. 113
.
.
.
.
.
.
. 115
. . . . . . . 118 .
.
.
.
.
.
. 129
. . . . . . . 139 . . . . . . . 149 . . . . . . . 152 . . . . . . . 152
vi
CONTENTS
Content-Addressable and Associative Memory Lawrence Chisvin and R . James Duckworth
1. 2. 3. 4. 5. 6. 7. 8.
Introduction . . . . . . . . . . . . . . . Address-Based Storage and Retrieval . . . . . . . . Content-Addressable and Associative Memories . . . . . Neural Networks . . . . . . . . . . . . . . Associativc Storage, Retricval, and Proccssing Methods . . Associative Memory and Processor Architectures . . . . Software for Associative Processors . . . . . . . . Conclusion . . . . . . . . . . . . . . . . Acknowledgments . . . . . . . . . . . . . Rcfcrcnccs . . . . . . . . . . . . . . . .
. 160 162 . 164 . 174 . 176 . 184 . 212 . 225 . 228 . 229
.
image Database Management William I . Grosky and Rajiv Mehrotra
1. 2. 3. 4. 5. 6.
Introduction . . . . . . . . . . . . Image Database Management System Architecture . Some Example Image Database Management Systems Similarity Retrieval in Image Database Systems . . Conclusions . . . . . . . . . . . . . . . . . . . . . . Acknowledgments References and Bibliography . . . . . . .
.
.
.
. 237 239 . . . . 249 . . . . 266 . . . . 283 . . . . 283 . . . . 283
. . . .
Paradigmatic influences on information Systems Development Methodologies: Evolution and Conceptual Advances Rudy Hirschheirn and Heinz K. Klein
Introduction . . . . . . . . . . . . . . . . 294 295 Evolution of Information Systems Development Methodologies Methodologies and Paradigms . . . . . . . . . . . 305 Paradigms and the Continued Evolution of Mcthodologies . . . 325 Conclusion . . . . . . . . . . . . . . . . .. 366 Acknowledgments . . . . . . . . . . . . . . 367 6 . Appendices: Summaries of the Methodologies . . . . . . 367 References . . . . . . . . . . . . . . . . . 381
1. 2. 3. 4. 5.
A U I I I O R INDFX .
.
.
.
.
.
.
.
.
.
.
.
.
.
. 393
SLIRJFCT INDFX .
.
.
.
.
.
.
.
.
.
.
.
.
.
. 405
Contents of Volumcs in this Scrics .
.
.
.
.
.
.
.
.
. 413
Contributors Numbers in parentheses rcfcr to the pages on wjhich the authors' contributions begin.
J. K. Aggarwal (59), Computer and Vision Research Center, College of Engineering, The University of Texas, Austin, Texas, 78712 Ted J. BiggerstaR ( 1), Microelectronics and Computer Technology Corporution , Austin, Texas 78759 Lawrence Chisvin ( 159), Digital Equipment Corporation, Hudson, Massachusetts 01 749 R. James Duckworth ( 159), Department of Electrical Engineering, Worcester Polytechnic Institute, Worcester, Massachusetts 01609 Ralph Duncan (1 13), Control Data, Government Systems, Atlanta, Georgiu 30328 William I. Grosky (237), Computer Science Department, Wayne State University, Detroit, Michigan 48202 Rudy Hirschheim (293), College of Business Administrution, University of Houston, Houston, Texas 77204 Heinz K. Klein (293), School of Mantigement, State University of New York, Ringhumton, New York 13901 Rajiv Mehrotra (237), Computer Science Department, Center for Robotics and Manujarturing Systems, University of Kentucky, Lexington, Kentucky 40506 N. Nandhakumar (59), Department of Elrctricul Engineering, University of' Virginiu, Churlot tesville, Virginiu 22903
vii
Preface The publication of Volume 34 of Aducrncc~sin Cutizputers continues the in-depth presentation of subjects of both current and continuing interest in computer and information science. Contributions have been solicited from highly respected expcrts in their fields who recognize the importance of writing substantial review and tutorial articlcs in their areas of expertise. A d v ( m v s in C,’omputerspermits the publication of survey-type articles written from a relatively leisurely perspective. By virtue of the length of the chapters included, authors are able to treat their subjects both in-depth and inbreadth. The Advunccs in Computers series began in 1960 and now continues i n its 33rd year with this volume. During this period, in which we have witnessed great expansion and dynamic change in the computer and information fields, the series has played an important role in the development of computers and thcir applications. The continuation of the series over this lengthy period is a tribute to the rcputations and capabilities of the authors who havc contributed to it. Included in Volume 34 arc chapters on software reuse, multisensory coniputer vision, parallel computer architecture, associative memory, image databases, and paradigms for information systems development. In the first chapter Ted Riggerstaff states that software reusability is an approach that under special circumstances can produce an order of magnitude improvement in software productivity and quality, and under more common circumstances can produce less spectacular but nevertheless significant improvements in both. His chapter examines several aspects of reuse. He concludes that software reuse provides many opportunities for significant improvemcnts to software development productivity and quality within certain well-defined contexts. If one understands where it works well and why, it can be a powerful tool in one’s arsenal of software development tools and techniques. Nandhakumar and hggarwal in Chapter 2 consider that computer vision broadly includes a variety of sensing modes. They conclude that the advantages of multisensory approaches to computer vision are evident from their discussions. The integration of multiple sensors or multiple sensing modalities is an effective method of minimizing the ambiguities inherent in interpreting perceived scenes. The multisensory approach is useful for a variety of tasks including pose determination, surface reconstruction, object recognition, and motion computation, among others. I n the third chapter Kalph Duncan indicates that the term parallel proccssing designates thc simultaneous execution of multiple processors to solve a single computational problem cooperatively. Parallel processing has viii
PREFACE
ix
attracted a great deal of recent interest because of its potential for making difficult computational problems tractable by significantly increasing computer performance. He further states that parallel processing must be supported by architectures that are carefully structured for coordinating the work of many processors and for supporting efficient interprocessor communications. His chapter’s central aim has been to show that, despite their diversity, parallel architectures define a comprehensible spectrum of machine designs. Each of the major parallel architecture classes included represents a fundamental approach to supporting parallelized program execution effectively. Chisvin and Duckworth in the fourth chapter state that associative memory has finally come of age. After more than three and a half decades of active research, industry integrated circuit design and fabrication ability has finally caught up with the vast theoretical foundation built up over that time. In the past five years, in particular, there has been an explosion in the number of practical designs based upon associative concepts. Advances in very largescale integration technology have allowed many previous implementation obstacles to be overcome. Their chapter describes the field of contentaddressable memory and associative memory, and the related field of associative processing. Compared to conventional memory techniques, contentaddressable and associative memory are totally different ways of storing, manipulating, and retrieving data. In the next chapter Grosky and Mehrotra discuss database management systems for images. Although database management systems were originally developed for data processing applications in a business environment, there has recently been much interest expressed in the database community for devising databases for such nonstandard data as graphics, maps, images, video, and audio, as well as their various combinations. Much of the initial impetus for the development for such nonstandard databases originated in the scientific community concerned with the type of data that was to be managed. Grosky and Mehrotra convey an appreciation for the continuing development of the field of image databases. They believe that since researchers in the database community have shown a mutual interest in its development, the field of image database management should experience much growth. This field is still in its infancy and not yet on a firm footing; the correct questions are just starting to be asked, let alone answered. Hirschheim and Klein, in the final chapter, state that the subject of computer-based information systems development has received considerable attention in both the popular and academic literature over the past few decades. One area that continues to have a high profile and where a remarkable amount of interest can easily be observed is in the approaches or methodologies for developing information systems. It is likely that hundreds of
X
PREFACE
different methodologies exist. In this chapter, the authors explore the emergence of alternative information systems development methodologies, placing them in their historical context and noting where and why they differ from each other. Hirschheim and Klein believe that the history of methodologies appears to be driven more by fashionable movements than by theoretical insights. They conclude that, from the beginning, methodologies were influenced primarily by functionalism, but more recently the inspiration has come from alternative paradigms. They have also shown that methodologies can be improved by systematically importing fundamental concerns and principles inspired by different paradigms. I am pleased to thank the contributors to this volume. They have given extensively of their time and effort to make this book an important and timely contribution to their profession. Despite the considerable time and effort required, they have recognized the importance of writing substantial review and tutorial contributions in their areas of expertise; their cooperation and assistance are greatly appreciated. Because of their efforts, this volume achieves a high level of excellence and should be of great value and substantial interest for many years to come. It has been a pleasant and rewarding experience for me to edit this volume and to work with the authors.
MARSHALL C . YOVITS
An Assessment and Analysis of S o f t w a r e Reuse TED J . BIGGERSTAFF Microelectronics and Computer Technology Corp. Austin. Texas
I . Introduction . . . . . . . . . . . . . . . . . 1 . 1 Hyperboles of Reuse . . . . . . . . . . . . . 1.2 Key Factors Fostering Successful Reuse . . . . . . . . 2. Software Reusability Successes . . . . . . . . . . . . 2.1 Fourth-Generation Languages (LSR to VLSR) . . . . . . 2.2 Application Generators (VLSR) . . . . . . 2.3 Forms Designer Systems (LSR to VLSR) . . . 2.4 Interface Developer’s Toolkits . . . . . . . 2.5 The Software Factory (MSR to LSR, Process-Orientf:d Reuse) . 2.6 Emerging Large-Scale Component Kits (LSR) . . . . . . 2.7 User-Oriented Information System (LSR to VLSR) . 2.8 Application-Specific Reuse (LSR to VLSR) . . . 2.9 Designer/Generators (LSR to VLSR) . . . . 3. Examples of Reuse Implementation Technologies . . . . . . . 3.1 Classification and Library Systems . . . . . . . . . 3.2 CASETools . . . . . . . . . . . . . . . . 3.3 Object-Oriented Programming Systems . . . . . . . . 4 . Effects of Key Factors . . . . . . . . . . . . . . 4.1 Relationships among the Reuse Factors . . . . . . . . 4.2 A Quantitative Model of the Relative Amount of Integration Code 5. Futures and Conclusions . . . . . . . . . . . . . 5.1 Futures . . . . . . . . . . . . . . . . . 5.2 Conclusions . . . . . . . . . . . . . . . . 6 . References . . . . . . . . . . . . . . . .
. . .
1
.
.
.
2
. . . . .
. . . . . . . . . .
4 10 10 13 15 18 20 22 23 25 27 30 30 31 33 38 38 41 53 53 54 54
.
.
. . . . .
. . . . . . . . . .
.
. . . . . . .
. . . . . . . .
.
.
.
. . . . . . . . . . . . .
.
.
1 . Introduction
Software reusability (Biggerstaff and Perlis. 1984; Biggerstaff and Richter. 1987; Freeman. 1987; Tracz. 1987. 1988; Biggerstaff and Perlis. 1989; Weide et al., 1991) is not a “silver bullet”* (Brooks. 1987). but is an approach that under special circumstances can produce an order of magnitude improvement in software productivity and quality. and under more common
* The phrase “silver bullet” is jargon that refers to a panacea 1 ADVANCES IN COMPUTERS. VOL . 34
for software development .
Copyright 0 1992 by Academic Press. Inc.
All rights of reproduction in any form reserved. ISBN 0- 12-012 134-4
2
TED J. BIGGERSTAFF
circumstances can produce less spectacular but nevertheless significant improvements in both. This chapter will examine several aspects of reuse: (1) reuse hyperboles that lead to false expectations, (2) examples of reuse successes, (3) the factors that make these examples successful, (4)the relationships among these factors, ( 5 ) in particular, the relationship between reuse technologies and their potential for productivity and quality improvement, and (6) the quantitative relationship between the key factors and the resultant reuse benefits.
1.1
Hyperboles of Reuse
After listening to a series of speakers, each promising additive cost decreases that were summing suspiciously close to 100Y0,one wag was heard to comment, “If this keeps up, pretty soon our internal software development activities will be so efficient that they will start returning a profit.” As in this story, software reusability hyperboles often strain credulity. Unfortunately, software reusability hyperbole is more seductive than software reusability reality. There are several major reuse hyperboles that reflect some measure of truth but unfortunately overstate the profit of reuse or understate the required qualifications and constraints. 0
0
Reuse technology is the most important factor to success. This is an
aspect of the silver bullet attitude and is typified by statements like: “If I choose Ada, or Object-Oriented programming or an application generator then all other factors are second- and third-order terms in the equation that defines the expected improvement. Success is assured.” However, this is seldom completely true. While the technology can have very high impact (as with application generators for example), it is quite sensitive to other factors such as the narrowness of the application domain, the degree to which the domain is understood, the rate of technology change within the domain, the cultural attitude and policies of the development organizations, and so forth. Yes, the technology is important but it is not always primary nor even a completely independent factor. Reuse can be applied everywhere to great benefit. This is another aspect of the silver bullet attitude that one can apply reuse to any problem or application domain with the same expectation of high success. The reality is that narrow, well-understood application domains with slowly changing technologies and standardized architectures are the most likely to provide a context where reuse can be highly successful. For
AN ASSESSMENT A N D ANALYSIS OF SOFTWARE
0
0
REUSE
3
example, well-understood domains like management information systems (MIS) and business applications, user interfaces, narrowly defined product lines, numerical computation, etc. all, to a greater or lesser extent, have these qualities and reuse has flourished in these environments. Reuse has failed in new, poorly understood domains. Reuse is a hunter/gatherer activity. Making a successful reuse system is largely an intellectual activity of finding the right domain, the right domain standards, the infrastructure, and the right technical culture. It is not simply a matter of going out into the field and gathering up components left and right. Casually assembled libraries seldom are the basis of a high payoff reuse system. Successful reuse systems are crafted to accomplish a set of well and narrowly defined company or organizational goals. Too general a set of goals (e.g., we need a reuse system) or too general a domain (e.g., we need components that support all of our functional needs) usually lead to a low payoff. The hidden truth in this attitude is that populating a reuse library is largely fieldwork and that the “gold” is in the domain. But the success comes through problem driven harvesting, establishing domain standards to enhance component interconnectability and careful adaptation of the harvested components to those interconnection standards. We can have reuse without changing our process. Reuse is sensitive to many cultural, policy and environmental factors. An anti-reuse attitude within an organization, a process that is inconsistent with reuse or a weak, unsupportive infrastructure (software and process) can doom a potentially successful reuse effort.
Given that we reject these hyperboles, let us look at the reality of software reuse. In the broadest sense, software reuse is the formalization and recording of engineering solutions so that they can be used again on similar software developments with very little change. Hence, in one sense, the software reuse process institutionalizes the natural process of technology evolution. Consider the evolution of commercial software products. Successful companies often maximize their competitiveness by focusing on product niches where they can build up their technological expertise and thereby their product sets and markets, in an evolutionary fashion. For example, over a period of years, a company might evolve a line editor into a screen editor and then evolve that into a word processor and finally evolve that into a desktop publishing system. Each generation in such an evolution exploits elements of the previous generations to create new products and thereby build new markets. In an informal sense, such a company is practicing reuse within a product niche. The companies that formalize and institutionalize this process are truly practicing reuse. Since this definition of reuse
4
TED J. BIGGERSTAFF
is independent of any specific enabling technology (e.g., reuse libraries or application generators), it allows us to take a very broad view of reuse, both in the range of potential component types that can be reused (e.g., designs, code, process, know-how, etc.) as well as in the range of technologies that can be used to implement reuse. The success of a reuse strategy depends on many factors, some of them technical and some of them managerial. While we will attempt to point out management factors that foster or impede reuse, we will largely focus on the technology of reuse. In the next subsection, we hypothesize a number of factors or properties that we believe foster successful software reuse. Then in the following sections of the chapter, we will examine several reuse successes and the role that these factors played in those successes. Finally, we attempt to build a qualitative model that describes the interrelationship among the factors and a quantitative model that describes the effects of two of the key independent technology factors on the payoff of software reuse. In the end, we hope to leave the reader with a good sense of the kinds of reuse approaches and technologies that will lead to success and those that will not.
1.2 Key Factors Fostering Successful Reuse
Some of the key factors that foster successful reuse are : 0 0
0 0 0 0 0
0
Narrow domains Well-understood domains/architectures Slowly changing domain technology Intercomponent standards Economies of scale in market (opportunities for reuse) Economies of scale in technologies (component scale) Infrastructure support (process and tools) Reuse implementation technology
Narrow domains: The breadth of the target domain is the one factor that stands out above all others in its effect on productivity and quality improvement. Typically, if the target domain is so broad that it spans a number of application areas (often called horizontal reuse) the overall payoff of reuse for any given application development is significantly smaller than if the target domain is quite narrow (often called vertical reuse). The breadth of the target domain is largely discretionary, but there is a degree to which
AN ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
5
the reuse implementation technology may constrain the domain breadth. There is a range of implementation technologies, with broad-spectrum technologies at one end and narrow-spectrum technologies at the other. Broadspectrum technologies (e.g., libraries of objects or functions) impose few or no constraints on the breadth of the target domain. However, narrow-spectrum technologies, because of their intimate relationship with specific domain niches, do constrain the breadth of the target domain, and most often constrain target domains quite narrowly. In general, narrow-spectrum implementation technologies incorporate specialized application domain knowledge that amplifies their productivity and quality improvements within some specific but narrow domain. As an example, fourth-generation languages (4GLs) assume an application model that significantly improves the software developer’s ability to build MIS applications but is of no help in other domains such as avionics. Even though there is a restrictive relationship only at one end of the spectrum (between narrow target domains and narrow implementation technologies), in practice there seems to be a correlation between both ends of the spectrum. Not only do narrow-spectrum technologies, perforce, correspond to narrow target domains but broad-spectrum technologies often (but not always) correspond to broader domains. The key effect of domain breadth is the potential productivity and quality improvement possible through reuse. Reuse within very narrow domains provides very high leverage on productivity and quality for applications (or portions of applications) that fall within the domain but provides little or no leverage for applications (or portions of applications) that fall outside the domain. For example, an application generator might be used to build MIS applications and it would give one very high leverage on the data management portion of the application but it would not help at all in the development of the rest of the application. Luckily, MIS applications are heavily oriented toward data management and therefore, such reuse technologies can have a significant overall impact on MIS applications. Broad-spectrum technologies, on the other hand, show much less productivity and quality improvement on each individual application development but they affect a much broader class of applications. Generally speaking, the broad-spectrum technologies we are going to consider can be applied to virtually any class of application development. In the succeeding sections, we will often use the general terms narrowspectrum reuse and broad-spectrum reuse to indicate the breadth of the domain without any specific indication of the nature of the implementation technology being used. If the breadth of the implementation technology is important to the point, we will make that clear either explicitly or from context.
6
TED J. BIGGERSTAFF
Well-understood domaindarchitectures: The second key factor affecting the potential for reuse success is the level of understanding of problem and application domains, and the prototypical application architectures used within those domains. Well-understood domains and architectures foster successful reuse approaches and poorly understood domains and architectures almost assure failure. Why is this? Often as a domain becomes better and better understood, a few basic, useful, and successful application architectures evolve within the domain. Reuse systems can exploit this by reusing these well-understood architectural structures so that the software developer does not have to recreate or invent them from scratch for each new application being developed. However, if such application architectures have not yet evolved or are not known by the implementing organization, it is unlikely they will be discovered by a reuse implementation project. The fact that the problem domains in which narrow-spectrum reuse has been successful are well-understood domains is not coincidental. In fact, it is a requirement of a narrow-spectrum reuse technology. This observation points up a guideline for companies that intend to build a narrow spectrum reuse system to support application development. To successfully develop a narrow-spectrum reuse technology, say an application generator or a domain-specific reuse library, the developer must thoroughly understand the problem and application domain and its prototypical architectures in great detail before embarking on the development of a reuse system for that domain.
There is a three-system rule of thumb-if one has not built at least three applications of the kind he or she would like to support with a narrowspectrum technology, he or she should not expect to create a program generator or a reuse system or any other narrow-spectrum technology that will help build the next application system. It will not happen. One must understand the domain and the prototypical architectures thoroughly before he or she can create a narrow-spectrum reuse technology. Hence, the biggest, hardest, and most critical part of creating a narrow-spectrum technology is the understanding of the domain and its prototypical architectures. Slowly changing domain technology: Not only must one understand the domain but the domain needs to be a slowly changing one if it is to lend itself to reuse technology. For example, the domain of numerical computation is one in which the underlying technology (mathematics) changes very little over time. Certainly, new algorithms with new properties are invented from time to time (e.g., algorithms allowing high levels of parallel computation) but these are infrequent and the existing algorithms are largely constant.
A N ASSESSMENT A N D ANALYSIS
OF SOFTWARE REUSE
7
Thus, if an organization makes a capital investment in a reuse library or an application generator for such domains, they can amortize that investment over many years. Rapidly changing domains, on the other hand, do not allow such long periods of productive use and, therefore, d o not offer as profitable a return on the initial investment. Intercomponent standards: The next factor is the existence of intercomponent standards. That is, just like hardware chips plug together because there are interchip standards, software components, and especially narrowspectrum technology components plug together because there are analogous intercomponent standards. These standards arise out of an understanding of the problem domains and the prototypical architectures. The narrower the domain, the narrower and more detailed the intercomponent standards. In very broad domains, these standards deal with general interfaces and data (e.g., the format of strings in a string package), whereas in a narrow domain the standards are far more narrowly focused on the elements of that domain (e.g., in an “input forms” domain, the standards might specify the basic data building blocks such as field, label, data type, data preskntation form, and so forth). This factor suggests that certain narrow spectrum reuse technology strategies will not work well. For example, if one intends to build a library of reusable software components, the strategy of creating a library and then filling it with uncoordinated software components, will lead to a vast wasteland of components that do not fit together very well. Consequently, the productivity improvement will be low because the cost to adapt the components is high. The analogy with hardware manufacturing holds here. If two software components (or chips) are not designed to use the same kinds of interfaces and data (signals), extra effort is required to build interface software (hardware) to tie them together. This reduces that payoff gained by reuse and also tends to clutter the design with Rube Goldberg patches that reduce the resulting application’s maintainability and limit its ability to evolve over time. Economies of scale in market: Another important factor is the economies of scale in the “market,” where we are using the term market in the broadest sense of the word and intend to include the idea that the total coalition of users of a component, regardless of the means by which they acquire it, is the market for that component. Thus, economies of scale in the market means that any reuse technology should be driven by a large demand or need. One should be able to identify many opportunities to apply the reuse technology to justify its development (or purchase) and maintenance. If you
8
TED J. BIGGERSTAFF
are only going to develop one or two applications, it seldom pays to develop (or purchase) a reuse technology for the target application. This is not to say that informal, ad hoc or opportunistic reuse, which is not organizationally formalized, should not be exploited. The point is that if an institutionalized reuse technology costs a company a lot to develop and maintain, it should return a lot more in savings to that company. One way to gauge that return beforehand is to consider the opportunities for reuse. Economies of scale in technologies: There are also economies of scale in the technologies themselves, in the sense that, the larger the prefabricated component that is used by the reuse technology, the greater the productivity improvement for each use. And it is this increase in size of components that tends to force the narrowing of the technology domain. Thus, the size of the prefabricated component, the narrowness of the application domain, and the potential productivity improvement are all positively correlated. Because the scale of the components is so important and the fact that scale correlates to other important properties of reuse technologies, we introduce some broad terminology that draws on the hardware component analogy. Smull-scule components are defined to be from 10 to 100 lines of code, i.e., O(I0’) LOC; medium-scale components are those from 100 to 1000 lines, i.e., O(IO’) LOC; /urge-scale from 1000 to 10,000 lines, i.e., O(IO’) LOC; uery large-scale from 10,000 to 100,000 lines, i.e., 0(104) LOC; and hyperscule above 100,000 lines, i.e., greater than 0(105) LOC. The sizes that we choose are somewhat arbitrary and flexible because we are most interested in the relative properties of the reuse systems that rely on the different scales of components. Therefore, the numbers should not be taken too literally but rather should provide a loose categorization of component sizes. Carrying the hardware analogy further, we use the term SSR (small-scale reuse) to refer to those technologies that tend to use small-scale components on the average. SSR is motivated by the hardware term SSI (small-scale integration). Similarly, MSR, LSR, VLSR, and HSR are medium-scale, large-scale, very large-scale and hyper-scale reuse technologies. While reuse technologies are not, strictly speaking, limited lo a particular scale, they seem to most easily apply to a characteristic scale range. For example, libraries of functions tend toward small scale and medium scale not because it is impossible to build large and very large function-based components, but rather because of the lack of formal support for large-scale design structures (e.g., objects or frameworks) in functionally based programming languages. Any such large-scale design structure falls outside of the functional language formalism and must be manually enforced. Experience has shown that manual enforcement tends not to be very successful. It is generally
AN ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
9
easier to use other reuse implementation technologies (e.g., true object-based languages) that provide formal mechanisms to enforce and manage these larger-scale structures. Infrastructure support: Another important factor is an organization’s infrastructure. Most reuse technologies (and especially the narrow spectrum technologies) pay off best when they are coordinated with an existing, welldefined, and mature software development infrastructure (process). For example, an organization that uses computer-aided software engineering (CASE) tools is better positioned to exploit the reuse of design information than one that does not. CASE tools provide a (partially) formal notation for capturing such designs. And if an organization is already trained and using CASE tools, the additional effort to integrate a library of reusable designs into the process is significantly less than it would be otherwise. Reuse implementation technologies: One factor that can effect the degree of success of a reuse approach is the implementation or enabling technology that one chooses. For many narrow spectrum approaches to reuse, the technology is intimately tied to the approach and it makes more sense to discuss these technologies in the context of the discussions of the specific approaches. We will do this in the following section. On the other hand, broad-spectrum implementation technologies are not tied to any specific reuse approach, even though they are quite often used for broad-spectrum reuse, and so we will mention a few instances of these technologies here and discuss their values. 0
0
0
0
Libraries: Library technology is not a primary success factor but its value lies largely in establishing a concrete process infrastructure that fosters reuse by its existence more than by its functionality. If an organization’s first response to a reuse initiative is to build a library system, then they probably have not yet thought enough about the other more important factors. Classification systems : The main value of classification systems is that they force an organization to understand the problem and application domain. CASE tools : Their value lies in establishing a representation system for dealing with designs and thereby including reusable components that are more abstract (and therefore, more widely reusable) than code. Object-oriented programming languages: Their main value is in the perspicuity of the representation and its tendency to foster larger and more abstract reusable components (i.e., classes and frameworks) than
10
TED J. BIGGERSTAFF
in earlier languages (i.e., functions). Further, the object-oriented representation tends to lead to clearer, more elegant and more compact designs. In summary, reuse success is not a result of one technology or one process model or one culture. It is a result of many different mixtures of technologies, process models, and cultures. We can be guided by a few general principles that point in the direction of success and warn us away from surefire failures, but in the end, the details of success are defined by hard technical analysis and a strong focus on the application and problem domains. I suspect that there is an 80/20 rule here-the domain has an 80% effect and all of the rest has a 20% effect. 2.
Software Reusability Successes
Now let us consider some cases of successful reuse and analyze them in the light of these success factors. 2.1
Fourth-Generation Languages (LSR to VLSR)
Among of the earliest rapid software development technologies to appear and ones that can be bought of the shelf today are fourth-generation languages (4GLs) (Gregory and Wojtkowski, 1990; Martin, 1985; Martin and Leben, 1986a, b). These are quite narrow technologies that apply most specifically to the domain of MIS and business applications. The entities that are being reused in these two cases are the abstract architectural structures (i.e., design components) of MIS applications. The typical 4GL system provides the end user with some kind of highlevel capability for database management. For example, a high-level query from the end-user is often translated into an application database transaction that generates a report. The report may be a business form, a text-based report, a graph, a chart, or a mixture of these elements (see Fig. 1). 4GLs are typically very high-level languages that allow you to talk to the database system without all of the overhead that you would have to use if you were writing an equivalent COBOL program. In a COBOL program you might have to allocate memory and buffers to handle the results from the query. You might have to open the database, initiate the search, and so forth. In contrast, 4GL languages typically do all of those things for you. They provide a language that requires you to talk only about the essential database operations. For example, Fig. 2 shows a sequential query language (SQL) query that selects a part number from a table of all parts, such that the weight of the associated part is less than 700 pounds.
AN ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
High Level Query or Processing Request
.**<
k$&'!.
11
J
Application LData Base
Direct Input
Generated Reports
320 -~ 0 10 20
FIG I .
Fourth-generation languages (4GLs).
r
SELECT Part# FROM PART WHERE Partweight < 700
FIG. 2. Typical 4GL Query (in SQL).
These languages provide you with quite an increase in productivity because of the reduction in the amount of information necessary to perform an operation. Figure 3 illustrates this reduction by comparing the number of bytes required to express a benchmark query in COBOL against the number of bytes required in various 4GLs (Matos and Jalics, 1989). Of course, the exact number of bytes needed to express any given query will vary but the relative sizes represented in this chart are pretty typical. The typical proportions are from 11 to 22 times more characters to express a query in COBOL than in a 4GL. Since the number of bytes required is directly proportional to the amount of work required to create the query, it is an order of magnitude easier to perform database queries and generate reports in 4GLs than in COBOL or other high-level languages. Now let us look at this example of reuse against the properties that we proposed : 0
Narrow domains: clearly, the domain is quite narrow in that it applies to the data management and user interface aspects of MIS and business systems in general. Importantly, this domain is a large part of each such application and therefore, the overall payoff for each application
12
TED J. BIGGERSTAFF
40k 35
30 25
20 15 10
5
0 COBOL
CONDOR
dBASE
FOCUS
INFORMIX ORACLE
PARADOX
REASE
FIG. 3. Comparison of source code volumes.
0
0
0
0
can be quite large. Over all business applications, the variance is quite large but one can expect the reduction in effort to range typically between 25% and 9 0 % It ~ is not atypical for 90% or more of the application to be directly handled by the 4GL, thereby allowing the application to be created for one tenth of the cost of building the system with a conventional high-level language. Defects are similarly reduced. Well-understood domains/architectures : the data management and user interface architectures within this application domain have been increasingly better understood and standardized for the last 25-35 years, and consequently they have evolved into standard subsystems that are common to many if not most of the application programs in the domain. DBMSs (database management systems) and 4GLs are two of the concrete manifestations of that ongoing understanding and standardization process. Slowly changing domain technology: the underlying hardware and software technologies have changed slowly enough that they can be largely hidden by lower-level system layers, e.g., DBMSs. Intercomponent standards : the DBMSs form a set of hardware-hiding standards and the 4GLs impose an additional set of application logichiding standards. If we looked inside of various 4GL systems we would likely find other finer-grained standards that allow the subsystems of the 4GL to fit together easily. Economies of scale in market: the MIS and business system market is probably one of the largest application markets that exist today. Virtually every business of any size at all requires some set of computer
AN ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
0
0
13
applications such as payroll, accounts receivable, etc. and these are only the tip of the iceberg for large companies. DBMSs, 4GLs, application generators, and the like are simply the evolutionary answer to these huge market pressures. It is the huge pressures and the advanced age of the market that explains why these systems were among the first examples of narrow-spectrum reuse technologies and why they are currently at the advanced level of maturity and productivity improvement. Economies of scale in technologies: the components, i.e., subsystems within the 4GLs, being reused are very large-grained pieces and this correlates with the level of productivity and quality improvement. Infrastructure support : the infrastructure support is largely in place when these tools are introduced because the tools are built to fit into an existing MIS shop. They are fitted to the kinds of hardware, operating systems, computer languages, and typical process models of MIS shops. This makes their adoption quite easy. 2.2 Application Generators (VLSR)
Application generators form another class of reuse technology that is similar to the 4GL class but varies in the following ways: 1. Generators are typically used to generate application systems that will be used many times whereas 4GLs generate programs or queries that are often one-of-a-kind. 2. Application generators tend to be more narrow than 4GLs, often focusing on a narrow application family, whereas 4GLs tend to focus on a broader application domain containing many application families. For example, compiler builders, like YACC and Lex, are application generators for building applications in the parser and lexical analyzer families. While it is a research prototype rather than a production system, the GENESIS system (Batory, 1988; Batory et al., 1989) is a good example of an application generator that is pushing the productivity and quality improvement limits. GENESIS (see Fig. 4) is for DBMSs what compiler builders are for compilers. GENESIS generates database management systems. While many application generators can be purchased off the shelf today, GENESIS is still in its research phase but, nevertheless, is interesting because it illustrates how far generator technology can be pushed. How does GENESIS work? The GENESIS user specifies a set of requirements that characterize the kind of DBMS desired. In a typical specification, the user specifies (1) a data
14
TED J. BIGGERSTAFF
0 File mapping (
0
File structure (
FIG.4. Genesis application generator system
language, e.g., Sequel and/or QBE; (2) the database link implementation, e.g., a ring list; (3) the file mapping, e.g., secondary indexes and encryption; (4) the file structures, e.g., B-trees, ISAM, unordered files, etc.; (5) the recovery methods, for example, logging, shadowing, etc. ; and ( 6 ) the data type schemas, e.g., ints and reds and strings, etc. GENESIS then generates a database management system to those specifications. So if one wants to generate a DBMS that has exactly the same functionality as Ingress, that can be done by specifying the particular requirements of Ingress. Typically, application generators provide productivity that is one or two orders of magnitude better than hand coding. While the only problem that GENESIS solves is the creation of database management systems, it is highly productive at this. I can generate a 40,000-plus line DBMS in about 30 minutes of work. So, application generators can give you very high productivity for very narrow domains. What is more, the quality of the code is very high. Typically, a bug that is due to the generator turns up about as frequently as bugs in a good, mature compiler. Now let us look at this example of reuse against the factors: Narrow domains : this is one of the narrowest domains-DBMS-and the productivity and quality improvements over hand coding from scratch are consequently exceptionally high. In this case, we can experience several orders of magnitude improvement in productivity and quality40,OOO lines of debugged code in less than 1 hour of work by using GENESIS versus four or five people for several years to build an equivalent target system from scratch. Well-understood domains/architectures: DBMSs have the advantage that hundreds of researchers have been working for over 20 years to
AN ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
0
0
0
0
15
work out the theoretical basis of these systems. That background work has turned an impossible task (i.e., building a GENESIS system 20 years ago) into one that is just hard (i.e., building GENESIS today). Slowly changing domain technology : DBMS technologies are relatively stable overtime although they do seem to go through periodic technology shifts such as moving from hierarchical to relational and more recently to Object-Oriented DBMSs. However, within any given DBMS model, the change is relatively slow and within the older technologies (hierarchical and relational) fundamental advances now seem almost nonexistent. Intercomponent standards : GENESIS would be impossible without the development of a set of well-defined module interconnection standards that allow the system to plug together various modules with different operational properties but having the same standardized connection interface. Economies of scale in market: since GENESIS is a research project, it is not yet clear whether or not there really are economies of scale for GENESIS per se. Nevertheless, the typical application generator arises because of a “market pressure” (sometimes within a single company) for the facility. Economies of scale in technologies: the prefabricated GENESIS components are typically several thousand lines of (parameterized) code and if one considers the additional generated code, GENESIS is in the VLSR technology range. Infrastructure support: as with 4GLs, application generators are fitted to the kinds of hardware, operating systems, computer languages, and typical process models that already exist within MIS shops, making their adoption quite easy.
2.3 Forms Designer Systems (LSR to VLSR) Another kind of reuse technology is forms designers, which are variously called screen painters or designers. These systems attack the problem of developing a forms-based user interface for a business application. Most businesses are rife with forms, e.g., invoice forms, that are used as an integral part of the business operation. Consequently, they are ubiquitous in many business applications and are therefore a prime candidate for a reuse technology. Forms designers and screen painters allow one to quickly generate a forms-based user interface by using a set of predefined building block components. Figure 5 presents a conceptual overview of forms designers. A forms designer’s form representation is visually like a paper-based form and it is used by the application program to request input from, and present
16
TED J. BIGGERSTAFF
ForrnlScreen Design
FormlScreen Schema
0
Labels Boxes and boundaries
0
Posltions and sizes
0
Edit modes
0
Grouplngs
0 0
Edit order Formulas
0
Functions
0
Adlustor Date
___
0 0
Problem = Application Interface Dlrect manipulation Interface
FIG.5. Creation of a form-based application interkce.
output to the end-user. Users create forms by a direct manipulation interface permitting them to draw the form on the screen, complete with labels, fields, borders, and so forth, exactly the way they want it to look to the application user. Then, the form design is turned into an internal schema that specifies the form’s labels, boxes, boundaries, colors, fields, and their positions. The schema may also specify editing modes. For example, numbers may be the only valid input for a date or price field. It specifies the edit order of the fields, i.e., the order in which the cursor sequences through the fields as the end user presses the tab or return key. The schema may also allow formulas that functionally relate certain fields to other fields. For example, the gross pay field in a work record form could be calculated as the product of the field containing the salary per hour times the field containing the number of hours worked. Once these forms are created, they are used by an application program as a way to request input from or present output to the user as shown in Fig. 6 . In the use phase, the application program loads the schema and a runtime library that manages the form’s data and user interaction. The runtime library handles the presentation of the form to the user, the interaction with the user, the editing of the fields, and the transformation of the form’s data into some kind of transaction or internal record that can be used by the application program. Once the data are entered into the form by the end-user, the form’s data fields are typically converted into a data record or database transaction, which may produce a variety of business side effects e.g., inventory being ordered, an invoice generated, etc. The properties of this domain are:
AN ASSESSMENT AND ANALYSIS
OF SOFTWARE
REUSE
17
Runtime Lib I Schema
T
FIG.6. Operation of form within an application program.
0
0
0
0
Narrow domains : this domain-forms-based user interfaces-is quite narrow but constitutes a smaller portion of the application (i.e., only the user interface portion) than 4GLs typically do. Therefore, it leads to a somewhat smaller but by no means inconsequential developmental cost and defect reduction. Depending on the overall complexity of the application, one might expect a typical developmental cost and defect reduction to be in the 5-25% range. When a forms designer is incorporated into a 4GL, which is common, the overall improvement jumps significantly and an order of magnitude decrease in total developmental cost and number of defects is common. Well-understood domains/architectures : like 4GLs, this technology has been evolving for years and the methods and architectures are well and widely known. Slowly changing domain technology : this technology has been largely stable for years with only minor evolutionary changes arising from advances in monitor technology (e.g., high resolution, bitmapped, color, etc.) and the associated interface software (e.g.$ graphical user interfaces (GUI) and windowing interfaces). Much of this evolutionary change can be and has been isolated by software layers within the screen designers that abstract out the essential properties of the monitor hardware and the interface software. Intercomponent standards: the screen designer tool establishes a wide range of standards including what types of data can be accommodated in fields, how the field and label information is encoded for the runtime routines, what kinds of editing operations are available to the
18
0
0
0
TED J. BIGGERSTAFF
user, and the nature of the data and operations that result from a completed form. Economies of scale in market : like the 4GLs, this is a huge marketplace that includes most MIS and business systems. Economies of scale in technologies: the reusable components (in the run-time library) are medium- to large-scale code components. Infrastructure support: like the 4GLs and application generators, this technology fits the existing infrastructure and therefore, accommodates easy inclusion into the existing software develop environment. 2.4 Interface Developer‘s Toolkits (VLSR)
Early forms and screen generation systems were usually built directly on the operating system. More recently, a new layer-the window managerhas been introduced and applications are now built on top of these windowing systems. This has given rise to another kind of reuse facility-the interface developer’s toolkit. Interface toolkits are analogous to forms designers and screen painters but address a broader range of applications. More to the point, they allow one to build applications with GUIs. Interface toolkits provide libraries of facilities out of which one can build a rich variety of user interfaces, including interfaces for form designers and screen painters. Figure 7 presents a conceptual overview of the interface developer’s toolkit. Like the form and screen designers, interface toolkits are designed for developing user interfaces. They tend to be built on top of whatever standard window interface is supplied with the operating system. They also provide Applicatlon Interface Deslgn
Wldget Llbrary
0
Active regions
0 Windows 0 Scrollbars
0 Menus
0
User interface orientation
0
Built on windows interface
0
Direct manipulation widgets
0 Icons
0
Include calls to widgets
0 Composites
Dialog boxes 0 Buttons
FIG. 7. Interface developer’stoolkit.
AN ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
19
a number of direct manipulation widgets (Nye and O’Reilly, 1990)-to use the X windows (Heler, 1990; Hinckley, 1989; Nye, 1988; Scheifler et al., 1988; Young, 1989) terminology-that can be included in the application interface via calls. So a typical widget library would provide active regions, a region sensitive to mouse events; window objects with all of the necessary window management functionality; scrollbar objects so that windows can show canvases that are actually much larger than the windows themselves and allow the user to scroll to the unseen portions of the canvas; menus of various kinds, such as pull-down, pop-up, sticky, etc. ; dialog boxes for data input ; buttons to invoke functions ; icons to represent suspended programs ; etc. An advantage of an interface toolkit is that it ensures a uniform look and feel for all applications built with it. This allows the end-user to have a pretty good idea of how a new application operates based only on his or her previous experience with other applications. One of the first uses of this idea was in Xerox PARC’s Alto personal computer system (Xerox, 1979). Later, the same idea was used in the Xerox Star Workstation (Xerox, 1981). The first widespread, commercialization was in the Apple computer’s MacApt interface builder’s kit. More recently, such toolkits have been built for various windowing systems. X-Windows appears to be the emerging window standard for the Unix/workstation world and several X-based toolkits are available, such as Interviews, Motif, and Openwindows. One can also purchase similar toolkits for other operating systems and machines types. Toolkits for the PC (personal computer) market are built on top of Microsoft Windows (TM Microsoft), OS/2 Presentation Manager (TM IBM), etc. The market for interface toolkits is growing rapidly at this time. The properties of this approach are much the same as the forms designers with a few differences. First, because the applications using this approach tend to cover a broader range of domains than just MIS or business systems, the user interface is typically a much smaller part of the overall target application program and therefore, while the payoff in absolute terms is large, the decrease in developmental costs and defect levels is often proportionally smaller over the whole developmental effort than with 4GLs and screen designers. While this technology is reasonably well understood and standards are being formalized, it is not as mature as the forms interface, and therefore it is still evolving new subdomains. One subdomain is the recent emergence of the GUI generator, the analogue of the forms designers. GUI generators, which recently appeared on PCs and are just beginning to appear on workstations, allow one to design the screen interface through a direct manipulation editor (Sun Microsystems Corporation, 1990). They allow one to
20
TED J. BIGGERSTAFF
graphically construct the interface design using high-level objects like menus, panels, dialog boxes, and events. These tools are more complicated than forms designers because so much more of the application code must come from and be customized to the application by the software engineer rather than just being standard run-time functions loaded from a library. Thus, these tools must allow a lot more custom programming to occur. As these interface designers emerge and evolve, we can expect more and more of the application creation to be taken over by them and consequently, a further decrease in development costs and defect levels. 2.5 The Software Factory (MSR to LSR, Process-Oriented Reuse) Another reuse approach is the software factory, a process that treats software development more like conventional manufacturing than design. Consequently, reuse plays a large role. The software factory concept has been perfected to a high art by a number of Japanese companies. Toshiba’s software factory (Fig. 8) (Cusumano, 1989, 1991; Matsumoto, 1989) is a typical example of this kind of reuse. Their domain is real-time process control software for industrial applications, e.g., heavy steel rolling mills. This is MSR (with the potential to evolve into LSR), where the components are not just code, but are artifacts drawn from across the full life cycle. That is, they include requirements, design, and code. In Toshiba’s case, these components are specified in formal languages, stored in a reuse repository, and reused in the course of developing customized versions of the products in the product family that they serve. Because the domain is narrow-i.e., a
0 Real-time,
0
Development by copy and edit Methodology based stds
0
Tool enforced stds
0 Output/person
0
Tool set incrementalism 0
process control appllcatlons improved for 10+ years
Religious support of DB FIG.8. Toshiba’s software factory
AN ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
21
product family-ach new version of the product represents only a modest variation on the stored components. That is, every heavy steel rolling mill is different in small ways, such as equipment kind and numbers, mill dimensions, and equipment placement. Such differences can be accommodated with only small changes in the requirements, designs, and code components. This is accomplished by a “copy and edit” approach. Interrelated requirements, design, and code components are retrieved from the repository and manually modified to accommodate each new process control system. Because the process is so highly formalized-e.g., through the existence of design languages-it is easy and natural for standards to arise. Further, these standards are enforced by the existence of tools. Both the tools and the associated standards grow and evolve together in an incremental way over the years. Finally, the software factory has a strong commitment to the support and maintenance of the repository system. In Toshiba’s case, the formalized languages, supporting tools, and associated standards form the foundation of the formalized software factory process and provide significant opportunities for leverage on productivity and quality of the code. Between 1976 and 1985 the Toshiba software factory increased its software development productivity from an equivalent of 1390 lines of assembly source code per month to 3 100 per month, thereby achieving a cumulative productivity increase of approximately 150%. During the same period, they were able to reduce the number of faults to between one quarter and one tenth of the number that they were experiencing at the beginning of the period (Cusumano, 1989). Other Japanese companies with software factory models (e.g., NEC and Fujitsu) have shown similar improvements in productivity and quality. What key properties of the software factory model foster reuse success? 0
0
Narrow domains: in this case, the domain is extremely narow (i.e., a product family) leading to the opportunity for reusing very large-scale pieces. However, the measured payoff is more modest suggesting that the degree of customization required for each such component may mitigate the improvement. Well-understood domains/architectures, slowly changing domain technology, intercomponent standards, and economies of scale in market : these properties all favor reuse. The domain is a product family that has been perfected over the years leading to a stable, well-understood architecture with well-developed intra-application standards. The very nature of the business establishes an inertia that slows the change of the problem domain’s technology. While this is not a huge market, it is clearly a sufficiently large market to make investment in reuse technology worthwhile. In short, these companies have determined that
22
0
TED J. BIGGERSTAFF
reuse makes business sense, which is the best measure of the value of applying this technology. Infrastructure support : the operational character of these companies provides a nurturing context for such techniques. The strong emphasis on process and the inclination to cast the software development into a manufacturing metaphor provide an infrastructure in which this approach to reuse has a strong opportunity for success. 2.6
Emerging Large-Scale Component Kits (LSR)
Now let’s do a little bit of prediction and look at a set of development technologies that are just beginning to emerge. You cannot buy these technologies today, but in a few years you probably will be able to. I believe that interface toolkits will spawn the development of other complementary toolkits that contain larger-scale components-components that are much more oriented to specific application domains and more complex than widgets. This is an example of the emerging field of vertical reuse. In some sense, large-scale components are an extension of the widget notion but one that is more specialized to particular application domains. (Domain specialization is an inevitable consequence of the growth of component sizes.) For example, desktop publishing is an application domain that is mature enough to supply such components, e.g., fonts, pixel images, graphs, and various kinds of clip art. Spreadsheets are another kind of component that may be included in various applications. What is left to do is to establish standards for their representation that transcend their particular application niche. Once this is done, clip art and spreadsheet frameworks can be imported into and used together within a single application program. Today such integration would be difficult. As transcendent standard representations emerge for these and similar component classes, it will become relatively easy. The large-scale component notion is enabled by object-oriented (Cox, 1986; Elis and Stroustrup, 1990; Goldberg and Robson, 1983; Meyer, 1988; Saunders, 1989; Stroupstrup, 1986, 1988) technology in that objects represent a good implementation mechanism. They are finer grained than programs but larger grained than functions or subroutines. Another important characteristic is that objects hide the details of their implementations thereby, allowing them to be more easily moved into a new application context without introducing conflicts between the implementation of the object and its surrounding context. Thus, this property makes them more like black boxes that can be plugged in where needed. This proposed approach has many of the same properties as interface toolkits adjusted to account for bigger components in narrower domain
AN ASSESSMENT AND ANALYSIS
OF
SOFTWARE REUSE
23
niches. We would expect that the component library would be a compilation of components from a number of mostly independent subdomain niches and the average payoff for each application developed using that library would reflect the degree to which the subdomains addressed the typical functionality in the application programs being developed. 2.7
User-Oriented Information System (LSR to VLSR)
Another kind of toolkit seems likely to emerge, one which is more specialized than DBMSs but less specialized than 4GLs or forms designers. This toolkit, illustrated in Fig. 9, is likely to be a combination of hypertext systems (Bigelow, 1987; Biggerstaff and Richter, 1987; Conklin, 1987; Gullichsen et U I .1986; ~ Smith and Weiss, 1988); frame systems (Brachman and Schmolze, 1985; Fikes and Kehler, 1985; Finin, 1986a, b) ; object-oriented programming languages (Cox, 1986; Ellis and Stroustrup, 1990; Meyer, 1988; Saunders, 1989; Stroustrup, 1986, 1988); and object-oriented databases (Kim and Lechovsky, 1989). Once again, while you can purchase packages that provide some of the characteristics of the desired technology, you cannot purchase a technology with all of the properties that I foresee. Nevertheless, I believe that in a few years you will be able to. What is happening in this area is a convergence of these four technologies. First, hypertext technologies allow one to deal with various kinds of unstructured data and link those data together in arbitrary ways. Hypertext systems are extending the nature of the user interface, but in another sense, they are also extending database technology.
0
Hypertext (also called hypermedla) systems
0
Al frame and rule based systems
0
Object orlented programmlng systems
0
Object oriented DBMS's
FIG.9. Emerging user-oriented information systems.
24
TED J. BIGGERSTAFF
The second technology that is part of this evolution is frame- or rulebased systems. These systems organize related sets of information into frames, which are very much like objects. They often provide a rule subsystem whereby the end-user can express inferencing operations on the frames. The third technology is object-oriented programming systems, which provides an elegant, disciplined, and lucid way to organize programs and their data. And finally, object-oriented DBMSs are beginning to address the problem of “persistence” of an application’s object-oriented data. Persistence addresses the following question: “How does one store large sets of objectoriented data such that it persists from application execution to application execution, and so that many applications can share it in a database-like mode?” Figure 10 summarizes the properties that such systems will have. These systems will have a powerful graphical interface that operates on a rich hypermedia-based information subsystem. This subsystem will be a toolkit with lots of facilities that can be quickly and easily integrated into individual application programs. It will have browser-navigators available so that one can navigate through the information. They will be highly flexible browser toolkits for building browsers that can be customized to the needs of specific application programs. In some sense, this technology is really a generalization of the forms designer and GUT generator concepts where the toolkit allows complex information to be projected into a greater variety of output forms and media. That is, we have moved beyond business forms and interfaces, and project the information into arbitrary graphical forms. Similarly, we will be offered a wide variety of output media including graphics, sound,
0
Powerful graphlcal Interface
0
Browserlnavlgator toolkits
0
Generalization of forms designer concept
0
Arbitrary objects (la., Irregular data such as text, graphlcs, etc.)
0
Arbitrary llnkages
0
Inference
0
Very large scale data bases of objects
0
OODBMS ’s with persistence and sharing FIG. 10. Properties of user-oriented information systems.
AN ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
25
full motion video, etc. We can already see elements of multimedia beginning to emerge in the PC marketplace. These systems allow one to operate with arbitrary objects and irregular data. Thus, one can intermix text, graphics, speech, animation, etc., and they can link this information together in rather arbitrary ways. That is, one can take a graphics diagram and put a link from any place in the diagram to any other node (i.e., object) in the database. It is not the kind of thing that typical databases allow one to do very well, because they are designed to deal with regular data; i.e., fixed-length items that fit nicely into predefined tables with predefined relationships, i.e., tables and relationships that are processed in a very regular fashion. Hypertext data are not regular, are not of fixed length, do not fit well into predefined tables or predefined relationships, and are not processed in a regular fashion. Another property of such systems is that they will allow you to do inferencing on the information. For example, one might want to write a rule that checks to see if there is a link between two frames or objects and if there is, execute an operation on the data base. The merging of object-oriented programming environments and objectoriented databases will allow large systems of objects or frames to persist from one application execution to the next. Thus, applications will be dealing with and sharing large sets of objects over months or years. Such object sets are typically too large to be loaded completely in memory with any given application. Therefore, applications must have the capability to “page” some subset of the object network into memory to be operated on. The objectoriented DBMSs must keep a faithful corespondence between the object images that are in an application program’s memory and the object images that reside in the database. The properties of this technology are likely to be a combination of the properties of the individual technologies. That is, it is likely to have many of the properties of user interface toolkits and 4GLs. However, it is likely that the leverage of these technologies will be proportionally much less because the applications developed are likely to grow in size and complexity. Thus, we would not expect order of magnitude productivity improvements, but rather midrange (20%-50%) improvements. The parkinsonian growth of the application specific portion of the target programs is likely to significantly reduce the overall profit from the user-oriented information system reuse.
2.8 Application-Specific Reuse (LSR to VLSR) The narrowest kind of reuse is the kind that is focused on a specific application family. The software factory is an example of one implementation of this idea.
26
TED J. BIGGERSTAFF
In some sense, the application-specific reusable components concept is an extension of the large-scale component concept that we talked about earlier. The main difference is that application-specific reusable components tend to be larger in scale and oriented toward a narrower class of applications, often a family of closely related products. We would not expect to find applicationspecific reusable components in the commercial marketplace. They are just too specialized. However, we would expect companies to develop such components internally as a way to enhance their ability to deliver variants of products quickly and cheaply. As a consequence of the increased scale and focus, these components typically provide greater leverage or payoff than large-scale components. But application-specific reusable components are only feasible in well-understood domains where there already exists a high level of domain competence in an organization. That is, if an organization has developed enough avionics systems to thoroughly understand the typical architectures of these systems, then it might be able to create a set of application-specific reusable components for avionics. If not, it would be almost impossible to create such a set because of the large amount of expertise that must be acquired. If an organization is going to develop a set of application-specific reusable components, it must analyze the domain. The organization must determine what set of components will be of greatest benefit to its product sets. One way is to look at the components of previously developed systems and harvest them for the component library. Of course, some energy will have to be invested to generalize these components and make them more reusable, but that is easy in comparison to the overwhelming alternative of creating all of the components from scratch. The results of this domain analysis should include (1) a vocabulary of terms that cover all important problem domain concepts and architectural concepts, (2) a set of standard data items that range across the whole domain and serve as the inputs and outputs for the components, and (3) a set of abstracted designs for the reusable components that will be used to construct the target applications. These results are generalizations of the concepts, data, and components found in existing systems and establish a framework of intercomponent standards that are important to component reusability. For a reuse library to be successful, it must be established on a rich and well-defined set of intercomponent standards. That is, one must make sure that the set of components derived from the domain analysis will plug together easily and can be reused in new applications without a lot of effort to adapt them. The data items, which are standard across all of the components in the library, are the key concrete manifestation of these intercomponent standards. Without such a framework of intercomponent standards, a reuse library has a high probability of failing. With such a framework, the chances of success increase significantly.
AN ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
27
Such a framework of intercomponent standards is critical to all reuse efforts, but they become a more and more important factor as the scale of the components increases. Hence, application-specific reuse with its large and very large components, amplifies the importance of such standards. This need to analyze domains via the analysis of existing programs, is spawning a new class of tools, which we call design recovery tools. These are really a generalization of reverse engineering tools. While reverse engineering tools are largely aimed at porting or cloning existing programs, design recovery tools are aimed, in addition, at helping human beings to understand programs in human-oriented terms. Operationally, reverse engineering tools are largely concerned with the extraction of the formal information in programs (that information expressible via programming language formalisms) whereas design recovery tools are more concerned with establishing a mapping between the formal, implementation structures of a program and the semiformal, domain-specific concepts that humans use to understand computational intentions. For example, consider the mapping from an array of C structures to the architectural concept process table. I have coined the term “the concept assignment problem” to describe the problem of creating such mappings and the concept assignment problem is the central problem being addressed by design recovery tools. The understanding developed with the aid of design recovery tools serves several purposes beyond simply porting application programs, purposes such as re-engineering, maintenance, domain analysis, and reuse library population. While the subject of design recovery and the related subjects of reengineering, maintenance, reverse engineering, and domain analysis are highly important to reuse, they are beyond the scope of this chapter. Suffice it to say that these are all critically important subjects to organizations engaged in reuse. 2.9 Designer/Generators (LSR to VLSR) Another class of reuse facilities currently being developed in research laboratories are the designer/generator systems. These systems add abstract design components to the reusable libraries to push the reuse activities back into the design cycle. Further, they add rules plus a rule-driven shell around the reuse libraries to allow some automation of the design process. The rules define how to specialize (i.e., add the missing details to) the design components and how to interconnect various design components to form a complete target program. Designer/generator systems are typically mixedinitiative systems with the software engineer providing requirements, missing designs, and a large dose of intelligent decision making. By this technique, the systems go from simple requirements and specifications directly to code.
28
TED J. BIGGERSTAFF
In essence, they emulate the kind of design construction process that human designers perform. If the libraries are well populated in the domain of interest to the software engineer, the development of a system is like a game of 20 questions with the user giving an initial set of requirements and specifications and then after that only participating when the system gets stuck or needs more information. The end product is executable code for the desired application. The ROSE reuse system (Lubars, 1987; Lubars, 1990; Lubars, 1991) is an example of a designer/generator system (see Fig. 11). It is a prototype that was developed to experiment with this kind of semiautomated reuse system. ROSE has two libraries-one of design schemas and one of algorithms-both of which are expressed in forms more abstract than code. ROSE takes a specification of the target system in the form of a data flow diagram built from abstract data types and abstract operations. It attempts to design the target system from this specification by choosing design schemas from its reuse library to fill out the lower levels of the design. The specifications that it starts with are ambiguous in the sense that most details of the target system are not determined by the initial specifications. Thus, the system develops the details of the design by four mechanisms : (1) choosing candidates for lower-level design schemas from the design library; (2) inferring design details via constraints attached to the designs ; (3) transforming and specializing pieces of the developing design by using transformation rules (i.e., design rules) from the library; and (4) soliciting information from the software engineer when it gets stuck. Once the design has been worked down to atomic design elements, it is still more abstract than code and goes through another step which maps (i.e., compiles) the design into algorithms specified in some specific programming language. * If the library is reasonably well populated within the target domain, much of the target program’s development is automated and a working program of a few hundred lines of code can be produced in 10- 15 minutes of work. If the library is incompletely populated, then the process becomes progressively more manual depending on the level of design library population. With a completely empty library, the system behaves much like a conventional CASE system and requires about the same level of effort as developing the target program with a CASE system. In the case of designer/generator technologies, most of the key factors that we have identified with successful reuse systems are defined more by the nature of library components than by the designer/generator technology itself. In theory at least, one can populate the libraries with elements of any
* The experimental version of ROSE produces target application programs in three languages: C, Pascal, and Ada.
AN ASSESSMENT A N D ANALYSIS OF SOFTWARE REUSE
..-
29
30
TED J. BIGGERSTAFF
scale and populate them completely enough to build large percentages of the target applications out of reusable parts. To date, the technology has not been tested with large-scale and very large-scale components and we speculate that this technology may have problems with big components within a regimen of nearly full automation because such a situation may impose large inference requirements on the system. Therefore, to date, designer/generators only have been shown to work reasonably well for components between medium and large scale within a well-defined framework of domain/architecture standards and constraints. It remains to be seen how well this technology will scale up. This technology is best suited for very narrow and well-understood domains because of the large amount of effort necessary to populate the reuse libraries. In fact, the large effort to populate ROSE’S design library led to the creation of a project to build a design recovery system called DESIRE (Biggerstaff, 1989; Biggerstaff et al., 1989).
3. Examples of Reuse Implementation Technologies This section considers generic technologies that are not strictly speaking reuse technologies but are implementation technologies that enable reuse : (1) classification and library systems, (2) CASE tools, and (3) object-oriented programming systems. These enabling technologies are themselves broad-spectrum or horizontal technologies in that they can be used to enable reuse in virtually any application domain and enable either narrow- or broad-spectrum reuse. Nevertheless, because of their inherent generality and the fact that they easily allow the specification of small reusable components, they tend to orient toward broad-spectrum or horizontal reuse in their application. 3.1
Classification and Library Systems
The classification system for reusable components and the library (repository) system used to hold those components are two elements of a reuse infrastructure. These elements largely help define a logical structure for the application domain, which simplifies the job of finding reusable components and identifying components that need to be added to the library. A key classification research problem is how to organize the overall library. For any given domain, there often is no single canonical or ideal hierarchical classification for a given reusable component. If a component is classified under the function (or functions) it implements, then it becomes difficult to access based on other properties such as the kind of data that it operates on. Since it was recognized that one may want to find the same
A N ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
31
component based on different properties, classification schemes have evolved that take the library science approach of allowing a component to be found based on any of a number of its properties (called “facets”) (Prieto-Diaz, 1989). The library system itself is a rather minor, though conspicuous, element of the reuse infrastructure. Its role can be viewed in two ways : (1) as a piece of technology that is key to the success of the reuse effort, or (2) as a piece of infrastructure whose main value is in establishing a process context and thereby enhancing the value of associated reuse technology. The author tends to believe that the second view is closer to the truth and that too much emphasis on the technical importance of the library system can lead one to focus too little on other more critical elements of the reuse project. To put the above notion in concrete terms, when a company is setting up a reuse effort, it is often easier to build a library system than to try to understand exactly what is really needed and what kind of technology best fits the company’s environment. In some cases, a manual library system may be a perfectly acceptable solution initially and the key technical innovations may lie in choosing and analyzing the appropriate domains. If a reuse proposal is only focused on the design of the library system, then it is quite possible that too little thought has been given to other more important aspects of the problem such as the specific domain and components to be reused. The library is not unimportant. It is just not the first thing to think about when planning and creating a reuse system.
3.2 CASE Tools Figure 12 characterizes CASE systems (Chikofsky, 1988, 1989; Fisher, 1988). Most CASE systems focus largely on the problem of drafting, i.e., providing engineering drawings that describe the design of software systems. They are most effective when used to design large-scale systems being developed by a team of designers. 0
Mostly automated drafting systems Data Model
0 Diagrams
Data Flow
0 Shared repository 0
Document generation
0 Consistency checking (weak) 0 Prototyping (e.g., screen layout)
0
Code generation (but NOT creation)
FIG. 12. Characterization of CASE systems.
Procedure
1
32
TED J. BIGGERSTAFF
CASE systems provide several kinds of diagrams that support various design methodologies. For example, they typically provide diagrams that describe data relationships, e.g., data flow diagrams that show how data flow through the various parts of the system, and procedural representations from which the source code can be derived, sometimes semiautomatically. In addition, they typically provide a shared repository and help in managing the diagrams; document generation capabilities for including design diagrams in design documents; and various kinds of analyses that report on the state of the design. They often do some weak consistency checking, e.g., verifying that the right kind of boxes are connected to the right kind of arrows. Some CASE tools provide limited prototyping capabilities such as screen layout facilities. With these tools, one can design the screen interface to the target system and then generate the interface code, much like the forms designers discussed earlier. The major benefit of using a CASE tool is that the evolving design is recorded formally, e.g., in data flow diagrams, statecharts, predicate calculus, etc. The real value of CASE tools arises out of using these design representations as a working model during the development process. The act of using design formalism forces many design issues out into the open that would otherwise remain hidden until late in the design process. Moreover, it uncovers omissions in the design. But the most important effect is the migration of the design model into the heads of the designers. After all, it is the inhead knowledge that one uses during the whole developmental process. Productivity improvement with CASE tools is often modest. Some savings result because design updates are easy with CASE tools and because the design and the code are integrated and often managed by the CASE system. But overall, the direct savings are modest. The major, but indirect, benefits of CASE systems come during the testing and maintenance phases. Because the details of the target design are expressed early, the errors and the defects can be seen and detected early. This tends to lead to a higher-quality system with fewer defects to correct after the system has been delivered. The productivity improvement arises largely because postdelivery defects cost two orders of magnitude more to correct than those corrected during the design phase. It is difficult to evaluate CASE tools against our proposed set of reuse properties because these properties are more sensitive to the nature of the reuse application than to the use of CASE tools. Consequently, the productivity and quality improvement that result strictly from the reuse aspects of CASE is usually quite modest and is often overshadowed by the productivity and quality improvement due to early defect detection. An inherent value of CASE tools to reuse applications is the infrastructure support that CASE tools provide to the reuse process. Another inherent
AN ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
33
value of CASE tools is that they tend to foster reuse of software designs in addition to reuse of code. Since designs are more abstract than code, they tend to have a higher opportunity for reuse and thereby have a higher payoff potential.
3.3 Object-Oriented Programming Systems Object-oriented systems (Cox, 1986 ; Ellis and Stroustrup, 1990; Goldberg and Robson, 1983; Meyer, 1988; Saunders, 1989; Stroustrup, 1986, 1988) impose a structure on a program by developing a graph of related classes of objects (see Fig. 13). For example, one could define a rectangle as the class of displayable, graphic objects that are rectangular and relate it to its superclass of “graphic object,” i.e., the class of all displayable, graphic things. Further, one could define a subclass of rectangle called a window, i.e., a displayable, graphical rectangle that has additional properties and behaviors over those of a rectangle. For example, a window is a displayable rectangle that can accept input from the keyboard and mouse, and produce output within its rectangular aperture. One could design other subclasses (i.e., specializations) of graphic objects such as circle, ellipse, and so forth. As shown in Fig. 14, each such class corresponds to some set of realworld objects. For the user interface classes, the real-world objects might be graphical manifestations that are drawn on a computer screen. For example, a rectangle could be part of a line drawing; or with certain additional characteristics, it might be a window; or with even more specialized characteristics, it might be a tiled window-i.e., a window with panes in it; or it could be a browser, i.e., a window that knows how to display graphs; and so forth.
-------
Cirre
I 0
m
0
+ Superclass Rectangle + Class I Window
Tiled Window
Browser
FIG. 13. Example class hierarchy.
Subclass
34
TED J. BIGGERSTAFF
Graphic Oblect Rectangle
Ellipse
II
I
Window
Tiled Window
Browser
FIG. 14. Classes and real-world objects.
Each class has two kinds of information associated with it, as shown in Fig. 15. One is state information that defines an instance of the class. For example, a rectangle class would have instance variables x and y that define the position of its upper-left corner. Further, it would have instance variables that define its length and width. The second kind of information associated with a class is a set of so-called methods that define the behavior of that class. These methods manage the
I
0
0
Window
0
display, origln, corner, center, border, flil Tiled Window
Browser
FIG. 15. Structure of classes.
AN ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
35
state information defined by the instance variables. Examples of such methods are display, which draws the rectangle on the screen ; origin, which returns the (x, y ) position of the rectangle on the screen; and so forth. One of the important object-oriented concepts is inheritance, which is also called subclassing and is illustrated in Fig. 16. The idea is that if I already have the definition of a rectangle and want to define something that is a specialized instance of rectangle, like a window, all I have to do is specify the additional data (i.e., instance variables) and behavior (i.e., methods) of a window over that of a rectangle. In other words, a window is something that has all of the same state information as a rectangle but, in addition, has some state specific to it. For example, a window might have a canvas containing a set of pixels to be displayed. Further, it might have a list of control facilities like buttons and scrollbars. In addition to the extra state information, a window may have additional methods. And it might also replace some of the rectangle’s methods (e.g., the display method of rectangle is replaced by the display method of window in Fig. 16). To put it more abstractly, classes represent the definition of the state and the behavior of the set of all objects of a given type. An individual member of that set of objects is called an instance of the class or alternatively, an object.
-
-
Methods
Classes
w %:&!
create, destroy. dump
display, origin, corner, center, border. fill
Instance Variables (State)
I )int object-number:
display. title, scrollbars, move, expand. open, close. mouse. ...
Window’s Methods
Window’s Instance Variables
Inherited From
create, destroy, dump
int object-number:
Graphic Object
origin, corner, center. border. fill
int x. y. length, width:
Rectangle
display. titie. scrollbars. move, expand. open. close. mouse. ...
int canvas int int ] pointer bulons: ba/s, .,.,
Window
FIG. 16.
Subclassing a n d inheritance.
36
TED J. BIGGERSTAFF
An instance is implemented as a data record that contains all of the state information that describes the individual object. This is illustrated in Fig. 17. Thus, a tiled window instance record would contain all of the state information unique to tiled windows, plus all the state information inherited from window, plus all the information inherited from rectangle, plus all the state information inherited from graphic object. The record containing all of that state information represents an instance of a tiled window. Now, when a method is called, it performs some operations that use or change the state information in the instance record. Examples of messages are display yourself, change your size, move the canvas under the window aperture, and so forth. One of the most important properties of object-oriented systems is that they impose an extra layer of design discipline over conventional languages. They allow one to formally express additional information about the architectural organization of a system beyond what one can express in a typical high-level language such C or FORTRAN. More to the point, they insist on that architectural information. They insist that one cast the design of a system in terms of a set of related classes that correspond in a natural way to the real-world entities that the system is dealing with. This discipline helps one to develop a cleaner and more elegant design structure, in the main, because it forces the designer to explicitly think about the real-world entities and their interrelationships, and this enhances the reusability of the resulting classes. Another valuable property of object-oriented design is the fact that classes are natural reusable components. Because much of the state information is “hidden”-i.e., accessible only to the class’s methods-the classes have fewer constraints that tie them to their original context and they can be easily Graphic Object
1 ;.
object-number
.
Window
y, length, width
canvas, buttons, bars,
,
. .
Tiled Wlndow Instance Record Manifestation of Instance Record
FIG. 17. lnstances of classes.
AN ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
37
relocated and reused in new contexts in other programs. And because classes are conceptually larger-grained components than functions, their reuse tends to provide better productivity and quality improvement than reuse of functions. On the other hand, from a reuse perspective, classes are still relatively small-grained components and one would really like even larger-scale reusable components. Fortunately, object-oriented systems also provide a platform for creating larger-scale reusable components, called frameworks. A framework is a set of classes that taken together represent an abstraction or parameterized skeleton of an architecture. The final benefit of object-oriented development is inheritance. It reduces the amount of programming effort necessary. Because one already has some functionality defined in an existing class, building a specialization of that class is much simpler than starting from scratch. Some of the data and some of the functions are already written and can be inherited from the superclass. The reuse benefits of object-oriented programming systems are analogous to the reuse benefits of CASE systems. That is, the productivity and quality benefits are more sensitive to the existence of a reuse infrastructure than the fact the object-oriented programming is involved. Nevertheless, we must admit that object-orientation mitigates toward somewhat larger-scale reuse that function orientation and therefore, there is a tendency toward improvements in productivity strictly due to the object-orientation. Even so, object-oriented languages are inherently broad spectrum and tend to most easily enable small- or medium-scale component reuse. Therefore, the productivity and quality gains due strictly to the object orientation tend to be modest. It would be a guess substantiated only by intuition but these gains would probably be in the 5-1Ooh range. Additional productivity and quality benefits are derived from the reduction in defects that accrue from the cleaner designs that object-oriented programming styles foster. Still further benefits can be derived from domain-specific facilities that particular object-oriented languages or environments provide, for example, the rich user interface building blocks in languages such as SmallTalk. As with CASE systems, the infrastructure provided by object-oriented languages is of significant value in the course of implementing reuse libraries. Although significant additional work is required to implement a complete and useful reuse system, an object-oriented development environment provides a head start. In summary, there are a number of different reuse technologies that can improve productivity and quality in software development. Not all approaches are right for every organization, but among the approaches, it is very likely that most organizations can find something that will fit their culture and needs. There are no magic wands or silver bullets that will give
38
TED J. BIGGERSTAFF
an organization orders of magnitude improvement over all the software that it develops. But there are a number of individual approaches which if used conscientiously within receptive contexts, will provide significant increases in productivity and quality.
4.
Effects of Key Factors
The objective of this section is to explore the relations among the reuse success factors and in the course of this exploration, to develop an analytical model that quantifies the relationship between some of the key factors and the productivity and quality benefits that they produce. We will also explore-in an intuitive manner-the relationship between specific reuse technologies and their potential for productivity and quality improvement.
4.1
Relationships among the Reuse Factors
Cost is arguably the most important metric in software development and it can be viewed as the sum of the following costs: 0 0
0
0
Cost of developing new custom components for the target software. The taxes on the reused components tie., the amortized costs to develop and maintain the reusable components). Cost to assemble the components into a system (i.e., integration or “plumbing” costs). Cost to remove the defects from the target system, which breaks down into two costs: (1) cost of removing defects from the component software and ( 2 ) cost of removing defects from the integration software, i.e., the plumbing software.
Figure 18 shows how these various costs are affected by the key factors that we used to characterize successful reuse systems. We can see that among the independent factors, the degree to which the domain is understood, the breadth of the domain chosen, and the specific kind of reuse technology have an effect on three key, dependent factors: (1) the amount of reuse within the target application, ( 2 ) the scale of the components being reused, and (3) the intercomponent connection standards. These in turn affect several elements of the total cost. The larger the amount of reuse (i.e., the larger the proportion of the application built out of reusable components), the less one has to spend on developing new components for the target application. Similarly, the larger the proportion of reused components in an application,
A N ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
Inter-
Component Standards Well UnderStood Domains
Components Scale of
Narrow Domains
Amount of Reuse in Application
Reuse Implementation Technology
Economies of Scale in Market
39
< % \educe
Plumbing
-* -=+
Stable Technologies
Defect Removal cost
b
PrOpDrtlon
Total Software cost
Component Reuse Tax
RedYCB
FIG. 18. Relationships among key factors and cost.
the less one has to spend on removing defects, because the reused components have fewer defects to start with. The number of defects in a reusable component generally decreases the more the component is reused. The scale of components typically affects the cost to assemble the components. Assembly of larger-scale components requires less plumbing and introduces fewer plumbing errors, both of which reduce costs. This is the same kind of cost reduction phenomenon seen in hardware: it is cheaper to build a device out of very large-scale integration (VLSI) components than out of small-scale integration (SSI) components. Finally, intercomponent standards reduce plumbing costs mainly by reducing the amount of specialized code that must be developed to hook components together. The more highly standardized the interconnections, the less effort it requires to assemble the applications out of components. The following section will examine this phenomenon analytically. Figure 18 should make it clear that the final effect on the software cost is wrought by a mixture of technology and business decisions. While it is important to carefully consider exactly what reuse technology is right for the organization and problem at hand, one must keep in mind that the effects of the best reuse technology can be nullified by ill-considered business decisions. For example, a poor choice of an application domain ( e g , one that the organization knows little about or one that is rapidly evolving), or a decision to accommodate too broad a part of the application domain, can overwhelm any productivity or quality improvement potential provided by the reuse technology. Therefore, while we focus much of our attention in this chapter on reuse technologies, successful reuse can only be achieved through good technology and business decisions.
40
TED J. BIGGERSTAFF
The choice of reuse technology significantly effects two of the most important factors cost influencing-the scale of the components and the percent of the application that can be built out of reused parts. One would like to know the answer to the following question: Is there a simple relationship between the reuse technology chosen and the productivity and quality benefits to be expected? The simple answer is no. The relationship is not really independent of the other factors (e.g., intercomponent standards). For example, one can make ill-conceived choices that result in poor intercomponent standards, which in turn lead to interconnection costs that overwhelm all other costs. Similarly, choosing too broad a domain can easily reduce the total amount of reuse to the point where the profit from reuse is minuscule compared to the costs to develop new application code. Nevertheless, intuition suggests that there is a relationship, given the assumption that choices for other factors are reasonable. We will assume a reasonably good domain choice with stable technology and components that have a high potential for reuse. Given these assumptions, there does seem to be a rough relationship between the technology chosen, the scale of the components implied by that technology, and the percent of the target application that is constructed out of reused components. And since the dollar savings to be realized from reuse correlates directly with the percent of the target application that is constructed out of reused components, we will express the benefits of reuse in terms of the potential percent of reuse in the target applications rather than dollars. Figure 19 is the author’s perception of the relationship among technologies, component scale, and the percent of the target application that can potentially be built out of reusable components.* It is intended solely as a conceptual description and is not for estimation purposes. To this writer’s knowledge, no one has yet done the empirical research necessary to establish a relationship between technology choices and the productivity and quality improvements. All other things being equal, technologies that fall in the upper right-hand portion of the diagram have the potential to provide large improvements in productivity and quality; i.e., generally more than 50% cost reduction. Those in the lower left-hand portion can provide 0 20% cost reduction, and those elsewhere in the chart are probably somewhere in between. However, let me remind the reader once again that this is at best an intuitionally based relationship that suggests potential, not one that guarantees specifics. It is easy in specific cases to manipulate the other factors to completely confound the relationship. Now let us take an analytical look at the relationship between some of the reuse factors. * Since each of the technologies shown in Fig. 19 allows quite a bit of implementation flexibility, they are drawn as boxes to indicate ranges along both axes.
41
AN ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
Scale
Forms DeSlgnerS
1
Application Spetlfic Reuse
User-Oriented Information Systems
LSR Libraries of Designs DesignerlGenerators
MSR
Libraries of Objects
ssRp I
I '
Libraries of Functions
;
;
50%
;
99%
Percent of Reused Code in Target Application
FIG. 19. Productivity and quality improvement estimating heuristic.
4.2 A Quantitative Model of the Relative Amount of Integration Code
This section introduces an analytical model to predict the effects of component scale and intercomponent standards on the plumbing costs and thereby, on the eventual profit wrought by reuse. We will do this by examining the amount of code required to connect the reused components into a target application program for various levels of component scale and intercomponent standards. 4.2.1 Definitions
Figure 20 defines the key model variables. Specifically, we want to determine PAC-the proportion of the total code in the application that is committed to connecting the reused components-because PAC is proportional to the overhead costs associated with reuse. The desired situation is where PAC is a very small proportion of the total code, ideally near zero. As predicted by our earlier qualitative analysis, this ideal is approached in the case of large components with good intercomponent standards. We will see that in the case of poor intercomponent standards, PAC can exceed 0.7 (i.e., 70%) of the total code in the target application, whereas with good intercomponent standards and relatively large components, PAC approaches
42
TED J. BIGGERSTAFF
t RLOC
FIG. 20.
t
t
NLOC
CLOC
{
Lines Of
Code
Divisions of program containing reused components.
zero. However, even with good intercomponent standards, if the components are too small, PAC can be up to 0.5 (i.e., 50%) of the total code. Table I contains the qualitative definitions of the model variables with the dimensions of each variable shown in parentheses. By convention, we will often use LOC in the text as an abbreviation for “lines of code.” ACC characterizes the interconnection standards of a reuse library.* It is the average number of lines of code that must be written to make a single TABLEI OF VARIABLESIN MODEL DEFINITIONS
Inputs characterizing library ACC Average connectivity complexity (LOCIConnection) SC Average scale (Lot) Inputs characterizing target application AFT Average Fan-In (Connections) Number of lines of new code (LOC) NLOC RLOC Number of lines of reused code (LOC) outputs CLOC NRC
P PAC PAN PAR TLOC
Number of lines of connection code (LOC) Number of reused components in target application (no. of components) Ratio of new LOC to reused LOC (dimensionless) Proportion of connection code in target application (dimensionless) Proportion o f new code in target application (dimensionless) Proportion of reused code in target application (dimensionless) Total number of lines of code in target application (LOC)
* The ACC characterizes the expect number of lines of code needed to make a connection to a component. It i s an average computed over many uses of a set of data-related components within different applications and is a convenient characterization of expected connectivity properties. It is not meaningful with respect to specific individual components or specific individual applications. It is the computed average over all components in a library and over a large number of reuse experiences of the total lines of connectivity code required in those applications divided by the total number of connections required in those applications.
AN ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
43
use of a component in the target application. It is a measure of the code that wires together the new code and data structures with the reused components and data structures, as well as the reused components and data structures with each other. If the data structures for conceptually equivalent domain entities are standard across the whole library and there is a pattern of standardization among the function interfaces, object protocols, etc. then ACC is small. As a trivial example, consider a set of routines that operate on strings where all strings used by all of the functions are stored in a standard format. The amount of code needed to use a string output by one of the functions as input to another function will be small. If the string formats required are different for the two functions, the amount of code to interface them will be significantly larger. While this example is trivial in comparison to real components, it illustrates the nature of the standards that we are discussing. In the best case, ACC is a single statement, which is the statement used to invoke the component. In the real world, this is seldom the case. Usually, the calling interface requires different forms of the data or requires data that is not readily available but must be computed before the component is invoked. Typically, the plumbing code characterized by ACC includes such things as computation of required data ; reorganization of existing data structures (e.g., transforming a zero-end-marker string into a length-tagged string); the creation of new data structures required for input, output, or operation ; database creation ; database operations ; file operations ; and so forth. This connectivity code can be extensive if the various data-related components hew to widely different standards. Although the ideal for ACC is one, it is often not achieved. An example serves to illustrate this. In order to reuse an existing parser, one often has to write a postprocessor that transforms the parse tree computed into a new form that fits the context of a different system. Often other computational extensions also need to be made for the new context. All of this code must be written from scratch and contributes to the average connectivity complexity for each of the components within the reuse library. The second model input variable that characterizes the reuse library is SC, the average scale (i.e., size in LOCs) of the components in the library. The other key inputs are defined by the target application program. AFI is the average number of connections required for a typical component. Each such connection requires ACC lines of code on the average. An example is in order to clarify the true nature of and relationship between ACC and AFI. Even though in the model we are considering average connections, a concrete example using individual connections and plumbing code will make the relationship clearer. Let us suppose that
44
TED J. BIGGERSTAFF
f(x,y , z ) is a reusable function. The plumbing code required to integrate f into a program consists of two parts: (1) the set of code that packages and formats the inputs to f-for example, x, y and some global data structure g-and later unpackages and reformats any outputs of f--e.g., z and the data within g ; and (2) the code that makes the data transfers happen--e.g., a call statement or a process spawn. If the packaging/ unpackaging code is the same for every use o f f in the program, then one can write functions to do the packaging and unpackaging, and amortize that code over the many invocations off in the new program. On the other hand, if we have several distinct kinds of uses off, each requiring packaging/unpackaging code that is so different that we cannot use a single set of functions to do the packaging/unpackaging, then we must amortize each distinct set of packaging/unpackaging code over its set of uses and use the average of those to compute ACC. Thus, only in the simplest case do the lines of code counted by ACC correspond to a specific programming artifact (e.g., a subroutine or function) within a target program. More generally, ACC represents some proportion of such artifacts averaged over many uses. The next two input variables define the number of lines of code in a target application program that are reused (RLOC) and new (NLOC). From these model input variables, we calculate CLOC, the number of lines of code required for connection of the reused components into the target application. TLOC (the total number of lines of code in an application) can also be calculated from these inputs as can the various proportions of code types in the application-PAR (reused), PAC (connection), and PAN (new). The average number of components in a target application-NRC+an be computed from these variables. We are most interested in how PAC changes as we vary our assumptions about the degree of intercomponent standardization and the relative scale of the components. We introduce another variable P,which is the ratio of new code to reused code. This ratio is useful because we are less interested in the absolute magnitudes of NLOC, RLOC, and CLOC than in the relative proportions of these quantities and how those proportions change under differing sets of assumptions. The variables ACC, SC, AFI, RLOC, CLOC, and NLOC are ripe for empirical studies to characterize various reuse libraries and compare the library characterizations against the results of reusing components from those libraries in real application programs. This would provide some measure of goodness for reusable libraries and eventually result in standards against which reuse libraries could be measured.
AN ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
4.2.2 The Model The following equations define the relations among the variables.
+ RLOC + CLOC
TLOC = NLOC PAR=NRC =
RLOC TLOC PAR * TLOC --RLOC
sc
sc
CLOC = NRC * AFI * ACC - RLOC -
* AFI * ACC
sc
p=- NLOC RLOC
CLOC PAC = _ _ TLOC ' Now we work PAC into a form that is more amenable to approximation.
PAC =
NRC * AFI * ACC TLOC
Using the first form of Eq. (4.3)
which allows us to cancel out the absolute quantity TLOC leaving
PAC =
PAR * AFI * ACC
sc
(4.7)
We want Eq. (4.7) in a form that involves only AFI, ACC, SC, and P, so we reformulate PAR.
PAR
=
RLOC TLOC
~
RLOC RLOC + NLOC
+ CLOC'
46
TED J. BIGGERSTAFF
Using Eq. (4.4)for CLOC, we get
PAR
RLOC
=
RLOC + NLOC -
-
-
(RLOC * AFI * ACC) -~
SC
RLOC * SC sc LRLOC+ NLOC * sc + RLOC * AFI * ACC RLOC * SC RLOC * ( s c + sc * P +AFI * ACC)
sc
-
SC -
+
+ SC * P + AFI * ACC
sc ~ _ _ _ SC * (17-P) + AFI * ACC’
(4.8)
Substituting Eq. (4.8) into Eq. (4.7), we get
Canceling our SC, we have a form that is good for approximation analysis. (4.9) Now let us consider three cases: 1. a library with poor interconnection standards 2. a library with good interconnection standards but small components 3 . a library with good interconnection standards and relatively large components
For case 1, we define a library with poor standards as one in which ACC is cqual to SC. In other words, it takes about as much code to make an interconnection to a reused component as is in the reused component itself, on the average. Substituting SC for ACC in Eq. (4.9) and canceling SC, gives us
AFI
PAC = (1
+ PI + A F T ’
(4.10)
47
AN ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
Notice that the size of the component does not appear because of our destandardization assumption. This is not just an interesting theoretical case. Anecdotal evidence suggests that it often happens that SC and ACC are nearly the same in libraries of casually assembled components. Figures 21a and 21b show two views of how PAC is affected by various values of AFI and P for case 1. Notice in Fig. 21b in particular that where P = 0, in the limit, PAC approaches 1.0 as AFI approaches infinity. However, for all P, we see that the proportion of connection code grows as AFI grows. While for large Ps the relatiue amount of connection code decreases, it does so only because the relative amount of reused code is diminishing. This relative decrease in PAC is not cause for rejoicing, because the absolute amount of work may still be substantial. More to the point, the amount of work necessary to reuse code can be more than the amount of work required to rewrite the reused code from scratch. Looking at the PAC/PAR ratio, PAC PAR
_ _ = AFI = Fan-In
"O0 0.75
PAC
T
0.5
AFI = 4 AFI = 3 AFI = 2 AFI = 1
0.25
1
b
1
3
2
5
4
6
7
P (New/Reused Ratio) 1.00
P I 0
P-1 P=2 P = 3 P=4
I 1
I
2
3
4
5
6
AFI (Average Fan-In)
FIG.21. Proportion of connection code for libraries with poor standards.
(b)
48
TED
J. BIGGERSTAFF
we see that since the fan-in must be at least 1, we always have to do at least as much work to connect the reused components as we would do to rewrite the reused code from scratch and if the fan-in is greater than 1, we have to do more. Admittedly, case 1 is a boundary case, but we must remember that there is a neighborhood around this case where reuse does not really payoff and one needs to structure their strategy to avoid this neighborhood. Case 2 is a library with good standards but relatively small components. We define good standards to mean that ACC = 1. Thus, Eq. (4.9) becomes
AFI PAC = -___
sc * ( I + P)+ AFI'
(4.1 1)
We define small components to mean that SC = AFI, or in other words, the size of the connection network for a component is about the same as the size of a component. This produces (4.12)
which is the same curve as that defined by AFI = 1 in Fig. 21 (case 1j. Thus, the relative (and usually the absolute) amount of connection code is high. In fact, if we look at the ratio of the connectivity code to reused code, we see that we are writing as much new connectivity code as we are reusing.
PAC - AFI - SC = 1. PAR SC SC __
This is not a good deal. We might as well rewrite the components from scratch. However, the payoff significantly improves in the case of larger components, as in case 3. For case 3, we assume good library interconnection standards (i.e., ACC = I ) and relatively large components in comparison to their interconnections. Large components relative to their interconnections will be taken to mean SC>>AFI,and more specifically
SC = 10" * AFI. This is a convenient approximation because it provides a simple if approximate relationship between PAC and component scale. That is, for AFI near 1, n is approximately log,, (average component size) and for AFT near 10, n + 1 is approximately loglo(average component size), and so forth. Thus, n is a relative gauge of the component scale. If one makes a few simplifying assumptions about AFI's range, we have an independent variable that ranges over the reuse scale, namely, SSR, MSR, LSR, VLSR, etc. Thus, we can easily relate the approximate (average) amount of work involved in connection of reused components to the scale of those components.
AN ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
49
Using this approximation, Eq. (4.9) becomes PAC =
AFI 10" * AFI
* ( 1 + P) + AFI'
Canceling out AFI, we get PAC = For n
=
1
10" * (1
(4.13)
+ P) + 1'
1,2,3, . . . , we get PAC, = 1
1 1 O * P + 11
=
PAC,,= 3 =
PAC, = 2 =
1 100 * P + 101
1 ~
1000 * P
+ 1001
and so forth. Thus, for n > 0, = PACapprox
1 10" * (1 + P)'
(4.14)
We can see that for at least one order of magnitude difference between the component scale (SC) and the average number of connections (AFI), the amount of total connection code is below 10% (for n = 1 and p = 0) and well below that for larger 11's. Thus, for libraries with good interconnection standards and large components, the amount of work involved in interconnection is small relative to the overall development. The payoff of reuse is seen quite clearly in this case by examining the ratio of connection code to reused code, which is approximately the inverse of the component scale for small AFI. PAC - 1 PAR 10"' Thus, the connection overhead is relatively small for MSR components and inconsequential for LSR components and above. Figure 22 summarizes the results of this analysis. 4.2.3 Proportion of Reuse Code (Actual and Apparent)
If rather than just examining the proportion of interconnection code, we would like to know the proportion of reused code (and by implication the proportion of code to be developed), we can perform a similar set of algebraic manipulations to derive the formulas for PAR in each of the three
50
TED J. BIGGERSTAFF
Poorly Standardized Libraries
I
Well Standardized Libraries
ili!!T PAC
P I 0 P r 1
P=2 P = 3 P=4
0.25
1
2
3
4
5
Relatively Small Components
a
AFI (Averape Fan-In)
PAC
PAC
1 -
=
2 c P
PAC
--
-
- 1
PAR
MI-4 AFI 3 AFI
I
2
AFI = 1
PAC
1
=
lo* *
(1
+
P)
+
1
PAC < 0 . 1 0 for all PAC
AF I
=
(1
PAC
--
- AFI
PAR
+
=
P)
+
( n = 1 or n > 1 ) and AFI
Fan-In
all ( p = PAC -=-
1
PAR
10"
0
or p > 0)
FIG. 22. Summary of case analysis
cases considered earlier. The results of these derivations are: 1
CASE 1 :
PAR =
CASE 2:
PAR=2+P
CASE 3 :
PAR =
(1
+ P) + AFI 1
10" 10" * ( 1
+ P) + 1
'
Relatively Large Components
51
AN ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
The formula for case 3 is fairly complex to compute and it would be convenient to have a simpler approximation to PAR for case 3. The apparent proportion of application reuse (APAR) is a useful approximation. APAR is defined as
APAR
RLOC RLOC + NLOC
= ___
which can be expressed as
APAR
1
=-
1 +P'
In other words, APAR ignores the connection code, assuming it to be small. Obviously, this approximation only works for some situations. The question is, Under what circumstances is this a good approximation of PAR? Figure 23 shows the APAR curve in comparison with the PAR curves for case 2 and several parameterizations of case 1. It is clear from this figure that APAR is generally not a good approximation for either case 1 or case 2 . However, for case 3, APAR is a pretty good approximation under most parameterizations. For n > or = 2, the connectivity does not significantly alter the percent of reused code and APAR is a good approximation. For n = 1, the worst case is when p = 0, and even in this case, the difference is only about 0.08. The remaining integral values of p (greater than 0) differ by no more than 0.02. For n = 0, the formula reduces to case 2 . This leads
AFI=l
(a case
2)
I 1
2
3
4
5
6
P (New/Reused Ratio) FIG. 23. Proportion of reuse code (apparent and real).
I
7
52
TED J. BIGGERSTAFF
to the following rule of thumb: If on the averagc, thc component scale (SC) is one or more orders of magnitude greater than AFI (the average interconnection fan-in) and the reuse library is well standardized (ACC is near 11, the connectivity code has no appreciable effect on the reuse proportions and APAR is a good approximation for PAR.
4.2.4 Effects on Defect Removal In the previous sections, we have focused largely on the excessive plumbing costs that arise from poorly standardized libraries and small components. The analytical model also has cost avoidance implications with respect to defect removal that may be as great or greater than the cost avoidance that accrues from well-designed reuse regimes. The important facts to note are: 0
0
Since reused code has significantly fewer defects than new code, defect removal from reused code is usually significantly cheaper than from new code. It is not unusual for there to be anywhere from several times to an order of magnitude difference between these costs. Since connective code is new code, it will exhibit the higher defect rates and therefore, higher defect removal costs than reused code.
When considering the effects of reuse regimes on defect removal, the conclusions are the same as when considering the effects of reuse regimes on basic development, i.e., make the connective code be as small as possible, thereby making PAR as large as possible. Each line of reused code will cost several times (and perhaps even an order of magnitude) less for defect removal than a line of new code or connective code. Therefore, the less connective code we have, the better. Thus, we are drawn to the same conclusions as above: to make defect removal as inexpensive as possible, we need to standardize our libraries and use large components.
4.2.5 Conclusions from the Model In summary, the conclusions drawn from our analytical model confirm those that we reached by qualitative argument and case study observations: 0
0
Library standards (most often expressed in terms of application domain data structure and protocol standards) are effective in promoting reuse. Large components reduce the relative effort to interconnect reusable components in all but those libraries with the poorest level of standardization.
AN ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
53
Therefore, the conclusion must be to develop large components (which tend toward domain specificity) and use a small set of (domain-specific) data structures and protocols across the whole library of components.
5. Futures and Conclusions 5.1 Futures If I try to predict the future evolution of reuse, I see two major branchesvertical reuse and horizontal reuse-that break into several minor branches. In the vertical reuse branch, I see large-scale component kits becoming the “cut-and-paste’’ components for end-user application creation. That is, more and more applications will be constructed by using folders of facilities that are analogs of the clip art that is so widespread in today’s desktop publishing. Of course, such end-user programming will have inherent limitations and therefore, will not replace professional programming, only change its span and focus. The other major evolutionary branch within vertical programming evolution will be the maturation of application-specific reuse, which will evolve toward larger-scale components and narrower domains. This technology will be used largely by the professional programmer and will probably focus mostly on application families with a product orientation. Even though the productivity and quality improvements will be high, as with all vertical reuse technologies, the motivation in this case will be less a matter of productivity and quality improvement and more a matter of quick time to market. More and more software companies are succeeding or failing on the basis of being early with a product in an emerging market. As they discover that reuse will enhance that edge, they will evolve in toward reuse-based product development. Interestingly, I doubt that any of the vertical reuse approaches will long retain the label “reuse,” but more likely, the technology will be known by application specific names, even though, in fact, it will be reuse. The second major evolution of reuse technologies will be in the area of horizontal reuse and here I see two major branches-systems enhancements and enabling technologies. As technologies like interface toolkits, user-oriented information systems, and 4GL-related technologies mature and stabilize, they will become more and more part of the operating system facilities. This is not so much a statement of architecture, in that they will probably not be tightly coupled with the operating systems facilities, but more a matter of commonly being a standard part of most workstations and PCs. In fact, a litmus test of the maturity of these technologies is the degree to which they
54
TED J. BIGGERSTAFF
are considered a standard and necessary part of a delivered computer. One can see this kind of phenomenon currently happening with the X windows system. Within 10 or so years, it will probably be difficult and unthinkable to buy a workstation or PC that does not have some kind of windowing interface delivered with it. The other major branch of horizontal reuse is the set of reuse enabling technologies. More and more these technologies will merge into a single integrated facility. The object-oriented language systems and their associated development environments (i-e., the integrated debuggers, editors, profilers, etc.) will be integrated with the CASE tools such that the design and source code become an integral unit. The CASE tools themselves will be enhanced by designer/generator systems to allow them to do increasingly more of the work for the designer/programmer by using reuse technologies and libraries. Finally, I expect to see both the CASE tools and programming language development environments merge with reverse engineering, design recovery, and re-engineering tools and systems. These reverse engineering, design recovery, and re-engineering tools all support the population of reuse libraries as well as the analysis, understanding and maintenance of existing systems. Without such systems, the reuse libraries will largely be empty and the technology impotent. These are the systems that allow an even more primitive kind of reuse, that of bootstrapping previous experience into formal reusable libraries and generalized reusable know-how. Thus, while horizontal reuse and vertical reuse will evolve along different paths, both will move from independent tool sets to integrated facilities and consequently their leverage will be amplified. 5.2 Conclusions
There are no silver bullets in software engineering, and reuse is not one either, although it may come as close as anything available today. While not a silver bullet or cure-all, it does provide many opportunities for significant improvements to software development productivity and quality within certain well-defined contexts. If one understands where its works well and why, it can be a powerful tool in one’s arsenal of software development tools and techniques. REFERPNCES Arango, G. (1988). Domain Engineering for Software Reuse, Ph.D. dissertation, University of California at Ivine. Batory, D. S. (1988). Concepts for a Database System Compiler, ACM PODS. Batory, D. S . , Barnett, J. R., Roy, J., Twichell, B. C., and Garza, J. (1989). Construction of File Management Systems from Software Components. COMPSAC.
AN ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
55
Bigelow, J., and Riley, V. (1987). Manipulating Source Code in Dynamic Design. HyperText ’87 papers. Bigelow, J. (1988). Hypertext and CASE. IEEE Software 21(3), 23-27. Biggerstaff, T. J., and Perlis, A. J., eds (1984). Special Issue on Reusability. IEEE Transactions on Sqftware Engineering, SE-lO(5). BiggerstaK, T. J. (1987). Hypermedia as a Tool to Aid Large-Scale Reuse. MCC Technical report STP-202-87; also published in “Workshop on Software Reuse,” Boulder, Colorado. Biggerstaff, T. J., and Richter, C. (1 987). Reusability Framework, Assessment, and Directions. IREE Software. Biggerstaff,T. J . , and Perlis, A. J., eds (1989). “Software Reusability” (two volumes). AddisonWesleylACM Press. Biggerstaff, T. J. (1989). Design Recovery for Maintenance and Reuse, IEEE Computer. Biggerstaff. T. J., Hoskins, J., and Webster, D. (1989). DESIRE: A System for Design Recovery. MCC Technical Report STP-021-89. Brachman, R. J., and Schmolze, J . G. (1985). An Overview of the KL-ONE Knowledge Representation System. Cognitive Science 9, 171 216. Brooks, F. P. (1989). No Silver Bullet: Essence and Accidents of Software Engineering. IEEE Computer 22(7). Chikofsky, E. J . ed. (1988). Special Issue on Computer Aided Software Engineering. IEEE Software. Chikofsky, E. J., ed. (1989). Computer-Aided Software Engineering. IEEE Computer Society Press Technology Series. Conklin, J. (1987). Hypertext: An Introduction and Survey. IEEE Computer. Cox, B. (1 986). “Object-Oriented Programming: An Evolutionary Approach.” AddisonWesley. Cross, J. H., 11, Chikofsky, J., and May, C. H., Jr. (1992). Reverse Engineering. In “Advances in Computers,” Vol. 35 (Marshall Yovitz, Ed.) Academic Press, Boston. Cusumano, M. A. (1989). The Software Factory: A Historical Interpretation. IEEE Software. Cusumano, M. A. (1991). “Japan’s Software Factories: A Challenge to U.S. Management.” Oxford University Press. Ellis, M. A., and Stroustrup, B. (1990). “The Annotated C + + Reference Manual.” AddisonWesley. Fikes, R., and Kehler, T. (1985). The Role of Frame-Based Representation in Reasoning. Communiccitions of the ACM, 28(9). Finin, T. (1986a). Understanding Frame Languages (Part 1). A1 Expert. Finin. T. (1986b). Understanding Frame Languages (Part 2 ) . A1 Expert. Fisher, A. S . (1988). “CASE: Using Software Development Tools.” Wiley. Freeman, P. ( 1987). Tutorial on Reusable Software Engineering. IEEE Computer Society Tutorial. Goldberg, A., and Robson, D. (1983). “Smalltalk-80: The Language and Its Implementation.” Addison-Wesley. Gregory, W., and Wojtkowski, W. (1990). “Applications Software Programming with FourthGeneration Languages.” Boyd and Fraser Publishing, Boston. Gullichsen, E., D’Souza, D., Lincoln, P., and The, K.-S. (1988). The PlaneTextBook. MCC Technical Report STP-333-86 (republished as STP-206-88). Heller, D. ( I 990). “Xview Programming Manual.” O’Reilly and Associates, Inc. Hinckley, K. (1989). The OSF Windowing System. Dr. Dobbs Journal. Horowitz, E., Kemper, A., and Narasimhan, B. (1985). A Survey of Applications Generators. IEEE Software. Kant, E. (1985). Understanding and Automating Algorithm Design. IEEE Transacfions on Software Engineering SE-11( 1I).
56
TED J. BIGGERSTAFF
Kim, W., and Lechovsky, F. H. eds. (1989). ‘Object-Oriented Concepts, Databases. and Applications,” Addison-Wesley/ACM Press. Lubars, M. D. (1987). Wide-Spectrum Support for Software Reusability. MCC Technical Report STP-276-87, ( 1987) also published in “Workshop on Software Reuse,” Boulder, Colorado. Lubars, M. D. (1990). The ROSE-2 Strategies for Supporting High-Level Software Design Reuse. MCC Technical Report STP-303-90, (to appear). Also to appear in a slightly modified form in M. Lowry and R. McCartney, eds., “Automating Software Design,” under the title, Software Reuse and Refinement in the IDEA and ROSE Systems. AAAI Press. Lubars, M. D. (1991). Reusing Designs for Rapid Application Development. MCC Technical Report STP-RU-045-91. Martin, J. (1985). “Fourth-Generation Languages: Volume 11. Principles.” Prentice-Hall. Martin, J., and Leben, J. (1986a). “Fourth-Generation Languages-Volume 11. Representative 4GLs.” Prentice-Hall. Martin, J., and Leben, J. (1986b). “Fourth-Generation Languages-Volume 111. 4GLs from IBM.” Prentice-Hall. Matos, V. M., and Jalics, P. J. (1989). An Experimental Analysis of the Performance of Fourth Generation Tools on PCs. Communications qf the ACM 32( 11). Matsumoto, Y . (1989). Some Experiences in Promoting Reusable Software: Presentation in Higher Abstract Levels. In “Software Reusability” (T. J. Biggerstaff and A. Perlis, eds.). Addison-Wesley/ACM Press. Meyer, B. ( 1988). “Object-Oriented Software Construction.” Prentice-Hall. Neighbors, J. M. (1987). The Structure of Large Systems. Unpublished presentation, Irvine, California. Norman, R. J., and Nunamaker, J. F., Jr. (1989). CASE Productivity Perceptions of Software Engineering Professionals. Communications of the ACM 32(9). Nye, A. (1988). “Xlib Programming Manual.” O’Reilly and Associates, Inc. Nye, A,, and O’Reilly, T. (1990). “X Toolkit Intrinsics Programming Manual.” O’Reilly and Associates, Inc. Parker. T., and Powell, J. (May 1989). Tools for Building Interfaces. Computer Language. Pressman, R. S. (1987). “Software Engineering: A Practitioner’s Approach-2nd Ed.” McGraw-Hill. Prieto-Diaz, R. (1 989). Classification of Reusable Modules. In “Software Reusability-Volume I” (T. J. Biggerstatf and A. Perlis, eds.). Addison-Wesley. Rich, C., and Waters, R. (1989). Formalizing Reusable Components in the Programmer’s Apprentice. In “Software Reusability” (T. J. Biggerstaff and A. Perlis, eds.). Addison-Wesley/ACM Press. Rowe, L. A,, and Shoens, K. A. (1983). Programming Language Constructs for Screen Definition. IEEE Transactions on Software Engineering, SE-9( 1). Saunders, J. H. (March/April 1989). A Survey of Object-Oriented Programming Languages. Journal of Ohject-Oriented Programming. Scheifler, R. W., Gettys, J., and Newman, R. (1988). “X Windowing System: C Library and Protocol Reference.” Digital Press. Sclby, R. W. (1989). Quantitative Studies of Software Reuse. In “Software Reusability” (T. J. Biggerstaff and A. Perlis, eds.). Addison-Wesley/ACM Press. Smith, J. B.,and Weiss, S. F. eds. (1988). Special Issue on Hypertext. Communications qf the ACM 31(7). Stroupstrup, B. (1 986). “The C + + Programming Language.” Addison-Wesley. Stroupstrup, B. (May 1988). What is Object-Oriented Programming? IEEE Software 10 20. Sun Microsystems Corporation (1990). “Openwindows Developer’s Guide 1 . 1 User Manual.” Sun Microsystems.
AN ASSESSMENT AND ANALYSIS OF SOFTWARE REUSE
57
Tracz, W. ed. (July 1987). Special Issue on Reusability. IEEE Sofiwnre. Tracz, W. ed. (July 1988). Tutorial on Software Reuse: Emerging Technology. IEEE Computer Society Tutorial. Wartik, S. P., and Penedo, M. H. (March 1986). Fillin: A Reusable Tool for Form-Oriented Software. IEEE SoJiware. Weide, B. W., Ogden, W F., and Zweben, S. H. (1991). Reusable Software Components. In “Advances in Computers” (M. C. Yovits, ed.) Xerox Corporation (1979). “Alto User’s Handbook.” Xerox Palo Alto Research Center, Palo Alto, California. Xerox Corporation (1981 ). “8010 Star Information System Reference Guide.” Dallas, Texas. Young, D. A. (1989). “X Window Systcms Programming and Applications with Xt.” PrenticeHall.
This Page Intentionally Left Blank
Multisensory Computer Vision N. NANDHAKUMAR" Department of Electrical Engineering University of Virginia Charlottesville, Virginia
J. K. AGGARWALt Computer and Vision Research Center College of Engineering The University of Texas Austin, Texas 1. Introduction . . . . . . . , . . . , . . 2. Approaches to Sensor Fusion . . . . . . , . . 2.1 The Fusion of Multiple Cues from a Single Image . . 2.2 The Fusion of Information from Multiple Views . . 2.3 The Fusion of Multiple Imaging Modalities . . . , 3. Computational Paradigms for Multisensory Vision . . . 3.1 Statistical Approaches to Multisensory Computer Vision 3.2 Variational Methods for Sensor Fusion . . , . . 3.3 Artificial Intelligence Approaches . . . . . . . 3.4 The Phenomenological Approach . . . . . . . 4. Fusion at Multiple Levels . . . . . . . . . , 4.1 Information Fusion at Low Levels of Processing . . 4.2 The Combination of Features in Multisensory Imagery. 4.3 Sensor Fusion During High-Level Interpretation . , 4.4 A Paradigm for Multisensory Computer Vision . . . 5. Conclusions . . . . . . . . . . . . . . References . . . . . . . . . . . . . . .
. . . .
. . . .
. . _ .
. . . . . . . , . . . .
. .
. .
. .
. .
, ,
. .
.
.
.
.
.
,
.
.
.
.
,
.
.
.
.
.
,
.
. . . . .
.
.
I
.
. '
.
.
.
.
. , .
. . . .
. . . .
. . . .
. . . .
. , . .
. . . .
.
.
.
.
.
.
59 63 63 68 71 86 86 90 91 94 99 100 102 103 103 105 107
1. Introduction
Automated analysis of digitized imagery has been an active area of research for almost three decades. Early research in this area evolved from signal processing schemes developed for processing one-dimensional signals. The science of describing and analyzing one-, two-, and three-dimensional * Supported in part by the Commonwealth of Virginia's Center for Innovative Technology under contract VCIT INF-91-007, and in part by the National Science Foundation under grant IRI-91109584. t Supported by Army Research Office under contract no. DAAL-03-91-G-0050. 59 ADVANCES IN COMPUTERS, VOL. 34
Copyright 0 1992 by Academic Press, Inc. All nghts of reproduction in any form reserved. ISBN 0- 12-012134-4
60
N. NANDHAKUMAR AND J. K. AGGARWAL
signals quickly became an established area of research. The area grew rapidly and was propelled by new theories and experimental findings in areas as diverse as cybernetics, artificial intelligence, mathematical modelling, human psychophysics, and neuro-physiological investigation. Moreover, the concomitant advances in technology made available increasingly sophisticated imaging sensors and greater computational power, which facilitated the implementation and verification (or refutation) of these new ideas. The development of automated image analysis techniques has also been driven by the urgent need for automating a variety of tasks such as equipment assembly, repair and salvage in hazardous environments, routine and repetitive inspection and monitoring, complex assembly operations that require sensing and interpretation of a scene, guidance and navigation of vehicles and projectiles, analysis of remotely sensed data, and so forth. All of these factors have provided great impetus to research in digital image analysis and have made possible the large and useful collection of knowledge that exists today in this exciting specialization of science and technology. Research in the automated analysis of digitized imagery may be grouped into three broad, loosely defined categories : 0
0
0
Image processing: The development of digital signal processing techniques to restore, enhance and compress images. Several books have been published on this subject, including the ones by Gonzalez and Wintz (1987), Rosenfeld and Kak (1982), and Jain (1989). Pattern Recognition: The development of mathematical (typically statistical and structural) models for representing or modelling classes of patterns and optimal algorithms for classifying patterns. The books by Duda and Hart (1973), Fukunaga (1990), and Therrien (1989) contain detailed discussions of important aspects of this approach. Computer Vision : The development of scene and world models involving a hierarchy of representations, and algorithms for interpreting scenes based on computational models of the functional behavior of biological perceptual systems. The books by Marr (1982), Ballard and Brown (1982), Horn (1986), and Schalkoff (1989) describe important results established in this area of research.
These categories overlap considerably. For example, problems such as image segmentation have been addressed from various perspectives, and such research may be classified into any of the preceding categories, depending on the particular approach that is followed. While the term computer vision has been construed by some to mean the investigation of computational models of only the human visual system, its usage in current literature includes a variety of sensing (perceptual) modes such as active range imaging
MULTISENSORY COMPUTER VISION
61
and thermal imaging. Moreover, computational models developed for computer vision rely on a variety of formalisms such as computational, differential, or analytic geometry and Markov random field models, among others. In the following discussion, the term computer uision is used with the latter, broader definition in mind. It is well known that the human visual system extracts a greal deal of information from a single gray-level image. This fact motivated researchers to devote much of their attention to analyzing isolated gray-scale images. However, research in computer vision has made it increasingly evident that formulation of the interpretation of a single image (of a general scene) as a computational problem results in an underconstrained task. Several approaches have been investigated to alleviate the ill-posed nature of image interpretation tasks. The extraction of additional information from the image or from other sources, including other images, has been seen as a way of constraining the interpretation. Such approaches may be broadly grouped into the following categories: (1) the extraction and fusion of multiple cues from the same image, e.g., the fusion of multiple shape-from-X methods; (2) the use of multiple views of the scene, e.g., stereo; and more recently (3) the fusion of information from different modalities of sensing, e.g., infrared and laser ranging. Various researchers have referred to each of these approaches as multisensory approaches to computer vision. The order in which the approaches have been listed indicates, approximately, the chronological order in which these methods have been investigated. The order is also indicative of the increasing amount of additional information that can be extracted from the scene and that can be brought to bear on the interpretation task. Past research in computer vision has yielded analytically well-defined algorithms for extracting simple information (e.g., edges, 2-D shape, stereo range, etc.) from images acquired by any one modality of sensing. When multiple sensors, multiple processing modules, or different modalities of imaging are to be combined in a vision system, it is important to address the development of (1) models relating the images of each sensor to scene variables, (2) models relating sensors to each other, and (3) algorithms for extracting and combining the different information in the images. No single framework is suitable for all applications and for any arbitrary suite of sensors. The choice of a computational framework for a multisensory vision system depends on the application task. Several computational paradigms have been employed in different recent multisensory vision systems. The paradigms can be categorized as (1) statistical, (2) variational, (3) artificial intelligence, and (4) phenomenological approaches. Statistical approaches typically involve Bayesian schemes that model multisensory information using multivariate
62
N. NANDHAKUMAR AND J. K. AGGARWAL
probability models or as a collection of individual (but mutually constrained) classifiers or estimators. These schemes are appropriate when the domain of application renders probabilistic models to be intuitively natural forms of models of sensor performance and the state of the sensed environment. An alternative, deterministic, approach is based on variational principles wherein a criterion functional is optimized. The criterion functional implicitly models world knowledge and also explicitly includes constraints from multiple sensors. Adoption of this approach results in an iterative, numerical relaxation approach that optimizes the criterion functional. The complexity of the task sometimes precludes simple analytical formulations for scene interpretation tasks. Models relating the images of each sensor to scene variables, models relating sensors to each other, and algorithms for extracting and combining the different information in the images usually embody many variables that are not known prior to their interpretation. This necessitates the use of heuristic and empirical methods for analyzing the images. The development of complex interpretation strategies and knowledge representational mechanisms for using such methods has been intensively researched in the field of artificial intelligence (AI). Many of these ideas can be employed in the design of a multisensory vision system. Recently, research has been directed at using phenomenological models for multisensory vision. The models are based on physical laws, e.g., the conservation of energy. Such models relate each of the sensed signals to the various physical parameters of the imaged object. The objective is to solve for the unknown physical parameters by using the known constraints and signal values. The physical parameters then serve as meaningful features for object classification. This chapter highlights the different ideas mentioned previously that are currently being investigated. The chapter is not meant to be an exhaustive compendium of such work. In keeping with this objective, a comparison and review of some recently reported work is presented while describing briefly some rccent and popular approaches to sensor fusion. Section 2 provides a brief description of specific systems that adopt multiple sensors for vision. The systems described are broadly classified into three groups: (1) those that combine the outputs of multiple processing techniques applied to a single image of the scene, (2) those that combine information extracted from multiple views of the same scene using the same imaging modality, and (3) those that combine different modalities of imaging, different processing techniques, or multiple views of the scene. Section 3 discusses some general computational paradigms used for implementing multisensory scene perception. It also discusses typical applications of each of the approaches. Section 4 discusses issues pertaining to the hierarchical processing of multisensory imagery and levels of sensory information fusion. It presents a paradigm for a
MULTISENSORY COMPUTER VISION
63
model-based vision system incorporating fusion at multiple levels of processing. The paradigm described in Section 4 is not prescribed as a general paradigm for multisensory vision since a general paradigm does not, as yet, exist for all applications. Finally, Section 5 contains concluding remarks. 2. Approaches to Sensor Fusion The term multisensor fusion has many connotations as described in the previous section. Approaches to combining multisensory imagery may be grouped into three broadly defined categories: (1) fusion of multiple cues from a single image, (2) integration of information from different views of a single scene, and (3) integration of different imaging modalities. Recent contributions in each of these three areas are discussed in this section. 2.1
Fusion of Multiple Cues from a Single Image
A greal deal of past and current research in computer vision has focused on the extraction of information from a single image. Different techniques, such as texture analysis, contour analysis, and shape analysis, among others, were developed and applied separately to an image. These techniques offered specific solutions to artificially constrained problems that could be solved in the laboratory or with synthesized imagery. The complexity of real-world scenes limits the usefulness of these techniques to imagery acquired from real scenes. In general, each of these problem formulations is typically underconstrained, yielding ambiguous results. This motivated researchers to combine the output of several different operations on an image in an attempt to constrain the interpretation. Such efforts have been directed by engineering applications and have also been motivated by results of psychophysical investigations. The latter have shown that various biological perceptual systems combine the outputs of multiple processing modules to produce an interpretation of the scene, e.g., blob, terminator, and crossing detection modules are integrated to perceive texture (Jules and Bergen, 1987). Presented in the following are examples of recent computer vision systems that follow this approach. 2.1. I
Visible Discontinuity Detection
Discontinuities in the intensity, texture, and orientation of surfaces imaged in a scene provide important information for scene segmentation, object classification, motion computation, etc. The reliable detection of visible discontinuities is, therefore, an important problem. A project that seeks to achieve this goal by combining the output of multiple discontinuity detecting
64
N. NANDHAKUMAR AND J. K. AGGARWAL
modules is the MIT Vision Machine (Poggio et al., 1988). Parallel modules compute zero crossings of the Laplacian of Gaussian filtered image, Canny’s edge detection scheme, and texture. Other information extracted from stereoscopic analysis, optic flow computation, and color segmentation is also integrated. The approach is based on the argument that, at discontinuities, the coupling between different physical processes and the image data is robust. Hence, discontinuities are argued to be “ideal” for integrating information from different visual cues, and the system is motivated by psychophysical findings that support this position. The approach seeks to refine the initial estimates of discontinuities using information from several cues. The different discontinuity cues are combined in the MIT Vision Machine using a Markov random field (MRF) model. The M R F model facilitates sensor fusion. Consider a surfacefand sparse observation g for this surface. LetJ and g, denote the corresponding values at site i in the image. The prior probabilities P(,f’) can be shown to be Gibbsian; i.e.,
where 2 is a normalizing constant, T is known as the temperature, and U( f)= C, U,( f) is the sum of contributions from every local neighborhood. Knowing the conditional probability of g givenf; the posterior distribution is given by the Bayes theorem as
where the energy function U ( f l g ) is given by
C denotes the cliques defined for the neighborhood of site i and contain site j , and y i = 1 at sites where data are available. The problem is to search for the f that maximizes the posterior probabilities for the entire image. One solution strategy involves the application of simulated annealing and stochastic relaxation techniques (Geman and Geman, 1984). The prior energy function can be modified to include other sources of information, such as intensity edge information, texture, orientation, etc. For example, let l{ be the output of a line detector that has output 1 if a linear edge exists between sites i a n d j and has value 0 otherwise. The energy function can then be
MULTISENSORY COMPUTER VISION
65
modified to be U C ( f ) = ( 5 -f,>2(1 - C) + PvCQ:) (4) where Vc is an operator that supports specified configurations of line edges. This operator may also be defined to support discontinuities detected from other sources of information, such as texture and orientation. Defining U c ( f ) to include information from multiple sources of information is thus a convenient and popular way to exploit M R F models for multisensory vision. The limitations of the approach are many, as listed by the system’s pro1988). Information integration may require goalponents (Poggio et d., directed processing, which the current MRF-based approach does not provide. Also, the probabilistic formulation of MRF is too general and therefore may be too inefficient. Deterministic algorithms, such as regularization techniques are preferred for this reason. A discussion of the advantages of deterministic approaches over stochastic approaches for visual reconstruction can be found in recent literature (Blake, 1989).
2.7.2 Computing Shape Many researchers address the use of shading information to compute the shape of an imaged surface. This problem is inherently underconstrained since brightness information at any pixel provides only a single constraint while surface orientation constitutes two degrees of freedom, i.e., ( p , q ) that denote the surface gradients along the x- and y-axes, respectively. Integrating other sources of information to constrain the solution has been an active area of research. One such commonly used piece of information is the assumption of smoothness (continuity) of the surface. This constraint allows the derivation of a method to grow a surface from points of known surface depth and orientation. The growth of the surface occurs along characteristic strips that are given by the solution of a system of five ordinary differential equations (Horn, 1986). This method is sensitive to noise, and it cannot use constraints from boundaries of the strip. An alternative to the characteristic strip expansion method that overcomes these limitations and also allows occluding contour information to be integrated as boundary conditions is based on a variational approach (Ikeuchi and Horn, 1981 ; Horn and Brooks, 1986). This approach seeks to minimize the deviation from smoothness and also the error in the image-irradiance equation. The stereographic plane is used instead of the gradient space. The conformal stereographic projection of the gradient space is defined as
66
N. NANDHAKUMAR AND J. K. AGGARWAL
Functions f ( x , y ) and g(x, y ) are sought that minimize ”
P
where A 3 0 , E(x,y ) is the image brightness and R,(f, g) is the reflectance map. The Euler equations for the preceding formulation consist of a pair of partial differential equations, the discrete forms of which specify an iterative relaxation approach for computingfand g . One drawback to the approach is that the resulting surface slopes may not be integrable. If z ( x , y ) is the surface being solved for, then integrability is defined by zxy(x. y ) = z&,
y)
(7)
viz., the second partial derivatives are independent of the order of differentiation. Methods for enforcing integrability in the solution of the surface are discussed by Frankot and Chellappa (1988) and Simchony and Chellappa ( 1990). The variational approach described previously, which uses the method of Lagrange multipliers to solve the constrained minimization problem, is also termed the regularization approach. The main objective of regularization is to transform ill-posed problems into well-posed ones. The variational approach is a convenient computational framework for incorporating multiple constraints and, hence, is an attractive strategy for implementing a multisensory computer vision system. The integration of the output of multiple texture analysis modules has been investigated by Moerdler and Kender (1987). Two shape-from-texture methods are integrated: (1) shape from uniform texel spacing, and (2) shape from uniform texel size. Their motivation for using multiple shape-fromtexture modules is that a single module can be applied only to a very limited range of real images while the combination of different modules allows surface orientation estimation for a wider class of textures surfaces. In shapefrom-uniform-texel size, two texels T I and T2 are detected whose sizes are S , and S2,respectively. If F, is the distance from the center of texel T, to the vanishing point (Fig. l), then
where F2 = FI - D . Since D can be measured from the image, we can solve for F, . In shape-from-uniform-texel spacing, three texels are detected (Fig. 2). The distance between the first texel and the vanishing point is given by
MULTISENSORY COMPUTER VISION
67
FIG.1. Computing vanishing points using uniformly sized texels.
Each vanishing point circumscribes a great circle on the Gaussian sphere. Vanishing points extracted from different choices of texels and from applying multiple shape-from-texture approaches to the same surface patch contribute multiple great circles, the intersections of which specify two unique surface orientations corresponding to the visible and invisible sides of the surface. The integration of multiple surface orientation estimates from the different approaches is designed to yield a “most likely orientation” for each texel path. An “augmented texel” is used for the integration process. This is a data structure containing a 2-D description of a texel patch and a list of orientation constraints. A hierarchical representation consisting of multiple Gaussian spheres tessellated at different scales of resolution is used to fuse multiple orientation information. A Waltz-type algorithm computes the most likely orientation for each texel patch. Surface segments are then generated from this information. Performance of the system on real data has been reported (Moerdler and Kender, 1987).
FIG.2. Computing vanishing points using uniformly shaped texels.
68
N. NANDHAKUMAR AND J. K. AGGARWAL
2.2 The Fusion of information from Multiple Views Although the extraction and integration of multiple visual cues from an image does yield more information about the imaged scene, the extra information produces sufficient constraints for unique solutions in only a limited number of situations. This is especially true for the problem of reconstructing the three-dimensional structure of the imaged scene using techniques such as shape-from-shading or shape-from-multiple-texture modules. The problem of 3-D scene reconstruction benefits greatly from the use of multiple views of a scene. The extra information available from additional views is entirely due to the geometrical constraints that arise from the motion of the camera and object. The simplest example of integrating multiple views is stereoscopic depth perception. Figure 3 illustrates the main principle of using two cameras Cl and C2-their positions, orientations, and focal lengths are calibrated with respect to a fixed coordinate system (perhaps centered on one of the cameras). Consider an object P that projects on to image plane points P I and P2 in C1and C2, respectively. Since the cameras are calibrated, the vectors OIPland are known, and hence, the intersection of these vectors can be computed to determine the 3-D coordinates of point P . The main problem in stereoscopic depth perception is to search for P2 in C2 given P I in C1such that both PI and P2 correspond to projections of the same point P in 3-D space. This problem is termed the correspondence problem. The primary constraint used to solve this problem is that Pz must lie on the epipolar plane containing P I , where the epipolar plane is defined to be the plane
FIG.3. Stereoscopic depth reconstruction.
MULTISENSORY COMPUTER VISION
69
containing the two centers of projection, 0, and 02,and the point P. The intersection of the epipolar plane containing P I and the image plane of C2 determines the epipolar line l2 on which Pz may be found. Additional constraints, such as the uniqueness of a match and smoothness of the imaged surface, are required to further constrain the establishment of correspondence. Several techniques have been developed for constraining the correspondence task. Dhond and Aggarwal (1989a) present a review of such techniques. A recently developed approach to facilitate correspondence relies on the use of a third camera C3 to create a trinocular imaging system. The image point P I now specifies epipolar lines l2 as well as l3 as shown in Fig. 4. A candidate match P2 in C2 specifies another epipolar line 1; in C 3 . If P2 is a valid match, then a point P3 in C3 that is at (or very near) the intersection of 1, and 1; will have a similar intensity distribution when compared with PI and P2. This condition signals a valid correspondence. Dhond and Aggarwal (1989b) analyze in detail the contribution of the third camera in aiding the correspondence process. The computation involved in establishing correspondence can be simplified further by rectifying the trinocular images (Ayache and Hansen, 1988; Ayache and Lustman, 1991), which involves applying linear image transformations to produce parallel, horizontal/vertical epipolar lines in the transformed images.
FIG.4. Trinocular imaging system.
70
N. NANDHAKUMAR AND J. K. AGGARWAL
A generalization of the preceding problem is to compute the 3-D scene structure and relative motion given 2-D images from unknown positions. Solutions to these problems rely on geometric and projective constraints that typically yield a system of nonlinear equations. A vast amount of literature is available on these topics and hence they are not discussed here. For example, Aggarwal and Nandhakumar ( 1988) review techniques for estimating 3-D motion from a sequence of 2-D images. It is worth noting that the integration of information in such approaches is truly synergistic. In the case of stereoscopic analysis, for example, the integration of simple cues (such as 2-D coordinates of edges) extracted from each image via identical processing modules yields 3-D information that cannot be otherwise obtained. Research has also been conducted on the integration of information from multiple views as well as from multiple processing modules that analyze these views. For example, Krotkov and Kories ( 1 988) discuss the combination of focus ranging methods and stereo ranging techniques. An agile, servomotor driven camera system is controlled autonomously to orient and focus cameras and to adjust the illumination. The focus ranging and stereo processes cooperate to yield more reliable estimates of the depth of objects from the cameras. The integration of depth estimates from the two processes is based on a statistical framework that seeks to reduce the variance of the final estimate. Another system that integrates multiple cues extracted from an image with information extracted from multiple views is the MIT Vision Machine (Poggio et ul., 1988), mentioned in Section 2.1. The MRF formulation also is used to integrate range data extracted from stereoscopic analysis, as well as optic flow extracted from a sequence of images, with the output from other early vision modules. Aloimonos and Basu (1988) discuss the fusion of stereo, retinal motion, contour, shading, and texture cues for computing 3-D structure and motion information of the scene with minimal assumptions. They explore issues regarding the uniqueness and stability of solutions for different pairwise combinations of these sources of information. Moerdler and Boult (1 988) discuss the fusion of stereo and multiple shapefrom-texture modules for recovering three-dimensional surface information. Their objective for information fusion is to enhance the robustness of surface reconstruction. Information fusion occurs in two stages. The combination of multiple shape-from-texture modules is similar to that described by Moerdler and Kender (1987) and is termed intra-process integration. Moerdler and Kender argue that it is easier to heuristically combine data from similar processes. A regularization-based approach combines the output of this stage with stereo range data to produce smooth object surfaces. This latter process is termed interprocess integration. A blackboard scheme is proposed for interaction between the computational modules and the integration modules.
M U LTI S E NSO RY CO M P UTER V ISI0N
71
2.3 The Fusion of Multiple Imaging Modalities It has been observed that the human visual system and other biological perceptual systems combine information from multiple monochrome visual images and from multiple processing modules operating on these images to produce a rich interpretation of the scene (Marr, 1982, Chapter 3). Research in computer vision, however, has shown that emulating this behavior functionally by using artificial means is a very difficult task. The approaches discussed in the previous sections continue to yield ill-conditioned formulations and produce very sparse interpretations. These problems may be lessened by using additional sensory information acquired via disparate sensing modalities that further limit the ambiguities in the interpretation. Such an approach has been motivated by two very different factors: (1) the recent availability of new sensing modalities, e.g., laser range and infrared; and (2) neurobiological findings that establish ways in which disparate sensory information is fused in natural perceptual systems, e.g., infrared and visual image fusion in snakes (Newman and Hartline, 1982) and the fusion of acoustic and visual imagery in barn owls (Gelfand, Pearson, and Spence, 1988). We present the salient features of several different research projects that combine multiple imaging modalities and that are motivated by either or both of these factors. We discuss the approaches used in research projects that are mature and integrate information in a nontrivial manner.
2.3.1 Different Components of Laser Radar Imagery Chu, Nandhakumar, and Aggarwal (1988,1990) developed a system that combines information from range, intensity, and velocity components of laser radar (ladar) imagery. The objective of the research is to detect and classify man-made objects in outdoor scenes. Each component of the ladar imagery is processed by different modules, and the resulting segmentation maps are fused to produce a composite segmentation map. The different modules process the image components based on the specific nature of the information contained in each image component. For example, the range image is segmented by using geometric analysis, i.e., by growing planar surfaces in the scene. Also, surface roughness parameters are extracted to help detect whether or not the region corresponds to a man-made object. Intensity imagery is analyzed to yield statistical properties of the speckle noise in the image. Different types of surfaces yield different types of speckle noise. Characterizing speckle helps distinguish between different types of surfaces.
72
N. NANDHAKUMAR AND J. K. AGGARWAL
The segmentation map and features extracted by the various modules are fed to an expert system for classification. The KEE expert system shell has been used for developing the rules for classification. The system has been tcstcd on a large set of real multisensory ladar imagery obtained from outdoor scenes. The segmentation results compare favorably with those obtained by manual segmentation. Preliminary attempts at classifying man-made objects show promising results. Other modalities of imaging such as infrared and millimeter-wave radar, are also being incorporated into the system. The block diagram of the system is shown in Fig. 5.
2.3.2 Structured Lighting and Contour lmagery
Wang and Aggarwal (1987, 1989) describe a system that combines information from both structured lighting and silhouettes (occluding contours) of the imaged object to reconstruct the three-dimensional structure of the object. A parallel projection imaging geometry is assumed. Multiple silhouettes from multiple views are rasterized in the direction parallel to the base plane. Each rasterized line segment is backprojected along the horizontal plane to intersect with backprojected line segments from other views. These intersections define a polygon on a plane parallel to the base plane. The stack of polygons corresponding to different parallel planes define the bounding volume description of the object (Fig. 6). The bounding volume description is then refined by using surface structure information computed from structured lighting. The computation of surface structure from light striping does not require correspondence to be established between the projected and sensed lines. Two orthogonal patterns are projected onto the object. Each pattern is a set of equally spaced stripes marked on a glass plate. Geometrical constraints are used to recover local surface orientations at the intersections of the two sets of mutually orthogonal grid lines. These local orientations are propagated along the lines to determine the global structure. Let the world coordinate axes be chosen such that the x-y plane is the base (horizontal) plane. Let the pan angle and elevation angle of the image plane normal be denoted by 8, and tyl, respectively. Similarly, let the normal to the plane containing the grid lines (which are to be projected onto the object) make pan and elevation angles of 8, and 'yg,respectively. Also, let the orientation of the plane, which is tangential to the object surface at the point of interest, be denoted by (8(,,'yJ. Let v I and v2 be orientations of the sensed stripes in the image plane reflected off the base plane. Let p1 and p2 be orientations of the sensed stripes in the image plane reflected off the point of interest on the object surface.
73
M U LTIS E NSO R Y CO M P UTER VI S I0N
Collection of data statistics and Data Format Conversion
Knowledge Base built from Integrated Segmentation
in C
Symbolic Reasoning in KEE and LISP Map
Estimating Segment Characteristics
1
Signal Processing Server
Interpreted Segmentation map and scene
+(proposed
4 feedback loop)
FIG. 5. Integrated analysis of the different components of ladar imagery.
u
Y L
C
._ ” a M U
U 0
e
0
a
L
-
MULTISENSORY COMPUTER VISION
The first step involves computing For the first stripe pattern
75
(Bi,yi) for the imaging configuration.
and for the second stripe pattern
Each constraint defines a curve on the Gaussian sphere. Four intersections of these curves provide four possible interpretations. The correct interpretation can be easily discerned by using a distinguishable marking (Wang and Aggarwal, 1987). The second step involves computing ( O o , y o ) for each point on the object where the grid lines intersect. Again, for the first stripe pattern A sin y o + B sin 8, cos yo = 0
(12)
and for the second stripe pattern
C cos O, cos y o i-D sin 8, cos y o + E sin y o = 0
(13)
where A , B, C, D, and E are known functions of ( p l ,p2), ( O r , yl),and ( O , , ty,) (Wang and Aggarwal, 1987). Note that ( O , , y,), and ( O , , y,) are known a priori while ( p l , p2) can be measured in the image plane. Each constraint defines a curve on the Gaussian curve and intersections of these curves correspond to the solutions. A unique solution is available since the image plane orientation is known and since mirror reflections can be discarded. The orientation of the tangent plane at each stripe junction is propagated along the stripe lines using cubic spline interpolation. This allows the change in depth to be computed along sensed grid lines, thus providing surface structure. Note, however, that this process does not fix the position of the computed partial surface structure at a unique point in space. Occluding contour information from multiple views is used to position the partial surface structure in space. The partial surface structure computed from each view is used to refine the bounding volume computed from the occluding contours. The surface structure computed from light striping can be slid along the contour generating lines for that view. In order to constrain its position, the contour generating lines of a different view are used, along with additional geometrical
76
N. NANDHAKUMAR A N D J. K. AGGARWAL
constraints as illustrated in Fig. 7 (Wang and Aggarwal, 1989). Radial fines are drawn from the centroid of the object to intersect the contour. Depending on the type of contour intersected, i.e., partial surface structure(s) or contour generating lines, different surface averaging operations are executed to coalesce the information into a single surface description. Hu and Stockman (1987) describe a more qualitative approach that uses grid coding as well as the intensity image. The stripes projected on the object yield qualitative surface shape information such as planar, convex, concave, etc. Correspondence between projected and sensed stripes is assumed and triangulation is used to compute the depth along the stripes. The intensity image yields boundary information. Boundaries are assumed to be one of five possible types, e.g., extremum, blade, fold. A rule-based system uses physical constraints between adjacent region types and separating contour types to label surface regions as well as the dividing contours.
2.3.3 Color Imagery Color may also be considered to be multisensory information since irradiation in three different spectral bands is sensed. Baker, Aggarwdl, and Hwang (1 988, 1989) address the problem of detection and the semantic interpretation of large stationary man-made objects, such as concrete bridges, in monocular color images of nonurban scenes. Their system consists of an expert system in which the higher level interpretation stage is tightly coupled with the lower-level image analysis modules. Initial segmentation feeds cues to the higher level. Hypotheses are generated and the low-level modules are directed in an incremental segmentution that uses color and geometric information to verify the existence of instances of three-dimensional models. The color image of the scene is first converted to a monochrome graylevel image. A Laplacian of Gaussian (LOG) filter is applied to the image and the zero-crossings of the output are detected to form an edge map. Since the edge map yields closed boundaries, each closed region is assigned a distinct label. Straight line segments exceeding a predetermined threshold are detected. Appropriate pairs of parallel lines are then selected to detect rectilinear structures. Strict mathematical parallelism, however, cannot be used since this includes both collinearity, as a degenerate case, and lines with a common orientation in two distinctly separate and unrelated parts of the scene. Also line pairs that are strictly parallel in 3-D are often not parallel in their 2-D perspective projection. This leads to the notion of perceptually parallel lines, that is, lines accepted as parallel for scene interpretation purposes, The perceptual grouping rule for parallel lines is defined as the following: find all line pairs with a similar orientation and significant overlap that
/
Contour gemerallmg Ilmea of directlom A,
Second view direction
position
Contour generating line8 of
a
/ pusition
direction
3
A2
C
r
b
?rojeclivr
range
im direction A *
/
position
H
c
W
ni
n
e
5
v,
Vlewing direction A2
.I
t Viewing
-
I
I .'I
:I
:
direction A 1
Structure inferred from the first v i e w Structure inferred from the second view
:
---------_
I
I I I
: :
:
+
0
First view direction
I I
Contour generating lines Radial sampling lines
FIG. 7. Positioning partial surface structure using bounding volume and occluding contour, first from a single view and then using an additional view (Wang and Aggarwal, 1989).
-4 -4
78
N. NANDHAKUMAR AND J. K. AGGARWAL
are separated by a perpendicular distance less than half the average length of the two lines. Rectilinear structures are extracted by identifying those subregions of the intensity (gray-scale) image bounded, in part, by parallel line segments. Each pair of perceptually parallel line segments defines a rectangle, called an intrinsic rectangle. There are two categories of intrinsic rectangles, called atomic and nunatomic rectangles. An atomic rectangle is derived from perceptually parallel line segments bounding a single region. A nonatomic rectangle encompasses more than one region in the image, as shown in Figs. 8 and 9. If the intrinsic rectangle contains more than one label, the rectangle covers multiple regions and is rejected. The color of each atomic rectangle is then sampled and used to reject rectangles occurring in natural, i.e., not man-made, portions of the scene. The color representation scheme used in the system is the CIE (1978) recommended L*a*h* color space, which defines a uniform metric space representation of color so that unit perceptual distances can be represented by unit spatial distances. For each atomic rectangle, the average values of luminance L, chroma C, and hue H a r e computed from the red, green, and blue values as specified by the C [ELAB transformation. The material composition of each region is estimated based on color characteristics. Each atomic rectangle
FIG. 8. h n d g of an outdoor scene containing a concrete bridge (Baker er nl., 1989).
MULTISENSORY COMPUTER VISION
79
80
N. NANDHAKUMAR AND J. K. AGGARWAL
is associated with each material and a confidence factor is assigned to that linkage. The confidence factor for a particular association between a rectangle and a material type is obtained from a color confidence function associated with that material type. Each confidence function for each material type is stored as a three-dimensional array indexed by color coordinates. The confidence functions may be considered to define volumes (of constant confidence values) in three-dimensional color space. Confidence factors are returned in the range [0, 1.01 and the DempsterShafter formalism is used for updating belief in classification (Shafer, 1976). This approach allows belief in other material types to reduce the belief that the material type of a region is concrete. Color constancy and fine brightness constancy control are handled within the encoding of the color confidence functions. The confidence functions are determined from training sets under supervised learning. The training phase involves the specification of the 3-D volumes of constant confidence values. The superquadric family of parametric volume representations is chosen for this purpose. The ( L ,H , C) data are obtained from training data consisting of intrinsic rectangles. Superquadric volumes are fit to these data to define the confidence functions. Values of the function are specified by a heuristic rule. Incremental segmentation is then performed. First, all obviously joinable regions are merged. Joinable regions are those that have central axes that are approximately parallel and that also have overlapping (artificially created) line segments on the nearer ends. The initial hypothesis generation is data driven from the material list in the segmenter graph. The interpreter attempts to instantiate as many instances of each bridge model as there are vertically oriented rectilinear concrete surfaces in the graph. The interpreter infers missing pieces in a complete model that has been instantiated. The missing piece is first instantiated, thus forcing a local (incremental) resegmentation of the scene and the creation of a new region. Verification of missing pieces is based on color information. During the verification process, the color confidence function is weakened to be able to accept a larger region of the color space as acceptable color values for the hypothesis being verified. Belief in the overall model is adjusted based on this additional information. Belief could be withdrawn if the model is later found to be inconsistent. The interpreter uses various constraints, including geometrical relationships between structural aggregates as well as the presence of shadows and the spatial relationships between the structural aggregates and shadows. A truth maintenance mechanism implemented within KEE retracts portions of the belief network that depend on assertions no longer believed. The interpreter cycles though the hypothesize and verify cycles until a complete model acquires a high measure of belief. Having detected a concrete bridge in the scene, the system then explores other structural aggregates in the image that
MULTISENSORY COMPUTER VISION
81
-
FIG. 10. Atomic rectangles corresponding to the concrete material type.
have not been associated with the verified model. Figure 10 shows the atomic rectangles, with color indicating the concrete material type. Figure 11 shows the results of the interpretation after the incremental segmentation and verification. Joinable regions have been appropriately joined and verified based on the instantiated model of a bridge. In Fig. 1 1 , the interpreter has detected two bridges, the first partially occluding a second. Three structural aggregates on the extreme right were not joinable to the other bridge structures because of the occluding telephone pole in front of the bridges. Levine and Nazif (1985a, 1985b) describe a rule-based image segmentation technique that processes color imagery. Their approach consists of first partitioning the image into a set of regions to form a region map. Edges are also extracted to form a map of lines. The regions are then repeatedly split and merged; and lines are repeatedly added, deleted, and joined. An important aspect of their system is a focus of attention mechanism. This mechanism identifies “interesting phenomena” in the image, e.g., a group of large adjacent regions that are highly uniform, highly textured, etc. The focus of attention mechanism chooses the order in which data are selected on which rules are to be applied. The system thus incorporates a feedback mechanism in which the data specify the rules to be applied and the order in which the rules are to be applied.
82
N. NANDHAKUMAR AND J. K. AGGARWAL
FIG. 1 I . Final interpretation shows that two instantiated models have been verified. One bridge partially occludes another, which is behind the former. Joinable aggregates are joined by appropriate or verified structures. Several concrete material structures at the extreme right remain separated from either instantiated model.
Klinker, Shafer, and Kanade (1988) discuss the segmentation of objects using physical models of color image generation. Their model consists of a dichromatic reflection model that is a linear combination of surface reflection (highlights) and reflection from the surface body. The combined spectral distribution of matte and highlight points forms a skewed T-shaped cluster (in red-green-blue space) where the matte points lie along one limb of the T and the highlight points lie along the other limb. Principal component analysis of color distributions in small nonoverlapping windows provides initial hypotheses of the reflection type. Adjacent windows are merged if the color clusters have similar orientations. These form “linear hypotheses.” Next, skewed T-shaped clusters are detected. This specifies the dichromatic model used to locally resegment the color image via a recursive region merging process. Thus a combination of bottom-up and top-down processing segments images into regions corresponding to objects of different color. More recently, Healey (1991) reports on a color segmentation approach that uses a reflection model that includes metallic as well as dichromatic surfaces. The
MULTISENSORY COMPUTER VISION
83
segmentation algorithm considers the color information at each pixel to form a Gaussian random vector with three variables. Segmentation is achieved by a recursive subdivision of the image and by the analysis of resulting region level statistics of the random vector. Jordan and Bovik (1988) developed an algorithm that uses color information to aid the correspondence process in stereo vision algorithms. Their work is motivated by psychophysical findings that indicate the secondary role of color information in human stereo vision. 2.3.4 Infrared and Visual Imagery
Nandhakumar and Aggarwal (1987,1988a-c) present a technique for automated image analysis in which information from thermal and visual imagery is fused for classifying objects in outdoor scenes. A computational model is developed that allows the derivation of a map of heat sinks and sources in the imaged scene based on estimates of surface heat fluxes. Information integration is implemented at the different levels of abstraction in the interpretation hierarchy, i.e., at the pixel and the symbolic levels. Pixellevel information fusion yields a feature based on the lumped thermal capacitance of the objects, which quantifies the surface’s ability to sink/source heat radiation. Region-level fusion employs aggregate region features in a decision tree classifier to categorize imaged objects as either vegetation, building, pavement, or vehicle. Real data are used to demonstrate the approach’s usefulness. The approach classifies objects based on differences in internal thermal properties and is tolerant to changes in scene conditions, occlusion, surface coatings, etc. The approach is suitable for applications such as autonomous vehicle navigation, surveillance, etc. The multisensory vision system Nandhakumar and Aggarwal (1987, 1988a-c) describe is largely a data-driven system. Oh, Nandhakumar, and Aggarwal (1989) and Karthik, Nandhakumar, and Aggarwal (1991) develop a unified modeling scheme that allows the synthesis of different types of images. In particular, they describe the generation of thermal and visual imagery as well as the prediction of classifier features used by the multisensory vision system of Nandhakumar and Aggarwal (1987, 1988a-c) for object recognition. The development of specific strategies for using the developed unified models for model-based multisensory vision is under investigation. 2.3.5 Range and intensity Imagery
The integration of registered laser range and intensity imagery has been intensively researched. Gil et al. (1983, 1986) explore the extraction of edge
84
N. NANDHAKUMAR AND J. K. AGGARWAL
information by combining edges separately extracted from range and intensity edges. A more complete edge description of the scene is obtained by merging edges extracted from the two types of images. The combination of intensity edge information and 3-D information from range imagery is used to recognize objects (Magee and Aggarwal, 1985; Magee ef al., 1985). Lines and curves are extracted from the intensity edge imagery. Range information corresponding to these features is used to specify their positions in 3-D space. A graph-matching approach is used to recognize objects where the nodes of the graph correspond to features and edges correspond to geometric relationships. The intensity guided range-sensing approach is also extended for computing the motion of imaged objects (Aggarwal and Magee, 1986).
2.3.6 Range, Visual, and Odometry Research in autonomous navigation at CMU has focused on the use of laser range sensors, color cameras, inertial navigation systems, and odometry-for interpreting scenes, finding roads, and following roads (Stentz and Goto, 1987; Kanade, 1988). A software system called CODGER integrates the tasks of perception, planning, and control functions. The system implements three types of sensory functions: (1) competitive fusion occurs when sensors are of the same modality, e.g., vehicle position; (2) complementary fusion occurs when sensors are of different modality, e.g., stairs are identified by using color and range maps; (3) sensors are used independently, e.g., landmark recognition by using only the color camera.
2.3.7 Radar and Optical Sensors Shaw, de Figueiredo, and Kumar (1988) discuss the integration of visual images and low-resolution microwave radar scattering cross-sections to reconstruct the three-dimensional shapes of objects for space robotic applications. Their objective is to “combine the interpreted output of these sensors into a consistent world-view that is in some way better than its component interpretations.” The visual image yields contours and a partial surfaceshape description for the viewed object. The radar system provides an estimate of the range and a set of polarized radar scattering cross sections, which is a vector of four components. An “intelligent decision module” uses the information derived from the visual image to find a standard geometrical shape for the imaged object. If this is possible, then a closed form expression is used to predict the radar cross section. Otherwise, an electromagnetic model uses the sparse surface description to compute the radar cross section
MULTISENSORY COMPUTER VISION
85
by using a finite approximation technique. The unknown shape characteristics of the surface are then solved for iteratively, based on minimizing the difference between the predicted and sensed radar cross section. This technique is illustrated by a simulation reported by Shaw et al. (1988). 2.3.8 Sonar and Stereo Range Sensors
Mathies and Elfes (1988) discuss the integration of sonar range measurements and stereo range data for mobile robot applications. Occupancy grids are used for each ranging modality to represent the sensed information. The 2-D plane containing the sensors is tessellated into cells and each cell can have one of two states: occupied or empty. Sensor data update the probabilities of the states from multiple views of the scene. The probability updates are based on a Bayesian scheme where the prior probabilities of a sensor reading given the state of a cell are obtained from a probabilistic sensor model. The probabilistic model for the sonar sensor is defined by the beam pattern. The behavior of range error for a given disparity error defines the probabilistic model for the stereo range sensor. The integration of the two occupancy grids is based on the same Bayesian update scheme used for the individual occupancy grids. Experimental results illustrate the performance of this method using real data (Mathies and Elfes, 1988).
2.3.9 Multispectral Imagery Bhanu and Symosek (1987) describe a knowledge-based system for interpreting multispectral images. The system uses 5 spectral channels of a 12 channel scanner. The channels are chosen based on a priori knowledge of their ability to discriminate between classes of objects, such as sky, forest, field, and road. Each of the five spectral images is processed by a texture boundary detector. The outputs are combined to form a single gradient image. Edge segments are detected by labeling local maxima of the gradient image. These edge segments are then grown to form closed contours. Statistics (mean and standard deviation) of each channel are computed for each region. Features based on region location and adjacency are computed. During interpretation, spectral and local features are used to first detect the sky. Then the remaining regions are analyzed using a pseudo-Bayesian approach based on relational, spectral, and location features. It is evident from the preceding that researchers are investigating a variety of sensing modalities and a variety of strategies for integrating multiple sensors. In the next section we describe general classes of techniques used to integrate multisensory information.
86
N. NANDHAKUMAR AND J. K. AGGARWAL
3.
Computational Paradigms for Multisensory Vision
The previous section discussed specific systems, each of which incorporates a specific suite of sensors and attempts a particular vision task, We discussed ways in which multisensory information is fused in each system. This section discusses a more general issue, i.e., computational frameworks, each of which is suitable for a variety of multisensory vision tasks. The development of a single framework general enough to be applicable lo different suites of sensors and to different vision applications has been considered in the past. However, the realization of this goal has yet to be achieved. Several specific approaches have been adopted for designing multisensory vision systems. The popular computational approaches may be categorized into the following broadly defined classes : ( 1) statistical integration, (2) variational approaches, (3) artificial intelligence (AI) techniques, and (4) phenomenological approaches. The basic principles in each of these approaches are presented. 3.1 Statistical Approaches to Multisensory Computer Vision
Several distinct statistical approaches have been explored for multisensory computer vision. The most straightforward approach utilizes Bayesian decision theory based on multivariate statistical models. Such techniques are especially widespread in the analysis of multispectral remote-sensing data. This approach typically consists of first forming a feature vector wherein each variable corresponds to the signal value (e.g., pixel gray level) from each sensor. This feature vector is then classified by a statistical decision rule. Other features, such as the mean intensity level in a neighborhood, contrast, second- and higher-order moments, entropy measures, etc. which are computed for each sensor, have also been used as elements of the feature vector; e.g., see Lee, Chin, and Martin (1985). In some techniques, linear or nonlinear combinations of signal values from dgerent sensors form a feature, several of which are then fed to a classifier, e.g., Rosenthal, Blanchard, and Blanchard (1985). Other extensions to the standard statistical approach are reported, e.g., Di Zenzo et al. (1987) report a fuzzy relaxation labeling approach for image interpretation wherein a Gaussian maximum likelihood classifier provides initial probability estimates to the relaxation process. Different optimal classification rules have been developed for interpreting multisource data for each of a variety of statistical models assumed for the data. For example, consider s,(x,y) to be the signal (feature) from the ith sensor at image location (x, y ) , and the feature vector S(x, y ) to be defined as (s,(x, y ) . . . , , s N ( x , Y ) ) ~where , the number of sensors (features) = N . Let
M U L T I S E N S O R Y C O M P U T E R VISION
87
k. A simple classifier based on the minimum-distance rule will choose class c for pixel (x,y ) if
Pk be the prototypical feature vector for class
[S(X,.Y) - PcI2d[s(& y ) - PkI2,
Vk # c.
(14)
It is well known that the preceding classifier is optimal (maximizes the likelihood ratio) when S(x, y ) are Gaussian random vectors, si(x,y ) are independent and identically distributed, the class covariance matrices are equal, and the cost associated with each possible misclassification is equal. It is possible to derive optimal classifiers for other choices of statistical models. Classifiers derived in such a manner, however, do not address the problem of choosing suficiently discriminatory features from the infinite number of available features. Such approaches therefore suffer from the disadvantage that the global optimality of the feature set is impossible to guarantee. Also, the training of such classifiers is difficult since very large training data sets are warranted for achieving a reasonable error rate. It is also not clear what physical properties of the imaged objects are being utilized by the classifier during the discrimination process. 3.1.1
Markov Random Field Models
MRF models provide a convenient framework for making local decisions in the context of those made in a local neighborhood. Appropriate forms of the prior probability density functions also allow the integration of different sources of information in making such contextual decisions. Consider the classification problem of assigning a label/state I ( x ,v ) to a pixel at location (x,y ) . Let L denote the state assignment to the entire image. Let Y denote a specific set of multisensory data. The problem is to find the L that maximizes the posterior probability P(LI Y ) . Applying the Bayes theorem, the problem is equivalent to maximizing p ( YI L)P(L).The MRF assumption states that P[I ( x , y ) I L’(x,y ) ] = P [ l ( x ,y) I i ( x , y ) ] , where L ’ ( x , y ) is the set L minus the element I ( x , y ) , and L ( x , y ) is the state assignment in a local neighborhood defined at location (x, y ) . This assumption renders the prior joint probability density function P ( L ) to be of the Gibbs form; i.e., P(L) = - e
-U(L)/T
z
where 2 is a normalizing constant, T is known as the temperature, and U ( L ) is known as the energy function
88
N. NANDHAKUMAR AND J. K. AGGARWAL
F,(W,) is a function of the states of pixels in clique V,. The image model is a two-dimensional analog of a one-dimensional hidden Markov model. While optimal solutions can easily be computed for the latter, searching for the optimal solution of the two-dimensional problem is computationally prohibitive (Geman and Geman, 1984; Therrien, 1989). Hence, suboptimal solutions that yield good results are typically used. One solution strategy involves the application of simulated annealing and stochastic relaxation techniques (Geman and Geman, 1984). An important feature of the MRF model that makes it suitable for multisensory computer vision is that the prior energy function U ( L )can be modified to include other sources of information. For example, one of the potential functions constituting the prior energy function may be defined as N
F A X , Y ) = (1x.y
-
lk,d2 -
1 P , V , [ l ( x ,y ) , I =
&(X,
y)l
(17)
1
where the operator V, measures support provided by sensor si to the state/ label I (x, y ) . The MIT Vision Machine implements a specific instance of this approach for integrating image discontinuities detected by different processing modules (Poggio et al., 1988).
3.1.2 Multi- Bayesian Techniques
When a suite of sensors is used to collect and merge partial and uncertain measurements of the environment into a single consistent description, the sensors may be considered as a team that makes a joint decision by using complementary, competitive, and cooperative information (Durrant-Whyte, 1987, 1988). Having chosen appropriate probabilistic models for the sensors and the state of the environment, the interpretations from multiple sensors can be merged by using the Bayes decision theory. First, consider the case of multiple sensors sensing the geometric structure of the environment. If the environment contains known objects, then a network can be used as the model of the environment wherein nodes are geometric features (lines, surfaces, etc.) and sensor-coordinated frames and edges are geometric (uncertain) relations between nodes. Thus, the parameter vector, p (e.g., the intercepts of straight lines), of the features/nodes is considered to be uncertain. Consider a set of observations i = {51,. . . ,5,,} of the environment where p and 5, are Gaussian random vectors; p z N ( 6 , A,,) ; 5, z N ( $, &) ; Zi= jj &; and Vi is zero mean Gaussian noise. The posterior probability distribution a ( pIf,, . . . , i n )is jointly Gaussian
+
MULTISENSORY COMPUTER VISION
89
with mean
and covariance matrix
When the observations are not all Gaussian, a clustering and filtering operation can be used to reject the outlying measurements to arrive a t a consensus estimate of the parameter vector (Durrant-Whyte, 1988). Given the network world model that expresses geometric constraints, fusing a new set of sensor observations into the network requires uncertainties to be updated throughout the network. Durrant-Whyte (1988) describes a rational update policy that maintains Bayesianity and geometric consistency. Pearl (1987) also describes methods for propagating and updating belief in specific classes of distributed Bayesian networks. When the environment is unknown, the multisensor system can be considered a team of multi-Bayesian observers. Consider two sensors making observations zI and z2 of two disparate geometric features p l and p 2 . If a geometric relationship exists between the two features, then the local estimates 6'(z1) and 62(z2)made by the sensors constrain each other. A utility function u,[ . , 6,(z,)] is required to compare local decisions. An appropriate choice of the individual utility function is the posterior likelihood
where p I is a single feature being estimated. The team utility function may be chosen to be the joint posterior likelihood
The team utility function may have either a unique mode or be bimodal. The former convexity property indicates that the sensors agree with the team consensus; and the latter condition indicates that they disagree. Consider the transformation of the scene geometry p to individual features p i by the transformation p i= h i ( p ) . Denote the hypothesis of the scene geometry generated from a single feature piasp = h;'(pi). The inverse transformation is, in general, indeterminate. If each sensor makes individual estimates 6,(zi) of possible features, the sensor fusion task is to find p such that the joint
90
N. NANDHAKUMAR AND J. K. AGGARWAL
posterior density given by ri
F { ~ ~ I ~ ; ~ [ ~ I (.Z. ,I K) 1I [, 6. n ( z n ) I }
=
llI L{r~1hi'[6i(zr)I}
(22)
I =
is convex. Durrant-Whyte (1988) describes a recursive algorithm that implements a pair-wise complexity analysis to cluster agreeing hypotheses into ditferent groups. 3.2 Variational Methods for Sensor Fusion
The integrated analysis of multiple sensors can sometimes be formulated as an optimization problem subject to multiple constraints. For example, depth information may be provided by multiple sensing techniques and the problem is to fit a surface while minimizing the deviation from smoothness. Analysis techniques available in the calculus of variations are typically applied to such problems. The method of Lagrange multipliers is used to integrate the constraints from the multiple sensors to form a new functional to be optimized (extremized). Consider the problem of solving for functions f i ( x ) , i = 1, . . . ,n which have specified values at the boundaries x = xI and x = x2. Given a criterion to be satisfied, e.g., smoothness, the approach consists of formulating an error functional to be minimized and of the form: e=
jAyF(x,fi,. . . ,.L,f;,. . . , ~ A I
dx.
(23)
The minimization is subject to the constraints u i ( x , f I , .. . , f n ) = 0,
i = 1 , 2 , . . . , m.
(24)
For example, if multiple-range sensing methods yield multiple estimates of depth zk(x) and iff(x) is the required surface, then an appropriate form for U k W - 1 is U k ( X , f ) = [f(x)- Z k ( 4 l 2 . Using the method of Lagrange multipliers, a new error functional of the form
is minimized where
MULTISENSORY COMPUTER VISION
91
and L,(x) are known as the Lagrange multipliers. Applying the variational principle, it can be shown that Eq. (25) is minimized by the solution to the following Euler equations (Courant and Hilbert, 1953) :
Discrete approximations of the Euler equations specify an iterative numerical solution for the unknown functions f ; ( x ) . A very simple error functional is presented in the preceding for the sake of illustration. More useful formulations comprise multiple independent variables (multiple integrals), %%,expressed as a function of second- and higherorder derivatives off;, and constraints that may be expressed in integral forms. The two-dimensional formulation is commonly used for combining multiple constraints and multiple sources of information in tasks such as surface extraction (Ikeuchi and Horn, 1981 ; Moerdler and Boult, 1988) and motion computation (Aggarwal and Nandhakumar, 1988). Euler equations are unavailable for all general forms of the error functional, and in general, they have to be derived for specific cases by using the variational principles. Note that the variational approach is a deterministic approach. One advantage of this approach is that it does not require the knowledge of prior probability models, as in the case of statistical approaches. However, a priori information is required in the variational approach and is implicit in the form of the specific error functional chosen, e.g., C' smoothness of the surface.
3.3 Artificial Intelligence Approaches The complexity of the task sometimes precludes simple analytical formulations for scene interpretation. Models relating the images of each sensor to scene variables, models relating sensors to each other, and algorithms for extracting and combining the different information in the images usually embody many variables unknown prior to interpretation. This necessitates the use of heuristic and empirical methods for analyzing the images. Typically, the appropriate choices of techniques for processing the imagery are also not known a priori. Hence, the strategy for interpreting the images tends to be very complex. The nature of the task demands the use of iterative techniques that search for interpretations consistent with known analytical models as well as common-sense knowledge of the scene. These strategies are typically implemented as hypothesize-and-verify cycles of processing. A combination of data-driven and goal-driven processing is therefore required. Another complication involved in interpreting multisensory imagery is that
92
N. NANDHAKUMAR AND J. K. AGGARWAL
different kinds of information extracted from each of the sensors and information derived from combining the information are best represented and stored using different schemes. Maintaining these representations, as well as the explicit relationship between them, is difficult. The issues raised earlier, including those of complex strategies, knowledge representation, and the application of heuristic and empirical techniques, have been intensively researched in the field of artificial intelligence (AI). Such research has focused on general theories regarding these issues as well as on solutions to specific problems in which such issues are addressed (Nandhakumar and Aggarwal, 1985). The recent progress in artificial intelligence research has made available many useful computational tools for sensor fusion. The development of large data bases of heuristic rules and complex control strategies for combining multisensory data has been explored. Research issues focus on ( I ) developing new representational schemes for modeling the world and sensed information in a common framework that support reasoning and decision making, and (2) developing new interpretation strategies for specific sensor suites and applications. Typically, a specific set of sensing modalities is chosen for an application. Useful features are identified and algorithms for evaluating them are implemented. Rules are then used for examining the collection of these features to arrive at a consistent interpretation. An important aspect of the interpretation strategy is to decide on which area of the scene or subset of features to focus at some intermediate stage of processing, viz., focus of attention mechanism. The given task, the choice of the features, and the interpretation strategy are usually instrumental in suggesting an appropriate world representation. No single A1 framework has been shown to be optimal for a general collection of sensors and for all tasks. Hence, we present a survey of multisensory vision systems that are representative of different approaches to different specific tasks. A rule-based system that combines information from ladar range, ladar intensity, ladar doppler, millimeter-wave radar, and passive infrared imagery for detecting and classifying man-made objects in outdoor scenes is being developed using KEE, a commercially available expert system shell (Chu et al., 1988, 1990). Frames are used in a hierarchical organization to represent individual regions and scene objects that are collections of regions (see Fig. 12). Slots in the frames correspond to region parameters and the attributes of objects. Rules are applied to the segmented input images to evaluate slot values for low-level frames. Rules are then applied to these frames to form groupings of frames corresponding to objects in the scene. In addition to this forward-chaining approach, it is also possible to implement different control strategies, such as backward chaining and truth maintenance. The KEE expert system shell has also been used for implementing a system
MULTISENSORY COMPUTER VISION
93
FIG. 12. Representation of regions and objects using frames in KEE.
that identifies structures in color images of outdoor scenes (Baker et ul., 1988, 1989). Low-level processing yields cues that instantiate models. Modeldriven processing refines the partial segmentation and extracts geometric and color features in the image to verify the instantiated model. A representation termed multisensor kernel system (MKS) is proposed for a robot equipped with various types of sensors (Henderson and Fai, 1983). The representation of three-dimensional objects is built from information provided by “logical sensors,” which provide 2-D and 3-D features extracted from visual and range images of the object. The logical sensor outputs are combined to form a feature vector of dimension k, where k is the number of logical sensor outputs. These vectors are nodes of a “spatial proximity graph.” This representation is built by first ordering the collection of vectors into a tree structure based on a measure of distance between vectors and then linking nearest neighbors of the vectors to each other. Although the representation is argued to be general, it has been developed specifically for fusing visual and tactile data. It is unclear how suitable this approach is for a suite of highly disparate sensors. A schema-based approach for sensor fusion is proposed, based on experience gained by the researchers in developing the VISIONS system (Belknap, Riseman, and Hanson, 1986; Arkin, Riseman, and Hanson, 1988). The system is used to integrate information from sonar sensors and visual cameras and has been argued to be a useful test bed for experimenting with different perceptual strategies for robot navigation. The schema-based system allows top-down and bottom-up analyses. Initially discovered cues generate hypotheses. Focus of attention mechanisms then direct processing to
94
N. NANDHAKUMAR AND J.
K. AGGARWAL
verify or discard these hypotheses. The system is described in detail for interpreting scenes based on combining the output of line-detecting and region-finding modules. A distributed blackboard approach has been proposed for sensor fusion in an autonomous robot (Harmon and Solorzano, 1983; Harmon, 1988). The blackboard is organized into a class tree. This hierarchical representation allows inheritance mechanisms, which are useful, for example, in maintaining geometric reference frames of various objects in the scene. Control statements, which are extended forms of production rules, are stored in the blackboard as separate objects and activated by a monitor that detects when condition field values of the rules are changed. The distributed system includes tools for performance monitoring and debugging. The system does not consider any specific algorithms for sensor interpretation and fusion. Applications of the system for autonomous welding and autonomous terrain-based navigation are reported. Hutchinson, Cromwell, and Kak (1988) describe a system that dynamically plans optimal sensing strategies in a robot work cell. An augmented geometric CAD model of the object is used. In addition to representing the object’s 3-D structure, the model also includes a table of features that can be observed by each of the sensors, as well as an aspect graph of the object (Ikeuchi and Kanade, 1988). Sensors include a laser ranging device, fixed and manipulator-held video cameras, a force-torque sensor mounted on the robot’s wrist, and the manipulators, which measure the distance between the robot’s fingers. A wide variety of 3-D and 2-D features are extracted separately from each of these sensors. The initial set of features extracted from the imaged object form hypotheses of the object’s possible positions and attitudes. The aspect graph is searched for the best viewing position to disambiguate the hypotheses. This viewing position is then chosen and the sensor( s) appropriate for sensing the features in the predicted aspects are applied. Hutchinson et af. (1 988) describe the application of this technique to one object in the work cell. Luo and Lin (1987) proposed a system for fusing a wide variety of sensors for a robot assembly cell. Analysis and control of sensing is divided into four phases : “far away,” “near to,” “touching,” and “manipulating.” A probabilistic framework is used to fuse 3-D feature location estimates using measurements made in each of these phases. Experimental results illustrating the application of this approach to a real task are unavailable. 3.4 The Phenomenological Approach
The phenomenological approach is a recently developed computational approach for integrating multisensory information (Nandhakumar and
MULTISENSORY COMPUTER VISION
95
Aggarwal, 1987, 1988a-c). This approach relies on phenomenological or physical models that relate each of the sensed signals to the various physical parameters of the imaged object. The models are based on physical laws, e.g., the conservation of energy. The objective is to solve for the unknown physical parameters by using the known physical constraints and signal values. The physical parameters then serve as meaningful features for object classification. Denote sensed information as s,. Each imaging modality (viz., physical sensor) may yield many types of sensed information s,. For example, we may have sI= “thermal intensity,” s2 = “stereo range,” s-) = “visual intensity,” s4 = “visual edge strength,” etc. Let ZJx, y ) denote the value of sensed information s, at any specified pixel location (x, y ) . For the sake of brevity, Zs,will be used instead of Zs,(x,y) in the following. Each source of information is related to object parameters and ambient scene parameters, collectively denoted by p,, via a physical model of the following form:
where N is the total number of scene and object parameters. Note that for eachJ;, only a subset of the entire set of parameters has nonzero coefficients. Examples of pi include visual reflectance of the surface, relative surface orientation, material density, and surface roughness. In addition to the preceding, various natural laws may be applied to interrelate the physical properties of the objects, e g , principles of rigidity and the law of the conservation of energy. These lead to additional constraints of the following form:
Let K denote the set of all p, known a priori, either by direct measurement (e.g., ambient temperature) or directly derivable from an image (e.g.,surface temperature from thermal image). Let U denote the set of all pi not directly measurable. Obviously N = +t( U ) + %?(K),where +t( U ) denotes the cardinality of set U. To solve for the unknown parameters, we need a total of at least W ( U ) independent equations in the form of (28) or (29) that contain elements of U. Note that, in general, the equations are nonlinear, and hence solving them is not straightfoward. Also, it may be possible to specify a larger number of equations than required, thus leading to an overconstrained system. An error minimization approach may then be used to solve for the unknowns. Consider the integration of spatially registered and calibrated thermal and visual imagery using such an approach (Nandhakumar and Aggarwal, 1987, 1988a-c). The gray-level L, of the thermal image provides information
96
N . NANDHAKUMAR AND J . K . AGGARWAL
regarding surface temperature T,. The relation is of the following form:
where KI , K2, C I , and C, are constants, and [Al, A,] is the spectral bandwidth of the sensor. Assuming Lambertian reflectance in the visual spectrum, the gray-level L, of the visual image is related to the surface reflectance p and the incident angle 8 as
L, K3 W,p cos 8 + K4 (31) where K3 and K4 are constants, and W, is the intensity of irradiation (W/ m2) on a surface perpendicular to the direction of irradiation. The principle of the conservation of energy applied to the surface equates the absorbed energy Wahq (in the visual spectrum) to the sum of the conducted, convected, and radiated energies (Wed, W,,, and Wrud,respectively; see Fig. 13). This energy balance constraint is expressed as A=
(a242
+ a3q3)
(41 - 1) q1 = Wdha/Wcd
92 =
(Tb- Tam,)/
wabs
93 = d F? - T:,b)/
wxbb
a 2 = h convection coefficient
a , = E surface emissivity wabs= K ( I - p ) cos e.
V
/
FIG. 13. Surface energy exchange (Nandhakumar and Aggarwal, 1988~).
MULTISENSORY COMPUTER VISION
4 a b s
“.‘d ““‘+I
97
rad
cT
-I-
FIG. 14. Equivalent thermal circuit of the imaged surface (Nandhakumar and Aggarwal, 1988~).
From these equations, it is possible to compute R at each pixel. R is an estimate of the ratio Wcd/Wabsand, therefore, is a measure of the object’s relative ability to act as a heat sink/source. The value of R is closely related to that of the object’s lumped thermal capacitance (Fig. 14). Hence, R is a physically meaningful feature for object classification. Figure 15 shows a block diagram of the sensor integration scheme. Figure 16 shows the visual image of a scene. Figure 17 shows the thermal image. Figure 18 shows the mode of the values of R computed for each region. Figure 19 shows the output of a decision tree classifier that uses R and other image-derived features. This phenomenological approach is extended for analyzing a temporal sequence of thermal and visual imagery (Nandhakumar, 1990). Nandhakumar (1991) discusses robust methods of solving the parameters occurring in Eq. (32). The phenomenological approach is also suitable for a variety of other sensor suites and domains of application. For example, the interpretation of underwater visual and sonar imagery described by Malik and Nandhakumar
Thermal Image
I
I
T
I
I Conducted Heat Flux I R = ConductediAbsorbed
Temp.
I
T
.
Refl.
I I
Region Labels
1
FIG.15. Overview of approach for integrated analysis of thermal and visual imagery (Nandhakumar and Aggarwal, 1988~).
98
N. NANDHAKUMAR AND J. K. AGGARWAL
FIG. 16. Visual image of the scene (Nandhakumar and Aggarwal, 1988~).
(1991) follows such an approach. The phenomenological model used for this application is based on the conservation of acoustic energy propagating through the interface between two fluids. Roughness information extracted from visual imagery is used, along with acoustic backscatter information, to estimate the physical parameters of the imaged surface, such as compressional wave speed and material density ratios. These parameters are shown to be useful features for material classification. The integrated analysis of radar and optical sensors, described Shaw et u1. (1988), is also based on a phenomenological approach. The principal difference between the phenomenological approach and the others is that the former seeks to establish physically meaningful features for classification. The other approaches seek to establish optimal classification strategies without regard to the optimality of the feature set. An emerging technique yet to be explored in any detail relies on connectionist ideas (Bolle et al., 1988) and on principles of artificial neural networks (Pearson et ul., 1988; Gelfand et al., 1988). Very little work is reported on
MULTISENSORY COMPUTER VISION
99
FIG. 17. Thermal image of the scene (Nandhakumar and Aggarwal, 1988~).
the use of such approaches for sensor fusion. Neural mechanisms for sensor fusion discovered in several primitive natural perceptual systems are likely candidates for emulation by such approaches, although at this point the problem remains a very difficult one to solve. 4.
Fusion a t Multiple Levels
A computer vision system that uses single or multiple sensors to classify objects in a scene typically implements the following sequence of operations : (1) segmentation of image(s) and detection of features, (2) evaluation of feature attributes and values, and (3) classification/interpretation of features. Many variations of course do exist to this sequence of operations. For example, segmentation may be incomplete and partial segmentation may be iteratively refined based on interpretation. The computation of feature values may also be iteratively enhanced, and higher-level models may guide these operations. These modifications do not drastically change the approach to
100
N. NANDHAKUMAR AND J. K. AGGARWAL
FIG. 18. Mode of the heat flux ratio for each region (Nandhakumar and Aggarwal, 1988~).
the interpretation task, and the preceding paradigm is generally followed in most vision systems discussed in the literature and in the previous section of this chapter. It is obvious that one could use multiple sources of information in each of these operations to improve the performance of each module and, thus, that of the entire system. This aspect, i.e., the fusion of multisensory information at different levels of analysis, is discussed here. Examining recently reported systems from this perspective engenders a new paradigm for fusing information at different levels of analysis in a multisensory vision system. 4.1
Information Fusion at Low Levels of Processing
Asar, Nandhakumar, and Aggarwal (1990) describe an example of a technique that combines information at the lowest levels of analysis. Their technique segments scenes by using thermal and visual imagery. Image pyramids are grown separately for the thermal and visual images. Regions are grown in the thermal image at a reduced image resolution. Contrast information extracted from the visual image is used to control this region-growing process. The labels are propagated to the highest resolution image by using links in the visual pyramid to form the final segmentation.
MULTISENSORY COMPUTER VISION
101
FIG. 19. Output of decision tree classifier (Nandhakumar and Agganval, 1988~).
Duncan, Gindi, and Narendra (1987) describe another approach to segmenting scenes using multisensory data. The different noise characteristics of the two sensors used are exploited to yield the best estimate of edges in the images. A deterministic hill-climbing approach is adopted in a sequential search for the next edge pixel. The approach chooses one image over the other depending on the noise present at each location. The metric used is image variance about a candidate edge or boundary pixel. The method has been demonstrated on one-dimensional signals, and extensions to twodimensional images are discussed. However, no results are shown for twodimensional images. Also, it is unclear whether the technique works in cases where occlusions exist. Duncan and Staib (1987) discuss a model-driven approach for the segmentation of multisensory images. A probabilistic framework is employed. Edges and edge segments are extracted from the different images. Trial contours are generated and ty-s curves are computed for each contour. Disagreements between the trial contours extracted from the different images prompt the application of the model in searching the images for better trial contours. The search, however, consists of a local monotonic optimization approach and is susceptible to failure in the presence of local minima.
102
N. NANDHAKUMAR AND J. K. AGGARWAL
The composite gradient image extracted by Bhanu and Symosek (1987) from five channels of a multispectral imager is also a case where low-level sensor fusion is exploited for improved scene segmentation. The many segmentation methods that rely on color features may also be grouped under this category.
4.2 The Combination of Features in Multisensory Imagery
A great deal of research in multisensory computer vision has dealt with combining features extracted from the different sensors’ outputs. Each sensor’s output is processed separately to detect features. The extracted features are combined with one of two objectives in mind : ( 1 ) to produce new features different in type from those extracted from each sensor, or (2) to increase the reliability of the features extracted from each imaging modality. These two approaches are illustrated with examples. A typical example of the former approach is stereoscopic perception, where intensity edge locations are integrated to yield depth estimates. The computed 3-D information is different in nature from the 2-D information extracted from each image. The extraction of structure and motion paramctcrs from a sequence of monocular intensity images also belongs to the former class of approaches. The images need not be produced by the same sensing modality. An example of such a system is the one described by Nandhakumar and Aggarwal(l987,1988b). In this system, surface temperature values extracted from a thermal image are combined with surface shape and reflectivity values extracted from the corresponding visual image of the scene to estimate values of internal thermal object properties used as features for object classification. The other approach, which is distinct in its objective from that just described, integrates multiple values of one type of feature as sensed by different sensors to improve the accuracy of the final estimate. Typical examples of such an approach are systems that compute more reliable surface reconstructions by combining the surface estimates produced by different methods, e.g., the fusion of shape-from-texture and stereo outputs using a blackboard scheme combining the information (Moerdler and Boult, 1988). The combination of structured lighting techniques to compute surface shape with contour analysis to determine the location of the surface computed is another example of the latter approach (Wang and Aggarwal, 1989). An analogous approach is that followed by Shaw et al. (1988) in which surface shape is hypothesized by the visual image and radar cross-section scattering models verify and refine the reconstructed object shape. The MIT Vision Machine also conforms to this approach by integrating edge information in
MULTISENSORY COMPUTER VISION
103
the form of edges or discontinuities detected in the outputs of various modules, such as optic flow, texture analysis, etc. The objective is to produce a denser and more reliable map of discontinuities in the scene. In contrast to these examples where images were strictly visual, Chu et al. (1990) describe a technique for segmenting registered images of laser radar range and intensity data, and for combining the resultant segmentation maps to yield a more reliable segmentation of outdoor scenes into natural and man-made objects. The combination of the segmentation maps involves first partitioning regions in one map with those in the other and then using various heuristic rules to merge regions. 4.3 Sensor Fusion During High- Level Interpretation
Features extracted by separately processing the different images and also those computed based on combining information at low and intermediate levels of analysis, as discussed earlier, may be combined at the highest levels of analysis during the final stages of interpretation. The system described by Nandhakumar and Aggarwal(1988a, 1988c) discusses the fusion of information at the intermediate and higher levels of analysis. Aggregate features for each region in the image are evaluated separately for the thermal and visual images of outdoor scenes. A feature based on integrating information from the thermal and visual images at an intermediate level of analysis is also computed and an aggregate value of this feature for each region is computed. All these features are then considered together, during the final interpretation, by a decision tree classifier that labels regions in the scene as vegetation, buildings, roads, and vehicles. The CMU NAVLAB project also implements the fusion of information at higher levels of processing (Kanade, 1988). The range image is segmented into surface patches. The reflectance image is processed to yield lines. This information is combined to detect road edges for navigation. The colors and positions of the regions are used to further classify regions in the scene using an expert system. Dunlay (1988) adopted a similar approach wherein color imagery is processed separately using a simple color metric to extract road boundaries. The 3-D location of the road boundaries are computed assuming a planar road surface. These are overlaid on the range image to limit the search for obstacles on the road, which are detected as blobs in the range image. 4.4 A Paradigm for Multisensory Computer Vision
We now outline a model-based paradigm for multisensor fusion. We illustrate this paradigm by outlining a recently reported system that combines
104
N. NANDHAKUMAR AND J. K. AGGARWAL
thermal and visual imagery for classifying objects in outdoor scenes. Information fusion from the different imagery occurs at different lcvels of analysis in the system (see Fig. 20).
FIG.20.
Sensor fusion at various levels of analysis
At the lowest levels, thermal and visual imagery are combined to extract meaningful regions in the scene (Asar et al., 1990). A pyramidal approach is adopted for segmentation, as outlined in Section 4.1. The thermal image is then analyzed to produce estimates of surface temperature while the visual image produces estimates of surface shape and reflectivity. This information is combined at the intermediate stages of analysis via a phenomenological scene model, which is based on the law of the conservation of energy. Scene variables, such as wind speed, wind temperature, and solar insolation, are used in the model to relate surface temperature, shape, and reflectivity to an internal thermal object property, i.e., thermal capacitance (Nandhakumar and Aggarwal, 1987, 1988b). The physical model allows the estimation of heat fluxes at the surface of the imaged objects. A feature based on these surface fluxes yields insight into the relative ability of the object to act as a heat sink or heat source. This feature is evaluated at each pixel of the
MULTISENSORY COMPUTER VISION
105
registered thermal and visual image pair. Thus information fusion at this intermediate level is synergistic and results in a new feature useful in identifying scene objects (Nandhakumar and Aggarwal, 1987, 1988b). A representative value of this feature based on surface heat fluxes is chosen for each region by computing the mode of the distribution of this feature value for each region. Other aggregate features from each imaging modality, for each region, are also computed separately. These include the average region temperature and surface reflectivity. These features are used in a decision tree classifier to assign labels to the regions. The labels are vehicle, vegetation, road, and building. Thus, information from the two imaging modalities are again combined during this high-level interpretation phase (Nandhakumar and Aggarwal, 1988a, 1988~). Another important component of the system is the object modeling approach, which consists of a unified 3-D representation of objects that allows the prediction of the thermal image and the visual image as well as the surface heat fluxes and, hence, the features used in classification (Oh et al., 1989; Karthik et al., 1991). The model is constructed from multiple silhouettes of objects, and the model can be “edited” to include concavities, internal heat sources, and inhomogeneities. Currently, the models used in each of the levels of analysis are different, and the classification task is based on feature values lying in fixed ranges. The system is being extended to use the predictions provided by the unified object models to guide the interpretation phase. 5. Conclusions
The advantages of multisensory approaches to computer vision are evident from the discussions in the previous sections. The integration of multiple sensors or multiple sensing modalities is an effective method of minimizing the ambiguities inherent in interpreting perceived scenes. The multisensory approach is useful for a variety of tasks including pose determination, surface reconstruction, object recognition, and motion computation, among others. Several problems that were previously difficult or even impossible to solve because of the ill-posed nature of the formulations are converted to well-posed problems with the adoption of a multisensory approach. We discussed specific formulations that benefit from such an approach. The previous sections presented an overview of recent ideas developed in multisensory computer vision and a comparison and review of some recently reported work. We classified existing multisensory systems into three broadly defined groups : ( 1 ) those that combine the output of multiple processing techniques applied to a single image of the scene, (2) those that combine information extracted from multiple views of the same scene by using the
106
N. NANDHAKUMAR AND J. K. AGGARWAL
same imaging modality, and (3) those that combine different modalities of imaging, different processing techniques, or multiple views of the scene. We presented examples of several systems in each category. We discussed several commonly used computational frameworks for multisensory vision and presented typical applications of such approaches. The chapter categorized computational frameworks as statistical, variational, artificial intelligence, and phenomenological. We discussed issues pertaining to the hierarchical processing of multisensory imagery and the various levels at which sensory information fusion may occur. Finally, we presented a paradigm for a model-based vision system incorporating the fusion of information derived from different types of sensors at low, intermediate, and higher levels of processing. We discussed the specific case of integrating thermal and visual imagery for outdoor scene interpretation. However, the principles embodied in this approach can be generalized to other combination of sensor types and application domains. At the lowest levels of analysis, multisensory information is combined to segment the scene. At intermediate levels of analysis, a phenomenological scene model based on physical principles, such as the conservation of energy, is used to evaluate physically meaningful features. These features are combined a t the highest levels of analysis to identify scene objects. This paradigm emphasizes the optimality and physical significance of features defined for object recognition. Such an approach simplifies the design of classifiers and yet ensures the required performance. The phenomenological approach has been applied to a limited number of application domains. Its advantages in other application areas remain to be verified. We cited recent research in the fusion of sonar and visual imagery for underwater scene classification as another successful implementation of this paradigm. The paradigm was not presented as the preferred paradigm for all vision tasks. Instead, it was meant to illustrate the various issues that need to be addressed in designing a multisensory vision system. Another paradigm, based on a connectionist or artificial neural network approach to multisensory vision, also remains to be investigated in detail. Recent and continuing developments in multisensory vision research may be attributable to several factors, including (1) new sensor technology that makes affordable previously unexplored sensing modalities, ( 2 ) new scientific contributions in computational approaches to sensor fusion, and ( 3 ) new insights into the electrophysiological mechanisms of multisensory perception in biological perceptual systems. Most of the progress to data may be attributed to the second cause. The development of new, affordable sensors is currently an important and active area of research and may be expected to have a significant future impact on the capabilities of vision systems. For example, the availability of low-cost imaging laser ranging sensors, passive
MULTISENSORY COMPUTER VISION
107
infrared sensors, and high-frequency radar imagers would provide significant impetus to research in developing multisensor-based autonomous navigation, object recognition, and surface reconstruction techniques. Many lessons from nature are yet to be learned from neurophysiological and psychophysiological studies of natural perceptual systems. Such studies may provide useful clues for deciding what combination of sensing modalities are useful for a specific task, and they may also provide new computational models for intersensory perception. Many multisensory vision tasks are very computation intensive. Hence, while significant milestones have been established in multisensory computer vision research, the development and application of practical multisensory vision systems in industry, defense, and commerce have not, as yet, been completely successful. The continual increase in performance of available computational hardware may be expected to provide additional impetus to the development of practical multisensory vision systems for “real-world’’ applications. Highly parallel computer architectures may also meet the computational demands placed on multisensory strategies. The development of such architectures, the automatic identification of parallelism inherent in multisensory vision tasks, and strategies for exploiting this parallelism are other topics of research yet to be addressed. Therefore, a highly interdisciplinary approach to research in multisensory vision is expected in the future in order to realize practical and robust real-time vision systems. REFERENCES Aggarwal, J. K., and Magee, M. J. (1986). Determining Motion Parameters Using Intensity Guided Range Sensing, Pattern Recognition 19(2), 169-180. Aggarwal, J. K., and Nandhakumar, N. (1988). On the Computation of Motion from a Sequence of Images, Proceedings of the IEEE 76(8), 917-935. Aggarwal, J. K., and Nandhakumar, N. (1990). Multisensor Fusion for Automatic Scene Interpretation-Research Issues and Directions, in “Analysis and Interpretation of Range Images.” ed. R. C. Jain and A. K. Jain. Springer Verlag, New York, pp. 339-361. Aloimonos, J., and Basu, A. (1988). Combining Information in Low-Level Vision, “Proceedings of DARPA Image Understanding Workshop,” Cambridge, MA, pp. 862-906. Arkin, R . C., Riseman, E., and Hanson, A. (1988). AURA: An Architecture for Vision-Based Robot Navigation, “Proceedings of the DARPA Image Understanding Workshop,” Cambridge, MA, pp. 417 43 I . Asar, H., Nandhakumar, N., and Aggdrwal, J. K . (1990). Pyramid-Based Image Segmentation Using Multisensory Data, Pattern Recognition. Ayache, N., and Hansen, C. (1988). Rectification of Images for Binocular and Trinocular Stereovision, “Proceedings of the International Conference on Pattern Recognition,” Rome. Ayache, N., and Lustman, F. (1991). Trinocular Stereovision for Robotics, IEEE Trans. Pattern Analysis and Machine Intelligence 13, 73 -85. Baker, D. C., Aggarwal, J. K., and Hwang, S. S. (1988). Geometry-Guided Segmentation of Outdoor Scenes, “Proceedings of the SPIE Conference on Applications of Artificial Intelligence Vl,” Vol. 937, Orlando, FL, pp. 576-583.
108
N. NANDHAKUMAR A N D J. K. AGGARWAL
Baker, D. C., Hwdng, S. S., and Aggarwal, J. K. (1989). Detection and Segmentation of ManMade Objects in Outdoor Scenes: Concrete Bridges, Journal of the Optical Society of America A 6 ( 6 ) , 938-950. Ballard, D. H., and Brown, C. M. (1982). “Computer Vision.” Prentice-Hall, Inc., Englewood CliKs. NJ. Belknap, R., Riseman, E., and Hanson, A. (1986). The Information Fusion Problem and RuleRased Hypotheses Applied to Complex Aggregations of Image Events, “Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition,” pp. 227234. Bhanu, B., and Symosek, P. (1987). Interpretation of Terrain Using Hierarchical Symbolic Grouping from Multi-Spectral Images, “Proceedings of the DARPA Image Understanding Workshop,’’ Los Angeles. pp. 466~-474. Blake, A. (1989). Comparison of the Efficiency of Deterministic and Stochastic Algorithms for Visual Reconstruction, IEEE Trtms. PAMI 11(1), 2-12. Rolle, R., Califano, A., Kjeldsen, R., and Taylor, R. W. (1988). Visual Recognition Using Concurrent and Layered Parameter Networks, to appear in “Proceedings of the TEEE Computer Society Conference on Computer Vision and Pattern Recognition,” San Diego, CA. Chu, C. C., Nandhakumar, N., and Aggarwal, J. K. (1988). Image Segmentation and Information Integration of Ldser Radar Data, “Proceedings of the Conference on Pattern Recognition for Advanced Missile Systems,” Huntsville, AL. Chu, C. C., Nandhakumar, N., and Aggarwal, 3. K. (1990). lmage Segmentation Using Laser Radar Data, Pattern Recognition 23(6), 569-581. CIE ( 1978). “Recommendation on Uniform Color Spaces- Color Difference Equations, Psychometric Color Terms,” Technical Report Supplement No. 2 to CTE Publication No. 15, Commision Internationale de L’Eclairage, Paris. Courant, R., and Hilbert, D. (1953). “Methods of Mathematical Physics,” Interscience Publishers, New York. Dhond, U. R., and Aggarwal, J. K. (1989a). Structure from Stereo-A Review, IEEE Trans. Systems, Mun und Cybernetics 19(6), 1489-1510. Dhond, U. R., and Aggarwal, 1. K. (1989b). A Closer Look at the Contribution of a Third Camera Towards Accuracy in Stereo Correspondence, “Image Understanding and Machine Vision,” Technical Digest Series 14, Optical Society of America, pp. 78-81. Di Zenzo, S., Bernstein, R., Degloria, S. D., and Kolsky, H. G . (1987). Gaussian Maximum Likelihood and Contextual Classification for Multicrop Classification, IEEE Trans. on Ceoscience and Remote Sensing GE-25(6), 805 814. Duda, R. O., and Hart. P. E. (1973). “Pattern CI ification and Scene Analysis,” John Wiley and Sons, New York. Duncan, J. S., and Staib, L. H. (1987). Shape Determination from Incomplete and Noisy Multisensory Imagery, “Proceedings of AAAI Workshop on Spatial Reasoning and MultiSensor Fusion,” St. Charles, IL, pp. 334 344. Duncan, J. S., Gindi, G. R., and Ndrendra, K. S. (1987). Multisensor Scene Segmentation Using Learning Automata, “Proceedings of the AAAI Workshop on Spatial Reasoning and Multi-Sensor Fusion,” St. Charles, IL, pp. 323-333. Dunlay, R. T. (1988). Obstacle Avoidance Perception Processing for the Autonomous Land Vehicle, “Proceedings of the IEEE International Conference on Robotics and Automation,” Philadelphia, pp. 912.~917. Durrant-Whyte, H. F. (1987). Sensor Models and Multi-Sensor Integration, “Proceedings of the AAAI Workshop on Spatial Reasoning and Multi-Sensor Fusion,” St. Charles, IL, pp. 303 312. Durrant-Whyte, H. F. (1988). “Integration, Coordination, and Control of Multi-Sensor Robot Systems,” Kluwer Academic Publishers, Boston, 1988.
MULTISENSORY COMPUTER VISION
109
Frankot, R. T., and Chellappa, R. (1988). A Method for Enforcing Integrability in Shape from Shading Algorithms, IEEE Trans. Pattern Analysis and Machine Intelligence 10, 439- 451. Fukunaga, K . (1990). “Introduction to Statistical Pattern Recognition.” Academic Press, San Diego, CA. Gelfand, J. J., Pearson, J. C., and Spence, C. D. (1988). Multisensor Integration in Biological Systems, “Proceedings of the Third IEEE Symposium on Intelligent Control,” Arlington, VA . Geman, S., and Geman, D. (1984). Stochastic Relaxation, Gibbs Distribution and the Bayesian Restoration of Images, IEEE Trans. Pattern Analysis and Machine Intelligence, 6 , 721-741. Gonzalez, R. C., and Wintz, P. (1987). “Digital Image Processing.” Addison-Wesley Publishing Company, Reading, MA. Hager, G., and Mintz, M. (1987). Searching for Information, “Proceedings of the AAAI Workshop on Spatial Reasoning and Multi-Sensor Fusion,’’ St. Charles, IL, pp. 313-322. Harmon. S. Y. (1988). Tools for Multisensor Data Fusion in Autonomous Robots, “Proceedings of the NATO Advanced Research Workshop on Highly Redundant Sensing for Robotic Systems,” II Ciocco, Italy. Harmon, S. Y., and Solorzano, M. R. (1983). Information Processing System Architecture for an Autonomous Robot System, “Proceedings of the Conference on Artificial Intelligence,” Oakland University, Rochester, MI. Healey, G. (1991). Using Color to Segment Images of 3-D Scenes, “Proceedings of the SPIE Conference on Applications of Artificial Intelligence,” vol. 1468, Orlando, FL, pp. 814-825. Henderson, T. C., and Fai, W. S. (1983). A Multi-Sensor Integration and Data Acquisition System, “Proceedings of the IEEE Computer Society Conference Computer Vision and Pattern Recognition,” Washington, DC, pp. 274-279. Horn, B. K. P. (1986). “Robot Vision.” MIT Press, Cambridge, MA. Horn. B. K . P., and Brooks, M. J. (1986). The Variational Approach to Shape from Shading, Computer Vision Graphics and Image Processing 33, 174-208. Hu, G., and Stockman, G. (1987). 3-D Scene Analysis via Fusion of Light Striped Image and Intensity Image, “Proceedings of AAAI Workshop on Spatial Reasoning and Multi-Sensor Fusion,” St. Charles, IL, pp. 138 147. Hutchinson, S. A,, Cromwell, R. L., and Kak, A. C. (1988). Planning Sensing Strategies in a Robot Work Cell with Multisensor Capabilities, “Proceedings of the IEEE International Conference on Robotics and Automation,” Philadelphia, pp. 1068 1075. Ikeuchi, K., and Horn, B. K. P. (1981). Numerical Shape from Shading and Occluding Contours, Artificial lntelligence 17, 141-184. Ikeuchi, K., and Kanade, T. (1988). Modeling Sensors and Applying Sensor Model to Automatic Generation of Object Recognition Program, “Proceedings of the DARPA Image Understanding Workshop,” Cambridge, MA, pp. 697-710. Jain, A. K. (1989). “Fundamentals of Digital Image Processing.” Prentice-Hall, Englewood Cliffs, NJ. Jordan, J. R., and Bovik, A. C. (1988). Computational Stereo Using Color, Cover Paper of Special Issue on Machine Vision and Image Understanding, IEEE Cuntrul Systems Magazine 8(3), 31-36. Julesz, B., and Bergen, J. R. (1987). “Textons, The Fundamental Elements in Preattentive Vision and Perception of Textures,” in “Readings in Computer Vision : Issues, Problems, Principles, and Paradigms,” ed. M. A. Fischler and 0. Firschein. Morgan Kaufmann Publishers, Los Altos, CA, pp. 243-256. Kanade, T. (1988). CMU Image Understanding Program, “Proceedings of the DARPA Image Understanding Workshop,” Cambridge, MA, pp. 40-52. Karthik, S., Nandhakumar, N., and Aggarwal, J. K. (1991 ). Modeling Non-Homogeneous 3D Objects for Thermal and Visual Image Synthesis, “Proceedings of the SPIE Conference on Applications of Artificial Intelligence,” Orlando, FL.
110
N. NANDHAKUMAR AND J.
K. AGGARWAL
Klinker, G. J., Shafer, S. A,, and Kanade, T. (1988). Image Segmentation and Reflection Analysis through Color, “Proceedings of the DARPA Image Understanding Workshop,” Cambridge. MA, pp. 838 853. Krotkov, E., and Kories, R. (1988). Adaptive Control of Cooperating Sensors: Focus and Stereo Ranging with an Agile Camera System, “Proceedings of IEEE International Conference on Robotics and Automation,” Philadelphia, pp. 548-553. Lcc, B. G., Chin, R. T., and Martin, D. W. (1985). Automated Rain-Rate Classification of Satellite Images Using Statistical Pattern Recognition, IEEE Trans. on Geascience and Remote Sensing CE-23(3), 31 5 324. Levine, M. D., and Nazif, A. M . (1985a). Dynamic Measurement of Computer Generated Image Segmentations, IEEE Trans. P A M 1 7(2), 155--164. Levine, M . D., and Nazif, A. M. (198%). Rule-Based Image Segmentation--A Dynamic Control Strategy Approach, Computer Vision Graphics and Image Processing 32(1), 104-1 26. Luo, R . C., and Lin, M.-H. (1987). Multisensor Integrated Intelligent Robot for Automated Assembly, “Proceedings of the AAAI Workshop on Spatial Reasoning and Multi-Sensor Fusion,” St. Charles, IL, pp. 351-360. Magee. M. J.. and Aggarwdl, J. K. (1985). Using Multi-Sensory Images to Derive the Structure of Three-Dimensional Objects: A Review, Computer Vision, Gruphics and Image Processing 32, 145 157. Magee, M. J., Royter, B. A,, Chien, C.-H., and Aggarwal, J. K. (1985). Experiments in Intensity Guided Range Sensing Recognition of Three-Dimensional Objects. IEEE Trans. on Pattern Anulysis and Machine Intelligence 7(6), 629 -637. Malik, S., and Nandhakumar, N . (1991). Multisensor Integration for Underwater Scene Classification, “Proceedings of the IEEE International conference on Systems, Man, and Cybernetics,” Charlottesville, VA. Marr, D. (1982). “Vision.” W. 14. Freeman and Co., New York. Matthies, L., and Elfes, A. (1988). Integration of Sonar and Stereo Range Data Using a GridRased Representation, “Proceedings of the IEEE lnternational Conference on Robotics and Automation,” Philadelphia, pp. 727-733. Mitiche, A.. and Aggarwal, J. K. (1986). Multiple Sensor Integration/Fusion Through Image Proccssing : A Preview, Optical Engineering 25(3), 380~-386. Mitiche, A,, Cil, B., and Aggarwal, J. K. (1983). Experiments in Combining Intensity and Range Edge Maps, Computer Viyion, Graphics and Imrige Processing, 21, 395- 41 1. Moerdler. M . L., and Boult, T. E. (1988). The Integration of Information from Stereo and Multiple Shape-from-Texture Cues. “Proceedings of DARPA Image Understanding Workshop,” Cambridge, MA, pp. 786-793. Moerdler, M. L., and Kender, J. R. (1987). An Approach to the Fusion of Multiple Shape from Texture Algorithms, “Proceedings of AAAT Workshop on Spatial Reasoning on MultiSensor Fusion,” St. Charles, IL. pp. 272 281. Nandhakumar, N . (1990). A Phenomenological Approach to Multisource Data Integration : Analyzing Infrared and Visible Data, “Proceedings of the NASA/TAPR TC7 Workshop on Multisource Data Integration in Remote Sensing,” College Park, MD. Nandhakumar, N. (1991). Robust Integration of Thermal and Visual Imagery I‘or Outdoor Scene Analysis, “Proceedings of the IEEE International Conference on Systems, Man and Cybernetics,” Charlottesville. VA. Nandhakumar, N., and Aggarwal, J. K. (1985). The Artificial Intelligence Approach to Pattern Recognition-A Perspective and an Overview, Pattern Recognition 18(6), 383-389. Nandhakumar, N., and Aggarwal, J. K. (1987). Multisensor Integration-Experiments in Integrating Thermal and Visual Sensors, “Proceedings of the First International Conference on Computer Vision,” London, pp. 83-92.
MULTISENSORY COMPUTER VISION
111
Nandhakumar, N., and Aggarwal, J. K. (1988a). A Phenomenological Approach to Thermal and Visual Sensor Fusion, “Proceedings of the NATO Advanced Research Workshop on Highly Redundant Sensing for Robotic Systems,” I1 Ciocco, Italy, pp. 87-101. Nandhakumar, N., and Aggarwal, J. K. (1988b). Integrated Analysis of Thermal and Visual Images for Scene Interpretation, IEEE Trans. on Pattern Analysis and Machine Infelligence 10(4), 469-481. Nandhakumar, N., and Agganval, J . K. (1988~).Thermal and Visual Information Fusion for Outdoor Scene Perception, “Proceedings of the IEEE International Conference on Robotics and Automation,” Philadelphia, pp. 1306 1308. Newman, E. A,, and Hartline, P. H. (1982). The Infrared “Vision” of Snakes, Scientific American 246(3), 116-127. Oh, C. H., Nandhakumar, N., and Agganval, J. K. (1989). Integrated Modelling of Thermal and Visual Image Generation, “Proceedings of the IEEE Computer Vision and Pattern Recognition Conference.” Ohta, Y. ( 1 985). “Knowledge-Based Interpretation of Outdoor Natural Color Scenes,” Pitman Publishing Inc., Massachusetts. Pearl, J. (1987). Distributed Revision of Composite Beliefs, Artificiul Intelligence 33, pp. 173215. Pearson, J. C., Gelfand, J. J., Sullivan, W. E., Peterson, R. M., and Spence, C. D. (1988). Neural Network Approach to Sensory Fusion, “Proceedings of the SPIE Conference on Sensor Fusion,” Vol. 931, Orlando, FL, pp. 103 108. Poggio, T., Little, J., Gillett, W., Geiger, D., Wienshall, D., Villalba, M., Larson, N., Cass, T., Bulthoff, H., Drumheller, M., Oppenheimer, P., Yang, W., and Hurlhert, A. (1988). The MIT Vision Machine, “Proceedings of DARPA image Understanding Workshop,” Cambridge, MA, pp. 177-198. Rodger, J. C., and Browse, R. A. (1987). An Object-Based Representation for Multisensory Robotic Perception, “Proceedings of the AAAI Workshop on Spatial Reasoning and MultiSensor Fusion,” St. Charles, IL, pp. 13 20. Rosenfeld, A,, and Kak, A. C. (1982). “Digital Image Processing.” Academic Press, New York. Rosenthal, W. D., Blanchard, B. J., and Blanchard, A. J. ( 1985). Visible/Infrared/Microwave Agriculture Classification, Biomass and Plant Height Algorithms, IEEE Trans. on Geoscience and Remote Sensing GE-23(2), 84-90. Schalkoff, R. J. (1989). “Digital Image Processing and Computer Vision,” John Wiley and Sons, New York. Shafer, G. (1976). “A Mathematical Theory of Evidence,” University Press, New York. Shaw, S. W., deFigueiredo, R. J. P., and Kumar, K . (1988). Fusion of Radar and Optical Sensors for Space Robotic Vision, “Proceedings of the IEEE Robotics and Automation Conference,” Philadelphia, pp. 1842- 1846. Simchony, T., and Chellappa, R. (1990). Direct Analytical Methods for Solving Poisson Equations in Computer Vision Problems, IEEE Trans. Pattern Analysis and Machine Intelligence 12, 435-446. Stentz, A,, and Goto, Y. (1987). The CMU Navigational Architecture, “Proceedings of the DARPA Image Understanding Workshop,” Los Angeles, pp. 440-446. Therrien, C. W. (1989). “Decision, Estimation and Classification.” John Wiley and Sons, New York. Wang, Y. F., and Aggarwal, J. K. (1987). On Modeling 3-D Objects Using Multiple Sensory Data, “Proceedings of the IEEE International Conference on Robotics and Automation, Raleigh, NC, pp. 1098-1 103. Wang, Y. F., and Aggarwal, J. K. (1989). Integration of Active and Passive Sensing Techniques for Representing 3-D Objects, IEEE Trans. Robotics and Automation 5(4), 460-471.
This Page Intentionally Left Blank
Parallel Computer Architectures RALPH DUNCAN Control Data Government Systems Atlanta. Georgia 1. Introduction . . . . . . . . . . . . . . 2 . Terminology and Taxonomy . . . . . . . . . 2.1 Interrelated Problems of Terminology and Taxonomy 2.2 Low-level Parallelism . . . . . . . . . 2.3 Flynn’s Taxonomy . . . . . . . . . . 2.4 Definition and Taxonomy . . . . . . . .
3 . Synchronous Architectures .
.
. . . . . . . . . . . . 4. MIMD Architectures . . . . . . . 4.1 Distributed Memory Architectures . 4.2 Shared Memory Architectures . . . 5. MIMD Execution Paradigm Architectures . 5.1 MIMD/SIMD Architectures . . . 5.2 Data-Flow Architectures . . . . 5.3 Reduction Architectures . . . . 3.1 Pipelined Vector Processors 3.2 SIMD Architectures . . 3.3 Systolic Architectures . .
5.4 Wavefront Array Architectures 6. Conclusions . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . .
. . . . . . . . . . . . . Acknowledgments . . . . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . .
1 . Introduction
The term “parallel processing” designates the simultaneous execution of multiple processors to solve a single computational problem cooperatively . Parallel processing has attracted a great deal of recent interest because of its potential for making difficult computational problems tractable by significantly increasing computer performance . Two basic kinds of computational problems are encouraging research in parallel processing through their need for orders-of-magnitude improvements in computer processing speed. First. problems characterized by inordinate size and complexity. such as detailed weather or cosmological modeling. often require hours or days of conventional processing . This 113 ADVANCES IN COMPUTERS. VOL 34
Copyr~ght0 1992 by Academic Press. Inc All nghts of reproduction In any form reserved ISBN 0-12-012 134-4
114
RALPH DUNCAN
hinders developing conceptual models and discourages researchers from modeling the phenomena of interest at a desirable level of detail. Real-time problems, which require computations to be performed within a strictly defined time period and are typically driven by external events, also need significant performance improvements. Real-time systems are being taxed by shorter times for processing and by demands for more processing to be performed before a time deadline. For example, real-time systems in military aircraft are being stressed by increased sensor input speeds and by the need for additional processing to provide more sophisticated electronic warfare functionality. These computational problems call for vast performance increases that conventional, single-processor computers are unlikely to provide. Although developers have achieved impressive increases in uniprocessor speed, continued advances are constrained by fundamental physical laws. The primary barriers to achieving this kind of performance improvement through parallel processing, however, are conceptual ones-finding efficient ways to partition a problem among many processors and to orchestrate multiple processors executing in a cooperative fashion. Since the difficulty of surmounting conceptual obstacles is less formidable than overcoming fundamental physical laws (such as the speed of light), parallel processing is a promising means for achieving significant computer performance advances. Clearly, parallel processing must be supported by architectures that are carefully structured for coordinating the work of many processors and for supporting efficient interprocessor communications. The many parallel architectures that have been developed or proposed define a broad and quite diverse spectrum of architectural possibilities. There are several reasons for this variety; these include the many possible responses to the fundamental conceptual challenge, the divergent characteristics of problems amenable to parallelization, and the practical limitations of alternative technologies that can be used for inter-processor communications. The parallel architecture discipline has been further enriched by the introduction of a host of new parallel architectures during the 1980s. The sheer diversity of parallel processing architectures can be daunting to a nonspecialist. Thus, this chapter attempts to provide a tutorial that surveys the major classes of parallel architecture, describing their structure and how they function, In addition, this chapter correlates parallel architecture classes with references to representative machines, in order to steer the interested reader to the vast literature on individual parallel architectures. Although this chapter’s primary intent is not taxonomic, a high-level parallel architecture taxonomy is presented in order to structure the discussion and demonstrate that the major architecture classes define a coherent spectrum of design a1ternatives.
PARALLEL COMPUTER ARCHITECTURES
115
2. Terminology and Taxonomy
2.1
Interrelated Problems of Terminology and Taxonomy
A coherent survey of parallel architectures requires at least a high-level architecture taxonomy in order to show that the diversity of extant architectures springs from different approaches to supporting a small number of parallel execution models, rather than from ad hoc approaches to replicating hardware components. A parallel architecture taxonomy, in turn, requires a definition of “parallel architecture” that carefully includes or excludes computers according to reasonable criteria. Specifying a definition for parallel architectures that can serve as the basis for a useful taxonomy, is complicated by the need to address the following goals : 0
0
0
Exclude architectures incorporating only low-level parallel mechanisms that have become commonplace features of modern computers Maintain elements of Flynn’s useful taxonomy (Flynn, 1966) based on instruction and data streams Include pipelined vector processors and other architectures that intuitively seem to merit inclusion as parallel architectures, but that are difficult to gracefully accommodate within Flynn’s scheme. 2.2 Low-level Parallelism
How a parallel architecture definition handles low-level parallelism is critical, since it strongly heavily influences how inclusive the resulting taxonomy will be. Our definition and taxonomy will exclude computers that employ only low-level parallel mechanisms from the set of parallel architectures for two reasons. First, failure to adopt a more rigorous standard could make the majority of modern computers “parallel architectures,” rendering the term useless. Second, architectures having only the features listed below do not offer the explicit framework for developing high-level parallel programming solutions that will be an essential characteristic of our parallel architecture definition.
0
Instruction pipelining-the decomposition of instruction execution into a linear series of autonomous stages, allowing each stage to simultaneously perform a portion of the execution process (e.g., decode, calculate effective address, fetch operand, execute, store) Multiple central processing unit (CPU) functional units, providing independent functional units for arithmetic and Boolean operations that execute concurrently.
116
RALPH DUNCAN
Separate CPU and input/output (I/O) Processors, freeing the CPU from 1 / 0 control responsibilities by using dedicated 1/0 processors. Although these features are significant contributions to performance engineering, their presence alone does not make a computer a parallel architecture. 2.3 Flynn’s Taxonomy
Flynn’s taxonomy for computer architectures enjoys such widespread usage that any proposed parallel architecture taxonomy must take it into account. The Flynn taxonomy classifies architectures on the presence of single or multiple streams of instructions and data, yielding the following four categories. 1. SISD (single instruction stream, single data stream) : defines serial computers 2 . MISD (multiple instruction streams, single data stream) : would involve multiple processors applying different instructions to a single datum ; this hypothetical possibility is generally deemed impractical 3. SIMD (single instruction stream, multiple data streams) : involves multiple processors simultaneously executing the same instruction on different data 4. MIMD (multiple instruction streams, multiple data streams) : involves multiple processors autonomously executing diverse instructions on diverse data
Although these distinctions provide a useful shorthand for characterizing architectures, they are insufficient for precisely classifying modern parallel architectures. For example, pipelined vector processors merit inclusion as parallel architectures, since they provide both underlying hardware support and a clear programming framework for the highly parallel execution of vector-oriented applications. However, they cannot be adequately accommodated by Flynn’s taxonomy, because they are characterized by neither processors executing the same instruction in SIMD lockstep nor the asynchronous autonomy of the MIMD category. 2.4 Definition and Taxonomy
In order for a definition of parallel architecture to serve as the basis for a useful taxonomy, then, it should include appropriate computers that the
PARALLEL COMPUTER ARCHITECTURES
117
Flynn schema does not handle and exclude architectures incorporating only low-level parallelism. The following definition is therefore proposed : a paralilel architecture provides an explicit, high-level framework for expressing and executing parallel programming solutions by providing multiple processors, whether simple or complex, that cooperate to solve problems through concurrent execution. Figure 1 shows a taxonomy based on the imperatives discussed earlier and the proposed definition. This informal taxonomy uses high-level categories to delineate the principal approaches to parallel computer architecture and to show that these approaches define a coherent spectrum of architectural alternatives. Definitions for each category are provided in the section devoted to that category. This taxonomy is not intended to supplant efforts to construct fully articulated taxonomies. Such taxonomies usually provide comprehensive subcategories to reflect permutations of architectural characteristics and to cover lower-level features. In addition, detailed parallel architecture taxonomies are often developed in conjunction with a formal notation for describing computer architectures. Significant parallel architecture taxonomies have been proposed by Dasgupta (1990), Hockney and Jesshope (198l), Hockney (1987), Kuck (1982), Schwartz (1983), Skillicorn (1988), and Snyder (1988).
Processor Array Synchronous Associative Memory
Systolic Distributed Memory
Shared Memory
MlMD/SIMD Data-flow
Reduction
Wavefront
FIG. 1. High-level taxonomy of parallel computer architectures. 0 1990 IEEE.
118
RALPH DUNCAN
3. Synchronous Architectures
The initial category in our high-level taxonomy consists of synchronous parallel architectures, which coordinate concurrent operations in lockstep by using global clocks, central control units, or vector unit controllers. Our survey of synchronous architectures next examines pipelined vector processors, SIMD architectures, and systolic arrays. 3.1 Pipelined Vector Processors Vector processor architectures were developed to directly support massive vector and matrix calculations. Early vector processors, such as Control Data’s Star- 100 (Lincoln, 1984) and Texas Instruments’ Advanced Scientific Computer (Watson, 1972), were developed in the later 1960s and early 1970s and were among the first parallel architectures to be offered commercially. Vector processors are characterized by multiple, pipelined functional units that can operate concurrently and that implement arithmetic and Boolean operations for both vectors and scalars. Such architectures provide parallel vector processing by sequentially streaming vector elements through a functional unit pipeline and by streaming the output results of one unit into the pipeline of another as input (a process known as “chaining”). Although data elements for a vector operation enter a given functional unit’s pipeline in sequential fashion, parallelism is achieved by concurrently executing different stages of the vector operation on different data elements (or element pairs). Additional parallelism is provided by having the various functional units execute simultaneously. A representative architecture might have a vector addition unit consisting of six pipeline stages (Fig. 2). If each pipeline stage in the hypothetical architecture shown in the figure has a cycle time of 20 nsec, then 120 nsec elapse from the time operands ul, bl enter stage 1 until result cl is available. When the pipeline is filled, however, a result is available every 20 nsec. Thus, start-up overhead of pipelined vector units has significant performance implications. In the case of the register-to-register architecture depicted, special high-speed vector registers hold operands and results. Efficient performance for such architectures (e.g., Cray-1, Fujitsu VP-200) is obtained when vector operand lengths are multiples of the vector register size. Memory-to-memory architectures, such as the Control Data Cyber 205 and Texas Instruments Advanced Scientific Computer, use special memory buffers instead of vector registers. Recent vector processing supercomputers (e.g., the Cray Y-MP/4 and Nippon Electric Corporation SX-3) typically unite 4 to 10 vector processors through a large shared memory. Since such architectures can support tasklevel parallelism by assigning individual tasks to different CPUs, they could
PARALLEL COMPUTER ARCHITECTURES
119
Vector Register A
FIG.2. Register-to-register vector architecture operation.
0 1990 IEEE.
arguably be classified as MIMD architectures. However, since pipelined vector processing units remain the backbone of such multihead architectures, they are categorized in this discussion as vector processors for clarity’s sake. It was argued previously that an architecture’s utilizing multiple functional units or instruction pipelining, per se, is insufficient to merit classifying the architecture as parallel. Since multiple units and pipelining are the underlying mechanisms for vector architectures’ concurrent execution, one might question their inclusion as parallel architectures. However, such architectures’ vector instructions, as well as the language extensions and subroutine libraries that facilitate their use, do provide the user with a high-level framework for developing parallel solutions. Thus, the combination of a vector-level framework for expressing application parallelism with the effective exploitation of multiple units and pipelining to support that parallelism makes it reasonable to classify vector machines as parallel architectures. Figure 3 shows some representative vector processor architectures. Only two of Cray Research’s many models are depicted. In addition, the figure suggests both the current preference for fegister-to-register approaches and the introduction of recent models by Japanese manufacturers.
3.2 SIMD Architectures SIMD architectures (Fig. 4) typically employ a central control unit, multiple processors, and an interconnection network (IN) for either processor-to-processor or processor-to-memory communications. The distinctive
120
RALPH DUNCAN
Register-to-Register Operation
I
Cray-l (Russell, 1978) Cmy Y-MP (Reinhardt, 1988) Fujitsu VP-200 (Miura & Uchlda, 1984) Hitachi 5-820 (Wada, et a1 , 1988)
Vector Archi tectures
NEC SX-22 (Hwang, IQ84bl P.R.O.C. Galaxy YH-I (Hwang, I984b)
Memory- to-Memory
CDC Star-100 (Control Data, 1976) Cyber 205 (Lincoln. 1984) TI ASC (Watson, 1972)
FIG. 3.
Example vector processor architectures.
INSTRUCTION
DATA
n c
add rl .b O
P
-
I
N
FIG. 4. SIMD execution.
PARALLEL COMPUTER ARCHITECTURES
121
aspect of SIMD execution consists of the control unit broadcasting a single instruction to all processors, which execute the instruction in lockstep fashion on local data. The IN allows instruction results calculated at one processor to be communicated to another processor for use as operands in a subsequent instruction. SIMD architectures often allow individual processors to disable execution of the current broadcast instruction. As the subsections below will show, the SIMD architecture category encompasses several distinctive subclasses of machine, including processor arrays for word-sized operands, massively parallel machines composed of 1-bit processors, and associative memory architectures.
3.2.1 Processor Array Architectures
Processor arrays geared to the SIMD execution of numerical instructions have often been used for computation-intensive scientific problems, such as nuclear energy modeling. The processor arrays developed in the late 1960s (e.g., Illiac-IV) and their more recent successors (e.g., the Burroughs Scientific Processor) utilize processors that accommodate word-sized operands. Operands are usually floating-point (or complex) values and typically range in size from 32 to 64 bits. Various IN schemes have been used to provide processor-to-processor or processor-to-memory communications, with mesh and crossbar approaches being among the most popular. One variant of processor array architecture uses a large number (thousands) of 1-bit processors. This machine organization was employed by several significant SIMD architectures of the 1980s and is one of the architectural approaches sometimes characterized as constituting “massive” parallelism. Various approaches to constructing SIMD architectures with 1 bit processors have been explored. In bit-plane architectures, the array of processors is arranged in a symmetrical grid (e.g., 64 x 64) and associated with multiple “planes” of memory bits that correspond to the dimensions of the processor grid (Fig. 5). Processor,, situated in the processor grid at location (x, y ) , operates on the memory bits at location (x,y ) in all the associated memory planes. Usually, operations are provided to copy, mask, and perform arithmetic operations on entire memory planes, as well as on columns and rows within a plane. Loral’s Massively Parallel Processor (Batcher, 1980) and the Distributed Array Processor exemplify this kind of architecture, which is often used for image processing applications by mapping pixels to the memory’s planar structure. An alternative approach to 1-bit processor organization is exemplified by Thinking Machines Corporation’s Connection Machine (Hillis, 1985), which
122
RALPH DUNCAN
1 -BIT SERIAL PROCESSORS
MEHORY BIT-PLANES
planel
plane2
plane,
FIG. 5. Bit-plane array processing. KJ 1990 IEEE.
organizes as many as 65,536 one-bit processors as sets of four-processor meshes united in a hypercube topology. Figure 6 reflects the recent commercial emphasis on 1-bit processor SIMD architectures. Although SIMD arrays based on word-oriented processors continue to be developed, reduced interest in this traditional approach is currently evident. 3.2.2 Associative Memory Processor Architectures
Associative memory processors (Kohonen, 1987) constitute a distinctive type of SIMD architecture. These architectures use special comparison logic to cffect parallel operations on stored data on the basis of data content. Research in constructing associative memories began in the late 1950s with the obvious goal of being able to search memory in parallel for data that matched some specified datum. Associative memory processors developed in the early 1970s, such as Bell Laboratories’ Parallel Element Processing Ensemble (PEPE), and more recent architectures ( e g , Loral’s ASPRO) have often been geared to such database-oriented applications as tracking and surveillance. Figure 7 shows the functional units that characterize a typical associative memory processor. A program controller (serial computer) reads and executes instructions, invoking a specialized array controller when associative memory instructions are encountered. Special registers enable the program controller and associative memory to share data. Most current associative memory processors use a bit-serial organization, which involves concurrent operations on a single bit-slice (bit-column) of
PARALLEL COMPUTER ARCHITECTURES
123
ICL DAP (Reddaway, 1973) Loral M P P ( B a t c h , 1980)
Connection Machine (Hillis, 1985)
llliac IV (Barnes, et al., 1968)
Other
Burroughs ESP (Kuck & Stokes, 1984) I B M G F l l (Beteem, et al.. 1987) Motorola T-ASP (Lang, et al., 1988)
FIG.6 . Example SIMD processor array architectures.
all the words in the associative memory. Each associative memory word, which usually has a very large number of bits [e.g., 32 kilobytes (32K)], is associated with special registers and comparison logic that functionally constitute a processor. Hence, an associative processor with 4K words effectively has 4K processing elements (PEs). Figure 8 depicts a row-oriented comparison operation for a generic bitserial architecture. A portion of the comparison register contains the value to be matched. All of the associative PEs start at a specified memory column and compare the contents of 4 consecutive bits in their row against the comparison register contents, setting a bit in the A register to indicate whether or not their row contains a match. In Fig. 9, a logical OR operation is performed on a bit-column and the bit-vector in register A, with register B receiving the results. A zero in the
124
RALPH DUNCAN
program memoryc '
program ontroller
4
4
ALU and 4 spectal registers
array .controller
+
COHPARISON REGISTER
m 10011010
ASSOCIATIVE
search' pattern
REGISTERS
ASSOCIATIVE HEHORY
E
f WORDS
n A
B
Hask
bit-column &arch window
1.-
BITS PER WORD
-4
FIG.8. Associative memory comparison operation. Q 1990 IEEE.
125
PARALLEL COMPUTER ARCHITECTURES
0
Mask Reg.
7 WORDS
i ASSOCIATIVE MEMORY BITS PER WORD
FIG. 9. Associative memory logical OR operation. 0 1990 IEEE.
Mask register indicates that the associated word is not included in the current operation. Figure 10 shows example associative memory architectures. In addition to the bit-serial architecture category discussed above, the figure uses several other categories of architecture defined by Yau and Fung (1977) in their older, but still useful, article. In fully parallel architectures, all bits (or groups of bits) in a given column of memory are accessed by an instruction, and multiple columns can be accessed simultaneously. This functionality can be implemented by a distributed logic approach, in which the columns of concurrently accessed memory are several bits wide, and typically contain enough bits to constitute a character. Lesser known variants of associative memory architecture have included word-serial machines, which use hardware to implement loop constructs for searching, and block-oriented architectures, which use rotating memory devices as the associative memory. These latter approaches are included primarily for historical interest. In recent years, interest in these traditional approaches to associative memory architecture seems to have lessened, with much of the work in content-addressable memory passing to the neural network field. 3.3 Systolic Architectures
In the early 1980s H. T. Kung of Carnegie-Mellon University proposed systolic architectures to solve the problems of special-purpose systems that
126
RALPH DUNCAN
word-serial Associative Memory Architectures
FIG.10. Example associative memory architectures.
must balance intensive computations with demanding 1 / 0 bandwidths (Kung, 1982). Systolic architectures (systolic arrays) are pipelined multiprocessors in which data is pulsed in rhythmic fashion from memory and through a network of processors before returning to memory (Fig. 11). A global clock and explicit timing delays synchronize this pipelined data flow, which consists of datums obtained from memory that are to be used as operands by multiple processors within the array. In some schemes, this pipelined data-flow may include partial results computed by the array’s processors. Modular processors united by regular, local interconnections are typically used in order to provide basic building blocks for a variety of special-purpose systems. During each time interval, these processors transmit and receive a predetermined amount of pipelined data, and execute an invariant sequence of instructions.
PARALLEL COMPUTER ARCHITECTURES
FIG.11. Systolic flow of data from and to memory.
127
(01990 IEEE.
Systolic arrays address the performance requirements of special-purpose systems by achieving significant parallel computation and by avoiding 1/0 and memory bandwidth bottlenecks. A high degree of parallelism is obtained by pipelining data through multiple processors, most often in two-dimensional fashion. Systolic architectures maximize the computations performed on a datum once it has been obtained from memory or an external device. Hence, once a datum enters the systolic array it is passed along to the various processors that need it, without an intervening store to memory. According to H. T. Kung’s definition, only processors at the topological boundaries of a systolic array perform 1 / 0 to and from memory. Figures 12a-e show how a simple systolic array could calculate the outer product of two matrices
The zero inputs shown moving through the array represent explicit timing delays used for synchronization. Each processor in this tightly synchronized scheme is expected to accept/send operands and execute a code sequence during each time-step period. Thus, if the operands needed by a given processor have not yet become available by passing through antecedent processors, timing delay operands are sent to that processor to ensure its computations are appropriately delayed. In the example, each processor begins with an accumulator set to zero and, during each cycle, adds the product of its two inputs to the accumulator. After five cycles the matrix product is complete. Figure 13 shows example systolic arrays developed by industry, academia, and government. The examples suggest that systolic array architectures have rapidly become commercially viable, particularly for algorithm-specific systems that perform military signal processing applications. In addition, programmable (reconfigurable) systolic architectures, such as the iWarp and Saxpy Matrix-1, have been constructed that are not limited to implementing
128
RALPH DUNCAN
h f
Q
e
0
h f
C
C
db
d b
(a)
(bl h
If
IQ (dl
(C)
(el Frc;. 12. Systolic matrix multiplication. 0 1990 IEEE
129
PARALLEL COMPUTER ARCHITECTURES
ARRAY ARCHITECTURES
I Carnegie-Hellon WARP 1
’
1-
IHughes Systolic/Cellular System} 1
Y} 1 -
-
I Ho-rt1
-1
NOSCSLAPP
}
(Annaratone. e t al.. 1987)
I
(Nash, e t al., 1987) (Borkar, e t al., 1988)
1
(Leeland, 1987)
1-
(Drake, e t al.,. 1987)
1
(Lopresti, 1987) (Hein, e t al., 1987)
1
FIG. 13. Example systolic array architectures.
a single algorithm. Although systolic concepts were originally proposed for very large-scale integration (VLS1)-based systems to be implemented at the chip level, recent systolic architectures have been implemented at a variety of physical levels.
4.
MIMD Architectures
MIMD architectures employ multiple processors that can execute independent instruction streams. Thus, MIMD computers support parallel solutions that require processors to operate in a largely autonomous manner. Although software processes executing on MIMD architectures are synchronized by passing messages through an IN or by accessing data in shared memory, MIMD architectures are asynchronous computers, characterized by decentralized hardware control. In this exposition, MIMD architectures are treated as being synonymous with asynchronous architectures. The impetus for developing MIMD architectures can be ascribed to several interrelated factors. MIMD computers support higher-level parallelism (subprogram and task levels) that can be exploited by “divide and
130
RALPH DUNCAN
conquer” algorithms organized as largely independent subcalculations (e.g., searching, sorting). MIMD architectures may provide an alternative to depending on further implementation refinements in pipelined vector computers to provide the significant performance increases needed to make some scientific applications tractable (e.g., three-dimensional fluid modeling). Finally, the cost effectiveness of n-processor systems over n single-processor systems encourages MIMD experimentation. Both major categories of MIMD architecture, distributed memory and shared memory computers, are examined in the following text. First we discuss distributed memory architectures and review popular topological organizations for these message-passing machines. Subsequent sections consider shared memory architectures and the principle interconnection technologies that support them.
4.1
Distributed Memory Architectures
Distributed memory architectures (Fig. 14) connect processing nodes (consisting of an autonomous processor and its local memory) with a processor-to-processor IN. Nodes share data by explicitly passing messages through the IN, since there is no shared memory. Significant developments in distributed memory architecture occurred during the 1 %Os, often spurred by the desire to construct a multiprocessor architecture that would “scale” (i.e., accommodate a large increase in processors without significant performance degradation) and would satisfy the processing requirements of large scientific applications characterized by local data references.
FIG. 14. MIMD distributed memory architecture structure. 0 1990 IEEE.
PARALLEL COMPUTER ARCHITECTURES
131
Various IN topologies have been proposed to support architecture expandability and provide efficient performance for parallel programs with differing interprocessor communication patterns. Figure 15 depicts some common topologies. Although the suitability of these IN topologies for a given architecture is partly determined by the cost and performance characteristics of a particular implementation, several more abstract characteristics can be used to judge topologies‘ relative merits. First, a topology’s scalability is strongly influenced by the number of connections that are required for each node (the node’s “degree”), since physical constraints limit the number of connections one can feasibly implement. It is desirable, therefore, for the number of connections per node to remain fixed or to grow logarithmically as the number of system nodes increases.
( a ) ring
( b l mesh
(c) tree
(d) hypercube
o =
root
I = level-1 2 = level-2
( e ) t re e mapped t o a reconfigurable mesh
FIG. 15. MIMD interconnection network topologies: (a) ring; (b) mesh; (c) tree; (d) hypercube; (e) tree mapped to a reconfigurable mesh. 0 1990 IEEE.
132
RALPH DUNCAN
Another important consideration is a topology’s inherent fault tolerance. This involves the degree of disruption that a single node’s failure causes and the overhead involved in routing messages around a failed node. A third abstract measure of topology suitability is communication diameter, which can be defined as the maximum number of communication links that a message must traverse between any source and any destination node, while taking the shortest available path (Bhuyan, 1987). In an informal sense, it is the best routing solution for the worst case pairing of source and destination nodes. The following subsections review several popular topologies in terms of these considerations.
4.I . 1 Ring Topology Architectures A major benefit of a ring topology is that each node’s degree (number of interconnections) remains constant as processors are added to form a larger ring. Significant drawbacks to a simple ring topology include the large communication diameter ( N / 2 for N processors) and low fault tolerance (a single failure disrupts communications). Ring-based architectures’ communication diameter, however, can be improved by adding chordal connections. Both chordal connections and the use of multiple rings can increase a ring-based architecture’s fault tolerance. Typically, fixed-size message packets are used that include a node destination field. Ring topologies are most appropriate for a small number of processors executing algorithms that are not dominated by data communications. Control Data Corporation has built several specialized architectures that use ring topologies for pipelining. These are hybrid architectures, however, that have both shared memory and message-passing capabilities. Such architectures include the Advanced Flexible Processor (Allen, 1982), the Cyberplus (Ray, 1985), and the Parallel Modular Signal Processor (Colestock, 1988). 4.1.2 Mesh Topology Architectures
A symmetrical two-dimensional (2-D) mesh, or lattice, topology has n2 nodes, each connected to their four immediate neighbors. Wrdp-around connections at the edges are sometimes provided to reduce the communication diameter from 2(n - 1) to 2 * (INTEGER-part of n / 2 ) . Increasing the mesh size does not alter node degree. Meshes with simple, four-neighbor connections are relatively fault tolerant, since a single fault results in no more than two additional links being traversed to bypass the faulty node. A mesh’s communication diameter can be reduced and its fault tolerance
PARALLEL COMPUTER ARCHITECTURES
133
increased by providing additional diagonal links or by using buses to connect nodes by rows and columns. 4.1.3 Tree Topology Architectures
Tree topology architectures have been proposed to support the parallel execution of algorithms for searching and sorting, image processing and other algorithms amenable to a divide-and-conquer approach. Although a variety of tree-structured topologies have been suggested, the complete binary tree topology is the most analyzed variant and is the one discussed below. Node degree is not a barrier to binary tree topology scalability, since it remains fixed as tree size increases. Communication diameter and fault tolerance, however, are significant limitations for a binary tree unadorned with additional communications links. For example, the communication diameter for such a tree with n levels and 2" - 1 processors is 2(n - 1). Furthermore, disrupted communications links at a single node would sever communications between all that node's descendents and the rest of the tree. For these reasons, various additional communications links have been proposed for binary tree topologies, such as buses or point-to-point links that unite all nodes at the same tree level. Well-known parallel architectures based on tree topologies include the DADO (Stolfo, 1987) and Non-Von architectures (Shaw, 1981) developed at Columbia University. 4.1.4 Hypercube Topology Architectures
Since hypercube topologies are not likely to be as familiar to readers as rings or trees, we define the topology in some detail, before considering its relative merits. A Boolean n-cube or hypercube topology uses N = 2" processors arranged in an n-dimensional cube, where each node has n = log, N bidirectional links to adjacent nodes (Fig. 15). Individual nodes are uniquely identified by n-bit numeric values that range from 0 to N - 1 and that are assigned in a manner that ensures adjacent nodes' values differ by a single bit. Messages contain the destination node's bit-value and a label initialized to the source node's bit-value. When a processor routes a message, it selects an adjacent node that has a bit in common with the destination value that the routing node lacks, corrects that bit of the message label, and sends the message to the selected node. As a result of these conventions, the number of links traversed by a message traveling from node A to node B is equal to the number of bits that differ in the two nodes' bit-values. Since the source and destination node
134
RALPH DUNCAN
labels can at most differ in each of the n bits in their respective labels, the communication diameter of such a hypercube topology is n = log, N . Similarly, hypercube node degree grows in proportion to log2 N . Thus, the total number of processors can be doubled at the cost of increasing the number of interconnections per node by a single communications link. These properties make hypercube topologies attractive as the basis for messagepassing architectures that can “scale up” to a large number of processors (i.e., on the order of 1024) in order to meet demanding scientific application requirements. In practice, hypercube topology fault tolerance is likely to be as much influenced by the sophistication of the message routing system as by the topology’s abstract properties. For example, if a node in a log, N dimension hypercube (where log2 N > 2) possesses a message that it should forward to a node other than its immediate neighbors, and a single neighbor node has failed, at least one optimal-length pathway to the destination is available. In order to cope with multiple faults, the message routing mechanism could be enhanced to use suboptimal alternative paths when faults block the optimal-length pathways. Interest in hypercube topologies was stimulated by the development of the Cosmic Cube architecture at the California Tnstitute of Technology (Seitz, 1985). Commercial architectures based on hypercube topologies have included the Ametek Series 2010, Intel Personal Supercomputer, and NCUBE/10. Research is continuing on generalized hypercubes where N is not restricted to being an integral power of 2 (Bhuyan and Agrawal, 1984).
4.1.5 Reconfigurable Topology Architectures
An interconnection network embodies a single topology in the sense that its physical implementation in hardware is fixed. A reconfigurable topology architecture can, however, provide mechanisms, such as programmable switches, that effectively allow the user to superimpose various interconnection patterns onto the physical IN. Recent research architecture prototypes have implemented I N topology reconfigurability with diverse approaches. For example, Lawrence Snyder’s CHiP (Configurable Highly Parallel computer; Snyder, 1982) allows the user to superimpose different topologies onto an underlying mesh structure. Another approach, which is exemplified by H. J. Siegel’s PASM (Partitionable SIMD/MTMD system; Siege1 et ul., 1987), allows the user to partition a base topology into multiple interconnection topologies of the same type.
135
PARALLEL CO M PUTER ARCH ITECTU R ES
A significant motivation for constructing reconfigurable topology architectures is that such an architecture can act as many special-purpose architectures that efficiently support the communications patterns of particular algorithms or applications. Figure 16 shows example distributed memory architectures that utilize reconfigurable topologies and the common topologies discussed in previous subsections.
4.2 Shared Memory Architectures As befits their name, the defining characteristic of shared memory architectures is a global memory that each processor in the system can access. In such an architecture, software processes, executing on different processors, coordinate their activities by reading and modifying data values in the shared memory. Our discussion defines these architectures, which involve multiple general-purpose processors sharing memory, as parallel architectures, while ring topology CDC AFP* (Allen, 1982) CDC Cyberpluw (Ray, 1985) CDC PMSP* (Colestock, 1988) *=hybrid distrib./shared memory
f f
5 MlHD Distributed Memory Architectures
tree topology
>
DADO2 (Stolfo, 1987) NON-VON (Shaw, 1981)
CHiP/PRINGLE (Snyder, 1982) (Kapauan, et al., 1984) PASM (Siegel, et al., 1987) TRAC (Lipovski & Malek, 1987)
FIG. 16. Example MIMD distributed memory architectures.
136
RALPH DUNCAN
excluding architectures in which a single CPU only shares memory with 1/0processors. A significant number of shared memory architectures, such as Encore Computer Corporation’s Multimax and Sequent Computer Systems’ Balance series, were commercially introduced during the 1980s. These shared memory computers do not have some of the problems encountered by message-passing architectures, such as message sending latency as data is queued and forwarded by intermediate nodes. However, other problems, such as data access synchronization and cache coherency, must be solved. Using shared memory data to coordinate processes executing on different processors requires mechanisms that synchronize attempts to access this data. The essential problem is to prevent one processor from accessing a datum while another process’ operation on the datum is only partially complete, since the accessed data would be in an indeterminate state. Thus, one process must not read the contents of a memory location while another process is writing a new value to that location. Various mechanisms, such as test-and-set primitives, fetch-and-add instructions, or special control bits for each memory word, have been used to synchronize shared memory access (Dubois et ul., 1988). These mechanisms can be implemented through microcoded instructions, sophisticated memory controllers, and operating system software. For example, the testand-set/reset primitives shown below can be used to grant a processor sole access to a shared variable when the test-and-set primitive returns a zero value.
TEST-AND-SET (lock-variable) t emp :=1o ck-var i ab 1e ; lock-variable:= 1; RETURN (temp); END;
RESET (lock-variable) lock-variable := 0;
END ;
Processors that receive a value of one when invoking the primitive are prohibited from accessing the variable ; they typically enter a busy waiting state as they repeatedly invoke the primitive (“spin-lock”) or wait for an interrupt signaling that the lock variable’s value has been reset (“suspendlock”). Other approaches to synchronizing shared data access are described in Kuck et ul. (1986), Gottlieb et ul. (1983), and Jordan (1984). Each processor in a shared memory architecture may have a local memory that is used as a cache. Multiple copies of the same shared memory data, therefore, may exist in various processors’ caches at a given time. Maintaining a consistent version of such data is the cache coherency problem, which is caused by sharing writable data, process migration among processors, and
PARALLEL COMPUTER ARCHITECTURES
137
1/0 activity. Solutions to this problem must ensure that each processor uses the most recently updated version of cached data. Both hardware-based schemes, such as write-invalidate and write-update protocols for “snoopy caches,” and software-based schemes, such as predetermining data cacheability or time-stamping data-structure updates, have been proposed (Stenstrom, 1990). Although systems with a small number of processors typically use hardware “snooping” mechanisms to determine when cached memory data has been updated, larger systems often rely on software solutions to minimize performance impact. Useful overviews of cache coherence schemes are presented in Dubois et af. (19SS) and Stenstrom (1990). Figure 17 illustrates some major alternatives for connecting multiple processors to shared memory outlined below.
1 BUS
(a 1 bus interconnection
(bl
a 2 X 2 crossbar
FIG. 17. MIMD shared memory interconnection schemes: (a) bus interconnection; (b) 2 x 2 crossbar; (c) 8 x 8 omega MIN routing a P3 request to M 3 . 0 1990 IEEE.
138
RALPH DUNCAN
[c) an 8 X 8 omega HlN routing a P3 request t o H3
FIG. 17. Continued.
4.2.1 Bus Interconnections
Time-shared buses (Fig. 17a) offer a fairly simple and relatively inexpensive way to give multiple processors access to a shared memory. Many of the commercial parallel architectures introduced during the 1980s were bus-based, shared memory machines. However, a single, time-shared bus can effectively accommodate only a moderate number of processors (4-20), since only one processor can access the bus at a given time. In order to accommodate more processors or to increase communications bandwidth, bus-based architectures sometimes utilize multiple buses and hierarchical interconnection systems (Mudge et al., 1987). The experimental Cm* architecture, for example, employs two kinds of buses-a local bus linking a cluster of processors, and a higher-level system bus that links dedicated service processors associated with each cluster. The Hector architecture (Vranesic et al., 1991) exhibits an alternative approach, using a hierarchy of “rings” (bit-parallel, point-to-point connections) to interconnect short buses that each serve a small number of processors. 4.2.2 Crossbar Interconnections
Crossbar interconnection technology uses a crossbar switch of n2 crosspoints to connect n processors to n memories (Fig. 17b). Processors may contend for access to a memory location, but crossbars prevent contention for communication links by providing a dedicated pathway between each
PARALLEL COMPUTER ARCHITECTURES
139
possible processor/memory pairing. Crossbar interconnections offer high communications performance but are a relatively expensive I N alternative. Power, pinout, and size considerations typically limit crossbar architectures to using a small number of processors (i.e., 4 16). The Alliant FX/8, which uses a crossbar scheme to connect processors and cache memories, is an example of a commercial parallel architecture using crossbar interconnections.
4.2.3 Multistage Interconnection Networks
Multistage interconnection networks, or MINs (Bhuyan, 1987; Kothari, 1987; Siegel, 1985), offer a compromise between the relatively high-price/ high-performance alternative of crossbar INS and the low price/low-performance alternative offered by buses. An N x N MIN connects N processors to N memories by deploying multiple “stages” or banks of switches in the IN pathway. When N is a power of 2, a popular approach is to employ logzN stages of N/2 switches, using 2 x 2 switches. A processor making a memory access request specifies the desired destination (and pathway) by issuing a bit-value that contains a control bit for each stage. The switch at stage i examines the ith bit to determine whether the input (request) is to be connected to the upper or lower output. Figure 17c illustrates MIN switching with an omega network connecting eight processors and memories, where a control bit equal to 0 indicates a connection to the upper output. Since the communication diameter of such MINs is proportional to log, N, they can support a large number of processors (e.g., 256). Since MIN technology offers a moderate price/performance IN alternative with a high degree of scalability, it has received a great deal of research attention, leading to proposals for variations such as the omega, flip, SW-banyan, butterfly, multistage shuffle-exchange, baseline, delta, and generalized cube networks. Similarly, many fault-tolerant MINs have been proposed, including the extra stage cube, multiplath omega, dynamic redundancy, merged delta, and INDRA networks (Adams et al., 1987). Figure 18 shows an example of MIMD shared memory architectures categorized by the IN technologies discussed above.
5. MIMD Execution Paradigm Architectures MIMD/SIMD hybrids, data-flow architectures, reduction machines, and wavefront array processors all pose a similar difficulty for an orderly taxonomy of parallel architectures. Each of these architectural types is predicated on MIMD principles of asynchronous operation and concurrent
140
RALPH DUNCAN
bus interconnection Cm* (Jones b Schwartz, 1980)
ELXSl 6400 (Hays, 1986) Encore Multimax (Encore Cornput., 1987) FLEX132 (Manuel, 1985) Hector (Vranesic, et al., 19911
t
HlMD Shared Memory Architectures
rossbar interconnectio Alliant FX/8 (Pemn b Mundie, 1986) S-1 (Widdoes & Correll, 1979)
MlN interconnection BBN Butterfly (BBN Lab., 1985) BBN Monarch (Rettberg, et al., 1990) CEDAR (Kuck, et al., 1986) IBM RP3 (Pfister, et al., 1987) Ultracomputer (Gottlieb, et al., 1983)
FIG. 18. Example MIMD shared memory architectures.
manipulation of multiple instruction and data streams. However, each architecture type is also structured to support a distinctive parallel execution paradigm that is as fundamental to its overall design as MIMD characteristics. For example, the data-flow execution paradigm exemplifies a distinctive form of processing, in which instruction execution is triggered by operand availability. Although data-flow architectures can be implemented using diverse MIMD technologies, their design features coalesce around the central concept of supporting data-flow execution. This dualism poses several taxonomic problems. Placing these architectures in MIMD subcategories solely on the basis of their memory structure and interconnection characteristics obscures the most fundamental aspect of their design-supporting a distinctive kind of parallel program execution. Simply adding special MIMD subcategories for these architectures, however, results in undesirable asymmetry and imprecision. First, having MIMD subcategories at the same taxonomic level be based on both supported execution
PARALLEL COMPUTER ARCHITECTURES
141
models (e.g., data-flow) and structural characteristics (e.g., shared memory, bus-based) makes the subcategorization asymmetrical and somewhat arbitrary. Second, the MIMD architectures discussed in Section 4 can typically support multiple parallel execution models. One can implement a messagepassing application using shared memory for the messages, or can implement an application using data-flow principles on a distributed memory hypercube architecture. Thus, if one subcategorizes MIMD architectures on the basis of supported execution models, one would have many architectures grouped under an imprecise category for “other models” or “multiple models.” Our taxonomy, therefore, creates a separate, high-level category : MIMD Execution Paradigm Architectures. This inelegant term emphasizes that these MIMD architecture types are structured to support particular parallel execution models. 5.1
MIMD/SIMD Architectures
A variety of experimental hybrid architectures have been constructed during the 1980s that allow selected portions of an MIMD architecture to be controlled in SIMD fashion (e.g., DADO, NON-VON, PASM, and the Texas Reconfigurable Array Computer, or TRAC) (Lipovski and Malek, 1987). These architectures employ diverse mechanisms for reconfiguration and SIMD execution control. One promising approach, based on tree-structured, message-passing computers, such as DADO2 (Stolfo and Miranker, 1986), will be used here to illustrate hybrid MIMD/SIMD operation. The master/slave relation of a SIMD architecture’s controller and processors can be mapped onto the node/descendents relation of a subtree (Fig. 19). When the root processor node of a subtree operates as a SIMD controller, it transmits instructions to descendent nodes that each executes
a
( MIMD operation node)
( SIMD controller node)
/
d
( SIMD
slave processors 1
FIG. 19. MTMD/SIMD operation. 0 1990 IEEE.
142
RALPH DUNCAN
the instructions on data in its local memory. In a true message-passing architecture, this instruction transmission process differs from that of the classic SIMD model of simultaneously broadcasting instructions to each processor, since instructions can be first transmitted to the controlling processor’s descendents, and then transmitted down the tree to their descendents. The flexibility of MIMD/SIMD architectures obviously makes them attractive candidates for further research; specific incentives for recent development efforts include supporting image processing applications (PASM ; Siege1 et al., 1987) ; studying scalable, reconfigurable architectures (TRAC; Lipovski and Malek, 1987) ; and parallelizing expert system execution (NON-VON; Shaw, 1981 ; DADO; Stolfo and Miranker, 1986). Figure 20 shows some example MIMD/SIMD architectures.
5.2
Data-Flow Architectures
The fundamental characteristic of data-flow architectures is an execution paradigm in which instructions are enabled for execution as soon as all of their operands become available. Hence, the execution sequence of a dataflow program’s instructions is based on data dependencies. Data-flow architectures can be geared to exploiting concurrency at the task, routine and instruction levels. A major incentive for data-flow research, which dates from J. B. Dennis’s pioneering work in the mid-l970s, is to explore new
HlHDlSlHD Architectures
0 Topoloqy
DADO (Stolfo & Miranker, 1986)
TRAC (Lipovski h Malek, 1987)
NON-VON (Shaw, 1981 )
FIG. 20. Example MIMD/SIMD architectures.
PARALLEL COMPUTER ARCHITECTURES
143
computational models and languages that can be effectively exploited to achieve large-scale parallelism. Programs for data-flow architectures can be expressed as data-flow graphs, such as the program fragment depicted in Fig. 21. Graph nodes may be thought of as representing asynchronous tasks, although they are often single instructions. Graph arcs represent communications paths for tokens that carry either execution results needed as operands in subsequent instructions or control information. Some of the diverse approaches used to implement data-flow computing are outlined below. Static implementations load all program-graph nodes into memory during initialization and allow only one instance of a node to be executed at a time; dynamic architectures allow node instances to be created at run-time and multiple instances of a node to be concurrently executed (Srini, 1986). Some architectures directly store token information containing instruction results into a template for the instruction that will use the results as operands (“token storage”). Other architectures use token matching schemes, in which a matching unit collects result tokens and tries to match them with instructions’ required operands. When a complete set of tokens (all required operands) is assembled for an instruction, an instruction template containing the relevant operands is created and queued for execution (Treleaven rt al., 1982b). Proposed instructions formats for data-flow architectures differ considerably (Srini, 1986). Significant differences result from varying constraints on the number of input and output arcs that may be associated with a graph node and from alternative approaches to representing control information.
Q Node 1
FIG. 21. Data-flow graph-program
Node 2
fragment.
0 1990 IEEE.
144
RALPH DUNCAN
A typical scheme, however, might allow operand data to be written into instruction fields as either literals or (result) memory addresses by using control bits to identify which data format is being used. Figure 22 shows how a simplified token matching architecture might process the program fragment shown in Fig. 21. At step 1, the execution of ( 3 * u ) results in the creation of a token that contains the result (15) and an indication that the instruction at node 3 requires this as an operand. Step 2 shows the matching unit that will match this token and the result token of ( 5 * h ) with the node 3 instruction. The matching unit creates the instruction token (template) shown at step 3 . At step 4, the node store unit obtains the relevant instruction opcode from memory. The node store unit then fills in the relevant token fields (step 5 ) , and assigns the instruction to a processor. The execution of the instruction creates a new result token to be used as input to the node 4 instruction. Figure 23 shows some examples of data-flow architectures, and categorizes them on the basis of the static and dynamic architecture distinction discussed above. Readers interested in detailed discussions of data-flow architecture
3
*h
STORE
w=’ IPI
(OPRNDP=IO
EST = NODE4
FIG.22. Data-flow token matching example.
0 1990 IEEE.
PARALLEL COMPUTER ARCHITECTURES
145
EDFG System (Srini, 19851 lrvine D-F Machine (Arvind & Gostelow, 19751
-
Manchester Data-Flow (Watson & Curd, 1979) Computer H.I.T. Tagged Token (Arvind & Kathail, 19811 Data-Flow Computer Newcastle JUMBO (Treleavan, e t a1 , 1 982a) Utah Data-Driven Machine (Davis. 1978)
static architectures CERT LAU System (Plas, e t al., 1976) H.I.T. Data-Flow (Dennis & Misunas, 1975) Computer TI Distributed Data (Cornish, 1979) Processor
FIG.23. Example data-flow architectures.
characteristics and taxonomy can consult Treleaven et al. (1982b) and Srini ( 1986).
5.3 Reduction Architectures Reduction, or demand-driven, architectures (Treleaven et al., 1982b) implement an execution model in which an instruction is enabled for execution when its results are required as operands for another instruction that is already enabled for execution. Most reduction architecture research began in the late 1970s in order to explore new parallel execution models and to provide architectural support for applicative (functional) programming languages. Reduction architectures execute programs that consist of nested expressions. Expressions are recursively defined as literals or as function applications on arguments that may be literals or expressions. Programs may reference named expressions, which always return the same value (ie., have the property of “referential transparency”). Hence, reduction programs are function applications constructed from primitive functions. Reduction program execution consists of recognizing reducible expressions, then replacing them with their calculated values. Thus, an entire
146
RALPH DUNCAN
reduction program is ultimately reduced to its result. Since the general execution model only enables an instruction for execution when its results are needed by a previously enabled instruction, some additional rule is needed to enable the first instruction(s) and begin computation. Practical challenges for implementing reduction architectures include synchronizing instruction result demands and managing copies of evaluation results. Demands for an instruction’s results must be synchronized, because preserving referential transparency requires that an expression’s results be calculated only once. Copies of expression evaluation results must be maintained, since an expression result could be referenced (needed) more than once and a single copy could be consumed by subsequent reductions upon first being delivered. Reduction architectures employ either string-reduction or graph-reduction to implement demand-driven execution models. String-reduction involves manipulating literals and copies of values, which are represented as strings that can be dynamically expanded and contracted. Graph-reduction involves manipulating literals and references (pointers) to values; thus, a program is represented as a graph and garbage collection is performed to reclaim dynamically allocated memory as the reduction proceeds. Figures 24 and 25 show a simplified version of a graph-reduction architecture that maps the program below onto tree-structured processors and passes tokens that demand or return results. Figure 24 depicts all the demand tokens produced by the program, as demands for the values of references propagate down the tree. In Fig. 25, the last two result tokens produced are shown as they are passed to the root node. The program fragment used in Figs. 24 and 25 is: a = fbc;
b = +de; c = *fg;
d=l. e=3. j = 5 .
g=7.
Figure 26 shows reduction machine architectures, categorized according to whether they implement the string or graph reduction mechanisms discussed previously.
5.4 Wavefront Array Architectures Wavefront array processors (Kung et al., 1987) combine the data pipelining of systolic arrays with an asynchronous data-flow execution paradigm. In the early 1980s, S. Y. Kung proposed wavefront array concepts to address
PARALLEL COMPUTER ARCHITECTURES
147
Node 1
FIG. 24. Reduction architecture demand token production. 0 1990 IEEE.
the same kind of problems that stimulated systolic array research. Thus, wavefront array processors are intended to provide efficient, cost-effective architectures for special-purpose systems that balance intensive computations with high 1/0 bandwidth. Wavefront and systolic architectures are both characterized by modular processors and regular, local interconnection networks. Both kinds of arrays read data from external memory (using PEs at their topological boundaries), pulse data from neighbor to neighbor through a local IN, and write results to external memory using boundary PEs. Wavefront arrays, however, replace the global clock and explicit time delays used for synchronizing systolic data pipelining with asynchronous handshaking as the mechanism for coordinating inter-processor data movement. Thus, when a processor has performed its computations and is ready to pass data to its successor, it informs the successor, sends data when the successor indicates it is ready, and receives an acknowledgment from the successor. The handshaking mechanism makes computational wavefronts
148
RALPH DUNCAN
Node I
FIG.25. Reduction architecture result token production.
0 1990 IEEE.
pass smoothly through the array without intersecting, as the array’s processors act as a wave propagating medium. In this manner, correct sequencing of computations replaces the correct timing of systolic architectures. Figure 27 depicts wavefront array operation, using the matrix multiplication example used earlier to illustrate systolic operation (Fig. 12). The simplified example shows an array that consists of processing elements (PEs) with one-operand buffers for each input source. Whenever a boundary PE’s buffer associated with external memory is empty and the memory still contains inputs, the PE immediately reads the next available operand from memory. Operands from other PEs are obtained by using a handshake protocol. Figure 27a shows the situation after memory input buffers are initially filled. In Fig. 27b PE(1, 1) adds the product ae to its accumulator and transmits operands a and e to neighbors; thus, the first computational wavefront is shown propagating from PE(1, 1 ) to PE(l,2) and PE(2, 1). Figure 27c shows the first computational wavefront continuing to propagate, as a second wavefront is propagated by PE(1, I ) .
PARALLEL COMPUTER ARCHITECTURES
149
graph reduction machines Cambridge SKIM (Clarke, et al , 1980) ALICE (Darlington & Reeve, 1981) Utah AMPS (Keller, et al., 19781
string reduction machines
Machine ~~
~~
GMD R-Machine (Kluge & Schlutter, 1980) Newcastle R-Mach. (Treleaven & Mole, I9801 N.C. Cellular Tree Mach. (Mago, I9791
SERFRE (Villemin, 1982)
graph h string reduction machines Indiana APSA (ODonnell, et al., 1988) FIG. 26. Example reduction machine architectures.
S. Y. Kung argues (Kung et al., 1987) that wavefront arrays enjoy several advantages over systolic arrays, including greater scalability (since global clock skewing is not a problem), increased processing speed when nodes’ processing times are not uniform, simpler programming (since computations need not be explicitly scheduled), and greater run-time fault tolerance (since a single processor can be independently interrupted for testing). Wavefront arrays constructed by the Applied Physics Laboratory of Johns Hopkins University (Dolecek, 1984) and by the British Standard Telecommunications Company and Royal Signals and Radar Establishment (McCanny and McWhirter, 1987) should facilitate further assessment of wavefront arrays’ proposed advantages. 6 . Conclusions
This discussion’s central aim has been to show that, despite their diversity, extant parallel architectures define a comprehensible spectrum of machine designs. Each of the major parallel architecture classes that we have reviewed represents a fundamental approach to effectively supporting parallelized program execution. Although these approaches range from providing networks
150
RALPH DUNCAN
FIG.27. Wavefront array matrix multiplication. (01990 IEEE.
of general-purpose processors to supporting specific parallel programming philosophies and languages, this conclusion attempts to characterize the direction in which the field of parallel architecture research was moving in early 1991. Recent accomplishments in “scalable” architectures are likely to strongly shape research efforts in the immediate future. The concern for building systems that can be significantly increased in size without performance
PARALLEL COMPUTER ARCHITECTURES
151
degradation has become an important aspect of designing message-passing topologies (e.g., hypercube architectures), interconnection networks (e.g., MINs, hierarchical bus systems), and execution paradigms (e.g., wavefront array processing). The commercial viability of Thinking Machines Corporation’s Connection Machine, Loral’s Massively Parallel Processor, and various hypercube architectures is spurring interest in massively parallel computers that use thousands of processors. The diversity of mature parallel architecture types suggests that there are many viable ways to structure parallel processing systems. This breadth of alternatives encourages researchers to select possible system components and component integration strategies from a wide range of alternatives. Such a concern with system component selection may encourage research attention to be more equally divided among processor, memory, and interconnection technologies, rather than focusing primarily on performance engineering for specialized processors. For example, recent years have seen many research efforts directed to multistage interconnection networks and to organizing cached memory hierarchies. One of the last decade’s most stimulating developments has been the introduction of new architecture types that are geared to supporting a specific parallel execution model. Such architectures have included systolic and wavefront array processors, data-flow architectures, reduction machines, and the massively parallel, bit-oriented SIMD machines. This increased concern with conceptualizing parallel execution models is a departure from the concerns of the vector architecture approach in its maturity, which has tended to emphasize successive engineering refinements to highly specialized components. The first prototypes of execution model-oriented architectures are often constructed using standard microprocessors, buses and memory chips. This combination of emphasizing parallel execution paradigms and of using standard components as system building blocks has significant implications. First, these trends make it easier for researchers to contribute to the field, since the enormous financial investment needed to develop architectures like the classic vector computers can be avoided by using standard components for prototyping. Hence, a professor at a relatively ill-funded Third World university, who is armed with a promising conceptual model of parallel execution and some standard components, has a reasonable chance of constructing a novel parallel architecture. By making it easier to experiment with new parallel architecture approaches, these trends are likely to result in an even greater variety of proposed parallel architecture approaches. Parallel processing is firmly established as a viable mechanism for solving computational problems that are characterized by intensive calculations and demanding processing deadline requirements. By providing diverse architectures that are well suited to different kinds of computational problem, the
152
RALPH DUNCAN
parallel architecture subdiscipline has made parallel processing a useful tool for many application domains. Current research concerns, such as scalability, interconnection network and hierarchical memory refinement, and parallel execution paradigm support, suggest that the number and variety of parallel architectures under active development will continue to increase.
Acknowledgments
The author thanks the following individuals for providing research papers, descriptions of NVN architectures, and insights : Theodore Bashkow, Laxmi Bhuyan, Joe Cavano, Jack Dongarra, Paul Englehart, Scott Fahlman, Dennis Gannon, H. T. King, S. Y. Kung, G. J. Lipovski, Richard Lott, David Lugowski, Miroslaw Malek, Susan Miller, Wayne Ray, Malcolm Rimmer, Douglas Sakal, Howard J. Siegel, Charles Seitz, Lawrence Snyder, Vason Srini, Kent Steiner, Salvatore Stolfo, Philip Treleaven, David Waltz, and Jon Webb.
REFERENCES Adams, G. B., Agrawal, D. P., and Siegel, H. J. (1987). A Survey and Comparison of FaultTolerant Multistage Interconnection Networks. Computer 20(6), 14~-27. Allen, G. R. (1982). A Reconfigurable Architecture for Arrays of Microprogrammable Processors. In “Special Computer Architectures for Pattern Processing” (K. s. Fu and T. Tchikawa, eds.), pp. 157 189. CRC Press, Boca Raton, Florida. Anderson, G. A,. and Kain, R. Y. (1976). A Content-Addressed Memory Design for Data Basc Applications. Pror. hi.Conference on Parallel Processing, pp. 19 1-195. Annaratone, M., Amould, E., Gross, T., Kung, H. T., Lam, M., Menzilcioglu, O., and Webb, J. A. (1987). The Warp computer; Architecture, Implementation and Performance. IEEE Trans. Comput. C-36(12), 1523.~1538. Arvind and Gostelow, K. P. (1975). A New Interpreter for Data Flow and its Implications for Computer Architecture. Rep. No. 72, Department Information and Computer Science, University of California, Irvine. Arvind and Kathail, V. (1981). A Multiple Processor that Supports Generalized Procedures. Proceeding 8th Annual Symposium Computer Architecture, Minneapolis, pp. 291 -302. Barnes, G. H., Brown, R. M., Kato, M., Kuck, D. J.. Slotnik, D. L., and Stokes, R. A. (1968). The Illiac 1V Computer. IEEE Trans. Comput. C-17(8), 746-757. Batcher, K . E. (1972). Flexible Parallel Processing and STARAN. 1972 WESCON Technical Papers, Session I-Parallel Processing Systems, pp. 115.1-1 15.3. Aatcher, K. E. (1980). Design of a Massively Parallel Processor. IEEE Transactions Comput. C-29(9), 836-844. BBN Laboratories ( 1985). “Butterfly Parallel Processor Overview.” BBN Laboratories, Cambridge, Massachusetts. Beteem, J., Denneau, M., and Weingarten, D. (1987). The GFl1 Parallel Computer. In “Experimental Parallel Computing Architectures” (J. J. Dongarra, ed.), pp. 255-298. Elsevier, Amsterdam.
PARALLEL COMPUTER ARCHITECTURES
153
Bhuyan, L. N., and Agrawal, D. P. (1984). Generalized Hypercube and Hyperbus Structures for a Computer Network. IEEE Trans. Comput. C-33(4), 323-333. Bhuyan, L. N. ( 1987). Interconnection Networks for Parallel and Distributed Processing. Computer 20(6), 9-12. Borkar, S., Cohn, R., Cox, G., Gleason, S., Gross, T., Kung, H. T., Lam, M., Moore, B., Peterson, C., Pieper, J., Rankin, L., Tseng, P. S.. Sutton, J., Urbanski, J., and Webb, J. (1988). iWARP: an Integrated Solution to High-speed Parallel Computing. Proceeding Supercomputing 88, Orlando, Florida, pp. 330-339. Briggs, F., and Hwang, K . (1984). “Computer Architectures and Parallel Processing.” McGraw-Hill, New York. Clarke, T. J. W., Gladstone, P. J. S., Maclean, C . D., and Norman, A. C. (1980). SKIM-the S,K,I Reduction Machine, Proceedings LISP-80 Conf., Stanford, California, August, pp. 128 135. Colestock, M. (1988). A Parallel Modular Signal Processor. Proceeding 8th Conference Digital Auionics Systems, San Jose, California, October 17 -20, pp. 607-613. Control Data Corp. (1976). “Control Data STAR-I00 Computer System.” Control Data Corp., Minneapolis, Minnesota. Cornish, M. (1979). The TI Data Flow Architectures: the Power of Concurrency for Avionics. Proceedings Third Conference Digital Avionics Systems, Fort Worth, Texas, pp. 19-25. Couranz, G. R., Gerhardt, M. S., and Young, C. J. (1974). Programmdbk Radar Signal Processing Using the RAP. Proceedings Sagamore Computer Conference on Parallel Processing, 37-52. Crane, B. A,, Gilrnartin, M. J., Huttenhoff, J. H., Rux, P. T., and Shiveley, R. R. (1972). PEPE Computer Architecture. Proceedings IEEE COMPCON, pp. 57-60. Darlington, J., and Reeve, M. (1981). ALICE: a Multiprocessor Reduction Machine for the Parallel Evaluation of Applicative Languages. Proceedings Int. Symposium on Functional Programming Languages and Computer Architecture, Goteborg, Sweden, pp. 32-62. Dasgupta, S. (1990). A Hierarchical Taxonomic System for Computer Architectures. Computer 23(3), 64 74. Davis, A. L. (1978). The Architecture and System Method of DDMl : a Recursively Structured Data Driven Machine. Proceedings 5th Annual Symposium Computer Architecture, pp. 210215. Dennis, J. B., and Misunas, D. P. (1975). A Preliminary Architecture for a Basic Data Flow Processor. Proceedings 2nd International Symp. Computer Architecture, January 20-22, pp. 126-132. Dolecek, Q. E. (1984). Parallel Processing Systems for VHSIC. Tech. Report, Applied Physics LabOI‘dtOry, Johns Hopkins University, Laurel, Maryland, pp. 84-1 12. Dongarra, J. J., ed. (1987). “Experimental Parallel Computing Architectures.” North-Holland, Amsterdam. Drake, B. L., Luk, F. T., Speiser, J. M., and Symdnski, J. J. (1987). SLAPP: a Systolic Linear Algebra Parallel Processor. Computer 20(7), 45-49. Dubois, M., Scheurich, C . , and Briggs, F. A. (1988). Synchronization, Coherence, and Event Ordering in Multiprocessors. Computer 21(2), 9-21. Encore Computer Corp. (1987). “Multimax Technical Summary,” Publication no. 726-01759 Rev. D. Encore Computer Corp., Marlboro, Massachusetts. ETA Systems, Inc. (1987). “ETA10 Supercomputer Series,” Brochure no. 205326. ETA Systems, Inc., St. Paul, Minnesota. Finnila, C. A,, and Love, H. H. (1977). The Associative Linear Array Processor. IEEE Transactions Comput. C-26(2), 112-125. Flynn, M. J. (1966). Very High Speed Computing Systems. Proceedings IEEE, 54, pp. 19011909.
154
RALPH DUNCAN
Foulser, D. E., and Schreiber, R. (1987). The Saxpy Matrix-I : a General-Purpose Systolic Computer. Computer 20(7), 35- 43. Gajski, D. D., Lawrie, D. H., Kuck, D. J., and Sameh, A. H. (1987). CEDAR. In “Parallel Computing: Theory and Comparisons” (G. J. Lipovski and M. Malek, eds.), pp. 284 291. Wiley, New York. Goodyear Aerospace Corp. (1984). “Functional Description of ASPRO, the High Speed Associative Processor,” document no. GER 16868. Loral Systems Group, Akron, Ohio. Gottlieb, A,, Grishman, R., Kruskal, C. P., McAuliffe, K. P., Rudolph, L., and Snir, M. (1983). The NYU Ultracomputer: Designing an MlMD Shared Memory Parallel Computcr. IEEE Transuctiuns Cornpui. C-32(2), 175 189. Hays, N. (1986). New Systems Offer Near-Supercomputer Performance. Computer 19(3), I04 -107. Hein, C. E., Zieger, R. M., and Urbano, J. A. (1987). The Design of a GaAs Systolic Array for an Adaptive Null Stcering Beamforming Controller. Computer 20(7), 92 93. Higbie, L. C. (1972). The OMEN Computers: Associative Array Processors. Proceedings IEEE COMPCON, pp. 287-290. Hillis, W. D. (1985). “The Connection Machine.” MIT Press, Cambridge, Massachusetts. Hockney, R. W., and Jesshope, C. R. (1981). “Parallel Computers: Architecture, Programming, and Algorithms.” Adam Hilger, Ltd., Bristol, England. Hockney, R. W. (1987). Classification and Evaluation of Parallel Computer Systems. In “Springer-Verlag Lecture Notes in Computer Science,” No. 295, pp. 13-25. Hwang, K., ed., ( 19844. “Tutorial Supercomputers: Design and Applications.” IEEE Computer Society Press, Silver Spring, Maryland. Hwang, K . ( 1984b). Evolution of Modern Supercomputers. In “Tutorial Supercomputers: Design and Applications” (K. Hwang, ed.), pp. 5 8. IEEE Computer Society Press, Silver Spring, Maryland. Jones, A. K., and Schwarz, P. (1980). Experience Using Multiprocessor Systems: a Status Report. ACM Compui.Surveys 12(2), 121 165. Jordan, H. F. (1984). Experience with Pipelined Multiple Instruction Streams. In “Tutorial Supercomputers: Design and Applications” (K. Hwang, ed.), pp. 239 249. IEEE Computer Society Press, Silver Spring, Maryland. Kandle, D. A. (1987). A Systolic Signal Processor for Signal-Processing Applications. Cumpuler 20(7), 94 95. Kapauan, A,, Wang, K-Y., Cannon, D., and Snyder, L. (1984). The PRINGLE: an Experimental System for Parallel Algorithm and Software Testing. Proceedings International Conference un Parullel Processing, pp. 1 6. Keller, R. M., Patil, S., and Lindstrom, G . (1978). An Architecture for a Loosely Coupled Parallel Processor. Technical Report No. UUCS-78- 105, Department of Computer Science, University of Utah, Salt Lake City. Kluge, W. E., and Schlutter, H. (1980). An Architecture for the Direct Execution of Reduction Languages. Proceedings International Workshop on High-Level Language Computer Archiiecfure, Fort Lauderdale, Florida, pp. 174 180. Kohonen, T. (1987). “Content-addressable Memories-2nd ed.” Springer-Verlag, New York. Kothari, S. C. (1987). Multistage Interconnection Networks for Multiprocessor Systems. In “Advances in Computers-Vol. 26” (M. C. Yovits, ed.), pp. 155 199. Academic Press, New York. Kozdrowski. E. W., and Theis, D. J. (1980). Second Generation of Vector Supercomputers. Computer, 13(11), 71 83. Kuck, D. J. (1982). High-speed Machines and their Compilers. In “Parallel Processing Systems” (D. Evans, ed.). Cambridge University Press, Cambridge, England.
PARALLEL COMPUTER ARCHITECTURES
155
Kuck, D. J . , and Stokes, R. A. (1984). The Burroughs Scientific Processor (BSP). In “Tutorial Supercomputers: Design and Applications” (K. Hwang, ed.), pp. 90- 103. IEEE Computer Society Press, Silver Spring, Maryland. Kuck, D. J., Davidson, E. S., Lawrie, D. H., and Sameh, A. H. (1986). Parallel Supercomputing Today and the Cedar Approach. Science 231, 967-974. Kung, H. T. (1982). Why Systolic Architectures? Computer 15(1), 37-46. Kung, S. Y., Lo, S. C., Jean, S. N., and Hwang, J. N. (1987). Wavefront Array ProcessorsConcept to Implementation. Computer 20(7), 18-33. Lang, G. R., Dharsai, M., Longstaff, F. M., Longstaff, P. S., Metford, P. A. S., and Rimmer, M. T. (1988). An Optimum Parallel Architecture for High-speed Real-Time Digital Signal Processing. Computer 21(2), 47-57. Leeland, S. B. (1987). An Advanced DSP Systolic Array Architecture. Computer 20(7), 95 96. Lincoln, N. R. (1984). Technology and Design Tradeoffs in the Creation of a Modern Supercomputer. In “Tutorial Supercomputers: Design and Application” (K. Hwang, ed.), pp. 3245. IEEE Computer Society Press, Silver Spring, Maryland. Lipovski, G. J., and Malek, M. (1987). “Parallel Computing: Theory and Comparisons.” Wiley, New York. Lopresti, D. P. (1987). P-NAC: a Systolic Array for Comparing Nucleic Acid Sequences. Computer 20(7), 98-99. Mago, G . A. (1979). A Cellular, Language Directed Computer Architecture. Proceedings Conference on Very Large Scale Integration, Pasadena, California, January, pp. 447 452. Manuel, T. (1985). Parallel Machine Expands Indefinitely. Electronics Week, May 13,49-53. McCanny, J. V., and McWhirter, J. G. (1987). Some Systolic Array Developments in the United Kingdom. Computer 20(7), 51-63. Miura, K., and Uchida, K. (1984). FACOM Vector Processor System: VP-IOO/VP-200. In “Tutorial Supercomputers: Design and Applications” (K. Hwang, ed.), pp. 59-73. IEEE Computer Society Press, Silver Spring, Maryland. Mudge, T. N., Hayes, J. P., and Winsor, D. C. (1978). Multiple Bus Architectures. Computer 20(6), 42-48. Nash, J. G., Przytula, K. W., and Hansen, S. (1987). The Systolic/Cellular System for Signal Processing. Computer 20(7), 96-97. O’Donnell, J. T., Bridges, T., and Kitchel, S. W. (1988). A VLSI Implementation of an Architecture for Applicative Programming. Future Generation Computer Systems 4(3), 245.254. Paddon, D. J., ed. (1984). “Super-Computers and Parallel Computation.” Clarendon Press, Oxford. Perron, R., and Mundie, C. (1986). The Architecture of the Alliant FX/8 Computer. In “Digest of Papers, COMPCON, Spring 1986” (A. G. Bell, ed.), pp. 390-393. IEEE Computer Society Press, Silver Spring, Maryland. Pfister, G . F., Brantley, W. C., George, D. A,, Harvey, S. L., Kleinfelder, W. J., McAuliffe, K. P., Melton, E. A,, Norton, V. A,, and Weiss, J. (1987). An Introduction to the IBM Research Parallel Processor Prototype (RP3). In “Experimental Parallel Computing Architectures” (J. J. Dongarra, ed.), pp. 123-140. Elsevier, Amsterdam. Plas, A., Comte, D., Gelly, O., Syre, J. C., and Durrieu, G. (1976). LAU System Architecture: a Parallel Data Driven Processor Based on Single Assignment. Proceedings International Conference Parallel Processing, August 24-27, pp. 293-302. Ray, W. A. (1985). CYBERPLUS: a High Performance Parallel Processing System. Proceedings 1st Intercontinental Symposium Maritime Simulation, Munich, pp. 24 29. Reddaway, S. F. (1973). DAP-a Distributed Array Processor. Proceedings 1st Annual Symposium Computer Architecture, pp. 61-65. Reinhardt, S. (1988). Two Parallel Processing Aspects of the Cray Y-MP Computer System. Proceedings International Conference Parallel Processing, August 15-19, pp. 31 1-314.
156
RALPH DUNCAN
Rettberg. R. D., Crowther, W. R., Carvey, P. P., and Tomlinson, R. S. (1990). The Monarch Parallel Processor Hardware Design. Computer 23(4), 18 30. Rudolf, J . A. (1972). A Production Implementation of an Associative Array Processor: STARAN. Proceedings AFIPS Fall Joint Computer Conference, 41( l ) , 229-241. Russell, R. M. (1978). The Cray-l Computer System. Communications ACM 21( l), 63-72. Schwartz, J. (1983). “A Taxonomic Table of Parallel Computers, Based on 55 Designs.” Courant Institute, New York University, New York. Seitz, C. L. (1985). The Cosmic Cube. Communications ACM 28(1), 22-33. Shaw, D. E. (1981). Non-von: a Parallel Machine Architecture for Knowledge Based Information Processing. Proceedings 7th International Joint Conference on Art$cial Intelligence, 961 963. Sicgel, H. J. ( 1985). “Interconnection Networks for Large-Scale Parallel Processing: Theory and Case Studies.” Lexington Books, Lexington, Massachusetts. Siegel, H. J., Schwederski, T., Kuehn, J. T., and Davis, N. J. (1987). An Overview of the PASM Parallel Processing System. In “Tutorial: Computer Architecture” (D. D. Gajski, V. M. Milutinovic, H. J. Siegel, and B. P. Furht, eds.), pp. 387~407.IEEE Computer Society Press, Silver Spring, Maryland. Skillicorn, D. B. (1988). A Taxonomy for Computer Architectures. Computer 21(11), 46 57. Snyder. L. (1982). Jntroduction to the Configurable Highly Parallel Computer. Computer 15( I), 47-56. Snyder, L. ( 1988). A taxonomy or synchronous parallel machines. Proceedings 17th Intrrnational Conjirence P~iraIleIProcessing, University Park, Pennsylvania, 28 1-285. Srini, V. (1985). A Fault-Tolerant Dataflow System. Computer 18(3), 54-68. Srini, V. (1986). An Architectural Comparison of Dataflow Systems. Computer 19(3), 68--88. Stenstrom, P. (1990). A Survey of Cache Coherence Schemes for Multiprocessors. Computer 23(6), 12 24. Stolfo. S. (1987). Initial Performance of the DADO2 Prototypc. Computer 20(1), 75-83. Stolfo, S. J., and Miranker, D. P. (1986). The DADO Production System Machine. Journal Parallel and Distributed Computing 3(2). 269-296. Treleaven, P. C., and Mole, G. F. (1980). A Multi-Processor Reduction Machine for UserDefined Reduction Languages. Proceedings 7th International Symposium on Computer Archifrcturr, pp, 121 130. Treleaven. P. C., Brownbridge, D. R., and Hopkins, R. P. (1982b). Data-Driven and DemandDriven Computer Architecture. A C M Cumput. Surueys 14(1), 93 143. Treledven, P. C., Hopkins, R. P., and Rautenbach, P. W. (1982a). Combining Data Flow and Control Flow Computing. Computer Journal 25(2), 207-217. Villemin, F. Y . (1982). SERFRE: A General-Purpose Multi-Processor Reduction Machine. Proceedings Internutionnl Conference Purallel Processing, August 24 27, pp. 140- 141. Vranesic, Z., Stumm, M., Lewis, D., and White, R. (1991). Hector: a Hierarchically Structured Shared-Memory Multiprocessor. Computer 24(1), 72-79. Wada, H., Ishii, K., Fukagawa, M., Murayuma, H., and Kawabe. S. (1988). High-speed Processing Schemes for Summation Type and Iteration Type Vector Instructions on Hitachi Supercomputer S-820 System. Proceedings International Conference Supercomputing, St. Malo, France, pp. 197 206. Watson, W. J. (1972). The ASC-a Highly Modular Flexible Super Computer Architecture. Proceedings AFIPS Fall Joint Computer Corference, 221-228. Watson, I . , and Gurd, J. (1979). A Prototype Data Flow Computer with Token Labeling. Proceedings National Computer Conference, New York, 48, 623 628. Widdoes. L. C., and Correll, S. (1979). The S-l Project: Developing High-Performance Digital Computers. Energy and Technology Reuiew, Lawrence Livermore Laboratory Publication UCRL-52000-79-9, September 1-1 5.
PARALLEL COMPUTER ARCHITECTURES
157
Wiley, P. (1987). A Parallel Architecture Comes of Age at Last. IEEE Spectrum 24(6), 46-50. Yau, S. S., and Fung, H. S. (1977). Associative Processor Architecture-a Survey. A C M Computing Surveys 9( I), 3-27.
This Page Intentionally Left Blank
Content-Addressable and Associative Memory” LAWRENCE CHlSVlN Digital Equipment Corporation Hudson. Massachusetts
R . JAMES DUCKWORTH Department of Electrical Engineering Worcester Polytechnic Institute Worcester. Massachusetts 1. Introduction . . . . . . . . . . . . . . . . . . 2. Address-Based Storage and Retrieval . . . . . . . . . . 3. Content-Addressable and Associative Memories . . . . . . . 3.1 Nomenclature . . . . . . . . . . . . . . . . . . . . 3.2 Materials . . . . . . . . . . . 3.3 Associative Storage and Retrieval in a CAM . . . . . . 3.4 Multiple Responses . . . . . . . . . . . . . . 3.5 Writing into a CAM . . . . . . . . . . . . . 3.6 Obstacles and Advantages of Content-Addressable and Associative Memories . . . . . . . . . . . . . . . . 3.7 Applications that Benefit from a CAM . . . . . . . . 3.8 New Architectures . . . . . . . . . . . . . . 4. Neural Networks . . . . . . . . . . . . . . . . 4.1 Neural Network Classifiers. . . . . . . . . . . . 4.2 Neural Network as a CAM . . . . . . . . . . . 5. Associative Storage, Retrieval, and Processing Methods . . . . . 5.1 Direct Association . . . . . . . . . . . . . . 5.2 Indirect Storage Method . . . . . . . . . . . . 5.3 Associative Database Systems . . . . . . . . . . . 5.4 Encoding and Recall Methods . . . . . . . . . . 5.5 Memory Allocation in Multiprocessor CAMS . . . . . . 5.6 CAM Reliability and Testing . . . . . . . . . . . 6. Associative Memory and Processor Architectures . . . . . . . 6.1 Associative Memory Design Considerations . . . . . . . 6.2 Associative Processors . . . . . . . . . . . . . 6.3 CAM Devices and Products . . . . . . . . . . .
. . . .
. . . . . . .
. . . . . .
. . . . . .
160 162 164 164 . . . 165 . . . 166 . . . 167 . . . 168
. . . . . . . . . . . . . . . . . . . . . .
. . . . . . .
. . .
168 170 173 174 175 176 176 177 178 178
. 180 . 182 . 183 . 184 . 186 . .
187 198
* Based on “Content-Addressable and Associative Memory : Alternatives to the Ubiquitous RAM” by Lawrence Chisvin and R . James Duckworth which appeared in IEEE Computer. Vol . 22. No . 7. pages 51-64. July 1989. Copyright 0 1989 IEEE . 159 ADVANCES IN COMPUTERS. VOL 34
Copyright 1992 by Academic Press. Inc . All rights of reproduction in any form reserved. ISBN 0-12-0121 34-4
160
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
Software for Associative Processors. . . . . . . . . 7.1 STARAN Software. . . . . . . . . . . . 7.2 DLM Software . . . . . . . . . . . . . 7.3 ASP Software . . . . . . . . . . . . . 7.4 Patterson’s PL/l Language Extensions . . . . . . 7.5 PASCALIA. . . . . . . . . . . . . . 7.6 LUCAS Associative Processor. . . . . . . . . 7.7 The LEAP Language . . . . . . . . . . . 7.8 Software for CA Systems . . . . . . . . . . 7.9 Neural Network Software . . . . . . . . . . 8. Conclusion, . . . . . . . . . . . . . . . . 8.1 Additional References . . . . . . . . . . . 8.2 The Future of Content and Associative Memory Techniques Acknowledgments. . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . 7.
. . . . . . . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . .
. . . . . . . . . . . .
212 213 215 216 21 8 219 220 221 223 223 225 . 225 . 228 . 228 229
. . . . . . . . .
1. Introduction
The associative memory has finally come of age. After more than three and a half decades of active research, including scores of journal papers, conference proceedings, book chapters, and thesis treatments, the industry integrated circuit design and fabrication ability has finally caught up with the vast theoretical foundation built up over that time. The past five years in particular have seen an explosion in the number of practical designs based upon associative concepts. Advances in very large-scale integration (VLST) technology have allowed many previous implementation obstacles to be ovcrcome, and there seems to be a more general recognition that alternative approaches to the classic method of computing are necessary to produce faster and more powerful computing systems. This chapter describes the field of content-addressable memory (CAM) and associative memory, and the related field of associative processing. Content-addressable and associative memory are a totally different way of storing, manipulating and retrieving data compared to conventional memory techniques. The authors’ work in this area started in 1984, when it became obvious that a faster, more intelligent memory solution was required to efficiently accommodate a highly parallel computer system under development (Brailsford, 1985). Although tremendous improvements had been made in the speed and capability of both microprocessors and peripherals, the function of memory had changed very little. We realized that a more intelligent memory could off-load some of the data processing burden from the main processing unit, and furthermore, reduce the volume of data routinely passed between the execution unit and the data storage unit.
CONTENT- AD D R ESSAB L E AN D ASSOC IATlV E
M EM0RY
161
This chapter is a review and discussion of the kind of intelligent memory that would solve the problems we recognized. It spans the range from content-addressable memory (CAM), that can retrieve data based upon the content rather than its address, and extends into associative processing, which allows inexact retrieval and manipulation of data. The field of neural networks is covered as well, since they can be considered a form of associative processor, and because some researchers are using neural networks to implement a CAM . Throughout the text, recent content addressable and associative system examples are used to support the authors’ contention that such systems are now feasible. The size and versatility of actual devices has been increasing rapidly over the last few years, enabling the support of new kinds of parallel and A1 architectures. The paper by Kadota et al. (Kadota, 1985), for example, describes an 8-kbit device they call a CARM (content-addressable and reentrant memory), designed to provide a high-speed matching unit in data flow computers. Also, a project in England called SCAPE is involved with the design of an associative parallel processor which has been optimized for the support of image processing algorithms (Jones, 1988); a 20-kbit CMOS associative memory integrated circuit design for artificial intelligence machines is described by Ogura et al. (Ogura, 1986), and recently a machine called ASCA was developed which executes Prolog at high speed using CAMs (Naganuma, 1988). The largest CAM device built at this time appears to be the DISP (dictionary search processor) chip (Motomura, 1990) which, with a 160-kb CAM, is over ten times larger than previously reported CAMs. A number of commercial content-addressable memory devices have recently been introduced by Advanced Micro Devices, Coherent Research Inc., Music Semiconductors, and Summit Microsystems. These devices are described in more detail in later sections. An interesting idea that takes into account the inherent fault-tolerant capabilities of a CAM has also recently been reported by a number of researchers. In a conventional memory system every addressable memory cell must function correctly, otherwise the device is useless. However, if faulty cells in a CAM can be found and isolated, then a perfect device is not essential since the actual storage of data does not have to relate to a specific memory location. Another interesting development that has recently been published is a proposal to construct an optical content addressable memory (Murdocca, 1989). We start this chapter with a brief overview of the traditional addressbased storage method which pervades all our present-day computer systems, and describe some of its deficiencies and inherent weaknesses. We then introduce the concept of content-addressable and associative storage and explain some of the terminology that abounds in this area. Next, we explain some of the obstacles that face the development of these intelligent memory
162
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
systems and explain the potential advantages that can be obtained if the obstacles can be overcome. We then describe how CAM techniques are presently being used in both traditional computers and some newer highly parallel computer systems. We also introduce the technique of hash coding that has been used by computer system designers in the past in an attempt to implement the CAM functionality using software. We follow this with a discussion of the use of neural networks as associative processing systems. In the next section we describe the storage, retrieval, and processing of data using associative techniques and then we describe the associative memory and processor architectures of devices that have either been used or are in active use today. This section also describes the design and use of CAM devices that are commercially available or have been produced in research laboratories. The issues of software for associative processors, including software for neural networks, is discussed next. Finally, in order to place our chapter in historical context, we summarize the major milestones and the articles that have been published over the last 25 years, and we conclude with some thoughts on the future prospects for the field of intelligent memory systems. We hope that this chapter will explain the associative concepts in enough detail to interest new people to study existing problems, and that it will motivate the incorporation of some of the ideas discussed into new designs, thus accelerating the exciting progress already underway. 2.
Address-Based Storage and Retrieval
Traditional computers rely on a memory architecture that stores and retrieves data by addressing specific memory locations, as shown in Fig. 1.
address
I
1
2 1 w0RD3
I
+eddata*
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
163
Every accessed data word must travel individually between the processing unit and the memory reservoir through a communications medium, often a shared bus, one word at a time. The elegance and simplicity of this approach has ensured its success, evidenced by the ubiquitous nature of the computer today, However, there are some inherent drawbacks to a word-at-a-time, location-addressed memory. One major problem of address-based memory is that the memory access path becomes the limiting factor for system performance. This has come to be known as the “von Neumann bottleneck” (Backus, 1978). Much of the traffic on the communications medium is involved with sending information back and forth merely to calculate the efective address of the necessary data word. A second important drawback to the location-addressed approach is the serial nature of the processing, where each piece of information manipulated by the computer must be handled sequentially. This approach is particularly slow in search and compare problems, for example, where many items must be inspected to determine the outcome. If the items are each distinct and unrelated to one another, then the only reason they must be processed sequentially is that the architecture is limited to handling them in that manner. All the records could be inspected simultaneously if the system allowed it. A linear search operation for an exact match on a conventional computer finds the match, on average, halfway down the search list. The search time increases at the same rate as the list size. The performance penalty increases if a more complex comparison is necessary while searching, such as correlating or sorting the data. Techniques such as hash coding and hardware execution pipelines attempt to alleviate the problems by reducing the search time and overlapping the functions. However, improvement using conventional methods is limited. Addressing by location is particularly inefficient when : 0 0
0
Data are associated with several sets of reference properties Data elements are sparse relative to the values of the reference properties Data become dynamically disordered in memory during processing (Hanlon, 1966)
The serious disadvantages inherent in location-addressed memories become more obvious when multiple processing units are introduced into the computing system. Modern parallel processing architectures, such as data flow machines, exploit application parallelism to increase their execution performance. Systems that rely on data flow properties do not execute efficiently using a traditional memory, where each data word can only be
164
LAWRENCE CHlSVlN A N D R. JAMES DUCKWORTH
accessed serially by its location in a large sequential array. The use of CAM devices in these types of parallel computers are discussed in more detail in Section 3.8, “New Architectures.” Conventional memory systems are also too slow in some applications. For example, the bridge between high-speed local area networks is readily implemented with a CAM. The bridge provides transparent communication between workstations on different networks. The problem with a bridge is the short time in which it must recognize that messages are for a station on the other network and route it accordingly: There may be many thousands of stations on the networks and the bridge must check the destination address to determine whether to accept the message and pass it on to the other network. Sequentially comparing an incoming address with addresses stored in the bridge may take many cycles, and slows down the overall message transfer in the system. Ideally the search and comparison should be done in parallel so that the search time remains constant irrespective of the number of addresses that must be compared. Commercial content addressable memory devices manufactured by Advanced Micro Devices and MUSIC Semiconductors, and described in more detail in Section 6.3, “CAM Devices and Products,” can carry out this search action and require less than 1 ps to find a match.
3.
Content-Addressable and Associative Memories
The basic problems with conventional address-based systems have led researchers to investigate the potential benefits of CAMs, where information is stored, retrieved, or modified based upon the data itself, rather than by its arbitrary storage location. In some ways, we can view such a memory as a representation of the information it contains, rather than as a consecutive sequence of locations containing unrelated data (Kohonen, 1987). 3.1
Nomenclature
Actual implementations of CAMs have been reported since at least 1956 (Slade and McMahon, 1956), and much of the research has used its own definitions. The people who surveyed the early progress of this field showed the problems associated with keeping track of what research was being conducted. Hanlon, in his 1966 paper (the first comprehensive survey of the field) (Hanlon, 1966), defined content-addressable memory as a storage mechanism where the data are located by content. He defined an associative memory as “a collection or assemblage of elements having data storage capabilities, and which are accessed simultaneously and in parallel on the
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
165
basis of data content rather than by specific address or location.” In this definition, the term “associative” referred to the interrelationships between the data elements, rather than the specific method of storage. Minker (1971) used the International Federation of Information Processing definition of an associative store as “a store whose registers are not identified by their name or position but by their content.” Parhami (1973) defined an associative memory as a “storage device that stores data in a number of cells,” where the cells “can be accessed or loaded on the basis of their contents.” This was similar to what Hanlon had called a content-addressable memory. Parhami further defined an associative processor as a system that exhibited sophisticated data transformation or included arithmetic control over the contents of a number of cells, depending upon their content. An associative computer was defined as “a computer system that uses an associative memory or processor as an essential component for storage or processing.” Foster (1976) defined a CAM to be “a device capable of holding information, comparing that information with some broadcast information, and indicating agreement or disagreement between the two.” A content-addressuble purulldprocessor ( C A P P ) was defined as “a CAM with the added ability to write in parallel into all those words indicating agreement.” In more recent literature, the term “associative memory” is used to describe a general storage and retrieval system that can access or modify cells based on their content, but does not necessarily need an exact match with a data key. This is similar to Hanlon’s definition, and is the more generic description. Content-addressable memory has come to represent the mechanism that is used to implement the associative system. However, many research papers still refer to them seemingly interchangeably, and both terms must be used to effectively find information on the topic.
3.2 Materials Most associative memories today are constructed using silicon in the form of VLSI circuits, and the examples in this chapter are mainly drawn from that wealth of experience. There are, however, systems in various stages of experimentation that are built using other methods, including Josephson memory cells (Morisue et al., 1987) and optical or optoelectronic principles (Farhat, 1989; Murdocca et ul., 1989; White, 1988). The field of optics in particular shows excellent promise, and it is likely that someday large, complex, and powerful associative engines will be designed and produced using optical techniques (Berra, 1987). This area is still in its infancy, however, and the systems being produced and suggested are more useful as small research vehicles than as commercially viable products (Berra et al., 1990).
166
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
3.3 Associative Storage and Retrieval in a CAM The concepts of storage and retrieval in a CAM are straightforward, and are described here in canonical form. The basic functions of a CAM are: 1. Broadcast and comparison of a search argument with every stored
location simultaneously 2. Identification of the matching words 3. Access to the Matching Words. Figure 2 is a simple block diagram of a CAM. A particular record can be found by matching it with a known pattern. This involves a key word, a mask word, and matching logic. The key word is used to input the pattern for comparison, while the mask word enables only those parts of the key word that are appropriate in the context of the request. The key and mask word combination is provided to the tag memory and matching logic, where the actual data comparison takes place. After a match has been found, the appropriate data words can be output to the requesting program or modified, depending upon the capabilities of the system architecture and the requirements of the application. Figure 3 shows a common CAM data word arrangement, where each data word is partitioned into fixed segments. In this scheme, there are three fields containing specific information. The tag bits signify the type of location, and are used to show whether the location is empty or used. If the location is used, this field identifies the type of information stored, such as temporary data or program code.
TAG MEMORY AND
MATCHING LOGIC
4
DATA MEMORY
FIG.2. Content-addressable memory block diagram.
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
ITAGSI
LABEL
I
167
1
DATA
Fic;. 3. Common bit arrangement for content addressable memory.
The balance of the word is split into lube1 and data fields. The label field is used to match any incoming key word requests, and the data field holds the information that will be returned or modified. If more flexibility is desired, or if the label information is embedded in the data, these two conceptual segments are treated as one entity. In this implementation, the entire data field is compared with the properly masked search key.
3.4 Multiple Responses Since it is possible (in fact, likely in a complex system) that a search will identify more than one matching stored record, some method of sorting through or outputting multiple matches must be provided. The two main problems with multiple responses are (1) identify the number of responders, and (2) select each member from the set of responses. As an example of these problems, assume that a 24-bit CAM contains the entries shown in Fig. 4. This figure shows separate label and data fields, and for simplicity contains no tug field. When the key word “3AH” is applied to the CAM, three labels respond after the matching operation. These responders have corresponding data items containing 3 8 6 C H , ABCDH, and 9732H. Each of the multiple matches might be selected at random or in some predefined priority order. Assuming some priority, the matching words could be presented as they are found in an ordered array, or they could be sorted by an algorithmic selection process. In the example of Fig. 4, they might be sorted alphabetically or numerically.
I
I
LABEL
I
DATA
I
25H
I
1234H
I
3AH
386CH
65H
42878
3AH
ABCDH
80H
5624H
3AH
9732H
FIG. 4. A section of a CAM.
168
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
3.5 Writing into a CAM After the matching operation identifies the associated data, it is often necessary to write new information into the CAM. This brings with it decisions unique to associative storage. The first difficulty is deciding where in the memory to write the information. Since the data words are usually not addressable by location, some other method of identifying currently available memory areas must be employed. This method must take into account the likelihood that certain data words are related to one another and should be stored for efficient retrieval. The free areas can be identified by their content, by a separate tag bit, or by a pattern in a special field. The word might be placed at random within the free area, or a more predictable choice might be made. For example, the new data word might be stored in the first free memory area found (if this had any meaning in the particular architecture), or it might be placed close to other related data. The algorithm would depend upon the intended application. Changing only the partial contents of a data word is a desirable function, and writing to an arbitrary number of cells within different words is a potentially powerful operation (Parhami, 1973). Once the location has been determined, the memory system must also confront the decision of exactly how the word is to be written. Since a content addressable architecture relies on the relationship of the data, it is not sufficient to merely dump the new information into its appointed storage location. The data currently residing in that location might have to be merged with the incoming data, and the label field will almost certainly have to reflect the new information now contained in that area of the memory. 3.6
Obstacles and Advantages of Content-Addressable and Associative Memories
There have been a number of obstacles to commercially successful associative memories. Some of these are listed below: 0 0 0
0 0
Relatively high cost for reasonable storage capacity Poor storage density compared to conventional memory Slow access time due to the available methods of implementation Functional and design complexity of the associative subsystem Lack of software to properly utilize the associative power of the new memory systems
An associative or content-addressable memory is more expensive to build and has lower storage density than a conventional address-based memory
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
169
because of the overhead involved in the storage, comparison, manipulation, and output selection logic, Some current large-scale integration (LSI) versions of CAMs are discussed later, where this characteristic can be clearly seen. Content-addressable and associative memories are always more complex than location addressable memories. Manipulation of data based upon the contents, in association with the contents of other locations, entails design decisions that do not exist when information can be saved merely by address. In a location-addressed memory, the usual design considerations include the word length, the number of words, the base technology (e.g., CMOS vs. ECL), the internal architecture (e.g., dynamic vs. static), and the speed. The CAM has all the decisions above, and adds some significant tradeoffs of its own. The internal fields have to be determined (e.g., whether a special “index” field is required or any arbitrary search capability is to be allowed), what the internal architecture will be (e.g., bit-serial vs. word-seriat), how to interface to the memory, how much internal interconnection is required between the various cells, how to handle multiple CAM hits, how to detect and correct errors, what language will best utilize the architecture and hardware (it may even be necessary to write a new language), and how this more expensive system compares to the traditional way of solving the target problem. The unfamiliarity with associative concepts that hampers many designers aggravates the situation, but even a widely understood CAM technology involves the extra tasks of storage and retrieval by content. Very little software is currently available for CAMs. At some very high level of hierarchy, the program differences can be made transparent, but the lowest programming levels will always have to adapt to the underlying architecture (and in some cases the hardware) to extract the power of content-based storage efficiently. 3.6.I
Motivating Factors
The motivation to overcome these obstacles is that a combination of highly parallel processing techniques and associative storage lends itself to certain classes of applications (Murtha, 1966; Thurber and Wald, 1975). For example, a large part of the software in administrative and scientific data processing is related to the searching and sorting necessary for data arrangement in a sequentially addressed memory. This is especially true in tasks such as compiling, scheduling, and real-time control. This type of housekeeping is unnecessary in a CAM because the data are used for retrieval and can be output already sorted by whatever key is specified. A database consisting of unordered list structures is a perfect candidate for content-addressable treatment (Hurson et al., 1989). Because the CAM
170
LAWRENCE CHlSVlN A N D R. JAMES DUCKWORTH
searches and compares in parallel, the time to extract information from the storage medium is independent of the list length. There is no need to sort or compact the information in the memory, since it can be retrieved easily based on its contents. This has immediate implications for common data manipulations such as collating, searching, matching, cross-referencing, updating, and list processing. An associative approach can help any problem where the information is stored based on an association with other data items. Lea ( 1 975) provides an excellent illustration of the type benefits obtainable through the use of a CAM. He discusses how one would access and update a company telephone directory. Using a location-addressable memory would involve some special technique for accessing the data, such as “hash-coding” or “inverted-listing” on the name field (Kohonen, 1987). This works fine until it is necessary to retrieve the information by a field other than the name. If one wanted to find out who was in room X, for example, it would still be necessary to go through the entire list looking for the “room” field. One could, of course, provide an access key for other fields, but this method of cross-retrieval quickly becomes cumbersome for a great number of possible keys and a large database. Moreover, if one wanted to allow for access based upon a name that was “almost” right, the design of the retrieval key would have to be such that this was possible. Updating such a database involves other problems, and the more flexible the retrieval mechanism, the longer and more complex the job of storage. A CAM solves all these problems. Retrieval is based upon the actual contents, and in this way every field is a “key” to the entire entry. Since the database is inspected in parallel, access to any specific data cell is fast and efficient. The storage process is greatly simplified since the location of the entry is irrelevant. Once the input fields are stored, their actual contents provide the links to the rest of the database. 3.7 Applications that Benefit from a CAM
This characterization suggests a vast array of applications that can potentially benefit from associative treatment. As just a few recent examples, content-addressable and associative memories have been suggested for list processing system garbage collection (Shin and Malek, 1985a), graph transversal (Shin and Malek, 1985b), pattern classification (Eichmann and Kasparis, 1989; Suzuki and Ohtsuki, 1990), pattern inspection (Chae et al., 1988), text retrieval (Hirata et al., 1988; Yamata et al., 1987), signal and image processing (Lea, 1986), speech processing (Cordonnier, 198l), image analysis (Snyder and Savage, 1982; Lee, 1988), parallel exhaustive search for NP-complete problems (Yasuura et ul., 1988), digital arithmetic through
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
171
truth table lookup processing (Cherri and Karim, 1988; Mirsalehi and Gaylord, 1986; Papachristou, 1987), logic simulation (Sodini er al., 1986), probabilistic modeling (Berkovich, 198I), and characterization of ultrasonic phenomena (Grabec and Sachse, 1989). CAMs are especially appropriate for computer languages, such as LISP (Bonar and Levitan, 1981; Ng et al., 1987), and PROLOG (Chu and Itana, 1985) that use list structures as their building block and tend to fragment memory with their execution. The Conclusions section at the end of this chapter provides a review of some important previous surveys on content-addressable and associative systems. Most of the literature mentioned there has its own list of applications. The performance improvement in many of the above areas can be dramatic using an associative memory, especially when the database being manipulated is large, and search time on a conventional computer becomes significant. To see this improvement, consider a sorting problem, where each data item must be compared to all the other data items to ascertain its sort position. In the general case, the search/sort time grows at the rate of O(n log n), where n is the number of items on the list. An application that needed only the maximum value would grow in execution time at least as fast as the increase in the list size. With an appropriate associative memory or associative processor, the sorting could be done while the access is occurring, growing only as the list grows. A problem that needed only the largest value could inspect all the data items simultaneously, and the performance would be the same for any list size that fit within the memory. One novel application for a CAM is the processing of recursively subdivided images and trees (Oldfield et al., 1987). An example of this is a binary tree, used to represent the pixels in a drawing. If each node in the tree has to be visited for the desired operation, then a conventional location-addressed memory can be made efficient. If, however, only a few of the nodes need to be either inspected or changed (for example, if a small portion of the picture needs modification), a CAM is a clear winner. With a traditional memory, the entire tree must be rewritten for each local change, and a search entails looking at all the previous nodes for the one of interest. A CAM allows a single node to be easily found (by using an appropriate key pattern as a search argument), and provides for local changes in constant time. CAMs have been suggested to enhance the performance of logic programming systems. One such system implements a version of the PROLOG language, and uses a CAM for the variable environment and the database (Nakamura, 1984). In this system, a more traditional serial depth-first search, and a heuristic (best-first) concurrent evaluation, can both be accommodated. In the depth-first method, the bindings are stored in the CAM, and are referred to by the appropriate keys. Concurrent program evaluation
172
LAWRENCE CHlSVlN A N D R . JAMES DUCKWORTH
is obtained by having the execution processors share the common associative memories. These memories contain the environments, the database, and the search operation context table. More recently, researchers at Syracuse University have been investigating the use of CAMs to increase the speed of logic programming (Kogge et al., 1988; Oldfield, 1986; Ribeiro, 1988; Ribeiro et al., 1989). The SUMAC machine (Syracuse University Machine for Associative Computation) uses advanced CAMs and an instruction set well suited for logic programming execution (Oldfield, 1987b). The logic expressions in their system are represented using the previously described CAM-implemented tree structures (Oldfield et al., 1987). Operations related to unification and data structure manipulation are the most frequent and time-consuming parts of executing logic programs. The unification operation involves finding substitutions (or bindings) for variables which allow the final resolution of the user’s goal. The backtracking operation describes the process by which a program continues a search by examining alternate paths. Both of these operations can be improved by using a CAM to store the information. An even better idea is to reduce the number of such operations, and a CAM will help in this area, too. An index stored in the CAM can filter the clauses needing to be matched against a goal. This can reduce the number of blind alleys in the search operation, and thereby increase the efficiency of the program. The other major operation that can be improved by a CAM is the maintenance and updating of the data structures. One example of this is the creation of, search for, and deletion of a binding. Other examples are garbage collection (easily implemented by an “in use” bit defined in the CAM word) and compaction (totally unnecessary in a CAM). Many conventional computer implementations of PROLOG spend an inordinate amount of time searching through linear lists. As already discussed, an associative treatment of this function has potentially significant performance improvement capability. The technique of content addressing has been used for many years in the design of cache memory. A cache memory is a high-speed memory placed in the path between the processor and the relatively slow main memory. The cache memory stores recently accessed data and code with the idea that if the processor needs to access this data again it can be retrieved from the faster cache rather than main memory and therefore speed up the rate of execution. For more information see, for example, Stone’s work (Stone, 1990). CAM techniques have also been used for many years in memory management units to carry out the virtual-to-physical address translations. The AT&T WE-32201 Integrated Memory Management Unit/Data Cache
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
173
(IMDC) is reported to be the first device to include a content addressablememory based Memory Management Unit (MMU) and a large instruction/ data cache on a single chip (Goksel, 1989).
3.8 New Architectures Another important use of CAMs and associative memories is in the implementation of new computer architectures. Massively parallel computer systems cannot depend upon a serial memory bottleneck, but must instead have concurrent access to much of the computer database. Data flow machines, for example, rely on data words that have tags to route them through the system. The matching of the data tags to the proper operation nodes requires an associative operation. In the data flow computational model, instructions can only be executed if all their operands are available, but the pairing and routing of the operands for instructions is one of the most critical parts of the whole system. An example of an early data flow computer is the Manchester data flow machine (Gurd et al., 1985). At the time the Manchester machine was constructed (1978), the largest commercially available CAM was only 64 bits in size, making the cost of building and implementing a true content-addressable matching store prohibitive. Instead, a pseudocontent-addressable matching store was implemented by manipulating data in conventional random access memory through a hardware hashing function unit (Silva and Watson, 1983). [For more information on hash coding see, for example, Knuth (1973).] Technology improvements have now made larger-sized CAMs feasible. Two papers have recently been published that describe devices developed using content addressable memory techniques to improve the performance of data flow systems. Takata et ul. (1 990) describe a high-throughput matching memory that uses a combination of a small amount of associative memory (32 words by 50 bits) with a hashing memory (512 words by 42 bits). The paper by Uvieghara (1990) describes a smart memory for the Berkeley dataflow computer. The content-addressable and reentrant memory (CARM), which is described in Section 6.3.1.1, is also suitable to construct a highspeed matching unit (Kadota ef ul., 1985). 3.8.1 Computer Networks
Associative memory is also appropriate in an intelligent computer network routing system. It could utilize the association of message content and routing information to efficiently package and dispatch the network intercommunication. Duplicate messages within local routing areas could be sent between the global routing nodes as a single message, which would be
174
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
decoded at the receiving node by content-addressable techniques. This would reduce the traffic between the global nodes. In multistage networks such as an omega network (see, e.g., Almasi and Gottlieb, 1989; Decegama, 1989) it is very advantageous to reduce network traffic by a technique known as message combining. This technique combines messages together that are to be sent to the same memory location. If message combining is not performed then hotspots (Pfister and Norton, 1985) can occur degrading the performance of the system. Message combining is implemented in the New York University (NYU) Ultracomputer (Almasi and Gottlieb, 1989). Using content addressable techniques to compare and match the destination addresses may result in substantial performance improvement.
4.
Neural Networks
The field of neural networks (1989a; Lippmann, 1987) has in recent years gone from a research curiosity to commercial fruition. In some ways, neural networks represent the entire field of associative memories. This is true for two reasons. First, the concepts behind neural networks were understood long before the technology was available to implement them efficiently. Second, a neural network is in every way an associative processing engine. It is ironic that John Von Neumann, whose word-at-a-time architecture has become so prevalent in the computer field, was one of the early proponents of the associative memory discipline (Gardner, 1990) long before there was any possibility of implementing a feasible system. This great man’s own work helped to establish the associative systems that are now making inroads into his previously unchallenged computer architecture. The field of neural networks has grown substantially in recent years due to improvements in VLSI technology (Goser et al., 1989; Graf et al., 1988; Treleaven et ul., 1989). The number of groups actively involved in artificial neural network research has increased from about 5 in 1984 to about 50 in 1988 (Murray, 1990). Neural network business has gone from about 7 million dollars in 1987 to around 120 million dollars today (Gardner, 1990). The basis for neural networks is massive parallelism, simple fine-grained execution elements, and a highly connected intercommunication topology. The network explores many competing hypotheses in parallel, arriving at the best solution based upon the input stimulus and the links and variable weights already in the system. The neural network has as its biological model the human brain, and it attempts to solve the same types of problems that humans can solve so well. They happen to be problems that conventional computers struggle with, mostly unsuccessfully.
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
175
Current targets of neural networks are the fields of speech and pattern recognition, process control, signal processing, nondestructive testing, and stress analysis. Despite advances in conventional computer technology, where computers have been designed that are significantly more powerful than anything available just a few years ago, speech and pattern recognition remains elusive. Pure computation speed does not seem to be an advantage for these problems. The human brain, for example, is relatively slow (about 1000 pulses per second) (Treleaven et al., 1989) yet people can recognize entities even when obscured. The current thought is that it is the parallelism that contributes to this skill, and that is where neural networks come in (Lerner, 1987). 4.1 Neural Network Classifiers
Figure 5 shows a neural network classifier (Lippmann, 1987). This system accepts input patterns and selects the best match from its storage database. The input values are fed into the matching store, where matches are made based upon the currently available data. The intermediate scores are then passed to the selection block, where the best match is filtered for output. The selection information is returned to modify the matching store data, and thus train the associative network. The data interconnections that are formed during this training session become the associative store, and provide the basis for later content-driven output selection. As selections are made, the feedback modifies the storage information such that the correct associative interconnections exist within the neural network. The weights that are used to interconnect the neurons thus change over time. This allows the network to “learn” what is appropriate behavior for various input conditions. This learning process is accomplished during a set of supervised training trial runs. Different sets of input stimuli are presented
COMPUTE
INPUTS
INTFRMEDIATE
MATCHING
AND
u I ENHANCE
SCORE
MAXIMUM
FIG.5. Neural network classifier.
OUTPUT
176
LAWRENCE CHlSVlN AND
R. JAMES DUCKWORTH
to the network, and at the end of the sessions the network is either told how well it performed or it is given what the correct answers should have been. In this way, the neural network becomes iteratively better at its task until the point at which it is ready for real, non-training input stimulus. The detailed underlying operation of the neural network is oriented to the higher-lcvel function of selecting from its stored database the pattern that is “most like” the input stimulus. The definition of “most like” varies, depending upon the required operation of the network. If the neural network is created to recognize speech, for example, the comparison might depend upon some encoded version of the raw input stimuli, extracting the frequency content over some time snapshot. On the other hand, a vision recognition system might break a picture into pixels and use a gray scale level to represent and compare two images. Teuovo Kohonen, of the Helsinki University of Technology, has developed an associative memory using a neural network that does an astounding job of recognizing partially obscured facial images (Kohonen et al., 1981).
4.2 Neural Network as a CAM
A neural network can also be used as a CAM (Boahen et al., 1989; Verleysen et al., 1989a, b), and provide the correct output when only part of an input pattern is available. One example of this use is a bibliographic search subsystem. A partial citation could be input to the neural network, and the entire bibliographic entry would be found and output. This could be handled by training the neural network to recognize any piece of the bibliographic reference exactly, or by recognizing which internal reference most closely matches the input data. even if no field has an exact match. The input data might be encoded in some way and stored as a data pattern that has no easily recognizable association with the actual information. Encoding the data, however, might make classification and recognition easier or more accurate.
5. Associative Storage, Retrieval, and Processing Methods
In an associative memory, we must assign state variables to conceptual items and the connections between them. The associative recall takes the form of a response pattern obtained on the output when presented with a key pattern on the input. A further input in the form of a mask pattern includes context information that selects closeness of recall. In this way, a broad search might turn up enough information to allow a more narrow
CONTENT-ADDRESSABLE A N D ASSOCIATIVE MEMORY
177
search using a new key or mask context. Many relationships can be considered when manipulating the data within an associative memory, and the intersection of the relevant items can be used for specific recall. An example of this might be a search for all the people on a particular street who had incomes above 20,000 dollars per year. Depending upon how the information was stored, this might take one pass through the CAM or two. In the two-pass method, the first pass could provide the name of the street as the key (or key and mask combination), and the output would be a list of names. This list would be buffered somewhere and the second pass would provide a key and mask combination that only matched people of incomes greater than 20,000 dollars. The intersection of the two lists (names of people on the street and people who made more than 20,000 dollars) is the target in this example. One way to further process the two lists would be to feed the fully extracted information from each associative pass into a standard computer, where they would be combined sequentially. This would not provide the performance of a totally parallel CAM, but would still be faster than doing the entire operation on that same sequential computer. A two-pass associative strategy could be implemented by loading the results of the first pass (including name and income) into a CAM buffer, and providing the income match as the second-pass key. The second-pass search key would be applied to the new buffer CAM which contained the retrieved list of names. This would provide a list of matches that already contained the intersection of street name and salary. If the information was properly structured in the initial CAM, a one-pass solution to this problem is possible. For example, if the entry for each name included the person’s street address and income, a key and mask combination could be formulated which only matched those entries falling into the appropriate intersection. As the previous example hints, information in an associative memory can be arranged in different ways. The various elements in the memory can be linked by direct or indirect association. 5.1
Direct Association
Direct association makes a logical link between the stored items. Using one of the items as a key causes the memory to present the associated item. This method of association is limited to systems where the interconnection fields of all the data items is known at the time the information is stored. A direct association storage mechanism can have more than two items in the link, as long as they are fixed and specific. In the example at the start of this section about street names and income levels, it was mentioned that
178
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
certain methods of storage could provide the ability to retrieve the intersection in one pass. A direct association system might allow this. If the name, street, and income were all combined directly then one key and mask combination could be used to query the database and pick off the matching entries. The drawback to this, of course, is that every conceivable link must be known and fixed from the start. The larger and more subtle the interconnections become, the more cumbersome this method is. If we wanted to add religion, political affiliation, and marital status to the list, it would soon be impossible to provide a one-pass answer to any reasonably useful query. Beyond that, it would be impossible to query the database using any link that was not understood during the storage. 5.2 Indirect Storage Method The indirect storage method involves the use of inferences to save information, giving an object a certain value for an attribute. In a simple case, three pieces of information can be stored for each set. By providing one or two of the pieces of information as a key, the entire triple can be accessed. ATTRIBUTE =color
VALUE = red
FIG. 6 . Indirect association.
Consider an apple with the color red, as shown in Fig. 6 . The object here is “apple,” the attribute is “color,” and the vulue is “red.” This can be represented by the triple (apple, color, red) (Kohonen, 1977; Stuttgen, 1985). By providing the value and the attribute ( X , color, red), we extract the name of the object ( X = apple). Alternatively, we could present the object along with the attribute (apple, color, X ) to extract the value of the color (X= red). If we present only the object (apple, X , Y ) , the given response is both the attribute and the value ( X = color, Y = red). This returns general information about the object. Relational structures such as this can be built up to create complex concepts (Kohonen, 1977). In this example, the database could be expanded to include information about other attributes (taste, feel, etc.) and contain other objects with separate or overlapping values.
5.3 Associative Database Systems The ideas presented above can be embodied in an associative database system using the general concepts of database theory (Gillenson, 1987, 1990). Information is stored as objects, or entities, with descriptive
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
179
characteristics called attributes. Within the database, the entities have associations with one another. Combining associations leads to a relationship. For example, a mechanic might have taken a particular course, and the two entities of “mechanic” and “course” form an association. The mechanic’s grade in that course would be an attribute of the relationship. The three database models in use today are the hierarchical, network and relational systems (Bic and Gilbert, 1986; Holbrook, 1988). A relational database model, the newest of the three structures, overcomes the recordoriented limitation of the hierarchical and network descriptions. It allows many-to-many relationships without the pointer overhead. Since the information is not stored with a predefined structural relationship, the relational model is most appropriate for implementation as an associative processing system. The traditional relational model can be referred to as complete. Complex relationships can be described, but each piece of information is explicitly represented. Large knowledge bases can also be stored and processed using an incomplete relational model (McGregor et al., 1987). In this model, each entity is designated to be a member of a particular class, which is a category or type identification. Class information is stored at the level of the class, and each member is assumed to have the characteristics of its associated class. This storage method allows a limited inference mechanism. A type lattice can be created from this definition, where entities can be considered part of more than one set. This relationship is shown in Fig. 7. The generic relational model (GRM) has been created to precisely define this method of database processing. The GRM consists of objects, or subsections, which communicate by means of a message-passing protocol. A query will normally describe an implicit tuple, which will then be translated ANIMAL
FIG.7.
Relationship between classes in a type lattice.
180
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
through model expansion, or inference, to a set of explicit tuples. Section 6.2.12 describes a system that implements the GRM in a real computer. At the user level, the informational query to a database system is obviously not formed in tuples or sets. Rather, it takes the form of a specific question. The query can ask for an exact match (“Retrieve the record whose identification number is 1357”), a partial match (“Retrieve the records whose first name is Smith and who are Republicans”), or an orthogonal range (“Retrieve the records for all identification numbers between 1000 and 2000”) (Wu and Burkhard, 1987). The storage and retrieval hardware/software mechanism must be matched to the underlying architecture to obtain satisfactory performance at reasonable cost. Many examples of this symmetry are provided in the sections which describe the associative systems that have been conceived and implemented. 5.4
Encodings and Recall Methods
Various recall methods are possible when using an associative memory for search and match operations (Kohonen, 1977). 5.4.1 Hamming Distance
One of the earliest methods of best match retrieval was based on the Hamming distance devised by R. W. Hamming (Hamming, 1980). Many computer systems now use “Hamming” error-correcting codes which are designed with the assumption that data and noise (error information) are random. In other words, no statistical basis exists for assuming that certain patterns will be more prevalent than others. We can construct Hamming codes lo allow correction of any number of random error bits. They work by using only a portion of all the data patterns possible given a certain number of bits in the word. In practice, extra check bits are added to a number of data bits, and the combination of check and data bits forms the data word. The code’s error detection ability is symmetrical, in that an error in either a data bit or a check bit will be handled properly. The number of extra check bits necessary for the data bits depends upon the size of the needed usable data word and the necessary capability for correction. For example, to detect and correct 1 error bit in 32 bits of usable data, 7 check bits are needed; for a 64-bit data word, 8 check bits are required. A geometric analysis of the same coding technique introduces the concept of a “Hamming distance.” If all the possible patterns of data and check bits are enumerated, only a subset are used for actual stored information. The legitimate information patterns are stored such that a data word with errors
CONTENT-ADDRESSABLE A N D ASSOCIATIVE MEMORY
181
is geometrically closer to one correct data pattern than any other. We can picture this as a distance between legitimate data words, as shown in Fig. 8 for a 3-bit code. The two correct data patterns in this example are (0, 0,O) and (1, 1, 1). The other corners of the cube are possible data patterns with at most a single bit error. The circled corners are the data patterns that will be corrected to (0, 0,O) and the corners with squares will be corrected to (1, 1, 1). The dotted lines form planes that separate the two data domains. An associative memory can use the concept of Hamming distance for inexact retrieval by allowing the correction process to respond with the closest data pattern for a given query key. In this case, a single response is possible, and any input will respond with some output. By choosing an appropriate Hamming distance between legitimate data words, we can construct a robust associative memory. More legitimate data patterns can be accommodated for the same number of memory bits if the system will accept multiple responses. In this case, there might be many memory words at the same Hamming distance in relation to the key word, and each of them are accounted for in the memory response. 5.4.2 Flag Algebra
A data transformation method using a concept called “flag-algebra’’ (Tavangarian, 1989) has been suggested to enhance the parallel processing of associative data in a uniprocessor hardware system. This approach replaces complex searching, arithmetic, and logical operations with simple Boolean functions. The system consists of three major components. First, the wordoriented data must be transformed into flag-oriented data. This new representation identifies each word as a flag in a bitvector; the data is processed by manipulating the flags. The second part of the system processes the flagoriented data using a new algebra based on set theory, Boolean algebra, and the special characteristics of the flagvectors. This new processing method, called “flag-algebra,’’ is used to manipulate all the flags simultaneously.
FIG. 8. Three-dimensional model of Hamming distance.
182
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
Finally, the flag-oriented resulting bitvectors must be converted back to word-oriented data. A flag-oriented associative processor has been suggested for the implementation of the above method. The program and word-oriented data are stored in the sequential PD memory (program/data memory). The word-oriented data coming from the PD memory of the input/output (I/O) units are converted to flag-oriented data during the program execution and stored in the flag memory and flag registers. Parallel, associative techniques are used to manipulate the flag-oriented data. A sequential control unit directs the operation of the processor, and obtains its instructions from the PD memory.
5.5 Memory Allocation in Multiprocessor CAMS Multiprocessor systems can also make use of content-addressable memories, but care must be taken in such cases to allocate the CAM appropriately. In particular, the memory must be allocated so as to reduce the conflicts that arise when more than one processor needs access to the same associative data. An example of this is a system incorporating several CAMSto store the tables used in a multiprocessor relational database engine. In a traditional, location-addressed memory, noninterference can be guaranteed by giving each processor a unique address range. When the data is instead manipulated by content, the solution to this problem is less obvious. To make the best use of the higher cost associated with the CAM hardware, an allocation strategy should provide for minimal overhead to prevent access contention. That is, most of the storage should be used for real data. Furthermore, all memory allocations should be conflict free. Finally, during one revolution of a circulating-type memory, the contents of each CAM should be capable of supporting a complete fetch by a set of noninterfering processors. Kartashev and Kartashev (1984) discuss a method for supporting such a system in terms of minimal and nonminimal files. A minimal file includes a set of data words that can be accessed by the same processor (let us say, processor P) in consecutive revolutions without the possibility of another processor changing them between accesses. Those data words are “connected” to processor P for the consecutive revolutions. A nonminimal file describes a set of data words that do not have this property. Both minimal file allocation (where each processor accesses only its minimal file during a memory revolution) and nonminimal Jile allocation (where each processor can access a nonminimal data file during a memory revolution) are described in the paper referenced.
CONTENT-ADDRESSABLE A N D ASSOCIATIVE MEMORY
183
5.6 CAM Reliability and Testing When we discuss retrieval of information, it is assumed that the information occupying the CAM cells is correct. That is, the data stored in the memory is what we intended to put there. Even if we ignore the possibility that some software flaw put erroneous data into the memory, we cannot guarantee that the information is correct since real hardware devices do fail on occasion. Given that CAM-based hardware is likely to grow in both storage capacity and importance in the years ahead, the reliability of such systems is a legitimate concern for designers (Grosspietsch, 1989). Testing a content-addressable memory is far from trivial, and current RAM test methods cannot be directly applied to this new technology. From a fault prospective, the CAM can be viewed as a combination of traditional random access memory (RAM) storage with extra logic to perform the comparison, masking, selection, etc. (Grosspietsch et al., 1986). So we can expect the same kinds of stuck-at, coupling, pattern-sensitive, and leakage problems already familiar to conventional memory designers. Furthermore, CAM faults can be classified into two broad categories: (1) a word that should match the search key misses, and ( 2 ) a word that should not match the search key hits. To detect errors on the fly, an error-detecting code can be added to the CAM storage section. The false “miss” fault (the first type above) can be detected during CAM verification by loading a pattern into all the CAM words, and presenting that pattern as a search argument. Every word should “hit” in the CAM, and any word that does not show a match can be assumed to have a problem. Diagnosis can be made even better by providing the ability to read (as well as write) the search, mask, and match (hit) registers. One proposed solution to detect a false “hit” (the second fault type above) is to add extra, strategically located, hardware. For each retrieval of a matching word (based upon the “hit” register) an extra comparison can be made between the search/mask register combination and the output match register. If the selected output data word does not match the properly masked input search register, a false hit is confirmed. Another method suggested to test CAMS is to gang entire groups of cells together, and to look for any words that do not follow an expected retrieval pattern (Mazumder and Patel, 1987). For example, all the odd lines can be grouped into one class and the even lines grouped into another class. If all the even lines react the same (expected) way, then they are all assumed to be good. If at least one line does not provide the same matching output as the rest, a single error line can present the evidence of this error. In this way, a good CAM is tested quickly since the error line will show that all the words reacted as expected. When employing any of the above methods, the
184
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
use of proper test patterns can speed the verification process and allow more specific diagnosis when an error is detected. Once an error has been detected in the CAM, the question remains about how to deal with it (Grosspietsch et al., 1987). The entire memory can be rendered unusable, but this becomes less attractive as the memory grows in size and is implemented in ever more dense technologies. Some better methods to deal with bad cells are:
1. Swap a spare (good) cell with a bad one (cell redundancy) 2 . Mark the bad word locations as unusable (graceful degradation by shrinking the number of words) 3. Triplicate each search and use a voting mechanism to correct errors.
6. Associative Memory and Processor Architectures
The data in an associative memory can be grouped by bits, bytes, words, variable fields, or blocks. Furthermore, the architecture can identify the information by discrete data fields (such as words), or distribute the information throughout the memory (neural net is one example). The architectural trade-offs that must be considered when deciding the orientation of the memory are (Thurber, 1976) : 0
0 0
0
The storage medium The communication between cells The type of retrieval logic The nature and size of external logic (such as registers and input/output ports).
The ultimate goal of a word-oriented CAM is to compare each data word with a key word, appropriately modified by a mask word. Every word in the memory is inspected in parallel, and any matching words are identified by a “match” bit (of which there is one per data word). The words can then be retrieved by cycling throughout the match bits. This tends to carry a high cost premium due to the comparison circuitry necessary for every single bit in the memory. In practice, some serialization is ordinarily used to get close to this ideal of totally parallel operation.
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
185
A typical method of simplification is called word-parallel, bit-serial (or bit-slice). The data in a bit-serial associative memory (Fig. 9) is inspected one bit at time, with the same bit in every word in the memory looked at simultaneously. The bit position in each memory word is handled as a slice running from the first memory word to the last. The search time through such a memory is related to the word width rather than the memory depth (number of words). This is explained by the fact that the entire memory depth is searched in parallel for each bit in the word. An example of a wordparallel, bit serial, design is that developed by Blair (1987). A word-parallel, byte-serial memory can also be constructed on the same principle, only the “slice” is a byte rather than a bit. This reduces the number of shifts necessary to compare the entire data word at the cost of higher comparison circuitry (since an entire byte is being compared in parallel for each data word). Although less useful in practice, a word-serial, bit-parallel associative memory can also be created (Kohonen, 1987). In this CAM architecture, one whole word is read in parallel and compared to the key/mask combination. The memory cycles through all the words in the memory sequentially with each associative access. This reduces the number of parallel comparison circuits at the cost of an increased cycle time, growing as the number of word entries grows. The advantage of this method over simply using a normal address-based memory (and programming the comparison) are simplicity and access speed. The programmer can simply specify the key word and the hardware handles the actual work. Having the sequential comparison happen at a very low hardware level makes this operation much faster than if a program were to execute it one word at a time. The hardware
INPUT WORD SLICE
INPUT BITSLICE
OUTPUT WORD S1,ICE
FIG. 9. Bit-serial associative memory block diagram.
186
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
designer can use the limited access types and distances to optimize the cycling of the data words, perhaps using block mode mechanisms or other technology-specific cycling abilities. An associative memory architecture that has been considered useful, especially for large database systems, is called block-oriented (Su, 1988). The systems that use this architecture are largely all derivative from Slotnick’s “logic per track” system, described in his classic 1970 paper (Slotnick, 1970). The simplest way to understand this type of memory is to envision a rotating disk, a commonly used method to implement a block-oriented CAM (Smith and Smith, 1979). Each cylinder of the disk is broken into blocks, and there are multiple processing elements (a disk read/write head, perhaps with some filtering logic), one for each cylinder. The information on the rotating medium serially passes by each of the parallel processing elements. The amount of information that can be accommodated using such an architecture can vary dramatically, depending upon the clocking scheme chosen. For example, Parhami (1989) has shown that a 70-90% capacity improvement can be realized at minimal cost by moving from a single scheme to one where the tracks are divided into equal-capacity groups. It is, of course, not necessary to view the information in the memory as series of words at all. A distributed logic memory (DLM) (Lee, 1962) places the comparison and manipulation logic into each cell, and thus performs the comparison function on every cell truly simultaneously. The information content of the memory is not made up of discrete information locations. Rather, it consists of information distributed throughout the entire memory. This system is described further in Section 7.2. 6.1 Associative Memory Design Considerations
As with a conventional memory, the three most important design considerations for the construction of an associative memory are speed, cost and density. The speed of a content addressable or associative memory depends upon (Stuttgen, 1985) the access time of the individual storage elements, the cycle time of the comparison system, and the degree of parallelism in the operation. This can be seen most clearly by example. A bit-serial associative memory has less inherent parallelism than a distributed logic memory. If the word width is (for example) 10 bits, then it is going to take at least 10 cycles through the comparison system for every lookup. The distributed logic memory (as well as the all-parallel word-oriented CAM) inspects every cell totally in parallel, and achieves its matching decision in one cycle. However, a DLM has a longer cycle time for each cell operation than a bit-serial memory because of the extra logic in each cell and the interconnection hardware required.
CONTENT-ADDRESSABLE A N D ASSOCIATIVE MEMORY
187
The useful speed of a particular associative memory thus depends upon the characteristics of the application, and how they relate to the architecture of the memory. Word-oriented (including bit and byte-serial) CAMS perform better when executing data and arithmetic computation problems, where the processing information is naturally in the word format. Fewer cycles (than on a distributed logic memory) should be needed to read the operands, perform the operation, and store the results. A distributed logic architecture, on the other hand, works very well for equality comparisons, since operations of that kind are naturally performed in parallel and simultaneously over all the bits in the memory (Thurber, 1976). The task of input/ output is a problem in a bit-serial CAM, since a conventional locationaddressable word-oriented computer is likely to be the interface to the world of humans. Some method of accessing the information by word is necessary, which entails the ability to read and write in a direction orthogonal to the normal processing operations. A totally distributed logic memory has the same problems, since no word access is even necessarily defined. However, a more structured DLM (such as an all-parallel but word-oriented CAM) can provide 1/0 in a format easily used by a normal computer. The cost of an associative memory is controlled by the price of the storage elements, the cell interconnection expense, and the amount (and per-gate cost) of logic associated with each element. In general, a bit or byte-serial architecture will be less expensive than a distributed logic architecture because of the smaller amount of logic for each storage element. The density of a memory system is related to the size of the storage elements, the overhead associated with the comparison, and the amount of interconnection between the elements. A bit-serial system will be more dense than a DLM due to a reduced comparison overhead (remember that the DLM has the comparison in every bit). The DLM is also likely to have more interconnection than the bit-serial CAM, especially if the storage is very unstructured.
6.2 Associative Processors 6.2.1 The STARAN
The most notable truly associative processor is the STARAN (Batcher, 1974), developed by the Goodyear Corporation in the early 1970s. What gives it such prominence is that it was successfully offered for sale to paying customers, and can thus be considered the first practical associative processor ever produced. The STARAN architecture was designed such that an ofthe-shelf implementation was feasible, and this cost-reduction principle coupled with some clever features probably contributed to its success (Feldman and Fulmer, 1974). There are many good descriptions of the machine in
188
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
varying degrees of depth (Foster, 1976; Thurber and Wald, 1975; Thurber, 1976) so only an overview is provided here. The description here will be used in a later section to give one example of associative processing software. The STARAN machine consists of an associative array, a PDP-11 sequential controller unit, and a sequential program memory. The controller executes operations on the associative array based upon normal sequential instructions residing in the program memory. The associative array can be made up of several array modules, each of which is described next. In the STARAN associative module there is one processing element (PE) for each word, and the set of PEs form a bit slice across the data words. The associative processing is performed in a bit-serial manner, but 1 / 0 is performed in word-serial form (where a data word is accessed with all its bits in parallel). These two access methods (bit and word) are accommodated through a multidimensional access (MDA) memory, which is 256 bits by 256 bits. The MDA allows the memory information to be accessed by word, by bit-slice, or by other fixed formats (such as a number of records with %bit bytes in each record).
6.2.2 A Hierarchical Associative Memory System A hierarchical associative memory system has been described by Stuttgen (Stuttgen, 1985) to take advantage of the tradeoffs available between performance and cost. The first level of hierarchy would be a fast, flexible, distributed, relatively expensive (but still cost-effective) associative memory that would operate on local data quickly. The second level of hierarchy would be a larger, slower (perhaps bit-serial), less expensive associative store containing the entire database or program. This two-level approach is conceptually similar to having a cache buffer between a traditional computer and its location-addressed memory. The analogy to a cache extends to the user’s perspective in the design, since the high-speed memory must be architecturally invisible to the programmer (except for the performance increase, of course). As far as the user is concerned, the associative memory is one large, powerful storage unit. Therefore, dedicated hardware would control the interaction between the various levels of storage. The local “cache” (first-level memory) would process the currently active subset of data items in this system, only going to the larger (but slower) second-level storage when it ran out of room. Lest we carry this analogy with a traditional memory too far, let’s look at some major and important differences. In an associative computer, much of the processing happens in the memory itself. The parallel associative matching and manipulating can
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
189
therefore take place on either level of storage. This is totally unlike a location-addressed system where all the processing must eventually travel to some specific processing element [central processing unit (CPU) or smart peripheral]. The question of when to send information between the two storage (and processing) levels is thus made more difficult, but more flexible. One way to handle the problem of data transfer is to split the main storage (second level of hierarchy) into sections. The system would determine which sections to transfer to the first level based upon a tradeoff between the overhead of the interlevel data transfer and the expected performance gain (standard cost/benefit analysis). Another solution to the data transfer problem is to perform an associative search of the main storage with a particular key and context mask. The number of responses from the main memory could be counted and the mask or key could be modified until the number of responding cells was able to fit into the first-level memory. In either of these scenarios, and in any formulation of level management, fast information interchange between the levels is important. We presented an earlier example in Section 5, where we wished to query a database and have returned to us all the people on a particular street who made more than a certain income level. The hierarchical system above could be used to good advantage for this by embarking on a two-pass search. The first pass would retrieve all the people on the street from the large, slower database storage. The information from this search could be loaded into our fast, local CAM and the income level could then be used to query the local memory .
6.2.3 The HYTREM Database Search System A different hierarchical approach was taken by the designers of the HYTREM (Hybrid Text-REtrieval Machine) database search system (Lee, 1990) as shown in Fig. 10. In this system there are actually three levels of hierarchy, the top two of which are associative. Before describing the system in a little more detail, it’s important to understand its intent, so that the elegant operation of the various pieces can be appreciated. The HYTREM is meant to store and retrieve large text databases efficiently. It uses a text signature to initially screen the entire database, then does a final selection on the remaining entries in a more accurate manner. The first level of hierarchy is a relatively small but fast bit-serial associative memory that stores a hashed signature of the database. The compressed signature information is typically only 10-20% as large as the entire database, and can thus be searched quickly. The first screen eliminates all the text records that cannot possibly match the query. The remaining records are likely to match, but there may be some false-positive indications (what
190
LAWRENCE CHlSVlN AND R . JAMES DUCKWORTH
FIG. 10. Diagram of the HYTREM system.
the designers call false drops). A multiple-match resolver (MRR) is included to retrieve the qualified pointers and send them to the next stage. The next level of hierarchy is a text processor with more complex pattern matching capabilities, called the associative linear text processor (ALTEP). The ALTEP does a more thorough matching operation on the text that is delivered from the signature file, and will make a final determination about the appropriateness of the text to the query. It is implemented as a linear cellular array, optimized for signature file access. The text in a signature file system is broken into fixed-length blocks and is loaded into the ALTEP cells on demand. At that point the array functions as an associative matching processor. The ALTEP also has an MRR to deliver the information to a sequential controlling element. The final, lowest level of hierarchy is a mass storage system. This contains the entire database and delivers the text for both the ALTEP matching operation and whatever further information is requested after a successfully matched query. The designers of the HYTREM envision this storage level as a set of magnetic disks, with a cache memory somewhere in the path to add a performance boost. Even with the relatively slow access time of a disk, they believe performance can be kept respectable through parallel and overlapping operation of various system components.
6.2.4 Syracuse University Database System An efficient data/knowledge base engine has been suggested by researchers at Syracuse University (Berra, I987b), targeting huge systems comprising
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
191
hundreds of gigabytes. As with the HYTREM system, they suggest the use of a hashed signature or descriptor file, which they call a surrogutefile, to greatly compress the access information contained in the full (or extensionan database (EDB). The surrogate file can be used to index the EDB, and even provide the ability to perform some relational operations on the data without access to the larger EDB. An associative memory would be used to implement the surrogate file, and any retrieval that depends upon only a partial match would be performed directly at this level. The memory would perform functions such as exact match, maximum, minimum, and Boolean operations. Since the surrogate file is compact and regular, the major drawbacks to associative memories (prohibitive cost for large storage and data format rigidity) no longer are issues. At some point the entire reference must be obtained from the full EDB, and the hashing function must be chosen so as to ensure efficient data transfer. The surrogate file can be created using a superimposed code word (SCW) mechanism, where the individually hashed values are ORed together. The index into the EDB would be guaranteed to retrieve any existing facts that match the query, but using the SCW method there might be unwanted facts retrieved (false drops). These must be eliminated after retrieval, which means more data transfer from the EDB and more postretrieval processing. An alternative hashing function, called concatenated code words (CCW), concatenates the individually hashed entities before storage in the surrogate file. This makes it far more likely that all the retrieved facts are desired (few false drops), but necessitates a longer word length to accommodate the extra information. Using a CCW should reduce the amount of data transfer traffic between the EDB and the front-end processing unit. Simulation of the two hashing schemes described above has shown that the surrogate file can easily be less than 20% of the size of the entire EDB, and the hashing function must be chosen based upon the characteristics of the information contained in the database. The amount of redundancy in the database must be analyzed to determine which of the hashing functions will provide the smaller surrogate file.
6.2.5 CAM Based Hierarchical Retrieval Systems Hashizume et al. (1989) discuss the problem of data retrieval and also suggest a hierarchical architecture to obtain good performance for a reasonable cost. Although modern integrated circuit technology has provided the ability to create a truly useful CAM, they argue that it will always be more expensive than a large bulk memory device. Their model of the proper associative system consists of a parallel VLSI CAM as the local high-speed
192
LAWRENCE CHlSVlN AND R . JAMES DUCKWORTH
search engine, and a block-oriented mass storage device as the main memory. The blocks in the mass storage device would correspond to the capacity of the local CAM, and data would be transferred back and forth from the mass storage in single block packets. Their paper evaluates the performance of their proposed system by making assumptions about the data characteristics, and changing various parameters to determine the outcome.
6.2.6 The LUCAS System Another interesting associative processor is the LUCAS (Lund University content addressable system) (Fernstrom et al., 1986). It was built during the early 1980s in order to study highly parallel systems, specifically their architectural principles, programming methodology, and applicability to various problems. The LUCAS system contains four major blocks, as shown in Fig. 11, the most interesting of which is the associative processor array. The processor array is interfaced to the outside world through a standard sequential master processor, which sends instructions to the associative array through a control unit. The processor array is composed of 128 processors, each of which has a 4096-bit memory module and a PE. The processors are configured in a bitserial organization. Data in a memory module is connected to the PE, and can be used in the same memory module or routed to a different one. A typical operation for the LUCAS is to compare the processor array contents (configured so that all the memory modules are accessed as one associative memory) with some template. Matching words are accessed by a multiple match resolving circuit. Data is routed among PEs by an interconnection network that allows eight possible sources for each PE input. One of the inputs is dedicated to the path between the PE and its memory, and the other seven can be configured to suit a particular application. This scheme has several important ramifications. The communication links are fixed (not data dependent) and the transfer of information happens in parallel between the source/destination pairs. The data are permutated as they are transferred. If there is no
Processor
t
Peripheral 4 Devices
UO
t Processor
Processor
FIG. 1 1 . Diagram showing the LUCAS system blocks.
CONTENT-ADDRESSABLE A N D ASSOCIATIVE MEMORY
193
direct link between two PEs, multiple passes through the network must be made to route information between them. The LUCAS architecture makes it a reasonable engine for certain classes of problems, and the developers suggested several applications for it. It can be used effectively for matrix muliplication, fast Fourier transforms, and graph theoretic problems (such as the shortest distance between points, minimal spanning tree, etc.). Furthermore, LUCAS was suggested as abackend processor for relational database processing, and as a dedicated processor for image processing. (The software for this system is discussed in Section 7.6.) 6.2.7 Matching Hardware and Software
A novel computer architecture has been suggested by Blair and Denyer (1989) that uses the power of content addressability to attack certain classical data structures and algorithms. Among their examples are vectors, lists, sets, hash tables, graphs, and sorting problems. The CAM is bit-serial and wordparallel to provide a constant speed unrelated to the number of words, to minimize the pin count and internal buses, and to keep the memory bit size reasonable. Each word has a tag set associated with it that is used to manipulate the contents of that word. There are two tags that keep the status of a comparison operation, and one tag that identifies the data word as empty (“empty tag”). An empty tag status means the associated data word is undefined and available for new data. Empty locations are also not a part of any comparison. The CAM operates as follows. A masked comparison is performed on the entire memory, and the appropriate tags will be left “active” after this operation. Each matching word is inspected one at a time, and it is either manipulated somehow or marked as empty (by setting the “empty tag” for that word). The matching tag for that word is then made passive (cleared), which brings up the “next” active tag (from the original matching operation). When all the words from the last match have been inspected and made passive, the CAM signals that there are no more responders. The smaller modular CAM described above can be cascaded to form large CAMS. A group “lookahead” signal, formed by a NOR function on the tag values, can be used to enhance the performance through a bypass of groups that have no active words. (The software for this system is discussed in Section 7.8.) 6.2.8 The DBA System
The DBA (database accelerator) system (Sodini et al., 1986; Wayner, 1991 ) at Massachusetts Institute of Technology (MIT) is a research vehicle
194
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
to discover and evaluate appropriate uses for content-addressable memories. The DBA can be viewed as a single-instruction, multiple-data (SIMD) (Flynn, 1972) content-addressable processor, with the additional ability to store and manipulate “don’t care” conditions, to selectively enable subsections of the memory, and to process sequences through a finite state machine. Each 32-bit CAM word in the DBA has its own I-bit microprocessor, a set of four single-bit registers that can be used to store the results of the current operation or feed into the next operation, a matching-pattern result latch, a selection circuit that enables the word, and a priority encoder to provide a serial output when more than one CAM word hits. The.DBA system, which is a combination of the above basic cells, is organized as a set of I-bit data paths connected in a linear, nearest-neighbor topology. As an example of the power such a system can provide, the designers suggest its use to enhance the performance of logic simulation. To use the DBA in this fashion, the network under simulation should be viewed as a clocked system, where the logic description can be represented as a sum of products feeding some latching element (most modern synchronous designs meet these constraints). The simulation is carried out by taking the logic network, represented as a Boolean function of its inputs, and computing in parallel the results of a set of functions over a set of variables. This is done in several steps. Before starting the actual simulation, the input variables are assigned unique bit positions in the CAM word. As a simple example, assume that the CAM consists of four words, each word having a width of 4 bits. Consider the expression
D =(A x
B) + (2x
C)
which contains three input variables ( A , B, C). These could be assigned to the leftmost three bit positions in the CAM word. The DBA’s ability to store and manipulate “don’t care” conditions is crucial here. The minterm designation can be represented by the words 1OX and OX1 (ABC, where the “X” term means “don’t care”). Each minterm is assigned its own CAM word, and one “in-use” bit per word specifies whether that term is to take part in the simulation. If there are more inputs than a single CAM word can accommodate, the simulation must employ a multiple-pass technique, making use of the DBA’s sophisticated logical ability and internal storage. The general procedure does not change. The simple example above uses only 2 CAM words to determine “D,” so the “in-use” bit (we assume that it is the rightmost bit) is only set in the two significant words. The 4 words would be 10x1,OX1 1, XXXO, XXXO. The actual simulation is carried out in three phases. The first phase in the simulation evaluates the minterms of all the equations, the second phase performs the A N D function on the results of the minterms, and the last phase returns the logic values.
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
195
6.2.9 The CAAPP System
Another recent bit-serial associative processor, the content addressable array parallel processor (CAAPP) (Shu et al., 1988) has been designed from basic principles to efficiently support an image understanding system. It is one component of the overall architecture, and provides pixel level and symbolic processing. The basic building block of the CAAPP is the PE, which consists of an ALU, support circuitry, and memory. Each PE consists of a 320-bit on-chip memory store and an external 32K-bit “backing store” designed with dual port video-RAMS (VRAM). The backing store has growth capability if larger VRAMs are used. The PEs are interconnected through a nonmultiplexed full mesh network, providing a compact and efficient topology. The CAAPP has several interesting architectural features. These include an activity bit that controls PE responses to a query, a some/none response feedback ability, and a method to count responders. The activity bit in a particular PE might be set after an initial query, and would thus leave only a subset of PEs for further processing. The some/none lines from all the processing elements are wired together such that only a single output line needs to be monitored by the system to determine if any matching responses were obtained. An example of where this type of system might prove useful is in the creation of a histogram that exhibits the informational content in an image. The controller would broadcast data that describes a particular pixel intensity range. Any matching PE would set its some/none line, and the response count circuitry would quickly determine how many active lines matched that range. 6.2.10 The CAFS
Another associative subsy’stem, called the content addressable file store (CAFS) (Burnard, 1987) has been created by ICL and is available on all their mainframes. It is designed to search efficiently through great quantities of unstructured text data, replacing cumbersome and often inadequate software measures such as tagging and indexing. The CAFS hardware, built into ICL’s disk controllers, consists of four major sections. The logical format unit identifies logical records within the byte stream. The retrieval unit converts the input byte stream into records for transmission to the CPU. The search evaluation unit does the actual search based upon data and mask information supplied. This unit determines if a particular record should be retrieved. The retrieval processor accumulates the matching responses (“hit” records) and can perform other simple arithmetic operations on those records.
196
LAWRENCE CHlSVlN AND R . JAMES DUCKWORTH
The CAPS system has the ability to process a special tagged structure called the self-identifying format (SIF). Properly tagged tokens can be stored in records of arbitrary length (fixed or variable). Using the SIF, the CAFS engine can identify tokens of any type, independently specifying the types to be searched for and retrieved, even applying a mask to each tag as it is processed. So, for example, the CAFS can search for any name, a name of some particular type, any surname, a surname of one type, etc. This search and processing ability is limited only by the length of the chosen tag.
6.2.17 The GAPP
An interestingly unique content-addressable memory approach has been chosen for the geometric arithmetic parallel processor (GAPP) (Wallis, 1984). A systolic array architecture is used to create a bit-serial CAM. It can search for data based upon its content, optionally performing logical and arithmetic operations on that data. Each array chip contains 72 singlebit processing elements, each element having access to 128 bits of RAM. The PEs operate in parallel as an SIMD machine, and more than one GAPP chip can be combined for greater capacity and flexibility.
6.2.12 The GAAP
A large knowledge base machine is currently being designed by researchers at the University of Strathclyde, Glasgow, Scotland in collaboration with Deductive Systems, Ltd. (McGregor, 1986; McGregor et ul., 1987). This machine will implement the generic relational model (GRM), briefly described in Section 5.3. The major associative component of this computer is the generic associative array processor (GAAP). This processor allows the hardware mechanism to inferentially expand the implicit query tuples into a set of explicit ones. In the GAAP achitecture, a traditional sequential processor controls an array of custom VLSI associative chips. The controller also has its own dedicated RAM to coordinate the interchip information. Connections among lines within one cell and between lines in different cells provide the information about set membership. The intercell and intracell communication matrix can be used to perform the operations needed in the GRM. These operations include set membership insertion and deletion, upward closure to determine the sets to which an entity belongs, and downward closure to ascertain the members of a set. Set operations such as union, intersection, and selection are also implemented.
CONTENT-ADDRESSABLE AND ASSOCIATIVE M E M O R Y
197
6.2.13 The ASP
Lea has written extensively about his associative string processor (ASP) (Lea, 1986a, b, c; 1987a, b) offered as a cost-effective parallel processing engine, the architecture of which is shown in Fig. 12. His unfortunate use of the acronym ASP can lead to confusion, since Savitt et al., used the same three letters to refer to their 1967 “Association-storing processor” specification (Savitt et al., 1967). The original ASP is examined in the software section of this chapter. Lea’s ASP, described in the following section, is not particularly related to that classic machine, though he does give at least conceptual credit to the original ASP. The associative string processor architecture (Lea’s ASP) describes a reconfigurable and homogeneous computing foundation, designed to take advantage of the inexorable technological migration from VLSI to ultralarge-scale integration and then on to wafer-scale integration. Its goal is to efficiently support such operations as set processing, string processing, array processing, and relational data processing. The building block of the ASP is the substring. Many substrings operate in parallel, and are supported by an ASP data buffer (ADB), a controller, and a data communications network. DATA INTEWACE
-
DATA COMMUNICATIONS NETWORK
ADB
ADB
I I
ASP S U B S T R I N G
CONTROL INTERFACE
-r -
ADB --L
ASP
ASP
S
S U B
U
B S T R I N G
ADB I
ASP S U B
S
S
T R I N G
T R
I N G
T T- I ASP CONTROLLER
FIG. 12. Diagram of the ASP architecture.
198
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
Each substring incorporates a string of identical APEs (associative processing elements) which communicate through an inter-APE network. During operation of the ASP, all the APEs simultaneously compare their stored data and activity registers to the information broadcast on the data and activity buses in the substring. Any APEs that find a match are either directly activated themselves, or indirectly activate other APEs. Activation in this context means that the APE’S activity register is updated. Once an APE has been activated, it then executes local processing operations in parallel with other active APEs. Four basic operations are supported by the APE: match, add, read, and write. The match operation affects the M (matching) and D (destination) flags, either setting or clearing them based upon the APE registers and the broadcast information. In the add operation, a bit-serial addition or subtraction is performed, and the outcome is stored in the M (sum) and C (carry) flags. A read operation drives the data bus with a wire-AND of all the activated APEs. A write operation updates the data and activity registers of all active APEs with the information on the data and activity buses. The ASP supports both bit-parallel, single-APE data transfer through the shared data bus, and bit-serial, multiple-APE information flow through the inter-APE communication network. The inter-APE communication path is restricted to high speed transfer of activity signals or M-flag patterns between APEs. This communication architecture, coupled with the ability to activate each APE by its data content, allows efficient control of data movement. The LKL and LKR ports maintain continuity by allowing information to be sensed by the external ASP controller. The inter-APE communication network allows the ASP substring to effectively emulate such common data arrangements as arrays, tables, trees, and graphs.
6.3 CAM Devices and Products Over the last few years a number of content-addressable memory integrated circuits have been designed and built. Most of these have been constructed in research laboratories but in recent years several commercial CAM devices and board level products have become available. This section first describes the devices and architectures developed at various research institutions and then describes commercially available CAM devices. 6.3.1 CAM Devices Being Developed
6.3.1.1 The CARM. Kadota et ul. (1985) describe an 8-kbit contentaddressable integrated circuit organized as 256 words by 32 bits. Their device
199
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
is called a content-addressable and reentrant memory (CARM) and was fabricated with 2-pm CMOS technology and two-layer metallization. The basic structure is similar to that of a static RAM in that it has address decoders, memory cell arrays, bit lines, word lines, write drivers, and sense amplifiers. To provide the added functionality required in the CARM, data and mask registers are added along with sense lines for each word, an address encoder (as well as a decoder), and an address pointer. A block diagram of this device is shown in Fig. 13. Thirty-two-bit-wide data are applied to the device through the data pins and transferred to the data register. These data are then masked according to the condition of the bits set in the mask register and applied to all the memory words in parallel. If the masked data coincide with the data stored in the memory words, a match occurs, and the sense lines for those words are activated and propagated to the matching sense amplifier. This amplifier activates the sequential address encoder and the address of the corresponding word is output through the address bus. When more than one word matches the data applied to the device, corresponding addresses are output, one after another from the sequential address encoder. I I I I I I
*J Data Bus
Address Bus
I I
I*
I
I I I I I I I I I I I I I I I I I I
Data and Mask
I
y
Sequential Address
1
AmuMer
Associative
Memory
I
I
I-l
Control and Timing Logic 1
A 256 words by 32 bits
I
I
I I I I I I I
I I I I I
Sense Amp
I
1
I I
I I I I I I I I I I I I I I I I I I I I I I I
I I I I
200
LAWRENCE CHlSVlN AND R . JAMES DUCKWORTH
6.3.1.2 CAM Device Expansion. One of the problems with earlier CAM deviccs was their limited size and lack of expandability. Even with current technology it is not possible to conceive that a single integrated circuit could be dcsigned and built that would satisfy all the memory requirements of a content addressable memory. A modular design is therefore essential whereby the capacity of the memory can be easily increased by the suitable addition of extra identical integrated circuits. In conventional RAM systems it is relatively easy to increase both the memory width and depth of the system by adding extra memory devices. Increasing the size of a CAM system is not as simple. The most difficult expansion feature to implement is concerned with the label and tag fields, which are the fields where we require a content-addressable search. The tag field may be considered an extension of the label field but we still have to allow for expansion of the width of this combined field which is also called horizontulexpansion. Also, we must be able to increase the number of entries that may be placed in the memory which corresponds to an increase in the depth, or vertical expansion of the memory. Ogura et ul. (1985) describe a 4-kbit CAM integrated circuit organized as 128 words by 32 bits which may be interconnected to produce larger sizes of CAM. A garbage flag register is used to indicate whether a word location is empty or not and during write operations a multiple-response resolver selects from among empty word locations. To implement a memory with more than 128 words, a number of these CAMs can be connected together by sharing a common data bus. Each CAM has an inhibit in and out signal ( Pe,,,,, and Pcxl,).The P,,,,, is generated from the multiple-response resolver flag register outputs. To extend the word count (depth) of this memory, two memory chips can be connected together on a common data bus, as shown in Fig. 14, to produce a memory system with 256 thirty-two-bit words. The use of the inhibit signals assign a priority in a daisy-chain fashion to each of the modules to resolve contention for the data bus. When data are to be stored or retrieved from the system, each of the CAM modules identifies an empty word location by referring to the garbage flag register. If an empty word is located in a module then the inhibit signal ( PeX,,,) is set to “1” which is then transferred to the last chip in a ripple through operation. After the highest priority module has accessed the common bus its (P,,,,,) signal is set to ‘‘0” allowing the next integrated circuit in the sequence to have priority. To allow multiple integrated circuits (ICs) to be connected together to increase the word “width,” existing designs of CAMs, e.g., the two devices mentioned above, generate an address when a match occurs and use this to allow for expansion. As an example, assume that it is required to carry out a matching operation on 96 bits but the actual width of the individual devices
CONTENT-ADDRESSABLE A N D ASSOCIATIVE MEMORY
-
-
201
CAM module 1
p,, uin
CAM module 2
F FIG.14. Increasing the word count of CAMS (vertical expansion)
is only 32 bits. The 96 bits would be split into three groups of 32 bits and one group applied to each of the three devices. One of the devices acts as a master and the rest are programmed to act as slaves as shown in Fig. 15. If the master detects a match on the four bits that it receives it outputs the address corresponding to the location of the matched entry and this address is then used as an input to all the slaves. The slaves compare their 32 bits of input data with the data stored at that particular address and if the data stored is the same then an output bit will be set. A match on the whole 96 bits only occurs if all the slaves indicate their part of the word matches.
Label
0 0 master
1 address I
FIG. IS.
slave 1
slave 2
I
A C A M arrangement that increases the width of the label field.
202
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
6.3.2 The Database Accelerator Chip Most of the LSI versions of CAMs that are currently available, such as the devices described in the previous section, use about 10 transistors for each cell. The memory cells for these devices were all based on a static memory cell, but by comparison, a dynamic CAM which only requires five transistors has been designed as part of the Smart Memory Project at MIT (Wade and Sodini, 1987). These devices may be compared to the single transistor and capacitor required to store one bit of information in a conventional dynamic random access memory. The DBA chip was briefly described as a complete system in Section 6.2.8. The DBA chip has two major sections, a memory section consisting of a CAM array used for associative operations and a processing section which consists of many simple, general-purpose, single-bit, computing elements. In this section we wish to concentrate specifically on the design of the CAM which uses a unique concept called trits (lernary digits). A trit can be either the usual zero or one but also a don’t care (“X”) value. This cam is also called a ternary CAM (Brown, 1991; Hermann, 1991; Wade, 1989). Traditionally CAMs have implemented the don’t care function with a mask register separate to the data register but the ternary CAM allows for the storage of the three states directly. As opposed to the static CAM cell designs of the integrated circuits mentioned above, the DBA cell is a fivetransistor CAM cell (Wade and Sodini, 1987). This dynamic cell was used because of its small size and ability to store three states.
6.3.3 The Dictionary Search Processor As far as the authors are aware the largest CAM produced to date is the DISP integrated circuit which has a 160-kbit content addressable memory (Motomura, 1990a). The DISP was developed to aid in dictionary search operations required for natural language processing. Two of the most important requirements of a dictionary search system are increasing the vocabulary (may be several tens of thousands of words) and speeding up the search process of this large dictionary. To complicate matters, the input words to the system may have misspellings. It is therefore not only necessary that the system searches for a stored word that exactly matches the input word, but it also has the ability to search for stored words that approximately match the input word, Previous dictionary search systems have used conventional memory and software to iteratively read out stored words and compare them with the input word. This process may take many thousands of cycles, especially when the nearest match to the input word is required. However, CAMs are
203
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
able to simultaneously compare an input word with all the stored words in one cycle, and so the DISP was developed to enable large and fast practical dictionary search systems to be constructed. The DISP contains a 160-kbit data CAM (DCAM) organized as 20 CAM arrays of size 5 12 rows by 16 columns, and a high-speed cellular automation processor. A block diagram of the DISP is shown in Fig. 16. In order to reduce the number of comparisons between the input word and the stored words the DISP classifies the stored words into a number of categories. The control code CAM shown in the figure is responsible for storing indexes into the DCAM based on the classification scheme used. For example, the categories could be selected using the first character of the stored word. The DISP can store a maximum of 2048 words classified into 16-different categories. As mentioned earlier, a dictionary search system should respond with the correct word even when the input word contains spelling errors. Similar to the Hamming code distance described previously in Section 5.4.1, the cellular automation processor of the DISP calculates the distance based on character substitutions, insertions, and deletions, between the input word and the closest stored words. Only stored words with distances less than or equal to 2 are treated as matched words; a word with a distance greater than 2 is treated as a mismatched word. Once a matched word or words are found the priority encoder will serially output the addresses of the matched words starting with the closest match. The DISP can store a maximum of 2048 words but multiple DISPs may be connected in parallel to increase the vocabulary of the system. For example, a Character Codes
a
Control Code
1~~~1Controller
RowAddress ColumnAddress
--
cam t
Cellular Automation Rocessor
I
T
PriorityEncoder
I
204
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
50,000-word dictionary search system could be constructed using 25 DISPs. Further details and additional references on this device can be found in a rcport by Motomura et al. (1990).
6.3.4 Fault- Tolerant Content Addressable Memory Devices
In a conventional memory system if every addressable location can not be accessed and correctly manipulated then the memory is usually useless. To increase the yield of memory devices some manufacturers include extra space capacity which is automatically switched in if faulty memory locations are detected. A CAM is naturally fault tolerant since there is no concept of an absolute storage location and the physical location of data can be arbitrary. So long as faulty cells can be detected and isolated, a CAM will still function, albeit with reduced capacity. The articles by Grosspietsch et al. (Grosspietsch et al., 1986, 1987; Grosspietsch, 1989) cover the issues of testability and fault tolerance in general and is a good introduction to this area. A number of researchers have incorporated these concepts into actual designs. Blair (1987) describes a device that exploits the natural fault tolerance of a CAM at the cost of one extra latch per word. During a test cycle this latch will be set if a fault is detected and it will ensure that the faulty CAM locations are not used for actual data storage and retrieval. An 8-kbit CAM (128 words by 64 bits) that is fault tolerant under software control is described by Bergh et al. (1990). A faulty word location in the memory can be made inaccessible by on-chip circuitry. This device was developed for a real-time multiprocessor system, but the authors also describe its use in telecommunications systems and as a matching unit for a data-flow computer (see Section 3.8 for more details). An additional feature of this device is a 12-bit counter that contains the number of stored valid words in the CAM. Each time a word is stored or deleted, the counter is incremented or decremented accordingly. A self-testing reconfigurable CAM (RCAM) is described by McAuley and Cotton (1991). The size of this device is 256 words by 64 bits and was designed for general high-speed table look-up applications. The design is similar to the CARM device described previously, but it has two additional transistors for the self-test. During the self-test cycle a number of test patterns are automatically generated to test all the CAM words. If a fault is found that word is disabled from future selection. This self-test reconfiguration is typically carried out when power is first applied. The RCAM has 26 usable words, less the number of bad locations found during the self-test.
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
205
The RCAM is also interesting because it is an example of an addressless CAM. It does not contain the usual address encoder to identify the matching locations in the CAM but instead outputs the matching word directly. To explain this concept further, an example of the use of the RCAM for address translation in a high-speed packet switching network is described (McAuley and Cotton, 1991). When packets arrive at the switch they must be routed to the correct output port. The correct port to be used is based on the destination address, and the switch must translate the packet’s destination address into an output port number. The RCAM may be used for this purpose by splitting the 64bit words into a 48-bit packet address field and a 16-bit port address field as shown in Fig. 17. After receiving a packet, the destination address is applied to the RCAM with the masking register set so that the top 16-bits are don’t care. Any RCAM locations that have the same bottom 48 bits will match and the 16bit port address will be output (along with a duplicate copy of the 48-bit address).
6.3.5 Commercially Available C A M Products
The devices and architectures mentioned above have all been produced in research laboratories. Over the last few years a number of companies have started producing products that utilize content addressable techniques. These products range from individual CAM integrated circuits to complete associative processing board products. This section provides brief details and references for what the authors believe are the main products that are currently commercially available. Packet Address
IXXXXXXXl
I 0
47 48
Packet Address
63
1 Port Address 1
Output Port Address
FIG. 17. The RCAM used for address to port translation.
206
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
6.3.5. I Advanced Micro Devices. One of the first commercially available CAM devices was the Am99C10 device from Advanced Micro Devices (1990a). This device is organized as 256 words by 48 bits and has been optimized for address decoding in local area networks and bridging applications although it could also be used effectively in database machines, file servers, image processing systems, neural networks, and other applications. A block diagram of the Am99C10A is shown in Fig. 18. Each of the 256 words consists of a 48-bit comparator and a 48-bit register. When the data (comparand) is presented to the CAM array, a simultaneous single cycle compare operation is performed between the comparand and all 256 stored words in about 100 ns. Any of the 48 bits of the comparand may be selectively masked, disabling those bits from participating in the compare operation and thereby allowing comparisons to be made on only a portion of the data word. If the comparand matches with a stored word the on-chip priority encoder generates an 8-bit address identifying the matched word location in the array. If there are multiple matches in the array the priority encoder gcnerates the address of the lowest matched location. The addresses of other
16-bil LIO Bus
FIG. 18. A block diagram of the Am99CIOA. (Reprinted with permission from Advanced Mico Devices. Copyright 0 1990.)
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
207
matching words may be selected individually by setting the skip bit in the CAM word. Some of the applications that AMD suggest for the Am99ClOA are (1 990a) : Local area network (LAN) bridge address filtering LAN ring message insertion and removal Database machine support-search and support accelerators Pattern recognition-string search engines, etc. Image processing and machine vision-pattern recognition, image registration, etc. Neural net simulation A1 language support-(LISP. etc.) garbage collection support, PROLOG accelerators, etc. The main intended use for this device is a LAN address filter. This application is also mentioned for the other commercial devices described in the following sections and it therefore seems appropriate to elaborate on this example of CAM use. 6.3.5.2 LAN Bridge Address Filtering. A LAN bridge should provide transparent communication between two networks. An example of a bridge between an FDDI network and an Ethernet network is shown in Fig. 19, and a block diagram of the FDDI-Ethernet bridge is shown in Fig. 20. To allow workstations on the different networks to communicate with each other the bridge must pass through appropriate messages from one 10 MHz Ethernet
FOOl
Workstation 157E
CPU 2
Workstation 231A
Bridge B
I I
,
I
I
I
I
I
I
I
b
I
I
Workstation 34E5
10 MHz Ethernet
100 MHz
I
tj
t~
File Serve
+
Bridge A
Workstation 562C
wo A351
Workstation 4405
I I
I
1
I
;t
1
CPU 1
I
I I
I I I
I
I 1 I
,I I
Workstation 13EB
FIG. 19. An FDDI-Ethernet network system. (Reprinted with permission from Advanced Micro Devices. Copyright 0 1990.)
208
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
100 MHz FDDI Out
4 10 MHz Ethernet Bus
Ethernet Controller
Buffer
Controller
L
100 MHz FDDI In
Address Filter Address Match ( M y for this net) Pass Message
to Ethernet
Message
FIG.20. A block diagram of the FDDI-Ethernet bridge. (Reprinted with permission from Advanced Micro Devices. Copyright 0 1990.)
network to another. For example, assume that the workstation with address 562C sends a message to workstation 34E5. For this to occur the bridge must recognize that the address 34E5 is for a workstation on the other Ethernet network and pass the message accordingly. The FDDI-Ethernet bridge must compare destination addresses of all the transmitted messages to see if any of the messages should be routed to its Ethernet network. The problem is that there may be hundreds or even thousands of workstations on the LANs and the bridge therefore has to compare the message destination address with many stored addresses as quickly as possible. A simple sequential search approach would be too slow but a CAM device such as the Am99ClOA can carry out the required message address comparison in a single cycle. The 48-bit word size of the Am99C10A corresponds to the 48-bit address length of the network messages. More information on LAN address filtering and Bridge implementations using the Am99C10 can be found in Wilnai and Amitai (1990) and Bursky (1988) as well as in the Advanced Micro Devices data sheet (1990a).
6.3.5.3The MUSIC Semiconductors LANCAM. The MUSIC (MultiUser Speciality Integrated Circuits) Semiconductors Company in Colorado introduced in 1990 a content-addressable memory also targeted at address filtering in LAN and routers (1991a). The name of the device is LANCAM
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
209
(part number is MU9C1480) and is capable of storing 1024 64-bit fields. The device may be used for destination and source address recognition, and also for general associative data storage in systems such as database accelerators, code converters, machine vision systems, and target acquisition systems. The MU9C1480 is very similar to the AMD Am99C10 described previously but it has additional functionality and more than four times the capacity. Figure 21 shows a block diagram of the LANCAM. Although the internal data path is 64 bits wide, the external interface is multiplexed to allow communication with the device over a 16-bit bus (labeled 0415-0). This bus conveys data, commands, and status to and from the MU9C1480. The four signals shown in the bottom right of the figure are flags to allow the device to be vertically cascaded. These four signals have the following meanings :
/FF /MI /FF /FI
Match Flag: This output goes low when a valid match occurs during a comparison cycle. Match Input: This input is used in vertically cascaded systems to prioritize devices. Full Flag: This output when low indicates that all the memory locations within a device contain valid contents. Full Input: This input is used in vertically cascaded systems to generate a CAM memory system full condition.
Using the above four signals it is very easy to increase the memory word size by connecting together a number of LANCAMs. This vertical expansion is shown in Fig. 22. For bridge or other applications that require more than 1024 entries the LANCAM can be easily cascaded without the need for external priority encoders or address decoders. Figure 22 shows the vertical cascading of the LANCAM and it can be seen that the flag signals are simply daisy-chained together. 6.3.5.4 The National Semiconductor SONIC. National Semiconductor also has a device targeted for high-speed LANs called the systems-oriented network interface controller (SONIC) (Wheel, 1990). This device employs CAM architecture to perform address filtering and has seventeen 48-bit entries to store destination addresses.
6.3.5.5 The Summit Microsystems C A M Board. Summit Microsystems is an example of a company that produces board level products containing a number of CAM integrated circuits. Their SM4k-GPX board contains an
BLOCK DIAGRAM
IEC
MUSIC-" SeniccrariCtors reserves me ngh! 10 make changes ID lhs product wnhogt notice for the purpose 01 improving design or performance charanerlrfics MUSIC" and the elements of me MUSIC-" logo are Rademaks of MUSICN Semiconductors
January 15,1991, Rev. 0
FIG.21. A block diagram ofthe LANCAM. (Reprinted by permission of MUSIC Semiconductors. Copyright G 1991.1
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY .
21 1
-
DQO-15 /EN IWR ICM IEC
IIl‘ - 7 ::,& :: /SMF lSFF
FIG. 22. Vertically cascading the LANCAM. (Reprinted by permission of MUSIC Semiconductors. Copyright 0 1991.)
array of AMD Am99C10 CAM chips to provide a 4086-word by 48-bit matching memory (1989b). An input pattern up to 48 bits wide can be compared against all the CAM array words in a single 100-ns cycle. The board responds with the address of the CAM word that found an exact match with the input pattern. This board plugs into a standard PC/AT bus and is supplied with menudriven software to provide a CAM development system. It is possible to daisy-chain up to 15 additional boards to expand the CAM capacity to 64k words. The boards contain a 16-bit address register, effectively expanding the &bit addressing range of the individual Am99C10 devices. The boards also have a special area for user prototyping and personalization. Some of the applications that Summit Microsystems state are suitable for this board are : 0
0 0 0
0
LAN interconnect address filtering (Wilnai and Amitai, 1990a) File servers Database management Disk caching Radar and SONAR (sound navigation ranging) signature recognition Image processing Neural networks Cache memories
6.3.5.6 Coherent Research, lnc. Coherent Research is a company that provides both individual devices and board level products. The CRC32256
21 2
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
is a CMOS associative processor with a capacity of a 256-word by 36-bit content addressable memory (1990b). The coherent processor [(a) ;Stormon, 1989; Wayner, 1991)] is a card for the PS/2 Model 70/80 that uses up to 16 of the CRC32256 chips to provide a 4096 by 36-bit associative parallel processing array. The Coherent Processor development system provides hardware and software support for writing, debugging, and running parallel application programs in the C language. The company also has a software tool called the coherent processor simulator which runs under MS-DOS or Sun Unix and which simulates a parallel associative processor. Programs developed on the simulator can be run on the coherent processor board simply by recompiling. Coherent Research has a number of application notes that describe the use of their products in such fields as neural networks, LANs, relational databases, pattern recognition, and radar multiscan correlation.
7. Software for Associative Processors
Storage, retrieval, and manipulation concepts for associative memories and processors differ significantly from those used on traditional sequential address-based computers. Consider a typical address-based computer instruction such as STORE X, which presumably takes a value from an internal register and stores it in a memory location with address “X.” We cannot directly compare this with its exact counterpart in an associative processor because the CAM storage contains no location addressed by “X.” All we can reference are locations differentiated by their contents. This is true even if the underlying hardware in the associative computer does allow such address-based location selection at the lowest hardware level, since this user-available reference method would completely circumvent the whole point of the associative processing architecture. Depending upon the exact implementation of the associative system, this type of instruction might refer to a pattern “X,” and it might store the pattern in some subset of associative memory storage fields. Associative software will be discussed in more detail in the following sections, and several interesting implementations of associative languages will be provided. The software programmer writing applications for a content-addressable processing system does not necessarily need to have detailed knowledge of the underlying hardware data storage mechanism. It is possible to allow the user to specify the program in terms of associative interconnections, and let the system handle the exact implementation. So, for example, the user could write a program in some associative language and that program could be executed on either a fully associative computer, or on a partially associative
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
213
computer (e.g., bit-serial), or even on an entirely traditional address-based computer. The hardware and operating system (or run-time library routines, microcode, etc.) could shield the programmer from the details. It might even make sense to debug the program on a serial location-dependent machine before letting it loose on a parallel associative hardware engine, since troubleshooting the code would be simpler without the parallel aspect of computation. However, for the programmer to fully exploit the power of fast, parallel, content-related data management, it is necessary that he or she comprehend the underlying associative architecture. At some level, the software architecture must contain the proper syntax, program flow control, and data structures to efficiently execute applications on the associative hardware (Potter, 1988). The emerging field of parallel processing is one area where an associative memory can substantially enhance a computing system. For example, the Linda parallel processing language relies on “tuple space” to coordinate the multiple execution threads. The tuples formed in this language are matched by associative comparison rather than address. This language obviously benefits from an associative component. Many new programming languages in the field of artificial intelligence are nonprocedural in nature. It is hoped that these languages will more closely model how people deal with problems. Languages such as Prolog and Smalltalk identify objects and their interrelationships, and are good prospects for an associative processor. Content-addressable systems show good promise as fast, efficient database engines. The software necessary to implement this function must provide easy, context-sensitive searching capability on sometimes unstructured data, and the user interface must be accessible to the average programmer (Berra and Trouillinos, 1987). Some important associative computers have been formulated with that task as their goal, and several are described in this section to show the nature of such systems.
7.1
STARAN Software
The STARAN associative parallel computer architecture was described previously in Section 6.2.1. This section will provide some of the details about the STARAN software (Davis, 1974; Thurber, 1976) as an example of an associative machine language and system support combination. Since the machine can be operated in address-mode as well as associative-mode, there are obviously going to be many language facets that are common to
21 4
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
other non-associative computers. This section deals only with the special language constructs and implementations that are unique to an associative computer. The assembly language for the STARAN processor is called APPLE (Associative Processor Programming LanguagE). Each APPLE instruction is parsed by the hardware into a series of microcoded execution operations, and as such the underlying hardware may not always perform the entire operation in parallel. From the vantage point of the programmer, however, each assembly language instruction may be viewed as performing the entire operation as if it were a totally parallel machine. The array instructions in the APPLE language are unique to the associative STARAN computer, These instructions are loads, stores, associative searches, parallel moves, and parallel arithmetic operations. They operate on the MDA memory arrays and the PEs that are associated with them. The load array instructions load the PE registers or the common register with data from the MDA. The load function can also perform logical operations on the data before it is finally saved in the PEs. The store array instructions perform just the opposite. function. They move data from the PE or common register (with logical operations and a possible mask enable) to the memory arrays. The associative search instructions search the words in the array enabled by the mask register. The search can take several formats, and the comparisons can be made by many of the nonexact methods already listed in a previous section (such as greater/less than, greater/less than or equal, maximum, minimum). By combining different searches even more powerful comparisons such as between limits and next higher can be obtained. This group also contains special instructions to resolve multiple responses from a search. The puruEIel move instructions move masked fields within an array to other fields within the same array. Permutations of the data as it travels from the source to the destination fields are possible, such as complement, increment, decrement and move the absolute value. Finally, the parullel arithmetic array instructions provide the ability to perform masked parallel operations such as add, subtract, multiply, divide, and square root within the associative arrays. This instruction has several formats, but in each case it uses all the array words in parallel (or whichever ones appropriately match) as potential targets. The STARAN also offers a macro definition language, which allows the programmer to add other useful operations. These would include other arithmetic, logical, relational, and string manipulation operators. Although the APPLE language is unique to the STARAN system, it does provide an excellent example of the kinds of low-level machine instructions that make sense on an associative computer.
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
7.2
21 5
DLM Software
The distributed logic memory (DLM) as a string retrieval engine was proposed by Lee in his classic paper on the subject (Lee, 1962). In this memory scheme, there are no fixed words at all. Rather, the cells are laid out as a string from start to end, each cell communicating with its neighboring cells (next and previous) in the string. Each cell contains a symbol for storage, and a tag field to identify the active state of the cell. A comparison string is entered through an 1/0 port, and this string is compared to the entire associative memory. When the comparison operation is complete, only the matching fields identify themselves and the retrieval process can begin. The comparison is done as follows. The first symbol in the input string is compared to all the symbols in the memory that reside in the first position of a field (the start of the field is identified by a special symbol). If any of the cells match, they send a propagate signal to the next cell, which sets that new cell active. The next symbol in the input string is then compared to any still-active cells, which behave in the same way as the first cell if a match is made. In this way, only the cells that represent a complete string match will show an active status, since any nonmatching symbol will not propagate an active signal to the next neighbor cell. This description has been brief, and is only meant to familiarize the reader enough so that the instructions presented next have some context. Several incarnations of this basic architecture have been developed, and more information is available in other reference works (Kohonen, 1987; Thurber, 1976; Thurber and Wald, 1975). The most basic instructions in the DLM are match, propagate left/right, store, and read. The match command causes all currently active cells to compare their data symbols to the reference symbol input. This is done associatively and in parallel. A side effect of the matching operation is a clearing of the matched cell’s active state and a propagation of that active state to the next cell (left or right, depending upon the propagation direction control). The propagate instruction causes a transfer of the activity state from each cell to its neighbor (left or right, again depending upon the direction control). For example, if a cell is currently active and a propagation command is given with the control to the left, the cell will become inactive but the previous cell will become active (the previous cell being the one to the left of the current cell). Every cell is affected in parallel for this command, so each cell transfers its information simultaneously to the next (or previous) cell. The store and read instructions operate only on the active cells. An active cell, as already described, contains a tag in its activity field identifying it as ready to accept or provide a symbol when the appropriate command is given.
21 6
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
B A -
R
FIG.23. Directed graph.
The store command instructs every active cell to simultaneously save the symbol provided from the input port. The active state is then propagated to the next cell, and the current state is made inactive. In this way, a string can be saved one symbol at a time. The read command sends the symbol from any active cell to the output port. If there is more than one active cell, some multiple match resolution mechanism must be provided. By combining the four instructions groups above, the defined search ability already described can be obtained. As well, it is possible to perform other types of transactions, such as arithmetic operations and proximity searches, on the information contained in the DLM.
7.3 ASP Software Another interesting associative language is the ASP (association-storing processor) specification (Savitt et al., 1967). This language (and the architecture for its suggested implementation) was designed to simplify the programming of nonarithmetic problems. The basic unit of data in the ASP language is the relation. It relies on the ordered triples of indirect association already described in Section 5.2. In this architecture, the reader may recall, the triple (A, R, B) states that A is related to B through the relation R. This association of two items is what the ASP language calls a relation. The relations in ASP can be expressed as above, or as a directed graph as shown in Fig. 23. Each item in the relation must be distinct, and no item can appear without a relation. A compound item is formed when the association between two other items is itself an item with its own association as shown in Fig. 24. The ASP language transforms the data based upon conditions. There are two main components of this transformation. First, the language provides a search capability where the existing database is inspected for a match with one set of relations. Any matching data is replaced with data described by B
t
C FIG.24. A compound item.
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
21 7
another set of relations. Furthermore, the instruction identifies the next instruction to be executed based upon the success or failure of the current match operation (conditional branch). ASP instructions are expressed as structures of relations, and linked together to form programs. One interesting aspect of this representation is that one ASP program can be processed by another ASP program. The best way to show how this would work in practice is to provide an example. The ASP description might read: LOCATE items X1 which are examples of airman, and have the jobclass of arm spec, and are stationed at items X2 (which are located in Europe), and have the status of items X3. REPLACE items XI which have the status of X3 by items XI which have the status of ALERT.
This language statement would take all the military personnel identified in the above description and change their status to “alert.” It was expected that a machine would be constructed based upon the ASP specification, and that the language described would be executed on this hardware. So the language was first designed, then the hardware architecture was formulated. A major component in this hardware was a distributed logic associative memory called the context-addressed memory. This highly interconnected memory would have the capability to perform the global searches on all the items (and their relations) in parallel.
7.3.I RAPID System Software Parhami has suggested a CAM-based system for reference and document retrieval (Parhami, 1972). Called RAPID (rotating associative processor for information dissemination), it includes an underlying architecture and a machine language specification for that architecture. Like the DLM, this system performs its operation on variable-length string records rather than fixed-length words. Special tag symbols are specified so that records can be identified. The system is envisioned to be a byte-serial CAM, where a byte is inspected in one low-level hardware operation. For the purpose of understanding the language, we can assume that the hardware is a circulating disk with enough heads to read and write one byte simultaneously. This is how Parhami envisioned the hardware implementation. Other hardware mechanisms could be used, of course, but by viewing the hardware as a disk the reader can gain insight into how the language interacts with the processing engine. We
21 8
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
can further assumc that each machine language instruction is performed on at least one full rotation of the disk, so that subsequent instructions will operate on the information left by the previous instruction. As the instructions are described the similarity to the DLM will become apparent. The first instruction type is of the search variety. A single character or string (strings would take several rotations, one byte comparison per rotation) would be searched for. When found, an “active” marker would be set such that a subsequent rotation could recognize the results of the search. The character search could take the form of equal, not equal, greater/less than, and greater/less than or equal. Other search instructions would look for marked characters or strings and perform some operation on them, such as setting new active markers. The propugate instruction would transfer active markers to other characters. The currently active characters would have their markers cleared, and the target of the propagation would have their markers set. This propagation would happen to every marker occurrence in one revolution, and would appear to happen in parallel to the programmer. The expund instruction would take active markers and propagate them without clearing the currently active markers. So, when a marker was found, the next “x” characters would be marked active (“x” depending upon the exact instruction parameter) but the currently active marker would remain active. The contract instruction would reset markers in a row. The addinstruction would add a numerical value to all marked characters and replace the new sum into that character. Finally, the replucr instruction would replace every marked character with a new character specified in the instruction. The language just described can be used to write programs that would search for patterns and modify them based upon subsequent instructions. So an initial search could be made for a character string, then the marked characters would be expanded or propagated until just the right combination of characters would remain marked for retrieval or modification. 7.4 Patterson‘s PL/1 Language Extensions
Patterson has suggested extensions to the PL/l language to allow associative structures to be easily manipulated (Patterson, 1974). He believes that an extension of an existing language is more reasonable than an entirely new language, since most problems have both associative and sequential components. His language would add a declaration for an associative procedure, similar to the already-included PL/ 1 declaration for a recursive procedure. This procedure would include a parameters field that the programmer could use to specify the appropriate entry length for the application.
CONTENT-ADDRESSABLE A N D ASSOCIATIVE
MEMORY
21 9
Variables would be differentiated by their nature (sequential or associative) upon their initial declaration. The two variable types would be static (for a normal sequential variable) and associative. An associative variable declaration would define a field in every associative word. Comparisons would be made in parallel and associatively between an input from the sequential machine and the associative words, or between two fields in all associative words. The associative words taking part in any operation (“active” words) would generally be a subset of the total words. The associative function would allow relational operators (greater/less than and equal) and logical operators (AND, OR, NOT) to execute. This operation would be simultaneously carried out on all currently active words, and would potentially reduce their number if all the words do not respond. Two special statements would find the minimum or maximum values in a particular field, and to activate each word that contains these values. The activate statement would perform the same relational and logical operations in parallel on the associative memory, but would be executed on all the words rather than the currently active ones. This could be used to set all the words active, or it could activate the first matching word found. The for statement could select a subset of the active words for the operation, and the else could be used to perform some operation on the active words that did not meet the selection criteria. Assignment statements in the PL/l extension would look similar to normal assignment statements in many languages ( X = Y ) . However, the outcome would be different if an associative variable was involved. If both variables were associative, the statement would move data from one field to another in all active words. If the source Y was a common sequential variable, then it would be loaded simultaneously into the X field of all active associative words. If the X field was common and the Y field was associative, then the first active word in the associative memory would be loaded into the destination variable.
7.5
PASCALIA
Another language suggested for extension is PASCAL. By adding some associative concepts to that language, PASCAL/A is formed (Stuttgen, 1985). PASCAL/A has only one more data structure than standard PASCAL, and that is the table. The table is similar to the “relation” in database terminology, but it provides more flexibility in that row uniqueness is not mandated (although the programmer can provide row uniqueness in the table if he wishes). A table declaration contains fields called attributes, which describe the information contained within it.
220
LAWRENCE CHlSVlN AND
R. JAMES
DUCKWORTH
Associative procedures in the PASCAL/A language operate on “active” rows in the table (the concept of “active” shows up often in associative languages). Generic instructions (such as emp.salary :=empsalary + 100) operate associatively and in parallel on all currently active rows. The statement above would add 1000 dollars to the scalar field in every active row in the database. Active rows are those that have matched a selection criteria, as set forth in specially defined content addressed statements for the language. For example, a statement such as
WHERE emp [ salary< 20,0001DO salary:=salary + 1000 would first search the database for employees currently earning less than 20,000 dollars, activating all the rows where this was true. Every row that matched this criteria would then have the salary increased by 1000 dollars. The associative data structures (tables) in PASCAL/A are interfaced by several special statements. The insert procedure writes a row into the table. Tables can be read by either the retrieve (nondestructive read) or the readout (erases row in table after reading) procedure. In each case, some arbitrary row would be copied into a buffer for other processing. Finally, the delete statement would erase all the active rows in the database. The PASCAL/A statements described above can be made into powerful programs that query and modify complex databases. The author of the language suggests the language would be especially strong in the areas of artificial intelligence, array processing, database systems, pattern recognition, numerical analysis, compilers, operating systems, and graph algorithms.
7.6 LUCAS Associative Processor PASCAL was also chosen as the base language for the LUCAS associative processor (Fernstrom et al., 1986) previously described in terms of its hardware and architecture (see Section 6.2.6). PASCAL was chosen for this system over APL and FORTRAN due to its structured language with powerful control, its strong typing of variables, and its excellent error detection offered at both compile and run time. The language PASCAL/L (as in PASCAL/LUCAS) adds several important extensions to standard PASCAL, including special constructions to allow parallel operations and special variable declarations for data that is allocated to the associative array. There are extensions to allow parallel operation, and special variable declarations for data that is allocated to the associative array. The two kinds of parallel variables are selectors and parallel arrays. Selector variables control the parallelism of operations on the PE. This can be
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
221
used to operate simultaneously on a defined subset of the PEs, while excluding operation on the unselected subset. A parallel array variable describes a fixed number of elements, all of which have the same type characteristic (e.g., integer, Boolean, character, etc.). The parallel array can also restrict the operation to a subset of the total PEs. Assignment statements operate based upon the variable types on the left and right side of the assignment. Sequential to sequential assignments operate just as they do in standard PASCAL. If the left-hand side (destination) is a parallel variable and the right-hand side (source) is a sequential variable, every component reference in the associative array will be loaded with the scalar expression contained in the sequential variable. If the destination is a sequential variable and the source is a parallel expression, the parallel source must indicate just one element in the array (e.g., S:=P[5]), and that is transferred to the sequential variable. Finally, if the destination is a parallel variable and the source is a parallel expression, the referenced components on the left are loaded with their corresponding elements on the right. The control structure for PASCAL/L also accommodates parallel associative processing. As many as 128 PEs may be operated on in parallel in each statement. So a statement such as
WHERE<selector expression>DO<true-statement> ELSEWHERE operates on both paths in parallel. The true clause is executed on one set of PEs using one set of data, while the false is also executed on another set of PEs using another data set. The CASE statement operates on all the paths in parallel, using different data on the different PEs. There is also a WHILE AND WHERE statement which repeats as long as the selector statement is true for any element, with the selector determining which PEs are active for any particular repetition. 7.7 The LEAP Language
Algol was extended to include some associative concepts (such as associations and sets) by Feldman and Rovner (1969) to create their LEAP language. Their language aims at striking the right balance between ease of use and efficiency of execution. They provide an interesting view of RAM as a special form of a CAM, with one field of the CAM reserved for the address of the word. By relaxing the need for a special field, the CAM provides more flexible retrieval capability. However, fixed and static fields (direct association) provide no ability to have complex interrelationships in the data. If we look at the example of a telephone directory (often used to show the benefits of an associative memory), the drawback of direct associative
222
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
becomes obvious. What if one person has two telephone numbers, or if one number must be associated with two people sharing an office? The LEAP language relies on the ordered triple concept (already described in Section 5.2 as indirect association) to create a more useful associative description. Thus, the language syntax treats each association as the 3-tuple (a, 0,v), representing the attribute, object, and value, respectively. A 3-tuple of this sort forms an association, and the items are the components of the association. Four new data type declarators were added to the standard ALGOL language: item, itemvar, local, and set. An item is similar to a LISP atom. An item that is stored in a variable is called an itemuur. Items may also be members of a set, or be associated to form the 3-tuples described above. A LEAP expression can be used to create new items or associations (construction expression), or to retrieve information about existing items or associations (retrieval expression). Items are obtained during execution by using the function new. The count operator returns the number of elements in the specified set. The istriple predicate returns a value that represents whether the specified argument is an association item. There are several set operators identifying the standard set manipulations such as NOT, AND, and OR. A few extra program statements are added to make ALGOL into LEAP. The put statement performs a union operation (e.g., put tom in sons will insert the item tom into the set sons), while the remoue statement does the opposite (removes the element from a set). The delete statement destroys an item that was previously created. The make statement places the specified “triple” into the universe of associations, whereas the erase statement removes the association from that universe. The most important addition in the LEAP language is the loop statement, exemplified by the,foreac.h statement. This statement must perform its operation over a set of simultaneous associative equations in order to determine the loop variables. The best way to show how this works is by example. The expression we will work with is : foreach father ’ x = hill do put x in sons In this expression, father and hill are items, x is a local data type, and sons is a set. This expression would first determine the set of all items who match the condition that the father attribute is “bill.” In other words, who are the people that have “bill” as their father? Each time the condition is met, the current value of “x” is added to the set “sons.” More complex expressions can be created that use more local variables and include Boolean functions in the set search space. The LEAP data system was used in a more recent artificial intelligence language called SAIL (Feldman et al., 1972).
CONTENT-ADDRESSABLE A N D ASSOCIATIVE MEMORY
223
7.8 Software for CA Systems The section on associative architectures introduced a computing system designed (and prototyped) by Blair and Denyer (1989). The system they envisioned has a careful matching between the software and its underlying hardware architecture. We will provide more details on the software for that system in this section. Their architecture was called a “triplet,” and contained a CAM-CPU-RAM combination. The CPU and RAM were viewed as a normal sequential (von Neumann) computer, and the CAM was connected to the CPU and accessed by an address group. Blair and Denyer chose (as did many before them) to extend an existing high-level language rather than create a new language from scratch, or count on an intelligent compiler to recognize when associative processing was possible. The C language was used as their starting point. Before describing the language extensions, we will explain an important underlying concept in this content-addressable system. After an associative comparison is performed using the CAM, a “bag” of words is formed from the matching entries. This bag contains the group of words whose tags show them to be active (the bag might be empty if no matches were made). Thefield statement identifies which fields within the CAM are to be used, and what type of information they contain (char, int, etc.). The bag is defined by the reserved word define, which describes the comparison fields and specifies their values. The function can also return a value that specifies whether the pattern describes an empty bag. The first operation type is the simple Boolean function empty, which is true if and only if the bag is empty. This can be used to determine when to stop looping (e.g., while (!empty)). The next operation returns the next value in the bag, and returns status to show when the bag is empty (there are no more values). The remove deletes an entry from the bag, and also returns similar status. Special language constructs are provided to perform common manipulations, such as loops that operate on each member of a bag. The foreuch statement first defines the bag based upon a pattern, and loops (similar to a while and next combination) until the bag is empty (i.e., the entire bag has been operated upon). The fromeuch loop defines a bag and then performs a remove operation on each member of the bag. The repeat with statement performs a new define function on each iteration of the loop, so the members of each bag depend upon the operations of the last iteration.
7.9 Neural Network Software Neural networks have been referred to several times in this chapter, and the underlying hardware for such a system is obviously significantly different than a normal sequential computer. The programming of a neural network
224
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
consists of modifying the connections between the primitive storage elements, and the systems created for this purpose have been called connectionist (Fahlman and Hinton, 1987). The programming of a connectionist computing system is more akin to teaching than to what is normally considered programming. The learning process is accomplished by entering initial input data, and feeding the selecied outputs back into the teaching inputs. The associative neural network decides which stored pattern most closely matches the input vector, and selects an ouput based upon the best match. The programming of connectionist systems, still a topic of much research and controversial debate, is dependent upon the associated hardware. The information in the neural system can be stored as a local or distributed representation. In a local representation, each discrete packet of data is localized to a section of the hardware. This is the easiest to program, since there is very little interaction between most of the neural nodes. It is also more familiar to most programmers, in that the individual chunks of information can be stored and validated without regard to the rest of the stored data. However, it allows many single points of failure to exist. If the local area used to store a piece of information is broken, that information is no longer available for recall. A distributed representation completely spreads the data throughout the available hardware. In this system, every neural node in the processing structure is potentially activated when any data is input. This eliminates any single point of failure, and is the most reliable associative system possible in terms of hardware. The disadvantage to the completely distributed representation is the programming and validation obstacle it presents to a software enginecr. Since every execution and storage unit is conceivably involved with every recall attempt, unexpected connections can influence the decision process. Languages that can be used to describe the operations generally performed by neural networks have been called neurosoftwure (Hecht-Nielsen, 1990). The goal of neurosoftware is to free the programmer from having to deal with the underlying mechanisms involved in storage and retrieval. In other words, let the hardware (or some firmware/operating system combination) map the programmer's conceptual statements into meaningful execution routines, freeing the user from tedious and difficult housekeeping chores. Lei the programmer describe what associative operations need to be performed at some high level. This theme is seen repeatedly in the examples of software written for associative processing systems. The following description shows how it applies to neural network software. It is assumed here that there is a traditional address-based computer acting as the user interface to the neural network. Since most problems have
CONTENT-ADDRESSABLE A N D ASSOCIATIVE MEMORY
225
sections that are best handled sequentially, this approach seems to provide the most efficient use of scarce resources. The initial function in this scenario must be to have some kind of network load command, which takes a description of the network and transfers it to the underlying network hardware. Once the network description is loaded, an instruction is required to define the run-time constants such as learning rates and thresholds. An instruction to define the initial state of each processing element is necessary, and there must be another instruction to provide a new input value (and its weight) to the processing elements. Another instruction should be included to monitor the current state of each processing element. Once the network has been initialized, an instruction to actually run the system must be included. This would cause the neural network to activate the underlying hardware and perform its execution. After the execution is done (or after some amount of time if the network runs continuously), an instruction to save the state of the network is necessary. This saved state can be used to restore the network at some later date. These primitive instructions can be used to completely control the neural network under program control of the host (traditional) computer. An example of a general-purpose neural network description language is AXON. This language is described in detail in Robert Hecht-Nielsen’s (1990) book on neurocomputing. 8 . Conclusion
This chapter has given a broad overview of content-addressable and associative systems. Important terms were defined, associative concepts were explained, and recent examples were provided in this rapidly progressing area of data processing and retrieval by content. In this chapter we concentrated on providing information on the most recent advances in content-addressable and associative systems. We conclude this chapter with information that places our article in historical context, and show the vast amount of research that has been lavished on the subject. There have been a significant number of major reviews of the topic during the last 30 years, and they are described briefly here. Most of these reviews contain a bibliography of their own and provide additional, older references. Finally we conclude with our thoughts on what we believe the next few years will bring in this area of intelligent memory systems. 8.1 Additional References
The first comprehensive survey of CAMS and associative memories was by Hanlon (1966). The CAM had been around for about 10 years at the
226
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
time, and his motivation was to summarize previous research and suggest interesting areas for further development. Hanlon’s survey described the concepts of this emerging field, and provided an excellent state-of-the-art (for 1966) tour of the topic. This included some details on the materials and architectures considered promising at the time. That same year the Advances in Computers series published a chapter by Murtha (1966) that discussed highly parallel information processing systems. Although the chapter was not entirely dedicated to the subject of associative systems, there was a good discussion of associative processors and their ramifications. The next major review of the research literature was done by Minker (1971). His paper was mostly a comprehensive bibliography with a very brief description of some interesting developments since the Hanlon survey. As with the Hanlon paper, Minker listed some applications of associative memories, as well as a few interesting implementation materials and memory organizations. He concluded that, as of 1971, “associative memory hardware technology has not yet come of age.” Parhami (1973) was the next major reviewer of the subject. His primary thesis was that “associative processing is an important concept that can be employed to enhance the performance of special-purpose and general-purpose computers of the future.” His article was not a tutorial, but rather a newer survey of associative processing techniques with a new bibliography for those interested in reading about it all first hand. His report described the architectural concepts inherent to associative storage and processing, detailed some interesting hardware implementations, briefly touched upon a few software considerations, and (as usual) provided some potential applications. P. Bruce Berra (1 974) provided a discussion of associative processors and their application to database management in his presentation at the 1974 AFIPS National Computer Conference and Exposition. He discussed most of the implementations attempted to that time, and showed the advantages and disadvantages of the associative approach in such applications. The ACM journal Computing Surveys published an article by Thurber and Wald (1975) that discussed associative and parallel processors. This article presented an excellent genealogy of associative SIMD machines, then went on to discuss at some length associative processors and their design issues and trade-offs. Several actual machines were highlighted. Two major books were published in 1976 that covered the topic of content-addressable systems. Foster (1976) dealt with the subject of content-addressable parallel processors. His book discussed the basics of content-addressable computers, included some useful algorithms for such machines, detailed several applications, presented some CAM hardware,
CONTENT-ADDRESSABLE AND ASSOCIATIVE MEMORY
227
and described the STARAN associative system in some detail. In that same year, Thurber (1976) published a book about large-scale parallel and associative computers. This book dealt with similar subject matter to the 1975 Thurber and Wald report, but was able to provide more details on the associative computers mentioned. In 1977 Yau and Fung surveyed associative processors for ACM Computing Surveys. During 1979, both IEEE Computer (1979a) and IEEE Transactions on Computers (1 979b) featured special issues on database machines. Each issue was dedicated to articles describing hardware and software for database applications. Kohonen (1987) put the subject all together in his 1980 book on contentaddressable memories (updated to briefly survey new information in 1987). He attempted to include a complete description of the field by “presenting most of the relevant results in a systematic form.” His book included information about CAM concepts, CAM hardware, and content-addressable processors. In 1985 Stuttgen (1985) provided a review of associative memories and processors as part of his book on hierarchical associative processing systems. The review section listed a number of different taxonomies for associative systems, including his own view. He then discussed several different past architectures in a way that allowed direct comparison of their benefits and drawbacks. Also, in 1985 Lea wrote a chapter called “Associative Processing” in his book Advanced Digital Information Systems (1985). The 1987 proceedings of COMPEURO contained an article by Waldschmidt ( 1987) that summarized the fields of associative processors and memories. In 1988, Su dedicated a chapter of his book on database computers to associative memory systems. The chapter presents an excellent overview of the topic, with descriptions of some major content-addressable architectures and their application to database management systems. Zeidler reviewed the topic of content-addressable mass memories (Zeidler, 1989) in his 1989 report. Mass memories are defined as those having large storage capacities (gigabytes), and are targeted for use in database and information systems. His paper was one of several in a special issue of the IEE Proceedings (1989) that concentrated on associative processors and memories. There are numerous other papers and books on the subject, including an earlier paper by the current authors (Chisvin and Duckworth, 1989). Many of them are mentioned in this report in reference to specific associative concepts. We have attempted to concentrate on recent developments in this chapter, and only refer to old references where they provide the classical description of some aspect of associative computing. We believe that the references above provide a reasonable historical review of the topic.
228
LAWRENCE CHlSVlN AND R . JAMES DUCKWORTH
8.2 The Future of Content and Associative Memory Techniques
The concepts and techniques of content-addressable and associative systems, already making their appearance in the commercial world, will become more important in time. This will happen as the technology used to build the devices reduces the size and cost of the final system, and as more people become familiar with the systems thus created. The development of inherently fault-tolerant CAM devices should help to produce very large devices and the availability of optically-based CAMSin a few years seems an exciting possibility. We seem to be at an interesting crossroads in this field of intelligent or smart memory systems. The technology is now available to implement devices that are of reasonable size. However, the problem seems to be whether enough semiconductor manufacturers will support and produce enough general devices that system designers will consider using them. It is the classic chicken and egg situation : Engineers will not incorporate new parts into their products unless they are well supported and are second sourced by at least one other manufacturer, on the other hand, manufacturers will not commit to an expensive introduction of a major new part unless they perceive a sizable market for that part is available. As with any new and exciting field of knowledge, the success of the systems will depend on the availability of bright, motivated people to program and apply these systems to both current problems and problems not yet imagined.
Acknowledgments
The motivation for this work started at the University of Nottingham in England with the MUSE project (Brailsford and Duckworth, 1985). This project involved the design of a structured parallel processing system using a mixture of control and data flow techniques. The use of CAM to improve the performance of the machine was investigated by a research student who demonstrated the potential of this approach (Lee, 1987). We acknowledge his contributions to this field. We thank Worcester Polytechnic Institute for providing the resources to produce this chapter. We would also like to thank Gary Styskal, a M.S. student of Worcester Polytechnic Institute, who performed a literature and product search of CAM architectures and devices. This chapter was based on a report published in IEEE Computer (Chisvin and Duckworth, 1989) and we wish to thank the Institute of Electrical and Electronics Engineers (IEEE) for permission to use that material in this chapter.
CONTENT-ADDRESSABLE A N D ASSOCIATIVE MEMORY
229
REFERENCES (a). Coherent Processor 4,096 Element Associative Processor. Data Sheet, Coherent Research, East Syracuse, New York. (1979a). IEEE Computer 12(3). ( 1979b). IEEE Transactions on Computers C-28(6). (1986a). Memory Update for Computers. New Scientist 109 (1492), 36. (l989b). SMC4k-GPX A General Purpose IBM PCjAT Add-on Content Addressable Memory Board. Data Sheet, Summit Microsystems Corporation, Sunnyvale, California. (1 989c). Special Section on Associative Processors and Memories. IEE ProceedinRs, Parf E 136(5), 341-399. (1989a). Special Issue on Neural Networks. IEEE Microsystems. (1990a). Am99C10A 256 x 48 Content Addressable Memory. Publication no. 08125, Advanced Micro Devices, Sunnyvale, California. (1990b). CRC32256 CMOS Associative Processor with 256 x 36 Static Content Addressable Memory. Coherent Research, Syracuse, New York. (1991a). MU9C1480 LANCAM. Data Sheet, MUSIC Semiconductors, Colorado Springs, Colorado. Almasi, G . S., and Gottlieb, A. (1989). Highly Parallel Computing. Bejamin/Cummings, Redwood City, California. Backus, J. (1978). Can Programming Be Liberated from the von Neumann Style? A Functional Style and Its Algebra of Programs. Communications of the A C M , 21(8), 613-641. Batcher. K. E. (1974). STARAN Parallel Processor System Hardware. Proceedings of AFIPS N C C 43, 405-410. Bergh, H.. Eneland, J., and Lundstrom, L.-E. (1990). A Fault-Tolerant Associative Memory with High-speed Operation. IEEE Journal of Solid-State Circuits 25(4), 912 919. Berkovich, S. Y . (198 1). Modelling of Large-Scale Markov Chains with Associative Pipelining. Proceedings 1981 International Conference on Parallel Processing, I3 I - 132. Berra, P. B. (1974). Some Problems in Associative Processor Applications to Data Base Management. Proceedings of AFIPS N C C 43, 1-5. Berra, P. B., and Troullinos, N. B. (1987a). Optical Techniques and Data/Knowledge Base Machines. IEEE Compufer 20( lo), 59-70. Berra, P. B., Chung, S. M., and Hachem, N. I. (1987b). Computer Architecture for a Surrogate File to a Very Large DatajKnowledge Base. IEEE Computer 20(3), 25 -32. Berra, P. B., Brenner, K.-H., Cathey, W. T., Caulfield, H. J., Lee, S. H., and Szu, H. (1990). Optical Database/Knowledgebase Machines. 29(2), 195-205. Bic, L., and Gilbert, J. P. (1986). Learning from AI: New Trends in Database Technology. IEEE Computer 19(3), 44-54. Blair, G . M. (1987). A Content Addressable Memory with a Fault-Tolerance Mechanism. IEEE Journal of Solid-State Circuits SC-22(4), 614-61 6. Blair, G. M., and Denyer, P. B. (1989). Content Addressability: An Exercise in the Semantic Matching of Hardware and Software Design. IEE Proceedings, Port E 136(1), 41 47. Boahen, K . A,, Pouliquen, P. O., Andreau, A. G., and Jenkins, R. E. (1989). A Heteroassociative Memory Using Current-Mode MOS Analog VLSI Circuits. IEEE Transactions on Circuits and Systems 36(5), 747 155. Bonar, J. G., and Levitan, S. P. (1981). Real-Time LISP Using Content Addressable Memory. Proceedings 1981 International Conference on Parallel Processing, 112- 1 19. Brailsford, D. F., and Duckworth, R. J. (1985). The MUSE Machine-An Architecture for Structured Data Flow Computation. New Generation Computing 3, I8 1- 195, OHMSHA Ltd., Japan.
230
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
Brown. C. (May 13, 1991). Chip Doubles as Data Cruncher. Elrctronic Engineering Times 43, 46. Burnard, L. (1987). CAFS: A New Solution of an Old Problem. LiteraryandLinguistic Computing 2(1), 7 12. Rursky, D. ( 1988). Content-Addressable Memory Does Fast Matching. Electronic Design 36(27), 119 121. Chae, %-I.. Walker, T., Fu, C.-C., and Pease, R. F. (1988). Content-Addressable Memory for VLSI Pattern Inspection. IEEE Journal of’ Solid-state Circuits 23( I). 74--78. Cherri, A. K., and Karim, M. A. (1988). Modified-Signed Digit Arithmetic Using an Efficient Symbolic Substitution. Applied Optics 27( 18), 3824 3827. Chisvin, L., and Duckworth, J. (1989). Content-Addressable and Associative Memory: Alternatives to the Ubiquitous RAM. IEEE Computer 22(7), 51-64. Chu, Y., and Itano, K. (1985). Execution in a Parallel, Associative Prolog Machine. Technical Report TR-147 I , University of Maryland, College Park. Cordonnier, V., and Moussu, L. (1981). The M.A.P. Project: An Associative Processor for Speech Processing. Proceedings 1981 International Conjcrence on Parallel Processing, 120128. Davis, E. W. (1974). STARAN Parallel Processor System Software. Proceedings qf AFIPS NCC 43, 11-22. Deeegdma, A. L. (1989). The Technology of Parallel Proccssing-Volume I . Prentice-Hall, Englewood Clifs, New Jersey. Eichmann, G., and Kasparis, T. (1989). Pattern Classification Using a Linear Associative Memory. Pattern Recognition 22(6), 733 740. Fahlman, S. E., and Hinton, G. E. (1987). Connectionist Architectures for Artificial Intelligence. IEEE Cornputer 20( I ) , 100 109. Farhat, N. H. (1989). Optoelectronic Neural Networks and Learning Machines. IEEE Circuits and Devices, 32-~41. Feldman, J. A,, and Rovncr, P. D. (1969). An Algol-Based Associative Language. Communicafions of the ACM 12(8), 439 449. Feldman, J . A,, Low, J. R., Swinehart, D. C., and Taylor, R. H. (1972). Recent Developments in SAIL-An Algol-Based Language for Artificial Intelligence. Proceedings uf AFIPS FJCC, 41, Part 11, 1193--1202. Feldman, J. D., and Fulmer. L. C. (1974). RADCAP-AN Operational Parallel Processing Facility. Proceedings of’ A F I P S N C C 43, 7-1 5. Fernstrom, C., Kruzela, I., and Svensson, B. ( 1986). “LUCAS Associative Array Processor.” Springer-Verlag, Berlin. Flynn, M. J. Some Computer Organizations and Their Effectiveness. IEEE Transactions on Computers C-21(9), 948-960 (September 1972). Foster, C. C. ( 1976). “Content Addressable Parallel Processors.” Van Nostrand Reinhold C:ompany, New York. Gardner, W. D. Neural Nets Get Practical. High Performance Systems, 68-72. Gillenson, M. L. (1987). The Duality of Database Structures and Design Techniques. Communications of’the A C M 30(12), 1056---1065. Gillenson, M. L. (1990). “Database Design and Performance.” In “Advances in Computers”Volume 30, pp. 39 83. Goksel, A. K., Krambeck, R. H., Thomas, P. P., Tsay, M.-S., Chen, C. T., Clemens, D. G., LaRocca, F. D., and Mai, L.-P. (1989). A Content-Addressable Memory Management Unit with On-Chip Data Cache. IEEE Journal of’ Solid-State Circuits 24( 3), 592-596. Goser, K., Hilleringmann, U., Rueckert, U., and Schumacher, K. (1989). VLSI Technologies al Neural Networks. IEEE Micru, 28-44.
CONTENT-ADDRESSABLE A N D ASSOCIATIVE MEMORY
231
Grabec, I., and Sachse, W. (1989). Experimental Characterization of Ultrasonic Phenomena by a Learning System. Journal of’Applied Physics 66(9), 3993-4000. Graf, H. P., Jackel, L. D., and Hubbard, W. E. (1988). VLSI Implementation of a Neural Network Model. IEEE Computer 21(3), 41L49. Grosspietsch, K. E., Huber, H., and Muller, A. (1986). The Concept of a Fault-Tolerant and Easily-Testable Associative Memory. FTCS-16, Digest of Papers, The 16th Annual International Symposium on Fuult-Tolerant Computing Systems, 34- 39. Grosspietsch, K. E., Huber, H., and Muller, A. (1987). The VLSI Implementation of a FaultTolerant and Easily-Testable Associative Memory. Proceedings of Compeuro ’87, 47 50. Grosspietsch, K. E. (1989). Architectures for Testability and Fault Tolerance in ContentAddressable Systems. TEE Proceedings, Part E, 136(5), 366-373. Curd, J. R., Kirkham, C. C., and Watson, I. (1985). The Manchester Prototype Dataflow Computer. Communications of’the ACM 28( I ) , 34-52. Hamming, R. W. (1 980). “Coding and Information Theory.” Prentice-Hall, Englewood Cliffs, New Jersey. Hanlon, A. G. (1966). Content-Addressable and Associative Memory Systems. IEEE Transactions on Electronic Computers EC-15(4), 509-521. Hashizume, M., Yamamoto, H., Tamesadd, T., and Hanibuti, T. (1989). Evaluation of a Retrieval System Using Content Addressable Memory. Systems and Computers in Japan 20(7), 1-9. Hecht-Nielsen, R. (1990). “Neurocomputing.” Addison-Wesley, Reading, Massachusetts. Hermann, F. P., Keast, C. L.. Ishio, K., Wade, J . P., and Sodini, C. G. A Dynamic ThreeState Memory Cell for High-Density Associative Processors. IEEE Journal of Solid-Stare Circuits 26(4), 537-541. Hirata, M., Yamada, H., Nagai, H., and Takahashi, K. (1988). A Versatile Data-String-Search VLSI. IEEE Journal of Solid-State Circuits 23(2), 329- 335. Holbrook, R. (1988). New RDBMS Dispel Doubts. Perform OLTP Applications. Computer Technology Review 8(6), 1I - 15. Hurson, A. R., Miller, L. L., Pakzad, S. H., Eich, M. H., and Shirazi, B. (1989). Parallel Architectures for Database Systems. In “Advances in Computers”-Volume 28, pp. 107151. Jones, S. (1988). Design, Selection and Implementation of a Content-Addressable Memory for a VLSI CMOS Chip Architecture. IEE Proceedings Part E. Computers and Digital Techniques 135(3), 165 172. Kadota, H., Miyake, J., Nishimichi, Y., Kudoh, H., and Kagawa, K. (1985). An 8-kbit Content-Addressable and Reentrant Memory. IEEE Jotrrnal of Solid-State Circuits SC-20(5), 951-957. Kartashev, S. P., and Kartashev, S. I. (1984). Memory Allocations for Multiprocessor Systems That Incorporate Content-Addressable Memories. IEEE Transactions on Computers C-33( I), pp. 28 ~ 4 4 . Knuth, D. E. (1973). “The Art of Computer Programming-Volume 3: Sorting and Searching.” Addison-Wesley, Reading, Massachusetts. Kogge, P., Oldfield, J., Brule, M., and Stormon, C. (1988). VLSI and Rule-Based Systems, Computer Archirecrure News 16(5), 52 65. Kohonen, T. (1977). “Associative Memories: A System-Theoretical Approach.” Springer-Verlag, New York. Kohonen, T., Oja, E., and Lehtio, P. (1981). Storage and Processing of Information in Distributed Associative Memory Systems. In “Parallel Models of Associative Memory” (Anderson, J. A,, ed.), pp. 105-143. Lawrence Erlbaum, Hillsdale, New Jersey. Kohonen, T. ( 1987). “Content-Addressable Memories.” Springer-Verlag, New York.
232
LAWRENCE CHlSVlN AND R . JAMES DUCKWORTH
Lea, R M. (1975). Information Processing with an Associative Parallel Processor. IEEE Computer, 25-32. Lea, R. M. (1985a). Associative Processing. In “Advanced Digital Information Systems” (Aleksander, I., ed.), pp. 531-585. Prentice Hall, New York. Lea, R. M. (1986b). VLSI and WSI Associative String Processors for Cost-Effective Parallel Processing, The Computer Journal, 29(6), 486-494. Lea, R. M. (1986~).VLSI and WSI String Processors for Structured Data Processing. IEE Proceedings, Part E 133(3), 153-161. Lea, R. M. (19x64. SCAPE: A Single-Chip Array Processing Element for Signal and Image Processing, IEE Proceedings, Pt. E 133(3), 145-151. Lea, R. M. (1988a).The ASP, A Fault-Tolerant VLSI/ULSI/WSI Associative String Processor for Cost-Effective Systolic Processing. Proceeding3 1988 IEEE Internastional Conference on Systolic, Arrays, 5 15-524. Lea, R. M. (1988b). ASP: A Cost-Effective Parallel Microcomputer. IEEE Micro 8(5), 10-29. Lee, C. Y. (1962). Intercommunicating Cells, Basis for a Distributed Logic Computer. FJCC 22, 130 136. Lee, D. L.. and Lochovsky, F. H. (1990). HYTREM-A Hybrid Text-Retrieval Machine for Large Databases. IEEE Transactions on Computers 39( I), 111-123. Lee, J . S. J., and Lin, C. (1988). A Pipeline Architecture for Real-Time Connected Components Laheling. Proceedings ojthe S P l E 1004, 195 201. Lee, W.-I-’.(1987). Thc Development of Associative Memory for Advanced Computer System, M.Phil. Thesis, IJniversity of Nottingham. Lerncr, E. J. (1987). Connections: Associative Memory for Computers. Aerospace America, 12-13. Lippmann, R. P. (1987). An Introduction to Computing with Neural Nets. IEEE ASSP Muguzine 4(2), 4 21. Mazumder, P., and Patel, J. H. (1987). Methodologies for Testing Embedded Content Addressable Memories. FTCS-17, Digex! of Pupers, Tire 17th Intrrnurional Symposium on FaultTolerant Gompuling, 201-275. McAuley, A. J., and Cotton, C. J. (1991). A Self-Testing Reconfigurable CAM. IEEE Journal qf Soliii-State Circuits 26(3), 257-261. McGregor, D., McInnes, S., and Henning, M. (1987). An Architecture for Associative Processing of Large Knowledge Bases (LKBs). The Computer Journal 30(5), 404-412. Minker, J. (1971 ). An Overview of Associative or Content-Addressable Memory Systems and a KWIC Index to the Literature: 1956 1970. A C M Computing Reviews 12(10), 453-504. Mirsalehi, M. M., and Gaylord, T. K. (1986). Truth-Table Look-Up Parallel Data Processing Using A n Optical Content-Addressable Memory. Applied Optics 25( 14), 2277-2283. Morisue, M., Kaneko, M., and Hosoya, H. (1987). A Content-Addressable Memory Circuit Using Josephson Junctions. Transactions on Mugnetics MAG-23(2), 743-746. Motomura. M.. Toyoura, J., Hirdta, K., Ooka, H., Yamada, H., and Enomoto, T. (1990). A I .2-Million Transistor, 3-MHz, 20-b Dictionary Search Processor (DISP) ULSI with a 160kb CAM. IEEE Journal of Solid-State Circuits 25(5), 1158-1165. Murdocca, M., Hall, J., Levy, S., and Smith, D. (1989). Proposal for an Optical Content Addressable Memory. Optical Computing 1989 Technical Digesst Series 9, 210 213. Murray, J. P. ( 1990). The Trade-offs in Neural-Net Implementations. High Performance Systems, 74 78. Murtha, J. C. (1966). Highly Parallel Information Processing Systems. In “Advances in Computers”-Volume 7, pp. 2--116. Academic Press, New York. Naganuma. J.. Ogura, T., Yamada, S., and Kimura, T. (1988). High-speed CAM-Based Architecture for a Prolog Machine (ASCA). IEEE Transactions on Computers 37( l l ) , 1375-1383.
CONTENT-ADDRESSABLE A N D ASSOCIATIVE MEMORY
233
Nakamura, K. (1984). Associative Concurrent Evaluation of Logic Programs. Journal ofLogic Programming 1(4), 285-295. Ng, Y. H., and Glover, R. J. (1987). The Basic Memory Support for Functional Languages. Proceedings of COMPEURO ‘87, 35 40. Ogura, T., Yamada, S., and Nikaido, T. (1985). A 4-kbit Associative Memory LSI. IEEE Journal of Solid-State Circuits SC-20(6), 1277-1282. Oura, T., Yamada, S., and Yamada. J. (1986). A 20kb CMOS Associative Memory LSI for Artificial Intelligence Machines. Proceedings IEE International Conference on Computer Design: VLSI in Compulers, 574-571. Oldfield, J. V. (1986). Logic Programs and an Experimental Architecture for their Execution. IEE Proceedings, Part I133(3), 123-127. Oldfield, J. V., Williams, R. D., and Wiseman, N. E. (1987a). Content-Addressable Memories for Storing and Processing Recursively Subdivided Images and Trees. Electronics Letters 23(6), 262. Oldfield, J. V., Stormon, C. D., and Brule, M. (1987b). The Application of VLSI Contentaddressable Memories to the Acceleration of Logic Programming Systems. Proceedings of COMPEURO ’87, 27-30. Papachristou, C. H. ( 1987). Associative Table Lookup Processing for Multioperand Residue Arithmetic. Journal of the ACM 34(2), 376-396. Parhami, B. (1 972). A Highly Parallel Computing System for Information Retrieval. Proceedings of AFIPS FJCC 41(Part 11), 681-690. Parhami, B. (1973). Associative Memories and Processors: An Overview and Selected Bibliography. Proceedings of the IEEE 61(6), 722-730. Parhami, B. (1989). Optimal Number of Disc Clock Tracks for Block-Oriented Rotating Associative Processors. IEE Proceedings, Part E, 136(6), 535-538. Patterson, W. W. (1974). Some Thoughts on Associative Processing Language. Proceedings of AFIPS NCC 43, 23-26. Pfister, G. F., and Norton, V. A. (1985). Hot Spot Contention and Combining in Multistage Interconnection Networks. IEEE Transactions on Computers C-34( lo), 943-948. Potter, J . L. (1988). Data Struclures for Associative Supercomputers. Proceedings 2nd Symposium on the Frontiers of Massively Parallel Computations, 77-84. Ribeiro, J . C. (1988). “CAMOAndOr: An Implementation of Logic Programming Exploring Coarse and Fine Grain Parallelism.” CASE Center Technical Report No. 88 15, Syracuse University, Syracuse, New York. Ribeiro, J. C. D. F., Stormon, C. D., Oldfield, J. V., and Brule, M. R. (1989). ContentAddressable Memories Applied to Execution of Logic Programs. IEE Proceedings, Part E 136(5), 383 388. Savitt, D. A,, Love, H. H., Jr., and Troop, R. E. (1967). ASP: A New Concept in Language and Machine Organization. Proceedings of AFIPS SJCC 30, 87-102. Shin, H., and Malek, M. (1985a). Parallel Garbage Collection with Associative Tag. Proceedings 1985 International Conference on Parallel Processing, 369-375. Shin, H., and Malek, M. (1985b). A Boolean Content Addressable Memory and Its Applications. Proceedings of the IEEE 73(6), 1142-1 144. Shu, D., Chow, L.-W., Nash, J. G., and Weems, C. (1988). A Content Addressable, Bit-Serial Associative Processor. VLSI Signal Processing 111, 120-128. da Silva, J. G. D., and Watson, I. (1983). Pseudo-Associative Store with Hardware Hashing, IEE Proceedings, Part E 130(1), 1 9 24. ~ Slade, A. E., and McMahon, H. 0. (1956). A Cryotron Catalog Memory System. Proceedings UfEJCC, 115 120. Slotnick, D. L. (1970). Logic per Track Devices. Advances in Computers 10, 291 -296.
234
LAWRENCE CHlSVlN AND R. JAMES DUCKWORTH
Smith, D. C. P., and Smith, J. M. (1979). “Relational Database Machines.” IEEE Computer 12(3), 28 38. Snyder, W. E., and Savage, C. D. (1982). Content-Addressable Read/Write Memories for Image Analysis. IEEE Transactions on Cornpulers C-31( lo), 963~-968. Sodini, C., Zippel, R,, Wade, J., Tsai, C., Reif, R., Osler, P., and Early, K . The MIT Database Accelerator: A Novel Content Addressable Memory. WESCON/86 Conference Record 1214, 1-6. Stone, H. S . ( 1990). “High Performance Computer Architecture.” Addison-Wesley, Reading, Massachusetts. Stormon, C. D. (1989). “The Coherent Processor. An Associative Processor for A1 and Database.” Technical Report, Coherent Research, Syracuse, New York. Stuttgen, H. J. ( 1985). “A Hicrarchical Associative Processing System.” Springer-Verlag, Berlin. Su, S. Y . W. (1988). “Database Computers: Principals, Architectures, and Techniques,” pp, 180-225. McGraw-Hill, New York. Suzuki, K., and Ohtsuki, T. (1990). CAM-Based Hardware Engine for Geometrical Problems in VLSI Design. Electronics and Communications in Japan, Part 3 (Fundamental Electronic Science) 73(3), 57- 67. Takata, H., Komuri, S., Tamura, T., Asdi, F., Satoh, H., Ohno, T., Tokudua, T., Nishikawa, H., and Terada. H. (1990). A 100-Mega-Access per Second Matching Memory for a DataDriven Miroprocessor. IEEE Journul of Solid-State Circuits 25( I), 95-99. Tavangarian, D. (1989). Flag-Algebra: A New Concept for the Realisation of Fully Parallel Associative Architectures. IEE Proceedings, Part E 136(5), 357-365. Thurber, K. J., and Wald, L. D. (1975). Associative and Parallel Processors. ACM Computing Surveys 7(4), 21 5-255. Thurber, K. J. ( 1976). “Large Scale Computer Architecture: Parallel and Associative Processors.” Hayden, Rochelle Park, New Jersey. l’releaven, P., Pdcheco, M., and Vellasco, M. (1989). VLSI Architectures for Neural Networks. IEEE Micro, 8-27. Uvieghara, G . A,. Nakagome, Y., Jeong, D.-K., and Hodges, D. A. (1990). An On-Chip Smart Memory for a Data-Flow CPU. IEEE Journcrl of Solid-State Circuits 25(1), 84 94. Verleysen, M., Sirletti, B., Vandemeulebroecke, A,, and Jespers, P. G . A. (1989a). A HighStorage Capacity Content-Addressable Memory and Its Learning Algorithm. IEEE Transactions on Circuits und Systems 36(5), 762 766. Verleysen, M., Sirletti, B., Vandemeulebroecke, A,, and Jespers, P. G. A. (1989b). Neural Networks for I ligh-Storage Content-Addressable Memory: VLSI Circuit and Learning Algorithm. IEEE Journul af Solid-Stnte Circuits 24(3), 562-569. Wade, J . P., and Sodini, C. G. (1987). Dynamic Cross-Coupled Bit-Line Content Addressable Journal of Solid-State Circuits SC-22( I ) , 119Memory Cell for High Density Arrays. I 121. Wade, J. P., and Sodini, C. G. (1989). A Ternary Content Addressable Search Engine. IEEE Journul oj’ Solirl-Stcite Circuits 24(4), 1003 1013. Waldschmidt, K. (1987). Associative Processors and Memories: Overview and Current Status. Proceedings of COMPEURO ’87, 19 26. Wallis, I.. (1984). Associative Memory Calls on the Talents of Systolic Array Chip. Electronic Design. 217 226. Wayner, P. (1991). Smart Memories. Byte 16(3), 147-152. Wheel, L. (1990). LAN Controller Goes Sonic. Elecfronic Engineering Times 571, 25, 30. White, H. J., Aldridge, N. B., and Lindsay, I. (1988). Digital and Analogue Holographic Associative Memories. optical Engineering 27( I), 30 37.
CONTENT-ADDRESSABLE A N D ASSOCIATIVE MEMORY
235
Wilnai, D., and Amitai, Z. (1990). Speed LAN-Address Filtering with CAMS. Electronic Design, 75 78. Wu, C. T., and Burkhard, W. A. (1987). Associative Searching in Multiple Storage Units. ACM Transactions on Database Systems 12( I ) , 38 64. Ydmada, H., Hirata, M., Nagai, H., and Takahashi, K. (1987). A High-speed String Search Engine. IEEE Journal of Solid-State Circuits 22(5), 829-834. Yasuura, H., Tsujimoto, T., and Tamaru, K. (1988). Parallel Exhaustive Search for Several NP-complete Problems Using Content Addressable Memories. Proceedings of I988 IEEE International Symposium on Circuiis and Sysiems 1, 333 336. Yau, S. S., and Fung, H. S. (1977). Associative Processor Architecture-A Survey. ACM Computing Surveys 9( I), 3-27. Zeidler, H. Ch. (1989). Content-Addressable Mass Memories. IEE Proceedings, Part E 136(5), 351-356. ~
This Page Intentionally Left Blank
Image Database Management WILLIAM I. GROSKY Computer Science Department Wayne State University Detroit, Michigan
RAJIV MEHROTRA Computer Science Department Center for Robotics and Manufacturing Systems University of Kentucky Lexington, Kentucky 1. Introduction . . . . . . . . . 2. Image Database Management System Architecture . 2.1 Classical Database Architecture . . . . . 2.2 Classical Data Models . . . . . . . . 2.3 A Generic Image Database Architecture . . . 2.4 A Generic Image Data Model. . . . . . 3. Some Example Image Database Management Systems 3.1 First-Generation Systems . . . . . . . 3.2 Second-Generation Systems . . . . . . 3.3 Third-Generation Systems . . . . . . . 4. Similarity Retrieval in Image Database Systems . . 4.1 Shape Similarity-Based Retrieval . . . . . 4.2 Spatial Relationship-Based Retrieval . . . . 5. Conclusions . . . . . . . . . . . . Acknowledgments. . . . . . . . . . . References and Bibliography. . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . .
231 239 239 241 245 241 249 250 256 265 266 268 219 283 283 283
1. Introduction
Contemporary database management systems are devised to give users a seamless and transparent view into the data landscape being managed. Such programs give users the illusion that their view of the data corresponds to the way that it is actually internally represented, as if they were the only users of the software. Originally developed for data processing applications in a business environment, there has recently been much interest expressed in the database community for devising databases for such nonstandard data as graphics (CAD/CAM in a manufacturing environment), maps (geographic information systems), statistics (scientific-experimental data 237 ADVANCES IN COMPUTERS, VOL. 35
Copyright 1992 by Academic Press, Inc. All rights of reproduction in any form reserved. ISBN 0- 12-012134-4
238
WILLIAM I. GROSKY AND RAJlV MEHROTRA
management), rules (deductive databases-expert systems), images, video, and audio (image-document-multimedia databases), as well as their various combinations. Much of the initial impetus for the development for such nonstandard databases originated in the scientific community concerned with the type of data that was to be managed. In this survey chapter, we hope to convey an appreciation for the continuing development of the field of image databases. The initial impetus for image databases originated with the image interpretation community. Most of the proposals from this community, however, were quite narrowly conceived and hence, after a brief flurry of activity in the late 1970s and early mid-l980s, interest in this activity decreased drastically, even resulting in the dropping of this area from the title of the IEEEsponsored workshop previously known as the Workshop on Computer Architecture ,for Pattern Analysis and Image Database Management. It is now known as the Conjermce on Computer Architecture j o r Pattern Analysis and Machine Intelligence. In our opinion, interest could not be sustained in this area due to its unsophisticated conception. The image interpretation community, or more accurately for the time, the pattern recognition and image processing community, conceived of an image database management system as just a way of managing images for image algorithm development test beds. Images were retrieved based on information in header files, which contained only textual information. At this time, the database community largely ignored such nonstandard applications due, we believe, to the then unsophisticated nature of the then current database management systems. It has only been since the development of various object-oriented approaches to database management that the field has expanded into these areas. In the last half of the 1980s, however, the situation had largely been reversed. The database community had expressed much interest in the development of nonstandard database management systems, including image databases, due, as mentioned earlier, to the development of the object-oriented paradigm as well as various data-driven approaches to iconic indexing. However, the interest of the image interpretation community had wavered. Only in the decade of the 1990s have the two communities been converging on a common conception of what an image database should be. This is due to the acceptance of the belief that image and textual information should be treated equally. Images should be able to be retrieved by content and should also be integral components of the query language. Thus, image interpretation should be an important component of any query processing strategy. A revived interest in the ficld from this perspective is shown by the publication of Grosky and Mehrotra (1989a). As is becoming increasingly apparent, moreover, the experience gained from this view of what an image database should be will generalize to other modalities, such as voice and touch, and
IMAGE DATABASE MANAGEMENT
239
will usher in the more general field of study of what we call sensor-based data manugement. An area related to image database management, and even considered a subarea by some researchers, is that of geographic or spatial database management. While there are many common issues between these two fields, notably those in data representation, data modeling, and query processing, the intent of geographic or spatial database researchers is quite different from what we consider the intent of image database management to be. Researchers in spatial data management are concerned with managing map data, which is largely graphics, or presentation, oriented. With the exception of satellite data interpretation issues, there is no notion of interpreting a map that has just been acquired by some sensor. Interpretation issues are largely bypassed by having them entered into the system by the users or the database administrators. The system already knows that, say, a lake exists in a particular region or that a particular road connects two cities in specific geographic coordinates. The research issues of spatial data management concern how to represent and query such nonstandard information in a database environment. The relation between this field and image database management lies in the fact that map data and image feature data are related and can be represented and modeled in similar ways. Similarly, spatial query language design gives insight into the design of query languages that encompass images. While discussing certain papers in the spatial data management field as they pertain to issues in common with image data management, this chapter will largely bypass the field. The reader is referred to Samet (1990a; 1990b) for a good survey of this area. There are many interesting problems in the field of image database management. Those that will be discussed in this chapter concern data modeling, sensor data representation and interpretation, user interfaces, and query processing. The organization of the rest of this chapter is as follows. In Section 2, we discuss a generic image database architecture after familiarizing the reader with various classic database architectures. Section 3 covers various implementations and proposals for image database management systems. We have divided these systems into three generations. The very important topic of similarity retrieval and query processing in this relatively new environment is then discussed in Section 4. Finally, we offer our conclusions in Section 5. 2.
Image Database Management System Architecture
2.1 Classical Database Architecture The architecture of a standard database management system, as shown in Fig. 1, is usually divided into three different levels, corresponding to the
240
WILLIAM I. GROSKY A N D RAJlV MEHROTRA User,
Userz
User
External Database Level (Individual User Views)
(Community View)
Physical Database Level (Storage View)
FIG. 1. The architecture of a database management system
ANSI/SPARC standard (Tsichritzis and Lochovsky, 1978). These levels are the physical database level, the conceptual database level, and the external database (view) level. The physical database resides permanently on secondary storage devices. This level is concerned with actual data storage methods. The conceptual database is an abstracted representation of a real world pertinent to the enterprise that is using the database, a so-called miniworld. The external database or view level is concerned with the way in which the data is viewed by individual users. In other words, an external database is an abstracted representation of a possibly transformed portion of the conceptual database. These levels of abstraction in data representation provide two levels of data independence. The first type of independence, called physical duta independence, follows from the relationship between the physical database level and the conceptual database level. This permits modifications to the physical database organization without requiring any alterations at the conceptual database level. The second type of independence, which follows from the relationship between the conceptual database level and the external database level, is called lugicul duta independence. This allows modifications to the conceptual level without affecting the existing external database, which also extends to any application programs that have access to the database. A database management system provides a data definition language (DDL) to specify the definition of the conceptual database in terms of some data model (the concepptual schema), as well as to declare views or external databases (the external schema). There is also a data manipulation language (DML) to express queries and operations over external views.
IMAGE DATABASE MANAGEMENT
241
In Section 2.3, we will see how the classical database architecture should be modified in order to support image data. 2.2 Classical Data Models
The implementation-independent framework that is employed to describe a database at the logical and external level is called a data model. These models represent the subject database in terms of entities, entity types, attributes of entities, operations on entities and entity types, and relationships among entities and entity types. There is a growing literature on various types of data models (Hull and King, 1987 ; Peckham and Maryanski, 1988). The most important ones that will be discussed here are the entity-relationship data model, the relational data model, the functional data model, and the object-oriented data model. Each of these data models has been used and extended to support image data. 2.2.1 The Entity- Relationship Data Model
This approach represents the entities under consideration as well as relationships between them in a generic fashion (Chen, 1976). Each entity has a set of associated attributes, each of which can be considered to be a property of the entity. Relationships among entities may also have associated attributes. As an example, suppose we are trying to represent the situation where students are taking courses in a university environment. The entities would be student, course, fuculty, and department. The student entity would have the attributes name (string), address (string), and social security number (string) ; the course entity would have the attributes name (string), number (integer), and description (string) ; the faculty entity would have the attributes name (string), address (string), social security number (string), and salary (integer) ; and the department entity would have the attributes name (string) and college (string). Among the relationships might be one between student, course, and department with associated attributes date (date) and grade (character), which indicates when and how successfully a given student took a given course; another between faculty, course, and department with associated attributes date (date), building (string), and room number (integer), which indicates when and where a particular faculty taught a particular course; another between course and department with no attributes, indicating that the given course belongs to the given department; another between faculty and department with associated attributes rank (string) and hire date (date), which indicates that a particular faculty member belongs to a particular department and is of a certain rank; and another between faculty
242
WILLIAM I. GROSKY AND RAJlV MEHROTRA
and department with associated attributes from-date (date), and to-date (date), which indicates that a given faculty member was chair of the given department for a given period of time. Various extensions to this model have added such enhancements as integrity conditions to the basic model (Teorey, 1990).
2.2.2 The Relational Data Model This model was motivated largely by the design of various file processing systems. Just as a file consists of records, each record consisting of various fields, a relation consists of tuples, each tuple consisting of various attributes. Thus, from a naive point of view, file corresponds to relation, record corresponds to tuple, and field corresponds to attribute. The beauty of the relational approach is that it is a mathematically precise model. There exist precisely defined operators (union, difference, selection, projection, join) that, using relational algebra, can be combined to retrieve any necessary information the user requires. It has also been shown, through the use of relational calculus, that relational algebra is quite powerful in its querying capabilities. Until the mid-l980s, most applications could be modeled using this approach with no loss in semantics. However, such modern applications for database management systems as graphics, rule-based systems, multimedia, and geographic information systems have experienced difficulty in using relational systems without a loss of semantics. This loss can be overcome only by adding some nonrelational components to the system. This disadvantage is perhaps the main reason why object-oriented database management systems are becoming the systems of choice. As an example, let us consider the miniworld discussed in Section 2.2.1. In our notation, we will write R(a,h) to denote the fact that we have a relation called R with attributes a and b. We then have the following relations with their associated attributes : student (name, address, social-securitynumber) course (name, number, description) faculty (name, address, social-securitynumber, salary) department (name, college) takes (student-social-security number, course-name, course-number, coursedescription, department -name, date, grade )
IMAGE DATABASE MANAGEMENT
243
taught (faculty-social-security-number, course-name, course-number, coursede script i on, department -name, date, building, room-number) works (faculty-social-security-number, department-name, rank, hire-date) belongs (course-name, course-number, coursedescription, department-name) chair (faculty-social-security-number, department-name, from-date, to-date) These attributes are defined over the same domains as in Section 2.2.1 2.2.3 The Functional Data Model
In this approach, attributes and relations are represented by functions whose domain consists of entities and whose range consists of entities or sets of entities (Shipman, 1981). The preceding miniworld can then be represented by the following functions :
student() + + entity name (student) + string address (student) + string social-security-number (student) + string courses (student) + + course x date x character course() -+ -+ entity name (course) string number (course) + integer description (course) -+ string home (course) department -+
faculty() + + entity name (faculty) -+ string address (faculty) + string social-security-number (faculty) -+ string salary (faculty) + integer works (faculty) + + department x string x date department() + -+ entity name (department) + string college (department) -+ string chairs (department) + + faculty
x
time-period
244
WILLIAM I. GROSKY AND RAJIV MEHROTRA
time-period() -+ -P entity begin (time-period) + date end (time-period) + date We note that a set-valued function is denoted by
-+
--*
2.2.4 The Object-Oriented Data Model The weakness of the relational model to fully capture the semantics of such recent database application areas as computer-aided design (CAD), computer-assisted software engineering (CASE), office information systems (OTS), and artificial intelligence, over the years, has become quite apparent. Due to its semantic richness, various object-oriented approaches to database design have been gaining in popularity (Zdonik and Maier, 1990). There are disadvantages to the object-oriented approach, however. These include the lack of agreement on a common model, the lack of a firm theoretical foundation, and the inefficiency of many currently implemented object-oriented database management systems as compared to the various existing relational implementations. An object-oriented database management can be behaviorally object-oriented as well as structurally object-oriented. Behauioral object-orientation relates to the notion of data encapsulation, the concept of methods, and the notion of the is-a type hierarchy and its associated concept of inheritance. Structural object-orientation relates to the notion of complex objects ; that is, objects whose attribute values are themselves objects rather than simple data types such as integers and strings, the is-part-of hierarchy. An object-oriented definition of our example miniword is as follows: class Person superclasses : none attribute name: string attribute address: string attribute social-security-number: string class Student superclasses: Person attribute transcript: set o f Course-History class Course-History superclasses : none attribute class: Course attribute when: date attribute grade: character class Faculty
IMAGE DATABASE MANAGEMENT
245
superclasses : Person attribute salary: integer attribute works-for: set o f Position class Position superclasse s : none attribute place: Department attribute rank: string attribute hired: date class Course superclasses: none attribute name: string attribute number: integer attribute home: Department class Department superclasses: none attribute name: string attribute college: string attribute chairs: set of Regime class Regime superclasses: none attribute person: Faculty attribute date-range: Time-Period class Time-Period superclasses : none attribute begin: date attribute end: date method length: date x date + integer
2.3 A Generic Image Database Architecture With the advent of image interpretation and graphics technologies, a wide variety of applications in various areas have evolved that require an application-dependent abstraction of the real world in terms of both textual and image data. It has thus become essential to develop or extend existing database management systems to store and manage this data in an integrated fashion. The type of information required to be managed can broadly be classified into five categories :
Iconic: This information consists of the images themselves, which are stored in a digitized format. Image-Related Data: This is the information found in the header and trailer files of the images.
246
WILLIAM I. GROSKY AND RAJlV MEHROTRA
Feature Informution Extracted,fiorn the Imuges: This information is extracted by processing the images in conjunction with various world models. Image-world Rrlationships: This information consists of the relationships between various image features and the corresponding real world entities. This information may be known a priori or obtained through analyzing the image. World-Reluted Data : This is conventional textual data describing the abstracted world pertinent to the application. Any image database management system must facilitate the storage and management of each of these five types of information. The advantages of data independence, data integrity, data sharing, controlled redundancy, and security offered by conventional database management systems for textual data are required here for both textual and image data. Such a system should perform query operations on iconic information by content. Generalizing from image data management to sensor-based data management and using satellite data as an example, this type of retrieval would include one or more, in combination, of the following simple cases,
1. The retrieval of image data from textual data. An example would be to find the spatio-temperature data distribution taken over a specific geographical area on a given day by a specific sensor. 2 . The retrieval of textual data from image data. An example would be to find the particular sensor that measured a given spatio-temperature data distribution. 3. The retrieval of image data from image data. An example would be to find the visual image of the particular hurricane that manifested a given pattern of spatio-pressure readings. 4. The retrieval of textual data from textual data. An example would be to find the type of sensor residing on a particular type of satellite at a particular time. As is obvious, some of the above mentioned image data retrieval requires the use of image representation (modeling) and recognition techniques. An efficient system will no doubt use model-based recognition techniques whose management will support the efficient insertion, deletion, and updating of given models. This is extremely important in a database environment. In light of the preceding discussion, we can say that an image database management system will consist of three logical modules, as shown in Fig. 2.
247
IMAGE DATABASE MANAGEMENT
Textual Data Management System
-
Textual Data Storage
A U
sE R
User Interface System
t Image Understanding c-----) Image Storage System
FIG. 2. The logical structure of an image database management system.
The image understanding module handles image storage, processing, feature extraction, decomposition, and matching. The textual data management module is a conventional database management system. It manages the textual data related to images, textual data extracted from the images, and other textual data. Recent research in knowledge-based systems design has advocated the use of conventional database management systems to store models and production-inference rules (Dayal, Buchmann, and McCarthy, 1988). In an image database management system, the models and the matching processes are nothing but the knowledge that the image understanding module needs to perform its task. Therefore, one can employ the textual data management module of an image database management system to manage the image models and the information related to the matching process (rules) that are needed by the image understanding module. The user intevfacr module interprets the input commands, plans the processing steps, and executes the plans by invoking the appropriate subsystems at the proper time. 2.4 A Generic Image Data Model
We believe that an image data model must represent the following types of information. The conceptual schema should consist of four parts (Mehrotra and Grosky, 1985): the model base, the model-base instantiation, the instantiation-object connection, and the object information repository, as shown in Fig. 3. The model base consists of hierarchical descriptions of generic entities that the system is expected to manage as well as descriptions of the processing
248
WILLIAM I. GROSKY AND RAJlV MEHROTRA
View I
Viewz
...
View,
External Level
Conceptual Level
Physical Level
FIG.3. Another view of thc proposed design for an image database management system.
that must occur for image interpretation. The model-base instantiation contains detailed hierarchical descriptions of the processed image data. These descriptions are detailed in the sense that all components and their relationships are described as an associated set of attributes. The description of an image will be in one-one cwrespondence with the associated model-base information. Each image entity corresponds to a real-world entity with given semantics. This correspondence is defined in the instantiation-object connection. Finally, the object information repository consists of textual information concerning these real-world entities. To use the system as a purely standard database management system or as an integrated image database management system, only the object information repository would be made available to the users for the definition of external views. In other words, the users would not have to worry about the iconic entity description and processing aspects of the system. The hierarchical descriptions of the generic objects and the image interpretation methods would be inserted in the model base by the database administrator. The information in the model-base instantiation would be stored by the system itself as the required information is obtained through processing the input images. On the other hand, to use the system for purely image interpretation or graphics applications, the entire conceptual schema would be made available to the user for the definition of external views. Thus in this case, the users can define and maintain their own models and image interpretation or graphics
IMAGE DATABASE MANAGEMENT
249
functions. In the former case, the model-base instantiation would be generated and stored by the system itself, whereas in the case of graphics applications, it would be inserted by the users. This system will be general enough to be used in one of the previously mentioned modes or in any combination of these. To achieve this generality and still allow the sharing of information among various types of users, however, one should not be allowed to change the information generated and stored by the system. 3.
Some Example Image Database Management Systems
In this section, we will give the reader a flavor of the different types of image database management systems that have been designed over the years. In order to accomplish this in a meaningful fashion, we divide the development of such systems into three generations. Systems in the first generation are characterized by being implemented relationally. As such, any image interpretation task associated with their use is either nonexistent or hardwired into the system and, if under user control, is so in a very rudimentary fashion. There is no notion of new image feature detectors being composed during run time by the user. Other standard database issues, such as the nature of integrity conditions in this new environment and potentially new notions of serializability, are also left unexamined. While relational systems are still being designed today, mainly for geographic information systems (Orenstein and Manola, 1988), the main thrust of the first generation lasted from the late 1970s until the early 1980s. Second generation systems are characterized by being designed either in a more object-oriented fashion or utilizing a semantically rich extension of the relational model. In this approach, image interpretation routines are, more or less, the methods and, as such, are packaged along with their respective objects. Still, there is no notion of the user composing new image feature detectors in a user-friendly and interactive fashion during run time. However, such database issues as integrity conditions are being examined in this new environment (Pizano, Klinger, and Cardenas, 1989). The second generation began in the mid1980s and is still ongoing. The third generation of image database systems is just beginning. These systems, when fully implemented, will allow the user to manage image sequences as well as to interact with the image interpretation module and compose new image feature detectors interactively and during run time. This interaction will be conducted at a high level and in a very user-friendly fashion. That is, the user will have available a toolbox of elementary features and their associated detectors (methods) as well as connectors of various sorts that will allow him or her to build complex features and detectors from
250
WILLIAM I. GROSKY AND RAJlV MEHROTRA
more elementary ones through an iconic user interface. The only system of which we are familiar that discusses this concept in the context of images is that of Gupta, Weymouth, and Jain (1991), although Orenstein and Manola ( 1988) discuss this concept in a geographical information context. 3.1
First-Generation Systems
The early systems of this generation have been of two major types. There are those systems specifically designed for pattern recognition and image processing applications. These systems were concerned mainly with images (Chang, 1981a). The textual data in these systems consists mostly of textual encodings of the positional information exhibited in the images. There are also those systems that are similar to conventional database management systems and that have images as part of a logical record. These systems, however, are not capable of handling the retrieval of image data by content. They cannot be called integrated image database systems as they do not treat images equally with text. The only two attempts towards the design of integrated image database systems are described in Grosky (1984) and Tang (1981). The pioneering work in this area was done in 1974 by Kunii, Weyl, and Tennenbaum (1974). In their system, a relational database schema is utilized to describe images. A relation snap (snap#, data, place, subject, negative#, frame#) is used to store the image related data. The relations ohjectahl (snap#, object#, object-name), objectah2 (object-name, superpositionorder), and part (object#, part#, part-name, part-superposition-order) are used to describe the images as superimposed objects and the objects as superimposed parts. Some additional relations are used to describe the color, texture, and regions of the objects. This approach satisfies the requirements of compatibility of textual data, data independence from hardware, and data independence from the viewpoints of the information and of the user. However, it does not address the issues concerning methods of extracting information from images and mapping them into the description schema nor the design of a data manipulation language for data input, update, retrieval, and analysis. The next system we discuss is the graphics oriented relational algebraic interpreter (GRAIN), developed by Chang and his colleagues (Chang, Reuss, and McCormick, 1977; 1978; Chang, Lin, and Walser, 1980; Lin and Chang, 1979; 1980). The organization of the GRAIN system is shown in Fig. 4. This system consists of RAIN, the relational algebraic interpreter, to manage the relational database for retrieval use, and ISMS, the image storage management system, to manage the image store. The main characteristic of this system is the distinction of logical images from physical images.
251
IMAGE DATABASE MANAGEMENT
D i S
Database Machine (RAIN)
-
Relational Database
P I a
Y D e V
i C
Store Processor
Image Storage
e
FIG.4. System organization of GRAIN.
This distinction leads to the design of a versatile and efficient image data storage and retrieval system. Logical images are a collection of image objects that can be considered masks for extracting meaningful parts from an entire image. These are defined in three tables : the picture object table, the picture contour table, and the picture page table. Each physical image is stored as a number of picture pages that can be retrieved from image storage using ISMS commands. A relational query language called GRAIN provides the means to retrieve and manipulate the image data. The concepts of generalized zooming and picture algebra have also been explored. Vertical zooming corresponds to a more detailed view of an image whereas horizontal zooming is with respect to a user supplied selection index, such as the degree of similarity. In this case, horizontal zooming corresponds to continuously changing this similarity degree and viewing the corresponding retrieved images. Picture algebra is an image version of relational algebra. This system meets the requirements of compatibility of textual data, data independence, and a manipulation language for image data. However, no methods have been developed to transform an image into its corresponding tuples in the above relational tables; the image description is manually entered. Also, the system has been used mainly in a geographical information context. A system designed recently for map information retrieval that has similar concepts is discussed in Tanaka and Ichikawd (1988). Another important first-generation system is the relational database system for images (REDI) developed by Chang and Fu (1980b; 1980c; 1981). REDI was designed and implemented for managing LANDSAT images and digitized maps. Figure 5 illustrates the system organization of REDI. In this approach, the database management system is interfaced
252
WILLIAM I. GROSKY AND RAJlV MEHROTRA
Command Interpreter
s
/
Interpreter
Image . Unde, .._.._..____._ ...... snding System Processing System
Recognition
+-
Dioplsy Device
Image Storage
Relational Database
Database Management
Image Processing
FIG.5. System organization of REDI.
(01980 IEEE.
with an image understanding system. The image features are extracted from images and image descriptions are obtained by using image processing operators supported by the system. Image descriptions and registrations of the original images are stored in the relational database. Original images are stored in a separate image store. A query language called query-by-pictorialexample (QPE) is part of the system. QPE is an extended version of the predicate-calculus-based relational symbolic data manipulation language Query-by-Example (Zloof, 1977). This system made the first effort .to manage the image processing routines as well as the image data. It did this through the use of so-called image processing sets. Each image processing set is an ordered sequence of image processing operations that accomplishes recognition tasks for various domains. There were processing sets for roads, rivers, cities, and meadows. All processing sets were packaged together into the LANDSAT processing package. This concept anticipated the emerging concepts of object-oriented design and is interesting for that reason. This
253
IMAGE DATABASE MANAGEMENT
system also included support for image-feature-relation conversion, introduction of pictorial examples that enabled effective pictorial queries utilizing terminals, and a simple similarity retrieval capability. An example road, city database consists of the following tables: roads (frame, road-id, xl, yl, x2, y 2 ) road-name (frame, road-id, name) position (frame, xsize, ysize, xcenter, ycenter, location) cities (frame, city-id, xl, yl, x2, y2) city-name (frame, city-id, name)
The position relation holds the registration information of an image, where location indicates where the image is stored. Figure 6 shows how the data manipulation command, ‘Apply the Road processing set to the image whose frame number is 54 and insert the processing results into the roads relation’ would be stated in query-by-pictorial-example; while Fig. 7 similarly illustrates the query, ‘Find one image frame whose road network pattern is most similar to that of the image shown on the display terminal.’ For Fig. 7, the value * of location denotes a default display terminal location for the given image. The road processing set is applied to this image and the intermediate results are inserted into a relation temp. The image operator SIM-LL finds lines similar to given lines. Tang (1981) extended the relational data model to allow an attribute of a relation to have a data type of picture or device. The picture data type is characterized by three numbers: m, n, and h. The size of the image is m X n and the maximum allowed number of gray levels is h. The device data type can take as values only operating system recognizable 1/0 device names. The device type is introduced in order to manage the complicated 1/0system in an integrated image database system through the use of the concept of a logical 1/0 system. The language SEQUEL, a forerunner of SQL, is position Road.
3
frame
xsize
ysize
xcenter
ycenter
location
54
1
I.(Road)
FIG. 6 . A query-by-pictorial-exampledata manipulation statement. 0 1981 IEEE.
254
WILLIAM I. GROSKY AND RAJlV MEHROTRA
position
frame
xsire
ysize
xcenter
ycenter
location
Road.
m I.(Road)
roads
frame
P.
road-id
Xl
yl
x2
y2
SIM-LL.(temp)
extended to serve as an interface between the users and the system. An example database is the following :
employee (name, id-number, face(pic), department-number) employee-feature (id-number, feature-name, feature(pic)) department (department-number, location, manager ) monitors (name(device), department-number, person-in-charge) scanners (name(device), department-number, person-in-charge) A sample query over this database is, ‘Exhibit the face and name, on monitor A, of the employee whose nose has been scanned by the scanner in department 5.’ This would be expressed in SEQUEL as follows:
SELECT employee.name, employee. face FROM WHERE
( ‘ ‘monitor A’ ’ ) employee, employee-feature, scanner scanner.department-number = 5 AND employee-feature.feature-name = ‘nose’ AND employee-feature.feature = scanner.name AND employee.id-number = employeefeature. id-number
IMAGE DATABASE MANAGEMENT
255
The weakness of this approach is that an image cannot stand by itself in the database and an entity cannot have more than a single associated image. Grosky (1984) proposed a logical data model for integrated image databases that overcomes these weaknesses. He proposed three entity sets: one consisting of individual analog images, another consisting of individual digital images, and the last consisting of digital subimages. The relationships among various entities are represented by three tables: Analog - Digital, connecting an analog image to its various digitized counterparts; Digital - Subdigital, connecting digital subimages to the digital images in which they occur ; and Appearing - In, connecting a digital subimage to the subject entities appearing in it. In this approach, the query ‘Display the names and addresses of all persons on file who were photographed together with employee Joseph Smith,’ would be SELECT name, address FROM employee WHERE employee. id-number IN SELECT subject-id FROM Appearing-In WHERE Subdigital.id IN SELECT Subdigital. id FROM Digital-Subdigital WHERE Digital-id IN SELECT Digital-id FROM Digital-Subdigital WHERE Subdigital. id IN SELECT Subdigital. id FROM Appearing-In WHERE Subject.id IN SELECT employee. id FROM employee WHERE name = ‘Joseph Smith’ Also discussed is the need for pictorial as well as textual indices. The last first generation system we discuss is the picture database management system (PICDMS) of Chock, Cardenas, and Klinger (1981 ; 1984) and the associated query language PICQUERY (Joseph and Cardenas, 1988). This system was initially designed for geographical applications, but it has some quite interesting features that can profitably be used in generic image database management systems. Its most interesting architectural property is
256
WILLIAM I. GROSKY AND RAJlV MEHROTRA
how image information is represented. At each point in an image, different attributes are generally recorded. For geographical applications, these attributes could be spectral data, elevation data, or population data, while in a generic image, these attributes could comprise such data as region segmentation data, boundary data, or optic flow data. Rather than record this information for each point, however, an image is subdivided into a gridlike pattern, where each grid element is of some equal small area, and the preceding attributes are recorded for each entire grid element. Rather than store particular attribute values for the entire image in an individual record, however, here a record consists of all the attribute values for the same grid element. Thus, if an image consists of g grid cells, each grid cell having a attributes, rather than having an image file consisting of a records, each record having g fields, this approach has an image file consisting of g records, each recording having a fields. The associated query language PICQUERY allows the user to request such operations as edge detection, different kinds of segmentation, and similarity retrievals of various sorts.
3.2 Second-Generation Systems
Systems in this generation are characterized by using more powerful data modeling techniques. Either various semantically rich extensions to the relational model are used or a model of the object-oriented variety. The system REMINDS (Mehrotra and Grosky, 1985) discusses a generic image data model that also included aspects related to image interpretation tasks. Although relational in implementation, it has many structural objectoriented aspects to it. Based on the image data model discussed in Section 2.4, the model-base consists of two parts: the generic entity descriptions and the functional subschema. The former consists of hierarchical descriptions of generic entities that the system is expected to manage. A model of an entity consists of descriptions of its parts and their interrelationships. In a hierarchical description, each component entity is further broken down into subentities, down to the level of primitive entities, with recursion being supported. As an example the following tables capture the generic entity shown in Fig. 8.
Primitive Primitiveld C
C
AttributeName Type Radius
Attributevalue Circle 1
uRigh L*REa 257
IMAGE DATABASE MANAGEMENT
LeflEye
Skull -&
Face
BearFace
Stomach
-+
RightEar
BearUpperBody
-+ RightLeg
LeflLeg
J
\ BearLowerBody
LowerBody
Bear
FIG. 8. A generic entity. Complex Part ComplexPart BearUpperBody BearUpperBody BearUpperBod y BearFace BearFace BearFace BearLowerBody BearLowerBody BearLowerBody Bear Bear
ComponentPart LeftEar RightEar Face LeftEye RightEye Skull Stomach LeftLeg RightLeg UpperBody LowerBody
Instanceof C C BearFace C C C C C C BearUpperBody BearLowerBody
Scaling 1 I 1 0.9 0.9 5 5 3.5 3.5
Relation ComplexPart BearUpperBody BearUpperBody BearUpperBody BearUpperBody BearUpperBody BearFace
ComponentPartl LeftEar LeftEar LeftEar RightEar RightEar LeftEye
ComponentPart2 RightEar Face Face Face Face RightEye
RelationType LeftOf Above Touch Above Touch LeftOf
1
I
WILLIAM I. GROSKY AND RAJlV MEHROTRA
BearFace BearFace BearLower Body BearLowerBody BearLowerBody BearLower Body BearLowerBody Bear Bear
LeftEye RightEye LeftLeg Stomach Stomach Stomach Stomach UpperBod y UpperBody
Skull Skull RightLeg LeftLeg LeftLeg RightLeg RightLeg LowerBody LowerBody
Inside Inside LeftOf Above Touch Above Touch Above Touch
The hierarchical structure of the generic entity shown in Fig. 8 is exhibited in Fig. 9. Methods are objects also. The functional subschema will logically manage the descriptions of all the image interpretation procedures available in the system. For each image interpretation task, a control structure describing how a set of procedures combine to perform that task resides here. This feature of their system makes the image interpretation system highly modular, which, in turn, makes it easily modifiable: the procedures can be shared among various tasks, new procedures can easily be added, and old procedures can easily be replaced or removed. Thus, the duplication of efforts in the development of new image analysis techniques can be avoided. This is a highly desirable environment in which to carry out image analysis experiments. This process of interacting with the image interpretation module should be able to be done by the user during runtime as well as, of course, by the database administrator. The following tables illustrate a simplified functional subschema for recognizing the generic entity shown in Fig. 8. The
LeftLeg
RightLeg
Stomach
LeftEar
RightEar
Face
F.
b
FIG. 9. The hierarchical structure of the bear entity.
IMAGE DATABASE MANAGEMENT
259
table Functions lists the given operators along with their associated addresses, whereas the table FunctionHierurclzy exhibits the partial order of the various operations involved. We note that the detectors in this latter table perform such tasks as verifying properties of and relationships among the recognized subcomponents. Functions Functionld F1 F2 F3 F4 F5 F6 F11 F12 F13 F14 F15 F16 F17 F18 F19 F20 F2 1 F22 F23 F24 F25 F26 F27 F28 F29 F30 F3 1 F32 F33 F34 F35 F36 F37
FunctionName Edgeoperator Thresholdoperator ThinningOperator Lin kingoperator LineDetector CircleDetector FindEdge FindLine Findcircle FindLeftEye FindRigh tEye Findskull FindFace FindLeftEar FindRightEar FindUpperBody FindStomach FindLeftLeg FindRightLeg FindLowerBody FindBear LeftEyeDetector RightEyeDetector SkullDetector FaceDetector LeftEarDetector RightEarDetector Upper BodyDetector StomachDetector LeftLegDetector RightLegDetector LowerBodyDetector BearDetector
FunctionHierarchy Command FindEdge FindEdge FindEdge FindLine
Predecessor ComponentFunction Edgeoperator Thresholdoperator ThinningOperator FindEdge
Successor ComponentFunction ThresholdOperator ThinningOperator Linkingoperator LineDetector
260
*
WILLIAM I. GROSKY AND RAJIV MEHROTRA
FindCircle FindLeflEye FindRightEye FindSk ull FindLeftEar FindRightEar Findstomach FindLeftLeg FindRight Leg FindFace FindFace FindFace FindLowerBody FindLowerBody FindLowerBody FindUpperBody FindUpperBody Find UppcrBody FindBear FindBear
FindEdge Findcircle Findcircle Findcircle Findcircle Findcircle Findcircle Findcircle Findcircle FindLeftEye FindRightEye Findskull Findstomach FindLeft Leg FindRightLeg FindLeftEar FindRightEar FindFace FindLowerBody FindUpperBody
CircleDetect o r LeftEyeDetector RightE yeDetector SkullDetector LeftEarDetector RightEarDetector StomachDetector LeftLegDetector RightLegDetector FaceDetector FaceDetector FaceDetector LowerBodyDetector LowerBodyDetector LowerBodyDetector UpperBody Detector UpperBodyDetector UpperBodyDetector BearDetector BearDetector
The next few systems we discuss concern themselves with managing geographic information, but in each approach there are interesting ideas that can easily be applied to generic image database management systems. The system PROBE (Orenstein and Manola, 1988) has been designed by researchers in the database management area and, as such, raises some quite interesting issues. PROBE uses a functional data modeling approach and represents spatial data as collections of points along with associated operations. One important issue concerns the nature of the operations packaged with each object class. Packaging all necessary application-based operations with the corresponding object class will make it difficult for implementers, who must then be familiar with database issues as well as application issues. Thus, the authors leave it to the database system kernel to implement general basic operations and for the object class to handle the more specialized operations. In turn, these specialized operations should be written in such a generalized fashion that they rely on the database system implemented operations as much as possible. An example of this occurs in their discussion of query processing, where the concept of a geometry$lter is also introduced. This is a collection of procedures that iterate over various collections of objects in one or more nested loops, choosing candidates that might satisfy certain query criteria and then verifying that they indeed do satisfy the criteria, As an example, consider the query, ‘Find all pairs of objects x and y, such that x and y are close to each other.’ This command would be
IMAGE DATABASE MANAGEMENT
261
expressed in their notation PDM algebra, as,
candidates:=spatial-join(x, y) result:=select (candidates, close) Spatial join is implemented in the database system kernel and chooses pairs of objects likely to be close to one another. In the application, each candidate is examined by the associated method close, where the notion of two objects being close to one another is more precisely defined. To show the applicability of the author’s approach to a generic image database application, we exhibit an example schema from their paper:
type image is entity pixels (image, X, Y) -+ pixel place (image) -+ box (*Bounding box giving bounding latitudes and longitudes*) time (image) --f time (*When the image was taken* ) frequency (image) -+ float (*spectral band* ) feature (image) -+ set of feature (*Set of notable features, extracted by an image interpreter*) type feature is entity entity type (feature) -+ feature-type location (feature) + (latitude, longitude) (*Real-world coordinates*) occurrences (feature) -+ set of (image, x, y) (*Describes occurrence of a feature in each image containing the feature, and gives the position o f the feature within the image*) near (feature) 4 set of feature ( * A set of nearby features*) type road is feature name (road) string crosses (road) + set of road length (road) -+ real -+
type bus-stop is feature buses (bus-stop) -+ bus-line
262
WILLIAM I. GROSKY AND RAJlV MEHROTRA
Another system designed by database researchers is that constructed around the query language PSQL (Roussopoulos, Faloutsos, and Sellis, 1988). This language is a spatial data management-based extension of SQL and was formulated first (Roussopoulos and Leifker, 1984). The system follows the philosophy of having an extensible language with embedded specialized, applications-dependent commands, the latter being implemented by a separate application processor. See Fig. 10 for the architecture of this system. At present, PSQL supports points, line segments, and regions, and it supports numerous specialized operators for these entities. An example command in their system is
SELECT hwy = section FROM highways , c i ti e s WHERE city = ‘Detroit’ and distance (location, segment) = min( SELECT distance (location, segment) FROM highways, cities WHERE city = ‘Detroit’ and hwy-name = ‘180’) which finds the highway section of 1-80 closest to Detroit. Query processing makes use of the specialized data structures of R trees and R+ trees (Guttman, 1984; Sellis, Roussopoulos, and Faloutsos. 1987). These indexing mechanisms, or structures like them, can also be used in generic image database management systems. User
Indirect Spatial Access
Direct SpatI aI Access
FIG. 10. The architecture ol’ the image database system for PSQL. K) 1988 IEEE.
IMAGE DATABASE MANAGEMENT
263
With respect to the above two systems, the functional data model is much more natural than SQL for various spatial operations and queries. However, image interpretation is quite a bit more complex than spatial operations and it is unclear from these articles how a real image interpretation task would proceed. Goodman, Haralick, and Shapiro (1989) overcome this shortcoming by indicating for a particular image interpretation task not only the image modeling that is necessary, but also the associated processing steps, what we previously called the functional subschema. The problem discussed is that of pose estimation, which is determining the location and orientation of an object from its 2-D image. The data model used is CAD-like and hierarchical. Primitive features are considered to be of level-0, while, in general, levelk features represent relationships between features of level less than k . As an example, consider the line drawing shown in Fig. 1 1. To describe this line drawing, the authors use a data structure called a relational pyramid. This data structure is hierarchical and describes higher level features in terms of features at lower levels. Conceptually, this data structure captures the following information, Level-0 Features Straight Curve L, c, LZ C2 L3
Level-1 Features Three-Line Junctions Jz : {(straight, LJ, (straight, L?), (curve, c,)) J, : {(straight, L,), (straight, L2), (curve, Cl)} Four-Line Junctions J, : {(straight, Ll), (straight, L), (curve, ‘211, (curve, C d } Level-2 Features Junction Adjacency { (four-line, J,), (three-line, Jz)) {(four-line, JI), (two-line, J,)) {(three-line, J2), (three-line, J,)}
J1
FIG. 11. A sample line drawing. 0 1989 IEEE.
264
WILLIAM I. GROSKY AND RAJlV MEHROTRA
For rapid feature matches, another data structure, called the summary pyramid, is constructed based on the relational pyramid. This data structure
captures the number of different types of features. Such a structure based on the relational pyramid is Levei-0 Features Straight Curve 3 2 Level-1 Features Three-Line Junctions [(straight, straight, curve), 21 Four-Line Junction [(straight, straight, curve, curve), 11 Level-2 Features Junction Adjacency [(four-line, three-line), 21 [(three-line, three-line), I ]
A functional subschema is then developed that utilizes these data structures. This encompasses the transformation of creating a 2-D wire frame representation from the associated image, building the relational and summary pyramids, using an associated index structure into the CAD model database, and finally determining the correct pose. Finally, an interesting use of images has been studied by Pizano, Klinger, and Cardenas (1989). In this paper, the notion of using images for expressing integrity constraints in a spatial database environment is explored. Each image represents an unacceptable database state. For example, in Fig. 12 an image is shown that conveys the fact that automobiles and people cannot be in a crosswalk simultaneously. These constraint images are automatically translated to predicate logic formulas and then to a form more amenable to whichever database management system is at hand.
FIG. 12. An example image constraint description.
1989 IEEE.
IMAGE DATABASE MANAGEMENT
265
3.3 Third-Generation Systems In all previous systems discussed, the user could formulate a standard database schema related to the aspect of the world to be modeled. This schema could, of course, include images. However, the user has no control over the module of the system that performs the actual image interpretation. Third-generation systems allow the user some control over this module. There will be some sort of functional subschema that the user can formulate. The only papers of which we are aware that have put some flesh on this concept are those of Jagadish and OGorman (1989) and Gupta, Weymouth, and Jain (1991a, b). In Jagadish and O’Gorman (1 989), derived image feature types can be built on top of particular base types. This customization is not in terms of a fixed set of operations, however, and whether it can be done dynamically is unclear. There is the notion of a physical hierarchy and a logical hierarchy as part of image data modeling. The physical hierarchy starts at the pixel level, advances to the chain level, the line level, the composite level, the structure level, and finally to the entire image level. In parallel with this, the logical hierarchy provides the semantics of the corresponding physical hierarchical structures. As an implementation of this general concept, the authors introduce the TLC image model, which is an acronym for thin line code. Entities at each level have their own associated attributes and methods. Different notions of inheritance are discussed due to the nature of the application. As an example, a poIygon’s constituent lines are part of the polygon but are not subtypes of the type polygon. However, these lines may still inherit such attributes as color and thickness from the given polygon. The discussion in Gupta, Weymouth, and Jain (1991a, b) is extremely comprehensive with respect to data model design as part of the implementation of a very general image database management system called VIMSYS (Visual Information Management System). This is the only prototype system in which managing information from image sequences has also been addressed. VIMSYS has a layered data model that is divided into an image representation and relation layer, an image object and relation layer, a semantic object and relation layer, and a semantic event and relation layer, each layer being implemented via object-oriented techniques. In the image representation and relation layer, each image object has multiple representations that are mutually derivable from each other. The image object and relation layer concerns itself with image features and their organization. Examples of such features are those of texture, color, intensity, and geometry. New features can easily be formed from given features. Using supplied constructors, one can define such features as an intensity histogram by the expression gruph_of(intensity, integer) as well as a texture field by the
266
WILLIAM I. GROSKY AND RAJlV MEHROTRA
expression mutrix_of(uppend(orientedness,point)).The latter definition illustrates the process of combining two existing features into a composite feature through the use of the operator append. The semantic object and relation layer is used to connect real-world entities with various objects in the preceding two layers. Finally, the semantic event and relation layer is used to construct so-called temporal features, a collection of features over an image sequence. An example of a temporal feature is that of a rotation. The authors’ design of a user interface is also quite interesting. The query specification is done through a graphical user interface in an incremental manner. The authors recognize that specifying a query over an image domain is not as straightforward as other researchers have presented it and have given the user much freedom to specify exactly what he or she wants. As an example, the user may want to search for a greenish object of a particular shape. The system will allow the user to specify what he means by the term greenish by manipulating hue, saturation, and lightness scrollbars via a mouse until the shade of green that the user feels is appropriate is exhibited. The user can use similar methods to specify the shape. Thus, the query style is more navigational than in other systems. 4. Similarity Retrieval in Image Database Systems
In image database systems, we often want to retrieve images whose contents satisfy certain conditions specified in an iconic query (i.e., queries that involve input images and conditions on them). In other words, an image database management system must support the retrieval of image data by content (Grosky and Mehrotra, 1989b; 1990). Two types of image data retrieval (or commands) involve input images : Shape Similurity- Bused Retrieval: In these queries, the spatial relationships among the objects in an image are not important. The specified conditions are based on similarity of shapes. An example is, ‘Find all images that contain one or more objects present in the input image or in the view of camera C1.’ Spatial Relationship Bused Retrieval: In these queries, the constraints on the similarity of shapes as well as the similarity of their spatial relationships are specified. For example, ‘Find all images containing the object in the view of camera C1 to the left of the object in the view of camera C2,’ or ‘Find all images having the same objects and same relationships among them as in the input image.’
To process ionic commands, the query image data must be analyzed to identify its contents. In other words, image representation and interpretation
IMAGE DATABASE MANAGEMENT
267
should be components of a viable query processing strategy. The query image(s) as well as the stored images must be efficiently and reliably interpreted. This requires an efficient organization of the model base as well as the model-base instantiation. The model base has to be searched to interpret the contents of the query images. The model-base instantiation has to be searched to identify the stored images or the model instantiations that meet the conditions specified in the query. We believe that an image database should have one index for organizing the model base and separate indexes for the instantiations of each of the models. In this case the image command processing can be considered a two-phase process. First, the model-base index is searched to analyze the content of the query images. This phase yields the matching or most similar models found in the query images as well as the values of various required model parameters (such as size, location, or various relationships). Then, the instantiation indexes corresponding to the retrieved models can be searched to identify the instantiations or images meeting the query conditions, possibly through the use of various model parameters. Since images are usually corrupted by noise or distortions, the search for similar shapes or images must also be capable of handling corrupted query images. Thus, efficient noise-insensitive and distortion-insensitive index structures based on robust representations of images are essential to achieve image data retrieval in an image database system. The traditional index structures are not directly applicable for these two classes of image retrieval. Several index mechanisms have been proposed to retrieve geometric objects that intersect a given spatial range (Guttman, 1984; Orenstein and Manola, 1988; Samet, 1990a; 1990b; Sellis et af., 1987). These mechanisms are useful for spatial database systems, but are not useful for the previously mentioned types of image information retrieval. As far as image information retrieval is concerned, the key issues to be handled in the design of an image information retrieval system are Shape and Image Representation: How can the useful information present in an image be described in terms of the features or properties of the shape of the objects or their spatial relationships? Of course, an important point is that these representations should be extracted automatically by processing the images. For the first type of retrieval, an image is represented as a set of shapes or regions present in that image. Each shape is represented in terms of its properties or primitive structural features. It is generally assumed that all the shapes that could appear in images to be managed are known a priori. Therefore, representation of each of the known shapes-objects is usually compiled and stored in the model base. For the second type of image information retrieval, an image is represented by an ordered or partially ordered
268
WILLIAM I. GROSKY AND RAJlV MEHROTRA
set of shapes or by a graph structure. The ordering is determined by the spatial relationships of interest. Similarity Measure: What measures or criteria should be employed to automatically determine the similarity or dissimilarity of two shapes or the spatial relationships among objects? The similarity measure used by a system depends on the type of features or properties used to represent shapes or spatial relationships among objects? The similarity measure used by a system depends on the type of features or properties used to represent shapes or spatial relationships. Index Structures : How should the shape and spatial relationship representation be organized so as to enable an efficient search for similar shapes or spatial relationships based on a predefined similarity measure? Since a large set of known models or images have to be searched to select a subset of models or images that satisfies certain conditions, model and image data must be organized in some index structures to facilitate efficient search. There are two main classes of approaches to image information retrieval. One class of approaches deals with the design and manipulation of indexes for shape similarity-based retrieval. In other words, these are data-driven techniques for shape recognition. The other set of techniques is concerned with the image spatial knowledge representation in order to retrieve images based on the similarity of spatial relationships among the various objects appearing in the given images. Some of these techniques are reviewed in the following subsections.
4.1
Shape Similarity-Based Retrieval
Shape matching or object recognition is an important problem in the area of machine vision. A number of approaches have been proposed for interpreting images containing two-dimensional (2-D) objects. Most of the existing techniques are model-based. The goal of a model-based system is to precompile the description of each known object, called a model, and then to use these models to identify any objects present in the input image data and to determine their locations. A model for an object is developed using features extracted from one or more prototypes of that object. In general, the overall functioning of a model-based recognition system can be divided into two phases: the training phase and the recognition phase. In the training phase, the system builds the models of the known objects, stores the models in a database, called the model base, and collects or generates information useful for the recognition of unknown objects. In the recognition phase, the models and other useful information acquired during the
269
IMAGE DATABASE MANAGEMENT
training phase are utilized to analyze the input images. Figure 13 shows the main functional components of a model-based object recognition system. The matching process of the recognition phase of most of the existing modelbased object recognition systems can be divided into two component processes : hypotheses generation and hypotheses verification. The hypotheses generation component is responsible for hypothesizing the identities and locations of objects in the scene, whereas the hypotheses verification component performs some tests to check whether a given hypothesis is acceptable or not. This mode of operation is called the hypothesize-and-test paradigm. Several shape matching or object recognition techniques have been proposed. One approach is to use each of the precompiled models, in turn, as a test model. Hence, the object's identity is assumed to be known. The image data is searched for one or more features of the model under consideration. If matching features are found then an instance of an object is assumed to be present and the location parameters are estimated, if possible or desired. The presence of an object at the estimated location may be verified later. We call this the model-by-model approach to shape recognition (Ayache and Faugeras, 1986; Bolles and Cain, 1982; Turney, Mudge, and Volz, 1985). The main disadvantage of this approach is that the cost of shape matching is usually high because the image data is exhaustively searched for a selected feature belonging to the test model. Another approach, which we call feature-by-feature (Knoll and Jain, 1986), forms a collection of features from all the models in the training phase and associates with each feature a list containing where and in which objects that feature is found. Each of these features is then searched for in A Prlorl Knowledge
-
- - - - - - - - TRAINING - - - PHASE ------_-___ - -_RECOGNITION PHASE
270
WILLIAM I. GROSKY AND RAJlV MEHROTRA
the image data. If a particular feature is found, the list associated with that feature is used to hypothesize and verify the identities and locations of the possible objects. The main limitation of this approach is that to achieve a higher speed of recognition, only features that appear in a certain proportion of models should be used to form the model feaure collection (Knoll and Jain, 1986; Turney et ul., 1985). To find such features, complex and expensive methods are usually used that must then be repeated each time a model is deleted or inserted. The fundamental difference between these two approaches (see Fig. 14) is that the model-by-model approach uses a feature belonging to a given model, whereas the feature-by-feature approach uses a feature belonging to a collection of features obtained from the database of models. These two
I
Image Data Representation
Matching
Hypothesis Generation
a: The Model-by-Model Approach
n image Data Representation
Matching
Hypothesis Generation
b: The Feature-by-Feature Approach FIG. 14. The model-driven approaches to object recognition.
IMAGE DATABASE MANAGEMENT
271
approaches are model driven in the sense that the image data is searched for model related feature data-either belonging to a specified model or to a collection of features obtained from the model database-in order to generate hypotheses. The various model-driven techniques are not suitable for database retrieval because a linear search is conducted to find matching models. Therefore, a desirable response time for processing the retrieval requests may not be attainable. Alternatively, the model database can be searched for an image-related feature in order to find which models have that image feature. Once this information is available, the identity and locations of the objects can be hypothesizd and verified. In other words, a data-driven approach (Grosky and Mehrotra, 1990; Mehrotra, 1986) to recognition of objects is another possibility. One way of finding the identity and location of an object that contains a given image feature is to search each model, in turn, for this feature-a datadriven, model-by-model approach. However, another possibility is to form a collection of features belonging to the models and search this collection for the given image feature. Since high speed is one of the desirable characteristics of an object recognition system in a database environment, the search for a given feature in the feature collection must be conducted with a minimum of effort. The efficiency of such a search can be increased by the use of such heuristic search procedures as A* (Grebner, 1986). However, this approach also employs a linear search and is thus not desirable for similarity retrieval in an image database system. The conventional data management approach to speed up search is to organize the data in a particular way and then employ some appropriately tailored search procedures. For example, binary search can be used with a sorted set of numerical data. If, in addition to the search operation, insertion and deletion operations are also required, the data can be organized in an index structure such as a binary search tree, kd-tree, 2-3 tree, hash table, or B-tree. Since an object recognition system in an image database environment may be required to identify additional objects and no longer be required to identify some of the previously existing objects, the insertion and deletion of models must also be efficiently handled by such a system. Earlier approaches to data-driven model-based object recognition techniques (Agin, 1980; Gleason and Agin, 1979) cannot handle complex image data containing overlapping, partially visible, and touching objects, due to the limitations of the features used for building models. Recently, a few data-driven techniques capable of handling complex image data have been proposed (Grosky and Mehrotra, 1990; Lamdan, Schwartz, and Wolfson, 1988; Mehrotra and Grosky, 1989; Stein and Medioni, 1990). In these techniques, as in traditional databases, iconic index structures are employed to store the image
272
WILLIAM I. GROSKY AND RAJlV MEHROTRA
and shape representation in such a way that searching for a given shape or image feature can be conducted efficiently. Some of these techniques handle the insertion and deletion of shapes or image representations very efficiently and with very little influence on the overall system performance. The general functioning of an index-based data-driven object recognition technique is depicted in Fig. 15. Index-based data-driven techniques are highly suited for similarity retrieval in an image database management system because they offer efficient shape matching and also the possibility of inserting and deleting models. The existing iconic index structures for shape similarity-based retrieval can be classified into two different classes based on the types of features used to represent shapes : global feature-based indices and local feature-based indices. 4.1.7 Global Feature-Based Indexes
These techniques utilize primitive structural features or properties that are derived from the entire shape. Examples of such features are those of area, perimeter, or a set of rectangles or triangles that cover the entire shape, among others. Since the entire shape is required to extract these features, however, techniques based on these methods cannot handle images containing overlapping or touching shapes. A Priori
a Known Object
Object
TRAINING PHASE
Design
- - - - - - _ - - - - - - - - - - -- - - - - - - RECOGNITION PHASE
IMAGE DATABASE MANAGEMENT
273
One of the earliest indexed, structure-based object recognition systems, called the SRI Vision Module, was proposed by Gleason and Agin (Agin, 1980; Gleason and Agin, 1979). This system uses global feature-based shape representations. The regions of a 2-D shape are represented by a vector of numerical attributes (or features) such as area, moments, perimeter, center of mass, the extent in the x and y directions, number of holes, area of holes, aspect ratio, and thinness ratio. Several images of each shape are taken to obtain average values of the various shape attributes. After building representations of all the known shapes, a binary tree-based attribute index of the type shown in Fig. 16 is created as follows. The two feature values with the largest separation for a given attribute and the corresponding pair of shapes are selected to reside at the root node of the index tree. A threshold is then selected for this attribute that distinguishes between the two shapes. Next, two subtrees of the root node are formed so that all shapes whose given attribute value is less than or equal to the threshold become members of the left subtree and all other shapes (i.e., those whose given attribute value is greater than the threshold) become members of the right subtree. This procedure is applied recursively to the two subtrees. This recursion terminates when the size of a subtree becomes one. Insertion or deletion of models requires a complete reconstruction of the decision tree for the new set of models. No secondary storage implementation has been proposed for this index. If N attributes are used to represent a shape, it becomes a point in an N-dimensional feature space. In this case, any multidimensional point indexing technique can be used.
shape-1
shape-2
FIG. 16. An example of a decision tree classifier.
274
WILLIAM I. GROSKY AND RAJlV MEHROTRA
Grosky and Lu (1986), propose a boundary code-based iconic index for shape recognition. In their approach, a shape is represented by the boundary code of its boundary. The similarity of two shapes is then based on the length of a particular type of longest common subsequence, called the longest q-generalized common subsequence (LqGCS), with respect to the boundary codes of the two shapes, based on a generalized pattern matching technique for two strings. An index is designed by packing the boundary codes into a superstring. Each character in the superstring contains a set of votes for the individual strings to which it belongs. To classify an input string (or boundary code), the LqGCS of this string with the superstring is found. The votes of the matching and nonmatching characters are used to determine the quality of the match between the input string and each of the strings in the database. Insertion or deletion of models again requires the complete redesign of the superstring for the new set of models. Recently, Jagadish proposed a retrieval technique for similar rectilinear shapes (Jagadish, 1991). A rectilinear shape is represented by a set of rectangles that cover the entire shape. One of the rectangles is selected as the reference rectangle to normalize the locations and sizes (represented by a pair of values) of the other rectangles. The location of a rectangle before normalization is represented by the coordinates of the center ( x , , y r ) of the line segment joining the lower-left and upper-right corners of that rectangle. The size of a rectangle before normalization is represented by the pair (Xur-xii), (yu,-yd, where (Xur, r u r ) and (XI). yiJ are the coordinates of its upper-right and lower-left corners, respectively. A shape is described by a vector ( t x , t,, s, d ) for the reference rectangle and a vector ( c ~ ,c,, s,, s), for each of the other rectangles. Here ( t x , t,) is the location of the reference rectangle, s is the product of the x and y components of the size of the reference rectangle, d is the ratio of the y and x components of the size of the reference rectangle, (cx, c,) is the coordinate of the center of the given rectangle normalized with respect to ( t x , tv), and s, and s, are the x and y components of the size of the given rectangle normalized with respect to the size of the reference rectangle. Thus, a shape covered by k rectangles becomes a point in 4kdimensional space. Therefore, any multidimensional point indexing method can be used. The similarity of two shapes (or two rectangles) is then determined by the sum of the areas of the various nonintersecting regions, if any, when one shape is placed on the other. Since all these techniques rely on global feature-based shape representation, they cannot handle images with overlapping or touching shapes or objects. We now describe some index-based techniques that permit shape similarity-based retrieval even when input images contain such overlapping or touching shapes.
I M A G E DATABASE M A N A G E M E N T
275
4.1.2 Local Feature-Based Indexes
These techniques utilize primitive local structural or relational features to represent shapes and images. Local features are those that do not depend on the entire shape and therefore can be extracted by processing local segments of a shape or an image. Examples of local features are line and curve segments of the object boundary and points of maximal curvature change. Mehrotra and Grosky proposed a data-driven object recognition approach based on local feature-based iconic index structures (Mehrotra, 1986; Mehrotra and Grosky, 1989). They proposed that given any structural feature-based shape representation technique and a quantitative method to measure the similarity (or difference) between any two features, a feature index tree having the following properties can be created (Grosky and Mehrotra, 1990) : 1. The model features are stored at the leaf nodes. 2. Each of the interior nodes contains a feature, called the reference feature. This feature can be either a member of the model feature collection or an artificial feature. 3. The members of any subtree are more similar to the reference features stored at the root of that subtree than to the reference feature stored at the root of the sibling subtree. Given a feature of the input image, the best matching feature in the feature index tree can be easily found using the following algorithm: I . Let the root of the feature index tree be at level 0. Find which of the two reference features at level 1 of the index of the index tree is more similar to the given feature. 2. Search the subtree whose root has the most similar reference feature and ignore the other subtree. 3. Recursively apply this procedure until a leaf node is reached. The feature stored at the leaf node is then taken to be the best matching feature.
Associated with each feature stored at a leaf node of the feature index is a list of shape-location information that tells where and in which shapes that feature appears. The shape-location list associated with the best matching feature is used to hypothesize the identities and locations of possible shapes. These hypotheses are later verified. The average time complexity of recognition in this case is O(log2 N) for a feature set of size N . This index structure permits efficient insertion and deletion of models. The index tree could be
276
WILLIAM I. GROSKY AND RAJlV MEHROTRA
developed by incrementally adding features of each model one at a time or by recursively subdividing the entire collection of all model features. A prototype system based on this feature index is presented in (Mehrotra and Grosky, 1989). In this system, a shape is modeled as an ordered set of vertices of the polygonal approximation of its boundary. Each vertex is described by a set of attributes that consists of a length, an angle, and its coordinate values. The length attribute gives the distance of the given vertex from the previous vertex and the angle attribute gives the angle of the given vertex. In other words, a shape is described by an attributed string. Finally, fixed size subsets (disjoint or nondisjoint) are used as features for building the feature index. Figure 17 shows an example of a feature. An edit-distancebased similarity measure was proposed to determine the similarity of two attribute strings (or features). This similarity measure computes the cost to transform one attributed string to another. It attains a value of zero for exact matching features and increases with the dissimilarity between the features. Grosky, Neo, and Mehrotra (1989; 1991) extended their binary tree-based feature index to an m-way tree for secondary memory implementation. This generalized index has the following properties : 1. Each internal node has the structure shown in Fig. 18. The value of Rejis a reference feature used to determine key values, while s represents the current out-degree of the node and is restricted to lie in the range [2,m]. The notations Pipoint to subtrees that are also m-way search trees. The notations K, are values that divide the underlying features into intervals.
FIG. 17. An example of a feature
277
IMAGE DATABASE MANAGEMENT
ef
s
Po
K1
PI
K2
P2
..=
KPr
P,
FIG. 18. Structure of an internal node.
2. The key values in an internal node are in ascending order; i.e., K, < Ki+, for 1 < i d s - 2 . 3. All key values in nodes of the subtree pointed to by P,are less than or equal to the key value Ki+i,for 0 6 i < s - 2. 4. All key values in nodes of the subtree pointed to by P,-, are greater than the key value K,-i. 5. A typical leaf node is shown in Fig. 19. The value n represents the current number of features in the node. Each F, is a feature with associated list Li, containing where and in which models Fiis found. In their implementation, this list has a maximum declared size. Any list which gets larger than this bound is chained in an overflow area. Each leaf node can contain from 1 to r features. The key value of an internal node are similarity values between the reference feature in the same node and features in its subtrees. A good match of an input feature is a feature in the index whose similarity with the input feature is less than some threshold value. A two-phase index search process was proposed to find a good match for an input feature. The first phase, called the external search, searches for a leaf node containing the potentially matching feature. The second phase, called the internal search, searches the data associated with that leaf node for the best matching feature. Two cutoff criteria are used to eliminate some subsets from search for the best match. Suppose that b is the current best-match key found so far, q the query key, and 5 = sim(q, b), the similarity between q and 6, where sim is a metric similarity measure. If sim(q, x) < 5 then b is updated with x and 4 is updated with sim(q, x), as b is a closer match. The following cutoff criteria provide sufficient conditions for eliminating subset Y of the key space X if it is known a priori that sim(q, y ) > 6 for all y in Y : 1. Suppose Y G X , x EX, and for every y E Y, we have that sim(x, y ) < k. Then, if sim(q, x) - k 2 5, we can eliminate subset Y from consideration. That is, no key in Y is closer to the query key than b.
n
FO
Lo
FI
Ll
0.-
F,I
L,I
278
WILLIAM I. GROSKY A N D RAJlV MEHROTRA
2. Suppose Y E X , x E X , and for every y E Y, we have that sim(x, y ) 3 k. Then, if k - sim(q, x) 2 4, we can eliminate subset Y from consideration.
The external search starts by traversing the path of the tree to the leaf L, which ostensibly contains the exact match. Hence, a better estimate of x is obtained, resulting in a possible exclusion of various subtrees from the search. If a good match is not found, the two cutoff criteria are applied in alternately searching the left and the right siblings of L. Once a cutoff criterion is met, further search of siblings in that direction is unnecessary since the key values in the tree are in ascending order. Another class of data-driven shape matching techniques is based on the concept of geometric hashing. These methods store the model-base information in a hash table that is indexed to search for a given shape or a feature of a shape. Lamdan and Wolfson (1988) represent a shape by a similarity invariant representation set of interest points. This is done by defining an orthogonal coordinate frame using an ordered pair of points, called the basis pair, and representing all other points with respect to this frame. Multiple representations of an object using different basis pairs are then obtained. For each basis pair, the transformed coordinates of all other points are hashed into a table that stores all (shape, busispair) tuples for every coordinate. To analyze given image data, a basis frame is selected from the set of image interest points and the coordinates of all other points are computed with respect to the selected basis. For each transformed point, the hash table is indexed and votes are gathered for the (model, basispair) tuples appearing there. The number of votes for a (model,hasispair) tuple indicate the quality of similarity. The transformation parameters are hypothesized using the point correspondence between the model points and the image points. The hypothesized transformation is then verified. Stein and Medioni (1990) propose another hash-based shape matching technique. They represent a shape by the polygonal approximation of its boundary. A set of adjacent line segments of the polygonal approximation, called a super segment, is used as a basic feature for creating a hash table. A super segment is characterized by a set of numerical attributes. The representation of each super segment is gray coded and hashed into a table where (super segmenl, object) tuples are stored. To analyze a query image, gray codes of the super segments of the input are used to index the hash table to generate and verify hypotheses regarding the identity and location of the shape. This technique also permits the efficient insertion and deletion of models. Some other data-driven shape matching techniques that are suitable for shape similarity-based retrieval in a database environment are described in
IMAGE DATABASE MANAGEMENT
279
Hong and Wolfson (1988); Kalvin et al. (1986); Mehrotra, Kung, and Grosky (1990); and Sethi and Ramesh (1989a; 1989b; 1991). 4.2 Spatial Relationship- Based Retrieval
To retrieve images that meet the shape identity and spatial relationship constraints requires efficient representation and organization of spatial relationship knowledge, which are sometimes called relational models. Very limited research activities have been reported on this type of image data retrieval. Generally, there are two types of image representation models that are used: graphs and strings. These methods assume that any given input image is first processed to obtain the identities and locations of the objectsshapes present in that image. In a graph-based method, an image representation or relational model is defined by a graph whose nodes represent objects and whose edges represent relationships. Shapiro and Haralick ( 1982) proposed two organizations for graph-based relational models. One of these two organizations is based on the concept of clustering whereas the other is based on the concept of binary trees. They defined two distance measures to quantify the similarity of two representations. According to their first measure, the distance D ( G , , G,) for a pair of graphs ( G I ,G,), each of size s, is given by
W G , ,G,)
= min f
llf(GJ - GZII,
where f is a permutation of s and /I . /I represents any norm. GI and G, are considered similar if D(G,, G,) is less than or equal to some threshold d. The second distance measure is a generalization of the first distance measure. Let MI = { R , , . . . , R k } and M2 = {SI, . . . , Sk} be two relational models. For any N-ary relation R s A N and associationf'c A x B, the composition R of' is defined as R 0 f = { (bl , . . . , b N ) E B N I 3(ul, . . . , u N ) E R with ( a , , b,) EL for 1 d n 6 N}. The distance between M I and M2 is then defined in terms of two types of errors: the structural error and the completeness error of the association f. The structural error of an association f E A x B with respect to N-ary relations R c A N and S c BN is E S ( n= 1 R 0 f - S 1 + IS o f ' - R I. The structural error is a measure of the tuples found in R, but not in S, or found in S, but not in R. The completeness error of an associationfz A x B with respect to N-ary relations R G A N and S E is
Ec(f)=I S - R o f l
+ IR-Sof-'I.
The completeness error is a measure of the tuples in S that no tuples in R map to and vice versa. The combined error is then given by E R , S ( f ) = ClEs(f)
+ C2E4f).
280
WILLIAM I. GROSKY AND RAJlV MEHROTRA
The total error offwith respect to the relational models M I and M2 is then given by K
The distance between MI and M2 is given by
GD(MI,M 2 )= min E ( f ) . f
The clustering-based approach forms clusters of relational models using one of the previously mentioned distance measures for comparing two relational models or graphs. For each cluster, a representative is selected such that every member of a cluster is more similar to its representative than to the representatives of other clusters. To retrieve matching images/models, the input relational model is matched against each of the cluster representatives. The clusters whose representatives are closely similar to the input model are then searched for the best matching or closely matching images/models. The binary tree-based relational model organization has the same properties as the binary tree-based feature index structure of Mehrotra and Grosky discussed earlier. For a given set of relational models S , a binary tree is recursively generated. At each level of recursion, for every large enough set of relational models L, two models A and B belonging to L are selected so as to minimize c E I.
min[D(G, A ) , D(G, @I,
where D(R, X )denotes the distance between models R and X . The remaining models of set L are split into two groups P A and PB so that every model in set Pa is more similar to A than to B and every model in PB is more similar to B than to A . The search for the best matching relational model starts with the comparison of the input model with the two representatives at level 1, where the root is at level 0. If an acceptable match is found, then the search terminates, otherwise the subtree with the more similar representative is recursively searched and the other subtree is ignored. No secondary storage implementation has been proposed for any of these methods. Other treatments of relational matching may be found in Haar (1982) ; Mulgaonkar, Shapiro, and Haralick (1982a, 1982b); Shapiro and Haralick (1981); Shapiro et al. (1984). Chang, Shi, and Yang (1987) have proposed a two-dimensional string representation for modeling the spatial relationships among the objects in an image. In their approach, the input is regarded as a symbolic image that preserves the spatial relationships among objects of the original image. This
IMAGE DATABASE MANAGEMENT
281
symbolic image can be obtained by recognizing the identities and spatial locations in the x and y directions of objects present in the original image. A symbolic image is encoded as a two-dimensional string. Formally, let V be the set of symbols representing the pictorial objects and let R be the set {=, <, :}, where the symbols in R are not in V. The symbols in R are used to denote the spatial relationships among objects: < denotes the left-right or below-above relationship, = denotes the same spatial location relationship, and : denotes the same set relationship. A two-dimensional string ( S , , S,) over V is defined as
...
( ~ ~ r l ~ u 2 r 2r~( n - I ) A ,
. . r(n - ~ ) ~ u ~ ) p ( ~ ) ) ,
up'p(l)rl,up(2)r2y~
where ul . . . u, is a one-dimensional string over V , p : ( l ,. . . , n > - { l , . . . , n )
is a permutation over (1, . . . ,n } , and r l r . . . r(n-l)xand r l y * * + Y ~ , - ~ )are ~ one-dimensional strings over R. For example, the symbolic picture shown in Fig. 20 can be represented as ( A : F < K = J < R :T , R : T = J < A : F < K ) . The relational symbols : and = may be omitted but could cause ambiguity if the symbolic picture is reconstructed from the two-dimensional string representation. To avoid ambiguity in the reduced two-dimensional string representation, the permutation function is added to the representation. This representation is called the augmented two-dimensional string. The augmented two-dimensional string for the symbolic picture of Fig. 20 is ( A F < KJ < RT, RTJ < A F < K, 564123). To retrieve a symbolic picture f , represented by (u, u, p ) , which contains the symbolic picturef', represented by (u', u',p'), u' must be a subsequence of u and u' must be a subsequence of u. Three different types of subsequences have been defined based on the positions of symbols and the number of instances of < present in the one-dimensional string. A string a is a type-i 1-D subsequence of a string p, if a is contained in and, if alwlbl is a
FIG.20. A symbolic picture.
282
WILLIAM I. GROSKY AND RAJlV MEHROTRA
substring of a , u I matches a2 in
p and hl matches b2 in p, then
(type 0 ) r(b2) - r(a2)b r ( b l ) - r(u1) or r ( h l ) - r ( a l ) = 0 (type 1) r(h2) - r ( 4 2 r ( h ) - r(alj > 0 or r(b2)- r(u2) = r(b1) - r ( a l )= 0 (type 2) d h ) - r(a2) = 4 b l ) - r(al j where r ( x ) is the rank of a symbol x, which is defined as I plus the number of instances of < preceding the symbol x . We say (u’, u’,p’) is a type-i subsequence of (u , o, p ) if u’ is type-i one-dimensional subsequence of u and o’ is type-i one-dimensional subsequence of u. Let (a, 6 , r j and (a‘, b‘, t ‘ ) be the two-dimensional strings of two symbolic pictures f’ and f respectively. To determine if picturef” is a type-i subpicture of picturef, we need to check whether (d,b , t ‘ ) is a type-i two-dimensional subsequence of (a, b, t). For example, consider the symbolic pictures shown in Fig. 21 (Chang et ul., 1987). The two-dimensional string representations of these pictures are I ,
f: ( p q < r < s , p < r s < q ,
h:
(p
12)
f2:
( p < s, p < 8,
12)
f 3 : ( p r < ,Y, p < rs,
1342)
123)
In this example, f l , f i ,f3 are all type-0 subpictures of ,k fi and fi are type-I subpictures of f . and fi is a type-2 subpicture off. Another concept of subsequence-based similarity of symbolic pictures is defined in Lee, Shan, and Yang (1989). In general, the two-dimensional string-based approaches do not utilize any index structure : to retrieve matching images the command image is matched against all the stored imagesmodels. Thus, in the case of large image databases, this technique may not achieve desirable response times.
f
fl
12
FtCj. 21. An example of symbolic picture matching.
f3
0 1987 IEEE.
IMAGE DATABASE MANAGEMENT
283
5 . Conclusions
Now that researchers in the database community as well as the image interpretation community have shown a mutual interest in its development, the field of image database management should experience much growth. This field is still in its infancy and not yet on a firm footing: correct questions are just starting to be asked, let alone answered. Among these questions are those concerned with what users require of such a system. How will an interactive session with such a system proceed? Just what queries can users be expected to ask? How does this field and its generalization to sensorbased data management issues mesh with the other emerging disciplines of multimedia systems, visualization, and artificial reality? In order for this field to develop, much experimentation and system building remains to be done on third-generation image database management systems. General methods for supporting interactive feature construction by the user especially needs to be more fully explored. This is the way that users will be able to define particular ‘hooks’ into sensor data. Such development is more likely to occur in particular narrow domains of application rather than as a general system. This is the reason that the development of image database management systems for particular applications must be encouraged.
Acknowledgments
The authors would like to acknowledge the Institute for Manufacturing Research at Wayne State University, the Center for Robotics and Manufacturing Systems at the University of Kentucky, NIH Grant BRSG-S07R R 0 7 1 14-21, and NASA-Langley Research Center Grant NAG- 1 - 1276.
REFERENCES A N D BIBLIOGRAPHY Agin, G . J. (1980). Computer Vision Systems for Industrial Inspection and Assembly, IEEE Computers 13(5), 11-20. Ayache, N., and Faugeras, 0. D. (1986). HYPER: A New Approach for the Recognition and Positioning of Two-Dimensional Objects, IEEE Trans. on Pattern Analysis and Machine Intelligence PAMI-ZI(I ) , 4454. Bernstein, R. (1980). Data Base Requirements for Remote Sensing and Image Processing Applications, in “Data Base Techniques for Pictorial Applications,” ed. A. Blaser, pp. 319 346. Springer-Verlag Publishing Company, New York. Bhargava, B. (1980). Design of Intelligent Query Systems for Large Databases, in “Pictorial Information Systems,” ed. S. K. Chang and K . S. Fu, pp. 431-445. Springer-Verlag Publishing Company, New York.
284
WILLIAM I. GROSKY AND RAJlV MEHROTRA
Billingsky, F. C. (1980). Data Base Systems for Remote Sensing, in “Data Base Techniques for Pictorial Applications,” ed. A. Blaser, pp. 299-318. Springer-Verlag Publishing Company, New York. Blaser, A,, ed. (1980). “Data Base Techniques for Pictorial Applications.” Lecture Notes in Computer Science, Vol. 8 I , Springer-Verlag Publishing Company. New York. Bolc, L., ed. ( 1983). “Natural Language Communication with Pictorial Information Systems.” Springer-Verlag Publishing Company, New York. Bolles, R. C., and Cain, R. A. (1982). Recognizing and Locating Partially Visible Objects: The Local-Feature-Focus Method, Intcrnmtionul Journal of Robotics Reseurck 1(3), 57 -82. Brolio, J., Draper, B. A., Beveridge, J. R., and Hanson, A. R. (1989). ISR: A Database for Symbolic Processing in Computer Vision, IEEE Computer 22( 12), 22-30. Boursier. P. (1985). Image Databases: A Status Report. “Proceedings of the Workshop on Computer Architecture for Pattern Analysis and Image Database Management.” Miami Bcach, November 1985, pp. 355 358. Castelli, E. (1989). Symmetry Rased Approach to Mechanical Drawing Retrieval, An A1 Application, in “Visual Database Systems,” ed. T. L. Kunii, pp. 405-414. North-Holland Publishing Company. Amsterdam. Chang, N. S., and Fu, K. S. (1980). A Relational Database System for Images, in “Pictorial Information Systems,” ed. S. K. Chang and K. S. Fu, pp. 288-321. Springer-Verlag Publishing Company, New York. Chang, N. S., and Fu, K. S. (1980~).Query-by-Pictorial-Example, IEEE Trans. on Software Engineering SE-6(6), 519-524. Chang, N. S., and Fu, K. S. (1981). Picture Query Languages for Pictorial Data-Base Systems, IEEE Cbmpu~er14( I I ) , 23-33. Chang, S. K., ed. (1981a). Pictorial Information Systems, Special Issue of IEEE Computer, 14(1 I ) . Chang, S. K. (1981 b). Pictorial Information Systems, IEEE Cnmpufer 14(1 l), 10--11. Chang, S. K. (1982). A Methodology for Picture Indexing and Encoding, in “Picture Engineering,” ed. K. S. Fu and T. L. Kunii, pp. 33--53. Springer-Verlag Publishing Company, New York. Chang, S. K. (1984). Image Information Systems, “Proceedings of the IEEE Workshop on Visual Languages,” Hiroshima. Japan, December 1984, pp. 213-221. Chang, S. K. ( 1989). “Principles of Pictorial Information Systems Design.” Prentice-Hall Publishing Company, Englewood Cliffs, NJ. Chang, S. K., and Fu, K. S., eds. (1980). “Pictorial Information Systems.’’ Lecture Notes in Computer Science, Vol. 80, Springer-Verlag Publishing Company, New York. Chang, S. K., and Kunii, T. L. (1981). Pictorial Data-Base Systems. IEEE Compufer 14(11), 13-21. Chang, S. K., and Liu, S. H. (1982). Indexing and Abstraction Techniques for a Pictorial Database, “Proceedings of the International Conference on Pattern Recognition and Image Processing.” Las Vegas, June 1982, pp. 422-431. Chang. S. K., and Liu, S. H. (1984). Picture Indexing and Abstraction Techniques for Pictorial Databases, IEEE Trans. on Pattern Analysis and Machine Intelligence PAMI-6(4), 475 484. Chang, S. K., Reuss, J., and McCormick, B. H. (1977). An Integrated Relational Database System for Pictures, “Proceedings of the IEEE Workshop on Picture Data Description and Management,” Chicago, April 1977, pp. 49 60. Chang, S. K., Reuss, J., and McCormick, B. H. (1978). Design Considerations of a Pictorial Database System, Iniernalionul Journal on Policy Analysis and Informutinn Systems 1.49- 70. Chang, S. K., Lin, €3. S., and Walser, R. (1980). A Generalized Zooming Technique for Pictorial Database Systems, in “Pictorial Information Systems,” ed. S. K . Chang and K. S. Fu, pp. 257 287. Springer-Verlag Publishing Company, New York.
IMAGE DATABASE MANAGEMENT
285
Chang, S. K., Shi, Q. Y., and Yan, C. W. (1986). Iconic Indexing by 2-D Strings, “Proceedings of the IEEE Computer Society Workshop on Visual Languages.” Dallas, June 1986, pp. 12 21. Chang, S. K . , Shi, Q. Y., and Yan, C. W. (1987). Iconic Indexing by 2-D Strings, ZEEE Trans. on Pattern Analysis and Machine Intelligence PAMI-9(3), 41 3-428. Chang, S. K., Yan, C. W., Dimitroff, D. C., and Arndt, T. (1988). An Intelligent Image Database System, IEEE Trans. on Software Engineering SE-14(5), 681--688. Chen, P. P. S. (1976). The Entity-Relationship Model: Toward a Unified View of Data, ACM Trans. on Database Management Systems 1(1), 9-36. Cheng, Y., Iyengar, S. S., and Kashyap, R. L. (1988). A New Method of Image Compression Using Irreducible Covers of Maximal Rectangles, IEEE Trans. on Software Engineering SE-14(5), 651-658. Chien, Y. T. (1980). Hierarchical Data Structures for Picture Storage, Retrieval and Classification, in “Pictorial Information Systems,” ed. S. K. Chang and K. S. Fu, pp. 39-74. SpringerVerlag Publishing Company, New York. Chock, M., Cardenas, A. F., and Klinger, A. (1981). Manipulating Data Structures in Pictorial Information Systems, ZEEE Computer 14( 1 l), 43-50. Chock, M., Cardenas, A. F., and Klinger, A. (1984). Database Structure and Manipulation Capabilities of a Picture Database Management System (PICDMS), IEEE Trans. on Pattern Analysis and Machine Intelligence PAMI-6(4), 484-492. Chuang, P. J., and Springsteel, F. (1989). ERDDS-The Intelligent E-R-Based Database Design System, in “Visual Database Systems,” ed. T. L. Kunii, pp. 57-86. North-Holland Publishing Company, Amsterdam. Corre, J. L., and Hegron, G. (1989). Unified Data Structures for Mixed 3-D Scene Management, in “Visual Database Systems,” ed. T. L. Kunii, pp. 319-338. North-Holland Publishing Company, Amsterdam. Danielsson, P. E., and Levialdi, S. (1981). Computer Architectures for Pictorial Information Systems, IEEE Computer, 14(1 I ) , 53-67. Davis, L. S., and Kunii, T. L. (1982). Pattern Databases, in “Data Base Design Techniques 11: Physical Structures and Applications,” pp. 357- 399. Lecture Notes in Computer Science, Vol. 133, Springer-Verlag Publishing Company, New York. Davis, W. A,, and Hwang, C. H. (1986). Organizing and Indexing of Spatial Data, “Proceedings of the Second International Symposium on Spatial Data Handling.” Seattle, July 1986, pp. 5-1 5. Dayal, U., Buchmann, A. P., and McCarthy, D. R. (1988). Rules Are Objects Too: A Knowledge Model for an Active Object-Oriented Database System. “Proceedings of the Second International Workshop on Object-Oriented Database Systems,” pp. 129--143.Lecture Notes in Computer Science Vol. 334, Springer-Verlag Publishing Company, New York. Dyer, C., and Chin, R. T. (1986). Model-Based Recognition in Robot Vision, ACM Computing Surveys 8(1), 67 108. Fu, K. S. (1980). Picture Syntax, in “Pictorial Information Systems,” ed. S. K. Chang and K. S. Fu, pp. 104-127. Springer-Verlag Publishing Company, New York. Fu, K. S., and Kunii, T. L., eds. (1982). “Picture Engineering.” Springer-Verlag Publishing Company, New York. Gleason, G. J., and Agin, G. J. (1979). A Modular System for Sensor-Controlled Manipulation and Inspection, “Proceedings of the Ninth SPIE International Symposium on Industrial Robots.” Washington DC, March 1979, pp. 57-70. Goodman, A. M., Haralick, R. M., and Shapiro, L. G. (1989). Knowledge-Based Computer Vision-Integrated Programming Language and Data Management System Design, IEEE Computer 22( 12), 43-54.
286
WILLIAM I. GROSKY AND RAJlV MEHROTRA
Grcbner, K . (1986). Model-Based Analysis of Industrial Scenes, “Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.” Miami, June 1986, pp. 28 33. Grosky, W. I. (1984). Toward a Data Model for Integrated Pictorial Databases, Computer Vision, Grcrphics, and Image Processing 25(3), 371 382. Grosky, W. I., and Lu, Y. (1986). Iconic Indexing Using Generalized Pattern Matching, Computer Vision, Graphics, und Image Processing 35(3), 383-403. Grosky, W. I., and Mehrotra, R., eds. (1989a). Image Database Management, Special Issue of IEEE Computer, 22( 12). Grosky, W. I., and Mehrotra, R . (1989b). Image Database Management, IEEE Computer 22(12), 7 8. Grosky, W. I . , and Mehrotra, R. (1990). Index-Based Object Recognition in Pictorial Data Management, Computer Vision, Graphics, and Image Processing 52(3), 416- 436. Grosky, W. I., Neo, P., and Mehrotra, R. (1989). An Iconic Index for Model-Based Matching, “Proceedings of the Fifth International Conference on Data Engineering.” Los Angeles, February 1989, pp. 180-187. Grosky, W. I., Neo, P., and Mehrotra, R. (1991). A Pictorial Index Mechanism for ModelBased Matching, Data and Knowledge Engineering, to appear. Gupta. A,, Weymouth, T. E., and Jain, R. (1991a). An Extended Object-Oriented Data Model for Large Image Bases, “Proceedings of the Second Symposium on the Design and Implementation of Large Spatial Databases.” Zurich, August 1991, to appear. Gupta, A., Weymouth, T. E., and Jain, R. (1991b). Semantic Queries with Pictures: The VIMSYS Model, “Proceedings of the Seventeenth International Conference on Very Large Databases.” Barcelona, August 1991, pp. 69-79. Guttman, A. (1984). R Trees: A Dynamic Index Structure for Spatial Searching, “Proceedings of the ACM International Conference on Management of Data.” Boston, June 1984, pp. 47 57. Haar, R. L. ( 1982). Sketching: Estimating Object Positions from Relational Descriptions, Computer Graphics and Image Processing 19(3). 227 247. Ilong, J., and Wolfson, H. J. (1988). An Improved Model-Based Matching Method Using Footprints, “Proceedings of the Ninth International Conference on Pattern Recognition.” Rome, November 1988, pp. 72-78. Hooley, A., Kibblewhite, E. J., Bridgeland, M. T., and Horne, D. A. (1980). Aspects of Handling Data from Astronomical Images, in “Data Base Techniques for Pictorial Applications,” ed. A. Blaser, pp. 413 426. Springer-Verlag Publishing Company, New York. Huang, H. K., Shiu, M., and Suarez, F. R. (1980). Anatomical Cross-Sectional Geometry and Density Distribution Data Base, in “Pictorial Information Systems,” ed. S. K. Chang and K. S. Fu, pp, 351 367. Springer-Verlag Publishing Company, New York. Hull, R., and King, R. (1987). Semantic Database Modeling: Survey, Applications, and Research Issues, ACM Computing Surveys 19(3), 201-260. Iyengar, S . S., and Kashyap, R. L. (1988a). Image Databases, IEEE Trans. on Software Engineerinx SE-14(5), 608 610. lyengar, S. S., and Kashyap, R. L., eds. (1988b). Image Databases, Special Issue of IEEE Truns. on Software Engineering SE-14(5 ) . Jagadish, H. V. (1991). A Retrieval Technique for Similar Shapes, “Proceedings of the ACM International Conference on the Management of Data.” Denver, May 1991, pp. 208-217. Jagadish. H. V., and O’Gorman, L. (1989). An Object Model for Image Recognition, IEEE Computer 22(12), 33 41. Joseph, T., and Cardenas, A. F. (1988). PICQUERY: A High Level Query Language for Pictorial Database Management, IEEE Trans. on Sojfware Engineering SE-14( S ) , 630-638.
IMAGE DATABASE MANAGEMENT
287
Jungert, E. (1984). Conceptual Modeling of Image Information, “Proceedings o f the IEEE Workshop on Visual Languages.” Hiroshima, December 1984, pp. 120-123. Jungert, E., and Chang, E. (1989). An Algebra for Symbolic Image Manipulation and Transformation, in “Visual Database Systems,” ed. T. L. Kunii, pp. 301-318. North-Holland Publishing Company, Amsterdam. Kalvin, A., Schonberg, E., Schwartz, J. T., and Sharir, M. (1986). Two-Dimensional, ModelBased Boundary Matching Using Footprints, International Journal of Robotics Research 4 (Winter), 38-55. Kasturi, R., and Alemany, J. (1988). Information Extraction from Images o f Paper-Based Maps, IEEE Truns. on Software Engineering SE-14(5). 671-1575, Kasturi, R.,Fernandez, R., Amlani, M. L., and Feng, W. C . (1989). Map Data Processing in Geographic Information Systems, IEEE Computer 22( 12), 10 21. Kato, T., and Fujimura, K. (1990). TRADEMARK: Multimedia Image Database System with Intelligent Human Interface, Systems and Compurers in Japan 21(1 I), 33-46. Kato, T., Kurita, T., Shimogaki, H., Mizutori, T., and Fujimura, K. (1991a). A Cognitive Approach to Visual Interaction, “Proceedings of the International Conference on Multimedia Information Systems ’91 .” Singapore, January 1991, pp. 109-120. Kato, T., Kurita, T., Shimogaki, H., Mizutori, T., and Fujimura, K. (1991b). Cognitive View Mechanism for Multimedia Database System, “Proceedings o f the First International Workshop on Interoperability in Multidatabase Systems.” Kyoto, April 1991, pp. 179-186. Kato, T., Kurita, T., and Shimogaki, H. (1991~).Intelligent Visual Interaction with Image Database Systems-Toward the Multimedia Personal Interface, Journal of Information Processing, to appear. Klinger, A,, and Pizano, A. (1989). Visual Structure and Data Bases, in “Visual Database Systems,” ed. T. L. Kunii, pp. 3 -25. North-Holland Publishing Company, Amsterdam. Klinger, A,, Rhode. M. L., and To, V. T. (1978). Accessing Image Data, International Journal on Policy Analysis and Information Systems 1, 171-1 89. Knoll, T. F., and lain, R. C. (1986). Recognizing Partially Visible Objects Using Feature Indexed Hypotheses, IEEE Journal of Robotics and Automation RA-2(1), 3-1 3. Kobayashi, I. (1980). Cartographic Databases, in “Pictorial Information Systems,” ed. S.K. Chang and K. S. Fu, pp. 322-350. Springer-Verlag Publishing Company, New York. Kunii, T. L., ed. (1 989). “Visual Database Systems.” North-Holland Publishing Company, Amsterdam. Kunii, T., Weyl, S.,and Tenenbaum, J. M. (1974). A Relational Database Schema for Describing Complex Pictures with Color and Texture, “Proceedings of the Second International Joint Conference on Pattern Recognition.” Lyngby-Copenhagen, Denmark, August 1974, pp. 310-316. Kunt, M. (1980). Electronic File for X-Ray Pictures, in “Pictorial Information Systems,” ed. S. K. Chang and K . S. Fu, pp. 368-415. Springer-Verlag Publishing Company, New York. Lamdan, Y., and Wolfson, H. J. (1988). Geometric Hashing, A General and Efficient ModelBased Recognition Scheme, “Proceedings of the IEEE International Conference on Computer Vision.” Tampa, FL, December 1988, pp. 238 -249. Lamdan, Y., Schwartz, J. T., and Wolfson, H. J. (1988). Object Recognition by Affine Invariant Matching, “Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition.’’ Ann Arbor, MI, June 1988, pp. 335-344. Lee, E. T. (1980). Similarity Retrieval Techniques, in “Pictorial Information Systems,” ed. S. K. Chang and K. S. Fu, pp. 128-176. Springer-Verlag Publishing Company, New York. Lee, S. Y., and Hsu, F. J. (1990). 2-D C-String: A New Spatial Knowledge Representation for Image Database Systems, Pattern Recognition 23( lo), 1077-1087. Lee, S. Y., Shan, M. K., and Yang, W. P. (1989). Similarity Retrieval of Iconic Image Database, Pattern Recognition 22(6), 675-682.
288
WILLIAM I . GROSKY AND RAJlV MEHROTRA
Lee, Y. C., and Fu, K. S. (1983). Query Languages for Pictorial Database Systems, in “Natural Language Communication with Pictorial Information Systems,” ed. L. Bolc, pp. 1 142. Springer-Verlag Publishing Company, New York. Lien Y. E., and Harris, S. K. (1980). Structured Implementation of an Image Query Language, in “Pictorial Information Systems,” ed. S. K. Chang and K. S. Fu, pp. 416 430. SpringerVerlag Publishing Company, New York. Lien, Y. E., and Utter, D. F., Jr. (1977). Design of an Image Database, “Proceedings of the IEEE Workshop on Picture Data Description and Management.” Chicago, April 1977, pp. 131 136. Lin, B. S.. and Chang, S. K. (1979). Picture Algebra for Interface with Pictorial Database Systems, “Proceedings of COMPSAC ’79” Chicago, November 1979, pp. 525-530. Pictorial Database Interface, “Proceedings Lin, B. S., and Chang, S . K. (1980). GRAIN-----A of the IEEE Workshop on Picture Data Description and Management, Asilomar, CA, August 1980, pp. 83-88. Imhman, G. M., Stoltzfus, J. C., Benson, A. N., Martin, M. D., and Cardenas, A. F. (1983). Remotely-Sensed Geophysical Databases: Experience and Implications for Generalized DBMS, “Proceedings of the ACM International Conference on the Management of DataEngineering Design Applications.” San Jose. CA, May 1983, pp. 146 160. Makkuni, R. (1989). A Diagrammatic Interface to a Database of Thangka Imagery, in “Visual Database Systems.” ed. T. L. Kunii, pp. 339-370. North-Holland Publishing Company, Amsterdam. Manola, F., and Orenstein, J. A. (1986). Toward a General Spatial Data Model for an ObjectOriented Database, “Proceedings of the Twelfth International Conference on Very Large Databases.” Kyoto, August 1986, pp. 328 335. McKeown, D. M., Jr. (1979). Representations for Image Databases, “Proceedings of the DARPA Image Understanding Workshop.” Los Angeles, November 1979, pp. 109-1 11. McKeown, D. M., Jr. (1982). Concept Maps, “Proceedings of the DARPA Workshop on Image Understanding.” Palo Alto, CA, September 1982, pp. 142-1 53. McKeown, D. M., Jr., and Reddy, D. J . (1977). A Hierarchical Symbolic Representation for lmagc Databases, “Proceedings of the IEEE Workshop on Picture Data Description and Managcment.” Chicago, April 1977, pp. 40-44. Mehrotra. R. ( 1986). Recognizing Objects Using Data-Driven Index Hypotheses. Ph.D. Dissertation, Computer Science Department, Wayne State University, Detroit. Mehrotra, R., and Grosky, W. I. (1985). REMINDS: A Relational Model-Based Integrated Image and Text Database Management System, “Proceedings of the Workshop on Computer Architecture for Pattern Analysis and Image Database Management,” Miami Beach, November 1985, pp. 348 354. Mehrotra, R., and Grosky, W. I. (1989). Shape Matching Utilizing Indexed Hypotheses Generation and Testing, IEEE Journal of Robotics and Automution RA-5(1), 70-77. Mehrotra, R., Kung, F. K., and Grosky, W. 1. (1990). Industrial Part Recognition Using a Component Index, Image und Vision Computing 8(3), 225 232. Meyer-Wegener, K., Lum, V. Y., and Wu, C. T. (1989). Image Management in a Multimedia Database System, in “Visual Database Systems,” ed. T. L. Kunii, pp.497 524. NorthHolland Publishing Company, Amsterdam. Miller, S. W., and lyengar, S. S. (1983). Representation of Regions of Map Data for Efficient Comparison and Retrieval, “Proceedings of the 1983 Conference on Computer Vision and Pattern Recognition.” Washington, DC, January 1983, pp. 102-107. Mohan, L., and Kashyap, R. L. (1988). An Object-Oriented Knowledge Representation for Spatial Information, IEEE Trans. on Software Engineering SE-14(5), 675-681. Mulgaonkar, P. G . , Shapiro, L. G., and Haralick, R. M. (1982a). Recognizing Three Dimensional Objects from Single Perspective Views Using Geometric and Relational Models,
IMAGE DATABASE MANAGEMENT
289
“Proceedings of the IEEE Conference on Pattern Recognition and Image Processing.” Las Vegas, June 1982, pp. 479-484. Mulgaonkar, P. G., Shapiro, L. G., and Haralick, R. M. (1982b). Using Rough Relational Models for Geometric Reasoning, “Proceedings of the IEEE Workshop on Computer Vision : Representation and Control.” Rindge, NH, August 1982, pp. 116-124. Nagdta, M. (1984). Geographical Interface for Image Database Retrieval of Remote Sensing, “Proceedings of the IEEE Workshop on Visual Languages.” Hiroshima, December 1984, pp. 94-100. Nagata, M., and Oonishi, Y. (1985). Video Image Manipulation in Multimedia Pictorial Database Management, “Proceedings of the Workshop on Computer Architecture for Pattern Analysis and Image Database Management.” Miami Beach, November 1985, pp. 340-347. Nagy, G. (1985). Image Database, Image and Vision Computing 3(3), 111-117. Omolayole, J. O., and Klinger, A. (1980). A Hierarchical Data Structure Scheme for Storing Pictures, in “Pictorial Information Systems,” ed. S. K. Chang and K. S. Fu, pp. 1-38. Springer-Verlag Publishing Company, New York. Orenstein, J. A. (1988). Can We Meaningfully Integrate Drawings, Text, Images, and Voice with Structured Data?-A Position Paper, “Proceedings of the Fourth International Conference on Data Engineering.” Los Angeles, February 1988, p. 603. Orenstein, J. A., and Manola, F. A. (1988). PROBE Spatial Data Modeling and Query Processing in an Image Database Application, IEEE Trans. on Software Engineering SE-14(5), 6 1 1-629. Palermo, F., and Weller, D. (1980). Some Data Base Requirements for Pictorial Applications, in “Data Base Techniques for Pictorial Applications,” ed. A. Blaser, pp. 555--568.SpringerVerlag Publishing Company, New York. Pavlidis, T. (1980). Structural Descriptions and Graph Grammars, in “Pictorial Information Systems,” ed. S. K. Chang and K. S. Fu, pp. 86-103. Springer-Verlag Publishing Company, New York. Peckham, J., and Maryanski, F. (1988). Semantic Data Models, ACM Computing Surveys, 20(3), 153-190. Peuquet, D. J. (1984). A Conceptual Framework and Comparison of Spatial Data Models, Cartogruphia 21, 66 113. Phillips, B. (1 988). Multimedia Systems and Text-A Position Statement, “Proceedings of the Fourth International Conference on Data Engineering.” Los Angeles, February 1988, p. 601. Pizano, A., Klinger, A,, and Cardenas, A. (1989). Specification of Spatial Integrity Constraints in Pictorial Databases, IEEE Computer 22( 12), 59-70. Rabitti, F., and Stanchev, P. (1989). GRIM-DBMS: A GRaphical IMage DataBase Management System, in “Visual Database Systems,” pp. 41 5-430. North-Holland Publishing Company, Amsterdam. Reuss, H. L., Chang, S. K., and McCormick, B. H. (1980). Picture Paging for Efficient Image Processing, in “Pictorial Information Systems,” ed. S. K. Chang and K. S. Fu, pp. 228-256. Springer-Verlag Publishing Company, New York. Roussopoulos, N., and Leifker, D. (1984). An Introduction to PSQL: A Pictorial Structured Query Language, “Proceedings of the IEEE Workshop on Visual Languages.” Hiroshima, December 1984, pp. 77-87. Roussopoulos, N., Faloutsos, C., and Sellis, T. (1988). An Efficient Pictorial Database System for PSQL, IEEE Trans. on Software Engineering SE-14(5), 639-650. Sakauchi, M. (1989). Two Interfaces in Image Database Systems, “Proceedings of the IEEE International Workshop on Industrial Applications of Machine Intelligence and Vision.” Tokyo, April 1989, pp. 22~~27. Samet, H. (1990a). “Applications of Spatial Data Structures.” Addison-Wesley Publishing, Reading, MA.
290
WILLIAM I. GROSKY AND RAJlV MEHROTRA
Samct, H. (1990b). “The Design and Analysis of Spatial Data Structures.” Addison-Wesley Publishing, Reading, MA. Schmutz, H. (1980). The Integrated Data Analysis and Management System for Pictorial Applications, in “Data Base Techniques for Pictorial Applications,” ed. A. Blaser, pp. 475494. Springer-Verlag Publishing Company, New York. Scllis, T., Roussopoulos, N., and Faloutsos, C. (1987). The R + Tree: A Dynamic Index for Multidimensional Objects, “Proceedings of the Thirteenth Conference on Very Large Databases.” Brighton, England, September 1987, pp. 507-518. Sethi, 1. K., and Ramesh, N. (1989a). A Flexible 2-D Shape Recognition Approach Through Hashing, “Proceedings of ROBEX ’89” Palo Alto, CA, August 1989, pp. 185 ~188. Scthi, 1. K., and Ramesh, N. (1989b). 2-D Shape Recognition Using Redundant Hashing, “Proceedings of the SPIE Conference on Intelligent Robots and Vision.” Philadelphia, November 1989, pp. 477-486. Sethi, I. K., and Ramesh, N. (1992). Local Association Based Recognition of Two Dimensional Objects, Muchine Vision & Applications, in press. Shapiro, L. G., and Haralick, R. M. (1981). Structural Descriptions and Inexact Matching, IEEE Trans. on Pattern Anrilysis and Machine Intelligence PAM1-3(5), 504-5 19. Shapiro, L. G . , and Haralick, R. M. (1982). Organization of Relational Models for Scene Analysis, IEEE Trans. on Partern Analysis and Machine Intelligence PAMI-4(6), 595 602. Shapiro, L. G . , Moriarty. J. D., Ftaralick, R. M., and Mulgaonkar, P. G. (1984). Matching Three-Dimensional Objects Using a Relational Paradigm, Partern Recognition 17(4), 385 405. Sheth, A. (1988). Managing and Integrating Unstructured and Structured Data: Problems of Representation, Features, and Abstraction-A Position Paper, “Proceedings of the Fourth International Conference on Data Engineering.” Los Angeles, February 1988, pp. 598 599. Shipman, D. (1981). The Functional Data Model and the Data Language DAPLEX, ACM Trans. on Datahuse Management Systems 6( I ) , 140- 173. Shouxuan, Z. (1981). An Approach to Image Database Organization, “Proceedings of the Workshop on Computer Architecture for Pattern Analysis and Image Database Management.’’ Hot Springs, VA, November 1981, pp. 242 249. Stein, F., and Medioni, G. (1990). Efficient Fast Two Dimensional Object Recognition, “Proceedings of the Tenth International Conference on Pattern Recognition,” Vol. 1. Atlantic City, NJ, June 1990, pp. 13 17. Sties, M., Sanyal, B., and Leist, K . (1976). Organization of Object Data for an Image Information System, “Proceedings of the Third International Joint Conference on Pattern Recognition.” Coronado, CA, November 1976, pp. 863 869. Stucki, P., and Menzi, U. (1989). Image-Processing Application Generation Environment: A Laboratory for Prolotyping Visual Data-Bases, in “Visual Database Systems,” ed. T. L. Kunii, pp. 29-40. North-Holland Publishing Company, Amsterdam. Takao, Y.,Itoh, S., and Iisaka, J. (1980). An Image-Oriented Database System, in “Data Base Techniques for Pictorial Applications,” ed. A. Blaser, pp. 527-538. Springer-Verlag Publishing Company, New York. Tamura, H. (1980). Image Database Management for Pattern Information Processing Studies, in “Pictorial Information Systems,” ed. S. K. Chang and K. S. Fu, pp. 198-227. SpringerVerlag Publishing Company, New York. Taniura. H., and Naokazu, Y. ( 1984). Image Database Systems: A Survey, Patrern Recognition 17(1), 29-43. Tanaka, M., and Ichikawa, T. (1988). A Visual User Interface for Map Information Retrieval Based on Semantic Significance. IEEE Trans. on Su/tware Engineering SE-14(5), 666-670.
IMAGE DATABASE MANAGEMENT
291
Tang, G. Y. (1981). A Management System for an Integrated Database of Pictures and Alphanumerical Data, Computer Gruphics and Image Processing 16(3), 270-286. Teorey, T. ( 1 990). “Database Modeling and Design, The Entity-Relationship Approach,” Morgan-Kaufmann Publishers, Palo Alto, California. Tsichritzis, D., and Lochovsky, F., eds. (1978). “The ANSI/X3/SPARC DBMS Framework.” AFIPS Press, Arlington, VA. Turney, J. L., Mudge, T. N., and Volz, R. A. (1985). Recognizing Partially Occluded Parts, IEEE Truns. on Pattern Analysis and Machine Intelligence PAMI-7(4), 410-421. Unnikrishnan, A., Shankar, P., and Venkatesh, Y. V. (1988). Threaded Linear Hierarchical Quadtrees for Computation of Geometric Properties of Binary Images, ZEEE Trans. on Softwure Engineering SE-14(5), 659-665. Walter, I. M., Lockemann, P. C., and Nagel, H. H. (1987). Database Support for KnowledgeBased Image Evaluation, “Proceedings of the Thirteenth International Conference on Very Large Data Base.” Brighton, England, September 1987, pp. 3-1 1. Ward, M., and Chien, Y. T. (1979). A Pictorial Database Management System Which Uses Histogram Classification as a Similarity Measure, “Proceedings of COMPSAC ’79.” Chicago, November 1979, pp. 153 156. Wiederhold, G., Brinkley, J., Samadani, R., and Clauer, C. R. (1989). Model-Driven Image Analysis to Augment Databases, in “Visual Database Systems,” ed. T. L. Kunii, pp. 159180. North-Holland Publishing Company, Amsterdam. Yamaguchi, K., and Kunii, T. L. (1982). PICCOLO: A Data Model for Picture Database Computers, in “Picture Engineering,” ed. K . S. Fu and T. L. Kunii, pp. 2-23. SpringerVerlag Publishing Company, New York. Yamaguchi, K., Ohbo, N., Kunii, T. L., Kitagawa, H., and Harada, M. (1980). ELF: Extended Relational Model for Large, Flexible Picture Databases, “Proceedings of the IEEE Workshop on Picture Data Description and Management.” Asilomar, CA, August 1980, pp. 95-100. Yamamura, M., Kamibayashi, N., and Ichikawa, T. (1981). Organization of an Image Database Manipulation System, “Proceedings of the Workshop on Computer Architecture for Pattern Analysis and Image Database Management.” Hot Springs, VA, November 1981, pp. 236-241. Yang, C. C., and Chang, S. K. (1978). Encoding Techniques for Efficient Retrieval from Pictorial Databases, “Proceedings of the IEEE Computer Society Conference on Pattern Recognition and Image Processing.” Chicago, June 1978, pp. 120-125. Yang, C. C., and Chang, S. K. (1980). Convex Polygons, Hypercubes and Encoding Techniques, in “Pictorial Information Systems,” ed. S. K. Chang and K. S. Fu, pp. 75 85. Springer-Verlag Publishing Company, New York. Yang, Y. K., and Fu, K. S. (1984). A Low-Cost Geometry-Preserving Image Database System, “Proceedings of the International Conference on Data Engineering.” Los Angeles, April 1984, pp. 604-610. Yost, R. A. (1988). Can Image Data be Integrated with Structured Data?-A Position Paper, “Proceedings of the Fourth International Conference on Data Engineering.” Los Angeles, February 1988, p. 602. Zdonik, S., and Maier, D. (1989). “Readings in Object-Oriented Databases.” Morgan-Kaufmann Publishers, Palo Alto, CA. Zloof, M. M. (1977). Query-by-Example: A Database Language, IEM Systems Journal 16(4), 324-343. Zobrist, A. L., and Bryant, A. L. (1980). Designing an Image Based Information System, in “Pictorial Information Systems,” ed. S. K. Chang and K. S. Fu, pp. 177-197. SpringerVerlag Publishing Company, New York. Zobrist, A. L., and Nagy, G. (1981). Pictorial Information Processing of Landsat Data for Geographic Analysis, IEEE Computer 14(1I), 34 41.
This Page Intentionally Left Blank
Paradigmatic Influences on Information Systems Development Met hodologies: Evolution and Conceptual Advances* RUDY HlRSCHHElM College of Business Administration University of Houston Houston. Texas
HEINZ K. KLEIN School of Management State University of New York Binghamton, N e w York I . Introduction . . . . . . . . . . . . . . . . . . . . 2. Evolution of Information Systems Development Methodologies. . . . . . 2.1 Background and Premethodology Era i. . . . . . . . . . . 2.2 Seven Generations of ISD Methodologid3, . . . . . . . . . . 3. Methodologies and Paradigms . . . . . . . . . . 3.1 Four Paradigms of Information Systems D L . . . . . . 3.2 Comparison of Differences Between the Four Para s . . . . . . 3.3 Methodologies and Paradigmatic Influences . . . ?-\,. . . . . . . 3.4 The Relationship Between Paradigms and Generations of IN3 Methodologies 4. Paradigms and the Continued Evolution of Methodologies . . .._ . . . . 4.1 Information Systems Planning (ISP) and Structured Approaches b, . . . 4.2 Prototyping and Evolutionary Systems Development . . . . . . . 4.3 Soft Systems Methodology (SSM) . . , . . . . . . . . . 4.4 Ordinary Work Practices Approach . . , . . . , . . . . . 4.5 Conclusion and Outlook . . . . . . . . . . . . . . . 5. Conclusion. . . . . . . . . . . . . . . . . . . . . Acknowledgments. . . . , . . . . . . , . . . . . . . 6. Appendices: Summaries of Methodologies . , . . . . . . . . . . 6.1 Structured Methodologies . . . . . . . . . . . . . . . 6.2 Prototyping . . . . . . . . . . . . . . . . . . . 6.3 Soft System Methodology . . . . . . . . . . . . . . . 6.4 UTOPIA. . . , . . . . . . . . . . . . . . . . 6.5 ETHICS Methodology . . . . . . . . . . . . . . . . References . . . . . . . . . . . . . . . . . . . . .
294 295 296 291 305 306 307 319 323 325 321 341 350 355 365 366 361 361 361 370 372 375 371 38 1
* The research for this paper was in part carried out at the Department of Mathematics and Computer Science at the University Center of Aalborg, Denmark, and supported by a grant from the Danish Research Council. 293 ADVANCES IN COMPUTERS, VOL. 34
Copyright 0 1992 by Academic Press, Inc. All rights of reproduction in any form reserved. ISBN 0- 12-012 134-4
2 94
RUDY H l R S C H H E l M A N D H E I N Z K. KLEIN
1.
Introduction
The subject of computer-based information systems development (ISD) has received considerable attention in both the popular and academic literature over the past few decades. Given the great influence that information systems (IS) can have on the successful operation of organizations, such attention is hardly surprising. One needs only to remember the horror stories of several infamous failed information systems to realize just how important these systems have become to today’s organizations and peoples’ expectations about them. One area that continues to have a high profile and where a remarkable amount of interest can easily be observed is in the approaches or methodologies for developing information systems. It is likely that hundreds of ISD methodologies exist. An indication of the strong interest in this area is the series of IFIP WG8.1 conferences on the theme Comparative Review of Information Systems Design Methodologies (CRIS), which undertook an analysis of various ISD approaches (Olle et al., 1982; 1983; 1986; and 1988). Other examples of this interest can be found in the books of Couger, Colter, and Knapp (1982), who trace the historical development of a number of important systems analysis and design approaches; Maddison et al. (1983), who compare the tools and techniques adopted by a number of ISD methodologies; Avison and Fitzgerald (1988), who discuss the differences and limitations between various methodologies; and Cotterman and Senn (1991), who explore the directions future research in information systems development approaches ought to take. An assumption, which appears common in much of the information systems literature, is that information systems development can be considered a largely rational process, undertaken with the help of various tools and techniques that are founded on the tenets of classical science. To put it differently, there exists a prevailing, accepted mode of information systems development that does not fundamentally differ from one particular approach to the next. It is possible, therefore, to speak of an ISD orthodoxy, one that shares a common underlying philosophy-a shared paradigm as it were (cf. Banville and Landry, 1989; Hirschheim and Klein, 1989). Recently, however, it is possible to note the emergence of some fundamentally different ISD approaches; ones that do not share the same paradigm and differ in their underlying philosophy (cf. Lyytinen, 1986; Bjerknes, Ehn, and Kyng, 1987; Ehn, 1988; Bogh-Andersen and Bratteteig, 1988; Bjerknes et al., 1990; and Greenbaum and Kyng, 1991). These alternative approaches pose a challenge to the ISD orthodoxy, and it is to this challenge that we turn our attention. In this chapter, we wish to explore the emergence of alternative information systems development methodologies, placing them in their historical
DEVELOPMENT METHODOLOGIES
295
context and noting where and why they differ from each other. In so doing, it is shown that the history of ISD methodologies appears to be driven more by fashionable movements than by theoretical insights. But this should not be taken in a pejorative sense. Nor does this mean that much cannot be learned from an historical analysis of the methodologies. On the contrary, we contend that the past points the way forward, and we seek to show this in the chapter. More specifically, we will attempt to indicate the direction in which mainstream research on ISD methodologies appears to be headed and balance this with an exploration of some interesting and challenging avantgarde approaches. In addition, we will examine the paradigmatic assumptions that underlie the methodologies. It will be seen that through an analysis of the paradigmatic assumptions, not only can considerable insight into the methodologies be gained, but also ways in which they could be improved will become apparent. The chapter proceeds as follows. Section 2 provides historical background on ISD methodologies, presenting their evolution in terms of seven generations. The purpose of Section 3 is to explain the relationships between paradigms and methodologies. We discuss the paradigmatic assumptions that are most characteristic for each generation. We also summarize our approach to associating a methodology with a particular paradigm. In Section 4, we explore some recent research on ISD methodology development, analyzing the methodologies’ problem focus and strengths and weakness from a paradigmatic assumptions perspective. We also show how insights gained from multiple paradigms might offer the methodologies a direction for future growth. Section 5 concludes by suggesting how paradigmatic assumptions analysis can serve as a valuable source of inspiration for suggesting concrete improvements for methodologies.
2. Evolution of Information Systems Development Methodologies
The purpose of this section is to give an overview of the history of ISD methodologies that can serve as background for the more detailed treatment of four specific methodologies in Section 4. Whereas methodologies are an important part of the evolution of IS as a field of study, a historical introduction to IS would have to cover many other aspects that are not touched upon here [for an interesting introduction to these see Dickson (1981) in Volume 20 of Advances in Computers, which covers the emergence of methodologies briefly on p. 14 under the “Take-off and Maturity” stage]. In order to keep the historical review concise, the following uses a broad brush to highlight the fundamental features that characterize important directions among the hundreds of methodologies that have been proposed over the
296
RUDY HlRSCHHElM AND HEINZ K. KLEIN
years. History is, of course, too complex for any single taxonomy to do justice to its richness. As will be seen in Section 3, the grouping of the methodologies into “generations” is inspired by certain theoretical principles suggesting a possible, but by no means absolute, ordering, and the placement of specific methodologies is open to debate. 2.1
Background and Premethodology Era
It is possible to conceive of the emergence of different ISD methodologies in evolutionary terms. Couger (1973) was perhaps the first to document the evolution of what he called “systems analysis techniques.” He described their evolution in terms of four distinct generations. Nine years later, he expanded the evolution to include five generations (Couger, 1982). Others have also tried to document the emergence of new methodologies but not so much from a historical perspective but rather as a comparative review or feature analysis (cf. Cotterman et al., 1981;Olle et ul., 1982; 1983; Maddison et ul., 1982; Avison and Fitzgerald, 1988). In this section we explore the emergence of ISD methodologies in a fashion similar to that of Couger, but instead of focusing on “techniques” we specifically look at “methodologies,” so it is not surprising that our categorization is quite different from his. A methodology in this paper means a codified set of procedures the purpose of which is to guide the work and cooperation of the various parties (stakeholders) involved in the development of information systems. Typically these procedures are supported by a set of preferred methods and tools.* Briefly sketched, we see the evolution of ISD methodologies to have taken the form of eight overlapping stages or generations: The first stage is not really a generation of methodologies. It could be referred to as the era of the “seat-of-the-pants approaches.” During this era of information systems development there were no formal methodologies to speak of, only “rules of thumb.” (We consider this a premethodology era and reserve the term “generation” for approaches that were actually codified.) In the premethodology era, system developers used a variety of techniques to help them develop computer-based information systems. New techniques * Our use of the term methodology is derived from Checklands (1981) distinction between methodology and method, and from Welke’s (1983) analysis of how methodologies partition the problcni setting of systems development into object systems, perccived images, stages and tasks. Checkland contends that a methodology has an intermediate status betwccn a “philosophy” and a “technique.” More precisely he states that a “methodology will lack the precision of a techniquc hut will he a firmer guide to action than a philosophy. Where a technique tells ‘how’ and a philosophy tells you ‘what,’ a methodology will contain elements of both ‘what’ and ‘how’” (Checkland, 1981, p. 162). Techniques may be learned by “apprenticeships.” They may or may not be fully specifiable. If a technique can be described so that it can be followed by others, for example by following a manual, then we speak of a method.
DEVELOPMENT METHODOLOGIES
297
were invented as needed, and they were usually very hardware dependent. Those techniques that seemed to work in previous development projects were subsequently used again. They became the developer’s “rules of thumb” and, in a sense, the “methodology” (cf. Episkopou, 1987). They were typically passed on to other system developers, often by word of mouth. These rules or techniques were typically not codified and sometimes not even written down, although techniques such as flow charting were fairly well documented. Systems development was considered a technical process to be undertaken by technical people. In this era, systems development was all art and no science. Little is known about the success rate of systems development. Even though some very large systems were successfully implemented in the military (such as SAGE: Semi-Automated Ground Environment) and industry (SABRE), we can guess that many less-ambitious projects failed. The end of this era can be roughly dated to the late 1960s when several influential treatises on methods, tools and general principles of system development appeared. Before that time, Canning (1956) is likely to be the very first treatment on how to develop computer-based information systems (cf. Agresti, 1986). Others refer to Rosove (1967) as the first textbook source.
2.2 Seven Generations of ISD Methodologies In 1968 the term software engineering was coined by a NATO-convened workshop to assess the state of software production and suggest possible avenues to improvement (Shaw, 1990). The use of this term gained popularity in the 1970s and is now often used to refer to the fairly well-structured methods and tools of program design, implementation and testing under the assumptions that systems requirements are given. From an IS perspective, the most difficult problems have already been solved by the time when requirement are specified so the program design can start. In 1969 an influential treatment of MIS development appeared (Blumenthal, 1969). It focused on the front end of systems development and presented an information systems framework of generic information processing modules. In part these modules would be shared by major information subsystems and in part they were unique to a single information subsystem. Blumenthal also noted the importance of information systems planning for determination of requirements and priorities of projects. He suggested “planned evolution” as a methodology for orderly, organization-wide IS development based on his experiences with the System Development Corporation (cf. Rosove, 1967).
2.2.1 Generation I : Emergence of Formal life-cycle Approaches
It became clear that, for the field of information systems development to grow and be taken seriously, it needed to codify its techniques. Far too many
298
RUDY HlRSCHHElM AND HEINZ K. KLEIN
development projects were failures, and it became necessary to formalize the practice of systems development so that the successful lessons of the past could be documented and passed on. Codified techniques proliferated and the beginnings of piecing them together into more formal “methodologies” began. Organizations grew up to help in the codification process (for example, the National Computing Center in the United Kingdom, cf. Daniels and Yeates, 1969, 1971). Courses in systems analysis and design became commonplace in both public and private institutions. More and more methodologies emerged ( e g , Glans et ul., 1968; Burch and Strater, 1974; Millington, 1978; Lee, 1978). Systems were built from the requirements elicited by the systems analyst from the users. User requirements elicitation was considered a difficult but largely noncontroversial exercise : users had to be asked what information they needed in their jobs. This formed the basis of user requirements. Additionally in this generation, a more rational strategy was taken for the entire exercise of systems development, from the initial stage when a system was considered, through to its implementation and maintenance. This became known as the “systems development life-cycle” (SLC). It divided systems development into distinct stages that allowed the development process to be better managed. It also gave rise to advancements in project management (Cleland and King, 1975) and information systems planning (McLean and Soden, 1977). The ISD methodologies or approaches of generation 1 have been described as the “traditional approaches” by Wood-Harper and Fitzgerald (1982), “the classic approaches” by Hirschheim, Klein and Newman (1991) and “second and third generation systems analysis techniques” by Couger (1982). The rate of successful systems development increased but was still fairly low. Systems development continued to be viewed as a technical process to be undertaken by technical experts. In this era, systems development started to be viewed less an art and more a science. With the increasing codification of a technical “orthodoxy” of information systems development, there also appeared the first critical analyses. They either drew attention to the organizational problems causing many “IS failures” (Argyris, 1971; Lucas, 1975) or called for consideration of fundamentals, insisting that IS development and use be seen in the broader context of human inquiry and its limits (Ulrich, 1977; Kirsch and Klein, 1977).
2.2.2 Generation 2: Emergence of the Structured Approaches
While the methodologies of generation 1 helped the developer overcome the limitations of having no codified set of procedures by which to build systems and offer a set structure for development (i,e,,the systems life-cycle), they failed to adequately deal with two perennial problems : changing user
DEVELOPMENT METHODOLOGIES
299
requirements and understandable system designs. From the developers’ point of view, the users constantly changed their requirements, which meant that it was difficult, if not impossible, to design the system. There was a need to freeze user requirements so the development could be undertaken. From the users’ perspective’, it was difficult to know in advance what the implemented system was going to look like. Analysts, it was claimed, failed to adequately describe what the system would embody in its finished form. Computer jargon was often viewed as the culprit. And users often felt the systems developers could not, or would not, speak in a way that was comprehensible to them. These obstacles were felt to be overcome by the development of two techniques associated with the so-called structured methodologies of generation 2 ; viz., the “sign off” and the “structured walk-through.’’ The former permitted the analyst to work to an agreed specification that was signed off by the users. The latter (at least in theory) permitted the users to better understand what the finished product would look like as they were formally “walked through” the details of the system design during these formal sessions. (Sadly, the structured walk-through has become more a technical procedure for programmers and systems development staff than a joint communicating tool with the user.) More recently (August 1991),joint application development (JAD) has emerged to allow for even better interaction between the analysts and users. The structured methodologies embraced the features of structured analysis and structured design and grew from the enormously successful structured programming languages (see Colter, 1982, for an excellent overview of the evolution of these methodologies). Structured methodologies such as SADT, SSADM and SA proliferated (cf. DeMarco, 1978; Gane and Sarson, 1979; Yourdon and Constantine, 1979; Weinberg, 1980). The methodologies of this generation are referred to by Couger (1982) as “fourth generation techniques.” They also facilitated the handling of important issues such as user-friendly interfaces, user involvement and ergonomically sound design. The latest enhancements to structured methodologies typically involves the use of automated tools to assist the analyst (i.e., CASE and integrated CASE tools). Systems development was still perceived as a technical process but one that had social consequences which had to be considered. The success rate of systems development improved markedly and was seen more as a science than an art, or perhaps more accurately, as a form of engineering: from software engineering (Boehm, 1976) to information engineering (Land, 1989).
2.2.3 Generation 3: Emergence of Prototyping and Evolutionary Approaches
Several particularly thorny problems with systems development became ever more pressing in the late 1970s and 1980s: as organizations’
300
RUDY HIRSCHHEIM AND HEINZ K. KLEIN
environments continued to change at an increasing pace due to increased competition, internationalization and the like, so too did user requirements. No longer could users wait two to three years for their systems to be developed, nor could they wait that long to find out that the system eventually delivered no longer met their needs. An equally serious problem was that the communication gap between professional analysts and users continued to grow as computer-based information systems addressed ever more complicated applications. Hence the idea emerged that users needed applications software that could be delivered quickly and with which they could experiment to better understand what the final system would be like. This was the purpose behind evolutionary (Lucas, 1978, p. 44) or adaptive (Keen, 1980) systems development and prototyping (Earl, 1978; Naumann and Jenkins, 1982; Alavi, 1984; Budde et al., 1984). Simply stated, a prototype is an experimental version of a system that is used to improve communication and user feedback (and sometimes to demonstrate technical feasibility or efficiency of new software designs; Floyd, 1984, p. 3). A prototype is a scaled-down variant of the final system that exhibits some of its salient features and thereby allows the users to understand the interfaces or computational power. When prototyping first emerged, no clear distinction was made to evolutionary systems development. Following Iivari (1982), in Section 4.2 we shall speak of evolutionary systems development in which the prototype continues to be improved until “it becomes the system” (Lantz, 1986; early examples, cf. Keen and Scott Morton, 1978). Early prototyping was generally thought to contain five phases (cf. Appendix): identify the basic requirements ; develop a design that meets those requirements ; implement the design; experiment with the prototype noting good and bad features; and revise and enhance the prototype accordingly. Through prototyping, a number of the problems associated with life-cycle methodologies could be overcome. Users could tell much earlier on if the system under development would meet their requirements. If not, it could be modified then rather than waiting until it was finished. Additionally, prototyping allowed users who may have had difficulties in formulating and articulating their requirements to work with a prototype, thereby allowing them a much better chance to accurately specify their requirements. And all this without the delays typically associated with life-cycle methodologies. In this generation, prototyping and “evolutionary development” were seen as an advancement over standard life-cycle approaches (Hawgood, 1982). Systems development through prototyping, like the previous generation, was still perceived as a technical process but one that had social consequences that had to be considered. The success rate of systems development improved markedly and was seen more as a science (experimental problem solving) than an art.
DEVELOPMENT M ETHODO LOG1ES
301
2.2.4 Generation 4: Emergence of Socio- Technical, Participative Approaches
The methodologies of generations 2 and 3 progressed the field greatly but there were a number of issues with which a number of individuals in the IS community still felt uncomfortable. For one, the level of user involvement permitted in the structured approaches was not considered sufficient by researchers who also had practical experience, such as Bostrom and Heinen (1977), Mumford (1981), DeMaio (1980) and Land and Hirschheim (1983). They felt that sign offs and structured walk-throughs were potentially helpful but were fundamentally misguided in their ability to elicit true user involvement. A second concern was with the focus of development. System development approaches had traditionally focused on the technical system rather than the social system. This led to information systems that might have been technically elegant, but were not ideal from a social or work standpoint. They produced work environments that were at best no better than before the system was introduced. This was perceived to be a missed opportunity by the socio-technical community, which suggested such interventions should lead to an improved social as well as technical system. In contrast then, generation 4 approaches used systems development as a vehicle to rethink the social work environment in which the new system would be implemented. Issues such as job satisfaction, learning and the development and use of new skills rose to the fore. In this generation we see the emergence of the participative systems development approaches, e.g., ETHICS (Mumford, 1983); PORGI (Oppelland and Kolf, 1980); and Pava’s (1983) STS approach. These methodologies all focus on (1) having the users not only be involved in systems development but take control of the process; and (2) having systems development be used to redesign the work situation leading to an improved social and technical system. The number of systems developed using participative approaches such as ETHICS was not that great, so it is difficult to assess their success rate. However, research on the use of participative systems design methodologies has reported positive results (cf. Hirschheim, 1983; 1985; 1986). Clearly another difference with this generation of methodologies is the movement away from viewing systems development as a technical process. Instead, systems development is viewed jointly as a social and technical process. In this era, systems development is seen as part art and part science. 2.2.5
Generation 5: Emergence of Sense-Making and Problem - Formulation Approaches
At approximately the same time that participative methodologies were emerging, other approaches were being developed to overcome a number of
302
RUDY HlRSCHHElM AND HElNZ K. KLEIN
shortcomings in the structured approaches. One significant concern surrounded the issue of problem formulation. Earlier generations adopted the position that, while problem formulation might not have been easy, it could, nonetheless, be tackled in a relatively straightforward way by adapting the scientific approach to problem solving (cf. Newell and Simon’s 1972 theory of problems solving). Not everyone agreed with this. Checkland (1981), for example, presented his soft system methodology (SSM) as an alternative that insisted upon a richer interpretation of the “problems of problem formulation.” Checkland felt that prior methodologies conceived of the problem that the system was to overcome in too narrow a view. Problems, or perhaps more precisely user requirements, were not easily articulated; in fact it may be misleading to assume that a problem “exists” rather that one is constructed between various “stake holders” adhering to differing perspectives. According to Checkland, SSM tools such as rich pictures and concepts like root definitions and conceptual modeling, allow for successful problem constructions more so than formal problem definitions as advocated in management science and kindred schools of thought. Using SSM as a base, two additional methodologies emerged that attempted to not only extend SSM, but to include insights from other methodologies: MULTIVIEW (Wood-Harper, Antill and Avison, 1985; Avison and Wood-Harper, 1990) and FAOR (Schafer ef al., 1988). Each of these embraced the need for “multiple perspectives” and adopted vehicles for implementing it. Others in the IS community, arguing along similar lines, felt the need to develop approaches that would cater to a better mutual understanding between the users and the developers. The term generally used to denote this was “sense making” (cf. Boland and Day, 1982; Banbury, 1987). More specifically, “sense making” can be defined as “the modes in which a group interacts to interpret their environment and arrive at socially shared meanings” (overview in Klein and Hirschheim, 1987, p. 288). Capitalizing on these conceptual developments, a number of system development projects were initiated focusing specifically on vehicles and tools to facilitate sense making; for example, FLORENCE (Bjerknes and Bratteteig, 1984; 1985) and MARS (Mathiassen and Bogh-Andersen, 1985). The latter in fact has grown into a systems development approach that could be termed “the ordinary work practices based approach” (Andersen et al., 1990) (see Section 4.4). While these projects cannot be called methodologies in their own right, they have nevertheless produced a number of methods and tools that could be used in the development of a methodology for sense making, e.g., diary keeping (Jepsen, Mathiassen and Nielsen, 1989), mappings (Lanzara and Mathiassen, 1984) and use of metaphors (Madsen, 1989). As many of these methodologies are fairly new or currently being developed, we cannot say they are more effective in producing successful systems. Clearly
DEVELOPMENT METHODOLOGIES
303
another difference with this generation of methodologies is the movement away from viewing systems development as a purely technical process ; it is conceived as mostly a social process. In this era, too, systems development is seen as part art and part science, but the reasons for the art part are more explicitly grounded on a philosophical basis. The foundations of this basis were laid by the later Wittgenstein in his “Philosophical Investigations” and further developed by the revival of the phenomenological and hermeneutic tradition (cf. Boland, 1985; 1991). 2.2.6 Generation 6:Emergence of the Trade-Union Led Approaches
Somewhat concurrent with the development of generation 3 methodologies and to a large extent as an antithetical reaction to the ideological underpinnings and negative social effects of the generation 1 and 2 approaches, a trade-union based strategy to ISD was proposed (Kubicek, 1983). It focused on the interests of the work force, or more specifically on the trade union representatives of the work force, and how they could control systems development. One segment of the IS community, spearheaded by a group of Scandinavian researchers, saw the need to embark on system development projects that put control in the hands of the work force rather than management. They felt that socio-technical approaches were a form of manipulation to reduce worker resistance to systems that served mostly the interests of managers and owners and offered little to improve the position of the workers. Using action research (Sandberg, 1985) they developed a set of guidelines, tools and techniques that would allow the trade unions to dictate the direction and outcome of the systems development exercise and escape entrapment in systems thinking and methodologies laden with managerial biases. The four most prominent projects that implemented this strategy were the Norwegian “Iron and Metal Project” (Nygaard, 1975); DEMOS (Carlson et al., 1978; Ehn and Sandberg, 1983); DUE (Kyng and Mathiassen, 1982) and UTOPIA (Ehn, Kyng and Sundblad, 1983; Howard, 1985; Bodker et al., 1987). The first three have been called “first generation projects” and the latter a “second generation project” by Ehn and Kyng (1987) as a way of distinguishing their main thrusts : first generation projects focused on “supporting democratic planning” while second generation projects added in the idea of “designing tools for skilled workers.” As in the sense-making approaches, these projects have not produced a particular methodology, but rather a set of tools, techniques and principles that could form the basis of a methodology. Taken as a whole, the loose assembly of these tools, techniques and principles has been termed the “collective resource approach” by Ehn and Kyng (1987). Recently, the approach has
304
RUDY HlRSCHHElM AND HEINZ K. KLEIN
evolved to include what is termed “cooperative design” (Kyng, I99 1 ; Greenbaum and Kyng, 1991). While this name is suggestive of a movement closer to the participative approaches, cooperative design does not negate its primary goal of keeping control of systems development in the hands of the trade unions under the rubric of “democratic planning.” Little research has been done to evaluate how effective this approach is in developing successful systems. Some proponents such as Ehn (1988) claim positive results, while others shed some doubt on its efficacy (Kraft and Bansler, 1988). Systems development is felt to be part art, part science and part class politics; it is fundamentally conceived as very much a social process rather than a technical one. 2.2.7 Generation 7: Emergence of Ernancipatory Approaches
This latest generation is very much in the making with no examples of methodologies available. It focuses on emancipation and adopts features of the previous generations. It takes its motivation from the work of Habermas’s (1984) Theory of Communicative Action. It too conceives of systems development as a social process and sees the need for sense making (what is called mutual understanding), but where it differs is in its orientation toward emancipation, which is striven for through the use of rational or emancipatory discourse. Communication comes to the fore in this approach and hence vehicles are developed to overcome obstacles to free and undistorted communication. The goal of systems development is a system that would not only support emancipatory discourse but also mutual understanding for all its users. Some progress has been made in this direction in the development of projects and tools to support the emancipatory ideal. The SAMPO project (Lehtinen and Lyytinen, 1983, Auramaki et al., 1988; 1991) provides an approach based on discourse analysis that is supportive of the emancipatory theme. Other work suggests how the emancipatory ideal might be applied in the context of ISD (e.g., Lyytinen and Klein, 1985; Lyytinen, 1986; Ngwenyama, 1987; Lyytinen and Hirschheim, 1988; Hirschheim and Klein, 1989; 1991a; 1991b; Ngwenyama, 1991; Klein and Hirschheim, 1991; and Hirschheim, Klein and Lyytinen, 1991). But as of yet, progress has been primarily on the conceptual front, and there are no approaches that implement this emancipatory theme nor specific systems development projects that have adopted it. As there are no concrete examples of its application, it is not possible to evaluate how effective it would be in the development of successful systems. Its social and philosophical basis suggests that systems development must be much more art than science because system development relies on understanding the users’ work language and
DEVELOPMENT METHODOLOGIES
305
other experiential knowledge that can be acquired only through participation in a community’s forms of life. System development is a science insofar as work practices can be “rationally reconstructed,” which puts them on a clear conceptual foundation as a prerequisite for their “rationalization.” However, as systems development means changing forms of life, it is invariably bound up with organizational politics that threaten its rationality. 3.
Methodologies and Paradigms
The preceding description of the evolution of methodologies in terms of seven generations emphasized a time line. Although loosely based around time (chronology), there are deeper conceptual connections that tie the generations together. If we focus on the conceptual base of each generation, then methodologies that at first sight have different purposes and look very different are revealed to belong to the same family by virtue of their common underlying assumptions and values. Consider, for example, business systems planning, structured methods and classical life-cycle approaches. Business systems planning’s purpose is to develop an organization-wide information systems architecture. Its methods and tools are very different from structured methodologies with their single application focus. Yet both can be assigned to the same “paradigm” as classical life-cycle approaches because of shared assumptions and core beliefs. We define “paradigm” as the most fundamental set of assumptions adopted by a professional community that allows them to share similar perceptions and engage in commonly shared practices. In this sense professional communities are communities of shared assumptions. As professional communities are never in full agreement, paradigms easily split into subparadigms, and each of these may mushroom into its own networked subset of kindred spirits. In order to keep the analysis of paradigms manageable, one must single out the philosophically most fundamental assumptions held by a sufficiently large community. These most fundamental assumptions tend to relate to the nature of what exists (ontology) and to the nature of knowledge and the appropriate ways of inquiry for obtaining knowledge (epistemology). Therefore we define a paradigm to consist of assumptions about the constitution or construction of reality and about the nature and origins (sources) of knowledge. Knowledge is simply that which contributes to the improvement of human understanding of oneself, culture, fellow human beings and nature. As systems developers must conduct inquiry as part of design and have to intervene into the social world as part of implementation, it is natural to distinguish two types of related assumptions : those associated with inquiry to obtain the knowledge needed for design and those associated with the
306
RUDY HIRSCHHEIM AND HEINZ K. KLEIN
nature of society. Both types of assumptions have affected methodologies and both types are beginning to change in the recent research literature on methodologies. The new focus on mutual understanding reflects different types of assumptions about the nature of knowledge and how it is acquired. The concern for emancipation reflects different assumptions about the nature of society. The following discussion of the nature of these assumptions will prepare the way for presenting our analysis of some of the more recent literature on methodologies in more detail. In the modern world, these assumptions have been deeply influenced by the prevailing canons of science. Hence we would expect connections between the assumptions held by professional communities about science and what these communities consider good practice (the set of assumptions about what defines good practice of medicine or law and what is quackery is a good example for this). Maddison et al. (1983) noted that methodologies came about by practitioners trying to upgrade their standards. If they are correct, it is to be expected that the assumptions about the nature of science in part became embedded in the description and practice of methodologies. Insofar as the assumptions about science would vary between different times or different societies this in due time should become reflected in different types of methodologies. It is our contention that this has indeed happened. But in order to understand the relationship between assumptions and methodologies we need to further elaborate on the notion of “paradigm” and how it applies to ISD. 3.1 Four Paradigms of Information Systems Development
According to Burrell and Morgan (1979) the assumptions about the nature of human knowledge and inquiry can be broken down into four fundamental sets of beliefs: ontological (beliefs about the nature of the world around us) ; epistemological (beliefs about how knowledge is acquired) ; methodological (beliefs about the appropriate mechanisms for acquiring knowledge) ; and human nature issues (beliefs about whether humans respond in a deterministic or nondeterministic, i.e., voluntaristic, fashion). Depending on the precise nature of these beliefs, one can distinguish a “subjectivist-objectivist” dimension, which is more commonly seen as the two extremes of philosophical inquiry. Objectivists hold that the world exists independent of our observation of it and that there is one method for knowledge acquisition, which is the same for both the natural and social world. Subjectivists, on the other hand, hold that the world is socially constructed and not independent of the individual observing it. Moreover, the method for knowledge acquisition in the natural world is not necessarily appropriate for the social world. Clearly from a subjectivist viewpoint, knowledge arises
DEVELOPMENT METHODOLOGIES
307
from human interaction. No set of observations can replace the sharing of ideas from which arise informed opinions. Hence the importance of sense making and achieving of mutual understandings throughout the system’s life cycle. The second set of assumptions are associated with the nature of society. Two basic positions can be identified depending on whether one tends to believe that society is best conceived in terms of order or conflict and radical change. The “order” or “integrationist” view of society emphasizes stability, integration, functional coordination and consensus. The “conflict” or “coercion” view of society stresses change, conflict, disintegration and coercion. Both sets of assumptions, those about knowledge (objectivism vs. subjectivism) and those about society (order vs. conflict) can be combined. They identify two dimensions that when mapped on to one another yield four paradigms of social science that are also manifest in information systems development : functionalism, social relativism, radical structuralism, and neohumanism (cf. Hirschheim and Klein, 1989). The functionalist paradigm is concerned with providing explanations of the status quo, social order, social integration, consensus, need satisfaction and rational choice. It seeks to explain how the individual elements of a social system interact together to form an integrated whole. The social relativist paradigm seeks explanation within the realm of individual consciousness and subjectivity and within the frame of reference of the perspective: “social roles and institutions exist as an expression of the meanings which men attach to their world” (Silverman, 1970, p. 134). The radicaZ structuralist paradigm has a view of society and organizations that emphasizes the need to overthrow or transcend the limitations placed on existing social and organizational arrangements. It focuses primarily on the structure and analysis of economic power relationships. The neohumanist paradigm seeks radical change, emanicipation and potentiality and stresses the role that different social and organizational forces play in understanding change. It focuses on all forms of barriers to emancipation: in particular, ideology (distorted communication), power and psychological compulsions and social constraints; and seeks ways to overcome them.
3.2 Comparison of Differences Between the Four Paradigms As these paradigms were discussed at some length in Hirschheim and Klein (1989), we prefer to only summarize their key features here and offer some limited comments about them that are necessary for the discussion in Section 4. These summaries and comments are presented in the following tables. Table I provides a summary of the four paradigms. The paradigms are contrasted in terms of ontological and epistemological assumptions
w
0
ca
TABLEI xi
SU MMARY OF THE PARADIGMS
Paradigm
Ontological Assumptions
Functionalism System requirements and constraints exist independent of theories or perceptions. They can be described by an empirical base of observations formulated in a neutral language free of distortions. This is realism and it tends to reify system requirements by suppressing their human authorship. Social relativism
System requirements and constraints are socially constructed; they change as perceptions change and perceptions change through continuous social learning and evolution of language and culture.
Epistemological Assumptions Empirical-analytical methods of observation, measurement and induction.
Deficiencies
Implications for Legitimation of Systems Objectives
Cannot explain how users associate meanings with measurement, how goals are set; resistance is interpreted as failure to comprehend the systemic needs and is
Goals are dictated by a “technological imperative”; i.e. only those goals consistent with the ideal of technical economic rationality are legitimate.
C
0
<
E a v)
0
I I
m_ Y
b
z 0
I ’”
z
irrational. It cannot explain the origination of subjective
N
meanings, conflicting goals, and the like.
x
r
m -
z lnterpretivist reflection and hermeneutic cycles ; raising of consciousness and dissemination of ideas through social interaction.
Unable to distinguish between justified, informed consensus from social conventions and cultural stereotypes ; tendency toward relativism and anarchy.
Any goals or values are legitimate that are consistent with social acceptance ; but there is no way to critically validate the acceptance.
Radical structuralism
Only the objective economic conditions of the social mode of production exist. These in turn determine the “ideological superstructure.” System requirements and constraints exist independent of theories or perceptions. The existence of an independent social reality is denied.
Neohumanism Differentiates physical from social reality; the former is similar to the ontology adopted in Functionalism; the latter to Social Relativism.
Empirical-analytical methods for physical reality. Physical reality is perceived to include the “objective relations of ownership of the means of production.” One’s vested interest in either maintaining the status quo or its revolutionary change, determines what one recognizes as truth in the social realm.
Cannot explain the notion of community of interests and social differentiation on the basis of criteria other than economic status; postulates that conflict will vanish if all become members of the working class.
Postulates the need for multiple epistemologies. To gain knowledge about physical nature, approaches similar to those of Functionalism are adopted. The only difference is that the correspondence of truth claims is established through critical debate-truth is “warranted assertability.” Extends this notion to knowledge acquisition about social reality where consensus is the key. Consensus may be fallacious, but is “correctable” through critical debate. Through such critical debate, it is possible to escape the prison of our preiudices.
Fails to explain why a rational consensus by the mere “force of the better argument” will occur. And if it does occur, how does one know that the consensus is “authentic”; it might just be another social consensus-better informed perhaps, but still historically contingent.
All objectives other than those that further the class interests of the workers are considered illegitimate and reactionary
0
rn
2 Extends notion of “warranted
6
assertability” to the establishment of norms and values ; those system objectives are legitimate that survive maximal criticism and thus are shown to serve generalizeable human interests.
zrn z -I
z
1 I 0
: U
E rn v)
1.1
31 0
RUDY HlRSCHHElM AND HEINZ K. KLEIN
(underlying principles that guide inquiry), deficiencies (weaknesses of the paradigm) and the implications for legitimation of systems objectives (how system goals are legitimized). Table I1 presents an overview of the principal concepts and ideas associated with each paradigm along with some representative references. These relate to the paradigms in general and are not specific to IS. The table then compares the paradigms specifically to IS along the following dimensions : nature of information systems application (what the purpose of the information system is); objectives for design and use of information systems (what the goals of information systems development needs to be); and role of designer (what the IS developer’s function should be). Table 111 compares the four paradigms in terms of their implications for the various functions of systems development. More specifically, the comparison considers six (fairly standard) functions of systems development along with how each paradigm perceives information and information systems development. In the case of the latter, each paradigm is depicted in terms of its “preferred metaphor for defining information” (its basic view of the concept of information) and “preferred metaphor for defining information systems development” (its basic view of what ISD does). In the case of the former, the six functions explored are problem finding and formulation, analysis, logical design, physical design and technical implementation, organizational implementation and maintenance.
3.2.I
Differences Relating to Human Interests
The paradigms also differ in terms of how they deal with human interests; i.e., what interests are sound and proper. In radical structuralism only the workers’ interests are seen as legitimate. In social relativism all interests are seen as legitimate. In neohumanism all interests are considered legitimate as long as they are generalizable, i.e., arguable in a rational discourse; however, no interest is privileged. Functionalism also has the potential for considering all interests. For example, it could support the realization of workers’ interests. However in practice, it trends to favor the interests of the societal elites, and most of its methods and tools are biased towards these interests. It is therefore difficult to see functionalism supporting these other interests without a major effort requiring large resources which are unlikely to be granted by those in power (cf. Klein and Lyytinen, 1985, and the conclusion in Klein and Hirschheim, 1987). In realizing human interests, functionalism can appeal to the common interests and common understandings by incremental, evolutionary reforms of the status quo. Social relativism appeals to all common interests through
DEVELOPMENT METHODOLOGIES
31 1
the search for consensual norms and common understandings. Radical structuralism denies the existence of a common interest between workers and the owners of capital. The appeal to common interests is seen as a strategic plot for controlling the workers by manipulating and distorting communication and not addressing the root cause of conflict; namely, social injustice. Neohumanism appeals to the common interests but explicitly recognizes that distortions may prevent its realization. It therefore argues for removing the barriers to common understanding that would then allow the generalizable interests to be realized based on nondistorted communication. 3.2.2 Differences in Ontology
In addition to the description of ontological differences offered in Table I, some other differences are worth noting. For example, both radical structuralism and functionalism postulate an independent reality and therefore favor development methods and tools that are similarly objectivist in nature; i.e., reflect a given reality. Radical structuralism postulates the existence of an inevitable conflict in the social domain that is unresolvable except by revolutionary change. Whereas functionalism denies radical conflict, it does however allow for conflict, but in a different sense from radical structuralism. Functionalism treats conflict as a phenomenon that does not challenge the fundamental basis of society, but is potentially productive because it contributes to innovation and change. The ontology of neohumanism and social relativism postulates the primacy of language as the only reality that we may have that leads to an epistemology which sees reality as socially constructed. 3.2.3 Differences Related to User Control and the Kind of System Produced
Differences in the kind of system produced relate to the differences arising from the application of the four paradigms, i.e., variations in terms of the systems that each produces; to put it differently, how the systems that each paradigm produces differ from the others. The differences in developed systems relate to the output and control of systems development and include the following eight features : technology architecture, kind of information flows, control of users, control of systems development, access to information, error handling, training and raison d’8tre. See Table IV.
1. Technology architecture refers to the way in which specific hardware and software components are configured and matched with the structural units of the organization. The structural differentiation supported by alternative technology architectures, for example, has a considerable
TABLEI1 PRINCIPAL CONCEPTS A N D IDEAS
Paradigm
Principal Concepts and Ideas
Functionalism Unity of scientific method: there is no separate mode of inquiry for the cultural sciences; social theory is concerned with the “middle range.” Value statements express intentions or emotions and cannot be falsified. Social relativism
Individual value judgments are determined by the soclal institutions and general conditions of human existence (e.g., agrarian vs. postindustrial, etc.). Localiy, value judgments are rational if based on an interpretive understanding of the totality of these conditions; e.g., child labor was necessary in the early industrial phase in England. but it is unacceptable now.
AND
THEIRIMPLICATIOKS FOR
THE
Representative References
Role of IS Designer
Russell (1929) ; Popper (1962, 1972); Nagel (1961); Alexander 11985).
The expert; similar to an artisan who masters the means for achieving given ends.
In IS, Minnesota School and their descendants; e.g., Dickson et al. (1977) ; Ives et al. (1980). Typical of interpretive sociology and theories of the social construction of reality (e.g., Berger & Luckmann [1967]). In IS, Boland (1985); Andersen EZ al. (1990).
NATURE AND THEORY OF ISD
Nature of Information System Application
IS is built around deterministic laws of human behavior and technology to gain optimal control of socioeconomic environment.
Objectives for Design and Use of Information Systems
ISD is concerned with fitting technology; i.e., IS design is a means to better realize predefined objectives. IS use is aimed at overcoming human computation limits and improving productivity.
W v, r)
I I
m
< 9
z A catalyst who smooths the transition between evolutionary stages for the social system for which he or she is a part.
IS is concerned with the creation and sharing of meaning to legitimate social action whatever it may be; overcoming of tension due to transition from one set of conditions to another.
To elicit the design objectives and modes of use that are consistent with the prevailing conditions; to help others to understand and accept them. To develop systems that implement “the prevailing Zeitgeist” (spirit of the times).
U
I E
z
N
Radical structuralism
Those values are rational that lead to social progress. They can be studied as part of the dynamics of the social totality-the development goal of society. There is no other reality than matter; mind is a form of appearance of matter ; hence if the material basis of society is changed, the collective consciousness will change with it. If the actual beliefs are different, they represent “false consciousness,” which is to be unmasked by studying the deterministic conditions of social evolution. Neohumanism At any given time there are constraints on people. Some of these are natural, others are due to the current limits of technology or imperfection of social conditions. These only seem natural. Humankind can emancipate itself through approximating a rational discourse; i.e., an informed debate among equally well-informed peers.
Marxist writers and many, but not all of their descendants, see Braverman (1974). In IS, the Marxist position on ISD is not well developed; but see Briefs (1983); Ehn and Sandberg (1979) ; Ehn and Kyng (1987); Sandberg (1985).
A warrior on the side of IS can contribute to the the forces of social evolution of society by progress. overcoming the inherent social contradictions; use of IS should be to achieve emancipation of working class. This involves aggressive application of the natural sciences that is a force of progress.
ISD must be a process of better understanding the requirements set by the current evolutionary stage of society and the place of the organization within it. IS designer must be on guard not to work in to the hand of vested interests, in particular the use of IS must further the class interest and not the exploitation of the common person.
$ 0 T J
I
rn
z
--1
I
rn -I
Most of the writings of An emancipator from the Frankfurt School of social and psychological Critical Social Theory; harriers. see McCarthy (1978) and Habermas (1984). In IS, not developed, but see Lyytinen (1 986) ; Ngwenyama (1987); Lyytinen and Hirschheim (1988).
Understanding of the options of social action and free choice; IS is to create a better understanding of these by removing bias and distortions.
IS development must be concerned with removing bias and distortion due to seemingly natural constraints; external (power) and internal (psychopathological) barriers to rational discourse must be removed.
I 0 0 0 I0
2 rn
rn
Activities in ISD
TABLEI11
2
PARADIGMATIC IMPLICATIONS FOR ISD FUNCTIONS
P
Functionalism
Social Relativism
Radical Structuralism
Neohumanism
Preferred metaphor for defining information
Information as a product; it is Information as a journey with a partner; produced, traded and made information emerges from available at will, like a reflection, interaction and commodity. experience.
Preferred metaphor for framing ISD
ISD is like engineering with the systems developer being the expert of methods and tools.
ISD is like a journey to an uncertain destination with the systems developer acting as the facilitator.
Information systems development is like a form of rationalization directed against worker interests. Or a counterstrategy by the workers to deflect exploitation.
Information systems development is like an opportunity to improve the control over nature and to overcome unwarranted barriers to communication.
Problem finding and formulation
Improve prediction and control of the various entities in the business functions through maintaining and analyzing data; identify misfits between organization mission and IS; align structure of IS with business strategy; seek opportunities for competitive advantage.
Improve conditions for learning and cooperation; identify means to support the improvement of mutual understanding and the creation of new meanings; facilitate interaction and the exchange of information.
Improved productivity of the workers. Or improve the position and enhance the craft and skills of the workers.
Improve institutional tools and organizational arrangements for prediction and control, mutual understanding and discourse and emancipation of all stake holders.
Analysis
Determine how the key processes of the organization contribute to the intended performance outcomes and which data they need for their effective functioning. For a good review of possible requirements determination strategies, see Davis (1982).
Understand and investigate the existing basis of interaction and communication, such as differing horizons of meanings of various stake holders.
Identify how IS can increase competitiveness and productivity by increasing work intensity, division of labor and control. Or identify alternative forms of IS that improve the wages and general conditions of work.
Identify existing technical, social and linguistic barriers for optimal prediction and control, mutual understanding and emancipation from unwarranted constraints.
Information as a means of Information as means for manipulation and a weapon in control, sense making and ideological struggle. argumentation.
Logical design
Model the portion of organizational reality relevant for the system, using tools such as process modeling, object modeling and demonstrate functionality through prototyping.
Reconstruct user language to support interaction to more effectively capture meanings as conveyed in ordinary speech (Boland and Day, 1982).
Construct systems models that enhance productivity and competitiveness. Or use prototypes to experiment with technology that will retain and enhance the skills and tradition of the craft.
Reconstruct the technical, linguistic and organizational basis for improving prediction and control, mutual understanding and discourse and learning and emancipation.
Physical design and technical implementation
Find cost-effective hardware and software solution to implement the logical design.
Not discussed in the literature.
Find cost-effective hardware and software solutions that will improve the quality of work life.
Realize changes ih technology, language and organization to improve control, mutual understanding and discourse and emancipation.
Organizational Implementation
Develop strategies to seek compliance by the users to avoid resistance and implementation games (Keen, 1981).
No implementation strategy needed since ISD supports the ongoing evolutionary change.
Develop strategies to seek compliance by the workers to avoid resistance so as to maximize productivity. Or, consider structural changes of control in work organization to enhance the position of the workers.
Anticipate potential impact of changes in organization, language and technology on each other; develop strategies to mitigate unwanted side effects.
Maintenance
Monitor environmental changes and continued functionality of IS.
No difference between maintenance and continuing evolution of IS
Monitor the realization of the system objectives regarding productivity and competitiveness. Or monitor the continued use of IS to support the interests of the workers.
Monitor the actual performance of IS with regard to control and prediction, mutual understanding, and emancipation and make adjustments accordingly in the domains of technology, language or organization.
TABLEIV
w _.-\ m
DIFFERENCES IN DEVELOPED SYSTEMSPRODUCED BY THE FOURPARADIGMS
Technology architecture
Kind of information flows
Functionalism
Social Relativism
Radical Structuralism
Neohumanism
Technology is fit to existing organizational structure resoectine deuartmental bo;ndari& and spheres of influence and authority; replacement of human input. which is seen as error prone and unreliable by automatic devices: sensors, scanners. etc. whenever possible.
Technology is distributed to facilitate free flow of information to all forms of symbolic interaction, taking care not to encroach on tasks that provide opportunities for exercising human judgment, sense making and interpretation.
Technology is used as a means to radically change the boundaries of control and spheres of influence, so as to enhance the power of the work force, and retain control of their work products.
Technology architectures are designed with several purposes: to serve the ~. technical interest, as in Functionalism; to serve the interest in human understanding, as in Social Relativism; and to overcome unwarranted uses of power and vested interests of any privileged group. As these may be in conflict, technology architecture must be decided upon by free and open negotiation. Providing checks and balances on judgmental human input; integrity and consistency checks using multiple channels and dialectics for cross
Emphasis on objectively measurable quantities and automatic sensoring, under the control of management favoring top-down instructions and bottom-up control flows.
Emphasis on judgmental quality of perceiving, and leaving the original data input to humans who are encouraged to bring to bear their "life world" experience to assure common sense and meaningful inputs.
Emphasis on objectively measurable quantities, but under the control of the worker force. Information is not seen as entirely neutral, but as carrying ideological bias to serve vested class interest.
Control of users
Consistent with existing Eliminate all controls except Reversing the direction of existing power structures. power structures; tends to be those implied by top down. evolutionary and egalitarian checks and balances in social interaction.
Control of systems development
Under the control and direction of management, but mediated by professional expert interest groups.
Emergent from internal group interaction and evolution of peer norms and values.
checking.
rn
c 0
< 2 n
v)
0
I I
E
I > z 13
I
I I
z
x x r I I
z Eliminate unwarranted controls; balances between necessary controls and freedom.
Under the control of the work Rationally justified rules and norms ; legitimized by open force. and free discourse.
Access to information
Dictated by formal Open to all organizational hierarchy : the “need to know” standard.
Open to all with safeguards As dictated by the need to implement worker control over against self-delusion and distorted communication the productive forces. (from bias or ideology).
Error handling
Detection through the reification of statistical data; correction by procedures that can be followed either by the users themselves or consultants.
Errors are an opportunity for the workers to reaffirm the need for their skills and thereby can provide a power base. This leads to the continuing attempt by management to wrestle error control away from users.
Training
Roison d’ttre
Errors are uninterpretable phenomena; they prompt the initiation of a hermeneutic cycle to improve the group understanding. An example is the reinterpretation of a program bug as a special feature (Markus, 1984, p. 2).
Determined by division of labor and productivity concerns; instrumental orientation dominates. Education is a source of power and control for management.
Emphasis is placed on creative sense making and oriented towards furthering shared human understanding.
Maximizing savings, minimizing costs, and improving competitive advantage.
Improving creativity and shared sense making.
Errors in the technical domain are handled in a fashion similar to that of Functionalism; errors in the mutual understanding domain are handled by the initiation of a hermeneutic cycle accompanied by clarifications; errors in the emancipatory domain are challenges to the assumptions underlying the system design regarding power and capabilities and must be handled by opening a discourse about them.
Education is a means of Education is a source of destabilizing the authority and emancipation from selfdeception and ideology; it has power of management. It is also used as a source of an egalitarian orientation. It should be free and open to all. enhancing the position of the workers vis-a-vis management. It also includes skills and creativity training as in Functionalism and Social Relativism. Placing in the hands of the work force, the control of the productive resources ; reducing individual alienation by enhancing the craft of each worker, and reaffirming the control over the results of his or her labor.
Improved technical control, human understanding, and emancipation from unwarranted physical and social constraints.
0 rn
I5 6 w zrn z -I
z
1 I 0 0
0 b
4! rn v)
318
RUDY HlRSCHHElM AND HElNZ K. KLEIN
impact on the opportunities and privileges afforded various user groups. 2. Kind oj’injbrmation flows refers to the intended meanings of the information dealt with by the IS. For example, the meaning of the information of one system might be to formalize a particular user group’s diagnostic skills so as to leave them out of the diagnostic loop, whereas in another, it might be used to improve the diagnostic capabilities of the users. 3 . Control of users refers to how the information system would contribute to or diminish opportunities for one group exercising power, authority or other forms of social influence over another. 4. Control of systems development refers to the locus of influence over the systems development process. In principle this can lie with the people affected by the system or some external group or a mixture. 5. Access to informution refers to who would have access to the information provided by the IS and, with it, who stands to benefit from improved information. Control of the access to information can dramatically alter the power structure of an organization. 6 . Error hundling refers to the arrangement for detecting errors and who would deal with them. Depending on how errors are viewed, they can be used as a basis for external sanctions and rewards, as a means of subjugation or, more positively, as a challenge to creativity, source of learning and creation of new meanings. 7. Training refers to the role that eduction plays as part of system change, who will be selected for training, whether it is seen as a means to enhance the individual and his or her social position or whether it is confined to mechanical skills for operating the system. 8 . Ruison d’gtre refers to the primary reason for the existence of the information system. For example, is it seen as a means for overcoming social barriers, for improving policy formation and competitive advantage, for enhancing management control over workers, for achieving cost savings by replacing labor, etc.? It should be noted that the eight features chosen for comparing systems differences were derived from analysis of the systems development literature. They are by no means exhaustive, as others could have been chosen, nor are they necessarily mutually exclusive. ( 1) Technology architecture was derived from Ciborra ( 1 981), who notes the importance of technology architecture for lowering the costs of organizational transactions. (2) Kind of information flow was derived from the language action view of information systems (Goldkuhl and Lyytinen, 1982), which focuses on the purposes of information flow. ( 3 ) Control of users was derived from Kling (1980), who notes
D EVE LO PM ENT M ETHO DO LOG I ES
31 9
that it “is often assumed that when automated information systems become available, managers and line supervisors exploit them to enhance their own control over different resources, particularly the activities of their subordinates” (p. 77). (4) Control of systems development was derived from Briefs, Ciborra and Schneider (1983), who note the importance of internal and external control of the actors who participate in systems development. (See also Mathiassen, Rolskov and Vedel’s (1983) critique of both traditional management strategies of ISD and trade union agreements “primarily aiming at controlling the development process from outside” (p. 262) either to minimize costs or to predetermine fixed points for participative decisions.) (5) Access to information was derived from Markus (1983), who vividly shows through her FIS case that the access to information could change the balance of power between different interest groups. A similar point is made in Newman and Rosenberg (1985). (6) Error handling was derived from Markus’s (1984) case, where an error was treated as a feature. (7) The importance of training was derived from Kubicek’s (1983) observation that worker sponsored production and distribution of information technologyrelated knowledge should involve learning activities based on the previous experience of the workers (cf. Ehn et af., 1983). (8) Raison d’gtre was suggested by studying the goals of information systems in the four paradigms.
3.3 Methodologies and Paradigmatic Influences Even our highly selective review of the evolution of information systems development methodologies in Section 2 demonstrates that numerous methodologies exist. We cannot know how many methodologies exist, but most likely there are hundreds. A key contention underlying this chapter is that the bewildering variety of methodologies can be reduced to a few major types by relating them to paradigms. Whereas the classification of paradigms is itself problematic, we have proposed four paradigms (based on Burreil and Morgan, 1979), but this classification could easily be replaced if a better one emerges. Given some classification of paradigms, how does one detect the paradigmatic influences in existing methodologies? This is a particularly thorny problem if the authors of the methodology were not even aware of being influenced by a paradigm. Sometimes it is possible to ask the author(s) (for example, cf. Andersen et al., 1990, p. 302) and some authors have either by hindsight or in advance specified their paradigmatic connections. For example, Goldkuhl, one of the originators of the ISAC methodology, has critically evaluated its underlying paradigmatic assumptions ; and Checkland explicitly reveals the phenomenological foundation of SSM (in Checkland, 1981,
0
N
0
TABLEV METHODOLOGY PLACEMENT
Methodology Structured methods
Prototyping
SSM
Resulting Paradigmatic Placement
c 13 < W
Key Building Blocks or Features
Paradigmatic Representation
1. Functional decomposition 2. Physical vs. logical model 3. Self-defining primitives 4. Consistency checking 5. Validation and walk-throughs 6 . Nominalism in data dictionary definitions3 7. Avoiding political conflicts
F F F F F SR F
Functionalism (F)
1. Use of data base language, code generators and screen painters 2. Emphasis on measurable costs 3. Link with data modeling, entity life-cycle analysis, etc. 4. Complete and fully formalized specification 5. “Cooperative design” 6 . Facilitates learning and communication 7. Improves user understanding
F F F F SR SR//NH SR
Ambiguous, Functionalism is dominant but can support Social Relativism
1. Rich pictures 2. Weltanschauung 3. Conflicting root definitions 4. Social community (ecology vs. human activity system) 5. Linkages to phenomenology 6 . Functions of human activity system 7. Conceptual modeling
SR SR SR SR SR F F
n
I
v, 0
I I
= !
sz
N
x x
r
z z Social Relativism (SR)
Ordinary work practices
UTOPIA
ETHICS
1. Emphasis on user learning and reflection 2. Respect for significance of the actual work practices and seeking their understanding 3. Change of working practices 4. Disclosure and debate of project interests and values 5. Democratic project organization 6 . Preference for field experiments with emerging methods and tools 7 . Learning through interaction with objects (concrete experiences) 8. Little concern for resource constraints and organizational control
SR SR
Social Relativism (SR)
FiNH NH NH SR SR SR
1. 2. 3. 4. 5.
Focus on labor-management conflict Union leadership in SD Recognition of economic necessities by the market Focus on enhancing position of members of the craft Technology can lead to progress for workers
RS RS F RS F/RS
Radical Structuralism (RS)
1. 2. 3. 4. 5. 6. 7.
Job diagnosis Social and technical objectives and alternatives have equal weight Ranking of alternatives by synthesis based on group consensus’ Emphasis on discussion between proponents of technical and social viewpoint Expansion of solution space by improved communication’ Emphasis on measurable cost, resources, constraints Joint optimization-fit
F
Very ambiguous, Functionalism (F) is dominant but with Neohumanist (NH) and Social Relativist (SR) influences
SR SR NH NH/F F
5
6 W
I rn
F
Neohumanism requires a “normative consensus” based on certain standards that serve as a safeguard against bias, group think, and other forms of distorted communication that cause a ‘‘fallacious’’ consensus. ETHICS does not emphasize such safeguards; it does not even recognize the issues of fallacious consensus. Functionalism would emphasize empirical means tests or fit to power structures. Opinion cannot decide on matters of truth by mere discussion, and value issues are decided by those who have the authority to do so: i.e., managerial fiat. According to DeMarco (1979), this applies to data dictionary definitions. It is the Humpty Dumpty nominalism notion: “When I use a word” Humpty Dumpty said, “it means just what I choose it to mean-nothing more nor less.” (p. 143). I
’
o rn
z 4
I m -I
I 0 0
EE
m v,
322
RUDY HlRSCHHElM AND HEINZ K. KLEIN
ch. 8, esp. p. 277). In some cases we have sought personal conversations with authors to clarify their paradigmatic position. When classifying methodologies we assume that one can identify the key principles and building blocks for each methodology. It is then possible to associate each of these features with one particular paradigm. If a feature appears to have multiple paradigm connections it should be decomposed further into simpler components until it can be clearly related to a distinguishing paradigmatic assumption. Table V illustrates this kind of procedure. It summarizes six information systems development methodologies, noting how their key building blocks (or features) relate to paradigmatic assumptions. Each of the six is then placed in its paradigmatic form; i.e., related to either functionalism, social relativism, radical structuralism or neohumanism. It should be noted that Table V has been kept concise by singling out the most important features for each methodology, but it is not claimed to be complete. Further paradigmatic features are identified in Section 4, where the respective methodologies are discussed in some detail. When interpreting Table V, it may be helpful lo keep in mind how we resolved the following two sources of disagreement about methodology placement. First, there is the issue of multiple paradigmatic influences. As will be seen in the next section, some methodologies do not fit easily into a single paradigm. Soft systems methodology (Checkland, 1981), for example, while largely influenced by the social relativist paradigm, also adopts features of functionalism. The same is true for the ETHICS methodology (Mumford, 1983). It is largely functionalist but it does possess a number of characteristics of social relativism and neohumanism. Prototyping too seems to adopt influences from more than one paradigm. Some of its features are clearly functionalist, while others are consistent with social relativism. (When a methodology is influenced by more than one paradigm, it can be applied in a spirit that is more or less consistent with a given paradigm and thereby grants a great deal of “autonomy” to the practitioner to follow his or her own predilections; this point is taken up later.) On the other hand, some of the methodologies are very heavily influenced by a single paradigm. Structured methodologies, for example, are virtually totally informed by functionalism, while the UTOPIA project (Ehn and Kyng, 1987) is strongly informed by the radical structuralist paradigm. In cases where methodologies are strongly influenced by more than one paradigm, one needs to be sensitive to one’s biases and identify all significant influences by seeking out features belonging to different paradigms. Admittedly this does not totally eliminate subjectivity, but by consulting with knowledgeable colleagues we have attempted to minimize any oversights. The scoring of ETHICS and prototyping in Table V illustrate how we attempted to cope with ambiguities. In principle it should be possible to relate key methodological principles and
D EVELO PM E NT METHODOLOGIES
323
objectives to one paradigm. The overall allocation is then a matter of judgment on which features one should focus on for a particular purpose. A second type of ambiguity arises because to some extent the paradigmatic nature of a methodology’s feature is in the eye of its beholder. Baskerville (1991) uses the term “philosophical attribution” to call attention to the “fact that a method or tool does not possess an independent philosophy” (p. 678). Either the originator, the researcher or the user interprets a methodology by imposing his or her framework. To some extent, practitioners are “autonomous” and can make a methodology fit their own paradigmatic predilections by “casting” it into some framework (p. 682). If those applying a methodology find themselves uncomfortable with its underlying assumptions, they may bend them to better fit their own orientations. For example it is possible to give an interpretivistic meaning to some of the features of a functionalist methodology and vice versa (cf. the examples in Baskerville, 1991). Our approach to methodology placement attempts to score the philosophical foundation of methodologies as they are discernible from their published form or are known to us from personal conversations with their originators. Baskerville notes that “the philosophical ‘tendencies’ of a methodology could flex, if not entirely alter the set of knowledge assumptions ordinarily reflected by an investigator (unless the investigator carefully guards these assumptions)” (p. 678). Here and in Hirschheim and Klein (1989) we have tried to reveal these inherent “philosophical tendencies” of methodologies by identifying the influences of alternative paradigms on their principal features. Baskerville (1 991) pinpoints an interesting research issue arising from a paradigmatic assumption analysis ; namely, that one cannot simply assume that methodologies are applied as intended. Rather, one needs to examine very carefully to what extent methodologies are applied concordant or discordant with their “underlying tendencies.” This requires a careful study of the actual working practices of professional system developers and we shall return to this specific point in Section 4.4. 3.4 The Relationship Between Paradigms and Generations of ISD Methodologies We can now return to the question of how the chronological organization of the evolution of methodologies relates to a deeper, conceptual connection among the members of each generation. It is our contention that, from the beginning, information systems development methodologies were influenced primarily by the paradigmatic assumptions of functionalism. This is true not only of the classical systems life-cycle and structured methods that copied the known methods of data collection from social science (observation, interviewing, questionnaires, etc.) into the tool kit for the system analyst,
324
RUDY HlRSCHHElM AND HEINZ K. KLEIN
but also prototyping and the methodologies associated with socio-technical system design. The functionalist nature of prototyping can be seen in its emphasis on cost-effective design while the socio-technical approaches seem to borrow from functionalism the goal of optimization, which is apparent from the emphasis on “joint optimization,” i.e., a synthesis between social and technical concerns of systems design. This presumes that order and regulation will prevail over fundamental change. As the discussion progressed and the rigidity of the functionalist approaches to information systems development became more apparent, influences from other paradigms became more and more effective in interpreting the methodologies or modifying them. In the most recent research literature on methodologies (e.g., Ehn, 1988; Mathiassen and Nielsen, 1989; Lyytinen and Hirschheim, 1988; Hirschheim and Klein, 1989; Greenbaum and Kyng, 1991; Nissen, Klein and Hirschheim, 1991 ; Floyd, Budde and Zuellighoven, 1991), the limitations of strictly functionalist approaches to systems development have become quite prominent. For example, the communication gap between technical staff and users has become very visible (Bansler and Havn, 1991), the emergent nature of system requirements is now obvious to almost anyone (cf. Kling and Scacchi, 1980; 1982; Truex and Klein, 1991) and the role of power in organizations has become more clearly understood and can no longer be denied (Pettigrew, 1973; Pfeffer, 1981). This has produced important shifts in our thinking, opening the minds of more and more systems developers to ideas from alternative paradigms. Power has become recognized as one of the principal obstacles to genuine participation (cf. Briefs et al., 1983; Mulder, 1971). Wherever there is the use of power, conflicts are to be expected and the nature of conflict and its legitimacy have become serious research concerns (Keen, 1981; Newman and Noble, 1990) and emergent sense making has been researched as an important theme in the institutionalization of computer-based packages (cf. the web model and interactionist theories in Kling, 1987; Kling and Iacono, 1984; Kling and Scacchi, 1982). Shifts like these in current thinking are reflected in a number of research efforts into the development of alternative ISD methodologies. The relationship between the paradigmatic shifts in our thinking about these general social issues and approaches to systems dcvelopment is taken up in greater detail in the next section. To summarize, it is our contention that the first two generations of information systems development methodologies (i.e., generations 1 and 2), reflect the paradigm of functionalism. Generation 3 approaches, i.e. classical prototyping, is largely informed by functionalism and to a lesser extent the socio-technical approaches (generation 4). Generation 5 methodologies associated with sense making and problem formulation approaches reflect the social relativist paradigm, as to a large extent do the generation 4
DEVELOPMENT METHODOLOGIES
325
methodologies (i.e., the socio-technical, participative approaches). Generation 6 methodologies associated with the trade-union-led approaches reflect the radical structuralist paradigm. And the last generation, associated with emancipatory approaches, reflect a neohumanist paradigm. 4.
Paradigms and the Continued Evolution of Methodologies
At present one can identify several research communities (cf. Kuhn, 1970) that have coalesced around different paradigmatic beliefs regarding the systems development that inform their reserch efforts to advancing the state of the art. Some have also been attracted by the emergence of alternative theoretical foundations and are using these to inspire new ways of thinking about ISD. Examples of such theoretical foundations for ISD are speech act theory, activity theory, critical social theory, self-referential systems theory, semiotics or, with a reactionary turn to the classics, Mao’s theory “On Contradiction.” Our plan is to explore, via a paradigmatic assumptions analysis, four of the more promising or theoretically appealing ISD methodologies along with some that are more widely known. Four aspects will be discussed for each methodology. We start by providing an overview of the methodology concentrating on its problem focus and key features. Next we explore its strengths and weaknesses from the perspective of different paradigms. Finally, we consider possible directions for future improvements in the methodology that emerge from our previous analysis. The methodologies chosen for our analysis were done so with the desire to provide a broad overview of the domain of ISD approaches. Our initial goal was to choose four methodologies, each of which reflected the assumptions of one of the four paradigms. But, as was noted in Section 2, there are no existing ISD methodologies for either the radical structuralist or neohumanist paradigm. Systems development projects like DEMOS, DUE and UTOPIA, which reflect many of the ideas of radical structuralism, have not transformed themselves into methodologies although they have led to the “collective resource approach” (Ehn and Kyng, 1987) that originally was motivated by radical structuralist principles. Even in its newest form it has managed to retain some of the radical structuralist political values albeit in a democratically refined (some would say “revisionist”) format that brings it somewhat close to neohumanism. The practical lessons from these projects seem to have moved the researchers’ philosophical thinking away from the strict radical structuralist notions more towards the position of “cooperative design” (cf. Ehn, 1988; Kyng, 1991), which has emancipatory implications. Because of the historical significance of these types of projects, an overview of the UTOPIA project is presented in the Appendix to this chapter. As a whole, the collective resource approach and its successor are too complex to
3 26
RUDY HlRSCHHElM AND HEINZ K. KLEIN
be adequately condensed here; indeed it could be the basis of a separate paper.* Moreover, such an analysis would be incomplete unless it is broadened to relate the collective resource approach to the “labor process perspective” on the actual problems and practices of software development (cf. Kraft, 1977; Bansler and Havn, 1991, p. 146). Thus, as there were no clear examples of neohumanist and radical structuralist methodologies, we could not choose any to specifically analyze in this section (but this has not stopped us from attempting to convey some key ideas on the significance of neohumanism for advancing the art of ISD; cf. Hirschheim and Klein, 1991b). Yet neohumanist and radical structuralist ideas have been influential, particularly in Scandinavia, and we do consider their influences (actual and potential) on other methodologies. Indeed, much of the four chosen methodologies’ weaknesses are based on the insights gleaned from neohumanism and radical structuralism. We have therefore chosen four methodologies (or approaches) that exhibit influences primarily from either functionalism or social relativism. The first one, information systems planning, reflects the principles of functionalism. It was chosen because of (1) its large literature base and available methodologies such as IBM’s BSP; (2) the importance attached to it by the IS community; and ( 3 ) its broad focus, which, from an information engineering perspective, includes both structured methodologies and prototyping, automated development and object orientation. The second one chosen is prototyping, because of its potential as a standalone methodology. While prototyping may not currently be considered a * The newest pubkations no longer usc the term “collective resource approach” (as is summarized in Ehn and Kyng, 1987). hut speak of “cooperative” or “work-oriented design” (cf. Ehn, 1988; Greenbaum and Kyng, 1991). They broaden the theoretical foundations of this line of research on TSD considerably by interpreting design from theoretical foundations other than radical structuralist (Marxist) philosophical lenses. Ehn (1988) considers implications of the social relativist paradigm by drawing on Heidegger and Wittgenstein’s philosophy of language games. Emancipation is discussed on the basis of a Marxist epistemology of practice. This contributes “to identification with oppressed groups and support of their transcendence in action and reflection” (Ehn, 1988, p. 96), but fails to see the broader implications of emancipatory concerns in ISD. Of great emancipatory potential are attempts to provide design support environments that could make design choices more accessible to all underprivileged groups (cf. Kyng, 1989) and thereby remove an important obstacle to meaningful participation. In summary, much work is in progress that is transforming the collective research approach and currently it speaks with many conflicting voices. A paradigmatic analysis could contribute to sorting out the competing lines of thought. In order to do justice to the current status of this line of research and the long history of its ideas (dating back to trade-union cooperation with Kristen Nygaard in the late 196Os), we prefer to reserve its discussion for some later time. A paradigmatic analysis of one of the research projects of the collective research approach, the UTOPIA project as briefly described in the Appendix, is presented in Hirschheim and Klein ( 1989).
DEVELOPMENT METHODOLOGIES
327
formal methodology, but rather an approach to systems development, it has made an important historical contribution to broadening our understanding of the process of information systems development. Indeed, much of the current thinking on ISD embraces prototyping in one form or another. Moreover, it is unique in that it seems to change its essence depending upon the paradigm from which it is interpreted. The third one chosen is soft systems methodology. SSM was chosen for four reasons: (1) it has been influential through contributing a number of general ideas to the advancement of ISD, especially in Britain ; its influence may also have been aided by its own journal (Journal of Applied Systems Analysis) where the discussion about SSM and its foundations can be easily followed. (2) While it has strong ties to social relativism, it embraces much that is consistent with other paradigms (cf. Mingers, 1981). (3) It has inspired the development of additional methodologies; e.g., MULTIVIEW and FAOR. It has also been applied to many information systems development projects (cf. Checkland and Scholes, 1990). (4) Moreover, the explicit philosophical grounding of its structure, tools and techniques in phenomenology makes it a methodology that is not only interesting, but also important to analyze. The fourth one chosen, which originates from Scandinavia, is a relatively unknown one: the ordinary work practices approach. It is not a methodology per se, but an approach to systems development strongly based on social relativism. While it has a number of principles and tools, it is not nearly as structured as the other methodologies considered. It was chosen because of both its novelty and interesting notions, many of which may well find themselves in the toolkits of tomorrow’s analysts. Our investigation of the principles which drive current research on ISD methodologies will reveal significant cross-paradigmatic influences in a number of the chosen methodologies. Such influences manifest themselves in attempts to draw on ideas originating from different paradigms to overcome the limitations that became apparent as ISD methodologies evolved. 4.1
Information Systems Planning (ISP) and Structured Approaches
The field of IS was created in response to the business opportunities and risks that the advent of computers created for users and corporate executives. Until the advent of computers, information technology evolved relatively slowly. For example, it took over two centuries until desk calculators were widely used in business after the first one was built by Pascal to aid his father with his administrative duties in the 1640s. With the advance of computers, new systems have to be created every few years or even months. But it is
328
RUDY HIRSCHHEIM AND HElNZ K. KLElN
not only the pace of change that creates a challenge. Much larger and much more complicated systems than ever before require a new profession dedicated to building computer-based information systems. In order to cope with the complexities of ISD, practitioners looked to science and its offspring, engineering, for inspiration. Early methodologies were adaptations of engineering and scientific management approaches that emphasized preplanned and well-defined procedures. Based on the precedence with the engineering of complex electromechanical machines like tool production systems or airplanes, three approaches were adopted to information systems development. The first of these is global planning of the complete IS application portfolio of an organization. It has an organizationwide focus and its result is not a working system, but an architecture for IS that produces priorities and guidelines for creating individual applications. The architccture is the basis for an orderly information engineering approach to systems development. The second approach is to ignore the complexities of organization-wide planning and treat each system request separately. In response to this, methodologies were developed with a single-application focus. The family of life-cycle approaches using structured methods and tools are a case in point. The third approach (which is discussed in more detail in Section 4.2) involves evolutionary systems development and prototyping. In situations that are too complex to be handled by design calculations, engineers often build prototypes, and this was also adapted to IS design. An important difference is that a physical prototype is often made from different material (e.g., a car body made of clay) and a different scale where it cannot really be used (scale model of a building). Closest to IS prototypes are early complete versions of a product that are actually tested to make improvements for the production version. Note that the prototype is distinct and separate from the production version.
4.1.1 Problem Focus and Overview
Structured methods were proposed to overcome the following productivity problems with the early SLC approaches : 1. The early development approaches had no standard format for
researching and documenting system requirements. This made it simply impossible to create requirement specifications that were complete, comprehensible, consistent and cost effective, because not even the criteria for consistency or completeness were defined. As there was no standard format with an agreed-upon interpretation, the same specifications document was subject to multiple interpretations by different
DEVELOPMENT METHODOLOGIES
329
audiences : users, developers, programmers. This led to many specification errors that were extremely costly to correct, if they were detectable at all before the system was installed. 2. It was impossible to obtain meaningful user input because both the analysts and users were overwhelmed by the sheer volume of unstructured specifications. 3. The specifications were out of date as soon as they were created, because it was impossible to track down all the places in the documentation where a change in the specifications had to be reflected. 4. It was difficult to train new analysts except through a long apprenticeship with many costly mistakes. Much progress with these problems has been made by the well-defined format of structured business systems analysis and specifications. A brief description of these is provided in Section 1 of the Appendix. The historical evolution of structured methods indicates that their primary focus has been on standardizing the system development process in three ways : (1) standardizing the type of data that are needed to define the logical functions of the system independent of their physical implementation (logical design) ; (2) standardizing templates for system descriptions that supposedly result in consistent, concise, cost-effective, nonredundant yet comprehensive system specifications and documentation ; (3) standardizing the diagramming methods and tools supporting the systems development process. Further advances with the cost effectiveness of structured methods can be expected by the computer-supported maintenance of the structured system descriptions in CASE tools. At its best, a CASE tool will check the consistency of different parts of a specification (as the definition of file contents with the contents of the inputs to a process; the latter must be a subset of all files accessed), alert the analysts to missing parts, maintain system descriptions both in graphic and other formats (tables, text) and generate code from the CASE data repository. The code can be used for generating prototypes and may also increase programmer productivity when implementing the final system by providing schemata, program module skeletons, or screen designs. As a whole, structured methods have not been able to resolve the software crises. One very intractable issue is continuing user communication problems. Many systems are developed that do not truly meet the business needs, and sometimes whole departments need “to work around them” rather than with them. Two tangible symptoms of this crisis are that an ever-larger portion of the information resource management budget has to be dedicated to maintenance and that there is a rising applications backlog. In part this backlog is invisible, as many requests for computer support are not even
330
RUDY HlRSCHHElM AND HEINZ K. KLElN
made because users know they might have to wait for months (even years). The two principle causes of backlog are system incompatibilities (locking out users from retrieving data that are stored in diverse formats), and the ripple effects of adaptive maintenance: the attempt to modify one major module of a system to meet a new need causes a multitude of secondary changes to keep the remainder of the system working. It is then very time consuming to test all these changes. Table VI lists a variety of approaches that have been proposed to deal with these issues. This table makes no attempt to identify the paradigmatic foundations of the various approaches ; these will become apparent as some of the approaches are discussed in more detail in the subsequent sections of this chapter. After briefly introducing the key concepts of the table, we shall focus on what we consider the most critical advance in functionalist approaches: computer-aided information systems planning and information engineering (ISP-IE). From a functionalist perspective, an orderly development of the IS applications portfolio is paramount for addressing the software crises and only the linking of structured systems development with an ISP-IE approach, provides a hope of getting the growing anarchy of IS under control. The issue of whether the apparent “anarchy” is required TARLEVI A
PARTIAI.
CLASSIbICATION
OF
SYSTEM DEVELOPMENT METHODOLOGIES Principal Locus of Control of ISD
Approach
Internal Control (users are in control)
Iterative-Evolutionary End-user developed systems (see Section 4.1) Approaches Cooperative prototyping (see Section 4.2)
External Control (system experts are in control) Evolutionary systems development (see Section 4.2) Most forms of prototyping (see Section 4.2)
Executable specifications (see Ordinary work practices prefers “experimental” approaches, but is Section 4.2) also compatible with SLC (see Section 4.4) ETHICS (see Section 2) Approaches with a Preplanned Life Cycle Information systems analysis and construction (ISAC; see Section
PORGI (planning organizational implementation; see Section 2)
Classical system life cycle (SLC; see Section 2) Structured analysis, specification and design (SASD) approaches (see Section 4.1 ) Information systems planning and information engineering (ISP/IE; see Section 4.1)
DEVELOPMENT METHODOLOGIES
331
to maintain flexibility (Hedberg, Nystrom and Starbuck, 1976), “healthy confusion” and organizational checks and balances against “intelligence failures” (Wilensky, 1967) has received comparatively little consideration in the information systems literature (cf. Hedberg and Jonsson, 1978, for a notable exception). The rows of Table VI divide the approaches to ISD into two categories depending on whether they rely on the preplanned stages of a formal life cycle with milestones and sign-offs after each stage or not. Approaches that handle systems development in an emergent rather than preplanned fashion by improving a given system incrementally through experimental changes in response to user feedback, are called “prototyping” in the broadest sense. The columns divide known approaches into two extremes depending on where they place the principle authority and responsibility for information systems development. If the “users are in the driver seat” so to speak, then we have end-user computing or methodologies in which the role of analysts is that of advisors or assistants. Control is internal to the user community or the eventual owners of the system. Alternatively, control may rest with the systems developers. It should be noted that not all known approaches can easily be placed in this table. For example, while SSM certainly favors genuine participation, in practice it gives the analyst and the systems sponsor much opportunity to control the development effort. Checkland (1972) himself has noticed this, hence the SSM locus of control is ambiguous. One widely discussed approach to the applications backlog is end-user computing (EUC). EUC, or more appropriately, end-user developed systems, is implemented by providing easy-to-use programming tools that allow users to do some of the applications development themselvzs. Examples are spreadsheets or high-level data base languages like SQL or QBE. Effective EUC depends, however, on a well-organized company-wide IS architecture (ISA) such as could be defined by an enterprise scheme (more on ISA definition later). Without such an ISA and some general guidance and support, EUC will likely further add to existing incompatibilities and ultimately increase the maintenance burden. The reason for this is that EUC cannot remove system incompatibilities, hence data already stored will simply be reentered when inaccessible, proliferating redundancies and inconsistencies. If allowed to grow unchecked, EUC could ultimately result in total anarchy of stand-alone systems. Hence we believe that the most critical advances in functionalist approaches will come from integrating information system planning, conceptual schema development and structured methods in a unified methodological framework. To put this framework into practice it will need to be well supported in its principal activities of ISD by powerful CASE tools that allow a fair amount of experimentation with various types
332
RUDY HIRSCHHEIM AND HElNZ K. KLEIN
of prototyping in order to overcome the rigidities and bureaucratic tendencies of any planning approach. Within a well-planned information systems architecture, EUC can thrive and will make an important contribution to reduce the backlog and reduce the cost of ISD. The idea of orderly developing individual information systems applications by matching them against a global ISA (information systems architecture) may be called “information engineering” (IE). An ISA is a high-level map of the information needs of an organization relating them to the principal business functions and components of the organizational structure. Various formats have been proposed for describing such a map (cf. the literature survey in Grant, 1991), but the key components are always a model of business functions and their data needs as describable in a data model (cf. Martin, 1983; Finkelstein, 1989; Brancheau, Schuster and March, 1989). We think that a more refined version of an ISA needs multiple models. This is a crucial frontier for the future of functionalism (see Section 4.1.4). ISP is the effort of developing and maintaining an ISA (cf. King and Srinivasan, 1987, for a possible SLC approach to ISP). The product of ISP is not a working system, but a broad statement to what extent the current or future portfolio of IS applications meets the needs of the organization and its individual participants. Because of the complexities involved, ISP usually first needs to document the existing applications portfolio. IBM’s business systems planning (BSP) is a possible approach to derive the current ISA by an organizational self-study. From the current ISA one can glean an understanding of the current IS application portfolio, its principal applications, their strategic interconnections, its strengths (if any) and weaknesses (usually many). A small ISA may have 20 to 30 principal entity classes. A large one might consist of 20 to 30 subject data bases, each composed of several dozens of major data classes (entity groupings). In very large organizations, ISP may be limited to the divisional level. Based on an understanding of the current applications portfolio, ISP derives an improved ISA and thereby defines priorities for developing the next state of the applications portfolio over a horizon of two to four years. This includes major modifications and new developments. The new (normative) ISA usually requires major restructuring of processes and data classes. A current influential school of thought holds that data are more stable than processes, and at the core of the ISA therefore is a stable data model. An approach to structure the data classes is to use subject database classifications, each of which is documented in a canonical (third normal form) conceptual scheme (cf. Martin, 1983; Brancheau et al., 1989; Finkelstein, 1989). The preceding characterization of ISP-IE presumed that IS would be divided into data definitions and programs, as has conventionally been the
DEVELOPMENT METHODOLOGIES
333
case. At the frontier is the question of whether this is, indeed, the best representational format and which alternative software architectures should be considered for defining and documenting an ISA. A proposal to use a knowledge representation based, modular, open systems ISA has been made by Kaula (1990). 4.1.2 Paradigmatic Analysis of Strengths
The principle strength of functionalism is that it has greatly refined its concepts and instruments to predict and control complex technical systems. N o other approach can currently match it in this regard, and it is therefore not surprising that all large projects relied on functionalist approaches (even though some participants voiced significant concern afterwards, see Brooks, 1975, for a classic). Examples are airline reservations systems, the Apollo space missions, or large operating systems. When it comes to developing large-scale systems, there is little alternative. Because of its Cartesian vision of clear, concise and well-formulated methods, functionalism has greatly succeeded in rationalizing its foundations into a well-articulated body of concepts. It therefore ranks high in understandability and lends itself well to teaching and knowledge transfer. The functionalist assumptions about the nature of reality and the appropriate ways to test human knowledge through means tests are consistent with widely held beliefs as taught throughout educational institutions. This further helps to communicate the approach and motivate newcomers to believe in it and “make it work.” Functionalism is naturally oriented towards efficiency and effectiveness, and this helps to conserve valuable resources. Because 3f the relative rigor and internal coherence of its conceptual basis, functionalist system development methodologies lend themselves well to computer support with CASE tools. This promises to further strengthen its cost effectiveness. To the best of our knowledge all CASE tools have embraced functionalist methods and even many nonfunctionalist conceptual approaches have been “functionalized” in some cases (e.g., Winograd’s The Coordinator, which is based on speech act theory). A final strength of functionalism is that it has shown itself to be rather flexible, able to learn from its critics and absorb some of the key insights from other paradigms. Hence it may overcome some of the weaknesses, which are discussed in the next section.
4.1.3 Paradigmatic Analysis of Weaknesses Whereas functionalism emphasizes efficiency and effectiveness it is very poor in helping to formulate and legitimate the ultimate goals that system
334
RUDY HIRSCHHEIM AND HEINZ K. KLEIN
development should serve. In this context we need to mention both an unwarranted and valid criticism of functionalism. Under the label of “positivist research,” functionalists have often been accused of contributing to an exploitative ideology of capitalist entrepreneurs. This argument has taken a moderate and more radical form (for summary and references to the key ideas, see Bleicher, 1982). ( 1 ) Functionalist theories contribute to stabilizing the status quo thereby perpetuating possibly inequitable unjust social conditions. (2) Functionalism has developed theories that make exploitation more effective. From this perspective the ideas of science, which in the Age of Enlightenment were a liberating force from religious doctrine and absolutist forms of government, have been pressed into the service of a rationalist economic elite that applies natural and social scientific theories for control of nature and people, respectively. While this analysis may have some historical merit, it need not carry forward to the future. There is no reason why functionalist research cannot be put in the service of the justifiable interests. There is no reason why the obstacles to this, both financial and social (cf. Wilensky, 1967), could not be overcome. To put it more simply, functionalism may serve managerial interests just as well as those of disadvantaged groups. A worker perspective need not necessarily be informed only by radical structuralist ideas and neither must a managerial perspective be limited to functionalism. This issue leads into an explicit discussion of ultimate goals for systems development. Functionalism does not have difficulty with admitting conflicting alternative values in its discourse, but is deficient in dealing with the meanings of any value statements regardless of whose interests they serve. Historically, functionalism recognized only those statements in the domain of science that either have direct empirical content or can be related to a base of observational knowledge through predictions : any speculative concepts are admitted provided they have predictive power. Even though values could be used in predicting people’s behavior, in the past functionalism espoused an ideal of value neutrality and considered value statements devoid of empirical content. Discussing the pros and cons of morals and values was unscientific. It is therefore not surprising that functionalism up to now has failed to develop an adequate approach to deal with value issues. Consequently, by following a functionalist approach, developers may subtly be steered away from carefully evaluating system goals that tend to be stated poorly under the best of circumstances. This may lead a development team to efficiently spend resources on a project that should not be conducted in the first place or effectively design a system that fails to meet the real needs of the work environment. For the same reason, they could easily fail to consider ethical and social implications of a system development project. The always dominant efficiency value (in terms of keeping within budget) often further reinforces this tendency.
DEVELOPMENT METHODOLOGIES
335
In order to improve the functionalist approach to value issues, one needs to understand what is most likely the central flaw of the approach: an inadequate concept of meaning and humun language. Functionalism insufficiently realized the nature and active role of language in the social construction of reality (for a good theoretical treatment, see Berger and Luckmann, 1967). Practical applications of functionalism d o not deal well with the ways in which humans create, negotiate and understand “meaning,” because functionalist approaches tend to adhere to some version of a denotational theory of meaning. Typically they tend to define meaning as a correspondence relationship between real-world objects and their representations. This relationship, called “reference,” defines the propositional content of a statement. In another version, the meaning of some linguistic expression (like an utterance, a program or some program output) is identified with the behavioral reactions that it produces or intended to produce (see Bariff and Ginzberg, 1982). This version appears particularly appealing in the meaning for rules, but both versions confuse meaning with naming; this is so, because one can name meaningful objects (such as Pegasus or mermaids) that need not exist (and hence have no reference), but are still meaningful (see Quine, 1963, p. 9). Making matters worse, functionalism treats correspondence relationships as being relatively fixed, assuming that the objects referred to exist a priori and independent of their representations. In contrast to this, humans treat meanings as context dependent, negotiable and emergent (see Truex and Klein, 1991 for the system development implications). The same sentence, word or rule can mean different things depending on who uses it in what context. Hence wit, irony or metaphor become meaningful. Humans have intentions and will fix the meanings of the words so that they serve to realize their intentions. On the spur of the moment they can say we will interpret this letter in such and such a way or in this meeting we shall treat x as if it were y and everybody knows. A functionalist approach that assumes that meanings are fixed by correspondence rules tends to create systems that would work if the past were to repeat itself. The denotational theory of meaning leads to inadequate analysis and hence numerous problems. First it leads to misinterpreting the use of IS as being mere repositories or processors of given meanings (see Boland, 1987). Second, it not only prevents a deeper understanding of the formulation of system goals and requirements or the evaluation of the likely effects of system changes, but also of the problems with systems descriptions in general as used in maintaining documentation of past and present systems. Functionalists will study how people have reacted to certain design options in the past and then conclude that the same designs or system descriptions will create similar effects in the future. They forget that the link between designs and the effects they produce is the meaning attributed to them, and this can
336
RUDY HlRSCHHElM AND HEINZ K. KLEIN
change from one day to the next. Winograd and Flores (1986) have pointed out that the whole notion of an internal representation may not be appropriate for capturing meaning. The denotational theory of meaning also fails to account adequately for the active role of language in creating shared meanings. This is of particular importance for understanding why the best plans will have little meaning if they are not created by common sense making through some form of interaction. From the denotational viewpoint, language is seen as a neutral medium of description. As long as one understands the syntactic and semantic rules of the language, any description can be meaningfully transferred. Contrary to this, language has a constructive role, because it shapes our thoughts. The terms that we use, create the reality that we see. The Sapir-Whorf hypothesis states this succinctly. It holds: “that all observers are not led by the same physical evidence to the same picture of the universe, unless their linguistic backgrounds are similar, or can in some way be calibrated” (Chase, 1956, p. v). If planners, developers and users do not interact and find ways of sharing their concerns and conceptions, the discourse at the planning level creates meanings and interpretations that are felt to be of little relevance and meaning at the implementation level. Another weakness of functionalist approaches is their lack of a critical perspective on the connections between control of data and the definition of their meanings by controlling the language and politics or organizational power. It is not that data in some neutral form of description support or contradict some organizational policy, rather the policy defines which kind of data are meaningful in the first place (see Klein and Lyytinen, 1991, for a detailed analysis). The consequence of this is that it prevents functionalism from adequately addressing the close linkages between system development and policy issues. Most functionalist methodologies aim at avoiding politics. Thereby they either define irrelevant IS or unwittingly become the political instrument of some prominent interest faction and then are bewildered if the results of their efforts are strongly resisted or “sabotaged” by the opposing political faction (Hirschheim and Newman, 1988). Functionalist approaches are, however, quickly changing. In the past, when faced with a choice between rigor and relevance, functionalists have tended to pursue rigor at the expense of relevance. This could easily change if the key points discussed become widely believed. In the future it might be easier for functionalism to relax its standards of rigor in order to become more relevant than for other approaches to support their richer theoretical frame of reference with clearly defined work practices in order to become more efficient and effective.
DEVELOPMENT METHODOLOGIES
337
4.1.4 Possible Directions for Future Improvements
Progress with ISP-IE is currently hindered by some common misconceptions lingering from earlier eras of data processing. Contrary to some claims (i,e,, Brancheau et al., 1989, p. 9) an ISA to a large degree is personnel, organization and technology dependent. Unless the architecture matches the organizational culture in general and the personal needs of executives in particular, it will have little effect. Moreover, the ISA needs to be in a representational medium and format that is easily adapted by key personnel. An ISA can cause the organizational structure and shared views to change of what information technology can do for career aspirations and the organization. In that sense, ISP should contribute significantly to “double loop” organizational learning (Argyris, 1982). Contrary to overly optimistic viewpoints (i.e., Finkelstein, 1989), comprehensive computer-aided IE is at this point more a concept than a practicable approach. However, it is a concept with a history of at least 25 years, and progress will be made if some of the key ideas of the founding fathers on the pitfalls of ISP are remembered. In addition to overcoming past misconceptions, ISP-IE needs to meet the following challenges through future research : flexibility and constructive representation, accommodating a social constructivist concept of meaning and language and addressing communicative rationality issues in organizational discourse. Also prototyping, alternative software architectures, multiperspective modeling methods and tools could help to address these issues. In order to retainpexibility in an ISA, ISP must be merged with prototyping. The fundamental idea of combining both global planning with limited prototyping experimentation to improve the knowledge base for planning was suggested by Sackman’s (1967) concept of evolutionary planning. Essential progress can be expected if the challenge on how to accomplish this in practice is met either by alternative forms of software design or by taking advantage of the many refinements or prototyping (see Section 4.2 in this paper). Two alternatives that promise greater flexibility than the conventional process-data software architecture are object-oriented modeling and knowledge representation (Bubenko, 1986, for example). Kaula (1990) has shown how a knowledge-based architecture could be modularized. These are technical approaches that could help to implement flexibility in an ISA. They are not replacements for prototyping, but different software support environments for them. Prototyping is essential as, together with a multiperspective modeling approach, it could redress some of the sense-making and communicative rationality deficiencies of functionalism. If functionalism is to overcome the weaknesses that follow from the reliance on denotative concepts of meaning, it needs to implement representations that allow the construction of new
338
RUDY HlRSCHHElM AND HElNZ K. KLEIN
meanings through computer-mediated social interaction. We propose the label “constructive reprewntution” for this and believe that prototyping in an appropriate software support environment and multiperspective rapid systems modeling are possible avenues to realize this concept. Instead of prototypes, mock-up simulations could also be used.* Prototyping together with Chaffee’s (1985) notion of an “interpretive strutq y ” t points to a promising research avenue for improving the sense-making aspect of functionalist system development approaches because it allows different communities to interact in “experimental futures of the organization”; i.e., those described in the ISA. For example, constructive ISA representations could help to overcome the difficulties of linking ISP-IE to the other principal sense-making processes at all levels of the organization : if this does not happen, the horizon of meaning of the planners will not overlap with the horizons of the various “work language communities” that carry on the day-to-day business of the organization. If that happens, the notion of a “work language” explains why ISP will have no real effect. A work lunguagr differs from a professional language in that it is more task and organization specific. Typically the functions of a work language are (see Holmqvist and Bogh-Andersen, 1987, pp. 328 and 348) to negotiate the organization of work and establish specific types of cooperation, address the work tasks proper and maintain social relations, shared knowledge and a supportive climate (work solidarity and collegiality). A work language is different from a sociolect (the language of a socio-economic class such as that of a blue-collar neighborhood) and it typically overlaps with several professional vocabularies (as taught in higher education) or national languages (see Holmqvist and Bogh-Andersen, 1987, p. 348). If the work language of the planners remains distinct from the remainder of the organization, ISP-TE will not affect the organization of work and forms of cooperation and contribute to the other functions as stated. This is like saying ISPIE will have no influence on the organization. Multiperspective modeling (MPM) is a possible analysis approach to overcome the language barriers of communicating with different work language communities. At present the principal focus of MPM is on realist aspects of information systems modeling. A brief outline of these
* The discussion of mock-up is in Section 4.2.3. t The following ideas on the interpretive model of strategy, also appear applicable to interpretive ISP. “The interpretive model of strategy. . . assumes that reality is socially constructed.. . reality is defined through a process of social interchange. . . . Strategy in the interpretive model might be defined as orienting metaphors or frames of reference that allow the organization and its environment to be understood by organizational stakeholders” (Chaffee. 1985, p. 93).
DEVELOPMENT METHODOLOGIES
339
may provide some hints on how this approach could be expanded to capture the key aspects of different work languages and thereby address semantic-linguistic barriers of organizational communication. From a realist perspective, a npdtiple model or multiperspective approach to IE needs to capture the following aspects: These are in addition to the activities and data structure descriptions of structured methods (data flow diagramming) and entity-relationship models which are now widely used ; (1) The environment of the business or business unit for which the ISA is developed (called a “high-level context model”) ; (2) a documentation of the organizational structure and the key responsibilities of the positions and how they relate to the overall business strategy (functional-structural model of the organization); and (3) the principal resources consumed by each business unit (entity life-cycle models). Grant (1991) and Grant, Klein and Ngwenyama (1991) focused on exploring a multiperspective approach for ISP-IE. Avison and Wood-Harper (1990) through their action research have developed a phase structure for a multiview approach to ISD through action research that is based on SSM. While their approach is single-application oriented, it gives considerable food for thought for expanding it to a multiperspective approach to ISP-IE. The same is also true for another multiperspective approach, FAOR (Schafer et al., 1988). Presently the vocabulary of ISP appears to be shaped by the work language of the planners. This creates numerous communication problems. A critical issue is how to link the results of ISP to subsequent applicationoriented systems development. The transformation of an ISA to well-defined, structured specifications of an applications system is ill-defined and requires a great deal of creative imagination. As it is to be expected that the team creating the ISA is different from those in charge of fitting applications to the ISA, a serious communication gap can be expected between the planners and the application developers. There is a good chance that the application developers will simply ignore the ISA plan because they have not participated sufficiently in the discourse that made it meaningful and prefer to start from scratch.* In light of the previous discussion it is unrealistic to expect that the plan descriptions as such can carry all the meaning necessary for its
* This in fact happened in a pilot ISP-IE project. Under guidance of one author (HKK) the analysis class in one semester was charged with developing an ISA for a small organization involving 12 departments. Another class was given the ISA and then charged with providing prototypes to illustrate the feasibility and implications of the ISA. In spite of some briefing by the instructor, who helped with formulating the ISA in the first case, and some subtle pressure to use the ISA architecture, the prototype developers preferred to do their own analysis and barely considered the documentation of the ISA that had been created. Hence it is mandatory that the ISA team also be involved with some prototyping or that there is overlapping team membership.
340
RUDY HIRSCHHEIM AND HElNZ K. KLEIN
proper interpretation let alone to persuade others of its validity and usefulness for future application development. Progress with this can be made if the planning is highlv participatory so that the ISA enters the organizational discourse. Furthermore, the planners themselves need to provide some prototypes that support key ideas of the ISA. The application developers and users can then get some hands-on experience with the kind of systems that the ISA inspires. If the prototypes produce promising results, this will carry a great deal more weight than thick manuals. Finally the ISA with its supporting prototypes should be documented in a format that is easily accessible to all, such as in a central project data repository of an IE work bench-a CASE environment for ISP. Developers should be able to call up pertinent descriptions and work with them rather than study them and reenter them. Through a semiotic approach to work language analysis, a rapid systems modeling and prototyping could be extended to multiperspective modeling and prototyping so that the ISA can speak to different user groups in their work lunguuggr about the topics of interest to them just as now IS models and descriptions speak to IS planners and developers (at best). This is because the ISA enters the planning discourse by relying primarily on derivatives of planners’ and developers’ work languages. Some progress has been made in this direction by designing interfaces that are directly derived from a work language analysis (see Andersen et ul., 1990). The preceding presumes that all want to communicate if they only could overcome the “Tower of Babel” of organizational work languages. But this assumption is unrealistic. Improving communicative rationality requires addressing both the improvement of sense making and the improvement of the conditions that shape the general arena of communication. Consider an analogy with transportation : if communication is like getting products and services to people, then the former is like improving the transportation vehicles’ speed, capacity and riding comfort (so that people want to travel and the vehicles reach their destination quickly and comfortably) while the other is like improving the roads and waterways so that everyone can be reached and people can get to where they really want to be and not where the roads happen to take them (see Klein and Hirschheim, 1991, for an indepth analysis on how different methodologies currently realize communicative rationality). In order for ISP-IE to effectively improve communicative rationality it needs to explicitly address the typical organizational barriers to communication. While these are case specific, Wilensky’s (1967) classification of communication barriers may give some initial pointers. He discusses the effects of hierarchy, power and secrecy, “officialdom,” prevailing myths, taboos and other biases. Wilensky’s analysis can be complemented with ideas from
DEVELO PM ENT M ETH ODOLOG I ES
341
critical social theory. For example, planners need to understand how their planning process is put at risk by existing distortions in organizational communication (see Albrecht and Lim, 1986, p. 126; and Forester, 1990). By the same token, those responsible for creating an ISA need to understand how communication barriers and biases affect the ISP process and whether the resulting ISA will mitigate existing communication barriers or possibly introduce new ones. The preceding discussion has illustrated that substantial efforts are needed to make ISP-IE workable, but the theoretical, human sciences-based knowledge and technology to implement it already exists to some extent. 4.2
Prototyping and Evolutionary Systems Development
The ideas inspiring prototyping in the broadest sense (including rapid, evolutionary systems development) can be traced back to optimistic speculations about human-machine communication or human-computer symbiosis and cooperation (see, for example, Licklider, 1960; 1968; or Carroll, 1965; 1967). Since real prototyping applications were first described and tested with interesting results (e.g., Scott Morton 1967; 1971; Earl, 1978; Keen and Scott Morton, 1978; Courbon and Bourgois, 1980), a rich variety of differing concepts and interpretations of prototyping has been proposed in recent years. Because of the confusion surrounding the meaning of prototyping, we shall build on Iivari (1982, 1984), Davis (1982), Floyd (1984) and Groenbaek (1989) to review the principle types of prototyping and evolutionary systems development. 4.2.1 Problem Focus and Overview
Early evolutionary systems development was seen as competing with “linear” SLC approaches and therefore incompatible with information systems planning (cf. Courbon and Bourgois, 1980; Naumann and Jenkins, 1982). Evolutionary systems development is described in Section 2 of the Appendix and could be considered as the first version of prototyping in a very broad sense. However, in the following we shall make some finer distinctions. We shall look upon prototyping in a more narrow sense, primarily as a strategy to collect information for system development through a form of experimentation (Iivari, 1984, speaks of prototypes as “means of producing information” for the decision makers in systems development that includes the users) and to facilitate communication and learning by interacting with a real system, which cannot be achieved as effectively by using abstract system descriptions (this differs from Iivari, 1984). Following Iivari (1982; 1984) we assume that a prototype implements an abstract, simplified model of a
342
RUDY HlRSCHHElM AND HElNZ K. KLEIN
future, more comprehensive system. The prototype implementation consists of hardware and software; its operation and use requires people; and the prototype is able to exhibit essential features of the future, final system. The use of a prototype is intended primarily for experimentation and gaining feedback through hands-on experience. Hence with prototyping there is no commitment that the “prototype becomes the system” even though nothing precludes reusing all or parts of the prototype in the implementation of the final system. The strategy to develop a final system through a series of experimental changes will be called “evolutionary systems development” (Hawgood, 1982) rather than prototyping. (Floyd, 1984, p. 10, proposes the terms “evolutionary prototyping” or “versioning.”) The narrower sense of prototyping just defined is consistent with the use of the term in a very influential research article by Davis (1982). It describes prototyping as a “strategy” for “discovering requirements through experimentation” in situations of high risk and uncertainty. Based on this, many authors have suggested using prototyping as one of several methods during the requirements determination stage in the SLC (see Senn, 1989; Kendall and Kendall, 1988; Yourdon, 1989). From this vantage point, prototyping is restricted to the elicitation and validation of requirement specifications that are then frozen for the remainder of the SLC. Prototyping is one of several methods in the analyst’s tool kit for requirement determination and quality assurance of system requirements. This second version of prototyping might be called “specifications prototyping.” It should be noted our first definition of prototyping (following Iivari, 1984) is broader in that nothing prevents using a prototype to collect information about other important concerns of systems development, but requirements testing. For example, prototypes can be used to demonstrate the technical feasibility or efficiency of a software design concept or the workability of a new kind of technology (like objects-oriented design and programming). Floyd (1984, p. 8) calls this “experimental prototyping” (which is a pleonasm if one accepts the narrow definition of prototyping as stated previously). Others have reversed the relationship between the SLC and prototyping by suggesting that prototyping should be improved by incorporating some steps of the SLC. In particular, the formulation of the initial problem should be done more carefully by using some semiformal requirements determination method before building the first prototype. This is seen as cutting out some unnecessary iterations. Almost any of the requirements specification approaches introduced elsewhere could be used for this purpose as long as they do not unduly stretch out the process of problem analysis: rich pictures, root definition analysis, data modeling, sketches of entity life-cycle analysis, data flow diagrams and so forth. This third version of prototyping is really a variant of evolutionary systems development and illustrates that there need
DEVELOPMENT METHODOLOGIES
343
not necessarily be a sharp boundary between conventional system development and evolutionary approaches. Developing systems with the SLC is simply evolutionary systems development in slow motion (with maintenance providing the different versions). This is not to deny that the speed-up of the SLC in evolutionary systems development produces very significant effects. In this version, the notion that the preliminary system eventually becomes “the system” is retained and we shall refer to this third version as “evolutionary systems development with explicit problem formulation.” Executable specification languages have been proposed as a kind of accelerator for the SLC (analogous to elapsed-time photography on a motion camera). If the specification can be used to generate a working system, it is possible to experiment with alternative, formal specifications. We shall refer to this fourth approach as “formal specifications based prototyping.” It should be noted that in this version the users need not be involved in writing the specifications. Rather they can evaluate the specification by working with its results-the automatically generated system. This approach ultimately aims at a complete and fully formalized specification, which classical prototyping tried to avoid altogether. As there is no commitment to reuse any of the experimental versions of the formal specifications, the label “prototyping” rather than evolutionary systems development appears appropriate. A fifth view of prototyping emphasizes its learning and communication capabilities. We shall refer to this as “rapid and expansive prototyping.” Rapid prototyping attempts to overcome the limitations of horizontal prototypes, where most of the user interfaces are implemented, but very little of the computation. Horizontal prototypes can easily be generated with screen editors and linked with a state-transition diagram editor so that the users can get a feel of the interaction with the system. This limits the user from having any real “human-machine communication” by which new insights can be gained. Consequently with horizontal prototypes there is no great motivation for users to explore the prototype in depth, because it fails to produce meaningful responses that give proper feedback. In applications that involve more than fact retrieval, prototypes without computations fail to provide a sound basis for judgments whether the prototyped application will really be useful. In vertical prototypes, on the other hand, implementing a few critical functions in depth may be too expensive to develop. These difficulties are overcome by rapid prototyping that combines some of the capabilities of both horizontal and vertical prototypes. This is achieved through the use of tools with screen and code generation capabilities such as fourth-generation languages. In spirit, this approach is not different from classical evolutionary systems development, only more effective tools are used to save time and effort in generating more comprehensive preliminary systems. If this fifth strategy
344
RUDY HlRSCHHElM AND HElNZ K. KLEIN
gradually turns preliminary versions of a system into a final one no new word is needed. One may simply speak of “computer-aided” rapid (evolutionary) systems development or computer-aided rapid prototyping depending on whether the code is primarily experimental or not. Computer-aided, rapid evolutionary systems development can also address the following problem. Some business situations are changing so quickly (or at least the human perceptions and understanding of them are) that any formal system life-cycle effort would hopelessly lag behind the quickly evolving user demands (Truex and Klein, 1991). Formal specifications-based approaches might be applicable, but their goal of finding a valid and complete specification is defeated. In this kind of situation, each version of the “emergent” system is no longer seen as a preliminary, experimental version of a better system to be delivered later, but rather each version “is the system.” It is immediately put to use with the understanding that it is a “quick and dirty solution” to the current state ?f the problem (a spreadsheet for grading a course that changes each semester is a simple example). If the solution is found wanting, it is upgraded or changed (but not necessarily upgraded and better documented) as the situation evolves. This approach emphasizes that through evolutionary systems development users may learn more about some important aspect of the total situation that then changes their comprehension of their roles and tasks. This is a revival of the old idea of problem solving synergy through human-machine communication. Computational cycles and knowledgeable, human judgments interact to produce results that could not be achieved without interactive computational power. The notion that prototyping should be primarily a learning experience is at the core of the sixth and seventh approach called “expansive prototyping” and “cooperative prototyping,” respectively (see Groenbaek, 1989; Greenbaum and Kyng, 1991). As the emphasis here is on experimentation and learning feedback rather than use as a workable system, the label “prototyping” is in order. Expansive prototyping adopts the idea that by experimenting with evolving software, users can “expand” their capabilities and learn new ways of seeing themselves and their work situation. This should ultimately lead to “working smarter.” Expansive prototyping activates and enhances the users’ tacit expert knowledge. This occurs not only through using the prototype (which was seen much earlier in classical prototyping), but also by intimately participating in building the prototype. Hence, expansive prototyping adds to rapid prototyping the idea that users either develop the prototype themselves or at least are deeply involved in defining the prototype changes. In this way users can acquire a deeper understanding of their work situation, and based on this, they can better tell what kind of system support they really need. Based on the information gained in this way, a good system
DEVELOPMENT METHODOLOGIES
345
can then be implemented. We may suspect that many first-time spreadsheet users used them as expansive prototypes (that were never effectively implemented due to the backlog of user requests with data processing or information centers), but note that spreadsheets are a very limited type of software and expansive prototyping should not be limited by the simplicity of software. The focus of expansive prototyping is on the user or user group and the work situation for which the prototype is being developed. If expansive prototyping is extended to a broader range of software that is too sophisticated to be run by non-computer professionals, it requires cooperation with professional system builders. This seventh version is called “cooperative prototyping,” a term suggested by Bodker and Groenbaek (1989). Cooperative prototyping attempts to implement the idea that users can take a proactive role in prototype design, such as when a future house owner builds a model of his or her dream home and brings it to the architect. The homeowner is sufficiently handy to directly manipulate the model, but cannot finalize the plans for the end product. In a similar fashion, cooperative prototyping depends on “direct manipulation tools” (Schneidermann, 1980; Groenbaek, 1989), but should not .be limited to prototyping tools that are sufficiently simple to be easily mastered by end users. Users still participate in actually building the prototype but have computer experts on their side to help with the implementation using sophisticated software or other approaches: “Cooperative prototyping is meant to combine the ideas of using computer-based tools for exploratory prototyping with approaches to design that allow users to participate in the modification of wood and paper mock-ups as described in Bodker et al. (1987)” (Groenbaek, 1989, p. 228). Bodker et al. (1987) discuss a set of requirements for tools to support cooperative design. In addition to this Groenbaek (1989, p. 230) emphasizes the requirement for direct manipulation support. He suggests that animation, simulation, design by example and related approaches could be used to supply some of the functionality needed in cooperative prototyping when generating the computational model is too time consuming. In this case prototyping merely serves creative learning purposes, and the prototype does not become the system, which is quite different from neoclassical rapid prototyping.
4.2.2 Paradigmatic Analysis of Strengths The rich variety of purposes and orientations associated with prototyping begs for clarification. We suggest that it can be explained by the observation that some forms of prototyping are informed by different paradigms. This
346
RUDY HlRSCHHElM AND HEINZ K. KLEIN
is easily checked by studying the terms and reference literature that researchers use to describe their prototyping concept. If we assume that prototyping spans at least two paradigms that allow for some evolution of the prototyping concept within each paradigm, then we can easily expect half a dozen forms of prototyping. From the functionalist perspective prototyping’s principal strengths are that it ( I ) sustains the motivation of users to participate in system development thereby providing the most reliable information on requirements, (2) overcomes some of the rigidity of the system life cycle and (3) allows the determining and validating of system specifications by conducting experiments with an evolving system. Functionalists will therefore tend to view prototyping either as a step in a system life cycle or as an approach in situations that put a premium on speedy and flexible system development. Prototyping versions two, three and to some extent four and five (insofar as the emphasis is on cost-effectively matching systems to changing requirements) appear to address these issues. The functionalist underpinning of executable specifications prototyping is particularly apparent in the assumptions that a full specification is possible and that through iterations the correct specification can be found in that successive attempts at specification will converge on a correct solution. Consistent with functionalism is also the use of prototyping as a means to predict user reactions by providing a quick and dirty version of the system and thereby control the total cost of systems development. So too is its heavy use of productivity tools such as database languages, code generators, screen painters and the like. From the perspective of social relativism, the principal strength of prototyping is its concrete support of human interaction, sense making and the creation of new meanings. Prototyping provides not only for intense interaction between users and designers through “cooperative design” (Greenbaum and Kyng, 1991), but also a concrete object to which both users and developers can relate. It is not an exaggeration to say that common meaning is created through manipulation of a shared work “cult” object-the prototype. From a phenomenological perspective, it is only through direct manipulation of “things” in the shared life world that the common situation of “being in the world” (in the prototyping literature called the “work situation”) can be experientially understood and with this the needs that arise from “being” in the work situation. To the extent that all prototyping forms support more interaction than the SLC, they contribute to better communication based on close interaction. Classical prototyping was formulated sufficiently flexible to allow easy association with an extremely broad range of theoretical ideas (see, for example, Courbon and Bourgeois, 1980, who associate the principal steps
DEVELOPMENT METHODOLOGIES
347
of prototyping with the Kolb model of the learning process).* Neoclassical rapid prototyping has retained some of the emphasis on user understanding and learning, focusing however more narrowly on the technical and cognitive feedback aspects than on a deeper theoretical foundation of human communication and understanding. These foundations come most clearly into focus with expansive and cooperative prototyping. Their paradigmatic strengths derive from their explicit foundation in hermeneutic and phenomenological principles regarding human understanding, forms of expert knowledge (see Dreyfus and Dreyfus, 1986) and learning through the creation of shared meanings. Their theoretical grounding is obvious from some of their terminology. For example, they explicitly refer to system descriptions as social constructions that fail to support genuine participation because they “only made sense to us the systems designers” and “to the users they were literally nonsense” (Ehn and Sjogren, 1991, p. 248). Cooperative and other forms of prototyping can help to overcome the “meaninglessness” of system descriptions by producing “breakdowns” (which means that some unexpected event in using a prototype in the real work situation forces a reflection such as may be caused by “a bad or incomplete design solution,” see Bodker and Groenbaek, 1991, p. 200), so that a situation “ready to hand” (one that is handled smoothly and routinely) is transformed into one that is “present at hand” (consciously experienced and reflected, see the fuller treatment in Winograd and Flores, 1986; see also Madsen, 1989). The paradigmatic connections of the latest forms of prototyping is also apparent from the extensive discussion of a new philosophical basis for system design (see, for example, Ehn, 1988; 1990). The sense in which a neohumanist perspective on prototyping reaches beyond what already has been said can be summed up in one key wordemancipation. At this point we can only speculate in what way prototyping might be emancipatory in that this is unexplored territory. Emancipatory prototyping would have to open up new opportunities to overcome either (1) personality deficiencies or (2) organizational deficiencies while at the same time (3) implementing some checks and balances against introducing
* Kolb’s learning theory introduces feedback between four interrelated learning stages. In clockwise order beginning at noon these are concrete experience, reflexive observation, conceptual abstraction, and active experimentation. The individual moves through these four stages and at each revolution reaches a higher level of personality development. Courbon and Bourgeois (1980) suggest that prototyping (which they call “the evolutionary approach”) can be given an interpretation in which implementation corresponds to experience, consciousness (the design stage in which people become conscious of change and relate it to developing new norms of behavior for dealing with the outcome of design) with reflexive observation, analysis with conceptual abstraction and norms (as explained in the previous parentheses) with active experimentation. This makes clear that prototyping can be interpreted as a social learning model. This view has also been developed by Kerola (1985; 1987).
348
RUDY HIRSCHHEIM A N D HEINZ K. KLEIN
new kinds of obstacles and biases to rational communication. Regarding (1) several points come to mind. Prototyping can be introduced in such a way that it provides a nonthreatening, supportive environment encouraging learning and growth of employees with low status and self-esteem. It does so by tying new ideas and abstract concepts to concrete circumstances in the work place. This should help to overcome anxieties and defensive reactions thereby aiding the learning process of users with little formal education. To the same end, prototyping can also be used to encourage serious exploration in a “playful manner,”* thereby further encouraging self-confidence and creativity. These ideas find some theoretical support in Freire (1971). Prototyping is probably not very well suited to addressing the second kind of deficiency or barriers to emancipation rooted in the organization or institutional environment. As will become apparent in the analysis of paradigm weaknesses, prototyping has a bias towards individuals or face-to-face groups and appears to assume rather idealistic organizational conditions. 4.2.3 Paradigmatic Analysis of Weaknesses
From a functional perspective, prototyping lacks controls for project management and reliable outcome measures. Because of the emergent nature of prototyping solutions it is difficult to plan milestones, delivery dates and clear budget figures and in many situations it is unsatisfactory that no reliable milestones and budget figures can be given. (Proponents of prototyping will no doubt counter that the milestones and budget figures of system life-cycle approaches are notoriously unreliable, and prototyping can work with upper budget limits.) In addition, there is a lack of clear rules when the prototyping process has reached its goals. To put it simply, the user’s appetite for changes could grow continuously, and there is no guarantee that the changes made are worth the expenditures. Particularly in the literature on rapid and cooperative prototyping available to us there appears to be little or no concern for cost control and means tests of the effectiveness of the approach except for user enthusiasm (Bodker and Groenbaek, 1989, p. 22). From a social relativist perspective, the “high tech” nature of prototyping is of concern for two reasons. First, there is the danger that communication and learning is biased by the ideas and values underlying the latest technological fashion and not by genuine social concerns. Also technical glitches
* The importance of play to improve participation in design is recognized in Ehn and Sjorgen (1991). In their paper the authors use situation cards and icons for work-related artifacts like notice boards to organize a scenario play. In this play, the participants design new professional roles and a new work organization to cope with changes induced by desktop publishing technology. It is easy to see how prototypes could be used in circumstances where the new technology i s sti!l under development or to make the simulation more realistic.
DEVELOPMENT METHODOLOGIES
349
and other frustrations common in a high tech environment could interfere with the essential aspects of sense making and the sharing of meanings. Second, and more fundamentally, an inescapable limitation of prototyping is that it treats information systems as technical systems that can be discontinued without further consequence if it is deemed deficient. This is fallacious. Social relativism suggests that IS are symbolic interaction systems and any intervention in the living organization produces irreversible effects in the minds of the affected people. In that sense, there can be no prototyping because prototyping is a form of social intervention whose effects are irreversible. From a neohumanist point of view, prototyping fails to take note of institutional barriers to the rationality of communication. Discussion of the possibilities and effects of prototyping reach at best up to the group level. The organization as the sponsoring environment is presumed to supply limitless resources. There are no vested interests, hidden agendas, ideological biases, interdepartmental warfare and the like. Moreover, one stern institutional lesson from the UTOPIA project has been forgotten in the enthusiasm for the high tech gadgetry in most if not all of the prototyping literature: that all technology is developed with clear interests in mind and hence reflects values and trade-offs that limit the degrees of freedom to design the content and organization of work (see Ehn, Kyng and Sundblad, 1983, as quoted in Ehn, 1988, p. 328). As an alternative to prototyping, Kyng (1989) examines several tools and approaches, most notably mock-ups, simulations, workshops and workplace visits. This is important so that resource-weak groups gain access to adequate means to support their design and evaluation of information systems. The ideal is to arrive at good support environments that facilitate high-quality “Designing for a Dollar a Day” (Kyng, 1989; cf. Ehn and Kyng, 1991, for a fuller discussion of various generations of mockups and their comparison to prototypes). 4.2.4 Suggested Directions for Future Improvements
The review of current approaches to prototyping shows prolific conceptual growth. This growth is highly desirable, but it lacks clear direction and order so that different research groups can build on kindred results. We shall briefly raise four points. The first item high on the research agenda for the further development of prototyping is more conceptual clarlJication. Second, this should be followed by more extensive jield tests. Third, the results of these field tests need to be critically reflected in cross-paradigm analysis to gain better insights into the potential significance of different research strategies. Fourth, ZSP-ZE needs to pay more attention to incorporating the advantages of prototyping.
350
RUDY HlRSCHHElM AND HEINZ K. KLEIN
Conceptual clarification should aim at formulating the theoretical foundations and implications of different versions of prototyping more clearly. Having reviewed earlier attempts at conceptual clarification, we reach the conclusion that the nature and significance of prototyping can be adequately assessed only if the assumptions made about the nature of systems development are clearly stated. Essentially this requires an explicit framework or meta-theory about systems development. An interesting proposal in this direction has been made by Iivari (1982). A clear statement of the parddigmatic foundations on which different research approaches build could significantly contribute to conceptual clarification. In addition to conceptual development, we need more field work with prototyping. This field work should also be inspired by different paradigms so that the strengths and limits of different approaches become more clearly visible. Among the different research methodologies that should be considered for this are both action research and more observationally oriented approaches of real-world implementations: grounded theory, cognitive mapping, content analysis of design records, case studies and exploratory laboratory experiments all have their place. The danger exists that this may simply add more details to the confusion. Therefore the careful cross-relating results is necessary to achieve any real progress. An interesting and unresolved theoretical question is whether the paradigm underlying the research methodology should be consistent with the paradigm underlying the prototyping approach that is being investigated. Here, too, there is much room for new kinds of research. For example, very interesting insights could be revealed by studying a functionalist prototyping approach with an interpretive research methodology such as cognitive mapping (cf. Banville, 1990). Prototyping played a prominent role in the examination of possible future directions for functionalism. The proposals made in the context of functionalism need to be interpreted in light of the cross-paradigmatic connections of prototyping as revealed in this section of the paper. As functionalism has shown great flexibility in absorbing insights from different paradigms, significant advances could be expected if ISP-IE methodologies and structured methods were enlarged by experimenting with different versions of prototyping. The high tech nature of all prototyping approaches should make it relatively easy to integrate them into a functionalist frame of reference.
4.3 Soft Systems Methodology (SSM) Soft system methodology (Checkland, 1981 ; Checkland and Scholes, 1990) is not a methodology limited to information systems development, but a very general approach to address many different kinds of political, social
DEVELOPMENT METHODOLOGIES
351
and organizational problems. However, SSM has served as the basis for at least two methodologies specific to information systems development, Multiview (Wood-Harper et al., 1985; Avison and Wood-Harper, 1990) and FAOR (Schafer et al., 1988). As we are concerned here with the fundamental principles underlying different methodologies, it is appropriate to deal directly with SSM rather than with its derivatives.
4.3.1 Problem Focus and Overview SSM may perhaps be more aptly termed a meta-methodology in that it is more concrete than a general philosophy for analyzing the world, but broader and more flexible than a specific professional method that is usually limited to a predefined problem domain. SSM focuses on broad problemsolving situations, involving what Checkland calls “human activity systems.” In human activity systems, problems are manifestations of mismatches between the perceived reality and that which is perceived might become actuality. SSM is different from the traditional systems development approaches in that it does not prescribe tools and methods that are specific to information systems, but only general problem formulating principles and methods. The classical version of the methodology (Checkland, 198I ) as described in Section 3 of the Appendix distinguished seven steps within the methodology that were categorized as “real-world’’ activities and “systems-thinking’’ activities. The steps in the former may involve the people in two ways: one is “express the problem situation” through “rich pictures”; and the other, implementing “feasible and desirable changes.” The steps in the latter attempt to provide conceptual models of the denoted problem situation, which are then brought back into the real world for comparison. In the newer version of the methodology (Checkland and Scholes, 1990) this distinction is replaced by “two interacting streams of structured enquiry which together lead to an implementation of changes to improve a situation” (p. 28). These two streams are termed “cultural analysis” and “logic-based analysis.” The stream of cultural analysis involves three analysis phases: analysis of the intervention ; social system analysis ; and political system analysis. The stream of logic-based enquiry involves four phases : selecting relevant systems ; naming relevant systems ; modeling relevant systems ; modeling relevant systems ; and comparing models with perceived reality. An interesting aspect of this is that SSM incorporates an explicit inquiry model from which other methodologies could also benefit substantially. All system development methodologies pay attention to appropriate approaches for collecting the information and data that provide the knowledge basis needed for analysis and design. Whereas the seat-of-the-pants approaches
352
RUDY HlRSCHHElM AND HEINZ K. KLEIN
relied mostly on user interviews and the analyst’s common-sense investigation skills, modern methodologies have become much more sophisticated in their approach to inquiry. Inquiry includes system modeling (deriving requirements from an existing system, see Davis, 1982, for a survey of alternative approaches to requirements determination), walk-throughs, discussion of system specifications in personal meetings, counseling with users by demonstrating prototypes and a myriad of other ways to collect needed information. We can analyze any system from the perspective of inquiry (cf. Churchman, 1971) and this applies to methodologies as well. This viewpoint suggests the need to define the inquiry model as the range of procedures recommended by a methodology to collect data, and create the knowledge that becomes the basis of analysis and design. The assumptions made about what type of knowledge is important for system development, and how it could be obtained is referred to as the epistemology of a methodology or inquiry model. Among existing methodologies SSM has the most explicit inquiry model with ETHICS closely following. Through the division of inquiry into cultural analysis and logic-based analysis, SSM hints at the possibility of dialectic inquiry, but does not fully implement this idea.
4.3.2 Paradigmatic Analysis of Strengths SSM embraces much that is consistent with the assumptions intrinsic to social relativism. In fact, it appears to be primarily informed by the social relativist paradigm (see Table V). For example, the use of “rich pictures” for describing the problem situation in an imaginative and creative fashion is an interesting vehicle for supporting sense making. Similarly, the ability to construct alternative “root definitions,” which are sensible descriptors of the problem situation, offers another vehicle to encourage the creation of new meanings. A key component of the root definition is the philosophical notion of “Weltanschauung,” which allows multiple perspectives to be recognized in the development process. Last, Checkland’s adoption of Vickers’s “social community” concept is consistent with Checkland’s call for SSM to embrace phenomenology because the social community is essentially seen as emergent and unpredictable. Social community is contrasted with human activity systems that abstract the purposeful, predictable and hence designable aspects of social systems. The methodology is also somewhat informed by the functionalist paradigm, but to a much lesser extent. Clearly the notion of an human activity system itself is functionalist (Checkland himself notes the engineering connection of this term). Furthermore, the attempt to determine the functions
D EVE LO PM ENT M ETHO DO LOG I ES
353
of human activity systems and its conceptual modeling step is clearly consistent with the tenets of functionalism. So too is its emphasis on improving the goal achievement of human activity systems. Usually this is done by improving the human activity systems’ ability to predict and control its environment. 4.3.3 Paradigmatic Analysis of Weaknesses
In terms of paradigmatic assumptions, SSM is mostly closely aligned with the “integrationist” dimension (as described in Section 3) : it favors stability, coordination and integration. Its’ seeking a consensus during problem formulation involving multiple perspectives through the use of a variety of sense making techniques is clearly integrationist. Thus, the radical structuralist and neohumanist paradigms appear the most in opposition to SSM. Both radical structuralists and neohumanists will be critical of SSM’s failure to reflect on whether it can be misused in realizing the goals of one group in the organization at the expense of another. Neohumanists would also be critical of SSM on the grounds it does not attempt to analyze and mitigate potential distortions and communication barriers while seeking a consensus (or at least accommodation) on root definitions. Hence the cognitive basis of the conclusions may be flawed by undetected bias. This also leads to ignoring the nature and influence of organizational power, and it fails to be sensitive to the issue of whether the new system will strengthen emancipation of all organizational participants or continued domination. An emancipatory system design would lead to a more equitable distribution of rights and duties (potentially shifting unwarranted use of power from those in possession of it already). A system contributes to existing forms of domination and social control if it continues the inequities of the status quo. For this reason SSM has been criticized along similar lines as functionalist approaches in that it is likely to do little to alter the plight of the workers. An important advance of SSM is the recognition of alternative perspectives in the inquiry process by which it collects information for analysis and design through building root definitions and conceptual models. This focuses on improving the collective understanding of the problem situation. Yet here is also an essential weakness. In the classical version of SSM, the inquiry process was seen to involve two “domains” or “worlds”: a conceptual domain, the world of systems concepts, and the real world. It failed to address the issue of how one can be independent of the other. A reasonable amount of independence was assumed, because a crucial step was to validate the models by comparing them with the actual situation. If one takes the position that concepts shape the world that we experience (i.e., interpretivism), then no real comparison is possible. If concepts comes
354
RUDY HlRSCHHElM AND HEINZ
K. KLEIN
first, a comparison of the conceptual models with the real world is selfconfirming: “experience only confirms what the concept teaches” is a wellknown dictum from Hegel that sums up an essential insight of the interpretivist position. Radical structuralists will insist that systems thinking is the ideological vehicle by which the dominant elite will seek to rationalize and legitimize primarily those designs that do not threaten their privileges and vested intcrests. The comparisons are phony, because no attempt is made to broaden the discourse and use elaborate checks and balances against selfdelusion or “cooking the data.” The emancipatory potential of SSM is lost for want of critical reflections on the connection between social-institutional boundary conditions of systems development and epistemology, the validity of the premises and ideas on which design solutions are based. Emancipatory systems can be designed only through genuine democratic participation. Neither in the classical nor the recent version of SSM is there an explicit discussion of the meaning and significance of participation, let alone of the difficulties of implementing a participatory design approach. The failure to address participation is perhaps surprising but yet indicative of the direction of how to improve SSM. It is surprising because social relativism strongly reminds us through the concept of the hermeneutic cycle that interaction is the very basis of human understanding and sense making. Given that SSM is heavily influenced by social relativist ideas, one wonders why it failed to consider the implications for participation. 4.3.4 Suggested Directions for Future Improvements
The previous paradigmatic assumption analysis reveals several directions in which SSM could be improved. Because of the ambiguities in the paradigmatic foundations of SSM (which adds to its richness), several lines of improvement are possible, although not necessarily totally consistent with each other. First, SSM needs better modeling methods and support tools. Its conceptual modeling method is too simplistic even when compared to standard structured process modeling (i.e. leveled data flow diagramming), let alone when compared to object-oriented organizational modeling. Its communication with users could also benefit from incorporating prototyping features. Second, SSM should incorporate principles to check on social conditions that may bias the cultural stream of inquiry and thereby invalidate the consensus. Through insisting on a cultural and a logic-based stream of inquiry, SSM opens itself up to critical discourse, and it should explicitly realize this potential. But in its current form SSM lacks, perhaps purposefully, a clear focus on the possibilities of realizing the emancipatory responsibilities of systems design. To accomplish this it needs to reconstruct its
DEVELOPMENT METHODOLOGIES
355
inquiry model so that it more directly aims at approximating a rational discourse. The replacement of the distinction between systems thinking and real-world thinking by a logic-based and cultural stream of analysis cannot accomplish this. It is no substitute for an approach to cross check and validate assumptions and intermediate results. Without a critical component, SSM is subject to the same kind of weaknesses as were hinted at earlier in the discussion of the paradigmatic weaknesses of prototyping (Section 4.2.3). Currently, SSM appears to be content with harnessing the hermeneutic (communicative, interpretive) potential of participation. From a neohumanist perspective it could be strengthened by also recognizing the important emancipatory roles of participation. In summary, if “improvement” of SSM means the realization of neohumanist, emancipatory concerns, it could substantially benefit from the same kind of suggestions as proposed for ETHICS in Hirschheim and Klein (1991b). This implies the following: (1) giving careful attention to the larger institutional context of systems development, i.e. setting the stage, which involves considering the impact of the organizational culture and policies on the outcomes of systems development; (2) recognizing the various roles of participation for symbolic interaction and assisting people to develop active nonservile and democratic personality structures, thus overcoming the alienation caused by the rigid division of labor; (3) implementing a critical model of inquiry approximating a rational discourse. 4.4
Ordinary Work Practices Approach
In the early 1980s many researchers thought that significant practical improvements in ISD could be achieved by providing new methods and tools for practical development. The term “methodology” came to mean an assembly of tools and methods into a systematic approach covering the complete life cycle from problem formulation to implementation, evaluation and maintenance. Lars Mathiassen (198l), the chief architect of the ordinary work practices approach, defined a methodology as consisting of the definition of an application domain, which is approached with a certain perspective and preferred principles of organization and cooperation (a preferred structure of division of labor and coordination) that guide the application of specific design principles, methods und tools. A common belief was that better methodologies would improve the effectiveness of practitioners by displacing poor approaches to ISD and shorten the learning cycle of beginners by transferring the knowledge that supposedly is encoded in advanced methods and tools. A small research group in Denmark challenged these popular assumptions by studying the actual working habits of system development practitioners.
356
RUDY HlRSCHHElM AND HElNZ K. KLElN
This research group originated from the trade unions projects that were conducted in Scandinavia during the 1970s and early 1980s (as reported in Nygaard and Haandlykken, 1981; Kyng and Mathiassen, 1982; Ehn and Kyng, 1987; and Bander, 1989). They found that the more experienced analysts were, the less they followed documented methodologies. This applied even if the organization had introduced a specific method of its own as a development standard. Methods were at best crutches for beginners to be tossed aside after a period of apprenticeship. More important for understanding how systems are developed are the actual working practices in the organizations that the beginners joined. The implication of this is that the usual academic training of software professionals with a heavy emphasis in computer science and software engineering needs to be complemented with practical experience and knowledge from actual development projects if the goal is to train effective system development professionals.
4.4.I Problem Focus and Overview The ordinary work practices approach builds on the early insights of the MARS project. In Danish, MARS stands for Methodical work habits (the “AR” in Danish comes from arbejdsfornzerj in System development. The systems development approach that grew out of MARS is described in Andersen et al. (1990). This project shared with mainstream research on methodologies the commitment to improve the processes by which systems are developed. This is, however, interpreted in a somewhat broader sense than the rational design perspective functionalism suggests. Building on Floyd (1987), a process view of ISD means that software is seen in its connection with human learning and communication as an emergent phenomenon “taking place in an evolving world with ever changing needs. Processes of work, learning and communication occur both in software development and use. During development, we find software to be the object of such processes. In use, software acts both as their support and their constraint” (Floyd, 1987, p. 194). In contrast to this, “the product oriented perspective abstracts from the characteristics of the given base machine and considers the usage context of the product to be fixed and well understood, thus allowing software requirements to be determined in advance” (ibid.j . The ordinary work practices approach to software development draws from this the conclusion that system development research must be anchored in a thorough understanding of the actual work habits of practicing system developers. Hence research on ISD must investigate the ways in which actual projects have been conducted, including the institutional constraints surrounding them. One important component of this is organizational culture,
DEVELOPMENT METHODOLOGIES
357
which may encourage or hinder a change of existing work practices in information systems development through prevailing attitudes to change (i.e., the “not invented here” syndrome), incentive systems, peer pressures, and the like. The focus of the ordinary work practices approach is on the work practices of information system developers (including the aspect of project management and better management of developers’ group work practices) rather than users. Because of this priority the approach so far has failed to adequately consider user relations and participation (see Section 4.4.3). The term “ordinary work practices” is intended to refer to both the unsystematic, evolutionary connotation of work habits and the methodical connotation of systematically designed work standards and procedures. In this sense, working practices can be learned by a combination of study (i.e., documentation of new methods and tools) and experience (i.e., by working with accomplished masters of the art; see Andersen et al., 1990, p. 60). Moreover, the principal aim of ISD research is to help practitioners improve their professional learning skills and their ability to manage their learning processes. A catch phrase for this is “being a professional.” Several avenues are proposed for its achievement : 1. The emphasis is on helping practitioners to design and maintain their own ways of learning. This also includes trying new methodologies or frameworks of analysis. For example, the approach experimented with structured analysis and design, Boehm’s spiral model of system development and risk management and more recently with SSM and objectoriented analysis (Coad and Yourdon, 1990). 2. A second important avenue to achieving professionalism is to help practitioners improve their capabilities to reflect upon their experiences and then change their behavior accordingly. In this way both successes and failures become important assets for improving the practice. A special tool proposed to support this is keeping a diary (see Lanzara and Mathiassen, 1984; Jepsen et al., 1989). 3. A third avenue to improving learning is building open-minded professional attitudes that encourages practitioners to actively seek out pertinent information by interacting with the professional community and through the study of the literature. Andersen et al. (1990, p. 9) state that an environment and tradition for professional systems development should contain elements such as the following: active system developers, sufficient resources, exchange and evaluation of experience and study of literature.
Following these basic ideas, in 1983 Andersen et al. began studying the working practices in eight development projects that took place in four large
358
RUDY HlRSCHHElM AND HEINZ K. KLEIN
organizations (two projects in each) for one year. In the next phase, they attempted to influence the working practices by introducing new methods and tools. One of these organizations withdrew after the first year. Hence, in the second year, six projects from three of the four organizations experimented with new work practices as suggested by the researchers. The solutions that resulted from this were carefully evaluated (references to the detailed Danish reports can be found in Andersen el al., 1990, under MARS 2 to MARS lo). In the third year, the researchers reported their experiences and disseminated what was learned in three ways: (1) in book form (written in Danish in 1986 and translated into English as Andersen et al., 1990), (2) through university courses and (3) through professional courses for practitioners. The experiences from these projects confirmed the following five maxims (or basic principles) that characterize the ordinary work practices approach in its present form (as well as many other Scandinavian research projects) : 1. Research on ISD must be committed to achieve and facilitate change in the workplace. Abstract theories or experiments reported in scientific journals read by a small elite are insufficient to accomplish this. Instead a change of working practice through better learning must take place. 2. Each research project in ISD should consciously select and define the interests that it is to serve through the knowledge change it produces. This is a generalization of the principle in the MARS project that stated improvement in ISD must serve the professional interests of the participating practitioners. 3. The research project organization should be democratic so that practitioners and researchers form a communication and cooperation community of peers. Researchers possess general theories and methods. Often they are more articulate than practitioners in analyzing problems with abstract ideas on how to approach a solution. But theories are empty unless they are appropriately applied to the specific setting. For this, the researchers depend on the experience of the practical system developers. Only the latter are familiar with the details of system development and the use of the methods that are specific to their organizations. 4. The preferred research approach is experimental, because theories and methods are subject to ambiguous interpretation; and their true consequences cannot be understood from logical deduction, but only by experimenting with them in concrete working situations. For example, cognitive maps were used to better understand the problems of a specific project and to analyze their causes and consequences (Lanzara and Mathiassen, 1984).
DEVELOPMENT METHODOLOGIES
359
5. Each research project must plan for the collection, reporting and dissemination of its results. The dissemination must take more than one form to be effective in practice; i.e., academic publication outlets are one important form but insufficient by themselves. They must be accompanied by professional presentations, courses or other forms of disseminating professional knowledge. 4.4.2 Paradigmatic Analysis of Strengths
This approach has been able to operationalize many of the important insights of social relativism especially those that were borrowed from hermeneutics and phenomenology. Clearly the emphasis on reflection is an application of the basic concept of the hermeneutic understanding cycle. The approach recognizes the importance of metaphors as an interpretative vehicle for understanding design situations and identifying design options building on Lakoff and Johnson (1980) and Schoen (1983). Because of the theoretical connection to the “reflective practitioner,” the approach could potentially build on a deep understanding of the problematical nature between human action and thought (Argyris and Schoen, 1978; Argyris, 1982). In its current practice this is partly realized by its insistence on learning through self-reflection and partly through the emphasis on interaction as a basis for dialogue (later we shall note that much of this is implicit in the research practice, and the theoretical connections need to be realized more clearly along with their implications). Moreover the approach also recognizes that dialogue must be informed by experiential feedback from shared practice. The theoretical rationale for this comes from the phenomenology of life worlds (Schutz, 1967; Schutz and Luckmann, 1974; Madsen, 1989). All this adds up to a sound theoretical basis for reaching a deeper understanding of the problematical nature of human communication and of the significance of genuine participation as a basis for symbolic interaction. As system development is an interpretive process that tries to “read” the organizational text in requirements definition and create new meanings in the design of a solution, it follows that developers cannot and should not impose these new meanings “from the outside.” Consequently, the ordinary work practices-based approach therefore gives analysts primarily a facilitator role in the users’ sense making and learning processes (see Hirschheim and Klein, 1989, p. 1205). Another strength is that the hermeneutic tradition checks “blind action” by assigning reflection a proper role. The ordinary work practices-based approach takes note of this by seeking various means to encourage and support it ( e g , systematic diary keeping; Lanzara and Mathiassen, 1984). All of these points are consistent with the recent philosophical ideas on
360
RUDY HlRSCHHElM AND HEINZ K. KLEIN
human understanding and the outlines of a general theory of communicative action (see Apel, 1982; Habermas, 1984). An approach based purely on hermeneutic theories of understanding runs the risk of withdrawing into contemplation and reflection. To counteract this, the ordinary work practices approach appears to be influenced by paradigmatic principles that favor human action. This is evident from at least four of the five maxims stated previously (the possible exception being 5). In addition to its social relativist influences, there are both functionalist tendencies in its emphasis on experimental prototyping and neohumanist tendencies in its commitment to reveal the interests and values of applying knowledge. It also draws on the ideas of neohumanism in its democratic project organization for facilitating the sharing of ideas. (Of course, functionalists might be inclined to argue that one cannot “vote on what is correct.”) The democratic project organization can be seen as an approximation of the rational discourse model and the focus on pertinent interests as an attempt to realize what Habermas calls the “generalizable interest” (Habermas, 1973). The two are not unconnected. Generalizable are those interests that emerge as legitimate from a maximal critique of all needs that should be served by a project. Insisting that the interests are made explicit and chosen consciously in a democratically organized group will most likely force a critical debate of these interests. Other devices to bring about critical discussion are also considered. For example, Mathiassen and Nielsen (1 989) suggest that a consultant report on the key problem aspects can be written to initiate a debate about conflicting interests in a development project. However, the actual supporting of rational communication in system development needs to be better developed (as is noted in our critique later). The approach balances the emphasis on communicative aspects with the need for direct action through the emphasis on practical experimentation and the commitment to achieve real change in the practice of system development. From a phenomenological perspective, the emphasis on experimentation is important for three reasons. First, it strengthens the basis of shared meanings. The results of the experiments equally inform the user-subject and the developer-scientist. This is different from the functionalist concept of experiment where an experimenter controls the conditions for the subjects who in turn do not benefit from the results. Second and equally important, shared experiments help to acquire the tacit knowledge basis that is the sign of true professionalism (see the analysis of expert knowledge by Dreyfus, 1982). Third, the experimental component of the approach is also a vehicle to deal with the problem of unanticipated consequences of ISD, which is consistent with the literature on the subject. Here the literature points out that ISD means a change of organizational life forms and the totality of such changes cannot be predicted by theory but at best partially anticipated and felt by experience.
D EVE LO PMENT METHODO LOG1 ES
361
A further connection to both neohumanist and social relativist ideas is the process orientation of the approach. (Earlier in this section the process view of systems development was contrasted with the product-oriented view.) If requirements are seen as emerging from the interaction between users and developers as both try to “read” the organizational situation and make sense of it, then the process view is more appropriate than the product view. A process approach to systems development is also suggested by the neohumanist paradigm because neohumanism focuses us on improving the rationality of communication and discourse so that the participants themselves can determine what is appropriate, just and correct in a given situation. This advocates process characteristics not specific outcomes. In line with such philosophical principles a process-oriented approach to ISD should seek to improve the processes by which an organization “reads” problematic situations ; for example, by improving cooperation, communication and knowledge bases. This is exactly what the ordinary work practices approach does. In line with this it also suggests that system developers should act as facilitators, which is consistent with the tenets of the social relativist paradigm (see Hirschheim and Klein, 1989, p. 1205). This facilitator principle is implemented in the ordinary work practices-based approach through the idea that system developers should help the organization to better manage its learning processes so as to better understand itself, the problems and general situation that together determine how IS are used. This is similar to the double loop learning concept of Argyris (1982).
4.4.3 Paradigmatic Analysis of Weaknesses When viewed from a functionalist or neohumanist perspective several weaknesses of the ordinary work practices-based approach stand out. From a functionalist perspective, the approach lacks clarity and structure. There is no explicit overall theory, only fragments of theories and singular experiences. This is by no means a necessary consequence of the interpretivist leanings of the approach. On the contrary, the recognition of the importance of clear communication should be an incentive to document itself systematically such as has been done with the project management aspects of the approach (see Andersen et al., 1990). Both the theoretical foundations and experience exist that are necessary for doing this. The project diaries record much experience. They could provide valuable information on what to emphasize in making the approach more accessible and thereby also “teachable.” Regarding structure, one would not necessarily expect the procedural type of documentation in terms of methods, tools and life-cycle structure that is so effective for functionalist methodologies. Instead, a systematic
362
RUDY HlRSCHHElM AND HElNZ K. KLEIN
discussion and enumeration of fundamental theoretical principles and how they have been realized in different cases might be more appropriate. The previous discussion has provided numerous pointers to this. Other possible starting points to review the theoretical foundations exist as well (e.g., Boland, 1985; Winograd and Flores, 1986; Lyytinen, 1986; Ngwenyama, 1987; Oliga, 1988; and Ulrich, 1988). The description of the theoretical basis should then be followed by examination of a list of common mistakes or misunderstandings with some friendly advice on the dos and don’ts. Again, the extensive project work and the documentation of participants’ behaviors and thoughts in the diaries should provide valuable hints for this. The documentation structure chosen by the PORGI project (see Kolf and Oppelland, 1980, p. 68) could point the way to an appropriate, yet clearly organized format. PORGI documented itself through five aspects: (1) the terminological basis (in PORGI called the “descriptional framework”) ; (2) a discussion of general design principles and the interventions to which they typically lead (called the “pool of design concepts”); (3) typical sequences of wellknown design activities (procedural scheme); (4)a pool of recommended methods and tools; and (5) a pool of informative problems or case vignettes. Both functionalists and neohumanists will also be critical of the lack of attention given to the cost and effectiveness of the approach. Clearly it is very cost and labor intensive, and both neohumanists and functionalists will insist that the cost be justified by results. Hence one misses a discussion of potential positive and negative outcomes of the approach and how they can be understood (functionalists would likely call for “measures,” but a broader concept of keeping score is applicable) and how it compares with other approaches to ISD. From a neohumanist point of view, the approach is weak in recognizing the theoretical and practical difficulties of improving established working practices primarily through communication. It gives little consideration to the role of authority and power in organizational change that is rather unrealistic. This is somewhat surprising, because the need for a change strategy is recognized in Andersen et al. (1990, p. 238). Among the pertinent issues recognized under the heading of “strategies for changing working practices” are the following : who should take responsibility for change, whether corporate culture impedes or encourages change, the need for determining qualification requirements and appropriate ways to meet them, and the need to foster learning and initiative. For the most part the ordinary work practices approach appears to draw on a functional perspective of “planned organizational change.” In so doing it fails to explicitly analyze the multiple personal, cultural and linguistic barriers to learning and communication that may exist both within and beyond the immediate work situation. Of particular concern is the failure to
DEVELOPMENT METHODOLOGIES
363
recognize that work practices are connected to different work languages and to change either one means a change in “forms of life.” For example, the examination of the relationship between language and phenomenon as well as between system description and reality (see Andersen et al., 1990, pp. 209 and 212) misses some of the most fundamental insights of social relativism. From a social relativist perspective, the ordinary work practices approach fails to address the fundamental role that bias and “prejudice” play in all human understanding. In particular “preunderstandings” in the form precedence and tradition would seem to apply to the analysis of work practices. Each “language game” entails a bias, a limited “horizon” into which we are bound by various forms of tradition and domination. The concept of change strategy needs to be broadened to encompass the issue of how the participants can possibly escape the “prison” of their tradition-bound work languages. To say that “the participants together should build up words and concepts increasing their understanding of similar phenomena” and “the objective may be that this understanding should be a shared platform improving the quality of the design process” (Andersen et al., 1990, pp. 209 and 210) fails to recognize how communities get entangled in the illusions and misconceptions of their own languages and how certain language patterns and associated forms of life are stabilized by subtle forms of social control and domination. A more penetrating analysis of the issues at stake would require an explicit discussion of not only how work languages could be changed (some of this is provided), but also how it could be changed for the better (i.e., in what sense improved) through repeated hermeneutic cycles. This would have to explain how different “horizons of meaning” can be fused in a new work language and practice both overcoming the limitations of the old (any language use implies a practice according to the later Wittgenstein). Some of the key issues with this surfaced in the Gadamer-Habermas debate on the limits of human understanding (see the review in McCarthy, 1982, p. 162 and esp. p. 171 ;and Held 1980). Habermas makes the point that communicative distortions are not the exception (as in propaganda, advertising or adversarial situations) but the rule in every-day life. One misses a clear recognition of how change of working practices can truly be changed for the better; i.e., be more than conforming to social norms that happen to prevail at a certain time in a certain place. Second, one misses a clarifying word what “improvement” of working practices really means in practice (for example, by interpreting a good case example). Typically one will interpret “improvement” as a synonym for “progress” (the irreversible change of working practices for the better), and there are some difficulties with this notion (see the analysis in Alvarez and Klein, 1989) which the literature on the ordinary work practices approach so far has failed to address. If this
364
RUDY HlRSCHHElM AND HEINZ K. KLEIN
was done by way of a theoretical foundation of the approach, it would be possible to give a more penetrating analysis of the improvement of work practices (and work languages) with the difficulties to be expected in such an enterprise. In our opinion the fundamental issues are the same as were noted in the section on information systems planning: there are different levels and communities of discourse, and it is not clear how the meanings of different languages can become shared and in what sense of the word this constitutes improvement. Addressing this issue squarely will amount to a research program on the “rational reconstruction of the conditions of speech and action” (see summary in McCarthy, 1982, p. 272) in the context of systems development. Implicitly the ordinary work practices approach has dealt with this through its emphasis on interaction and experience. From a practical perspective, explicit, theoretically well-founded principles of participation and project organization will be critical for rational change of working practices and the treatment of this (see Andersen et ul., 1990, pp. 17 1, 239 and 250) falls back behind Mumford’s analysis of participation and the extensive empirical work on principles of participation in the PORGI project (see the summary in Oppelland, 1984): who should participate? under which conditions? with which qualifications? for which purposes? what are the barriers for effective participation (e.g., Mulder, 1971)? etc. Given the general strong emphasis on interaction, one cannot help but be surprised about the neglect of the pertinent literature. 4.4.4 Suggested Directions for Future Improvements
Earlier, the need for better documentation was noted. The analysis of weaknesses points to two further fruitful avenues of research that could considerably strengthen the approach. One has to do with language and the other with the effects of power and distortions. Given the emphasis on understanding in the approach, it is surprising that no attention is paid to the role of language, in particular work language, in ordinary work practices. By incorporating some of the principles and methods of a semiotic approach to IS design (see Bogh-Andersen, 1991), some of the vagueness and lack of structure of the ordinary work practices approach could be addressed. The second issue concerns the interrelationships between power, communication and language. One misses a clear discussion of a situation and interest analysis as is, for example, proposed in Iivari (1989). This should include the institutional context with a focus on goals, hidden agendas and overt and covert uses of unwarranted power. A first approach to this can be found in Mathiassen and Nielsen (1990). In general, however, the primary focus of the ordinary work practice approach is on change of working
DEVELOPMENT METHODOLOGIES
365
practices through creative learning and well-intentioned cooperation. This overlooks the following causes of development problems with users that are very difficult to deal with: perceived inequities of proposed changes, or lingering issues of frustration and social injustice imported into the organization from the surrounding society (see Newman and Rosenberg, 1985; Braverman, 1974; Bjorn-Andersen et af., 1982). Perceived inequities of organizational change tend to arise from shifts in the distribution of duties and rewards at the workplace. These types of problems are more easily addressed in a project than the second problem type: frustration. General frustration among users is often revealed by examining the social contract between different socio-economic groupings in a region or society from which the organization draws its employees. Major imbalances in social justice cause large numbers of people to feel that they “never get a square deal” and the resulting latent frustration surfaces when things change in the organization. If a change of ordinary work practices is to be successful across a large number of different organizations, it must incorporate principles that allow the detection of such issues. In order to address this, classical functionalist theories of the nature of organizational incentives and power (e.g., Harsanyi, 1962) could be combined with the neohumanist analysis of distorted communication. Neohumanism emphasizes the connection of communication problems to macro-societal issues. The synthesis of such a conceptual base drawn from functionalism, hermeneutics and neohumanism combined with the emphasis on working practices might be helpful for two purposes: first, to address participation and project organization as was already noted ; second, to place the ordinary work practices on a realistic socio-theoretic foundation so that organizations and analysts can more clearly understand in which kinds of situations it might produce promising results and in which it is unlikely to be applicable. 4.5 Conclusion and Outlook
In the previous characterization several references were made to emancipatory features. Yet, a fully developed emancipatory approach to ISD does not exist in any concrete form. Such an approach could be constructed on the grounds of the neohumanist paradigm. No methodologies as yet implement this neohumanist view, but a literature base is coalescing around the theme of applying critical social theory (CST) to the development of information systems (see Lyytinen and Klein, 1985 ; Lyytinen, 1986; Lyytinen and Hirschheim, 1988; Ngwenyama, 1987; 1991). The ideas are still somewhat embryonic but enough substantive material is available to suggest the nature and direction of an emancipatory approach to information systems
366
RUDY HIRSCHHEIM AND HEINZ K. KLEIN
development. Elsewhere (Hirschheim and Klein, 1991b) we take up this theme and sketch in some detail how emancipatory concerns could be realized in a methodology that originally did not focus on emancipatory concerns, We believe that the ETHICS methodology* could be developed further from a critical social theory perspective to incorporate not only emancipatory but also communicative aspects. 5. Conclusion
In the preceding analysis, we have attempted to highlight two key themes. The first dealt with the evolution of ISD methodologies and their paradigmatic influences. The second dealt with how methodologies could evolve in the future by adopting alternative paradigmatic influences. It has been our contention that from the beginning, methodologies have been influenced primarily by functionalism, but more recently, the inspiration has come from alternative paradigms. We have also attempted to show that methodologies can be improved by systematically importing fundamental concerns and principles inspired by different paradigms. Through our detailed analysis of four methodologies, two key points were made. First, it was revealed how some of the recent methodologies incorporate principles from more than one paradigm. We can therefore speak of recent research on ISD methodologies as “mixing” paradigmatic influences. Second, research on methodologies could substantially benefit by systematically adopting maxims and principles from more than one paradigm. In order to demonstrate this conjecture we explore elsewhere (Hirschheim and Klein, 1991b) how a specific methodology, ETHICS, could be improved by drawing on ideas from the neohumanist paradigm, specifically CST. The preceding analysis could, of course, be broadened considerably by including more methodologies. In order to keep the length of this paper within the bounds of one chapter, we omitted several potentially important recent developments. Among these are the application to information systems development of activity theory (Kuutti, 1991; Bodker, 1991); emergent systems thinking (Truex, 1991; Truex and Klein, 1991); postmodernism (Achterberg, van Es and Heng, 1991); speech act theory (Auramaki et ul., 1988; 1991) ; Mao’s theory of contradictions to issues of project management (Ogrim, 1990); and last but not least, the successor to the collective resource approach (e.g., Kyng, 1991 ; Ehn, 1988). It is our belief the kind of paradigmatic assumption analysis applied in this chapter to methodologies such as information systems planning (and structured methods), prototyping, SSM,
* Because the ETHICS methodology has been referenced so often in this paper, suggesting its iniportance to the field of information systems development, we felt it necessary to provide
an overview of the methodology. It can be found in Section 6.5 of the Appendix.
DEVELOPMENT METHODOLOGIES
367
and the ordinary work practice approach is applicable to these recent developments. We therefore believe that all methodologies can be critically investigated in this manner. Furthermore we suggest that future research on ISD methodologies could profit considerably from a paradigmatic assumption analysis of research plans. The insights gained from such an analysis could help the different research groups that contribute to the advancement of methodologies to determine which issues are the most critical and where the leverage of scarce research resources is greatest. Acknowledgments
Several conversations with Jan Stage and Lars Mathiassen from the University of Aalborg, Denmark, helped us to better understand the historical background of the MARS project and the current directions of the ordinary work practices approach. Both Pertti Jaervinen, Tampere University, and Juhani Iivari, University of Oulu, Finland, took the time to thoroughly check an earlier version of this manuscript. Their comments identified missing literature and caused us to reconsider the wording in a number of places. In particular Juhani’s insistent questioning contributed to clarifying our ideas on several important issues. 6. Appendices: Summaries of the Methodologies In these summaries of methodologies, it should be noted that, in the case of the structured methodologies and prototyping, we are really describing a “family of methodologies” rather than a single approach. In order to provide a sense of consistency throughout these summaries, the methodology descriptions consist of four parts: (1) an analysis of the reasons why the particular methodology was proposed (its purpose and rationale) ; (2) a concise examination of the key ideas underlying the methodology by which it hopes to achieve its stated purpose (its focus); (3) a characterization of the methodology’s principal stages and their sequence (its phase structure) ; and (4) a list of its special methods and tools with their intended purpose. In order to highlight the special features of each methodology, we, in some cases, relate these features to the classical system life cycle (see the systems analysis textbooks of Kendall and Kendall, 1988; Yourdon, 1989). 6.1 Struct wed Methodologies 6.1.I
Purpose and Rationale
The most common forms of structured methodologies can be traced to the classic works of Gane and Sarson (1979), DeMarco (1978) and Yourdon
368
RUDY HlRSCHHElM AND HEINZ K. KLEIN
(1989). They were introduced to cope with the complexity of analysis and design and problems that the complexity caused with descriptions used in the classical life-cycle approach. The latter lacked a clear focus and organization for the analysis and predefined formats for describing and filing the myriad of details that a development exercise collects during the systems development process, beginning with vague descriptions of the problem and ending (ideally) with detailed program specifications and user documentation. As a result, system descriptions formats were ad hoc, very verbose and as difficult to read as a “Victorian novel.” There was no assurance that the documentation was in any sense of the word complete or consistent. This produced many specification errors through omission and multiple interpretations. In addition, it is practically impossible to keep conventional lifecycle documentation up to date, because it is too time consuming and costly to track down all changes in a lengthy document that is redundant and uses ambiguous, often inconsistent terminology. Classical life-cycle approaches also failed to clearly separate functional requirements from implementation options and constraints. This led to confusion between logical and physical design details further compounding the system documentation problem.
6.1.2 Focus
Structured methodologies tried to resolve these issues by proposing specification and diagramming standards and a new organization of the life cycle (see the phase structure section). In order to more effectively organize system documentation four documentation standards were proposed : 1. A set of leveled diagrams documenting work processes with their inputs
and outputs (data flows) and intermediate data storage needs (i.e., files, directories and the like) in a set of interlinked data flow diagrams. 2 . The details of the processes and data flows and data stores are recorded in a project data dictionary. For each type of entity in the system documentation (e.g., data flows, data stores, external entities and processes) a standard format is proposed to record the details. This results in a special, concise, semiformal notation for system description ; hence the notion of a special system description language arises. 3. Data flow diagrams break processes down until they can be documented on one page. Again semiformal notation like decision tables or Nassi-Schneidermann diagrams are used to describe the processes that during implementation provide the input to defining program modules. Process documentations are part of the project data dictionary.
DEVELOPMENT METHODOLOGIES
369
4. Program designs are documented via the tools of structured design such as structure charts that allow one to formulate models with strong cohesion that are only loosely coupled.
6.1.3 Phase Structure
The specification phase of the fully structured life cycle has six stages, which are then followed by coding, testing and implementation. In each of the stages a special system model is built. In stage 1 the problem is formulated by building a model of the existing system. Based on this a logical model of the existing system is built in stage 2. In the more recent Yourdon (1989) version, an “essential model” describes the key functional requirements, which replaces the physical and logical model of the existing system. By now the requirements are thoroughly understood (at least in theory) and a logical model of the new system is built in stage 3. The results of the logical specification stages are also used to identify hardware options. Stage 4 deals with the physicalization of this model, describing implementation details. In stage 5 the specification is “packaged” by filling in deferred details (such as processing error messages, performance considerations) and key user interfaces. Stage 6 uses this information to create program and hardware specifications. This last stage of structured analysis and design deals with coding, testing and implementation.
6.1.4 Special Methods and Tools
The structured approach was instrumental in recognizing the importance of information systems modeling and the special methods and tools that support it. Methods include effective conventions for representing systems (i.e., data flow or state transition diagrams) and maintaining descriptions (i.e., data dictionary formats for processes and data). Most recently the ideas underlying the original methods and tools have been refined and incorporated into computer-aided software environments-complex computer packages that support all phases of the structured life cycle. They include graphic editors, a central data repository, word processing and document generation facilities. In the most advanced versions, software tools can use the information collected in the data repository to generate database schemas and a skeleton program; i.e., to support the generation of structured program code.
370
RUDY HlRSCHHElM AND HElNZ K. KLEIN
6.2 Prototyping 6.2.1 Purpose and Rationale
Prototyping, sometimes referred to as the “iterative-adaptive” (Keen, 1980) or “evolutionary” (Hawgood, 1982; Budde et al., 1984) approach to systems development, specifically attempts to address the difficulties with formulating and articulating requirements reliably and completely before proceeding with system design and implementation. It seems to have emerged at about the same time in a number of different countries (see Earl, 1978; Keen and Scott Morton, 1978; Courbon and Bourgois, 1980; Naumann and Jenkins, 1982). It was an attempt to overcome several limitations of the system life-cycle methods that were a direct consequence of the life cycle’s rigidity. In situations that are very volatile or full of uncertainties, it simply is impossible to specify reliable requirements in advance as all learn the “real” requirements as work proceeds. Often users do not know or cannot articulate their requirements, and modeling the existing system is impossible or prohibitively difficult. In particular this is the case with systems that support creativc problem solving and planning. In such cases neither users nor analysts can specify the requirements in advance. Modeling the existing system even if possible may be undesirable because it perpetuates unsatisfactory ways of information processing. The only way for improvement may be through experimentation with a working prototype; ie., a system that delivers some promising functions, but is not developed in full detail. Basically, prototyping allows the user to experiment with partially completed systems that can be “patched together” quickly. The prototype is a scaled down version of the full system that delivers a sufficient subset of the functions to allow the users to understand and modify the full system’s capabilities. Two principal ways are usually considered. A horizontal prototype provides the interfaces to give the user a “feel” for the interaction patterns without much computational functions. In cases where computational power is essential to judge the potential of a new system, a vertical prototype provides some of the functionality by implementing a connected subset of functions that solves some real problems, but is constricted in scope. In either case, one of the main benefits of prototyping is that it permits the users to get a feel for the system through hands-on experience without the delays of systems development normally associated with life-cycle methodologies. 6.2.2 Focus
Prototyping forgoes the idea that detailed specifications must be developed before coding starts. Rather it aims at identifying some essential, but basic
DEVELOPMENT METHODOLOGIES
371
user requirements such as can be elicited by an informal conversation and documented in some casual notes. The prototype, a system exhibiting the essential features of a possible solution to the initial requirements, is then ‘implemented quickly-typically within a few days. This initial solution is then presented to the user in the spirit of an experiment. The analyst assists the user in running the prototype and the user is encouraged to experiment with it. In this way it is learned what the initial prototype can do for him and in which way it is unsatisfactory. On the basis of what the analyst and the user learn from using the prototype, the requirements are refined and a second prototype is developed. This process is repeated until (ideally) the user is satisfied. The last version of the prototype is documented and implemented more carefully to make it run efficiently and provide the user with a better interface. 6.2.3 Phase Structure
Prototyping as originally proposed (Naumann and Jenkins, 1982;Courbon and Bourgois, 1980) essentially involves five phases leading to stepwise refinement so that “the prototype becomes the system.” (1) Identify some of the basic requirements without any pretensions that these are either complete or not subject to drastic changes. (2) Develop a design that meets these requirements and implement it. (3) Have the user experiment with the prototype (possible with the developer acting as a chauffeur if the implementation is not robust), noting good and bad features. (4) Revise and enhance the prototype accordingly, thereby redefining and gradually completing the requirements and also improving the interface and reliability. (5) Repeat steps 3 and 4 until the user is satisfied or time and money foreclose on further revisions. At this point “the prototype becomes the system” (Lantz, 1986, P. 5 ) . 6.2.4 Special Methods and Tools
As noted above, there are two primary types of prototypes (Floyd, 1984, p. 4): “horizontal,” which is a mock-up of the user-system interfaces that allows the users to develop an appreciation of the interaction patterns of the system although without much computational functions; and “vertical,” which offers full computational power but limited functionality by implementing a carefully selected subset of functions in their intended final form solving some real problems, but which are constricted in scope. As prototyping requires rapid implementation, very high-level code generators, on-line screen design programs and the like are needed. Such tools also keep the cost of frequent recoding manageable. Often they are more
372
RUDY HlRSCHHElM AND HEINZ K. KLEIN
suited for horizontal prototypes and less so for vertical prototypes that need a high degree of computational power. An exception to this is Iverson’s APL. It is an early, very high-level language that was used for vertically prototyping mathematically oriented applications. More recently, data base languages like SQL, query by example, spreadsheets, hypercard and graphic packages have been used. Some CASE tools have special prototyping features such as screen generators and journaling of screen designs.
6.3 Soft System Methodology 6.3.1 Purpose and Rationale
Checkland’s (1981) goal was to develop a general methodology, termed “soft system methodology” (SSM), that could be used for information systcms analysis but is not specific to IS. He views SSM as a meta-methodology that should guide a systematic approach and is tailored to a specific situation. This kind of meta-methodology is more concrete than a general philosophy for analyzing the world, but broader and more flexible than a specific professional method that is usually limited to a predefined problem domain. Checkland states that SSM was designed for use in broad problem-solving situations, but its relevance to systems analysis is clear. He sees information systems within the realm of social systems, which he states are composed of rational assemblies of linked activities and sets of relationships. He refers to these as “human activity systems.” SSM adopts the viewpoint that human activity systems possess goals that are not quantifiable. In human activity systems, problems are manifestations of mismatches between the perceived reality, and that which is perceived might become actuality.
6.3.2 Focus Checkland’s methodology is different from the traditional approaches in that it does not prescribe specific tools and techniques, only general problemformulating approaches. It is a framework that does not force or lead the systems analyst to a particular “solution,” rather to an understanding. The seven steps (or “stages”) within the methodology are categorized as “realworld” activities and “systems thinking” activities. The steps in the former are executed by the people in the real world or problem situation; the steps in the latter attempt to provide a conceptual model of the real world, which is in turn modified by discussion with the concerned people. It is therefore a highly participative approach.
DEVELOPMENT METHODOLOGIES
373
6.3.3 Phase Structure
Although the characterization of the general form of SSM has recently been somewhat modified (Checkland and Scholes, 1990), as noted earlier in Section 4 of the chapter, the following refers to the version presented in Checkland (1981). Stages 1 and 2: Expressing the Problem Situation The function of these stages is to outline the situation so that a range of possible and relevant views can be revealed. The first stage is concerned with the problem situation unstructured, and the second attempts to “express the problem situation” such that many different perceptions of the problem situation, from people with a wide range of roles, are collected. The views will, ideally, reveal patterns of communication (both formal and informal), power, hierarchy and the like. These early stages should not attempt to “define” the problem, only to build up an understanding of the “situation in which there is perceived to be a problem” without imposing any particular structure on it or assuming any structure exists. Stage 3: Root Definitions of Relevant Systems By the conclusion of stage 2, different perceptions of the problem situation should have been sufficiently clear to enable “notional systems” to be named. These are systems that are relevant to the problem. Stage 3 is concerned with choosing the appropriate notional systems from the stated systems in stage 2. The choice is done through the help of what is known as a “root definition,” a concise description of a human activity system that captures a particular view of it. To put it differently, it is a definition of the basic nature or purpose of the system thought by a particular actor to be relevant to the problem at hand. The construction of the root definition is at the core of the methodology. Checkland offers a six-element checklist that all root definitions should explicitly contain, known conveniently by the acronym CATWOE, which stands for customer--clients, beneficiaries, or victims of the system’s activities; actors-the agents who carry out, or cause to be carried out, the transformation activities of the system ; transformationthe process by which defined inputs are transformed into defined outputs ; Weltanschauung-an outlook, framework, image, etc. that makes the particular root definition meaningful ; ownership-the agents who have a prime concern for the system, system’s owners, system control or sponsorship ; and enudronmental constrajnnts--environmental impositions, features of the environment or wider system that have to be taken as given.
374
RUDY HlRSCHHElM AND HElNZ K. KLEIN
Stage 4: Making and Testing Conceptual Models Conceptual models are those that will accomplish what is defined in the root definition. If the root definition is viewed as an account of what the system is, then the conceptual model is an account of the activities that the system must perform in order to be the system named in the definition. Conceptual models are not descriptions of any part of the real world, only the structured set of activities sufficient to account for the system as defined by the root definition. Stage 5: Comparing Conceptual Models with Reality At this stage, conceptual models are compared with the problem situation analyzed in stage 2. There are four approaches for making such a comparison: (1) use the conceptual model as a basis of ordered questioning, in particular, to open up a debate about change; (2) trace through the sequence of events that have produced the present problem situation and compare what had happened with what would have happened if the conceptual models had been implemented; (3) ask what features of the conceptual models are particularly different from the present reality and why; and (4)develop an alternative conceptual model of the problem situation with the same “form” as the first, then “overlap” one onto the other revealing mismatches, which are the source of discussion of change. The purpose of the comparison stage is to generate debate about the possible changes that might be made within the perceived problem situation. Stages 6 and 7: Implementing ”Feasible and Desirable” Change Checkland states that the changes likely to be recommended through his methodology will probably be more modest in nature than what would be recommended by a “hard” methodology. (The latter is likely to advocate the creation and implementation of a new system.) Changes should be desirable as a result of the insight gained from building and selecting root definitions and creating conceptual models. They should be culturally feasible, taking into account the characteristics of the situation, the people in it, and their shared experiences and beliefs. Changes generated from this stage are debated by those people in the problem situation who care about the perceived problem. Once changes have been agreed, they can be implemented. This process may be straightforward or problematic-in which case the methodology itself may be used to help in the implementation process. 6.3.4 Special Methods and Tools
Checkland offers a number of methods and tools that are specific to SSM. For example, the notion of CATWOE as a vehicle for making sure
DEVELOPMENT M ETH0 DO LOG I ES
375
alternative root definitions are sensible descriptors of the problem situation ; rich pictures for describing the problem situation in an imaginative and creative fashion ; making and testing conceptual models that abstract from the real world ; human activity systems that describe the nature of what is studied ; and what might be thought of as SSM’s dualistic ontology that allows the separation of the conceptual and real-world spheres. 6.4 UTOPIA
As UTOPIA is not a methodology but rather a specific project it cannot be written up in the previous structure. The UTOPIA project was an attempt at developing a computer-based system for typesetters in the newspaper industry in Sweden. The project invoked a number of tools and techniques in the systems development exercise. Here, we briefly describe the nature of the industry and then how the systems development proceeded. Traditional newspaper production involves four major processes : writing, editing, typesetting and printing. Reporters and columnists write individual copy, which is then edited by the editorial staff. Typesetters take the edited copy, including the relevant pictorial material, and lay out pages. Printers take the results and print the newspapers. Typical systems designs focus on rationalizing newspaper production by combining tasks that can logically be done on the same electronic device, such as editing and formatting. Page layout is conceived as a natural extension of formatting. A requirements analysis along these lines suggests that the editorial staff can perform the typesetting function because computers already aid the editors with handling page layout as well as editing. Editors can embed typesetting commands directly in the final copy. Page layout is done on screen and sent to phototypesetting equipment. The editors become responsible not only for editing but also for page makeup. High resolution screens, electronic cut, paste and scaling facilities and previewing apparatus permit the typesetting function to be assigned to the editors. In the UTOPIA project (Ehn et al., 1983; Howard, 1985) an alternative approach was tried at one newspaper company. The system development team consisted of union representatives and typesetters. Their goal was the establishment of an electronic typesetting support system that enhanced the position of the typesetting craft in the newspaper industry. Union involvement in the design team was required because the typesetters by themselves possessed neither the know-how nor the resources necessary to complete such a systems development project. Union resources were used to employ computer scientists and consultants to educate and advise the project team. The newspaper’s management was not included in the design team so that typesetters’ interests were given primacy in all design decisions. Existing
376
RUDY HlRSCHHElM AND HEINZ K. KLEIN
turn-key systems were considered inappropriate because they had built-in design constraints and management biases that did not take into account the unique requirements of the typesetting craft. These management biases emphasized cost savings, efficiency and control leading to deskilling, job losses and an aesthetically inferior end product. Data processing specialists assumed an advisory role serving the typesetters’ interest. In the requirements analysis, the design team viewed typesetting as an essential task requiring specialist skills that would be lost by its integration with editing. Two types of requirements were established: (1) the transforming of edited texts into made-up pages; and (2) the creating of an aesthetically pleasing product-the newspaper. Typesetting skills were different from editorial skills, which were responsible for content. (Editors are in charge of substance, typesetters are in charge of form.) The craft’s interest was to retain the quality of typesetting and to possibly enhance the productivity of typesetters. To retain quality, systems design options focused on providing the flexibility and diversity of the traditional tools of the typesetting trade by electronic means. For meeting this objective, the team found it necessary to use hardware mock-ups in order to overcome the limitations of the then available technology. While this is similar to prototyping, the use of hardware mockups overcame the bias inherent in the technology used for prototyping. The available prototyping tools were unable to accommodate the craft skills used to meet the aesthetic requirements of newspaper page layout. To enhance the quality of typesetting output, additional system capabilities were added such as fine tuning the contrast of pictures, scaling of pictures, etc., which were not available before. The UTOPIA approach resulted in an electronic typesetting support system that enhanced the typesetters’ skills and productivity.
6.4.7 Development Process Differences Process differences relate to the decisions made during system development. In the UTOPIA project, the development team made a conscious decision to retain and enhance the craft, not to include management representatives and not to be bound by the then-available page layout technology. The rationale for these decisions can be traced back to the paradigmatic assumptions that guided the development team. For example, the assumption that conflict is endemic to society in the radical structuralist paradigm, motivated the project team to focus on the conflict between typesetters and management. The denial of the possibility of the system developer being a neutral expert committed them to bolstering the position of the worker in the perceived social struggle and to enhancing the craft of the typesetters. This led to an emphasis on union leadership, which put control of system
DEVELOPMENT METHODOLOGIES
377
development in the hands of a single group. It also heightened the sensitivity to the effects of ideological, managerial bias in that the existing typesetting systems would make the craft largely redundant thereby enhancing management control over workers. Moreover, they believed the ideological bias was manifest in the components of the technology itself: the social neutrality of technology was denied. As Kubicek (1983) notes: “This approach is based on the assumption that EDP-knowledge is not impartial to the interests of capital and labor but rather biased by the perspective of capital and management” (p. 9). If available technology had limitations that would not allow the enhancement of the craft’s future, then it would not be in the interest of the workers to accept existing technology as a design constraint. In the words of Ehn et al. (1983, p. 439): “The trade unions’ ability. . . is limited in an increasing number of situations to a choice between yes or no to the purchase of ‘turn-key packages’ of technology and organization.”
6.5 ETHICS Methodology
6.5.I Purpose and Rationale ETHICS, an acronym for effective technical and human implementation of computer systems, is a methodology developed by Enid Mumford at the Manchester Business School and has been evolving over the past twenty years (Mumford, 1983). It is quite different from the traditional approaches to information system development in that it is based on the ideals of sociotechnical systems (STS). A key aspect of the methodology is that it views participation not only as a necessary device to obtain valid requirements or stimulate commitment, but as an intrinsic right or end in itself. Consequently users play a very large and important role in systems development. While user involvement is important in any methodology, it is absolutely vital in ETHICS.
6.5.2 Focus While the participative nature of ETHICS is often written about in the literature, this facet of the methodology should not be overemphasized. The methodology is a serious attempt at operationalizing the key aspects of the socio-technical systems philosophy. In particular, two design teams are formed each with a different focus of attention. One is concerned with the technical design of the system, the other with its social design. Another important aspect of ETHICS is its focus on the job satisfaction needs of the system users. ETHIC’S social orientation is clearly visible throughout the methodology.
378
RUDY HlRSCHHElM AND HEINZ K. KLEIN
6..5.3 Phase Structure The ETHICS methodology contains six stages that are further divided into twenty-five steps (cf. overview in Fig. 1). Stage 1 : Essential Systems Analysis Stage 1 is the preliminary phase of the ETHICS methodology. The procedures carried out here have much in common with the more conventional methodologies. For example, in this stage the problem to be solved is identified, its boundaries are noted, the current system is analyzed and described, and key objectives and tasks are identified. After the establishment of key objectives and tasks, it is necessary to pinpoint key information needed to accomplish these objectives and tasks. Subsequently, a diagnosis is made of efficiency needs and job satisfaction needs and a future forecast (“future analysis”) is undertaken. The final step in the first stage is an exercise in which all interest groups rank the list of objectives on a scale of 1-5. (Stage 1 includes steps 1 -1 1 in Fig. 1.) Stage 2: Socio-Technical Systems Design Stage 2 tries to reconcile the social side with the technical side of systems design. In this stage, the technical and business constraints are set out as well as the social constraints. Two different groups are formed (one focusing on the social aspects of the system; the other, on the technical aspects), whose job it is to find technically and socially desirable design options. After identification of the social and technical constraints, the resources available for both the technical and social system are identified and examined. The objectives and tasks set in stage 1 for the technical and business side and the social side are set out in priority style. The objectives (in ranked order) are then checked for compatibility before actual technical and social systems decisions are taken. Revision may be necessary before this final step is completed. (Stage 2 includes steps 12-20 in Fig. 1.) Stage 3: Setting out Alternative Solutions In stage 3, an examination of any alternative technical and social solutions is undertaken. These are set out in matrix form, evaluating possible advantages and disadvantages as well as overall compatibility with the established objectives. As in the previous stage, each will be evaluated against three criteria : priority, constraints and resources. Once doubtful solutions are eliminated, a short list of technical solutions and one of social solutions is drawn up. (Stage 3 includes steps 21-22 in Fig. 1.)
379
DEVELOPMENT METHODOLOGIES
r
Step 3 Step 4 Step 5 Step 6 Step 7 Step 8 Step 10
Describe existing system Specify key objectives Identify key tasks Identify sets of tasks Identify information needs Identify variance Forecast future needs
1
Step 1 Step 2 Identify -c Identify problem system boundaries
Step 11 Set and ronk efficiency and job satisfaction needs
I Step 9 Diagnose job satisfaction needs
Step 19: Take technical decisions
Step 16 Specify priority technical and business objective
-
t
Step 18 Check that technical and social objectives are compatible
Step 20: Take social c---Step 17 Specify priority decisions social objectives
r
Step 21 Set out olternotive technical solutions Set out alternative social solutions
FIG 1
Step 23 Set out compatible wcio- technical solutions
t
1
Step 12 Identify technical and business constraints Step 14 ldentify technical resources available Step 15 Identify social resources available
-
-
Step 13 Identify soc~ol constraints
Step 24 Step 25 Rank compotible -+ Prepare detai led pairs of work socio - technical solutions design
Schematic of the stages of the ETHICS methodology
380
RUDY HIRSCHHEIM AND HEINZ K. KLEIN
Stage 4: Setting out Compatible Solutions Stage 4 merges the short lists set out in stage 3 to see which solutions are most compatible. Incomplete solutions are discarded. Technical and social solutions found to operate well together are entered into an evaluation matrix for the next stage. (Stage 4 includes step 23 in Fig. 1.) Stage 5: Ranking Socio-Technical Solutions The matrix set up in the previous stage is now ranked, using information generated in stage 3, while still ensuring all socio-technical solutions meet the criteria outlined in stages 1 and 2. (Stage 5 includes step 24 in Fig. 1.) Stage 6: Prepare a Detailed Work Design A detailed list and description of all tasks people would perform under a particular socio-technical solution’s implementation is drawn up. Tasks are ranked in terms of simplicity and attempts are made to provide a balanced spread of required skills and complexity of tasks. Checks are made to ensure that created jobs are as interesting and satisfying as possible, using a set of “issues of concern.” If the highest ranking socio-technical solution scores high on these issues while achieving the technical objectives, it is accepted as the final solution. If this is not the case, another short-listed solution is tried in the same manner. (Stage 6 includes step 25 in Fig. 1.)
6.5.4 Special Methods and Tools
ETHICS adopts a number of special methods for systems development. For example, there is a special diagramming method used for describing work layout. There is also a job diagnostic questionnaire, which is used to elicit views on the job situation. More important perhaps, ETHICS employs a facilitator who seeks to find a consensus on the systems development exercise using special questionnaire instruments. Another special feature is the use of dialectics to stimulate the generation of socially and technically desirable alternatives. Mumford notes that managers often vary technical solutions after the fact at the implementation stage. She observes that much more could be accomplished to meet social requirements if they were considered at a stage when the design was not yet frozen. At that time, social objectives could often be met with little or no extra cost. This gives rise to the split of the development team described in stage 2 and the explicit consideration of social and technical design objectives as described in stages 3 and 4.
DEVELOPMENT METHODOLOGIES
381
REFERENCES Achterberg, J., van Es, G., and Heng, M. (1991). Information Systems Research in the Postmodern Period, in H.-E. Nissen, H. K. Klein and R. Hirschheim, eds., “Information Systems Research : Contemporary Approaches and Emergent Traditions,” pp. 281-294. North-Holland, Amsterdam. Agresti, W. (1981). The Conventional Software Life-Cycle Model: Its Evolution and Assumptions. In W. W. Agresti, ed., “New Paradigms for Software Development.” IEEE Computer Society Order Number 707. Albrecht, J., and Lim, G.-C. (1986). A Search for Alternative Planning Theory, Use of Critical Theory, J . Arch Plan Res. 3, 117-131. Alavi, M. (1984). An Assessment of the Prototyping Approach to IS Development, Comm. ACM 27(6), 556-563. Alvarez, R., and Klein, H. K. (1989). Information Systems Development for Human Progress? in H. K. Klein and K. Kumar, eds., “Systems Development for Human Progress,” pp. 119. North-Holland, Amsterdam. Andersen, N., Kensing, F., Lundin, J., Mathiassen, L., Munk-Madsen, A,, Rasbech, M., and Sorgaard, P. (1990). “Professional Systems Development: Experience, Ideas and Action.” Prentice-Hall International, Hemel Hempstead, England. Apel, K. (1980). “Towards a Transformation of Philosophy.” Routledge and Kegan Paul, London. Argyris, C. (1971). Management Information Systems: The Challenge to Rationality and Emotionality, Munagement Science 17, B 215 292. Argyris, C. ( 1 982). “Learning and Action : Individual and Organizational.” Jossey-Bass, San Francisco. Argyris, C., and Schoen, D. (1 978). “Theory in Practice. Increasing Professional Effectiveness.” Jossey-Bass, San Francisco. August, J. (1991). “Joint Application Design: A Group Session Approach to System Design.” Prentice-Hall, Englewood Cliffs, NJ. Auramaki, E., Lehtinen, E., and Lyytinen, K. (1988). A Speech-Act Based Office Modeling Approach, ACM Truns. on Ofice Information Systems 6(2), 126-152. Aurarnaki, E., Hirschheim, R., and Lyytinen, K. (1991). Modeling Offices Through Discourse Analysis: A Comparison and Evaluation of SAMPO with OSSAD and ICN, The Computer Journal, forthcoming. Avison, D., and Fitzgerald, G. (1 988). “Information Systems Development: Methodologies, Techniques and Tools.” Blackwell Scientific, Oxford. Avison, D. E., and Wood-Harper, A. T. (1990). “Multiview: An Exploration in Information Systems Development.” Blackwell Scientific, Oxford. Banbury, J. (1987). Towards a Framework for Systems Analysis Practice, in R. J. Boland and R. A. Hirschheim, eds., “Critical Issues in Information Systems Research,” pp. 79-96. John Wiley & Sons, Chichester, England. Bansler, J., Systemudvikling. (1989). Teori og Hitorie i Skandinavisk Perspectiv. Studenttliteratur, as referenced and summarized in J. Bansler, Systems Development Research in Scdndinavia : Three Theoretical Schools, Scandinavian Journal of Information Systems 1, 3-20. Bansler, J., and Havn, E. (1991). The Nature of Software Work, in P. Vanden Besselaar, A. Clement, and P. Jarvinen, eds., “Information System Work and Organizational Design,” pp. 145-152. North-Holland, Amsterdam. Banville, C. (1990). Legitimacy and Cognitive Mapping: An Exploratory Study of a Social Dimension of Organizational Information Systems. Ph.D. Thesis, Faculte Des Sciences De L’Administration, Universite Laval, Quebec.
382
RUDY HlRSCHHElM AND HEINZ K. KLEIN
Banville, C.. and Landry, M. (1989). Can the Field of MIS Be Disciplined? Communications of’ the AC‘M 32( 1) : 48-60. Bariff, M., and Ginzberg, M. (1982). MIS and the Behavioral Sciences, Data Base 13(1), 19 26. Baskerville, R. (1991). Practitioner Autonomy and the Bias of Methods and Tools, in H.-E. Nissen, H. K. Klein and R. Hirschheim, eds., “Information Systems Research: Contemporary Approaches and Emcrgent Traditions,” pp. 673-697. North-Holland, Amsterdam. Berger. P., and Luckmann, T. (1967). “The Social Construction of Reality.” Doubleday, New York. Bjerknes, G. (1989). “Contradiction a Means for Understanding Situations in System Development [in Norwegian]. Doctoral Dissertation. Dept. of Informatics, University of Oslo, Norway. Bjerknes, G., and Bratteteig, T. (1984). The Application Perspective-Another Way of Conceiving Systems Development and EDP-Based Systems, in M. Saaksjarvi, ed., “Proceedings of the Seventh Scandinavian Research Seminar on Systemeering,” Helsinki. Bjerknes, G., and Bratteteig, T. (1985). FLORENCE in Wonderland- Systems Development with Nurses, paper presented at the Conference on Development and Use of Computer Based Systems and Tools, Aarhus. Bjerknes, G., Dahlbom, B., Mathiassen, L., Nurminen, M., Stage, J., Thoresen, K., Vandebo, P.,and Aaen, I., eds. (1990). “Organizational Competence in Systems Development: A Scandinavian Contribution.” Studentlitteratur, Lund. Sweden. Bjerknes, G., Ehn, P., and Kyng, M., eds. (1987). “Computers and Democracy: A Scandinavian Challenge.” Avebury, Aldershot, England. Rjorn-Andersen, N., Earl, M., Holst, J., and Mumford, E., eds. (1982). “Information Society: For Richer For Poorer.” North-Holland, Amsterdam. Rleicher, J. ( 1982). “The Hermeneutic Imagination.” Routledge, London. Blumenthal, S. (1969). “Management Information Systems, A Framework for Planning and Development.” Prentice-Hall, Englewood Cliffs, NJ. Bodker, S. (1991). Activity Theory as a Challenge to Systems Design, in H.-E. Nissen, H. K. Klein and R. Hirschheim, eds., “Information Systems Research : Contemporary Approaches and Emergent Traditions,” pp. 55 1-564. North-Holland, Amsterdam. Bodker. S.. and Groenbaek, K. (1989). Cooperative Prototyping Experiments-Users and Designers Envision a Dentist Case Record System, in J. Bowers and S. Benford, eds., “Proceedings of the First European Conference on Computer-Supported Cooperative Work,” ECCSCW, pp. 343- 357. London. Bodker, S., and Groenbaek. K. (1991). Design in Action: From Prototyping by Demonstration to Cooperative Prototyping, in J. Greenbaum and M. Kyng, eds., ”Design at Work: Cooperative Design of Computer Systems,” pp. 197 218. Lawrence Erlbaum, Hillsdale, NJ. Bodker, S., Ehn, P., Kammersgaard, J., Kyng, M., and Sundblad, Y. (1987). A UTOPIAN Experience: On Design of Powerful Computer-Based Tools for Skilled Graphic Workers, in G. Bjerknes, P. Ehn and M. Kyng, eds., “Computers and Democracy: A Scandinavian Challenge,” pp. 251 278. Avebury, Aldershot, England. Boehm, B. ( 1976). Software Engineering, IEEE Trans. on Computers (December), 225-240. Bogh-Andersen, P.(1991). A Semiotic Approach to Construction and Assessment of Computer Systems, in H.-E. Nissen, H. K. Klein and R. Hirschheim, eds., “Information Systems Research: Contemporary Approaches and Emergent Traditions,” pp. 465 514. North-Holland, Amsterdam. Bogh-Andersen. P.. and Brattetieg, T., eds. (1988). Computers and Language at Work: The Relevance of Language and Language Use in the Development and Use of Computer Systems. SYDPOL Working Group 2. Institute of Informatics, University of Oslo. ~
DEVELOPMENT METHODOLOGIES
383
Boland, R. (1985). Phenomenology: A Preferred Approach to Research in Information Systems, in E. Mumford, R. Hirschheim, G. Fitzgerald and T. Wood-Harper, eds., “Research Methods in Information Systems,” pp. 193-201. North-Holland, Amsterdam. Boland, R. (1987). The In-Formation of Information Systems, in R. Boland and R. Hirschheim, eds., “Critical Issues in Information Systems Research,” pp. 363-380. John Wiley & Sons, Chichester, England. Boland, R. (1991). Information System Use as a Hermeneutic Process, in H.-E. Nissen, H. Klein and R. Hirschheim, eds., “Information Systems Research: Contemporary Approaches and Emergent Traditions,” pp. 439-458. North-Holland, Amsterdam. Boland, R., and Day, W . (1982). A Phenomenology of System Design, in M. Ginzberg and C. Ross, eds., “Proceedings of the Third International Conference on Information Systems,” pp. 31 46. Ann Arbor, MI. Bostrom, R., and Heinen, S. (1977). MIS Problems and Failures: A Sociotechnical Perspective-Part I: The Causes, MIS Quarterly 1(3), 17-32. Brancheau, J , , Schuster. L., and March, S. (1989). Building and Implementing an Information Architecture, Data Base. 9-17. Braverman, H. (1974). “Labor and Monopoly Capital.” Monthly Review Press, New York. Briefs, U., Cihorra, C., and Schneider, L., eds. (1983). “Systems Design for, with, and by the Users.” North-Holland, Amsterdam. Brooks, F. (1975). “The Mythical Man-Month.” Addison-Wesley, Reading, MA. Bubenko, J. (1986). Information Systems Methodologies-A Research View, in T. W. Olle, H. Sol and A. Verrijn-Stuart, eds., “Information Systems Design Methodologies: Improving the Practice,” pp. 289 31 8. North-Holland, Amsterdam. Budde, R., Kuhlenkamp, K., Mathiassen, L., and Zullighoven, H., eds. (1984). “Approaches to Prototyping.” Springer-Verlag, Berlin. Burch, J., and Strater, F. (1974). “Information Systems: Theory and Practice.” John Wiley & Sons, New York. Burrell, G., and Morgan, G. ( 1979). “Sociological Paradigms and Organizational Analysis.” Heinemann, London. Canning, R. (1956). “Electronic Data Processing for Business and Industry.” John Wiley & Sons, New York. Carlson, J., Ehn, P., Erlander, B., Perby, M., and Sandberg, A. (1978). Planning and Control from the Perspective of Labor: A Short Presentation of the DEMOS Project, Accounting, Organizations, and Society 3(3 4). Carroll, D. C. (1965). Man-Machine Cooperation on Planning and Control Problems, paper presented at the International Symposium on Long Range Planning for Management, UNESCO, Paris, September 20-24. Carroll, D. C. (1967). Implications of On-Line, Real Time Systems for Management Decision Making, in C. A. Myers, ed., “The Impact of Computers on Management.” MIT Press, Cambridge, MA. Chaffee, E. (1985). Three Models of Strategy, Academy of Management Review 10(1), 89-98. Chase, S. (1956). Foreword to J. B. Carroll, “Language, Thought, and Reality, Selected Writings of Benjamin Lee Whorf,” pp. v-x. MIT Press, Cambridge, MA. Checkland, P. (1972). Towards a Systems-Based Methodology for Real-World Problem Solving, Journal of Systems Engineering 3(2), 9-38. Checkland, P. ( 1981). “Systems Thinking, Systems Practice.” John Wiley & Sons, Chichester, England. Checkland, P., and Scholes, J. (1990). “Soft Systems Methodology in Action.” John Wiley & Sons, Chichester. Churchman, C. W. (1971). “The Design of Inquiry Systems.” Basic Books, New York.
384
RUDY HlRSCHHElM AND HEINZ K. KLElN
Ciborra, C. ( I98 I). Information Systems and Transactions Architecture, International Policy Analysis Information Systems 5(4), 305-323. Cleland, D.. and King, W. (1975). “Systems Analysis and Project Management.” McGrawHill, New York. Coad, P., and Yourdon, E. (1990). “Object-Oriented Analysis.” Prentice-Hall, Englewood Cliffs, NJ. Colter, M. (1982). Evolution of the Structured Methodologies, in J. D. Couger, M. Colter and R. Knapp, eds., “Advanced System Development/Feasibility Techniques,” pp. 73-96. J. Wiley & Sons, New York. Cotterman, W., and Senn, J., eds. (1991).“Systems Analysis and Design: A Research Agenda.” John Wiley & Sons, Chichester, England. Cotterman, W., Couger, J. D., Enger, N., and Harold, F., eds. (1981). “Systems Analysis and Design : A Foundation for the 1980s.” North-Holland, Amsterdam. Couger, J. D. (1973). Evolution of Business System Analysis Techniques, ACM Computing Surueys 5(3), 167 198. Couger, J. D. (1982). Evolution of System Development Techniques. In J. D. Couger, M. Colter, and R. Knapp, eds., “Advanced Systems Development/Feasibility Techniques.” John Wiley & Sons, New York. Couger, J. D., Colter, M., and Knapp, R. (1982). “Advanced Systems DevelopmentlFeasibility Techniques.” John Wiley & Sons, New York. Courbon, J. C., and Bourgois, M., (1980). The IS Designer as a Nurturing Agent of a SocioTechnical Process, in H. Lucas, F. Land, T. Lincoln and K. Supper, eds., “The Information Systems Environment,” pp. 139-148. North-Holland, Amsterdam. Daniels, A., and Yeates, D., eds. (1969, 1971). “Systems Analysis,” American ed. Palo Alto, 1971. Original title: “Basic Training in Systems Analysis,” National Computing Centre, England, 1969. Davis, G . ( 1982). Strategies for Information Requirements Determination, IBM Systems Journal 2 l ( l ) , 4-30. DeMaio, A. (1980). Socio-Technical Methods for Information Systems Design, in H. Lucas, F. Land, T. Lincoln and K. Supper, eds., “The Information Systems Environment,” pp. 105122. North-Holland, Amsterdam. DeMarco, T. (1978). “Structured Analysis and Systems Specification.” Yourdon Press, New York. Dennis, A. R., George, J. F.. Jessup, L. M., Nunamaker, J. F. and Vogel, D. (1988). Information Technology to Support Electronic Meetings. Munagement Infbrmation Systems Quarterly. 12(4): 591 624. Dickson, Gary W. (1981). Management information Systems: Evolution and Status. In “Advances in Computers,” Vol. 20, pp, 1-37. Academic Press, Boston. Dickson, G., Senn, J., and Chervany, N. (1977). Research in Management Information Systems: The Minnesota Experiments. Manugement Science 23(9), 91 3-923. Dreyfus, H. (1982). “What Computers Can’t Do.” Harper & Row, New York. Dreyfus, H., and Dreyfus, S. (1986). “Mind over Machine-The Power of Human Intuition and Expertise in the Era of the Computer.” Basil Blackwell, Oxford. Earl, M. (1978). Prototype Systems for Management Information Systems and Control, Accounting, Orgunizations and Society 3(2), 161-170. Ehn, P. ( 1988). “Work-Oriented Design of Computer Artifacts.’’ Arbetslivscentrum, Stockholm. Ehn, P. (1990). Scandinavian Design-on Participation and Skill, pp. 28-30. Invited paper to the Conference on Technology and the Future of Work. Ehn, P., and Kyng, M. (1987). The Collective Resource Approach to Systems Design, In G. Bjerknes, P. Ehn and M. Kyng, eds., “Computers and Democracy: A Scandinavian Challenge,” pp. 17-57. Avebury, Aldershot, England.
DEVELOPMENT METHODOLOGIES
385
Ehn, P., and Kyng, M. (1991). Card Board Computers: Mocking-It-up or Hands-on the Future, in J. Greenbaum and M. Kyng, eds., “Design at Work: Cooperative Design of Computer Systems,” pp. 169-195. Lawrence Erlbaum Associates, Hillsdale, NJ. Ehn, P., and Sandberg, A. (1979). Systems Development: Critique of Ideology and the Division of Labor in the Computer Field, in A. Sandberg, ed., “Computers Dividing Man and Work.” Arbetslivcentrum, Swedish Center for Working Life, Demos project report no. 13. Ehn, P., and Sandberg, A. (1983). Local Union Influence on Technology and Work Organization: Some Results From the DEMOS Project, in U. Briefs, C. Ciborra and L. Schneider, eds., “Systems Design for, with and by the Users,’’ pp. 427-437. North-Holland, Amsterdam. Ehn, P., and Sjogren, D. (1991). From System Descriptions to Scripts of Action, in J. Greenbaum and M. Kyng, eds., “Design at Work: Cooperative Design of Computer Systems, pp. 241-268. Lawrence Erlbaum Associates, Hillsdale, NJ. Ehn, P., Kyng, M., and Sundblad, Y. (1983). The UTOPIA Project: On Training, Technology, and Products Viewed from the Quality of Work Perspective, in U. Briefs, C. Ciborra and L. Schneider, eds., “Systems Design for, with and by the Users.” pp. 439-449. North-Holland, Amsterdam. Episkopou, D. (1987). The Theory and Practice of Information Systems Methodologies: A Grounded Theory of Methodology Evolution. Unpublished Ph.D. Thesis, University of East Anglia. Floyd, C. (1984). A Systematic Look at Prototyping, in R. Budde et al., eds., “Approaches to Prototyping,” pp. 1-18. Springer-Verlag, Berlin. Floyd, C. (1987). Outline of a Paradigm Change in Software Engineering, in G . Bjerknes, P. Ehn and M. Kyng, eds., “Computers and Democracy ---AScandinavian Challenge,” pp. 19121 2. Avebury, Aldershot, England. Floyd, C., Budde, R., and Zuellighoven, H., eds. (1991). “Software Development and Reality Construction.” Springer, Berlin. Forester, J. (1989). “Planning in the Face of Power.” University of California Press, Berkeley. Freire, P. (1971). “Pedagogy of the Oppressed.” Herder & Herder, New York. Cane, C., and Sarson, T. (1979). “Structured Systems Analysis: Tools and Techniques.” Prentice-Hall, Englewood Cliffs, NJ. Glans et al. (1968). “Management Systems.” Holt, Rinehart and Winston, New York. Goldkuhl, G., and Lyytinen, K. (1982). A Language Action View of Information Systems, in M. Ginzberg and C. Ross, eds., “Proceedings of the 3rd International Conference on Information Systems,” pp. 13-30. Ann Arbor, MI. Grant, D. (1991). Towards an Information Engineering Approach for Computer Integrated Manufacturing Information Systems. Ph.D. Dissertation, State University of New York, Binghamton. Grant, D. A,, Klein, H. K, and Ngwenyama, 0. (1991). Modeling for CIM Information Systems Architecture Definition : An Information Engineering Case Study. School of Management, SUNY-Binghamton, Work Paper. Greenbaum, J., and Kyng, M., eds. (1991). “Design at Work: Cooperative Design of Computer Systems.’’ Lawrence Erlbaum Associates, Hillsdale, NJ. Groenbaek, K. (1989). Extending the Boundaries of Prototyping-Towards Cooperative Prototyping, in S . Boedker, ed., “Proceedings of the 12th IRIS-Part I,” DAIM PB-296-1, pp. 219-238. Aarhus. Habermas, J. (1 973). “Legitimation Crisis.” Heinemann, London. Habermas, J. (1984). “The Theory of Communicative Action.” Beacon Press, Boston. Harsanyi, J. C. (1962). Measurement of Social Power, Opportunity Costs and the Theory of Two-Person Bargaining Games, Behavioral Science, 67. Hawgood, J., ed. (1982). “Evolutionary Systems Development.” North-Holland, Amsterdam.
386
RUDY HlRSCHHElM AND HEINZ K. KLEIN
Hedberg, B., and Jonsson, S. ( 1978). Designing Semi-confusing Information Systems for Organizations in Changing Environments, Accounting, Organizations and Society 3( I), 47 64. Hedberg, R.,Nystrom, P., and Starbuck. W. (1976). Camping on See-saws: Prescriptions for Self-Designing Organizations.” A S Q 24, 40-63. Held, D. (1980). “Introduction to Critical Theory: Horkheimer to Habermas.” University of California Press, Berkeley. Hirschheim, R. (1983). Assessing Participative Systems Design : Some Conclusions from an Exploratory Study, Inf’ormaliun & Management 6(6), 317 327. Hirschheim, R. ( I 985). User Experiences with and Assessment of Participative Systems Design, MISQ 9(4), 295-303. Hirschheim, R. (1986). Participative Systems Design: User Experience, Evaluation and Conclusions, Australian Compurer Journal 18(4), 166-1 73. Hirschheim, R., and Klein, H. (1989). Four Paradigms of Information Systems Development, Communications uf’the A C M 32(10): 1199- 1216. Hirschheim, R., and Klein, H. (1991a). A Research Agenda for Future Information Systems Development Methodologies, in W. Cottermann and J. Senn, eds., “Systems Analysis and Design: A Research Agenda.” John Wiley & Sons, Chichester, England. Hirschheim, R., and Klein, H. (1991b). Implementing Critical Social Theory Principles in Information Systems Development: The Case of ETHICS. University of Houston, Information Systems Research Center Working Paper. Hirschheim, R., and Newman, M. (1988). Information Systems and User Resistance: Theory and Practice, The Computer Journal31(5), 1-11. Hirschheim, R., Klein, H., and Lyytinen. K. (1991). Control, Sense-Making and Argumentation in Information Systems. Informations Systems Research Center (ISRC) Working Paper Series, University of Houston. Hirschheim, R., Klein, H., and Newman, M. (1991). Information Systems Development as Social Action : Theoretical Perspective and Practice, OMEGA 19(6), 587 608. Holmqvist, B., and Bogh-Andersen, P. (1987). Work Language and Information Technology, Journal of’ Pragmatics 1 , 327 357. Howard, R . (1985). UTOPIA: Where Workers Craft New Technology, Technology Review. 28(3), 43 49. Iivari, J. (1982).Taxonomy of the Experimental and Evolutionary Approaches to Systemeering, in J. Hawgood, ed., “Evolutionary Information Systems,” pp. 105-1 19. North-Holland, Amsterdam. Iivari, J. (1984). Prototyping in the Context of Infomation Systems Design, in R. Budde, K. Kuhlenkam, L. Mathiassen and H. Zullighoven, eds., “Approaches to Prototyping,” pp. 261 277. North-Holland, Amsterdam. livari, 1. (1989). A Methodology for IS Development as Organizational Change: A Pragmatic Contingency Approach, in H. Klein and K. Kumar, eds., “Systems Development for Human Progress,” pp. 197-2 17. North-Holland, Amsterdam. Ives, B., Hamilton, S., and Davis, G. (1980). A Framework for Research in Computer-Based Management Information Systems, Management Science 26(9), 910-934. Ives, B., and Ledrmonth, G. (1984). The Information System as a Competitive Weapon, Communications of’rhe ACM 27(12), 1193-1201. Jepsen, L., Mathiassen, L., and Nielsen, P. (1989). Back to Thinking Mode: Diaries for the Management of Information Systems Development Projects, Behauior and Information Technology 8(3), 207-217. Kaula, R. (1990). Open System Architecture. Ph.D. Dissertation, State University of New York, Binghamton.
DEVELOPMENT METHODOLOGIES
387
Keen, P. (1980). Adaptive Design for Decision Support Systems, Data Base 1-2 (Fall), 12-25. Keen, P. (198 I). Information Systems and Organizational Change, Communications of the ACM 24(1), 24- 33. Keen, P., and Scott Morton, M. (1978). “Decision Support Systems: An Organizational Perspective.” Addison-Wesley, Reading, MA. Kendall, K., and Kendall, J. (1 988). “Systems Analysis and Design.” Prentice-Hall, Englewood Cliffs, NJ. Kerola, P. (1985). On the Fundamentals of a Human-Centered Theory for Information Systems Development, “Report of the Eighth Scandinavian Seminar on Systemeeing,” Part I, pp. 192210. Aarhus, Aug. 14-16 Kerola, P. (1987). Integration of Perspectives in the Conception of Office and Its Systems Development, in P. Jarvinen, ed., “The Report of the 10th Scandinavian Seminar,” pp. 369392. University of Tampere, Finland. King, W. R., and Srinivasan, A. (1987). The Systems Life Cycle and the Modern Information Systems Environment, in J. Rabin and E. M. Jackowski, eds., “Handbook of IS Resource Management,” pp, 325 343. Basil Marcel, New York. Kirsch, W., and Klein, H. K. (1977). “Management Information Systems 11: On the Road to a New Taylorism?’ Kohlhammer Urban-Taschenbuecher, Berlin (in German : Managementinformationssystems I1 : Auf dem Weg zu einem neuen Taylorismus?). Klein, H., and Hirschheim, R. (1985). Fundamental Issues of Decision Support Systems, Decision Support Systems 1(1), 5-23. Klein, H., and Hirschheim, R. (1987). Social Change and the Future of Information Systems Development, in R. Boland and R. Hirschheim, eds., “Critical Issues of Information Systems Research,” pp. 275-305. John Wiley & Sons, Chichester, England. Klein, H. K . , and Hirschheim, R. (1991). Rationality Concepts in Information Systems Development Methodologies. Accounting, Management and Information Technologies 1(2), 157 187. Klein, H. K., and Lyytinen, K. (1985). The Poverty of Scientism, in E. Mumford, R. Hirschheim, G. Fitzgerald and T. Wood-Harper, eds., “Research Methods in Information Systems,” pp. 131 ~162.North-Holland, Amsterdam. Klein, H. K., and Lyytinen, K. (1991). Data Modeling: Four Meta-Theoretical Assumptions, in C. Floyd, R. Budde and H. Zuellighoven, “Software Development and Reality Construction.” Springer, Berlin. Kling, R. (1980). Social Analyses of Computing: Theoretical Perspectives in Recent Empirical Research, ACM Computing Surueys 12( I), 61-1 10. Kling, R . (1987). Defining the Boundaries of Computing Across Complex Organizations, in R. Boland and R. Hirschheim, eds., “Critical Issues in Information Systems Research,” pp. 307-362. John Wiley & Sons, Chichester, England. Kling, R., and Iacono, S. (1984). The Control of Information Systems After Implementation, Communications of the ACM 27(12), 1218-1226. Kling, R., and Scacchi, W. (1980). Computing as Social Action: The Social Dynamics of Computing in Complex Organizations, in “Advances in Computers,” Vol. 19. Academic Press, New York. Kling, R., and Scacchi, W. (1982). The Web of Computing: Computing Technology as Social Organization, in “Advances in Computers,” Vol. 21. Academic Press, New York. Kolf, F., and Oppelland, H. (1980). Guidelines for the Organizational Implementation of IS: Concepts and Experiences with the PORGI Implementation Handbook, in N. BjornAndersen, ed., “The Human Side of Information Processing,” pp. 69-87. North-Holland, Amsterdam. Kraft, P. (1977). “Programmers and Managers.” Springer, New York.
388
RUDY HlRSCHHElM AND HEINZ K. KLEIN
Kraft. P., and Bander, J. (1988). The Introduction of New Technologies in Danish Workplaces, paper presented at the Third Technological Literacy Conference, Arlington, VA. Kubicek, H. (1983). User Participation in Systems Design: Some Questions About Structure and Content Arising from Recent Research from a Trade Union Perspective, in U. Briefs, C. Ciborra and L. Schneider, eds., “Systems Design for, with and by the Users,” pp. 3-18. North-Holland, Amsterdam. Kuhn, T. (1970). “The Structure of Scientific Revolutions,” 2nd ed. University of Chicago Press, Chicago. Kuutti, K . (1991). Activity Theory and Its Applications to Information Systems Research and Development, in H.-E. Nissen, H. K . Klein and R. Hirschheim, eds., “Information Systems Research : Contemporary Approaches and Emergent Traditions,” pp. 529 549. North-Holland, Amsterdam. Kyng, M. (1989). Designing for a Dollar a Day, Ofice: Technology and People 4: 51-170. Kyng, M. (1991). Cooperative Design: Bringing Together the Practices of Users and Designers, in H.-E. Nissen, H. Klein and R. Hirschheim, eds., “Information Systems Research: Contemporary Approaches and Emergent Traditions,” pp. 405-41 6. North-Holland, Amsterdam. Kyng, M., and Mathiassen, L. (1982). Systems Development and Trade Unions Activities, in N. Bjorn-Andersen, M. Earl, J. Holst and E. Mumford, eds., “Information Society: For Richer, For Poorer,” pp. 247 260. North-Holland, Amsterdam. Lakoff, G., and Johnson, M. (1980). “Metaphors We Live By.” University of Chicago Press, Chicago. Land, F. (1989). From Software Engineering to Information Systems Engineering. In K. Knight, ed., “Participation in Systems Development,” pp. 9 ~ ~ 3 Kogan 3. Page, London. Land, F., and Hirschheim, R. (1983). Participative Systems Design: Rationale, Tools and Techniques, Journul of’ Applied Systems Analysis 10, 91- 107. Lantz, K. (1986). “The Prototyping Methodology.” Prentice-Hall, Englewood Cliffs, NJ. Lanzara, G. (1983). The Design Process: Frames, Metaphors and Games. In U. Briefs et ul., eds., 1983, pp. 29 40. Lanzara, G., and Mathiassen, L. (1984). Mapping Situations Within a Systems Development Project. DIAMI PB-179, MARS Report 6, Department of Computer Sciences, Aarhus University. Lee, B. (1978). “Introducing Systems Analysis and Design,” Vols. 1 and 2. National Computing Centre, Manchester, England. Lehtinen, E., and Lyytinen, K. (1983). The SAMPO Project: A Speech-Act Based Information Analysis Methodology with Computer Based Tools. Report WP-2, Department of Computer Science. Jyvaskyla University, Jyvaskyla, Finland. Licklider, J. C. R. (1960). Man-Computer Symbiosis, IEEE Trans. on Human Factors in Efecfronics HFE-1, 4. Licklider, J. C. R. (1968). Man-Machine Communication, in C. A. Cuadra, ed., “Annual Review of Information Science and Technology,” 201. Chicago. Lucas, H. (1975). “Why Information Systems Fail.” John Wiley & Sons, New York. Lucas, H. (1978). The Evolution of an Information System: From Key-Man to Every Person, SIoan Mgmt. Rev. 19(2), 39-52. Lyytinen, K. (1986). Information Systems Development as Social Action: Framework and Critical Implications. Unpublished Ph.D. Thesis, University of Jyvaskyla, Finland. Lyytinen, K., and Hirschheim, R. (1988). Information Systems as Rational Discourse: An Application of Habermas’ Theory of Communicative Rationality, Scanilinauian Journal of Management Studies 4( 1 2), 19-30.
DEVELOPMENT METHODOLOGIES
389
Lyytinen, K., and Klein, H. (1985). The Critical Social Theory of Jurgen Habermas (CST) as a Basis for a Theory of Information Systems, in E. Mumford, R. Hirschheim, G . Fitzgerald and T. Wood-Harper, eds., “Research Methods in Information Systems,” pp. 219-232. North-Holland, Amsterdam. Maddison, R., Baker, G., Bhabuta, L., Fitzgerald, G . , Hindle, K., Song, J., Stokes, N., and Wood, J . (1 983). “Information System Methodologies.” Wiley Heyden, Chichester, England. Madsen, K. H. (1989). Breakthrough by Breakdown, in H. Klein and K. Kumar, eds., “Information Systems Development for Human Progress in Organizations,” pp. 41-53. NorthHolland, Amsterdam. Markus, M. L. (1983). Power, Politics and MIS Implementation, Communications ofthe ACM 26(6), 430-444. Markus, M. L. (1984). “Systems in Organizations: Bugs and Features.” Pitman Press, Marsfield, MA. Martin, J. C. (1983). “Managing the Database Environment.” Prentice-Hall, Englewood Cliffs, NJ. Mathiassen, L. ( I 981 ). Systemudvikling and Systemudviklingmetode. Dissertation, Dept. of Computer Science, University of Aarhus, DAIMI PB-I 36. Mathiassen, L., and Bogh-Andersen, P. (1985). Systems Development and Use: A Science of the Truth or a Theory of the Lie, in “Proceedings of Working Conference on Development and Use of Computer-Based Systems and Tools, Part 11,” pp. 351-382. Aarhus, Denmark. Mathiassen, L., and Nielsen, P. A. (1989). Soft Systems and Hard Contradictions-Approaching the Reality of Information Systems in Organizations, Journal of Applied Systems Analysis 16.
Mathiassen, L., and Nielsen, P. (1990). Surfacing Organizational Competence: Soft Systems and Hard Contradictions, in G . Bjerknes, B. Dahlbom, L. Mathiassen, M. Nurminen, J. Stage, K. Theoreson, P. Vandebo and I. Aaen, eds., “Organizational Competence in System Development: A Scandinavian Contribution.” Studentlitteratur, Lund. Mathiassen, L., Rolskov, B., and Vedel, E. (1983). Regulating the Use of EDP by Law and Agreements, in U. Briefs, C. Ciborra and L. Schneider, eds., “Systems Design for, with and by the Users,” pp. 251-264. North-Holland, Amsterdam. McCarthy, T. (1982). “The Critical Theory of Jurgen Habermas.” MIT Press, Cambridge, MA. McLean, E., and Soden, J. (1977). “Strategic Planning for Management Information Systems.” John Wiley & Sons, New York. Millington, D. (1978). “Systems Analysis and Design for Computer Application.” Ellis Horwood, Chichester, England. Mingers, J. C. (1981). Towards an Appropriate Social Theory for Applied Systems Thinking: Critical Social Theory and Soft Systems Methodology, Journal of Applied Systems Analysis 7, 41-49. Mulder, M. (1971). Power Equalization Through Participation? Administrative Science Quarterly 16(1), 31-38. Mumford, E. (I981 ). Participative Systems Design : Structure and Method, Systems, Objectives, Solutions 1( 1). Mumford, E. (1983). “Designing Human Systems-The ETHICS Method.” Manchester Business School, Manchester, England. Mumford, E. (1984). Participation-From Aristotle to Today, in T. Bemelmans, ed., “Beyond Productivity: Information Systems Development for Organizational Effectiveness,” pp. 95104. North-Holland, Amsterdam. National Computing Centre. (1977). Data Processing Design Standards. Manchester: National Computing Centre.
390
RUDY HIRSCHHEIM AND HElNZ K. KLEIN
Numann, J. D., and Jenkins, A. M. (1982). Prototyping: The New Paradigm for Systems Development, M I S Quarterly, 29-44. Naur, P. (1985a). Intuition in Software Development, in H. Ehling et al., eds., “Formal Methods and Software Development.” Springer, Berlin. Naur, P. (1985b). Programming as Theory Building, Microprocessing and Microprogramming 15, 253 261. Newell, A,, and Simon, H. ( 1972). “Human Problem Solving.” Prentice-Hall, Englewood Cliffs, NJ. Newman, M. (1985). Managerial Access to Information: Strategies for Prevention and Promotion, Journal of’Munagement Studies (March), 193 -21 1. Newman, M., and Noble, F. (1990). User Involvement as an Interaction Process: A Case Study, Informalion Systems Research 1( l), 89-1 13. Newman, M., and Rosenberg, D. (1985). Systems Analysts and the Politics of Organizational Control, OMEGA 9, 127 143. Ngwenyama, 0. (1987). Fundamental Issues of Knowledge Acquisition: Toward a Human Action Perspective of Knowledge Acquisition. Ph.D. Dissertation, Watson School of Engineering, State University of New York, Binghamton. Ngwenyama, 0. (1991). The Critical Social Theory Approach to Information Systems: Problems and Challenges, in H.-E. Nissen, H. Klein and R. Hirschheim, eds., “Information Systems Research: Contemporary Approaches and Emergent Traditions,” pp. 267-280. North-Holland, Amsterdam. Nissen, H.-E., Klein, H. K., and Hirschheim, R., eds. (1991). “Information Systems Research: Contemporary Approaches and Emergent Traditions.” North-Holland, Amsterdam. Nygaard, K . (1975). The Trade Unions New Users of Research, Personnel Review 4(2) as referenced on p. 94 in K. Nygaard, “The Iron and Metal Project: Trade Union Participation,’’ in A. Sandberg, ed., “Computers Dividing Man and Work.” Arbetslivcentrum, Stockholm, Sweden. Nygaard, K., and Haandlykken, P. (1981). The System Development Process-Its Setting, Some Problems and Needs for Methods, in H. Hunke, ed., “Software Engineering Environments,” pp. 157-172. North-Holland, Amsterdam. Oegrim, L. (1990a). Re- and De-Construction of What? in R. Hellman er al., eds., “Proceedings of the 13th IRIS Conference,” pp. 547-557, Reports on Computer Science and Mathematics no. 108. Abo Akadenii University, Turku, Finland. Oegrim, L. (1990b). Mao’s Dialectics as a Basis for Understanding Information Systems Development, paper presented at the Aalhorg Workshop, University Center of Aalborg, Denmark. Oliga, J. C. (1988). Methodological Foundations of System Methodologies, Systems Practice 1(1), 87 112. Olle, T. W., Sol, H. G., and Verrijn-Stuart, A. A,, eds. (1982). “Information Systems Design Methodologies: A Comparative Review.” North-Holland, Amsterdam. Olle, T. W., Sol, H. G., and Tully, C. J., eds. (1983). “Information Systems Design Methodologies : A Feature Analysis.” North-Holland, Amsterdam. Olle, T. W., Sol, H. G., and Verrijn-Stuart, A. A,, eds. (1986). “Information Systems Design Methodologies: Improving the Practice.” North-Holland, Amsterdam. Olle, T. W., Verrijn-Stuart, A. A,, and Bhabuta, L., eds., (1988). “Information Systems Design Methodologies: Computerized Assistance During the Information Systems Life-Cycle.’’ North-Holland, Amsterdam. Oppelland, H. ( 1984). Participative Information Systems Development, in T. Bemelmans, ed., ‘‘Beyond Productivity : Information Systems Development for Organizational Effectiveness,” pp. 105 125. North-Holland, Amsterdam. Oppelland, H., and Kolf, F. (1980). Participative Development of Information Systems, in H. Lucas, F. Land, T. Lincoln and K. Supper, eds., “The Information Systems Environment,” pp. 238-249. North-Holland, Amsterdam.
DEVELOPMENT METHODOLOGIES
391
Pava, C. (1983). “Managing New Office Technology: An Organizational Strategy.” Free Press, New York. Pettigrew, A. ( 1973). “The Politics of Organizational Decision Making.” Tavistock, London. Pfeffer, J. (1981). “Power in Organizations.” Pitman Press, Boston. Popper, K. (1972). “Objective Knowledge.” Clarendon Press, Oxford. Porter, M., and Millar, V. (1985). How Information Gives You Competitive Advantage. Harvurd Business Review. 63(4), July/August. 149 160. Quine, W. V. (1963). “From a Logical Point of View.” Harper Torch Books, New York. Rosove, P., ed. (1967). “Developing Computer-Based Information Systems.” John Wiley & Sons, New York. Sackman, H. (1967). “Computers, System Science, and Evolving Society.” John Wiley & Sons, New York. Sandberg, A. (1985). Socio-Technical Design, Trade Union Strategies and Action Research, in E. Mumford, R. Hirschheim, G. Fitzgerald and A. T. Wood-Harper, eds., “Research Methods in Information Systems,” pp. 79-92. North-Holland, Amsterdam. Schafer, G., Hirschheim, R., Harper, M., Hansjee, R., Domke, M. and Bjorn-Andersen, N. (1988). “Functional Analysis of Office Requirements : A Multi-Perspective Approach.” John Wiley & Sons, Chichester, England. Schneiderman, B. (1980). “Software Psychology: Human Factors in Computer and Information Systems.” Winthrop, Cambridge. Schoen, D. (1983). “The Reflective Practitioner. How Professionals Think in Action.” Basic Books, New York. Schutz, A. (1967). “Collected Papers I : The Problem of Social Reality.” Martinus Nijhoff, The Hague. Schutz, A,, and Luckmann, T. (1974). “The Structures of Life World.” Heinemann, London. Scott Morton, M. S. (1967). Interactive Visual Display Systems and Management Problem Solving, Ind. Managemen1 Review 9, 69. Scott Morton, M. S. (1971). Strategy for the Design and Evaluation of an Interactive Display System for Management Planning, in C. H. Kriebel, R. L. van Horn and T. J. Heames, eds., “Management Information Systems: Progress and Perspectives.” Carnegie Mellon University Press, Pittsburgh. Senn, J. (1989). “Analysis & Design of Information Systems,” 2nd ed. McGraw-Hill, New York. Shaw, M. (1990). Prospects for an Engineering Discipline for Software, IEEE Computer 7(6), 15--24. Silverman, D. (1970). “The Theory of Organizations.” Heinemann, London. Truex, D. (1991). Emergent Systems Theory as a Basis for Information Systems Development. Dissertation draft manuscript, State University of New York, Binghamton. Truex, D., and Klein, H. (1991). A Rejection of Structure as a Basis for Information Systems Development, in R. Stamper et al., eds., “Collaborative Work, Social Communications and Information Systems,” pp. 21 3-235. North-Holland, Amsterdam. Ulrich, W. (1977). The Design of Problem-Solving Systems, Munagement Science 23( lo), 10991108. Ulrich, W. (1988). Systems Thinking, Systems Practice and Practical Philosophy: A Program of Research, Systems Practice 1(1), 137-163. Venable, J. (1991a). An Enhanced Conceptual Data Model with a Combined Object-Oriented and Relational Implementation. Working Paper, State University of New York, Binghamton. Venable, J. (1991b). A Method and Architecture for Tool Integration in a CASE Environment. Ph.D. Dissertation draft, State University of New York, Binghamton. Vogel, D., Nunamaker, J. F., Martz, B., Grohowski, R., and McGoff, C. (1990). Electronic Meeting System Experience at IBM, Journal of Management Information Systems 6(3), 2543.
392
RUDY HlRSCHHElM A N D HElNZ K. KLEIN
Vonnegut, K. (1988). “Player Piano.” Dell Publishing, New York. Weinberg, V. (1980). “Structured Analysis.” Prentice-Hall, Englewood Cliffs, NJ. Welke, R. J. (1983). IS/DSS: DBMS Support for Information Systems Development. In C. Holsapple and A. Whinston, eds., “Data Base Management.” Reidel, Dordrecht, The Netherlands. Wilensky, H. L. ( 1 967). “Organizational Intelligence.” Basic Books, New York. Winograd, T., and Flores, F. ( 1986). “Understanding Computers and Cognition.” Ablex Publishers, Norwood, NJ. Wood-Harper, A. T., and Fitzgerald, G. (1982). A Taxonomy of Current Approaches to Systems Analysis, The Computer Journul25( I). Wood-Harper, T., Antill, L., and Avison, D. (1985). “Information Systems Definition: The Multiview Approach.” Blackwell Scientific, Oxford. Yourdon, E. (1989). “Modern Structured Analysis.” Prentice-Hall, Englewood Cliffs, NJ. Yourdon, E., and Constantine, L. (1979). “Structured Design: Fundamentals of a Discipline of Computer Program and Systems Design.” Prentice-Hall, Englewood Cliffs, NJ.
AUTHOR INDEX Numbers in italics indicate the pages on which complete references are given.
A
Baker, G., 294,296,306,388 Bakard, D. H., 60,108 Banbury, J., 302,381 Bander, J., 304, 324, 326, 356,381,387 Banville, C., 294,350,381-382 Bariff, M., 335,382 Barnes, G. H., 123,152 Bamett, J. R., 13,54 Baru, C. K., 284 Baskerville, R., 323,382 Basu, A., 70,107 Batcher, K. E., 121, 123, 126,152, 187,229 Batory, D. S., 13,54 BBN Laboratories, 140, 152 Belknap, R., 93, 108 Benson, A. N., 288 Bergen, J. R., 63,109 Berger, P., 312,335,382 Bergh, H., 204,229 Berkovich, S . Y., 229 Bernstein, R., 284 Bernstein, S., 86,108 Berra, P. B., 109, 165, 213, 226,229 Beteem, J., 123,152 Beveridge, J. R., 284 Bhabuta, L., 294,296,306,388,390 Bhanu, B., 85,102,108 Bhargava, B., 284 Bhuyan, L. N., 132,134,139,153 Bic, L., 179,229 Bigelow, J., 23,55 Biggerstaff, T. J., 1,23,30,55 Billingsley, F. C., 284 Bjerknes, G., 294,302,382 Bjorn-Andersen, N., 302,339,351, 365, 382,391 Blair, G. M., 185, 193, 204, 223,229 Blake, A., 65, I08 Blanchard, A. J., 86,111 Blanchard, B. J., 86,111 Blaser, A., 284 Bleicher, J., 334,382
Aaen, I., 294,382 Achterberg, J., 366,381 Adams, G. B., 139,152 Aggarwal, J. K., 69-71,73-78,83-84, 91-105,107-111 Agin, G. J., 271,273,283,286 Agrawal, D. P., 134, 139, 152-153 Agresti, W., 297,381 Alavi, M., 300,381 Albrecht, J., 341,381 Aldridge, N. B., 165,234 Alemany, J., 287 Allen, G. R., 132,135, I52 Alrnasi, G. S., 174,229 Aloimonos, J., 70,107 Alvarez, R., 363,381 Amitai, Z., 208, 211,235 Amlani, M. L., 287 Amould, E., 129,152 Andersen, N., 302,312,319,340,356-358, 361-364,381 Anderson, G. A., 126,152 Andreau, A. G., 176,229 Annaratone, M., 129,152 Antill, L., 302, 392 Apel, K., 360,381 Arango, G., 54 Argyris, C., 298,337,359, 361,381 Arkin, R. C., 93,107 Arvind, 145,152 Asai, F., 173,234 Asar, H., 100, 104, I07 August, J., 381 Auramaki, E., 304, 366,381 Avison, D., 294,296,302,339,351,381,392 Ayache, N., 69,107,269,283
B Backus, J., 163,229 Baker, D., 76,78,93,107-108
393
394
AUTHOR INDEX
Blumenthal, S., 297,382 Boahen, K. A,, 176,229 Bodker, S., 303,345,347-348,366,382 Boehm, B., 299,382 Bogh-Andersen, P., 294, 302, 338, 364,382, 386,389 Boland, R., 302-303,312,335,362,383 Bolc, L., 284 Bolle, R., 98, 108 Bolles, R. C., 269,284 Bonar, J. G., 171,229 Borkar, S., 129,153 Bostrom, R., 301,383 Boult, T. E., 70, 91, 102,110 Bourgois, M., 341,346-347,370-371,384 Boursier, P., 284 Bovik, A. C., 83,109 Boyter, B. A,, 84,110 Brachman, R. J., 23,55 Brailsford, D. F., 160, 228,229 Brancheau, J., 332,337,383 Brantley, W. C., 140,155 Bratteteig, T., 294, 302,382 Braverman, H., 313,365,383 Brenner, K.-H., 165,229 Bridgeland, M. T., 286 Bridges, T., 149,155 Briefs, U.,319,324,383 Briggs, F., 153 Briggs, F. A,, 136-137,153 Brinkley, J., 291 Brolio, J., 284 Brooks, F. P., 1,55, 333, 383 Brooks, M. J., 65,109 Brownhndge, D. R., 143,145,156 Brown, C., 202,230 Brown, C. M., 60,108 Brown, R. M., 123,152 Browse, R. A,, 111 Brule, M., 172,231,233 Bryant, A. L., 292 Bubenko, J., 337,383 Buchmann, A. P., 247,285 Budde, R., 300,325,370,383,385 Bulthoff, H., 64-65, 70, 88,111 Burch, J., 298,383 Burkhard, W. A,, 180,235 Burnard, L., 195,230 Bunell, G., 306,319,383 Bursky, D., 208,230
C Cain, R. A., 269,284 Califano, A., 98,108 Canning, R., 297,383 Cardenas, A. F., 249,255,264,285,287-289 Carlson, J., 303,383 Carroll, D. C., 341,383 Carvey, P. P., 140, I56 Cass, T., 64-65, 70, 88,111 Castelli, E., 284 Cathey, W. T., 165,229 Caulfield, H. J., 165,229 Chae, S . 4 , 170,230 Chaffee, E., 338,383 Chang, E., 287 Chang, N. S., 251,284 Chang, S. K., 250,280-282,284-285, 288-289,291 Chase, S., 336,383 Checkland, P., 296,302,319,322,327,331, 350-351,372-373,383 Chellappa, R., 66,109, 111 Chen, C . T., 173,230 Chen, P.P. S., 241,285 Cheng, Y., 285 Cherri, A. K., 171,230 Chien, C.-H., 84, I10 Chien, Y. T., 285,291 Chikofsky, E. J., 31,55 Chin, R.T., 86,110,285 Chisvin, L., 227-228,230 Chock, M., 255,285 Chow, L.-W., 195,233 Chu, C. C., 71,92,103,108 Chu, Y., 171,230 Chuang, P. J., 285 Chung, S. M., 190,229 Churchman, C. W., 352,383 Cihorra, C., 318-319,324,383-384 CIE, 78,108 Clarke, T. J. W., 149,153 Clauer, C. R., 291 Cleland, D., 298,384 Clemens, D. G., 173 Coad, P., 357,384 Cohn, R., 129,153 Colestock, M., 132, 135, 153 Colter, M., 294,384 Comte, D., 145, I55 Conklin, J., 23,55
AUTHOR INDEX Constantine, L., 299,392 Control Data Corp., 120, 253 Cordonnier, V., 170,230 Cornish, M., 145,153 Corre, J. L., 285 Correlf, S., 140,156 Cotterman, W., 294, 296,384 Cotton, C. J., 204-205, 232 Couger, J. D., 294,296,298-299,384 Courant, R., 91,108 Couranz, G. R., 126,153 Courbon, J. C., 341,346-347,370-371 ,.384 COX,B., 22-23,33,55 Cox, G., 129,153 Crane, B. A., 126,153 Cromwell, R . L., 94,109 Crowther,W. R., 140,156 Cusumano, M. A,, 20-21,55
D Dahlbom, B., 294,382 Daniels, A,, 298,384 Danielsson, P. E., 285 Darlington, J., 149,153 Dasgupta, S., 117,153 da Silva, J. G. D., 173,233 Davidson, E. S., 136, 140,155 Davis, A. L., 145,153 Davis, E. W., 213,230 Davis, G., 314,341-342,352,384 Davis, L. S . , 285 Davis, N. J., 134-135, 142,156 Davis, W. A,, 285 Day, W., 302,383 Dayal, U.,247,285 Decegama, A. L., 174,230 deFigieiredo, R. J. P., 84-85,98, 102,111 Degloria, S. D., 86,108 DeMaio, A,, 301,384 DeMarco, T., 299,321, 367,384 Denneau, M., 123, I52 Dennis, A. R., 384 Dennis, J. B., 145,153 Denyer, P. B., 193,223,229 Dharsai, M., 123,155 Dhond, U. R., 69, I08 Dickson, Gary W., 295,384 Dimitroff, D. C., 285 Di Zenzo, S., 86,108
395
Dolecek, Q. E., 149,153 Domke, M., 302,339,351,391 Dongana, J. J., 153 Drake, 9.L., 129,153 Draper, B. A., 284 Dreyfus, H., 347,360,384 Dreyfus, S., 247,384 Drumheller, M., 64-65, 70, 88, 111 D’Souza, D., 23,55 Dubois, M., 136-137,153 Duckworth, J., 227-228, 230 Duckworth, R.J., 228,229 Duda, R. O., 60,108 Duncan, J. S., 101,108 Dunlay, R. T., 103,108 Durrant-Whyte,H. F., 88-90,108 Durrieu, G., 145,155 Dyer, C., 285
E Earl, M., 300,341,365,370,382,384 Early, K., 171, 191,234 Ehn, P., 294,303-304,313,319,322,324-326, 345,347-349,356,366,315,371,382-385 Eich, M. H., 169,231 Eichmann, G., 170,230 Elfes, A., 85, 110 Ellis, M. A,, 22-23,33,55 Encore Computer Corp., 140,253 Eneland, J., 204,229 Enger, N., 296,384 Enomoto, T., 161,202,204,232 Episkopou, D., 297,385 Erlander, B., 303,383 ETA Systems, Inc., 253
F Fahlman, S. E., 224,230 Fai, W. S., 93, I09 Faloutsos, C., 262,267,290 Farhat, N. H., 165,230 Faugeras, 0. D., 269,283 Feldman, J. A., 221-222,230 Feldman, J. D., 187,230 Feng, W. C., 287 Fernandez, R., 287 Fernstrom, C., 192,220,230 Fikes, R., 23,55 Finin, T., 23,55
396
AUTHOR INDEX
Finnila, C. A., 126,153 Fisher, A. S., 31,55 Fitzgerald, G., 294, 296, 298,306, 351,381, 388,392 Flores, F., 336,347, 362,392 Floyd, C., 300,325,341-342, 356, 371,385 Flynn, M. J., 115,153, 194,230 Forester, J., 341,385 Foster, C. C., 165, 188, 226,230 Foulser, D. E., 129,154 Frankot, R. T., 66, 109 Freeman, P., 55 Freire, P., 348,385 Fu, C.-C., 170,230 Fu, K. S., 251,284-286,288,291 Fujimura, K., 287 Fukagawa, M., 120, 156 Fukunaga, K., 60,109 Fulmer, L. C., 187,230 Fung, H. S., 125-126,157,235 G Gajski, D. D., 154 Gane, C., 299,367,385 Gannon, D., 135,154 Gardner, W. D., 174,230 Garza, J., 13,54 Gaylord, T. K., 171,232 Geiger, D., 6 4 6 5 , 70, 88,111 Gelfand, J. J., 71, 98, 109,111 Gelly, O., 145, 155 Geman, D., 64,238,109 Geman, S., 64, 88,109 George, D. A., 140,155 George, J. F., 384 Gerhardt, M. S., 126,153 Gettys, J., 19,56 Gil, B., 110 Gilbert, J. P., 179,229 Gillenson, M. L., 178,230 Gillett, W., 64-65, 70, 88,111 Gilrnartin, M. J., 126,153 Gindi, G. R., 101,108 Ginzberg, M., 335,382 Gladstone, P. J. S., 149,153 Glans, 298,385 Gleason, G. J., 271, 273,286 Gleason, S., 129, I53 Glover, R. J., 171,233
Goksel, A. K., 173,230 Goldberg, A., 22,33,55 Goldkuhl, G., 318,385 Gonzalez, R. C., 60,109 Goodman, A. M., 263,285 Goodyear Aerospace Corp., 126,154 Goser, K., 174,230 Gostelow, K. P., 145,152 Goto, Y . , 84,111 Gottlieb, A., 136, 140, 154, 174,229 Grabec, I., 171,231 Graf, H. P., 174,231 Grant, D., 332,339, 385 Grebner, K., 270,286 Greenbaum, J., 294,304,324,326,344,346, 385 Gregory, W., 10,55 Grishman, R., 136, 140,154 Groenbaek,K., 341,344-345,347-348,382, 385 Grohowski, R., 391 Grosky, W. I., 238, 247,250, 256,266,271, 272,274,276,279,286,288 Gross, T., 129,152-153 Grosspietsch, K., 183-184, 204,231 Gullichsen, E., 23,55 Gupta, A,, 250, 265, 284,286 Gurd, J., 145,156 Gurd, J. R., 173,231 Guttman, A., 262, 267,286
H Haandlykken, P., 356,390 Haar, R. L., 280,286 Habermas, J., 304,313,360,385 Hachem, N. I., 190,229 Hager, G., 109 Hall, J., 161, 165,232 Hamming, R. W., 180,231 Hanibuti, T., 191,231 Hanlon, A. G., 163-164,225,231 Hansen, C., 69,107 Hansen, S., 129,155 Hansjee, R., 302,339,351,391 Hanson, A. R., 93,107408,284 Harada, M., 291 Haralick, R. M., 263,279,280,286,289-290 Harmon, S. Y . ,94,109 Harold, F., 296,384
AUTHOR INDEX Harper, M., 302,339,351,391 Harris, S. K., 288 Harsanyi, J. C., 365,385 Hart, P. E., 60, 208 Hartline, P. H., 71,111 Harvey, S. L., 140, I55 Hashizume, M., 191,231 Havn, E., 324,326,381 Hawgood, J., 300,342,370,385 Hayes, J. P., 138, 255 Hays, N., 140,154 Healey, G., 82,109 Hecht-Nielsen, R., 224-225,231 Hedberg, B., 331,386 Hegron, G . ,285 Hein, C. E., 129,154 Heinen, S., 301,383 Held, D., 363,386 Heller, D., 19,55 Henderson, T. C., 93,109 Heng, M., 366,381 Henning, M., 179, 196,232 Hermann, F. P., 202,231 Higbie, L. C., 126,154 Hilbert, D., 91, 108 Hilleringrnann, U.,174,230 Hillis, W. D., 121, 123, 154 Hinckley, K., 19,55 Hindle, K., 294,296,306,388 Hinton, G . E., 224,230 Hirata, K., 161,202, 204,232 Hirata, M., 170,231,235 Hirschheim, R., 294,298, 301-302,304, 307, 310,313,323-324,326,336,339-340, 351,355,359,361,365-366,381, 386-388,390-391 Hockney, R.W., 117,154 Hodges, D. A,, 173,234 Holbrook, R., 179,231 Holmqvist, B., 338,386 Holst, J., 365,382 Hong, J., 279,286 Hooley, A,, 286 Hopkins, R. P., 143, 145,156 Horn, B. K. P., 60, 65,91, 209 Horne, D. A,, 286 Horowitz, E., 55 Hoskins, J., 30,55 Hosoya, H., 165,232 Howard, R., 303,375,386
397
Hsu, F. J., 288 Hu, G., 76,109 Huang, H. K., 286 Hubbard, W. E., 174,231 Huber, H., 183-184,204,231 Hull, R., 241,286 Hurlbert, A., 64-65, 70, 88,111 Hurson, A. R., 169,231 Hutchinson, S. A., 94,109 Huttenhoff, J. H., 126, 153 Hwang, C. H., 285 Hwang, J. N., 146,149,155 Hwang, K., 120,153-254 Hwang, S. S., 76,78,93,207-108
I Iacono, S., 324,387 Ichikawa, T., 251,291 Iisaka, J., 290 Iivari, J., 300, 341-342, 350, 364,386 Ikeuchi, K., 65,91,94,109 Ishii, K., 120, 156 Ishio, K., 202,231 Itano, K., 171,230 Itoh, S., 290 lves, B., 386 Iyengar, S. S., 285-286,289
J Jackel, L. D., 174,231 Jagadish, H. V., 265,274,287 Jain, A. K., 60,109 Jain, R., 250, 265,284,286 Jain, R. C., 270,287 Jalics, P. J., 11, 56 Jean, S. N., 146, 149,155 Jenkins, A. M., 300,341, 370-371,389 Jenkins, R. E., 176,229 Jeong, D.-K., 173,234 Jepsen, L., 302,357,386 Jespers, P. G . A,, 176,234 Jesshope, C. R., 117,154 Jessup, L. M., 384 Johnson, M., 359,388 Jones, A. K., 140,254 Jones, S., 161,231 Jonsson, S., 331,386 Jordan, H. F., 136,154 Jordan, J. R., 83,109
398
AUTHOR INDEX
Joseph, T., 255,287 Jules, B., 63, I09 Jungert, E., 287
K Kadota, H., 161, 173, 198,231 Kagawa, K., 161,173, 198,231 Kain, R. Y., 126,152 Kak,A. C., 60,94,109,111 Kalvin, A., 279,287 Kamibayashi, N., 291 Kammersgaard, J., 303,345,382 Kanade,T.,82,84,94, 103,109-110 Kandle, D. A,, 129,154 Kaneko, M., 165,232 Kant, E., 55 Kapauan, A,, 135,154 Karirn, M. A,, 171,230 Kartashev, S. I., 182,231 Kartashev, S. P., 182,231 Karthik, S., 83, 105, 109 Kashyap, R. L., 285-286,289 Kaspdris, T., 170,230 Kasturi, R., 287 Kathail, V., 145, 152 Kato, M., 123, 152 Kato, T., 287 Kaula, R., 333, 337,386 Kawabe, S., 120,156 Keast, C. L., 202,231 Keen, P., 300,315,324,341,370,386 Kehler, T., 23,55 Keller, R. M., 149,154 Kemper, A,, 55 Kendall, J., 342, 367,387 Kendall, K., 342,367,387 Kender, J . R., 66-67,70,110 Kensing, F., 302, 312, 319, 340, 356-358, 361-364,381 Kerola, P., 347,387 Kibblewhite, E. J., 286 Kim, W., 23,55 Kimura, T., 161,232 King, R., 241,286 King, W., 298,384 King, W. R., 332,387 Kirkhdm, C. C., 173,231 Kirsch, W., 298,387 Kitagawa, H., 291 Kitchel, S. W., 149,155
Kjeldsen, R., 98,108 Kleinfelder, W. J., 140, I55 Klein, H. K., 294,298, 302,304,307, 310, 323-324,326,335,336,339-340,344, 355,359,361,363,365-366,381, 385-388,390-391 Kling, R., 318,324,387 Klinger, A,, 249,255,264,285,287,289 Klinker, G. J., 82,110 Kluge, W. E., 149,154 Knapp, R., 294,384 Knoll, T. F., 270,287 Knuth, D. E., 173,231 Kobayashi, I., 287 Kogge, P., 172,231 Kohonen,T., 122,154,164,170,176,178,180, 185,215,227,231 Kolf, F., 301, 362,387,390 Kolsky, H. G., 86,108 Komuri, S., 173,234 Kories, R., 70, 110 Kothari, S. C., 139, 154 Kozdrowski, E. W., 154 Kraft, P., 304, 326,387 Krambeck, R. H., 173,230 Krotkov, E., 70,110 Kruskal, C. P., 136, 140,154 Kruzela, I., 192, 220,230 Kubicek, H., 303,319, 377,387 Kuck, D. J., 117,123, 136,140,152,154-155 Kudoh, H., 161,173,198,231 Kuehn, J. T., 134-135, 142,156 Kuhlenkamp, K., 300,370,383 Kuhn, T., 325,388 Kumar, K., 84-85,98, 102,111 Kung, F. K., 279,288 KUng, H. T., 126,129,152-153,155 Kung, S. Y., 146,149,155 Kunii, T. L., 250,284-287,291 Kunt, M., 287 Kurita, T., 287 Kuutti, K., 366,388 Kyng, M., 294,303-304,313,319,322, 324-326,344-346,349,356,366,377, 382,384-385,388 L Lakoff, G.,359,388 Lam, M., 129,152,153 Lamdan, Y., 271,278,287
AUTHOR INDEX Land, F., 299,301,388 Landry, M., 294,382 Lang, G. R., 123,155 Lantz, K., 300, 371,388 Lanzara, G., 302,357-359,388 LaRocca, F. D., 173,230 Larson, M., 64-65,70,88,111 Lawrie, D. H., 136,140,154-155 Lea, R. M., 170, 189,197,232 Learmonth, G., 386 Leben, J . , 10,56 Lechovsky, F. H., 23,55 Lee, B., 388 Lee, B. G., 86,110 Lee, C. Y., 186,215,232 Lee, D. L., 232 Lee, E. T., 288 Lee, J. S. J., 232 Lee, S. H., 165,229 Lee, S. Y., 282,288 Lee, W.-F., 228,232 Lee, Y. C., 288 Leeland, S. B., 129,155 Lehtinen, E., 304,366,381,388 Lehtio, P., 176,231 Leifker, D., 260,289 Leist, K., 290 Lerner, E. J., 175,232 Levialdi, S., 285 Levine, M. D., 81,110 Levitan, S. P., 171,229 Levy, S., 161, 165,232 Lewis, D., 138, 140, 156 Licklider, J. C. R., 341,388 Lien, Y. E., 288 Lim, G.-C., 341,381 Lin, B. S., 250,285,288 Lin, C., 232 Lin, M.-H., 94,110 Lincoln, N. R., 118, 120, 155 Lincoln, P., 23,55 Lindsay, I., 165,234 Lindstrom, G., 149,154 Lipovski, G. J., 141-142,155 Lippmann, R. P., 174-175,232 Little, J., 64-65, 70, 88, 111 Liu, S. H., 284 Lo, S. C., 146,149, I55 Lachovsky, F. H., 232, 240,291 Lockemann, P. C., 291
399
Lohman, G. M., 288 Longstaff, F. M., 123,155 Longstaff, P. S., 123,155 Lopresti, D. P., 129,155 Love, H. H., 126,153 Love, Jr., H. H., 197, 216,233 Low, J. R., 222,230 Lubars, M. D., 28,56 Lucas, H., 298,300,388 Luckmann, T., 312,335,359,382,391 Luk, F. T., 129,153 Lum, V. Y., 288 Lundin, J., 302, 312, 319,340, 356-358, 361-364,381 Lundstrom, L.-E., 204,229 Luo, R. C., 94,110 Lustman, F., 69,107 Lyytinen, K., 294, 304,310, 313, 318, 324, 336, 362,365-366,381,385-388
M Maclean, C. D., 149,153 Maddison, R., 294, 296,306,388 Madsen, K. H., 302,347,359,389 Magee, M. J., 84,107,110 Mago, G. A., 149,155 Mai, L.-P., 173,230 Maier, D., 291 Makkuni, R., 288 Malek, M., 141-142,155, 170,233 Malik, S., 97,110 Manola, F., 288 Manola, F. A,, 249-250, 260, 267,289 Manuel, T., 140, I55 March, S., 332,337,383 Markus, M. L., 317,319,389 Max, D., 60, 71,110 Martin, D. W., 86,110 Martin, J., 10, 56 Martin, J. C., 332,389 Martin, M. D., 288 Martz, B., 391 Maryanski, F., 241,289 Mathiassen, L., 294,300,302-303, 312, 319,324,340,355-364,370,381-383, 386,388-389 Matos, V. M., 11,56 Matsumoto, Y., 20,56 Matthies, L., 85,110 Mazumder, P., 183,232
400
AUTHOR INDEX
McAuley, A. J., 204205,232 McAuliffe, K. P., 136, 140, 154-155 McCanny, J. V., 149, I55 McCarthy, D. R., 247,285 McCarthy, TI,363-364,389 McCormick, B. H., 250,285,289 McGoff, C., 391 McGregor, D., 179, 196,232 Mclnnes, S., 179, 196,232 McKeown, Jr., D. M., 288 McLean, E., 389 McMahon, H. O., 164,233 McWhirter, J. G., 149,155 Medioni, G., 271, 278,290 Mehrotra, R., 238, 247, 256,266, 271,272,274, 276,279,286,288 Melton, E. A., 140, 155 Menzi, U., 290 Menzilcioglu, O., 129,152 Metford, P. A. S . , 123, I55 Meyer, B., 22-23,33,56 Meyer-Wegener, K., 288 Millar, V., 391 Miller, L. L., 169, 231 Miller, S . W., 289 Millington, D., 389 Mingers, J. C., 327,389 Minker, J., 165, 226,232 Mintz, M., 109 Miranker, D. P., 141-142,156 Mirsalehi, M. M., 171,232 Misunas, D. P., 145,153 Mitiche, A,, 110 Miura, K., 120,155 Miyake, J., 161, 173, 198,232 Mizutori, T., 287 Moerdler, M. L., 66-67, 70,91, 102,110 Mohan, L., 289 Mole, G. F., 149,156 Moore, B., 129,153 Morgan, G., 306,319,383 Moriarty, J. D., 280,290 Morisue, M., 165, 232 Motomura, M., 161, 202, 204 Moussu, L., 170,230 Mudge,T. N., 138,155,269-270,291 Mulder, M., 324, 364,389 Mulgaonkar, P. G., 280,289-290 Muller, A., 183-184, 204,231 Mumford, E., 301,322,365, 377,382,389
Mundie, C., 140,155 Munk-Madsen, A,, 302,312,319,340, 356358.361-364,381 Murayuma, H., 120,156 Murdocca, M., 161, 165,232 Murray, J. P., 174,232 Murtha, J. C., 169,226,232 N
Nagai, H., 170,231,235 Naganuma, J., 161,232 Nagata, M., 289 Nagel, H. H., 291 Nagy, G., 289,292 NdkdgOme, Y., 173,234 Nakamura, K., 171,233 Nandhakumar, N., 70-71,83,91-92, 94-105,107-111 Naokazu, Y . ,291 Narasimhan, B., 55 Narendra, K. S.,101,108 Nash, J. G., 129,155, 195,233 National Computing Centre, 389 Naumann, J. D., 300,341,370-371,389 Naur, P., 389 Nazif, A. M., 81,110 Neighbors, J. M., 56 Neo, P., 276,286 Newell, A,, 302,390 Newman, E. A,, 71,111 Newman, M., 298,319,324,336,365,386,390 Newman, R., 19,56 Ng, Y. H., 171,233 Ngwenyama, O., 304,313,339,362,364, 385,390 Nielsen, P., 302, 324, 357, 360, 364,386,389 Nikaido, T., 220,233 Nishikawa, H., 173,234 Nishimichi, Y., 161, 173, 198,231 Nisscn, H.-E., 324.390 Norman, A. C., 149,153 Norman, R. J., 56 Norton, V. A,, 140,155, 174,233 Nunamaker, J. F., 384,391 Nunamaker, Jr., J. F., 5 6 Nurminen, M., 294,382 Nye, A,, 19.56 Nygaard, K., 303,356,390 Nystrom, P., 386
AUTHOR INDEX
0 O’Donnell, J. T., 149,155 Oegrim, L., 366,390 Ogden, W. F., 1 , 5 7 O’Gorman, L., 265,287 Ogura, T., 161, 200,232-233 Oh, C. H., 81, 105,111 Ohbo, N., 291 Ohno, T., 173,234 Ohta, Y . ,111 Ohtsuki, T., 170,234 Oja, E., 176,231 Oldfield, J., 172,231,233 Oliga, J. C., 362,390 Olle, T. W., 294, 296,390 Omolayole, J. O., 289 Ooka, H., 161,202,204,232 Oonishi, Y.,289 Oppelland, H., 301, 362, 364,387,390 Oppenheimer, P., 64-65,70, 88,111 O’Reilly, T., 19, 56 Orenstein, J. A., 249-250,260, 267,288-289 Osler, P., 171, 191,234
P Pacheco, M., 174-175,234 Pakzad, S. H., 169,231 Palermo, F., 289 Papachristou, C. H., 171,233 Parhami, B., 165,168, 186,217,226,233 Parker, T., 56 Patel, J. H., 183,232 Patil, S., 149, 154 Patterson, W. W., 218,233 Pava, C., 301,390 Pavlidis, T., 289 Pearl, J., 89, 111 Pearson, J. C., 71,98,109,111 Pease, R. F., 170,230 Peckham, J., 241,289 Penedo, M. H., 5 7 Perby, M., 303,383 Perlis, A. J., 1, 55 Perron, R., 140,155 Peterson, C., 129,153 Peterson, R. M., 98,111 Pettigrew, A., 324,390 Peuguet, D. J., 289 Pfeffer, J., 324, 390
401
Pfister, G. F., 140,155, 174,233 Phillips, B., 289 Pieper, J., 129, I53 Pizano, A,, 249,264,287,289 Plas, A,, 145,155 Poggio, T., 64-65, 70,88,111 Popper, K., 312,390 Porter, M., 391 Potter, J. L., 233 Pouliquen, P. O., 176,229 Powell, J., 56 Pressman, R. S., 56 Prieto-Diaz, R., 31,56 Przytula, K. W., 129,155
Q Quine, W. V., 335,391
R Rabitti, F., 289 Ramesh, N., 279,290 Rankin, L., 129,153 Rasbech, M., 302,312,319,340,356-358, 361-364,381 Rautenbach, P. W., 145,156 Ravishankar, C. V., 284 Ray, W. A., 130,135, I55 Reddaway, S. F., 123,155 Reddy, D. J., 288 Reeve, M., 149,153 Reif, R., 171, 191,234 Reinhardt, S., 120,155 Rettberg, R. D., 140,156 Reuss, H. L., 289 Reuss, J., 250,285 Rhode, M. L., 287 Ribeiro, J. C. D., 172,233 Rich, C., 56 Richter, C., 23,55 Riley, V., 55 Rimmer, M. T., 123,155 Riseman, E., 93,107-108 Robson, D., 22,33,55 Rodger, J. C., 111 Rolskov, B., 319,389 Rosenberg, D., 319,365,390 Rosenfeld, A,, 60,111 Rosenthal, W. D., 86,111 Rosove, P., 297,391
402
AUTHOR INDEX
Roussopoulos, N., 260,262,267,289-290 Rovner, P. D., 221,230 Rowe, L. A,, 56 Roy, J., 13,54 Rudolf, J. A,, 126, 156 Rudolph, L., 136, 140,154 Rueckert, U., 174,230 Russell, R. M., 120,156 Rux, P. T., 126,153
S Sachse, W., 171,231 Sackman, H., 337,391 Sakauchi, M., 290 Samadani, R., 291 Sameh, A. H., 136,140,154-155 Samet, H., 239,267, 290 Sandberg, A,, 303,,313,383,385,391 Sanyal, B., 290 Sarson, T., 299,367,385 Satoh, H., 173,234 Saunders, J. H., 22-23,33,56 Savage, C. D., 170,234 Savitt, D. A,, 197, 216,233 Scacchi, W., 324,387 Schafer, G., 302,339, 351,391 Schalkoff, R. J., 60,111 Scheifler, R. W., 19,56 Scheurich, C., 136137,153 Schlutter,H., 149, 154 Schmolze, J. G., 23, 55 Schmutz, H., 290 Schneider,L., 319, 324,383 Schneiderman, B., 345,391 Schoen, D., 359,381,391 Scholes, J., 327, 350-351, 373,383 Schonberg, E., 279,287 Schreiber,R., 129,154 Schumacher, K., 174,230 Schuster, L., 332, 337,383 Schutz, A,, 359,391 Schwartz, J. T., 271,279,287 Schwarz, P., 140,154 Schwederski, T., 134-135,142,156 Scott Morton, M., 300, 341,370,387,391 Seitz, C. L., 134-135, 156 Selby, R. W., 56 Sellis, T., 262, 267, 290 Senn, J., 294, 342,384,391 Sethi, I . K., 279,290
Shafer, G., 80, 111 Shafer, S . A., 82,110 Shankdr, P., 291 Shan, M. K., 282,288 Shapiro, L. G., 263,279,280,286,289-290 Sharir, M., 279,287 Shaw, D. E., 133,135,142,156 Shaw, M., 297,391 Shaw, S. W., 84-85,98,102,111 Sheth, A., 290 Shi, Q. Y., 280-282,285 Shimogaki,H., 287 Shin, H., 170,233 Shipman, D., 243,290 Shirazi, B., 169,231 Shiu, M., 286 Shiveley, R. R., 126,153 Shoens, K. A,, 56 Shouxuan, Z., 290 Shu, D., 195,233 Siegel, H. J., 134-135, 139, 142, 152, 156 Silverman, D., 307,391 Simchony, T., 66,111 Simon, H., 302,390 Sirletti, B., 176, 234 Sjogren, D., 347-348,385 Skillicorn, D. B., 117,156 Slade, A. E., 164,233 Slotnick, D. L., 123,152, 186,233 Smith, D., 161, 165,232 Smith, D. C. P., 186,234 Smith, 3. B., 23,56 Smith, J. M., 186,234 Snir, M., 136, 140, 154 Snyder, L., 117,134-135,154,156 Snyder, W. E., 170,234 Soden, J., 389 Sodini, C., 171, 191,234 202,231,234 Sodini, C. G., Sol, H. G., 294,296,390 Solorzano, M. R., 94,109 Song, J., 294,296,306,388 Sorgaard, P., 302,312,319, 340,356-358, 361-364,381 Speiser, J . M., 129,153 Spence, C. D., 71,98,109,111 Springsteel,F., 285 Srini, V., 143,145,156 Srinivasan, A,, 332,387 Stage, J., 294,382
AUTHOR INDEX Staib, L. H., 101,108 Stanchev, P., 289 Starbuck,W., 386 Stein, F., 271, 278,290 Stenstrom, P., 137,156 Stentz, A,, 84,111 Sties, M., 290 Stockman, G., 76,109 Stokes, N., 294,296, 306,388 Stokes, R. A., 123,152,155 Stolfo, S., 133, 135, 141-142,156 Stoltzfus,J. C., 288 Stone, H. S., 172,234 Stormon, C. D., 172,212,231,233-234 Strater, F., 298,383 Stroustrup, B., 22-23,33, 55-56 Stucki, P., 290 Stumm, M., 138, 140,156 Stuttgen, H. J., 178, 186, 188, 219, 227,234 Su, S. Y .W., 186,234 Suarez, F. R., 286 Sullivan, W. E., 98,111 Sundblad, Y., 303,319,345,349,377,382,385 Sun Microsystems Corporation, 19,56 Sutton, J., 129,153 Suzuki, K., 170,234 Svensson, B., 192,220,230 Swinehart, D. C., 222,230 Symanski, J. J., 129,153 Symosek, P., 85,102,108 Syre, J. C., 145,155 Szu, H., 165,229
T Takahashi, K., 170,231,235 Takao, Y., 290 Takata, H., 173,234 Tamaru, K., 170,235 Tamesada, T., 191,231 Tamura, H., 291 Tamura, T., 173,234 Tanaka, M., 251,291 Tang, G. Y . , 253,291 Tavangarian, D., 181,234 Taylor, R. H., 222,230 Taylor, R. W., 98,108 Tenenbaum, J. M., 250,287 Teorey, T., 242,291 Terada, H., 173,234 The, K.-S., 23,55
403
Theis, D. J., 154 Therrien, C. W., 60, 88,111 Thomas, R. H., 173,230 Thoresen, K., 294,382 Thurber, K. J., 169, 184,187-188,213,215. 226-22 7,234 To, V. T., 287 Tokudua, T., 173,234 Tomlinson, R. S., 140,156 Toyoura, J., 161, 202, 204,232 Tracz, W., 1,56-57 Treleaven, P. C., 143, 145, 149,156, 174-175, 234 Troop, R. E., 197,216,233 Troullinos, N. B., 213,229 Truex, D., 324,335,344, 366,391 Tsai, C., 171, 191,234 Tsay, M.-S., 173,230 Tseng, P. S., 129,153 Tsichritzis, D., 240,291 Tsujimoto, T., 170,235 Tully, C. J., 294, 296,390 Turney, J. L., 269-270,291 Twichell, B. C., 13,54
U Uchida, K., 120,155 Ulrich, W., 298, 362,391 Unnikrishnan, A,, 291 Urbano, J. A., 129,154 Urbanski, J., 129,153 Utter, Jr., D. F., 288 Uvieghara, G. A,, 173,234 V Vandebo, P., 294,382 Vandemeulebroecke, A., 176,234 van Es, G., 366,381 Vedel, E., 319,389 Vellasco, M., 174175,234 Venable, J., 391 Venkatesh, Y . V., 291 Verleysen, M., 176,234 Verrijn-Stuart, A. A., 294,296,390 Villalba, M., 64-65, 70, 88,111 Villemin, F. Y . , 149,156 Vogel, D., 384,391 VOltz, R. A,, 269-270,291
404
AUTHOR INDEX
Vonnegut, K., 391 Vranesic, Z., 138, 140,156
W Wada, H., 120,156 Wade, J. P., 171, 191, 202,231,234 Wald, L. D., 169, 188,215,226,234 Waldschmidt, D., 227,234 Walker, T., 170,230 Wallis, L., 234 Walser, R., 250,285 Walter, I. M., 291 Wang, K-Y., 135,154 Wang, Y. F., 73-77,102,111 Ward, M., 291 Wartik, S. P., 5 7 Waters, R., 56 Watson, I., 145, 156, 173,231,233 Watson, W. J., 118, 120,156 Wayner, P., 193,212,234 Webb, J. A,, 129,152-153 Webster, D., 30, 55 Weems, C., 195,233 Weide, B. W., 1 , 5 7 Weinberg, V., 299, 391 Weingarten, D., 123,152 Weiss, J., 140, 155 Weiss, S. F., 23,56 Welke, R. J., 296,391 Weller, D., 289 Weyl, S., 250,287 Weymouth, T. E., 250, 265,286 Wheel, L., 209,234 White, H. J., 165,234 White, R., 138, 140,156 Widdoes, L. C., 140,156 Wiederhold, G., 291 Wienshall, D., 64-65, 70, 88,111 Wilensky, H. L., 331,334,340,392 Wiley, P., 135,157 Williams, R. D., 233 Wilnai, D., 208, 211,235 Winograd, T., 336, 347, 362,392
Winsor, D. C., 138,155 Wintz, P., 60, I09 Wiseman, N. E., 233 Wojtkowski, W., 10,55 Wolfson, H. J., 271, 278-279,286-287 Wood-Harper, A. T., 298,302,339,351,381, 392 Wood, J., 294,296,306,388 Wu, C. T., 180,235,288
X Xerox Corporation, 19,57
Y Yamada, H., 161, 170,202,204,231-232,235 Yamada, J., 161,233 Yamada, S., 161,200,232-233 Yamaguchi, K., 291 Yamamoto, H., 191,231 Yamamura, M., 291 Yan, C. W., 282,285 Yang, C. C.,291 Yang, W., 64-65,70,88,111 YdIlg, w. P.,280-282,288 Yang, Y. K., 291 Yasuura, H., 170,235 Yau, S. S., 125-126,157,235 Yeates, D., 298,384 Yost, R. A,, 291 Young, C. J., 126,153 Young, D. A,, 19,57 Yourdon, E., 299,342,357,367-369,384,392
Z Zdonik, S., 291 Zeidler, H. Ch., 227,235 Zieger, R. M., 129,154 Zippel, R., 171, 191,234 Zloof, M. M., 252,292 Zobrist, A. L., 292 Ziillighoven, H., 300,325,370,383,385 Zweben, S. H., 1 , 5 7
A Address-based memory, 162 drawbacks, 163 Advanced Micro Devices Am99C10, content-addressable memory, 206 ALGOL, similarity to LEAP language associative extensions, 222 Ambient scene parameters, 95 Am99C10, Advanced Micro Devices content-addressable memory, 206 APPLE (Associative Processor Programming Language), for STARAN, 214 Application generators, 13-15, 41 Application specific reuse, 25-27,41 Array process, see Processor array Artificial intelligence, computer vision, 62,91-94 Artificial neural networks, 98-99 ASP (associative string processor), 197 software, 216 Aspect graph, 94 Associative database system, see Database system Associative memory, 122-125 bit-serial, 122-123, 125 Associative memory, see Content-addressable memory Associative processing, see Content-addressable memory Assumptions, see also Paradigms, 305 Atomic rectangles, 78, 80-81 AXON, neural network programming language, 225
B Bayesian networks, 89 Bayes theorem, 64 Blackboard approach, multisensory computer vision, 94, 102 Bottom-up processing, 82, 93 Boundary, types, 76 Boundary code, matching, 274 Bounding volume description, 72, 74
Brightness, constancy, 80 Business systems planning, see Information systems architecture
C C, language extensions for associative processing, 223 CAAPP (content addressable array parallel processor), 195 CAD model, 94 CAFS (content addressable file store), 195 CAM, see Content-addressable memory Capacitance, 97, 104 CARM (content-addressable and reentrant memory), 199 CATWOE, 373 Classification systems, 9, 30-31 CMU NAVLAB, 103 COBOL, 10-12 CODGER, 84 Coherent Research CRC32256 contentaddressable memory, 211 Collective resource approach, 303, see also UTOPIA project Color confidence function, 80 constancy, 80 metric space, 78 Communication, see Development of information systems; Information systems architecture; Rationality Component libraries, 9, 30-31, 41 scale, effect on reuse profit and product quality, 41-53 Computer-aided software engineering (CASE), 9,28,30-33, 37,54 Computer vision, see Multisensory computer vision Concatenated code word (CCW), use in database system, 191 Confidence function, 80 Connectionist programming, 224
405
406
SUBJECT INDEX
Conservation of acoustic energy, 98 Constructive representation, 338 Content-addressable memory (CAM), 160,164 applications, 169-170,193,207,211 database systems, 169, 191 LAN ( local area network), 206,209 logic programming, 171-172, 194 networks, computer, 173, 206,209 neural networks, 176 new architectures, 173 PROLOG, 171 architectures, 184 bit-serial, word-parallel, 185 block-oriented, 186 byte-serial, word-parallel, 185 distributed logic memory, 186 fully parallel, 184 trade-offs, 184 word-serial, bit-parallel, 185 ASP (associative string processor), 197 software, 216 association, 177 direct, 177 indirect, 178 bit-serial, 185 CAAPP (content addressable array parallel processor), 195 CAFS (content addressable file store), 195 CARM (content-addressable and reentrant memory), 195 C, language extensions for associative processing, 223 commercial products, 205 Advanced Micro Devices Am99C10,20h Coherent Research CRC32256,211 MUSIC Semiconductor Company LANCAM (MU9C1480), 209 National Semiconductor SONIC, 209 Summit Microsystems SM4k-GPX, 209 connectionist programming, 224 database system, 178 data word arrangement, 166 DBA, database accelerator at MIT, 193, 202 definitions, 164 design considerations, 169, 186 devices and products, 198 DISP (dictionary search processor), 202 DLM, distributed logic memory, 215 expansion, using multiple CAMS, 200 faults, 183
dealing with, 184 testing for, 183 tolerance, 204 GAAP (generic associative array processor), 196 GAPP (geometric arithmetic parallel processor), 196 GRM (generic relational model), 179, 196 Hamming distance, creating CAM with, 181 similarity to DISP character distance, 203 hierarchical associative memory system, 188,191 HYTREM (HYbrid Text-REtrieval Machine), 189 logic-per-track memory, 186 LUCAS (Lund University content addressable system), 192 software, 220 Manchester machine, 173 Massachusetts Institute of Technology (MIT) research, 193 massively parallel computers, 1.73 matching hardware and software, 193 materials, 165 memory allocation, 182 motivation for using, 169 applications, 170 performance improvement, 171 retrieval advantage, 170 multiple matches, resolving during retrieval, 167, 190 neural networks, use of CAM, see Neural networks obstacles, 168 RAPID (rotating associative processor for information dissemination), 217 RCAM (configurable CAM), 204 reliability, 183 signature file, hashed, 190 size, CAM cell, 202 software, for content addressable processing, 212 ALGOL, similarity to LEAP associative language extensions, 222 APPLE (Associative Processor Programming Language), for STARAN, 214 ASP, 216 C, language extensions for associative processing, 223
407
SUBJECT INDEX LEAP, 221 PASCALIA, 219 PASCAL/L, 220 PW1, language extensions for associative processing, 218 STARAN, 187 software, 213 storage and retrieval, 166, 176, 178, 183 survey of previous literature, 225 Syracuse University CAM research, 172 database system, 190 technology improvements, 173 testing, 183 writing into, 168 Contour imagery, 72, 74-77 occluding, 72, 75-77 Cooperative design, 304 Correspondence problem, 68-69 CRC32256, Coherent Research content-addressable memory, 211 Crossbar, 138-139 bus, 138 Cross-section scattering models, 102 Cues, fusion of multiple from single image, 63-67
D Database architecture generic, 239 image, generic, 245 Database management systems (DBMSs), 10-15,23-25 Database system, 178 flag-algebra, 181 relational database model, see Relational database model Data-flow architecture, 142-145 Data model entity relationship, 241 functional, 243 generic image, 247 object-oriented, 244 relational, 242 DBA, database accelerator at MIT, 193,202 Decision module, 84 Decision tree classifier, 83, 97, 101, 105 Designedgenerator, 27-30, 41 ROSE, an example of, 28-30
Design recovery, see also Reverse engineering, 27,30 DESIRE, a system for, 30 Development of information systems CASE tools for, 329, 331-332 control of, 316,318,331 emancipatory approach, 347-348,354, 365 evolutionary, 341 explicit model of inquiry, 351,353-354 language barriers, 338, 364 language. role of, 335-336 learning, in the development of, 347 ordinary work practices approach, 355-365 participatory, 301, 364 work practices, 357 Dichromatic reflection model, 82 Direct association, 177 drawback, 178 Directed graph, 216 DISP (dictionary search processor), 202 Distributed memory architecture, 130-135 DLM, distributed logic memory, 215
E Edge segments, 85 Emancipation, barriers to general, 307 institutional, 348-349 language, 339-341 Emancipatory ideal, 304,313,347-348 prototyping, 347 End user computing (EUC), 331 Energy balance constraint, 96 Energy function, 64-65, 87 Enterprise schema, see Information systems architecture ETHICS, 301,377 Euler equations, 66,91
F FAOR, 302,339 Fault tolerance, of CAM, 204 Feature vector, 86 Flag-algebra, 181 Focus of attention mechanism, 81,92-93 Focus ranging methods, 70 Forms designers, 15-18, 23,41
408
SUBJECT INDEX
Fourth generation languages (JGLs), 5, 10-13,17-19,23,25,41,53 Frame-and rule-based systems, 23-24
G GAAP (generic associative array processor), 196 CAPP (geometric arithmetic parallel processor), 196 Gaussian curve, 75 Gaussian random vectors, 88-89 Generic relational model (GRM), 179 GENESIS, 13-15 GRAIN, 250 Graphical user interface (GUI ), 17-20,41 GRM (generic relational model), 179, 196 GUI generators, 18-20,41, see also Interface development kits
H Hamming distance, 180 associative memory, creating with, 1x1 error-correcting codes, 180 geometric analysis, 180 Heat flux, 83,97,100,10.1105 Heat sink, 83, 97 High-level interpretation multisensor fusion, 103 Hill-climbing approach, 101 Hypercube architecture, 133-1 34 Hyper-scale reuse (HSR), 8 Hypertext (or hypermedia), 23-25 Hypothesize-and-verify cycles, 91 HYTREM (Hybrid Text-REtrieval Machine), 189
I Image database, characterization first generation, 250 second generation, 256 third generation, 265 Image processing, 60, 10&102 Image processing sets, 252 Incremental segmentation, 76, 80 Index-based data-driven object recognition, 271-272 global feature-based, 272 local feature-based
feature index tree, primary and secondary memory, 275-276 geometric hashing, 278 point access methods, 274 Indirect association, 178 triple, 178, 216, 222 Information integration, 83 engineering, 337, see also Information systems architecture fusion low levels of processing, 100-102 from multiple views, 68-70 Information systems development, see Development of information systems planning, see Information systems architecture; Methodologies Information systems architecture (ISA), 331-332,339 constructive representation, 337-338 flexibility, 337 organizational learning, 337 Infrared imagery, 83 Inquiry epistemological assumptions, 305-306 model, explicit, in informational systems development, 351,353-354 Integrability, 66 Intensity imagery, 83-84 Interconnection network, 130-135,137-139 hypercube topology, 133-134 mesh topology, 132-133 multistage interconnection network, 139 reconfigurable topology, 134-135 ring topology, 132 scalability, 131-132 tree topology, 133 Interface development kits, 18-20, 41, see also GUI generators Interpreter, 80-82 Interprocess integration, 70 Intrinsic rectangle, 78
J Joinable regions, 80-81 Joint posterior density, 89-90 Joint probability density function, 87
SUBJECT INDEX K KEE, 72,80,92-93
L L*a*b* color space, 78 Ladar imagery components, 71-73 segmenting registered images, 103 Lagrange multiplier method, 90 Lambertian reflectance, 96 LANCAM (MU9C1480), MUSIC Semiconductor Company CAM, 209 Laplacian of Gaussian filter, 76 Large-scale component kits, 22-23, 41 Large-scale reuse (LSR), 8, 10, 15,20,22-23, 25, 27, 41 Laser radar, see Ladar imagery LEAP, language for associative processing, 221 Learning, Kolb’s model, 347 Life cycle approaches, 297 Light striping, 72,75-76 Line edges, configurations, 65 Local orientations, 72 Logical sensors, 93 LUCAS (Lund University content addressable system), 192 software, 220
M Management information systems (MIS), 5, 10-13,19 Manchester machine, 173 Markov random fields, 64-65, 70, 87-88 MARS, 302,356 Massachusetts Institute of Technology (MIT) CAM research, 192 Smart Memory Project, 202 Meaning, denotational theory, and systems development, 335-336 Medium-scale reuse (MSR), 8, 20, 41 Method, 296 Methodologies, see also Specific methodologies classification, 330 concept, 296,355 emergence, historical, 295 evolution, 295, 323-325 generations, 297,323 information systems planning, 327 paradigmatic assumptions, 305
409
paradigmatic influences, 319, 324 participative (internal control), 301, 331, 364, 370 problem-formulation, 301, 350, 372 prototyping, 299,337-338,341-350,370 sense-making, 301,352 socio-technical, 301 soft systems methodology, 302,350-355 structured, 298, 327, 367 summaries, 367 trade-union led, 303,375 MIMD architecture, 129-139 MIMDiSIMD architecture, 1 4 1 4 2 Minimum-distance rule, 87 MIT Vision Machine, 64,70, 102-103 MKS, 93 Model-based object recognition, 268 data driven, 270, see also Index-based data-driven object recognition feature-by-feature, 269-270 hypothesize-and-test, 269 model-by-model, 269 model driven, 270 MRF model, 64-65,70,87-88 MU9C1480 LANCAM, MUSIC Semiconductor Company CAM, 209 Multi-Bayesian techniques, 88-90 Multiperspective modeling, 338 Multiple imaging modalities, 71 Multiple texture analysis modules, 66 Multisensor fusion, see also Ladar imagery computing shape, 65-67 3-D motion from 2-D images, 70 high-level interpretation, 103 information from multiple views, 68-70 infrared and visual imagery, 83 multiple imaging modalities, 71 multispectral imagery, 85 range and intensity imagery, 83-84 range, visual, and odometry, 84 sonar and stereo range sensors, 85 structure lighting and contour imagery, 72, 7677 variational methods, 90-91 visible discontinuity detection, 63-65 Multisensor kernel system, 93 Multisensory computer vision, 59-107, see also Multisensor fusion aggregate features, 103
41 0
SUBJECT INDEX
approaches, 61 categories, 60 classification, 105-106 combination of features, 102-103 computational framework choice, 61-62 computational paradigms, 86-99 artificial intelligence, 91-94 classification rules, 86-87 feature vector, 86 Markov random field models, 87-88 multi-Bayesian techniques, 88-90 phenomenological approach, 94-99 sensor fusion, variational methods, 90-91 statistical approaches, 86-90 data-driven, 91 extracted features, 102 fusion at multiple levels, 99-105 feature combination, 102-103 high-level interpretation, 103 low levels of processing, 100-102 paradigm, 103-105 gray-scale images, 61 integrated analysis, thermal and visual imagery, 97-98 model-based, 83, 103-106 model-based paradigm, 103-105 phenomenological approach, 62,9699, 106 stereoscopic perception, 102 thermal image, 95 Multispectral imagery, 85 Multistage interconnection network, 139 MULTIVIEW, 302,339 MUSIC Semiconductor Company LANCAM (MU9C1480) CAM, 209
N National Semiconductor SONIC contentaddressable memory, 209 Networks, see also Neural networks computer, use of CAM, 173,206, 209 Neural networks, 174 artificial, 98-99 AXON, neural network programming language, 225 CAM use, 176 classifier, 175 connectionist programming, 225 neurosoftware, 224 software, 223
Noise characteristics, 101 Nonatomic rectangles, 78-79
0 Object modeling approach, 105 Object-oriented, 23-25, 33-38,54 Object parameters, 95 Occluding contours, 72, 75-77 Occupancy grids, 85 Odometry, 84 Ontology, 31 1 Optical sensors, 84-85 Ordinary work practices approach to information systems development, 355-365
P Paradigms definition, 305 differences, 307, 310, 316 functionalist, 307 neohumanist, 307, 347, 349, 361 radical structuralist, 307-308, 312 social relativist, 307 summary, 308 system development, 306, 314,323 types, 306 Parallelism, 76, 107 PASCAL extensions for associative processing, 2 18 Pattern recognition, 60 Phenomenological approach, multisensory computer vision, 62, 94-99, 106 PICDMS, 255 PORGI, 301,362,364 Posterior likelihood, 89 Probabilities, Gibbsian, 64 PROBE, 260 Processor array, 121-123 bit-plane, 121 Prototyping, types, 341-345,347,371, see also Methodologies mock-ups, use as alternative to prototyping, 349 PSQL, 262 Pyramidal approach, segmentation, 104
Q Query-by-pictorial-example, 252
SUBJECT INDEX
R Radar, 84-85 Range imagery, 84 RAPID (rotating associative processor for information dissemination), 217 Rasterization, 72 Rationality, communicative, 340, 348, 361, 363 RCAM (reconfigurable CAM), 204 Reality construction, social, 335, 338 Rectilinear structures, 78 REDI, 25 1 Reduction machine architecture, 145-146 Regularization approach, 66 Relational database model, 179 complete, 179 generic relational model (GRM), see Generic relational model incomplete, 179 Reliability, content-addressable memory, 183 REMINDS, 256 Reusability apparent, 49-52 broad-spectrum, 5 horizontal, 4,53-54 hyperboles, 2-4 narrow-spectrum, 5 product quality effect, 52-53 success factors, 2-10, 39 implementation technology, 4, 9-10 infrastructure support, 4 , 9 intercomponent standards, 4, 7 market economies of scale, 4,743 narrow domains, 4-5 relationships among, 38-53 slowly changing technology, 4,&7 technology economies of scale, 4,8-9 well-understood domains, 4, 6 vertical, 4, 53-54 Reverse engineering, 27, see also Design recovery Root definition, 373 Roughness, 98 Rule-based image segmentation technique, 81
S Sapir-Whorf hypothesis, 336 Screen painters, see Forms designers; GUI generators; Interface development kits
41 1
Segmentation, 76, 82-83 pyramidal approach, 104 Shape, computing, 65-67 Shape-from-texture modules, 66,70 Shape-from-uniform-texel, 6 6 6 7 Shape-from-X, 61 Shape similarity, retrieval, 266 Shared memory architecture, 135-139 cache coherency, 1 3 6 1 3 7 data access synchronization, 136 Signature file, hashed, 190 SIMD architecture, 125-129 Skewed T-shaped clusters, 82 Small-scale reuse (SSR), 8, 41 SM4D-GPX, Summit Microsystems contentaddressable memory, 209 Soft systems methodology (SSM), 302,350-355 Software for content-addressable memories, 212 engineering, 297 factories, 20-22.41 Sonar sensor, 85,97-98 SONIC, National Semiconductor contentaddressable memory, 209 Spatial proximity graph, 93 Spatial relationships, retrieval graph-based, 279 string-based, 281 SQL, 10, 14 SRI Vision Module, 273 Standards, intercomponent, 38-44, 46-53 STARAN associative processor, 187 software, 213 Statistical approach, multisensory computer vision, 86-90 Statistical decision rule, 86 Stereographic projection, gradient space, 6 5 4 6 Stereo ranging techniques, 70,85 Stereoscopic depth reconstruction, 68-69 Stereoscopic perception, 102 Strip expansion method, 65 Structural aggregates, 80-81 Structured lighting, 72, 74-77, 102, 72, 74-76,102 Summit Microsystems SM4k-GPX content-addressable memory, 209 Superimposed code word (SCW), use in database system, 191 Surface orientation, 65,67,72
41 2
SUBJECT INDEX
Syracuse University CAM research, 172, 190 Systolic architecture, 125-1 29
T Taxonomy, parallel architecture, 155-157 Technique, concept of, 296 Testing, content-addressable memory, 183 Thermal capacitance, 97 Thermal image, multisensory computer vision, 95,97,99-100 TLC, image model, 264 Top-down processing, 82,93 Trade-union led approaches, 303, 313 Trinocular imaging system, 69 Triple, indirect association, 178, 216, 222
U Underwater visual imagery, 97-98, 106 Unified modeling scheme, 83 User -oriented information systems, 41
Utility function, 89 UTOPIA project, 303,375-377 V Vanishing point, 66-67 Variational method, sensor fusion, 90-91 Vector architecture, 118-1 19 pipelining, 118-119 Verification process, 80 Very-large-scale reuse (VLSR), 8, 10, 13, 15, 18, 23,25,27,41 VIMSYS, 265 Visible discontinuity detection, 63-65 VISIONS, 93 Visual imagery, 83,97-98, 100
w Waltz-type algorithm, 67 Wavefront architecture, 146-149 Work language, 305,338,363
Contents of Volumes in this Series Volume 1 General-Purpose Programming for Business Applications CALVIN C. Go11I I H Numerical Weather Prediction NORMAN A. PHILLIPS The Present Status of Automatic Translation of Languages Yl.HOSHIIA BAK-HILLEL Programming Computers to Play Games A K T H ~1.. I RSAMC'FL Machine Recognition of Spoken Words RI('HAKI1 FAItH('HANI1 Binary Arithmetic GI-OK(il w. KElTWltSNl K
Volume 2 A Survey of Numerical Methods for Parabolic Differential Equations J I M D0tJc;l AS, J R Advances in OrthonormaliLing Computation PHILIP J . DAVIS A N I ) PHIL.IP RAHINOWITZ Microelectronics Using Electron-Beam-Activated Machining Techniques KINNETH R. SHOULDEKS Kccent Developments in Linear Programming SAtll 1. GI.AS.5 The Theory of Automalta: A Survey KOBEKIMCNAIIGHTON
Volume 3 The Computation of Satellite Orbit Trajectories SAMUtL D. CON.rt Multiprogramming E F. C0110 Recent Developments of Nonlinear Programming PHILIP W0l.FF Alternating Direction Implicit Methods GAKKI I B I K K H O t ~ F , RICHARD s. V A K G A . A N D D A V I D Combined Analog- Digital Techniques in Simulation HAKOLI) F. SKKAMSrAI) Information Technology and the Law REEDC. LAWLOK
YOUNG
Volume 4 The Formulation of Data Processing Problems for Computers WILLIAM c MCGEE All-Magnetic Circuit Techniques DAVID R. B t N N I o N AND HEWITTD. CKANE 41 3
CONTENTS OF VOLUMES IN THIS SERIES
414
Computer Education HOWAKI) E. TOMIXINS Digital Fluid Logic Elements 11. H. GLAFTTIJ Multiple Computer Systems WILLIAWA . C ~ I K T I N
Volume 5 The Role of Computers in Electron Night Broadcasting J A ( X Mosriws Some Results of Research on Automatic Programming in Eastern Europe WLAUYSLAW TMKSI A I l i w m i o n of Artificiiil Intelligence and Self-Organiration GOKIIOY P A S K
Automatic Optical Design OK~STE N S. STAVROUIXS Computing Problems and Methods in X-Ray Crystallography CHAKL t.s L. COULTEK Digital Computers in Nuclear Reactor Design ELIZAHTTH CUTHII.1.
A n Introduction to Procedure-Oriented Languages H A K K YD . Flr'sri.~
Volume 6 Information Retrieval CLAUIIF. E. WALSTON Speculations Concerning the First Ultraintelligent Machine IKVING J o t n Gooo Digital Training Devices CHAKLIS R . WI(.KMAN Number Systems and Arithmetic I I A K W Y L. GAKNEK C'onsiderations on M a n versus Machine for Space Probing P. L B A K ( i l ' I . 1 I V I Data Collection a n d Reduction for Nuclear Particle Trace Detectors HI K H l K I GI I J ' K N T I K
Volume 7 Highly Parallel Information Processing Systems JOHYc'. MLIKTHA Programming Language Processors R11iti M. D A V I S T h e Man Machine Combination for Computer-Assisled Copy Editing WAYYf- A. UANII:L.SON Computer- Aided Typesetting WILI.IAW
R
ROIUAN
Programming Languages for Computational Linguistics A ~ " ~ 0 i .C. i ) SAlTEKTHWAlT
CONTENTS OF VOLUMES IN THIS SERIES Computer Driven Displdys and Their Use in Man-Machtne Interactton AVI)KIFS V A V DAM
Volume 8 Time-shared Computer Systems THOMAS N. PIKE,JR. Formula Manipulation by Computer J E A NE. SAMMFT Standards for Computers and Information Processing T. B. STEEL.JR. Syntactic Analysis of Natural Language N A O M I SAGER
Programming Languages and Computers: A Unified Metatheory R . NAKASIMHAN Incremental Computation LIONELLO A. LOMBARUI
Volume 9 What Next in Computer Technology W. J. POPPELBAUM Adbances in Simulation JOHN M c . L ~ o i ) Symbol Manipulation Languages PAUL
w.A U R A H A U S
Legal Information Retrieval Avit/tu S. FRAENKEL Large-Scale Integration -An Appraisal L. M. SPANIX)RFER Aerospace Computers A . s. BIIC'HMAN The Distributed Processor Organizatioq L. J . KOCZELA Volume 10 Humanism. Technology. and Language CHAR1 I:S DECARLO Three Computer Cultures: Computer Technology, Computer Mathematics, a n d Computer Science
Pt 1 I R
WEOYFR
Mathematics in IYX4-The Impact of Computers BRYANTHWAITFS Computing from the Communication Point of View E. E. DAVIU, JK. Computer M a n Communication: Using Graphics in the Instructional Process F K E U ~ X IP.C BROOKS, K JR. Computers and Publishing: Writing, Editing, and Printlng A N D R I E S V A N DAMAND .DAVID E. RICE A llnified Approach to Pattern Analysis CJII GKCNANUFR
41 5
416
CONTENTS OF VOLUMES IN THIS SERIES
Use of Computers in Biomedical Pattern Recognition S. LEDLEY ROBERT Numerical Methods ol Stress Analysis W I L L I APRAGI-R M Spline Approximation and Computer-Aided Deaign J . H. AHLBERG Logic per Track Devices D. 1.. SI.0ThlC.K
Volume 11 Automatic Translation of Languages Since 1960: A Linguist's View H . Josst.Lsox HARRY Classilication, Relevance. and Information Retrieval [I. M. JA(.KSON Approaches to the Machine Recognition of Conversational Speech K L A U SW. OTTEY Man-Machine Interaction Using Speech DAVID R. H I L L Balanced Magnetic Circuits for Logic and Memory Devices R. B KIFHIJKT7 ~ h i i )E. E. NEWHALL Command and Control: Technology and Social Impact A N IH O N Y Dt.~or;s
Volume 12 Information Security in a Multi-User Computer Environment J A M H P. ANvEnSON Managers. Deterministic Models, and Computers G . M. F E K K E R O DIRO('CAFLKREKA Uses of the Computer in Music Composition and Research HARRYR. I.iN('oLN File Organization Techniques DAVII)C. ROBERTS Systems Programming Languages J . D. GANNON, D. P. SHE(.HTtR, F. w. TOMPA,AND A. V A N DAM R. D. BEHGERON. I'arametric and Nonparametric Recognition by Computer: An Application to Leukocyte Image Processing J ~ J V I TM H . S PWWiT7
Volume 13 Programmed Control of Asynchronous Program Interrupts RICHARD L. WCXEI.RLAT Poetry Generation and Analysis JAMESJOYCE Mapping and Computers PATKKIA FLJI.ION Practical Natural Language Processing: The REL System as Prototype B. THOMPSON AND BO%t.NA HENISZTHOMPSON FKI:IXKICK Artificial Intelligence The Pas1 Decade B. CHANVRASEKAKAN
CONTENTS OF VOLUMES IN THIS SERIES
41 7
Volume 14 On the Structure of Feasible Computations J. HARTMAVIS A Y D J. SIMOV A Look at Programming and Programming Systems T. E. C H ~ A T H AJK. M .A N D JUDY A . TOWNELY Parbing of General Context-Free Languages L. GRAHAM A N I ) MI(.HAEL A. HARKISON SLISAY Statistical Processors W. J . POPPELBAUM Information Secure Systems DAVIDK . I I C I A O AND RICHARII 1. BAUM
Volume 15 Approaches to Automatic Programming ALANW. BIERMANN The Algorithm Selection Problem JOHNR. RICE Parallel Processing of Ordinary Programs J KUCK DAVID The Computational Study of Language Acquisition H . REEKER LARRY The Wide World of Computer-Based Education DONALD BlTZEK
Volume 16 3-D Computer Animation A. CSURI CHARLES Automatic Generation of Computer Programs NOAHS. PKYWES Perspectives in Clinical Computing C. O'KANEAND EDWARD A. HALUSKA KEVIN The Design and Development of Resource-Sharing Services in Computer Communication Networks: A Survey S A N U KA. A MAMKAK Privacy Protection in Information Systems REINT U K N
Volume 17 Semantics and Quantification in Natural Language Question Answering W. A. W m v s Natural Language Information Formatting: The Automatic Conversion of Texts to a Structured Data Base NAOMISAGER Distributed Loop Computer Networks M I N GT. LIU Magnetic Bubble Memory and Logic TIENCHICHENA N D Hsu CHANG Computers and the Public's Right of Access to Government Information ALANF WESTIN
41 8
CONTENTS OF VOLUMES IN THIS SERIES
Volume 18 Image Processing and Recognition AZKIEL RosmFFLIi Recent Progress in Computer Chess MONKOI: M . NEWHORN Advances in Software Science M . H. HALSTEAD Current Trends in Computer-Assisted Instruction PATRICK SUPPES Software in the Soviet Union: Progress and Problems S. E. GWLNAN Volume 19 Data Base Computers DAVII) K. HSIAO The Structure of Parallel Algorithms H. T. KUNG Clustering Methodologies in Exploralory Data Analysis RICHARD DUBESAND A. K. JAIN Numerical Software: Science or .4lchemy? C . W. G E A R Computing as Social Action: The Social Dynamics of Computing in Complex Organizations ROH KLlNG A N D WALT SCACCHI Volume 20 Management Information Systems: Evolution and Stafus GARY w. DICKSON Real-Time Distributed Computer Systems W. R. FRANTA. E. D O ~ I G L A JENSEN. S R. Y. K A I NA, N D GEORGE D. MARSHALL Architecture and Strategies for Local Networks: Examples and Important Systems K. J. T H U R H ~ K Vector Compdter Architecture and Processing Techniques K A I HWANG,SHUN-PIAO SU, A N D LIONEL M. NI An Overview of High-Level Languages JEANE. SAMMET Volume 21 The Web of Computing: Computer Technology as Social Organization ROHKLINCA N D WALTSCACCHI Computer Design and Description Languages DASGUPTA SUBRATA M icrocnmputers: Applications, Problems, and Promise RORERIC . GAMMILL Query Optimization in Distributed Data Base Systems GIOVANNI M A R I A SAC'CO AND s. BINC Y A O Computers in the World of Chemistry PETER LYKOS
CONTENTS OF VOLUMES IN THIS SERIES
419
Library Automation Systems and Networks JAMESE. RUSH Volume 22
Legal Protection of Software: A Survey MICHAEL C. GEMIGNANI Algorithms for Public Key Cryptosystems: Theory and Applications S. LAKSHMIVAKAHAN Software Engineering Environments A ~ T H O VI .YWASSERMAN Principles of Rule-Based Expert Systems A N U RICHAKU 0. DUUA BKIJCEG. BUCHANAN Conceptual Representation of Medical Knowledge for Diagnosis by Computer: MDX and Related Systems B . CHANDRASEKAKAN AND SANJAY MITTAL Specification and Implementation of Abstract Data Types ALFS T. BEKZTISSA N U SATlSH THATTE Volume 23
Supercomputers and VLSI: The Effect of Large-Scale Integration on Computer Architecture LAWKEVCE SNYDER Information and Computation J F. T K A U B ,\NU H. WOZNIAKOWSKI The Mass Impact of Videogame Technology THOMAS A. DEFANTI Developments i n Decision Support Systems CLYDE W. HOLSAPPLE. AND ANDKEW B. WHINSTON RO5EKT H. BONCZEK. Digital Control Systems L PETEKDORATOA N U D A N I EPETEKSEN International Developments in Information Privacy G. K . GIJPTA Parallel Sorting Algorithms S. LAKSHMIVAKAHAN. SUUARSHAN K . DHALL.ANU LESLIEL. MILLER Volume 24
Software Effort Estimation and Productivity S. D. C O N T ~H.. E. DUNSMORE. A N D V. Y . SHEN Theoretical Issues Concerning Protection in Operating Systems MICHAEL A. HAKKISON Developments in Firmware Engineering DASGUPTA AND BRUCED. SHRIVER SUBRATA The Logic of Learning: A Basis for Pattern Recognition and for Improvement of Performance RANANB. B A V E K ~ I The Current State of Language Data Processing PAULL. G A K V I N Advances i n Information Retrieval: Where Is That / # * & ( a $ Record? DONALD H. K R A F T The Development of Computer Science Education W I L L I A F. M ATC'HISON
420
CONTENTS OF VOLUMES IN THIS SERIES
Volume 25 Accessing Knowledge through Natural Language NICKCEK(’ON1: AN11 GORDON MCCALLA Design Analysis and Performance Evaluation Methodologies for Database Computers STI:VI.N A. D ~ M U K J I DAVID A N , K. HSIAO,A N D P A ~ J LR.A STKAWSER Partitioning of Massive/Real-Time Programs for Parallel Processing 1. Lr.t, N . PKYWES. A N D B. SZYMANSKI Computers in High-Energy Physics M I C H A ~ LMETCALF Social Dimensions of Ollice Automation A B B E MowsHowrrz Volume 26 The Explicit Support of Human Reasoning in Decision Support Systems AMI.IAVADUT-IA Unary Processing W. J . POPPELHAUM. A DOLIAS.J . B. GLICKMAN, AND C. OT(MjLE Parallel Algorithms for Some Computational Problems ABHAMOITKA AN11 s. S I ’ I H A R A M A IYENGAK Multistage Interconnection Networks for Multiprocessor Systems S . C. K O T H A K I Fault-Tolerant Computing WING N.TOY Techniques and Issues in Testing and Validation of VLSI Systems H. K. REGHRATI Software Testing and Verification LEE.i.WHITC issues in the Development of Large, Distributed. and Reliable Software c‘. v. RAMAM(M)KrHY. A T ~ JPRAKASH. L V l J A Y CIAKC;. T S ~ J N EYO AMAURA. ANIII’AM BHIDF
ANI)
Volume 27 Military Information Processing JAMES S T A R K DRAPER Multidimensional Data Structures: Review and Outlook S. SITHAKAMA IYENGAH, R . L. KASHYAP, V. K. VAISHNAVI. A N D N. S. V. RAO Distributed Data Allocation Strategies A N D AKUNA RAO ALANR. HEVNER A Reference Model for Mass Storage Systems W. MILLER STEPHEN Computers in the Health Sciences KFVINC. O’KANE Computer Vision A~.KIEL ROSENFELD Supercomputer Performance: The Theory, Practice, and Results OLAFM. LUBECK Computer Science and Information Technology in the People’s Republic of China: The Emergence of Connectivity JOHN H . MAIEK
CONTENTS OF VOLUMES IN THIS SERIES
421
Volume 28 The Structure of Design Processes SUBRATA DASGUPTA Fuzzy Sets and Their Applications lo Artificial Intelligence ABRAHAM KANDELA N D M ~ R D E C H SCHNEIVER AY Parallel Architectures for Database Systems A. R. HURSON,L. L. MILLER,S. H. PAKZAD,M. H. EICH,ANU 9. SHIRAZI Optical and Optoelectronic Computing M I R M O I ~ A BMIRSALEHI, A MUSTAFAA. G . ABUSHAGUR. A N D H . JOHNCAULFIELD Management Intelligence Systems MANFRED KOCHEN Volume 29 Models of Multilevel Computer Security JONATHAN K. MILLEN Evaluation, Description and Invention: Paradigms for Human-Computer Interaction JOHN M . CARROLL Protocol Engineering M i u c T. LIU Computer Chess: Ten Years of Significant Progress MONROENEWBORN Soviet Computing in the 1980s RICHARD W. JUDYA N D ROBERTW. CLOUGH
Volume 30 Specialized Parallel Architectures for Textual Databases A. R. HURSON,L. L. MILLER,S. H. PAKZAD, A N D JIA-BING CHENG Database Design and Performance M A R KL. GILLENSON Software Reliability ANTHONY IANNINO A N D JOHND. MUSA Cryptography Based Data Security GEORGE 1. DAVIDA AND Yvo DESMEDT Soviet Computing in the 1980s: A Survey of the Software and Its Applications RICHARDW. JUDYAND ROBERTW.CLOUGH Volume 31 Command and Control Information Systems Engineering: Progress and Prospects STEPHENJ . ANDRIOLE Perceptual Models for Automatic Speech Recognition Systems RENATODEMORI,MATHEWJ. PALAKAL. A N D PIEROCosi Availability and Reliability Modeling for Computer Systems DAVIV I. HEIMANN, NITINMITTAL,AND KISHORS. TRIVEDI Molecular Computing MICHAEL CONRAD Foundations of Information Science ANTHONY DEBONS
422
CONTENTS OF VOLUMES IN THIS SERIES
Volume 32 Reusable Software Components BRUCEW. WEIDE,WILLIAM F. OGDEN,A N D STUART H. ZWEBEN Object-Oriented Modeling and Discrete-Event Simulation P. ZEIGLER BERNARD Human Factors Issues in Dialog Design THIAGARAJAN PALANIVEL AND MARTINHELANDER Neurocomputing Formalisms for Computational Learning S. GULATI,J. BARHEN, AND S. S. IYENGAR Visualization in Scientific Computing THOMAS A. DEFANTl AND MAXINE D. BROWN Volume 33
Reusable Software Components BRUCEw . WEIDE, WILLIAM F. OGDEN, A N D S T U A R T H.ZWEBEN Object-Oriented Modeling and Discrete-Event Simulation BERNARDP. ZEIGLER Human-Factors Issues in Dialog Design THIAGARAJAN PALANIVEL A N D MARTINHELANDER Neurocomputing Formalisms for Computational Learning and Machine Intelligence S. GULATI,J. BARHEN,AND S. S. IYENGAR Visualization in Scientific Computing AND M A X I N E D. BROWN THOMASA. DEFANTI Volume 34
An Assessment and Analysis of Software Reuse TEDJ. BIGGERSTAFF Multisensory Computer Vision N. NANDHAKUMAR A N D J. K. AGGARWAL Parallel Computer Architectures R A L P IDUNCAN I Content-Addressable and Associative Memory LAWRENCE CHlSVlN A N D R. JAMESDUCKWORTH Image Database Management AND RAJIVMEHROTRA WILLIAMI. GROSKY Paradigmatic Influences on Information Systems Development Methodologies: Evolution and Conceptual Advances R U D Y HIRSCHHEIM A N D HElNZ K. KLEIN