Computing and Information Technologies

INTERNATIONAL CONFERENCE ON m COMPUTING AND I EXPLORING EMERGING TECHNOLOGIES & * - Editors George Antoniou & Dor...

Author: George Antoniou | Dorothy Deremer

133 downloads 3253 Views 20MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

INTERNATIONAL CONFERENCE ON

m

COMPUTING AND

I

EXPLORING EMERGING TECHNOLOGIES

& * -

Editors

George Antoniou & Dorothy Deremer World Scientific

INTERNATIONAL CONFERENCE ON

COMPUTING AND INFORMATION TECHNOLOGIES EXPLORING EMERGING TECHNOLOGIES

INTERNATIONAL CONFERENCE ON

COMPUTING AND INFORMATION TECHNOLOGIES EXPLORING EMERGING TECHNOLOGIES Montclair State University, NJ, USA

12 Oct 2001

Editors

George Antoniou Dorothy Deremer Montdair State University

V|fe World Scientific wB

New Jersey • London • Singapore Sinqapore •• Hong Kong

Published by World Scientific Publishing Co. Pte. Ltd. P O Box 128, Farrer Road, Singapore 912805 USA office: Suite IB, 1060 Main Street, River Edge, NJ 07661 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.

COMPUTING AND INFORMATION TECHNOLOGIES Exploring Emerging Technologies Copyright © 2001 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.

ISBN 981-02-4759-1

Printed in Singapore by World Scientific Printers (S) Pte Ltd

V

PREFACE Four hundred years ago movable type technology changed European literacy from the privileged minority to the general population. Fifty-five years ago, ENIAC, the world's first general-purpose computer, solved relatively few scientific problems in hours rather than days. Forty years ago integrated circuits became commercially available ushering in the miniaturization and accessibility of computing technology. Pioneers in software and hardware considered more interactive devices and additional problem solving roles for computing in building and studying models, and searching for patterns. Thirty years ago supercomputing sought the highest performance possible while personal computers provided access to even home environments. Ten years ago, networking initiated the definition of a computer as anytime and anyplace. Computing technology has become ubiquitous. Society has changed radically and rapidly not only because of computing technology itself but also because technology transforms the way people think and communicate. This volume samples emerging technologies of the twentyfirst century through the presentation of recent research. Rather than focusing on one domain, the volume illustrates a spectrum of research areas with works in network, internet, and parallel computing, theoretical computing including learning algorithms and fuzzy sets, human computer interaction and computing education, and computing applications in biology, imaging applications, IT, and linguistics. We wish to thank our ICCIT2001 keynote speaker, Dr. Lotfi Zadeh, himself a pioneer in Fuzzy Set technology and its broad applications. We extend our appreciation to the authors whose results are contained in this volume. We planned ICCIT2001 as a forum for academics, computer and information scientists, engineers and applied mathematicians to share ideas about computing and to present their work to the scientific and technical community. We believe this volume is an implementation of that objective. Dorothy Deremer George Antoniou

VII

CONTENTS Preface

v

INTERNET APPLICATIONS

1

Analyzing search engine bias Akira Kawaguchi and Abbe Mowshowitz

3

Autonomous agents for web pages Fldvia Coimbra Delicato, Luci Pirmez andLuiz Fernando Rust Da Costa Carmo

filtering

9

Public key encryption and transparency in internet casinos H.M. Hubey andP.B. Ivanov

15

A review of evaluation tools for web content accessibility Marta Prim

23

A sharable instructable agent for information filtering K.R.K. Murthy, S.S. Keerthi and M.N. Murty

31

COMPUTING IN BIOLOGY

39

Entropy versus information: Is a living cell a machine or a computer ? Jack A. Tuszynski

41

Coupling a tubuloglomerular feedback nephron model with a myogenic afferent arteriole model Roman M. Zaritski, E. Bruce Pitman, Harold E. Layton and Leon C. Moore

55

VIM

A mathematical model of the inner and outer renal medulla M.E. Rosar

63

A natural codon space and metric KM. Hubey

71

HUMAN COMPUTER INTERFACE

79

A case study of a disabled man using a mouthstick and accessibility options in windows to do E-commerce Bruce Davis, Eamon Doherty, Gary Stephenson and Joann Rizzo Usability issues concerning the cyberlink mental interface and persons with a disability Eamon Doherty, Chris Bloor, Gilbert Cockton, Joann Rizzo, Dennis Benigno and Bruce Davis Robotics for the brain injured: An interface for the brain injured person to operate a robotic arm Paul Gnanayutham, Chris Bloor and Gilbert Cockton A computer interface for the "blind" using dynamic patterns David Veal andStanislaw Paul Maj

81

87

93

99

Designing for dyslexia — The development of mathematics software for students with specific learning difficulties Walter Middleton, Emma Lejk and Chris Bloor

105

Using hands-free technology in programs for profoundly disabled children H. ToddEachus and Andrew M. Junker

111

PARALLEL COMPUTING/TECHNIQUES

119

Dynamic load balancing in the parallel continuous global optimization problem by using interval arithmetic A. Benyoub andE.M. Daoudi

121

IX

Investigation of a low-cost high-performance shared-memory multiprocessor system for real-time application Constantine N. Manikopoulos, Sotirios G. Ziavras and Charalambos Christou

127

A linear algorithm to find path among obstacles using reconfigurable mesh Dadjn Wang

137

Dense wavelength division multiplexing for optically interconnected linear array processors Haklin Kimm

143

A two-level optimal distributed monitoring scheme for mesh Dajin Wang Non-linear clustering scheduling with two clusters, one being linear, is NP-hard Wingning Li and John Jingfu Jenq Formal verification of microinstruction sequencing Lubomir Ivanov

151

157

165

Dynamic block data distribution for parallel sparse Gaussian elimination EM. Daoudi, P. Manneback andM. Zbakh

177

All pairs shortest paths computation using Java on PCs connected on local area network John Jingfu Jenq and Wingning Li

185

COMPUTING EDUCATION

193

Enhancing student learning in e-classrooms Jerome Eric Luczqj and Chia Y. Han

195

X

Building up a minimal subset of Java for a first programming course Angel Gutierrez and Alfredo Somolinos

201

Completing a minimal subset of Java for a first programming course Angel Gutierrez and Alfredo Somolinos

207

DEWDROP: Educating students for the future of web development John Beidler

213

Boolean function simplification on a palm-based environment Ledion Bitincka and George Antoniou

221

Internet-based Boolean function simplification using a modified Quine-McCluskey method Sebastian P. Tomaszewski, Ilgaz U. Celik and George E. Antoniou

229

LEARNING ALGORITHMS

237

Autoassociative neural networks and time series filtering Jose R. Dorronsoro, Vicente Lopez, Carlos Santa Cruz and Juan A. Siguenza

239

Neural network architectures: New strategies for real time problems U. Ugena,, F. De Arriaga and M. El Alami

247

Evolving scoring functions which satisfy predetermined user constraints Michael L. Gargano, Ying He and William Edelson

255

Genetic algorithms for mining multiple-level association rules Norhana Bt. Abdul Rahman Araby and Y. P. Singh

261

A clustering algorithm for selecting starting centers for iterative clustering Angel Gutierrez and Alfredo Somolinos

269

XI

Dimension reduction in datamining H.M. Hubey, I. Sigura, K. Kaneko and P.Zhang

275

Process control of a laboratory combustor using neural networks T. Slanvetpan, R.B. Barat and John G. Stevens

283

COMMUNICATION SYSTEMS/NETWORKS

291

Investigation of self-similarity of Internet round trip delay Jun Li, Constantine Manikopoulos and Jay Jorgenson

293

Modified high-efficiency carrier estimator for OFDM communications with antenna diversity Ufuk Tureli and Patrick J. Honan

303

A comparison between two error detection techniques using arithmetic coding Bin He and Constantine N. Manikopoulos

311

An optimal invalidation method for mobile databases Wen- Chi Hou, Hongyan Zhang, Meng Su and Hong Wang Comparison of wavelet compression algorithms in network intrusion detection Zheng Zhang, Constantine Manikopoulos, Jay Jorgenson and Jose Ucles

317

333

INFORMATION TECHNOLOGY/LINGUISTICS

343

The emerging challenge of retaining information technology human resources Rick Gibson

345

XII

Hard-science linguistics as a formalism to computerize models of commnication behavior Bernard Paul Sypniewski

353

B-Nodes: A proposed new method for modeling information systems technology Stanislaw Paul Maj and David Veal

359

The Montclair electronic language learner database Eileen Fitzpatrick and Steve Seegmiller

369

COMPUTING FORMALISM/ALGORITHMS

377

Improvement of synthesis of conversion rules by expanding knowledge representation H. Mabuchi, K. Akama, H. Koike and T. Ishikawa

379

A blocks-world planning system Bhanu Prasad and Vorapat Chavananikul

385

Multi-computation mechanism for set expressions H. Koike, K Akama andH. Mabuchi

391

Proving termination of to rewriting systems Y. Shigeta, K Akama, H. Koike and T. Ishikawa

399

Semantics for declarative descriptions with referential constraints K. Akama, H. Koike and T. Ishikawa

405

Solving logical problems by equivalent transformation K Akama, H. Koike, Y. Shigeta andH. Mabuchi

411

Deciding the halting problem and preliminary applications to evolutionary hardware, and hybrid technology A.A. Odusanya

419

A new algorithm for the computation of invariant curves using arc-length parameterization K.D. Edoh and J. Lorenz

423

ALTUZZY SETS APPLICATION AND THEORY

431

Comparison of interval-valued fuzzy sets, intuitionistic fuzzy sets, and bipolar-valued fuzzy sets Keon-Myung Lee, Kyung-Mi Lee and KrzysztofJ. Cios

433

Introducing user centered design into a hybrid intelligent information system methodology Kate Ashton and Simon L. Kendal

441

Towards hybrid knowledge and software engineering S. Kendal andX. Chen

449

Dynamical computing, communication, development and hierarchical interface H.M. Hubey andP.B. Ivanov

455

IMAGING APPLICATIONS

463

Catadioptric sensors for panoramic viewing R. Andrew Hicks, Ronald K. Perline and Meredith L. Coletta

465

High-performance computing for the study of earth and environmental science materials using synchrotron X-ray computed microtomography Huan Feng, Keith W. Jones, Michael McGuigan, Gordon J. Smith and John Spiletic Author Index

471

481

Internet Applications

3 ANALYZING SEARCH ENGINE BIAS AKIRA KAWAGUCHI AND ABBE MOWSHOWITZ Department of Computer Science, The City College of New York, Convent Avenue at 138* Street, New York, NY 10031, USA E-mail: [email protected], [email protected] This paper is concerned with quantitative measurement of bias in search engines. Bias is measured by comparing the performance of a given search engine to that of a collection of comparable engines. To investigate bias using this measure, a computer-based system that allows for applying the measure to popular search engines on the World Wide Web, has been implemented. The system is accessible at http://wwvi-cs.engr.ccny.cuny.edv/~project. Details of the definition of the measure, the measurement procedure, design of the system, and a discussion of preliminary applications of the measure are reported in [1, 2].

1 Introduction This paper is concerned with measuring bias in search engines. Bias is defined as the degree to which the distribution of URLs, retrieved by a search engine in response to a query, deviates from an ideal or fair distribution for that query. This ideal is approximated by the distribution produced by a collection of search engines. If the collection includes search engines comparable to the one under study, the distribution produced by the collection is a reasonable approximation to the ideal. Like traditional measures of retrieval performance (i.e., recall and precision), bias is a function of a system's response to a query, but it does not depend on a determination of relevance. Instead, the ideal distribution of items in a response set must be determined. Using a collection of search engines to define the ideal makes the measurement of bias computationally feasible in real time. Tests of search engine bias conducted thus far suggest the possibility that bias may be subject-sensitive, i.e., that a search engine exhibiting a high degree of bias on queries related to, say real estate, might be relatively unbiased on questions concerning travel. Results of preliminary experiments designed to test this hypothesis are presented here. 2 Search Engine Bias Bias is one aspect of the quality of the information provided to users by search engines. It is concerned with emphasis, i.e., the balance and representativeness of items in a collection retrieved from a database for a set of queries. Search engines present biased results when some items in the database are selected too frequently and others not frequently enough, or some items are presented too prominently and others not prominently enough. Clearly, "too frequently/prominently" and "not frequently/prominently enough" are relative terms, so it is necessary to establish a norm approximating the ideal or fair distribution. Then bias can be operationalized as the degree to which the distribution of items in a retrieved collection deviates

4

from the ideal. A family of comparable search engines can be used to establish such a norm. This approach is computationally feasible and yields a reasonable approximation to the ideal distribution. [1, 2] The distribution is obtained by computing the frequencies of occurrence of the URLs in the collection retrieved by several search engines for given queries. Two variant measures of bias, one that ignores the order in which the retrieved URLs are presented, and one that takes account of order, are computed by the system. For either variant, the measurement of bias is based on a procedure for obtaining the collection of URLs corresponding to a set of queries processed by a set of search engines. 3 Measurement Procedure Suppose t queries q± (l i-e., the response sets listed in row major form. The number of times each URL occurs among the /?k (1< k < nf) must be tabulated. This can be done by computing atnxK matrix whose kl-th element is 1 if Sk,i contains ai and 0 otherwise. Clearly, the sum Pi of the /-th column is the number of times URL at occurs among the response sets Rlt ... , Rtn. To facilitate comparison, assume the URLs of A are given in non-increasing order of frequency. The vector X=(Ph P2, ... , PK) is called the response vector for the collection of search engines. Similarly, a response vector x = (pi, p2,..., pN)fi>r (a particular engine) E is determined. The bias of an engine E with respect to a set of search engines and a collection of queries (representing a given subject) can be measured as the dissimilarity between the vectors x and X. For purposes of comparison, it is useful to normalize the measure. Bias should be 0 when the results produced by E are 'essentially' the same as those produced by the collection; and 1 when there is 'no agreement'. 9 The following formula for the similarity s(v,w) of vectors v = (v iv .., v„) and w = (w,,..., w„) representing, respectively, queries and documents, has long been used in information retrieval research. [3, 4, 5] s(v,w) =

where all the summations are from i-\ to i=n. {Z(v,)2 X(w;)2}* Taking v and w as vectors passing through the origin in an n-dimensional Euclidean space, this measure can be interpreted as the cosine of the angle between them. It can also be interpreted as the correlation coefficient of v and w, taken as random variables with 0 means. This measure is invariant under normalization, i.e., J(V,W) =

5

$(v/Ev(i),w/Ew(0), where; 5(v,w) = 1 when v = c w, for any real number c\ and .s(v,w) = 1 when.v and w are -orthogonal, i.e., when vt- w, =0 for all i. Thus, defining the bias of an engine E (relativeto'-a collection Q by b(E; q/,..., f,; £j,..., !?«) == 1 s(x,X) gives the extreme values 0 (when^X is a scalar-multiple of x% and 1 (when A} or Xj is -0 for each i). Since users of search engines -tend to- pay more attention 'to-: URLs appearing at the top of a response sequence than they do to those at the bottom, Mas measures should be able to take account -of the order in which URLs appear. A simple way to take account of position is to increment -the-count of a URL-'by- an amount dependent on the position in which it occurs. Possible schemes- include the: following, where m represents the'number-of-positions in a "-response-.sequence.. Increment the-count of a URL in position i-by(l) (m+14)/m (the ,URL in the -first position counts 1, and the URL in the iw-th position count 1/w), or (2) m/i (the URL in the first .position counts m, while the last one counts only 1). 4 The Measurement System A system acting as a meta-search engine [6]-that automatically computes bias- for a set of queries has been developed to facilitate empirical investigation. Currently* the system
Figure 1. Current and Planned System Configuration

6 A servlet-based presentation layer for Web browsers interfaces with these search engines. The main features and options of the system (accessible at http://www-cs.engr.ccny.cuny.edu/~project) are shown in the left part of Figure 2. The results of a particular bias assessment are presented in a series of tables called "Cumulative Bias Measures: after searching [query x]". The bias values in each table are based on the combined results of all the queries shown in square brackets. Four independent values, divided into two groups, are computed. The first two (from left to right) include the engine being assessed in the collection defining the norm; the latter two exclude that engine. "Complete URL" means that the bias computation treats Web pages as the fundamental unit, whereas " Truncated URL" means Web pages from the same Website are taken to be identical. The last column "Average" is the numerical mean of the "Complete URL" for the " Engine Included" and "Engine Excluded" cases. Following the i-th cumulative bias table is a list by search engine of the (complete) URLs retrieved by that engine for the i-th query. A facility for graphing bias values for each engine as a function of the number of queries is also available. 5 Applications of the bias measurement system Preliminary experiments have centered on the ability of the measures to discriminate between search engines, and on the correctness and reliability of the measurement approach. Results obtained thus far show quite clearly that the dissimilarity-based bias measure discriminates effectively between search engines. In a set of experiments designed to explore the sensitivity of bias to subject domain ten search terms were selected for each of the subject areas listed below. A program was used to select words randomly from a Unix spell check utility file. The search terms were used individually as queries in searches performed by seventeen commercial search engines (About, AltaVista, AOLSearch, DirectHit, Excite, FastSearch, Google, Goto, HotBot, ICQIT, Inktomi, Lycos, MSN, NBCi, NorthernLight, Yahoo, WebCrawler). The first thirty URLs retrieved in each search were used in the computation of bias values. Note that the search terms for random-1,2, and 3 were selected at random from the dictionary and are not linked to any specific subject area. 1.

2.

random-1: chimney, explanation, griffin, necropsy, ova, puke, sandman, trapezoid, trio, wipe random-2: consulate, epic, fink, flame, Jacques, horsepower, kettle, Papua, scarf streamside random-3: expositor, frill, hotshot, impartial, liquid, neuroanatomy, nil, perpetuate, quarrymen, Slav euthanasia-1: euthanasia, disability rights, right to die, pro life, kevorkian, doctor death, suicide machine, physician assisted suicide, hemlock society, terminal illness euthanasia-2: mercy killing, voluntary euthanasia, assisted suicide, doctor assisted suicide, pain killer, death with dignity, choice in dying, deadly compassion, life ending treatment, euthanasia agents

7

The right side of Figure 2 shows the plot of the bias values obtained in the random-1 experiment. Three search engines (NorthemLight, Yahoo, and Excite) were chosen for detailed statistical analysis. The results for these three clearly show that the mean bias values of NorthemLight are uniformly larger that those of Excite, and similarly the mean bias values of Excite are larger than those of Yahoo. Analysis of variance was applied using the Minitab system to justify this observation. In the random-1 case the P value in the analysis indicates a measure of the credibility of the null hypothesis such that the differences of the samples are not statistically discernible. Since P is less than the statistical error level 5% (0.05), the null hypothesis is rejected. The confidence intervals in this case also show that the bias values are sampled from statistically different data populations. In other words, it appears that NorthemLight, for example, collects some URLs not found by others, whereas the set collected by Yahoo approximates the norm for all the engines. Similar results are obtained for the random-2 and random-3 cases.

S E A R O

Keyword I Related Set for Search; m

Search With} »select m^prffajt!) Atout AltaVista AClsearcft (BIzRoeket) (CommunifyPotf) DireetHIt (EuroSeek) Excite FastSearcti GartnfoSeek

%^mMm»Vmmm

H

euftianasfa ; rigtttedte pro life disability lights kevarWan -» doctor aealft suicidereserine physician assisted suioic riemtecKsoclety 84;.

3. A*A

4r • *•' ; V/Vr

0.3'.

.

^r-~*R,

%

\

..

A^ I

a M

trmM&x P©" Ett§'msei

%-

83%

^^5 Ready to Seard

-J

ai

20 '

-

-

«

•

i

$

l

r

»

Figure 2. Search on Subject Euthanasia and Bias values for random-1 search

a

tt

8 To investigate whether or not search results in a particular subject area are statistically different for different sets of search terms, bias values were computed for three different searches on the subject of euthanasia (i.e., euthanasia-1, euthanasia-2, and euthanasia-3). These searches were performed using AltaVista, Google, and Lycos. Statistical analysis shows that for each set of search terms the bias values computed for Google are uniformly smaller than those for Lycos, which in turn are uniformly lower than those for AltaVista. However, the results are inconclusive on the question of whether or not the different search-term sets representing a specific subject area significantly affect the bias values for a given search engine. In the case analyzed, the behavior of both AltaVista and Google differ with respect to the choice of the search terms, within the same subject domain. For instance, the analysis of variance for Google's results does not warrant concluding that there are statistically significant differences in bias corresponding to the three search-term sets. An extensive analysis of the bias measures on various subject areas is in progress. The research aims to produce bias profiles for major commercial search engines based on the results of tests using sets of search terms representing selected subject domains. The effect of cumulative search terms on bias values is also being examined. There is limited evidence to suggest that bias may be reduced under certain conditions by accumulating the URLs retrieved for each of several related search terms. Moreover, the cumulative results allow for differentiating more sharply between search engines.

References 1. Mowshowitz, A., and Kawaguchi, A. Bias in information retrieval systems. Proceedings of the Ninth Annual Workshop on Information Systems and Technologies (Charlotte, NC, December 11-12, 1999), 32-37. 2. Mowshowitz, A., and Kawaguchi, A. Assessing bias in search engines. Information Processing and Management, to appear. 3. Becker, B., and Hayes, R.M. (1963). Information Storage and Retrieval: Tools, Elements, Theories. Wiley, New York, 1963. 4. Salton, G. Automatic Information Organization and Retrieval. McGraw-Hill, New York, 1968. 5. Salton, G. and McGill, M.J. Introduction to Modern Information Retrieval. McGraw-Hill, New York, 1983. 6. Liu, J. Guide to meta-search engines. BF Bulletin (Special Libraries Association, Business and Finance Division) 707 (Winter 1998), 17-20.

9

AUTONOMOUS AGENTS FOR WEB PAGES FILTERING

FLAVIA COIMBRA DELICATO, LUCI PIRMEZ AND LUIZ FERNANDO RUST DA COSTA CARMO Nucleo de Computagao Eletronica, Federal University of Rio de Janeiro, Epitacio Pessoa 4476, 803 - Bl. 1, Lagoa, Rio de Janeiro, RJ BRAZIL E-mails: [email protected]; [email protected]; [email protected]

With the current growth of the information available in Internet, users are facing an information overload. This work proposes a multiagent system for Web pages personalized filtering. The system is composed of a set of autonomous and adaptive agents that automatically provide relevant documents to the user according to a preferences profile. The agents learn with the user feedback and attempt to produce better results over time. This work presents the system description and the promising results of tests performed in a simulated environment. The proposed system proved to be a useful tool to recommend successfully relevant information to a well-defined preferences user.

1

Introduction

The use of the Internet has been growing in the last years with the appearance of World Wide Web. Although the increase of the information available facilitates the spreading of knowledge and the acquisition of products and services, it also makes the search for relevant material a real challenge. Recent work that arises at the intersection on information retrieval and software agents offers some new solutions to this problem. Agents can be defined as softwares with the aim of performing tasks for their users, usually with autonomy, playing the role 6f personal assistants. The present work suggests the use of autonomous agents for the personalized information filtering. The proposed system is composed of a set of adaptive and non-mobile agents aiming to satisfy the user's needs for information. The agents receive the user's feedback about the relevance of the retrieved information and improve their search, obtaining better results over time. The set of agents is autonomous as it can perform its task without the user's presence, based on a preference profile previously built. The system is adaptive as it learns the user's preferences and adapts itself when these ones change over time. The main agent's learning mechanisms is the relevance feedback [7]. The use of genetic algorithms [4] as a complementary mechanism aiming to introduce diversity in the system's parameters is addressed. The information is represented by the vector space model [8]. The results presented were obtained through a series of

10

sessions with simulated users. The system's efficiency evaluation was made through the normalized distance performance measure (ndpm), suggested by Yao [9]. This paper is organized as follows. In section 2 there's a comparison with related works. Section 3 describes the proposed system. The analysis of results is presented in section 4 and some conclusions are drawn in section 5. 2

Related Works

In the domain of Web, WebWatcher [1] and Lira[2] are agents whose actions are interleaved with the user's browsing in Netscape. They require explicit interaction to indicate interest in topics or particular pages. MIT Media Laboratory's Letizia [5] is an autonomous interface agent designed to assist and provide personalization to the user while browsing the WWW by performing a breadth-first search on the links ahead and providing navegation recommendations. The main disadvantage of such approaches is that they are restricted to the sections of the Web visited by the user, recommending links starting from them. In contrast, the proposed system looks for new domains for information that can be of potential interest for the user. The user probably never saw before the presented topic. More similar to our work with regards to application domain and representation are the systems built by Balabanovic [3] and Amalthea, proposed by Moukas [6]. Balabanovic proposed a multiagent system that combines both content-based and collaborative techniques applied to the web pages recommendation. He adopts the vector-space model, relevance feedback as the learning method and he suggests the use of genetic algorithms as a possible solution for some of the problems found in the content-based filtering. Amalthea is a system that combines the concepts of autonomous agents and artificial life in the creation of an evolving ecosystem composed of competing and cooperating agents for web pages recommendation. 3

System Description

Fenix system was developed according to the object oriented approach. The system was implemented as a Java application to be running locally in the user's machine. Fenix is composed of various functional modules described below. 3.1

User Interface Module

This module presents a graphic interface to interact with the user. The user's interaction with Fenix system begins with his registration, where he must inform his personal data and choose a login and a password. After the identification the user can create a new agent, load an existing one or to activate the autonomous mode. When creating a new agent, the user must choose a name and provide the search parameters, which include the query expression. As a result of the initial

11

search, a series of retrieved documents is presented. After reading the chosen documents, the user can provide positive (+1) or negative (-1) feedback according to their relevance. The user can modify a document URL, if he finds a more interesting link starting from the initial page, through the button "Alters". He can also include a URL of interest manually, that has not been retrieved by the agent, by clicking the button "Includes". When saving a newly created agent, the references to the documents with positive feedback will be saved (their URLs) and the term vector and their weights will be created, building the initial profiles for that agent. The user can also choose some URLs to be constantly monitored. Certain URLs are frequently changed and updated, and the system can be scheduled to verify from time to time if their contents changed. When loading an existing agent, the user can read some retrieved document, provide feedback about some document or starting a new search for documents. 3.2

Filtering Module

The filtering process consists in translating documents to their vector representations, calculate the similarity between documents and profiles, and selecting the top-scoring documents for presentation to the user. The representation adopted in this work is based on the vector space model (VSM) [8]. To compute the content of a document, the system uses a keyword frequency measure, TFTDF (term frequency times inverse document frequency). This technique says that keywords that are relatively common in the document, but relatively rare in general are good indicators of the content [7]. A profile is a set of information about the retrieved documents as, for example, the documents URL, the score computed for the system and the user's feedback assigned to them. Besides, it contains the vector representation of all documents that received positive feedback. The adopted similarity measure was the cosine of the angle between die documents and the profiles vectors [8]. The filtering agents are responsible for gathering the documents generated by all the profiles, classifying them according to their similarity values, eliminating repetitions and presenting to the user. 3.3

Learning Module

The learning methods addressed in this work were relevance feedback and genetic algorithms. At the present stage, the relevance feedback sub-module was implemented and tested. The specifications of the genetic algorithm sub-module had already been done, but its implementation will be a matter of future works.

12 •

Relevance Feedback Sub-Module

For vector space representations, the method for query reformulation in response to user's feedback is vector adjustment. Since queries and documents are both vectors, the query vector is moved closer to vector representing documents with positive feedback, and further from vectors of the documents with negative feedback. The effect is that, for those terms already existing in the profile, the term weights are modified in proportion to the feedback. The terms not existing in the profile must be added to it. •

Genetic Algorithm Sub-Module

In the next stage of this work the genetic algorithm will be implemented as a complementary mechanism to introduce diversity to the search parameters as a goal. This goal will be achieved by recombining the contents of different vectors of terms belonging to the same user profile. In the proposed system, a population P is defined as a group, where each element is a pair of profile and its fittness. Each profile is converted for a binary representation, and it corresponds to an individual or chromosome of the population. The fittness is computed based on the average values of similarity between the documents and their respective profiles. The genetic operators of crossover and mutation update the population to each generation, introducing new members and taking advantage of the fittest ones. The final objective is to evolve the population in direction to a global optimization. 3.4

Other Modules

The search module is responsible for interacting with search engines existent in the web, gathering information from them about the chosen subject and saving them in a local database. The system database is composed of all information from the user, his agents and respective profiles, as well as the pages retrieved in searches. Besides these modules, Fenix has a controlling module, responsible for controlling all the other classes creation and their methods invocation. 4

Results

We adopted the performance measure proposed by Yao [9]. The ndpm measure ("normalized distance-based performance measure") is a distance, normalized to range from 0 to 1, between the user's classification for a set of documents and the system's classification for the same documents. A user is supplied with a list of documents and should classify it in agreement with his interests by a subject. The system also ranks the documents according to how well they match the profile previously built for that user. The expected result is for the ndpm distance between

13

the user and system classifications to decrease gradually over time, as the user's profile is adjusted. One hundred and twenty agents were created for thirty subjects of interest to a simulated user. For each subject, a number of simulated sessions of "user"-system interaction were accomplished. After an initial search, the agents classified the retrieved documents; the "user" evaluated the documents, providing their feedback values and classification. With the feedback, the agents profiles were adjusted to further searches and the classification is used to computer the ndpm. A progressive decrease of the ndpm distance along the sessions was observed, indicating that the agents were adapting themselves to the user's preferences and increasing the probability of retrieving a larger number of relevant documents while discarding the irrelevant ones. Several system configuration parameters were tested in the simulated sessions. To sum up the final results we can say the system reached the best performance when the terms of the query were more specific, the agents were composed of at least 4 (four) and in the maximum 10 (ten) profiles; and the term vectors of the documents had maximum size of 300 terms. Nine agents were tested along 20 (twenty) sessions, in order to compare with the work described in [3], where a multiagent system was implemented for the WWW pages recommendation. His system performance was also evaluated with the ndpm measure and the obtained curve had a behavior quite similar to the one presented in the tests with Fenix (Figure 1).

1

2

3

4

S

7

9

11

12

13

IS

17

19

21

Number ol sessions

Figure 1: Average ndpm distance between user and system rankings, over 20 sessions. 5 Conclusions Fenix is an autonomous agent that must be able to specialize to user interests, to adapt when they change and to explore the domain for potentially relevant information. The system proved to be a powerful tool of information filtering. The

presented results confirmed that the system, using the vector-space model with relevance feedback as the learning mechanism is able to successfully filter relevant documents for a well-defined preferences user. The performance values obtained in simulated tests based on the ndpm measure were similar to the ones found in another works of information filtering. Information filtering agents are a great promise to the management of extensive available information. References 1. Armstrong, R. et all., WebWatcher: A Learning Apprentice for the World Wide Web, in AAAI Spring Symposium on Information Gathering, Stanford, CA, March 1995. Available in: http://www.cs.cmu.edu/afs/cs/project/theo-6/webagent/www/project-home.htm]. 2. Balabanovic, M. and Shoham, Y., Learning Information Retrieval Agents: Experiments with Automated Web Browsing,, in AAAI Spring Symposium on Information Gathering, Stanford, CA, March 1995. Available in: http://flamingo.stanford.edu/ users/ marko/ bio.html. 3. Balabanovic , M. Learning to Surf: Multiagent Systems for Adaptive Web Page Recomendation Service. Dissertation submitted to the Department of Computer Science and the Committee on Graduate Studies of Stanford University. UMI Number: 9837173. UMI Company. 1998. 4. Goldberg, D. E. Genetic and Evolutionary Algorithms come of age. Communications of the ACM, 37(3):113-119, March 1994. 5. Lieberman, H., Letizia, an agent that assists web browsing. In Proceedings of IJCAI-95. AAAI Press, 1995. 6. Moukas, A., Amalthaea: Information Discovery and Filtering using a Multiagent Evolving Ecosystem. In proceedings of the Conference on Practical Applications of Agents and Multiagent Technology, London, April 1996 7. Rocchio, J.J. Relevance feedback in information retrieval. In: The Smart Retrieval System - Experiments in automatic Document Processing, p. 313323, Englewood Cliffs: Prentice-Hall, 1971. 8. Salton, G., Automatic Text Processing - The Transformation, Analysis and Retrieval of Information by Computer. Addison-Wesley Publishing Company, Inc., Reading, MA, 1989. 9. Yao, Y. Y. 1995. Measuring retrieval effectiveness based on user preference of documents. Journal of the American Society for Information Science 46(2): 133-145.

15

PUBLIC KEY ENCRYPTION AND TRANSPARENCY EN INTERNET CASINOS H.M. HUBEY Department of Computer Science, Montclair State University, Upper Montclair, NJ 07043 USA E-mail: [email protected] P. B. IVANOV International Science and Technology Center, 9 Luganskaya Street, P.O. Box 25, Moscow, 115516, Russia E-mail: [email protected] Different games have different characteristics. One of the standard characteristics is that of transparency of the state of the game; for example, chess is an open game; whereas card games usually are not. The second characteristics are the transparency of the protocols; In games such as the slot-machines, the players have no choice. Complete trust in the house is required. At the other end of the spectrum is craps where the rules for tossing the dice are quite explicit and it is virtually impossible to cheat. Backgammon belongs to the latter type except the Internet or the computer version.

1

Introduction

Backgammon is a game very easy to learn to play badly and hard to master. Probability calculations are much easier for a machine than a human, however, heuristics are quite easy. There are excellent players who cannot even read (literally) or

Home No Man's Land

Figure 1: Backgammon Board: Each player moves the pieces into his home board (moving in opposite directions) and bears off (takes them off). have only 5th grade educations. Effectively speaking players move their pieces in

16

opposite directions. The board is 'folded over' so that it takes the shape of Fig (1). The players try to bring all their pieces into their homeboards after which they bear off (pick them all up). Two or more pieces on a slot is called a point or making a point. A single piece occupying a slot is called a blot. One cannot put a piece on any of the opponents points. If a piece lands on a blot of the opponent that piece goes off the board (on the bar) where it goes into the opponents homeboard according to the toss of the dice. Entrance back into the game is not automatic.

Figure 2: An Example of a Miracle Roll: The checkered piece gets a 46 and hits the opponent's piece which effectively seals the fate of the game. There are many types of miracle rolls. The game is a stochastic game, and the computations are difficult for human beings, however the heuristics are quite easy. If we group the numbers on the die into two sets H (high) and L (low), then 50% of the time, we get HL or LH. Obviously HH and LL occur 25% of the time. The average roll moves a piece 8 pips (8 spaces). In live plays with actual players it is possible to exert control over what numbers come up when the dice are thrown. In actual play, a match can be anywhere from 5 to 11 points (or games) so as to make luck insignificant. Furthermore the rating system works over many matches. In the course of a single game, there occur what might be called bad luck streaks or good luck streaks, but they are generally of short duration. Someone who has been on the receiving end of a short bad luck streak may cheat only once, and reverse the course of the game. Indeed, it is possible even in fair games for the roll of the dice to have the same effect. For those who want to practice the effect of this, a rule similar to chess may be annunciated, the Hubey Rule1. The better player may spot the poor player a roll or two, similar to giving up a rook or a queen. In other words, when

17

the player is desperately stuck, he may opt to use his option to select any specific roll of the dice. In Internet games there are advantages and disadvantages. On the whole it is possible to fix up things so that the advantages outweigh the disadvantages. One of the advantages is that it is not necessary to keep account of the rolls, or to depend on the players to stay honest. The software keeps track of it all. There is no way the other player (unless it is the machine owned and operated by the house) can affect the dice roll. Secondly, there is no way for the player to bear off more pieces, or put pieces in the wrong place or disagree about the value of the dice. In games in which the dice are thrown by a computer program there are two main problems. The first problem is that the random numbers (actually pseudo random numbers) may be seriously correlated [1]. However this will have the general effect of making all games take on a certain coloring; there might be more streaks in computerized games because of the correlation, but it will be fair to every player over the long run. 2

Miracles and the Duck Rule

The second problem is more serious; there is no way for the player to know what algorithm is being used by the house and if the toss is really "fair", and this not only in the case of a player playing against the house machine but especially during such games. This is similar to playing the one-arm bandits in casinos. It is probably because people are already habituated to taking what is offered that this game persists. There is no way for a player to know if the one-arm bandit is being manipulated by someone or if it has a program that makes things look random. In other games such as craps or blackjack there is much more openness to the algorithm and rules of the game. There are very few ways for the house to manipulate the outcome. But in backgammon tournaments when the player plays against a machine there are serious trust and security holes. In such a case, the rule is; if it looks like a duck, walks like a duck, and quacks like a duck, it is probably a duck. There are many ways in which a software can manipulate the outcome of the game, and most of these are based on "magic rolls". These are "lucky rolls" that save the player. Unfortunately, if all the luck in its various manifestations is on one side, then the duck rule (see above) applies. It is difficult enough to test a sequence of integers for randomness, and it is even more difficult to show that for software that combines together to toss the die and play the game. In that case, the duck rule applies. There are many ways to manipulate the game.

18 1) Constant drain: this is when one side consistently throws bigger numbers without suffering any bad consequences. The "luck" here is spread out over the whole game. Someone has to win, so it's hard to tell if the rolls are really random. 2.1) Miracle rolls: In Figure 2, the CH (crosshatched) player rolls 46 and hits the LI blot, whereas LI might roll several times and not even be able to get out of the opponents homeboard, although his chances of getting out are greater than that of CH. If these miracle rolls do not occur to both sides, the duck rule applies. 2.2) Throw miracle doubles and overtake the opposition over and over. This can also be hidden cleverly by making the opponent also get doubles, but either low ones or even better, useless or harmful doubles. When the statistics are tallied, both players have the same average number of doubles. This is also a good technique if the doubles come right at the end, where they do little good, for example if the player bears off the last blot with a 66. 2.3) Another clever and nice trick is to lose every once in a while but only 1 point at a time, and to win big. This can be done in a variety of ways. 2.4) When the machine hits 70-80% of the player's pieces that land in the no man's zone when it should be in the neighborhood of 30% and the player can only hit the machine's no-man's-zone pieces only about 20% of the time, and at opportune moments the duck rule applies. 2.5) The Mexican Standoff miracle: Often, when the game reaches a critical point players' points are left in the other's zones and neither wants to run first and break his point and leave a blot. Whoever throws the first 6 (usually) is a victim of the Mexican Standoff, especially if he gets hit more than the average. 3.1) Bearing off: One trick that will surely ruin anyone is to let the player seem to be winning until almost the end. Arrange all the pieces in the machine's home board so that it is almost airtight, and then force the player to leave a blot while bearing off, and of course, hit it! 3.2) Throw bigger numbers at each roll, and doubles while bearing off. This must be done if the other tricks have been overused. 3.3) Throw big doubles near the end and take the game. It also helps if the player has a "hole" in his homeboard while bearing off, say at point 3. Then if every roll has a 3 in it, such as 31, 34, 36, the player slowly falls behind. 4) Finally, if all of these have been overused, it does not hurt to cheat once or twice directly select the best roll for that play.The best thing that Internet backgammon sites can do is to allow the Hubey rule to be implemented so players can

19 see for themselves that cheating only once per game vastly improves their chances. 5) And over all, the last trick, win by a score such as 9-8, 9-7, 9-8, etc. enough times and win many games with a single roll or a single blot to give the impression that the player only needs to improve an iota. After 40-50 matches like this, it becomes impossible for a player to trust in the laws of probability theory. In such a case, it is quite likely that laws of probability are not operating, and that the program is cheating, according to the duck rule. Even if it is not cheating, it is important not to give even an impression of cheating. If the probability for the machine to win p=0.9, the probability of winning 10 games in a row is only (0.9)10=0.3487, and probability of winning 30 straight games is (0.9)30=0.424. If p=0.95, then the results are (0.95)10=0.5987, and (0.95)30=0.2146. With even p=0.99 we obtain (0.99)10=0.9044 and (0.95) =0.7397. More sophisticated probability calculations can be made relatively easily. The only method of dispelling feelings that the machines are cheating is to employ open standards and protocols. First, there should be a set of robust random number generators to choose from [2,3,4,5]. There are many kinds of tests for random numbers [3]; there is no excuse for using ones that fail simple tests such as the serial correlation tests [1]. The best random number generators are congruential and usually have some seed or modulus that is a prime. This number can be set for every session. In the days of broadband access and 1 GHz CPUs there is no excuse for not using this. There are rules that even children employ to minimize cheating. For example, when choosing up sides, two players will simultaneously display 1, 2 or 3 fingers. They are all added up, and if the sum is even one player wins, and if odd the other. The key here is that the result is computed from both inputs not from a single one. A protocol resembling this can be created for Internet gaming. The key is the Diffie-Hellman key exchange. 3

The Diffie-Hellman Exchange 2

For any prime number p, a primitive root p is such that p mod p, p mod p,... p" ~ mod p are all distinct and consist of the integers 1 to p-1 in some permutation. With p a publicly known prime number, and an integer p that is a primitive

20 root of p, the human player selects a random integer Xh
and computes

x x Yh = p p mod p. The machine selects Xm

Simple Rules to make the game transparent, and open

This protocol can be used to generate several algorithms to make the game open to inspection by anyone and show that it is an honest game. One way would be to generate the integers {Xh,XmJ at each step (by the human player's CPU and the bot of the Internet game provider respectively) using a random number generator. Then they can use a simple algorithm to compute r = (K mod 5) + 1 , where the various parameters are explained below. Then using the exact same random number generator then can use r to generate the other die number. This can be repeated for every roll. Even if one side cheats in the generation of the random number set {Xh,Xm} they still cannot control the final output. 5

Other alternatives

Another way to proceed, after the initial exchange of the key, would be to make sure that the number generation is transparent to both the bots and the human players, (or if the day should arrive, to bots that compete against each other). There must be input into this process of random number generation from both the player (actually his/her CPU) and the Internet backgammon provider who also provides the bots. Random number sequences are generally of the type r

n + l = arn

mod

M

W

M should be a large prime. Good choices for the parameters a and M can be 31

found in [2,3]. A good choice for M, given by Lehmer, is M = 2 - 1 , a Mersenne prime number. From a possible choice of more than 2 million integers, there are only a handful for the multiplier a , that pass several tests of random-

21

ness [2], The minimal standard random number generator uses a = 16807 . This multiplier can be set as the default and the others can be given to the players as choices should they wish to change. The algorithm follows: both the player and the machine generate the initial random numbers r™, and r^, using some random generator function r0 = / ( a , x, k) where / is the jth acceptable random generator function. These are supplied by the Internet game service provider. These functions should be coded in an open language such as Java which can be examined by millions of competent people, and it should be certified to be working correctly by organizations devoted to Internet gaming. The seed a can be chosen in a number of ways. It can be entered by the player or it could be a function of the timer register, thus a = F(x), and k is some parameter chosen for this session. This parameter can be one of several to add more randomness and choices to the process. Then these random numbers are exchanged between the player and the machine. For security reasons, these exchanges should also use the Diffie-Hellman key exchange algorithm. After the exchange each side computes two new random numbers. In general the next set of random numbers are generated as r

n+i = « l ( C ^ . * )

3nd

r

n + 2 = Sl^"yn,k)

(2)

It should be noted that both machines (the player's and the bot's) use the same two functions gj and g2 and the same parameters to compute the same numbers. To make sure there are no errors, these numbers should be exchanged and compared although with the reliability of optical fiber connections the need for this decreases in time. The most important part of the algorithm as can be seen is that as in the childrens' algorithm of choosing up, both inputs determine the random numbers. For practical reasons the numbers generated internal by the player's CPU should be displayed separately and the two integers to be played should be displayed on the game board. It should be possible for the player at any time to check the calculations, and just as importantly, the algorithm should be coded in a language like Java which can be compiled by the player. There should be move toward developing a standard API so that the fairness of the game can be made obvious to millions of programmers worldwide. It is in the interest of the Internet game service providers to make this available to the

22

users. The Internet is a different kind of a world than other aspects of life. Millions of people might crank slot machine arms mindlessly but it won't work on the Internet, at least not for long. Indeed, it will even influence the future of slot machines in casinos. 6

Conclusions

The number of ways in which one can cheat at backgammon is even more than this especially in the days of AI and supercomputers. A simple random number generator checking program cannot spot sophisticated cheating techniques. What we need are either (i) a metric that determines how "lucky" one side is, or (ii) a clearly open version of how the random numbers are generated. The "luck" part can be worked out in the future, however the easiest way to approach this is to create an open environment. The concept of Open Gaming, is aan idea whose time has passed. It is time for games of chance to use open standards and fit into the general scheme of things on the Internet and partake in the future which the Internet has spawned. Notes 1. The Ivanov Rule says that "If you don't want it, just don't play". References 1. Hubey, H.M., and A. Gutierrez, Testing Random Numbers Via Cumulants, Proceedings of the Twenty-third Annual Pittsburgh Conference on Modeling and Simulation, May (1991), pp. 147-152. 2. Park, S. and K. Miller, Random Number Generators: Good Ones Are Hard to Find, CACM, vol. 31(1988), No.10,1192-1200. 3. Bratley, P., B. Fox, and L. Schrage (A Guide to Simulation, Springer-Verlag, 1983) 4. MacLaren, M and G. Marsaglia, Uniform Random Number Generators, J. ACM, vol.l2(1965), no.l, 83-89. 5. Marsaglia, G. Random Numbers Fall Mainly in the Planes, Proc. NAS, vol. 61(1968), 25-28. 6. Stallings, Wm, Cryptography and Network Security, (Prentice-Hall, Englewood Cliffs, 1999)

23 A REVIEW OF EVALUATION TOOLS FOR WEB CONTENT ACCESSIBILITY

MARTA PRIM Unitat Microelectronica, Dept. Informatica, ETSE, Universitat Autonoma de Barcelona.08193 Bellaterra, Barcelona, Spain E-mail: [email protected] This article makes a comparison between two applications whose aim is to determine the degree of accessibility of Web pages: Bobby and HTML Validator. At the same time, this document wants to emphasise the low accessibility of current Web pages. The purpose of this comparison is to identify the advantages and disadvantages of each one of them, for the later creation of a similar tool in Spanish. These two applications are Web pages analysers that follow the recommendations made by W3C (World Wide Web Consortium) reflected in norms WAI (Web Accessibility Initiative).

1

Introduction

The Internet is becoming more and more a commodity item in our lives. However, this statement is not true for every human being. Different reasons limit this generalisation. First, a country status, its economy together with its development can limit the expansion of the Internet. Secondly, one of the weak points of the Internet, leaving aside its speed, is its accessibility: not all of us are able to correctly interpret the content of the Web pages. There are people with certain visual, physical-motor and cognitive handicaps whose access to these pages is restricted when Web pages do not fulfill particular accessibility norms. In this article we will be dealing specifically with this second group, by focusing on the parameter of accessibility for people who are partially or completely blind, or have temporary difficulties in reading a computer document directly or indirectly by means of speech synthesizers, screen readers, refreshable Braille displays. In particular, we will report a study of two applications designed to obtain more accessible Web pages: Bobby and HTML Validator. We have chosen these two applications because of their user-friendliness and also because Web pages that are qualified as accessible have been certified as so by these two applications. However, there are other applications which have also been designed to solve accessibility problems, but their functions to help designers are more limited. For further information connect to the following Web page: The two applications considered in this article are based on the norms WAI created by the W3C [1]. These norms are a set of rules established to help Web page

24 designers create more accessible pages. Norms WAI have been designed to be stable, that is to say, they are intended not to be modified in the near future. Without entering in detail, these norms are [2-3]: • • •

• •

• • • • •

Images and animations: Use the ALT attribute to describe the function of each visual component. Maps of image: Use maps of clients and alternative text for the active zones. Multimedia: Provide captioning and transcripts of audio, and descriptions of video and accessible versions in the case of using non-accessible formats. Hypertext links: Use text that makes sense when read out of context. For example, avoid "click here". Page organisation: Use headings (HI, H2, H3...), lists, and consistent structure. Use Cascading Style Sheets (CSS) for layout and style where possible. Graphs and characters: Summarise or use the LONGDESC attribute. Scripts, applet and plug-ins: Provide alternative content in case active features are inaccessible or unsupported. Frame: Use NOFRAME and meaningful titles. Tables: Make line by line reading sensible. Include a summary. Check the work: Validate code HTML. Use check tools and only-text navigators to verify the accessibility.

In their last version these norms have been divided into three priority levels. In order to detail the comparison between these two Web pages analysers, this article has been structured in several sections. Sections II and III briefly explain each of the considered applications. Section IV presents their comparison. We have considered first how easy it is to interpret the Web page accessibility errors and second how this information is displayed. These two factors are essential to help Web masters. The last section summarises and concludes this report.

2

Bobby

Booby is a tool intended for Web pages designers. This application tries to help the designer to identify the necessary changes on Web pages so that they are accessible to handicapped users. Bobby was created by CAST (Centre Applied Technology), http://www.cast.org/Bobby/ . CAST is a non-profit organisation whose aim is to increase handicapped people 's opportunities in the technological field. At the

25 moment, it is working together with W3C to develop an evaluation tool, which employs their Web Content Accessibility Guidelines and provides developers with page and site evaluation supports. 3

HTML Validator

It checks HTML documents for conformance to W3C HTML and XHTML recommendations and other HTML standards. The W3C develops interoperable technologies (specifications, guidelines, software, and tools) to lead the Web to its full potential as a forum for information, commerce, communication, and collective understanding, http://validator.w3.org/ [5]. 4

Study

Next, we will indicate the characteristics of each application, to go on with the comparison in itself: 4.1

Bobby tool

The access Web page to the application Bobby 2,3 consists of two functions. The first function evaluates directly the page that the user requires and the second is informative, allowing the user to download the program to be able to later validate created Web pages on its own computer. Once the user has indicated the Web page to validate its accessibility, the application assesses this page and generates a new Web page with the results of the verification. These results are structured in three parts: First part: It displays the analysed Web page and it indicates by means of • (user check points) and the symbol "*Ksi (accessibility errors) which points display accessibility problems. For example, in the following figure (fig 1) the Universitat Autonoma of Barcelona Web site has been analysed. It can be seen that it has some accessibility problems (with this comment we do not mean to question analysed pages effectiveness)

26

Fig 1: Visual result of the assessed Web page by means of Bobby tool

Second part: It presents the errors of accessibility that has been detected. These errors are indicated according to three levels of priority. Within each priority level any HTML code line that does not fulfil the recommendations is specified and at the same time, a solution is suggested (see fig 2).

1

Priority 1 Accessibility

This page does not meet the requirements for Bobby Approved status. Below is a list of 1 Priority 1 accessibility errors found: 1.Provide alternative textfor all images. (15 instances) Line 24, Line 30, Line 31, Line 35, Line 83, Line 92, Line 95, Line 99, Line 128, Line 131 Fig 2: Result obtained by means of Bobby tool

Third part: It contains a list of browser compatibility errors Mid the download time. Browser compatibility errors help to determine when HTML tags aid their attributes we not compatible with certain web browser or HTML

27

specifications. Browser compatibility errors do not affect the accessibility rating of a page. The download time table gives download time statistics for the images, applets, and objects on this page, using a 28,800-baud modem. See the results in the following fig 3:

rne M-.rw.tti, ssttsoa ? -«tons a kit of 5 browser c-impabb&ty errors Browser cernps'uVity error J hd^ Jo ostennae whsr. HTML t%>s ar.d fear aar.but« * « r^i icavaubla with .-srtJsn web orc-cwers <>r HTML sper.featkn? Browser cornpiisbiSsw errors -fc not affect the acceisss&y raanj, of a page

Line 243 ihte264,'i^a"m!lLir.e

310. L k T B s J Lina" 366, Lina 394, Lais 422

Lin* 180

Download Time

:

:

URL

: Siz»

JitJp Oynrtt «»if orgft'Obby/sKdex. can J

lht^//wrarwv;astor^gra{jhics, 'ssteW!Je/d<>t_rfeararf

TiraH<set*)

!27 08 K =

7 3i!

004K;

001

Fig 3: Bobby tool: browser compatibility errors and download time table.

42

HTML Validator

W3C HTML Validation Service allows the user to input the Web page address to be analysed or to load a file by indicating its path. Also, the user can select outline, source listing or parse tree. As output, the application generates a new Web page with information of the Web site: last modification, type of server, type of document. Next, the application displays a listing indicating the line and the column, as well as the code where the error is. In addition, it offers information about how to solve the error (fig 4).

28

Line 1, column 1: A

Error: Missing DOCT YPE declaration at start of document (explanation...) Line 5, column 29: Fig 4: Result obtained by means of HTML Validator tool

4.3

Comparison

When making the comparison, we noticed that each application has its own way to focus on the question about the potentially troublesome information and how to help the designer of accessible Web pages. Bobby first provides visual information about any weak points on the Web page and then specifies the different priority levels by indicating the errors as well as the tags that should be reviewed. It is a very complete application; nevertheless the great amount of information provided may overwhelmed the designer, since it is not possible to select how the information is displayed. HTML Validator, on the other hand, does provide an option for selecting the information layout. However, it does not show the errors visually on the same Web page as Bobby does. We think that such an application should allow the designer first to see the most important errors from an overall point of view and then enable him to increase gradually the accessibility of the page. Furthermore, it should provide us with the possibility of its execution on the server, or of unloading the application for later verifications with no need of being connected to the Internet, as the Bobby tool does.

5

Conclusion

The main objective of this article is the study of two applications (Bobby and HTML Validator) which are intended for Web masters so that they can improve Web pages accessibility and increase the diffusion of the existence of this type of utilities. The analysed and assessed characteristics of these applications will enable

29 us to design a similar tool in Spanish, which will include the positive parts of each one of the applications we have studied in this article. References 1. Seminario de Iniciativas sobre Discapacidad y Accesibilidad en la Red del Real Patronato de Prevention y de Atencion a Personas con Minusvah'a. Spain. http://www.sidar.org 2. Albouy J., Noves tecnologies i integracio professional: les persones cegues", ESADE Associacio, 92, March-April 2001, pp. 72-78. 3. W3C (2001) Web Content Accessibility Guidelines 2.0 (online) W3C Recommendation 7- January-2001 Technical Reports and Publications. http://www.w3.org/WAI 4. Publicaciones de la Unidad de Acceso de la Universidad de Valencia http://accso.psievo.uv.es/unidad/pubs/index.html 5. Center for Applied Special Technology http://www.cast.org/bobby 6. HTML Validator Service http://validator. w3 .org

31

A S H A R A B L E I N S T R U C T A B L E A G E N T FOR I N F O R M A T I O N FILTERING K.R.K. MURTHY Dept. of Comp. Sci. and Automation, Indian Institute of Science, Bangalore - 560012, India E-mail: [email protected] S.S. KEERTHI Dept. of Mech. and Production Engg., National University of Singapore, 10 Kent Ridge Crescent, Singapore 119260 E-mail: [email protected] M.N. MURTY Dept. of Comp. Sci. and Automation, Indian Institute of Science, Bangalore - 560012, India E-mail: [email protected] In this paper we present an information filtering agent called sharable instructable information filtering agent (SI I FA). It adopted the approach of sharable instructable agents. SIIFA provides comprehensible and flexible interaction to represent and filter the documents. The representation scheme in SIIFA is personalized. It, either fully or partly, can be shared among the users of the stream while not revealing their interests and can be easily edited. SIIFA is evaluated on the comp.ai.neuralnets Usent newsgroup documents and compared with the vector space method.

1

Introduction

Information filtering [1] is the process of filtering out the irrelevant documents from the stream of documents reaching the user. This is necessitated because of the following reasons: (1) information overload which occurs because of the availability of relevant information in the huge heap of irrelevant or less relevant information; (2) offending or value-opponent or policy-opponent information which occurs because of the borderless nature of the Internet, this needs to be controlled mainly at the receiving end rather than at the distribution end; (3) a personalized agent whose domain is restricted to the documents of certain characteristics needs filtering of documents out of its domain; and, (4) information filtering profile of a user contains information about his interests which may be exploited in gathering information to fulfill his information need. Several software agents and techniques [2] have been developed to tackle these problems. Each one of these agents addresses certain information filtering task characterized by the stream, granularity of filtering, and the filtering problem at hand.

32

KPl KP2 KP3 KP4 KP5 KP6 KP7 KP8 KP9

Figure 1: CNW: Keyphrases, KP1...KP9; Concepts, C1...C6; Context-Filters, CtL.CtlO.

In this paper we describe an information filtering agent, called sharable instructable information filtering agent (SIIFA), developed based on the sharable instructable agents (SIAs) approach [3]. SIAs are instructable agents [4] along with the properties of glass-box comprehensibility, modularity, and add-on ability. This approach is suitable for information filtering because of the overlapping characteristics of the streams received by different users and interestrepresentation correlations of profiles of different users. These make collaboration to model or represent the document streams possible. SIIFA is realized based on the comprehensible document analysis and representation scheme developed using the novel acyclic directed graphical structure called concept network (CNW) [5]. SIIFA model defines the interaction and filtering mechanism based on the CNW based analysis and representation scheme. This helps in constructing and maintaining the personalized CNW and filtering documents. SIIFA is evaluated on the documents collected from comp.ai.neural-nets Usenet newsgroup and compared with the vector space approach to information filtering. It demonstrates the superiority of SIIFA over the vector space method. The rest of the paper is organized as follows. Section 2 describes the document analysis and representation scheme used in defining SIIFA. Section 3 introduces SIIFA. Section 4 presents the evaluation and comparison of SIIFA. 2

Concept Network: A n Introduction

A concept network (CNW) is a directed acyclic graph that can be used to analyze and represent textual documents. Its nodes are concepts and its edges pass through context filters. Figure 1 shows a sample sketch of CNW. It shows three distinct items namely keyphrases (KP1,...,KP9), concepts (C1,...,C6), and context filters (Ctl,...,CtlO). Decision lists are used for decision making both in concepts and context filter stages.

33

A concept is denned as follows: A keyphrase is a concept, a function of concepts is a concept. A keyphrase is a string of characters of any length with at least one non-blank character, i.e., a keyphrase does not have to be a valid linguistic phrase. Keyphrases are features observed in the documents and they are called fundamental concepts. A context filter removes the noise (or ambiguity) in its input with the help of the context of occurrence of the input in the document. The context of an occurrence of a concept in a document is defined in terms of the relative positions of the occurrences of the other concepts in that document. A context filter is a sequence of four stages. Each stage removes a certain noise contained in its input in the form of a certain type of ambiguity of its occurrence in the document. First stage handles implicit concept presence which deals with the absence of a concept when the topic highly dependent on it is present. Concept sense ambiguity [6] is handled by second and third stages. Fourth stage handles concept influence ambiguity which deals with the influence of a given concept in the neighborhood of its occurrence. Analysis and Representation: The given document is divided into lines of text. They are numbered in the ascending order starting from 0 from the beginning of the document. Each keyphrase is assigned a sequence of numbers which correspond to the lines in which it appeared. A concept in layers other than the Oth layer outputs a series of non-overlapping windows. Each context filter stage takes a series of windows or positions and processes them to output a series of non-overlapping windows. Each concept takes each line of the document, based on whether that line is present in one of the windows of its inputs, decides whether that line contains the concept. All lines which contain the concept are formed into a series of non-overlapping windows. This shows that the analysis of the document contains in terms of the windows output by various concepts and context filter stages. Hence, the graphical arrangement of these serieses of windows is taken as the representation of the document. 3

SIIFA

SIIFA is an interactive information filtering agent. It provides a comprehensible and flexible interaction to construct a personalized CNW and filter the documents using it. SIIFA has three modules: Filter module, Interaction module, and Co-ordination module. Filter module processes, represents, and filters the documents. Interaction module interacts with the user. Co-ordination module co-ordinates the activity between the above two modules and the database. Filter module: The filter module is the heart of SIIFA. This module analyzes the given documents to discover the presence of various concepts and

34

represents them. Then, based on the keyphrases and the concepts observed, it classifies the documents into appropriate categories. It contains three submodules: concept network (CNW), document processing and representation unit, and document classification unit. CNW indicates the keyphrases to be observed, concepts to be derived, and how they have to be derived. Document processing and representation unit observes various phrases required by the CNW along with their positions in the document and processes the observations according to the specifications in the CNW to generate the representation of the document. Document classification unit takes up the representation generated by the above unit and classifies it according to the classification specification. Document classifiers can be considered as highest layer concepts in the CNW. Hence, they are part of the CNW in SIIFA. Interaction module: Interaction module is the one which interfaces the agent with the user. This module is concerned with: presenting documents, rules, and results of queries; taking feedback and queries from the user, and, prompting and interacting with the user during the agent-user dialogue. Documents are presented along with the activations caused by them on the CNW. Rules are nothing but the decision lists in the concepts and the context filter stages. They are presented along with the prototype documents. SIIFA allows several varieties of feedback: (1) Classification feedback which allows addition/deletion/correction of windows output by the CNW concepts; (2) CNW correction feedback which involves corrections to the architecture of the CNW and the corrections to the decision lists of various concepts and context filter stages; and, (3) Hints include pointing out important passages, inputs, and concepts for a concept for document classification. Upon receiving feedback from the user, SIIFA prompts the user for potential inconsistencies in it and gets into dialogue with the user to make user acceptable changes to the CNW. This process updates several parameters related to the CNW and the validation documents which helps in better induction of decision lists in passive mode of learning. The dialogue can be stopped by the user at any stage of dialogue and the user can instruct SIIFA to learn the decision lists in passive mode. Co-ordination module: Co-ordination module is the mediator between the filter module, the interaction module, and the database. It takes care of the following: (1) Maintaining documents and their representation, CNWs, and parameters in the database. (2) Taking queries and feedback from the user through the interaction module, converting them into an appropriate form, then executing respective routines, and presenting results to the user through the interaction module.

35

4

Evaluation and Discussion

The document collection used for evaluation is obtained from comp.ai.neuralnets Usenet newsgroup. It is composed of 38 documents belonging to the period of 23-07-1994 to 27-07-1994 (called July documents) and 102 documents belonging to the period of 16-08-1994 to 7-09-1994 (called Aug-Sep documents). The documents are labeled by the user as relevant and irrelevant. The relevant portions of the documents are also supplied. Four methods are evaluated in this paper for comparison. They are Human, SIIFA, Community SIIFA (cSIIFA), and Vector space (VSpace) methods. In Human method, user filters out the documents based on the text available in the Subject: field of the documents. In cSIIFA method, SIIFA operates in a community and the CNW required for representation of the documents is immediately available. VSpace method is as followed in information retrieval [7]. In this, the term frequency of a word in a document is the number of times the word appears in the body, Subject: and Keywords fields of the document. Inverse document frequency (idf) of a word is given by {log( N~d?'5) -v- log(N + 1.0)) where N is the number of documents available for idf estimation and df is the number of documents of this collection that contain the word. A document is classified relevant if the scalar product of the profile and the document vectors crosses the threshold. Profile and the threshold are learnt using the relevance feedback. These methods are evaluated in three different scenarios. Each scenario represents certain practical limitations on the retrieval and reading of documents. They are: (1) infinite documents can be retrieved and all of them can be read; (2) only limited number of documents can be retrieved; and, (3) only limited number of documents can be read. In all these cases a user can read as many Subject: fields as the number of documents he can retrieve. Measurements: The following parameters are measured for these methods: (1) reading recall (Rr), it is the percentage of the relevant documents read by the user; (2) reading precision (Pr), it is the percentage of the documents read by the user are actually relevant; (3) reading effort (Er), number of irrelevant lines of the text to be read by the user to fulfill his information need; and, (4) feedback effort (Efb), the amount of effort needed to supply feedback. The E/6 measurement for SIIFA and cSIIFA is taken to be the total number of feedbacks supplied, whereas it is the number relevance feedbacks supplied for VSpace. Experimentation: The experimentation involves reading documents oneby-one and supplying feedback on them if learning is allowed. A document is read if it is classified to be relevant by either the profile in hand or by Human method which can use the information provided by the actual method under

36 Table 1: Results on July, 1st and 2nd sets of Aug-Sep documents respectively.

Approach Human VSpace SIIFA cSIIFA

July

1st set of Aug-Sep

2nd set of Aug-Sep

(Rr,Pr,Er,Efi)) (100,60.0,357,000) (100,53.1,396,032) (100,58.6,130,420) (100,85.0,040,040)

(Rr,Pr,Er,Efb) (100,22.9,516,000) (100,20.0,642,040) (100,61.5,245,155) (100,82.0,042,000)

(Rr,Pr,Er,Efb) (76.5,37.0,300,0) (76.5,37.1,329,0) (76.5,86.7,046,0) —

Table 2: Results for limited retrieval of 25 documents and limited reading effort of 20 documents scenarios.

Approach Human VSpace SIIFA

Limited Retrieval

Limited Reading

(Rr,Pr,Er) (47.1,47.1,127) (52.9,52.9,150) (64.7,61.1,080)

(Rr,Pr,Er) (47.1,40.0,180) (52.9,45.0,164) (76.5,65.0,080)

consideration (e.g. CNW activations in SIIFA) also. The experiment in the first scenario has three phases: (1) generate representation and profile using July documents with the help of the method under consideration, measure the performance; (2) starting with the profile available at the end of the first phase, learn and measure profile on the first set of AugSep documents; and, (3) the performance of the profile available at the end of the second phase is measured on the second set of Aug-Sep documents without any learning. The three columns of table 1 show the evaluation of SIIFA and the other methods on July documents, first and second sets of Aug-Sep documents. These results show that SIIFA is clearly superior to Human and VSpace methods on Aug-Sep documents in all respects. But on July documents, it is marginally inferior to Human method on Pr and significantly inferior on EfbInferiority on Pr is due to purposeful reading of some of the irrelevant documents to improve the CNW. But the inferiority of SIIFA on July documents is compensated by its performance on Aug-Sep documents which is the reward expected for the high initial Efb- The operation of SIIFA in a community sharply reduces this effort since Efb in it corresponds to the effort to build only the relevance classifier only. The community operation leads to all-round improvement as shown in the table. These methods are evaluated in the remaining two scenarios on the second set of Aug-Sep documents. The evaluation results are shown in table 2. They

37

show that SIIFA is clearly superior to the other two in all respects. The testing involves retrieval or reading, whichever is applicable, of the documents classified relevant. Then Human method to retrieval or reading along with the appropriate information available related to the actual method is applied on the documents classified as irrelevant. 5

Conclusions and Future Research

This paper presented the new information filtering agent called shamble instructable information filtering agent (SIIFA). It is evaluated and compared with two other methods using the collection of documents obtained from comp.ai.neural-nets Usenet newsgroup. The results have shown that SIIFA is much superior to the two methods used for comparison on the documents used. The future research is to be focussed on evaluating SIIFA on several datasets and for several users to study its average behavior. CNW and SIIFA model definitions need improvements to handle structured documents such as HTML and XML documents. The current implementation of SIIFA does not include search facility which has to be developed. References 1. Oard, D.W., The State of the Art in Text Filtering, User Modeling and User Adapted Interaction, 7(3):141-178, 1997. 2. Information Filtering Resources. Web: http://www.ee.umd.edu/medlab/filter/filter.html 3. Murthy, K.R.K., Keerthi, S.S., and Murty, M.N., Networked Distributed Collaboration for Information Activity Infrastructure, in the Proc. of Intl. Conf. on Knowledge Based Computer Systesm, 2000, India. 4. Maulsby, D., Instructible Agents, PhD dissertation, Dept. of Computer Science, The University of Calgary, 1994. 5. Murthy, K.R.K., Keerthi, S.S., and Murty, M.N., Concept Network: A Structure for Context Sensitive Document Representation, in the Proc. of Intl. Conf. on Knowledge Based Computer Systems, 2000, India. 6. Guthrie, L., Pustejovsky, J., Wilks, Y., and Siator, B.M., The Role of Lexicons in Natural Language Processing, Communications of the ACM, 39(l):63-72, January, 1996. 7. Salton, G., Automatic Text Processing: The Transformation, Analysis and Retrieval of Information by Computer, Addison-Wesley Series in Computer Science, 1989.

Computing in Biology

41

ENTROPY VERSUS INFORMATION: IS A LIVING CELL A MACHINE OR A COMPUTER? JACK A. TUSZYNSKI* Laboratoire de Physique, Ecole Normale Superieure de Lyon, 46 Allee dltalie, Cedex 07, France

69364 Lyon

^permanent address : Department of Physics, University of Alberta, Edmonton, Alberta, T6G 2J1, Canada E-mail: jtus@phys. ualberta. ca

In this paper we discuss the entropy and information aspects of the living state. Particular attention is paid to the information gain on assembling and maintaining a living cell. Numerical estimates of the information and entropy reduction are given and discussed in the context of the cell's metabolic activity. Difference between information and instruction are given. Finally, a brief overview is provided of the possibilities of bio-computing.

1

Introduction

The current explosion of interest in the future of computing is strongly motivated by an imminent approach of the limit of classical computing as extrapolated from Moore's Law stating that the number of transistors that can be fabricated on commercially available silicon integrated circuit doubles every 18-24 months. This amazingly fast trend towards miniaturization has been valid in the microelectronics sector for close to four decades. Today, the smallest available silicon chips contain up to 100 million transistors on a few cm2 of a wafer translating into linear dimensions on the order of 200 nm or less. To reach the dimensions of small clusters of several atoms, approximately 2 nm in length requires a 10,000-fold miniaturization of micro-circuitry, and according to Moore's Law, is expected to occur some time between 2019 and 2028. However, there is a growing concern heretofore unexplored technologies will have to be found even before this limit is reached. There is already substantial effort underway in an area referred to as quantum computation since nanometer-size objects reach into the realm of quantum mechanics and entirely different physical laws apply. Unfortunately, practical considerations such as the so-called entanglement of the system's wave function with the environment pose a serious challenge to any practical applications of quantum computing. Simultaneously with an effort to build the first quantum computer, a quest has been pursued to use biological materials provided to us by nature itself, or perhaps in combination with silicon-based technology, to come up with not only smaller

42

electronic devices but also with devices that are more flexible structurally and functionally. It is hoped that one day soon a biological computer will be built that is fast, small and evolvable. Of course, we are already intimately familiar with its prototype, our brain, which can be held as proof of concept. The human brain is composed of 10 billion nerve cells interconnected with as many as 1000 neighboring neurons communicating via signals in the form of electric potential differences that travel along axons, with speeds in the m/s range. We know that waves of electric activity are correlated with the brain's cognitive functions but do not know which structures and/or process are responsible for consciousness or even what constitutes memory. This is sometimes referred to as the mind-body problem. Without getting too far into this hotly debated topic these days, we might just add that there is a camp of researchers galvanized by the Oxford mathematician Sir Roger Penrose who believes that the fundamental nature of the mental processes lies in quantum mechanics. If true, this would, in a way, provide a neat conceptual link between the two routes towards a new type of computing: nano-scale in-silico computing and also nano-scale biological or in vivo computing. The process of nerve excitation involves a passage of the electrical signal from one nerve cell to another in the form of synaptic transmission. Synapses are connections between nerve cells, their axons or dendrites and form nanometer size gaps which are crossed by neurotransmitter molecules stored in vesicles that open when stimulated. Here again is a point of contact with quantum mechanics. In addition there is a very intricate structure of protein filaments filling both the nerve cell's body and its axons. Inside the axons one finds a parallel architecture of microtubular bundles interconnected with other proteins, a structure that resembles parallel computer's wiring (Hameroff, 1987) leading to the hypothesis that this microtubular structures may be involved in subcellular (nano-scale, possibly quantum) computation. Brown and Tuszynski (1999) demonstrated theoretically their feasibility as information storing and processing devices. There is a great promise that protein networks are strongly involved in both information processing and storage. This then suggests that living cells perform significant computational tasks. If so, can we reconcile laws of thermodynamics and information theory with our knowledge of cell biology? Furthermore, can we gain insights into the inner workings of the cell such that in the future hybrid computers can be designed that harness the power of biological computational elements being integrated with silicon technology? We address these issues in the remainder of this paper. 2

Cell Energetics - The Cell as a Machine

In order to function, every machine requires specific parts interconnected in an intelligent fashion in order to perform the desired function. In addition, a steady supply of energy must be provided to convert it, with some level of efficiency, into useful work. Likewise, all biological cells, like machines, must have many well-

43

engineered parts to work. Indeed, cells are constructed from yet smaller machines known as organelles. Cell organelles include mitochondria, Golgi complexes, endoplasmic reticulum and the protein filaments of the cytoskeleton. Even below this level there are machine-like parts of the cell, such as motor proteins and enzymes, that perform specific functions involving energy input and power output* e.g. transport (Alberts et al, 1994). A critically important macromolecule is ATP that is a complex nano-machine that serves as the primary energy currency of the cell. ATP is used to build complex molecules, provide energy for nearly all living processes such that it powers virtually every activity of the cell. Nutrients contain numerous low-energy covalent bonds but unfortunately these are not very useful to do most type of work in the cell. Thus, low energy bonds must be translated into high-energy bonds using ATP energy by removing one of the phosphate-oxygen groups, turning ATP into ADP. Subsequently, ADP is usually immediately recycled in the mitochondria where it is recharged and re-emerges again as ATP. At any instant each cell contains about one billion ATP molecules. Because the amount of energy released in ATP hydrolysis is very close to that needed by most biological reactions, little energy is wasted in the process. Generally, ATP is coupled to another reaction such that the two reactions occur nearby utilizing the same enzyme complex. Release of phosphate from ATP is exothermic while the coupled reaction is endothermic. The terminal phosphate group is then transferred by hydrolysis to another compound, via a process called phosphorylation, producing ADP, phosphate (Pi) and energy. Phosphorylation often takes place in cascades becoming an important signalling mechanism within the cell. Importantly, ATP is not excessively unstable, but it is designed so that its hydrolysis is slow in the absence of a catalyst. This insures that its stored energy is released only in the presence of an appropriate enzyme. The mitochondrion, where ATP is produced, itself functions to produce an electro-chemical gradient—similar to a battery—by accumulating hydrogen ions between the inner and outer membrane. This electro-chemical energy comes from the estimated 10,000 enzyme chains in the membranous sacks on the mitochondrial walls. As the charge builds up, it provides an electrical potential that releases its energy by causing a flow of hydrogen ions across the inner membrane into the inner chamber. The energy causes an enzyme to be attached to ADP which catalyses the addition of a third phosphorus to form ATP 3

Entropy Reduction in Living Systems

Living cells are dissipative, open and far-from-equilibrium systems that lower the entropy utilizing an influx of energy and molecular material in a multicompartment structure with specific functional characteristics. Entropy reduction was discussed early on by Schroedinger (1967) and it relies on both energy supply to create a metastable non-equilibrium state and electrical, pressure and chemical potential gradients across semi-permeable membranes. Electric potential differences also assist in the process. As an open system, a cell operates cyclically exchanging

44

material and heat with the environment. High-energy molecules are absorbed through pores in the membrane and their energy used to synthesize components of the cell and maintain ambient temperature. Heat is dissipated and waste products excreted so that excess entropy in the environment is balanced by structure- and information-production lowering the entropy inside the cell. This, of course, leads overall to a net entropy change in the cell fluctuating quasi-periodically close to the zero value. Cell death would manifest itself in the breakdown of structures and functions leading to a continuous entropy production. Overall, the entropy changes in the cell can be attributed to: (a) chemical reactions, (b) mass transport in and out of the cell, (c) heat generation and (d) information processing. Morowitz (1995) estimated that approximately 2 x 1011 bits of information are contained in the structure of E.coli bacteria, the simplest and best documented organism, a number which agrees with calorimetric data (Gilbert, 1966). However, the estimated information capacity in the E.coli's genome is only 107 (Johnson, 1970) which is at first surprising but on closer examination, to be expected, as will be argued below. Living cells, as all matter, must obey the energy conservation principle which in this case takes the form of the first law of thermodynamics, i.e. dU = DQ + DW

(1)

In this thermodynamic sense, cells can be viewed as a machine, exactly the way we discuss the thermodynamics of a combustion engine engaged in a Camot cycle, performing work and generating heat, thus requiring constant supply of energy and matter, more precisely, energy-giving molecules like glucose (see below). A more appropriate formulation of the energy balance is therefore through the Gibbs free energy that accounts for a change in the numbers of molecules and the presence of several molecular species. G=U-TS+PV=uN or dG=-SdT + V dP +pdN

(2)

Hence, the entropy differential can be written as: dS=dU/T+PdV/T-udN/T

(3)

which indicates that entropy changes can be achieved through heat production, change of volume or a flux of molecules. All of the above is relevant in the context of a cell. Since the entropy of an ideal gas of N particles with total energy E, of mass m each, is (Landau and Lifshitz, 1969) S/k=N {ln(V/N)+ 3/2 In (mE/3n h 2N) + 5/2}

(4)

45

this means that confining molecules within space, as is the case with building a cellular structure reduces the exploration volume V and thus reduces the entropy of the system accordingly. Conversely, mixing two molecular species with numbers Nj and N2 in a fixed volume V by opening a partition between their compartments V, and V2, increases the entropy by the amount given below: AS=N,ln(N/N1)+N2 ln(N/N2)

(5)

Therefore, keeping various molecular species separated in individual compartments (such as the mitochondria, the nucleus, the endoplasmic reticulum, etc) is an entropy reducing process since our information about the system's internal distribution is enhanced. En2ymatic catalysis against the energy barrier is a process that typically helps achieve such a deliberate separation of molecular species. In fact, a variety of solute molecules are contained within cells. The cellular fluid (cytosol) has a chemical composition of 140 mM K+, 12 mM Na+, 4 mM CI" and 148 mM A" where the symbol A stands for protein. Cell walls are semipermeable membranes and permit transport of water but not of solute molecules. We use Dalton's Law to determine the osmotic pressure inside a cell. A mixture of chemicals, with concentrations cl5 c2, c3,... dissolved in water has the total osmotic pressure equal to the sum of the partial osmotic pressures, n , of each chemical. The total osmotic pressure inside a cell, n,„, is therefore

n = n 1 +n 2 +n 3 +... = ^r(ci +c2+c3+...) , (l40 + 12 + 4 + 148)xlQ-3mo/ 1 liter 4 ; IIM=*7^ —-^ x — = 7.8xl(TPa 1 liter

(6)

1(T3WJ3

The cell exterior is composed of 4mM K+, 150 mM Na+, 120 mM CI" and 34 mM A". As a consequence the total osmotic pressure of the cell exterior, Yloul, is given by

"'

(4 + 150 + 120 + 34)xlO-Wx_ljyr I liter 10"3m3

4

Because Ylin and Tlout are quite close in values, the osmotic pressure difference between the exterior and interior part of the cell is very small, as it is the net pressure exerted on the cell wall that matters. For fragile animal cells, it therefore becomes vitally important to keep their interior and exterior osmotic pressures

46

closely matched. The cell therefore has a sophisticated control mechanism to do this. This can, again, be seen as an entropy reduction mechanism. Looking deeper into the issue of entropy reduction by the cellular process, in the production of macromolecules such as proteins, naturally the atoms that are assembled lose their degrees of freedom by being joined together. In the simplest case of a peptide chain viewed as a semi-flexible rod, each amino-acid prior to the assembly process possesses three translational and three rotational degrees of freedom, in addition to some internal degrees of freedom which by and large survive the assembly process. After a peptide has been assembled, only small rotations around the backbone are permitted effectively wiping out five degrees of freedom per amino-acid. Consequently, one can view this as an entropy reduction process. This negative entropy, call it structural for want of a better word, is created in addition to the combinatorial contribution that described the probability of selecting a particular sequence of amino-acids in the peptide, let's say k In (20") where n is the number of amino-acids in a peptide. The folding of a chain into a globular protein restricts the motion of its member groups eliminating some rotations altogether and limiting others. This, again, can be seen as a reduction of the phase space whose volume changes from Q to Q' with an attendant entropy reduction of AS=k ln(Q/£i'). For illustration purposes, we have used here a somewhat simplistic approach via a micro-canonical ensemble where all states in the phase space have the same probability while in reality, due to the interactions between molecules, a canonical ensemble should be used leading to a more accurate but also a more complicated formula, namely: S = k T ^ - l n Z + klnZ

(8)

_EL

e

Where Z=2\

kT

is the partition function of the system (treated as an isolated

i

one). Indeed, since the system is open (albeit the openness is not complete), a grand canonical ensemble technique should be used in the evaluation of the resultant entropy change. Much has been said about the effect of the second law of thermodynamics on living systems since the latter seem to defy it. However, the second law of thermodynamics is valid for closed systems and it basically states that in closed systems irreversible processes such as heat generation lead to entropy increases while and reversible processes involve no heat and no entropy change. No provision is made for entropy reduction but once again, a living cell is an open system and taken together with its surroundings the total entropy change will never be negative. In closed systems conditions for equilibria are expressed as either minima of the

47

appropriate thermodynamic potentials (e.g. Gibbs free energy) or maximum entropy requirements (Landau and Lifshitz, 1969). In open systems, there is no such rule except one looks for stability conditions of a given state, i.e. whether under a small perturbation the state will evolve or retain its equilibrium value. Another way of discussing entropy is in terms of order and disorder. The most pertinent physical transformations between states of matter that are ordered and disordered are called phase transitions. Continuous (second order) phase transitions involve no entropy change at the critical point and ordering in the system sets in gradually as seen through the bifurcation of an associated order parameter. In first order phase transitions, on the other hand, an entropy jump is always present at the transition point. Since Q = T AS is the latent heat of transition, this entropy jump is proportional to the latent heat of transition. Phase transitions with both positive and negative latent heats exist, i.e. entropy creation or reduction takes place in the system on supplying or withdrawing heat, but always AG =0 at the transition point. This does not violate the second law of thermodynamics since the system is not isolated thermally from the environment that may receive excess heat. This example is, of course, relevant to a living cell, if one were to speculate about jump-starting a living process by physical means. In non-equilibrium systems such as auto-catalytic chemical reactions of the Brusselator type (Prigogine, 1980), order is created and sustained spontaneously by means of non-linear interactions. Since these are open and driven systems, the second law of thermodynamics does not need to be invoked. Another important properties of non-linear systems is the possibility of self-assembly, for example in pattern forming crystal growth. This provides an example where there is no necessity for an instruction-driven creation of order and structure. Sometimes, when discussing the assembly of bio-matter, concern is unduly given to the need for instruction in putting the building blocks of matter together. While there are clear instructions for the amino-acid sequences in the genome, the details of higher order structure formation need no special encoding. They may emerge spontaneously as an attractor in non-linear dynamical system that we call a living cell as a result of biological self-organization (Kauffmann, 1993). As emphasized earlier, a living cell constantly consumes energy to maintain its structure and vital functions. The energy comes basically in two forms: photons (in plants) and glucose-containing compounds (in animals). Glucose is easily utilized to synthesize ATP that, together with its analog GTP, is the common currency of biological energy as discussed in Sec.2. An ATP molecule, under standard conditions carries 7.3 kcal/mol of energy and its less common analog GTP somewhat less. Each glucose molecule gives rise to approximately N=30 ATP molecules and the associated entropy production is given by the equation (Daut, 1987)

48

dS/dt=A G(glucose) J(ATP)/NT

(9)

Where A G(glucose)=3x 106 J/mol is the free energy of glucose oxidation and J(ATP)= 10"13mol/hr is the flux if resultant ATP for a single cell (Kim et al, 1991).Taking T=310 K results in roughly an entropy rate of change for a single cell in the range of 10"14 J/K s. This can be compared to only 0.7 x 10"17 J/K s of entropy reduction due to DNA transmitted information, i.e. less than one thousandth. This is not surprising since many other processes are at work to keep the cell in its metastable (low entropy) state. First of all, the membrane itself consisting of phospho-lipids comprises some 60% of the cell's mass and presents a highly ordered structure requiring an entropy reduction to be put in place. Likewise, proteins and peptides are composed of up to several thousand atoms each with fairly well specified positions leading to a net entropy drop compared to a non-living state. Finally, approximately 50% of the metabolic energy of a cell is utilized in the process of ion pumping across the membrane (Rolfe and Brown, 1997), mainly as a result of trans-membrane potential and the work of ion pumps. The latter resemble in its function the Maxwell demon, except of course, they do not possess a thinking function. Instead, they rely on molecular recognition mechanisms. These mechanisms, when a pump is activated, lower the entropy by binding the two molecules together. The subsequent placement of an ion or a macromolecule within the confines of a membrane permanently lowers the entropy by volume of exploration reduction as discussed above. A release of a waste product into the environment results in a precisely opposite effect. Chemical reactions may either absorb or release heat much like first order phase transitions, i.e. they may be either exothermic or endothermic. Since almost all living processes are in essence chemical reactions or cascades thereof, it is worth analysing them from the point of view of entropy. The most interesting reactions from the viewpoint of information theory are catalytic reactions of the special types of functional proteins called enzymes. Enzymes use a fine-tuned selection mechanism of molecules that have shapes complementary to a recognition pocket. This is called a lock-and-key mechanism and , by forcing particular orientations of the catalytically reacting molecules, enzymes increase the reaction rates by several orders of magnitudes. Consequently, this process can be viewed as information processing whereby the shapes of binding domains are recognized, the molecules are optimally positioned for binding and in some cases particular bonds are broken and others created. Some enzymes belonging to the class of allosteric proteins may adopt two or more stable conformations acting like switches, being activated in one conformation and inactive in some others. Following a binding and a catalysis event, enzymes return to their original conformation thereby participating in their active promotion of a particular reaction cyclically. From the point of view of information and entropy reduction, they do not overall decrease the entropy of the

49 cell (Lowenstein, 1999). At best, they break even since any piece of information that an enzyme invests in a catalytic reaction is re-collected at the end of a cycle. Furthermore, it is important to stress, that the information necessary to perform a particular function (molecular recognition) is not entirely contained in an enzyme. In order for an enzyme to be effective, it must be activated by the environment in which it resides: the water, the inorganic ions, etc. Thus, one may say that the information is contained in the entire system, i.e. the cell.

4

Biological Information

Shannon defined information as negative entropy given by the formula I=klnW=-kEpiln(pi)

(10)

Its introduction has enabled the resolution of the long-standing paradox referred to as Maxwell's demon. The problem involved a creature that operated a small door between two compartments of a container with two types of gas molecules, for example high- and low-energy ones. The end result would be a separation of the gas into hot and cold with no energy expenditure thus contradicting the second law of thermodynamics. Szilard's solution (Szilard, 1929) of the problem endowed the demon with information, i.e. Shannon's information which is negative entropy balancing out the changes in the entropy of the gas. In quantitative terms, the energy cost of 1 bit of information at physiological temperature is ln2 kT=3x 10'21J = 18.5 meV which should be kept in mind discussing biological functions that are replete with information content. Information about the structure and composition to be developed, such as protein sequences and folding patterns, is of major importance in this context.

Production of DNA takes place even in non-replicating cells. A typical mammalian cell polymerizes approximately 2x 108 nucleotides of DNA a minute into hnRNA (Brandhorst and McConkey, 1974) out of which only 5% end up in the cytoplasm coding for protein synthesis (Dreyfuss et al, 1993). Since there is redundancy in coding of nucleotide triplets for the 20 amino acids, the original 6 bits of information in DNA translate into log 2(20) = 4.2 bits in a protein. Consequently, on the order of 0.7 x 106 bits/s are transmitted form the nucleus to the cytoplasm. This is augmented by a small fraction of information due to mitochondrial DNA (Alberts et al 1994). As we have shown above, this is but a small fraction of the total information production (understood as negative entropy) of a living cell. The vast majority of information is contained in the organized structure of the cell and its components.

50

Since the Shannon information formula employs probabilities of particular states, there are inherent dangers of selecting these probabilities, especially when this is done purely combinatorially as is often the case, for example in an amino-acid or nuclei acid sequence determination. This is not necessarily a random choice situation akin to tossing a coin. This means that taking p=l/W for a single element selection, where W is the number of possible choices, may not be correct giving an excessively improbable (or high information) estimate. This would be the case if the choices of elements in a sequence are not of the same statistical weight, but instead are biased statistically, so that a more appropriate probability value is given by the canonical ensemble Boltzmann distribution formula Pi=p0exp(-Ei/kT). Of course, in order to make this estimate, one needs to know the energies E( and hence the Hamiltonian for the system. Therefore, the apparent information estimate of I=(klnN)n where n is the number of members in a string may be significantly larger than the true value of -S from thermodynamic estimates of a given state- a maximum entropy state for equilibrium and hence a minimum information content. For a string of choices (e.g. amino acid sequence in a peptide or a nucleic acid sequence in a DNA or RNA), this may lead to "basins of attraction" favoring some combinations strongly over others. There could be evolutionary retention of favored choices and the establishment of hierarchies of order. An immense number has been defined as 1=10110 and represents a clear computational barrier even from the point of view of cataloguing such an enormous number of objects. Immense numbers commonly appear in biology: both DNA and protein sequences are immense numbers arising from the sheer numbers of possible combinations in which these macromolecules may be formed. However, in view of the argument above, restricting the phase space by forming basins of attraction due to intra-molecular interactions may result in a hugely reduced number of combination one would encounter in practice. Another comment we wish to make is that regarding a clear distinction between information and instruction. While the former was introduced on purely statistical grounds as a measure of the number of choices possible when making a selection for a string of elements, instruction implies the existence of a message, a messenger and a reader who would the execute the message. A classic example of this would the synthesis of amino-acids contained in the triples of DNA and RNA base pairs. While there is the same information value in every triplet, namely k ln(43)=6 bits, some amino-acids are coded uniquely by a single triplet, some by two different ones and some triplets do not encode anything. This is obvious in view of the fact that there are 64 possible triplets of base pairs while only 20 distinct amino-acids, hence the redundancy. A similar difference between information and instruction can be found in the genome where in addition to the coding sequences of DNA, some of which are of vital importance to the very survival of a given organism, one finds socalled junk DNA that has apparently no coding value but represents a vast majority of the DNA sequence. The main thing to stress here is that information is often

51

confused with instructions. However, in the sense of Shannon's information, it is a more general concept that includes purely structural aspects such as combinatorial entropy reduction and chemical entropy changes. Instruction, on the other hand, implies a message and a message reader that executes a command. In this sense, this is similar to computer commands and operations. DNA and RNA are thought to be such biological messengers and so are hormones and various signaling molecules. However, as shown in previous sections, it appears that a vast majority of information content is not instructional in nature. This is akin to simple algorithms like the logistic map or fractal recursive relations which give rise to great mathematical complexity of the results that follow. Similarly, DNA can be viewed as an algorithm that spans an awe-inspiring complexity of a living cells. While coding for protein synthesis is contained in the genetic code, it is most improbable that details of structure formation need special coding. They most likely unfold due to self-organization inherent in the dynamics of the synthesized products In view of the discussion presented in this paper regarding information content and processing in a living cell, we wish to postulate the existence of two types of information in biological systems: (a) structural information- i.e. negative entropy and (b) functional information. The former is simply related to a neat and tight packing of the various molecules into macromolecules and macromolecules into organelles that comprise the cell. The latter on the other hand, pertains to the functioning of the cell and hence the rate and amount of chemical reactions taking place. The two forms are somewhat related but not identical. Imagine the construction of a car as an analogy. It may look perfectly good but if the gas line is cut, it won't function. The same holds true of a living cell. Some key reactions like synthesis of tubulin, if not properly executed, will lead to the cell death. While structural information should be maximized meaning entropy reduction by the cell, functional information is concerned with how rapidly information is being exchanged. Therefore, the cell's tendency should be to increase the speed of biochemical reactions and the amount of molecular interactions if possible. Living processes are cyclical in nature, so one is expecting to maximize the rate of information (or entropy) change over time for maximum functionality. In order to optimize both aspects: structure and functionality, a living system such as a cell, should strive to maximize the product of the two quantities, i.e. achieve: Max {I2(t) (dl(t)/dt)2}

(11)

where we have squared the quantities in the product due to the cyclicity of life's processes. Note that some aspects of this distinction between structural information and functional information in biological systems were already emphasized more than three decades ago by H. Froehlich (1968) who coined the term biological coherence to draw attention to both the holistic and functional integration of the information flow in living matter.

52

5

Conclusions

The various points of reference regarding the nature of the living state undoubtedly reflect the prevailing Zeitgeist of the period in which a given theory has been created. The viewpoint of representing the cell as a machine, or even a factory, closely mirrors the worldview of the industrial revolution of the 19th century. Likewise, the currently popular opinion that living cells are intensely engaged in some type of computation is closely linked with the technological revolution ushered into in the late 20th century as a result of the proliferation of computer technology. Both points of view have merits, i.e. the cell obeys the laws of physics such as the first law of thermodynamics and hence can be viewed as a thermodynamic machine and simultaneously it acts against the second law of thermodynamics by creating structural and functional order. In other words, it creates and maintains information. Furthermore, it most certainly processes information and engages in signaling thereby actively performing computation. It is safe to say that living cells can be viewed as both micro-factories (with nanomachines performing individual tasks) and biological computers whose nano-chips are the various proteins and peptides in addition to DNA and RNA. Most of the cell is what we might call hardware while a small fraction is software (for example the genetic code in the DNA that instructs for the synthesis of proteins). Probably only a small fraction of the cell can be seen as pure information content. Is there something else in living systems that neither machines nor computers possess? Probably yes. At least two properties distinguish animate matter from inanimate objects: procreation and autonomy expressed by free will (to move against the whims of thermal noise, in the very least) On a more practical note, can biomimetics be used to enhance our computational capabilities? The answer is yes, although progress in this area has been slow. In general terms, a chemical computer is one that processes information by making and breaking chemical bonds, and it stores logic states or information in the resulting chemical (i.e., molecular) structures. A chemical nano-computer would perform such operations selectively among molecules taken just a few at a time in volumes only a few nanometers on a side. An alternative direction has been to adapt naturally occurring biochemicals for use in computing processes that do not occur in nature. Important examples of this are: Adleman's DNA-based computer, Birge's bacteriorhodopsin-based computer memories as briefly discussed below. Adleman first used DNA, to solve a simple version of the "traveling salesman" problem where the task is to find the most efficient path through several cities. Adleman (1994) demonstrated that the billions of molecules in a drop of DNA contained significant computational power. Digital memory can be seen in the form of DNA and proteins. Exquisitely efficient editing machines navigate through the cell, cutting and pasting molecular data into the stuff of life. Has evolution

53

produced the smallest, most efficient computers in the world? Even if this is an exaggeration, the innate intelligence built into DNA molecules could help fabricate tiny, complex structures using computer logic not to crunch numbers but build things. Furthermore, DNA computers may use a billion times less energy than electronic computers, while storing data in a trillion times less space. Moreover, computing with DNA is highly parallel: in principle there could be billions upon trillions of DNA or RNA molecules undergoing chemical reactions, performing computations, simultaneously. Molecular biologists have already established a toolbox of DNA manipulations, including enzyme cutting, ligation, sequencing, amplification, and fluorescent labeling. The idea behind DNA computing springs from a simple analogy between the following two processes: (a) the complex structure of a living organism ultimately derives from applying sets of simple instructed operations (e.g. copying, marking joining, inserting, deleting, etc.) to information in a DNA sequence, (b) any computation, is the result of combining very simple basic arithmetic and logical operations. Eric Winfree intends to create nanoscopic building blocks out of DNA that are designed—to carry out mathematical operations by fitting together in specific ways. DNA is not the only candidate for a biological computer chip. Birge (1994) proposed to use of the lightsensitive protein dye bacteriorhodopsin, that is produced by some bacteria. He and his collaborators have shown that it could provide a very high density optical memory that could be integrated into an electronic computer to yield a hybrid device of much greater power than a conventional, purely electronic computer. In conclusion, electronic computers assembled using DNA and run on organic nutrients instead of electricity are another science-fiction idea that may soon become reality. 6

Acknowledgments

Research for this project has been supported by NSERC, MITACS-MMPD and PIMS. JT gratefully acknowledges hospitality during his stay at the ENS Lyon and financial support from the French Ministry of Education. References 1. E. Schroedinger, What is Life? The Physical Aspects of Living Cells, Cambridge University Press, Cambridge, 1967. 2. W. R. Loewenstein, The Touchstone of Life, Oxford University Press, Oxford, 1999. 3. B. Alberts, D. Bray, J. Lewis, M. Raff, K. Roberts and JD Watson, Molecular Biology of the Cell, Garland Publishing, New York, 1994 4. SA Kauffmann, The Origins of Order: Self-Organization and Selection in Evolution, Oxford University Press, New York, 1993.

54

5. CE Shannon and W Weaver, The Mathematical Theory of Communication, University of Illinois Press, Urbana, 1959. 6. L. Szilard, On the decrease of entropy in a thermodynamic state by the intervention of intelligent beings, Z. Phys. 53, 840 (1929). 7. S. Hameroff, The Ultimate Computing, Elsevier, Amsterdam, 1987. 8. J.A. Brown and J.A. Tuszynski, Dipole Interactions in Axonal Microtubules as a Mechanism of Signal Propagation, Physical Review E 56, 5834-5840 (1997) 9. Adleman, L., "Molecular Computation of Solutions to Combinatorial Problems," Science, Vol. 266, 11 November 1994, pp. 1021-1023 10. Birge, R., "Protein-Based Three-Dimensional Memory," American Scientist, July-August 1994, pp. 348-355. 11. Morowitz, HJ, Some order-disorder considerations in living systems, Bull. Math. Biophys. 17, 81-86 (1995). 12. Gilbert, EN, Information theory after 18 years, Science 152, 320-326 (1966). 13. Johnson, HA, Information theory in biology after 18 years, Science 168, 15451550(1970). 14. Brandhorst BP and McConkey EH, Stability of nuclear RNA in mammalian cells, J. Mol Biol. 85,451-563 (1974). 15. Dreyfuss, G, Swanson MS, Pinol-Roma S and Burd, CG, hnRNA Proteins and the biogenesis of mRNA, Annu. Rev. Biochem. 62,289-321 (1993). 16. Daut J, The living cell as an energy-transducing machine, Biochem. Biophys. Acta, 895,41-62 (1987). 17. Kim, HD, Koury MJ, Lee, SJ, Im, JH and Sawyer, ST, Metabolic adaptaion during erythropoietin-mediated differentiation of mouse erythroid cells, Blood, 77,387-3392(1991). 18. Rolfe DFS and Brown GC, Cellular energy utilization and molecular origin of standard metabolic rate in mammals, Physiol. Rev. 77, 731-758 (1997). 19. L.D. Landau and E.M. Lifshitz, Statistical Physics, Addison-Wesley, 1969. 20. I. Prigogine, From Being to Becoming: Time and Complexity in the Physical Sciences, Freeman and Co., San Francisco, 1980. 21. H. Frohlich, Int. J. Quantum Chem. 2, 641 (1968).

55 COUPLING A TUBULOGLOMERULAR FEEDBACK N E P H R O N MODEL WITH A MYOGENIC AFFERENT ARTERIOLE M O D E L

ROMAN M. ZARITSKI Computer Science Department, Montclair State University, Upper Montclair, NJ 07043 E-mail: [email protected] E. BRUCE PITMAN Department of Mathematics, State University of New York, Buffalo, NY 14214-3093 E-mail: [email protected] HAROLD E. LAYTON Department of Mathematics, Duke University, Box 90320, Durham, NC 27708-0320 E-mail: [email protected] LEON C. MOORE Department of Physiology and Biophysics, State University of New York, Stony Brook, NY 11794-8661 E-mail: [email protected] This paper presents numerical and computational aspects of coupling two complex dynamic mathematical models from renal physiology, the tubuloglomerular feedback (TGF) and the myogenic models. Although these two flow models are very different mathematically, they are physically coupled and can be incorporated into one model through a flow resistance approach. Here we present a computational algorithm to solve this combined model and some results of the numerical simulations. 1

Introduction

A nephron is the principal blood filtration unit of the kidney. Figure 1 gives a schematic diagram of a mammalian nephron. Blood enters the glomerulus (GL) through the afferent arteriole (AA) and the filtrate is forced by blood pressure out of the GL down a U-shaped system of tubules (PT, DL, AL). The glomerular filtration rate (GFR) of a nephron is the filtrate flow rate entering the P T . The first author's doctoral dissertation [1], which is the basis of this paper, provides an extensive overview of the importance of the GFR. GFR may be constant or it may exhibit regular (limit-cycle) oscillations,

56

TGF AA T /

GL

EA MD

AL

CRA

CD

DL

Figure 1. Schematic short-looped nephron (CRA: cortical-radial artery, AA: afferent arteriole, GL: glomerulus, EA: efferent arteriole, P T : proximal tubule, DL: descending limb, AL: ascending limb, MD: macula densa, T G F : tubuloglomerular feedback, CD: collecting duct).

irregular oscillations, or chaotic (or quasi-chaotic) oscillations. Constant flow and regular oscillations are largely explained by the TGF model (e.g., see [2]), which focuses on explicit Cl~ consideration in AL, and which is summarized below in section 2. Tubuloglomerular feedback (TGF) is a complex (nonlinear, delayed) negative feedback between the filtrate flow rate up the AL and the flow rate of blood entering GL through A A. Under certain conditions it leads to regular oscillations in GFR. However, to explain all the complexity in GFR dynamics, including irregular and quasi-chaotic behavior, other factors should be considered along with TGF. One such factor is the explicit consideration of the AA, whose walls exhibit interesting and complex dynamics that affects GFR. Couhterintuitively, increases in A A intramural pressure are followed by an active A A wall contraction, known as the myogenic response. A myogenic model is presented below in section 3. Because the AA delivers blood to the GL, the myogenic model is coupled with the TGF model through blood pressure and blood flow. In section 4 below a resistance model is presented that combines the TGF and myogenic models into one system. The TGF model (nonlinear hyperbolic PDE with delay) has been well

57

understood and numerical methods to solve it have been identified [3]. The myogenic AA model presented below (a PDE system) is very different from the TGF model mathematically and computationally. Given the differences in the two models, a numerical algorithm to solve the combined system is not straightforward. Section 5 below presents a computational algorithm, which was used successfully to compute the GFR dynamics numerically, as well as the discussion of the results. 2

T G F Model

This section gives a summary of the nondimensional form of the TGF mathematical model presented in [3], using the notations of [1]. The focus of this model is the explicit representation of the Cl _ concentration C(x, t) along the TAL. Conservation of Cl~ yields the equation ^C{x,t)

+ F{C(l,t-T))^C(x,t)

= -J(C(x,t)),

0<x
(1)

with the boundary condition C(0, t) = 1 (x — 0 corresponds to the loop bend, and x = 1 corresponds to the MD), where J(C) = ^

|

(2)

models the work of the ion pumps in the TAL walls, approximated by Michaelis-Menten kinetics, with Vmax = 2.0 and Km = 0.5. F in Eq. 1 stands for the fluid flow rate up the TAL (or velocity) and T is the TGF response delay. Note that F is assumed to be a fixed fraction (0.2) of the GFR. The negative TGF is modeled by letting F(C) = l + k1 tanh[fc2 (Cop - C)] ,

(3)

where fci = 0.3 and Cop is the operating (steady-state) chloride concentration at the MD, equal to 5(1), where S'(x) = -J(S(x))

,

(4)

with the boundary condition 5(0) = 1, is the steady-state (time-independent) form of Eq. 1. A detailed bifurcation and numerical analysis for this model is presented in [3] and is confirmed in [1]. According to the bifurcation analysis the stability of the steady-state GFR is determined by only two parameters: r (the nondimensional TGF delay time) and 7 (the TGF gain), defined as -fcifc25'(l). The n = 1 curve in the r — 7 parameter space on Fig. 2 separates the stable and unstable regions of the steady-state GFR.

58

0

0.2

0.4

0.6

0.8

1.0

Figure 2. The T G F model bifurcation curves (denoted by £ ( T ) ) . The position of parameter point (T, 7) with respect to the n = 1 curve determines the system stability. Below this curve the steady-state GFR is stable, and it is unstable above, i.e., small perturbations in GFR lead the system to sustained large-amplitude limit-cycle oscillations. Dashed box marks the approximate physiological range.

Numerical simulations, based on a second-order predictor-corrector ENO difference scheme, confirm the above analytical results. Above the n = 1 curve on Fig. 2, small perturbations to the steady-state GFR lead to sustained large-amplitude periodic GFR oscillations. The steady-state fluid transit time through the TAL (15.7 s) was used as a reference unit for time t and delay r throughout this model. 3

Myogenic A A Model

This section summarizes the nondimensional mathematical model of the AA and the myogenic response, as presented in [1], which is a modified version of the original, largely phenomenological model in [4]. The model explicitly represents the AA from the CRA (z = 0) to the part before GL (z = 1). The equations are: (5)

£*•*>=-?ehjW S

QJr(z> *) + k(z, t)r(z, t) = p(z, t) ,

and

(6)

59

d2 h-^k(z,t)

d + hg-tHz,t)

+ [1 - 0.3(p(M) -PsS{z))]k(z,t)

=p(z,t)

,

(7)

where p(z,t) is intramural AA pressure, r(z,t) is the AA radius, Q(t) is the blood flow rate through the AA, k(z,t) is the myogenic coefficient, and pss(z) is the steady-state AA pressure profile. The constants are: S = 0.05, 63 = 0.001, 62 = 0.2, and ksm = 0.6. With the physiological boundary conditions p(0, t) = Pinj = 3 and p(l,t) = p0ut,t = 2 this system has the steady-state solution Q{z,t) = r(z,t) = 1 and p(z,t) = pss(z) = 3 — z, which corresponds to straight AA walls and linear pressure drop along the AA. A simple explicit first-order difference scheme is used to solve this model numerically. The reference unit for time in this model was chosen to be the same as in the TGF model (15.7 s). 4

Resistance Model

To combine the TGF and myogenic models, we use a resistance approach, i.e., we assume that the blood flow rate through a thin vessel is equal to the pressure drop across the vessel divided by the vessel resistance [5]. This assumption is similar to Ohm's law for electric currents. Figure 3 shows such a combined model, as presented in [1]. In this combined AA-TGF model, the TGF submodel is the same as in section 2 above, except that now the feedback signal regulates RTGF resistance, and not directly the flow Q or QGFR- If the AA wall radius r(z,t) in the myogenic A A model from section 3 above is known, then the afferent resistance can be calculated as [1]: f1 = / Z T - ^ 47 dz • (8) Jo r(M) On the other hand, if RTGF and RAA are computed, because all the endpoint pressures p , n , poi = 0, P02 = 0 (boundary conditions) and resistances RE and Rj? are known constants, the flow Q and its fixed fraction QGFR can be computed. Thus, [1] shows that RAA

Q{t)

= R.Jt) RAA(t)

i f - 1+

3 — ' F{cMD(t-r))

This fact forms the basis of the numerical algorithm described below.

0)

60

Figure 3. AA-TGF Resistor Model. RESISTANCES: RAA- total AA resistance from z = 0 to z = 1, RTGFthe resistance of the distal AA portion, RE: total glomerular and efferent resistance to blood, RF: total glomerular and tubular resistance to nitrate. PRESSURES: pin: at z = 0 of AA, poutat z = 1 of AA, PGL- preglomerular, poi: venous (far down EA), po2'- tubular (far downstream). FLOW RATES: Q: through the AA, QE: efferent, QGFR- GFR. TGF (dashed arrow): changes RTGF based on CMD(t — T), which depends on the QGFR history.

5

Numerical Simulations and Discussion

At the top level the algorithm that was used to solve the combined AA-TGF model numerically consists of repeated iteration of the following four steps: 1. Updating the A A flow rate as a solution of the algebraic resistor model (Eq. 9); 2. Updating the AA pressure, radius, and myogenic coefficient k(z,t) based on Eqs. 5, 6, and 7, using an explicit first-order scheme; 3. Updating the AL flow rate as a solution of the algebraic resistor model (based on Eq. 9); 4. Updating the AL chloride concentration based on Eq. 1 (with a factor of Q(t) in place of F(C(l,t — r))), using a second-order predictor-corrector scheme. This computational algorithm was implemented in C++ in an objectoriented way. Both "Nephron" and "AA" objects in the code have "OneTimeStep()" method, which corresponds to updating the dependent variables of the TGF and AA models respectively to the next time step. One hundred grid points were used to represent both the AL and the A A, and the time step was 0.001. With these values the numerical solutions exhibited adequate, stable behavior. A typical simulation run starts with the steady-state variable

61 7.0 6.0 5.0 4.0 3.0 2.0 1.0 0.0

Figure 4. A : Dashed line: numerically determined bifurcation curve for the AA-TGF model. Solid curve: the primary bifurcation curve of the basic T G F model (n = 1 curve on Fig. 2). B : One sample GFR curve obtained as a numerical solution of the combined model, which exhibits oscillatory dynamics not observed in the basic T G F model. The time t is in nondimensional units.

values, and a perturbation is introduced to the feed pressure pin. Numerical simulations for the presented computational model allow one to investigate the AA-TGF model behavior, which cannot be deduced analytically due to the system complexity. This includes numerical bifurcation (stability) analysis, numerical solutions with new dynamic properties, and simulation of diseases (e.g., hypertension). Figure 4 gives two examples of information obtained from the numerical simulations for the AA-TGF model based on the above algorithm. Acknowledgment This research was supported in part by the National Institutes of Health, grant DK-42091. References 1. Zaritski, R. M. Models of complex dynamics in glomerular filtration rate. Ph.D. dissertation, State University of New York, Buffalo, New York, 1999. 2. Holstein-Rathlou, N.-H., and D. J. Marsh. Renal blood flow regulation and arterial pressure fluctuations: a case study in nonlinear dynamics. Physiol. Rev. 74: 637-681, 1994.

62

3. Layton, H. E., E. B. Pitman, and L. C. Moore. Bifurcation analysis of TGF-mediated oscillations in SNGFR. Am. J. Physiol. 261 (Renal Fluid Electrolyte Physiol. 30): F904-F919, 1991. 4. Arthurs, K. M., L. C. Moore, C. S. Peskin, E. B. Pitman, and H. E. Layton. Modeling arteriolar flow and mass transport using the immersed boundary method. J. Comput. Phys. 147: 402-440, 1998. 5. Kozlova, E. K., V. I. Badicov, A. M. Chernysh, and M. S. Bogushevich. Modeling blood flow in vessels with changeable caliber for physiology and biophysics courses. Am. J. Physiol. 272 (Adv. Physiol. Educ. 17): S26-S30, 1997.

63

A MATHEMATICAL M O D E L OF T H E I N N E R A N D O U T E R RENAL MEDULLA M. E. ROSAR Department of Mathematics, William Paterson University, Wayne, NJ 074 fO E-mail: [email protected] We have extended our existing n-nephron central core model of the inner medulla that treats NaCl, urea, and water movement to a four dimensional model in space and time that is based on the actual three dimensional architecture of the medulla as determined by microscopic section. We have achieved excellent qualitative and quantitative results for the concentration in the outer medulla. This new model has now been extended to the inner medulla. We now have a numerically stable mathematical model which, in the steady state, still gives excellent quantitative agreement with experimental data in the outer medulla. Results in the inner medulla are good as well, but must be further developed to achieve the high concentration levels observed in some species.

1

Introduction

Early models of kidney function typically considered single solute transport in a medulla characterized by a single nephron and a central core. These models were then refined to consider multiple, distinct nephrons, again with a central core. These one-dimensional, central core models have been further expanded and refined to consider the three-dimensional structure of the renal medulla [1]. Although good results have been obtained, further refinement is necessary. To this end, in this study, we set up a mechanism to use the actual three-dimensional anatomy of the kidney, to see if we can obtain enough urea cycling to drive the concentration mechanism in the inner medulla. We present a model which accommodates the structure of the outer stripe and inner stripe of the medulla as well as the interstitium. In the outer medulla, while water is extracted from the descending limb (DL) salt is reabsorbed from the thick ascending limb (AL) by an interstitium of higher concentration. Thus there must exist an active transport mechanism in the outer medulla. By contrast, the mechanism of concentration in the inner medulla is not understood as well. Although this mechanism works in a qualitative way, efforts to model the system quantitatively have been disappointing. Overall, it has been shown that, while the central core model is a good approximation to the functioning of the outer medulla, it is not so good a model for the inner medulla. We seek to incorporate the actual three-dimensional

64

architecture of the inner and outer medulla in order to get a complete model of urea cycling. Our first step towards this goal was to develop our four dimensional model for the outer medulla. Having done this, we proceeded to extend this model to the inner medulla. The division between the inner and outer medulla is physically characterized by the turning of the short DLs of the loops of Henle up into the short ALs. The remaining DLs entering the inner medulla will turn at different levels of the inner medulla, with ultimately only one approaching the papillary tip. In addition, the thickness of the tubules changes on entering the inner medulla, thus leading to changes in the tubule parameters' values. 2

The Model

The architecture of our four dimensional model in space and time is derived by integrating the tree like structure of the mammalian collecting duct (CD) system with the organization of the vascular bundles in the medulla. The CDs traverse the outer medulla without fusing, but in the inner medulla they fuse successively eight times in all (in the human) to give a final CD emptying into the renal pelvis with 2 8 outer medullary tributaries. These 256 tributaries drain between 1280 and 1536 nephrons. The differential equations describing solute and water flow in the medullary counterflow system, consisting of nephrons, descending vasa recta (DVR), and combined interstitial-ascending vasa recta-capillary-space (IVCS) are an extension of those used previously for medullary models [1], and we note that the same equations are used for both the inner and outer medulla. With the tube axis taken to be parallel to the z-axis, mass balance requires that the fcth solute in the ith flow tube, whether nephron, DVR, or CD, obeys the conservation equation

- ^

+ AiSik Jik

~

=

—dT-

(1)

where F^ is the axial flow of the fcth solute in the ith flow tube, At is the cross-sectional area of the tube, Sik is the average net rate per unit volume at which material is being produced or destroyed by chemical or physical reactions, usually zero, •/,& is the net outward transmural flux per unit length, Cik is the solute concentration, and t is time. The corresponding equation for volume flow in the ith tube is dFiv dz

Jiv

_dAi ~ at •

{}

65

Here F „ is the axial volume flow and Jiv is the outward transmural volume flux. The pressure drop along the tubes is determined by the equation of motion OT)'

-Q— = -RiFFiV

(3)

where pi is the hydrostatic pressure and RiF is the flow resistance. In all flow tubes, the solute flow is related to the volume flow by ddk d^i F%k = FiVCik — AiDik—^ AiUikCik—^— (4) oz oz where Dik is the diffusion coefficient of the fcth solute, mk is its mobility, and \I>j is the potential in the ith tube. In the tubules it is usually assumed that the diffusive and mobility terms are negligible relative to convective flow. Electroneutrality requires that

X>Cifc=0

(5)

where Zk is the valence of the fcth solute. Implicit in the above equations is the assumption that radial variations of concentration within individual tubes are negligible. The medullar volume is discretized in terms of Cartesian coordinates x, y, and z. At any z level (representing medullary depth) the xy grid is indexed by I. The section of IVCS located in this volume of medulla is indexed by (I,z). A section of tube, at this z depth, will be contained (all or in part) in one or more volume grid spaces, again with index of the form (I, z), and can exchange solute and/or fluid with the section of interstitium located in the same (I,z) grid space(s). We let On represent the fraction of volume of tube i which is in interstitial volume element J. Since we are assuming axial symmetry along the length of each tube, the amount of exchange between the ith tube at depth z and the Jth interstitial element at this same depth will be proportional to the percentage of the ith tube contained in the i t h volume element, viz. On. Therefore, we can conclude that the transmural volume flux can be given by Jiv = 2_^l^iI^iI,v

(6)

where JuiV represents the flux between the ith tube and the 7th interstitial element. Similarly, the transmural flux for the fcth solute can be expressed as Jik = 2^,01/Jt/,k-

CO

66 Transmural volume flux is given by

Y^ BT(Cik - Cik)aik + ^ - pi)

(8)

L k

and transmural flux of the feth solute is given by Jik = Jiv(l-trik)Cavg+2*nhik

[(Cik - Cjk) + Cavg(ZkF/Rr)(*i

-

*i)]+J?k (9) where r j is the radius of the ith tube, &ik is the Staverman reflection coefficient of the wall of the ith tube for the fcth solute, hiV is its hydraulic permeability coefficient per unit length, hik is its passive permeability for the Arth solute per unit length, Jfk is the metabolically driven transport of the fcth solute, R is the gas constant, T is the absolute temperature, F is the Faraday constant, Cavg is the average transmural concentration,

cavg = £*+£«.

(10)

The metabolically driven transport is usually assumed to be J- + Oik/Uik

where ctik is the maximum rate of transport and bik is the Michaelis constant. In the CDs and loop of Henle urea transport is assumed to be carrier mediated [2], k = urea: Jik = 2-KrihikKm ( 7 ; — ^

7;

-^7- ) •

(12)

The equations for the IVCS are the key to incorporating morphometric data into the model. Let piv be the fraction of a particular volume element AVj defined by the selected coordinate system that is interstitium, ascending vasa recta, or capillaries. For each volume element AVj, let v j be the volume flow in the volume element averaged over a unit area. We define pjk = pivCjk. Then the equation of continuity for IVCS volume flow is (AV/)-|p =

-{AVi)V\pivwJ)A-YjAzieiIJihv+'YjeDvR,iAnDVRFDVRyV i

i

(13) where the first term on the right accounts for fluid entering the volume element AVi from contiguous volume elements of the IVCS, the second term accounts for fluid entering from transmural volume flux from tubules or DVR; the summation is taken over all tubules. The third term accounts for fluid from

67

DVR that terminate at level z to supply the IVCS. This summation is taken over all DVR, with AnovR equal to the fraction of the DVR tubule in the 7th volume element at z = length of the ith DVR. The equation of continuity for flow of the fcth solute is

i

i

(14) The equation of motion for the space is Vp, = - R / i ; - v , .

(15)

Solute flow in the space is determined by the Nernst Planck equation v/fc = v//9 /fc - D/ fe • Vp/fc - pikvt-ik • V*/ f c .

(16)

In Eq. 16 G ik and ujf. are tensors defining diffusion and mobility along the coordinate axes. The set of equations for the IVCS is completed by the equation of electroneutrality

J2 ZkPk = 0.

(17)

k

3

N u m e r i c a l solution

We take as our functional unit one vascular bundle and associated nephrons and interstitium. We treat the DLs and the ALs as separate tubes. Thus we have a total of 256 tubes, with 80 DL, 80 AL, 80 DVR and 16 CD. To guarantee that this functional unit will "fit" into the larger picture, we impose periodic boundary conditions on the sides adjacent to other functional units. Boundary conditions at the "top" will agree with conditions in the cortex; conditions at the "bottom" will agree with conditions at the papillary tip. We consider the region of interest to be a rectangular cube (note that in actuality, it will not matter very much to the computation if we deform the region into a "prismatic" shape). We impose Cartesian coordinates on this system with the z-direction being in the direction along the nephrons with the positive direction being toward the papillary tip, and z = 0 at the cortical-medullary boundary. The model is designed to accept any number of solutes. In this investigation, we consider only urea and salt (NaCl). Since each solute is electrically neutral, each has a valence of 0, and thus Eqs. 5 and 17 are satisfied identically and therefore may be ignored. Then for each tubule, there are four equations

68

involving the volume flow and four equations for solute flow for each solute. Similarly, there are four equations for the interstitium. The first equation that we solve is Eq. 1. We will assume that we have Poiseuille flow, thus the tube walls may be considered to be rigid (although permeable), thus the cross sectional area of any tube will remain a constant in time. Substituting Eq. 4 into Eq. 1, and using the observed fact that diffusion is negligible in the tubes and that we are treating all our solutes as electrically neutral, Eq. 1 becomes a flux-conservative equation which can be solved for the concentration by a semi-implicit method. For the ith tube and the kth solute this leads to a discretized equation where we use upwind differencing. A von Neumann stability analysis shows that this scheme will be numerically stable in time. Equations 2 and 3 are solved using the trapezoid rule. Equations 4, 8, 9, and 12 are solved directly. Any needed derivatives are computed by a centered differencing scheme, with appropriate modifications at the boundaries. To solve the interstitium equations, we assume in the first approximation that the flow is one-dimensional, in particular, flowing up through the outer layer of the outer medulla and to the cortex. For lateral flow in the interstitium, we will incorporate Fourier transform methods. For any given volume element AV}, pjv is the fraction of that volume element which is interstitium, ascending vasa recta or capillaries, which is assumed to be known. We will assume that this fraction is uniform throughout the outer medulla and remains constant in time. Then Eq. 13, can be integrated for vjz. Eq. rhok interstitium becomes a combined advective diffusion equation which we solve with an explicit scheme for the advective term and an implicit Crank-Nicholson scheme for the diffusive term. In the first approximation, the equation for the pressure, Eq. 15, becomes a linear, first order differential equation, which can be integrated. Finally, v*. can be calculated directly from Eq. 16 and the solute concentrations in the interstitium from Plk = PlvClk4

Results and Conclusion

The outer and inner medulla have been successfully modeled in the manner described herein. The results for the outer medulla are in good agreement, both qualitatively and quantitatively, with experimental data found in the literature. Figure 1 shows the osmolality in the outer and inner medulla, as a function of (normalized) medullary depth, where 0. represents the corticalmedullary boundary and 1. represents the papillary tip. The junction of the outer and inner regions of the medulla is represented by 0.5. In each case we

69 Osmolality l_

/ T ^ / ^**\'~'~

/' /

/''

//,

/' f • / I

/

•'

'' t'/

....

, 0.0

0.2

0.4

0.6

0.8

1.0^ePth

Figure 1. Osmolality in mosm/L as a function of normalized medullary depth for the outer and inner medulla

see a continuous, smooth, transition between the outer medulla and the inner medulla, indicating that the model has been successfully implemented and remains stable throughout the computation, allowing the numerical calculations to reach a steady solution. This medullary model incorporates more detail of medullary structure than is found in previous models. The passive transport mechanism, which successfully models the concentration mechanism of the outer medulla, has been extended to the inner medulla. This passive mechanism, however, is not sufficient for the inner medulla. Results show a relatively flat concentration profile in the inner medulla, Figure 1, reaching concentrations of approximately 600 mosm/L near the papillary tip. This is substantially less than the 6000 mosm/L achieved by dessert rodents. Further methods of concentration in the inner medulla remain to be explored. This work was supported in part by a grant from the William Paterson University, College of Science and Health, Center for Research. References 1. J.L. Stephenson, Handbook of Physiology — Section 8, Renal Physiology, Vol. 2., ed. E. Windhager, (Oxford Univ. Press, NY 1992), 1349. 2. J.L. Stephenson, H. Wang, and R.P. Tewarson, Am. J. Physiol, 1995.

71

A NATURAL CODON SPACE AND METRIC H.M. HUBEY Department of Computer Science, Montclair State University, Upper Montclair, NJ 07043 E-mail: [email protected] The building blocks of life are the nucleutides adenine (A), cystosine (C), guanine (G) and thymine (T). These can be coded in binary and displayed in a table similar to the chemical table of elements. A particular form of coding, the Gray code, shows the relationship of these nucleotides in better ways than a simple binary code.

1

Introduction: the Nucleotides

Cells are the building blocks of life. Simple organisms like bacteria do not have cells with nuclei (prokaryotic cells). Most other cells have a nucleus distinct from the rest of the surrounding cytoplasm (eukarytic cells). DNA (the basis of life) is located in the nucleus (in eukoryates) and carries genetic material (genes) that determines the structures of proteins which are the building blocks of life. It is a an amazing fact that the DNA molecules (which are linear polymers) consist of only four types (bases of nucleotides): adenine (A), cystosine (C), guanine (G) and thymine (T). These are linked together by sugar-phosphates which form the 'backbone' [1]. The spatial structure of the DNA as is well-known is a double-helix of two complementary linear strands of nucleotides are held together by weak hydrogen bonds in which a G in one strand complements with a C in another and an A complements a T. DNA replication takes place via a process in which the two strands separate and then act as templates for new complementary strands. At this level DNA may be considered to be a language with four symbols in the alphabet. Enzymes, which are also proteins, are catalysts for cellular chemical reactions. 2

The Amino Acids

They are built up of only 20 different types of amino acids, linked to each other by peptide bonds. Such polypeptide molecules are not linear but have unique three-dimensional structures specific to their functions. The simplest part of the genetic code, then, is to map the four-letter alphabet code (the bases) of the linear DNA molecules into the 20 letter alphabet code (amino acids) of the threedimensional protein molecule. In the barest essentials, a triplet of bases (called a codon) codes for a distinct amino acid. In other words, one of the DNA strands read in a particular direction

72 (specified by its ends) three bases at a time, translates into a unique sequence of amino acids. A single strand of DNA can have many genes, each coding for a different protein (one gene for one protein). Because of the unique start and termination of the strands, there are codons which also code for start and termination of the amino acid sequence pertaining to a protein. This code was deciphered in 1966. A G C 11 00 01 GLY GLY GG 0000 GLY GA 0001 GLU GLU ASP GU 0010 VAL VAL VAL

U 10 GLY ASP VAL

GC 0011 ALA

ALA

ALA

AG 0100 ARG2 ARG2 SER2

SER2

AA0101 LYS

LYS

ASN

ASN

AU0110 MET

ILE

ILE1

AC 0111 THR

THR

G 00 GG0000 GLY GA 0001 GLU GC0011 ALA

A 01 GLY GLU ALA

U 10 GLY

C 11 GLY ASP

ASP

ALA

ALA

GU 0010 VAL AU0110 MET

VAL

VAL

VAL

ILE

ILE1

ILE1

AC 0111 THR AA 0101 LYS

THR

THR ASN

THR

ILE1

THR

THR

AG 0100 ARG2 ARG2 SER2 SER2

UG 1000 TRP STOP2 CYS

CYS

UG 1000 TRP STOP2 CYS

CYS

UA 100] STOP1 STOP1 TYR

TYR

UA 1001 STOP1 STOP1 TYR

TYR

UU 1010 LEU2 LEU2 PHE

PHE

UC 1011 SER1 SER1

SER1

ALA

UC 1011 SER1 SER1

SER1 SER1

CG 1100 ARG1 ARG1 ARG1 ARG1 HIS CA 1101 GLN GLN HIS CU 1110 LEU1 LEU1 LEU1 LEU1 CC 1111 PRO PRO PRO PRO Table 1 Plain Binary Coded Codon Table

LYS

SER1

ASN

UU1010 LEU2 LEU2 PHE

PHE

CU1110 LEU1 LEU1 LEU1

LEU1

cc mi

PRO PRO PRO PRO CA1101 GLN GLN HIS HIS CG1100 ARG1 ARG1 ARG1 ARG1 Table 2 KH-map Coded Codon Table

There are 64 possible codons; ( 4 x 4 x 4 ) . Of these three codons code for STOP or termination of the translation into amino acids. The mapping from the 61 codons to 20 amino acids determines the three-dimensional folding of the protein molecule. This process is called gene expression. A part of one strand of DNA is first transcribed into a molecule called messenger RNA or mRNA (ribonucleic acid). The mRNA is single-stranded and pos-

73 sesses bases complementary to the ones on the DNA with the exception that the adenine (A) in the DNA is paired with uracil (U) on the mRNA (instead of thymine, T). mRNA is an intermediary which carries the genetic information from the nucleus to the cytoplasm where the actual translation into protein occurs. The mRNA alphabet consists of A, C, G, and U, and it too can fold into distinctive three-dimensional structures along some of its parts via base pairing. Only a fraction of the DNA strand called exons actually code for proteins. The rest of it (called introns) are non-coding regions which are also called junk DNA, and which have to be removed. The edited version of the mRNA is the template that is translated into protein. This translation is carried out by two other types of RNA, ribosomes and tRNAs (transfer RNAs). The tRNA has a dual role; it is able to recognize both the amino acid and its corresponding codon. Because sometimes only the first two bases are read (the third base being irrelevant for tRNA selection), it is not necessary for there to be 61 types of tRNAs. 3

Binary Coding of Codons

These processes, DNA replication, DNA transcription into mRNA, and translation of mRNA into proteins is the basis of molecular biology and genetics. This article looks at the patterns not seen until now in the coding of the 64 codons into 20 amino acids. The genetic code can be said to be constructed from codons. It has been shown that the codon code is essentially a combination of a fixed-length and variable-length code [2], and the same authors have shown that there is a binary coding scheme for this. This code is a combination of fixed length code and a variable length code, and is given in Table 1. The coding is A=01, G=00, C = l l and U=10. The above authors' contribution is to determine the efficiency of the genetic code from an algorithmic complexity viewpoint. In this article, we look into remarkable patterns this code possesses. In Table 1 (Appendix), the binary code uses x as a 'don't care' condition of digital design. Therefore the first 8 amino acids use only a 4-bit code; the last two bits which represent one of {A,G,C,U} can be any of them. The second and third sets of amino acids in the table, are 5-bit codes. In the second set the third codon can be only one of {C,U}. In the third set, the third codon can be only A or G. Thererefore in the binary code, the last bit of the second set is always 1, and the last bit of the third set is always 0. The last set is a 6-bit code. Thus, using z as one of the four bases, and b as a bit, the codes are of the type zzbb, zzlb, zzOb, and zzz. We can put these in a table using this binary code as done in Table 1. Table 1 shows the various amino acids, at most one per row. In some cases there are two per row and in two cases, single occurrences are adjacent to doubles. The table shows the code graphically similar to a chemical table. However the relationships of the various amino acids are not clearly shown. An amino acid

74

that differs from another by a single nucleutide (such as the codons numbered 922) should somehow be shown next to each other.

MESD Figure 1: The Codon Torus: the KH-table can be wrapped on a torus as shown However the codon code exhibits other striking regularities. If we map out this binary code into a KH-map (which is a K-map of size greater than 4 x 4) [4] in a particular shape as shown in Table 2, we see a pattern in which every row is taken up by either a single codon, two codons or three codons. The coding scheme for this KH-map is created by ordering the binary codes in a specific way, called the Gray code, or the reflected code [2,3]. The binary codes are ordered in such a way that each cell differs from any of its neighbors by one bit. A neighbor is defined as a cell either vertically or horizontally adjacent. This map then is a very specific ordering of the codons. Because of its construction, the cells along the top are also neighbors of the cells along the bottom, and the cells along the left edge are neighbors of the cells along the right edge. Therefore, this planar map can be mapped onto a torus (as in Figure 1) in such a way that the cells are physically adjacent to the cells with which they are neighbors [2,3]. The number of nucleotides at each row can be expressed as 1211212322131121 (see Fig.2.) Other combinations could have also occured but there might be a reason why this particular pattern has evolved. We can shift the acids around on the table by reassigning the binary codes to G,A,C, and U, however it is best to think of the table as being wrapped on the surface of the torus shown in Figure 1. Then the row or column shifts will merely result in the rotation of the torus in different ways, one of them being rotating it so that the surface facing the inside circle is on the outside. The distance of the nucleotides to each other can then be read off the torus. From the table we can see that the 4 bit codes are the most robust in the sense that the last two bits can be anything and still represent the same nucleotide. The 6 bit codes are the most volatile in the sense that any change in any of the bits will result in turning the nucleotide into another one. Therefore the 'distance metric' is row dependent, along the length of the torus for all of them, and it is row and column dependent for the others.

75 G 00 GG 00Q0( GA 0001 (

A C u 01 11 10 GLY ) GLU " ) ( ASP )

GC 0011 (

ALA

)

GU OQio(

VAL

)

AU 0110(^(|LE)(JLEr) AC 0111 ( AA 0101 (

THR LYS ~ X

) ASN

)

AG 0100 ( ARG2^)( SER2 ) UG 1000(TRP)^TOP^( CYS )

UA 1001 (JTOP1_)(^JYR_J UC 1011 (

SER1 PHE

)

UU 1010 ( LEU2 X CU 1110 ( LEU1

) )

CC 1111 (

)

PRO

CA 1101 ( GLN ~X CG 1100 (

HIS )

ARG1

)

Figure 2: Another View of the Codons 4

Conclusion

In addition to the codon code being a combination of a fixed length code and a variable length code [4,5], the code also displays regularities in the form of Graycoding and thus can be displayed as a KH-map/table [2], and wrapped on a torus [3]. Thus the torus is the natural space for this code, and a metric can be constructed on this space.

76 References 1. Lewin, B, Genes V, (Oxford University Press, Oxford, 1995) 2. Hubey, H.M. A Complete Unified Method for Taming the Curse of Dimensionality in Datamining and Allowing Logical-ANDs in ANNs, submitted to Journal of Knowledge Discovery and Datamining, June 2001. 3. Hubey, H.M. Mathematical and Computational Linguistics, (Moscow, Mir Domu Tvoemu, 1994) 4. Naranan, S. and V.K. Balasubrahmanyan, Information Theory and Algorithmic Complexity: Application to Linguistics Discourses and DNA sequences as Complex Systems, Part I: Efficiency of the Genetic Code of DNA, Journal of Quantitative Linguistics, Vol 7 (2000), No 2, August, pp. 129-152. 5 Naranan, S. and V.K. Balasubrahmanyan, Information Theory and Algorithmic Complexity: Application to Linguistics Discourses and DNA sequences as Complex Systems, Part II: Complexity of DNA Sequences, Analogy with Linguistic Discourses, Journal of Quantitative Linguistics, Vol 7 (2000), No 2, August, pp. 153-183.

77

APPENDIX I Table 1: The Genetic Code #

Codo n Set

Legend Name

Abbrev

Binary Code

Threonine Proline Alanine Glycine Valine Arginine Leucine Serine

THR PRO ALA GLY VAL ARG1 LEU1 SER1

Ulllxx llllxx 001lxx OOOOxx 001lxx HOOxx lOllxx lOllxx

AA# CA# GA# UG# UU# AG# AU# UA#

Asparagine Histidine Aspartic Acid Cysteine Phenylalanine Serine Isoleucine Tyrosine

ASN HIS ASP CYS PHE SER2 ILE1 TYR

OlOllx HOllx 0001lx lOOOlx 10101X OlOOlx OllOlx lOOllx

17 18 19 20 21 22

AA? CA? GA? AG? UU? UA?

Lysine Glutamine Glutamic Acid Arginine Leucine Stop

LYS GLN GLU ARG2 LEU2 STOP1

OlOlOx HOlOx 0001Ox OlOOOx 10100X lOOlOx

23 24 25 26

AUG UGG AUA UGA

Methionine Tryptophan Isoleucine Stop

MET TRP ILE2 STOP2

011000 100000 011001 100001

1 2 3 4 5 6 7 8

AC* CC* GC* GG* GU* CG*

9 10 11 12 13 14 15 16

cu* uc*

A=01,G=00, C=ll, U=10 *=A,QC,or U #=C, or U (Pyrimidine) x= 0 or 1 ?=A or G (Purine)

Human Computer Interface

81 A CASE STUDY OF A DISABLED MAN USING A MOUTHSTICK AND ACCESSIBILITY OPTIONS IN WINDOWS TO DO E-COMMERCE BRUCE DAVIS B.A. (Former Chair of Advisory council on Disabilities in Morris County, N.J.) 9 Ridgedale Avenue, Cheshire Home, Florham Park,, New Jersey 07932, USA EAMON DOHERTY M.S., GARY STEPHENSON B.A. University of Sunderland, School of Computing, Info. Systems, and Engineering P.O. Box 299, Sunderland, SR6 OYN, England E-mail: [email protected], [email protected] JOANN RIZZO M.A. (Morris County Family Services, N.J.) 62 Elm Street, Morristown,

New Jersey 07960, USA

A quadriplegic man found that he was able to perform e-commerce using a mouthstick and the accessibility options in Windows 95/98/2000. The quadriplegic man took approximately 33% longer to perform an e-commerce task than a man with full use of his limbs who performed the same task with a mouse. The speed at which the motor impaired man's purchase was carried out was of less interest to the researcher than the fact that the man could now participate in the new economy from his home and independently purchase products he wanted. The disabled man, Bruce Davis, said it was his professional opinion that many quadriplegic institutionalized persons have transportation issues and cannot make timely purchases in person. Bruce Davis stated further that this population could become more empowered and independent by performing e-commerce with a mouthstick and the Windows accessibility options that are available in institutions. Bruce stated that the goal of this paper is to inform institutionalized persons, their caregivers, and family that there exists a feasible means of allowing some severely motor impaired persons a means to participating in the new economy by purchasing or selling items online.

1

Introduction

Bruce Davis has served many years as an advisor on disability issues for human services in Morris County, New Jersey. He has met numerous disabled people at meetings and found that almost all of the disabled people depended on others for transportation or making purchases for them. Another problem Bruce and many of the motor impaired people he met had was that their wheelchairs were not suitable for use in snow, ice, rain, or extremely cold weather. This meant that people could only independently buy

82

products during good weather if their health permitted outside travel or depend on others to make purchases for them. Bruce and many other institutionalized persons have obtained computers, Internet Service, and accessibility packages with Windows 95/98/2000. This is mostly due to a general decline in prices of computer hardware, software, and Internet Service Providers. Pu and Faltings [1] say that people with a PC, software, and Internet service have found that people can purchase competitively priced items and find the experience satisfying. E-commerce is the process that allows people select an online product and have it delivered in a timely manner to their place of residence. Bruce Davis said such a process was in his professional opinion very useful to people who need to purchase items but were unable to travel to a traditional store. 2

A Case Study of E-Commerce with a Severely Motor Impaired Person

There were numerous studies concerning e-commerce and quality of service [2] and issues concerning usability with people who had no disabilities. However, no literature was located on the e-commerce experience concerning a severely motor impaired (quadriplegic) person with some head movement. It was then decided that an investigation would be performed to investigate the length of time it would take to perform a purchase in an Internet environment by a severely motor impaired person and compare it against a person with a normal limb and hand. 3

The Study

The case study started and ended on November 10,2000. A personal computer was equipped with Windows 98 and a mouse. The accessibility functions were also enabled so that the numeric keypad could be used to navigate the cursor and perform a "clicking" function. A "sticky keys" function in the accessibility function was also enabled, so that chords - two keys pressed together - such as ALT-R could be performed by the motor impaired user. A Netscape browser was installed. Erols was the local Internet service provider. Bruce Davis is considered by his physical therapist to be a skilled user of a mouthstick and to be a good representative of a severely motor impaired quadriplegic user. Professor Doherty is considered by many in the academic community to be a skilled mouse user and a good representative of a person with functioning limbs.

83

Bruce Davis was fitted with his mouthstick and given the command to start. He used the mouse emulator and sticky keys in conjunction with the mouthstick to start a dialer, the browser, accept cookies, find the www.ebav.com auction site, find the third item in a list of US coins and place a bid. Professor Doherty performed the exact same steps using his hand to navigate a mouse. The results are available in Table 1 below. It can be seen that Bruce Davis performed the task in seven minutes and thirty-two seconds while Professor Doherty took four minutes and five seconds, which is 33% faster. Bruce Davis Dialed in to Erols on first try Started Netscape Typed in URL of Auction Answered three cookies Navigated 3 screens Typed in User Name and Password 7 Minutes, 32 seconds

Prof. Doherty .i.. it tt tt ti

«» it ti

Used Same User Password 4 Minutes, 5 seconds 33% faster

Name

and

3.1 - Table 1 4

Discussion

Bruce Davis was asked why it took 33% more time than Professor Doherty when no errors were made by either participant. Bruce pointed out that the mouse has been considered an optimum pointing device since first researched by Card, English and Burr in 1978 [3]. The mouse emulator is not as fast as the mouse nor is it intended to be. Its function is to be enabling and not optimal. Bruce said he was not aware that he could perform e-commerce before the experiment and felt intimidated by the whole process. Bruce said that it was easy to purchase items from the Internet and that private delivery companies such as FEDEX, UPS, and DHL EXPRESS offered him overnight express for high priority purchases. Bruce stated that three minutes either faster or slower to perform a purchase was not significant to people in an institutionalized setting who do not feel the same pressures as persons who are in the non-institutionalized world and involved in full time employment. Bruce also said that many institutions including his offer computer labs with a range of accessible input devices that allow severely motor impaired persons access to both the computer and the Internet. It was his opinion that

84

this method of commerce was feasible to cognitive persons with some head movement or ability to use a switch and scanning software. He stated further that e-commerce was convenient for those with severe transportation issues who cannot leave the institution to buy or sell items. 5

The Investigation of the Mouse Emulator

Professor Doherty performed an investigation with the mouse emulator and mouse that comes standard a PC and with Windows 95/98/2000. The purpose was to see the difference in performances when using a mouse and mouse emulator to choose icons and buttons when performing e-commerce. On November 10, 2000 an experiment was undertaken to understand the difference in time to acquire icons and buttons with both the mouse and the mouse emulator. Professor Doherty used a program called Clicktest [4] that presented eight targets, the size of a standard icon, in a serial fashion for acquisition. The targets appeared on forty-five degree azimuths seventy-five percent of the distance between the origin of the screen and the outer edge of the screen (Fig. 1). The results were that it took Prof. Doherty an average of 5.77 seconds to acquire each target with the mouse emulator while it took only .55 seconds to acquire with the mouse. Prof. Doherty then presented the results to Bruce who agreed that the mouse appeared over ten times faster to perform selections and clicks. He reiterated that the mouse emulator was about enabling a person, not about performing tasks in optimal times when compared to non-impaired persons. Bruce said his only other option to performing ecommerce was to use EZ-Keys [5], a program that allowed the scanning of the screen using a switch to indicate direction changes and clicks. Such a program was effective but required more vigilance to the task at hand and also required precise timing to push the switch. There was also the risk that performing the same action could lead to a repetitive stress injury. 6

Overall Conclusion

We have presented results here with one person performing one task. The results here can only be used as a suggestion of how e-commerce may be useful for a spinal chord injured intelligent man and may not apply to other cases of disability. The results and opinions in this paper should be encourage others to perform an in depth e-commerce study. Our limited results suggest that the mouse emulator and EZ Keys appear to be the only methods of computer control that allow severely motor impaired quadriplegic

85

persons like Bruce Davis an access method to performing e-commerce. Bruce's opinion is that EZ keys may provide an opportunity for a repetitive stress injury due to the repetition of hitting a switch with a mouth pointer. It also requires a high level of vigilance to the task at hand as well as precise timing with a switch to point and click. The mouse emulator on the other hand does not require precise timing. A motor impaired operator can touch a key, move the cursor, and rest if he or she is tired. Then he or she can resume the task. It does not require the same level of vigilance as EZ keys. The numeric keypad has over sixteen keys to provide navigation and selection tasks with the cursor. There is not the same level of repetition and from an ergonomic standpoint there is most likely less risk of a repetitive stress injury. The mouse emulator seems the most effective method for motor impaired persons such as Bruce to perform pointing and clicking to do e-commerce. The mouse may be an optimal device and ten times faster but is not an option to a severely motor impaired person to use in e-commerce. It was mentioned now many times that institutionalized people do not have the same time constraints as those living in a non-institutionalized world. The mouse emulator provides pointing and clicking ability for the motor impaired person in at least one representative's of that population's idea of an acceptable time. The number of keys needed for navigation and selection appears to make the task more interesting as well as reduce the risk of injury. The ability to rest the mouthstick and not fail the e-commerce task at hand seems a great asset. Lastly e-commerce offers the transportationally challenged person a viable means to participate in the new e-economy and purchase as well as obtain goods in a timely fashion.

Figure 1.0 - Clicktest and Target Simulating an Icon

86 References 1. 2.

3. 4.

5.

Pu, P., Faltings, B., "Enriching Buyers' Experiences: the SmartClient Approach" CHI 2000 (ACM), 1-6 April 2000, Amsterdam, Netherlands, p. 289-96 Bouch, A., Kuchinsky, A., Bhatti, N., "Quality is in the Eye of the Beholder: Meeting Users' Requirements for Internet Quality of Service, CHI 2000 (ACM), 1-6 April 2000, Amsterdam, Netherlands, p. 297-304 Douglas S., Mithal A., "The Ergonomics of Computer Pointing Devices", Springer - Verlag, Berlin Heidelberg New York, 1997, Page 65. Doherty, E., Bloor, C , Cockton, G.( Engel, W., Benigno, D., "Yes/No - A Mind Operated Device for Severely Motor Impaired Persons" (2000), Computers Helping People with Special Needs, ICCHP 2000, Proceedings of the 7th International Conference on computers Helping People with Special Needs, July 17-21, 2000 Karlsruhe, Germany Lawrence, P., "Ez Keys for Windows Uniquely Satisfies aac and other Assistive Technology Needs", Presented at CSUN 1998 Conference in Los Angeles, March 1998, available upon request from Phillip-Lawrence, Words+Inc.40015,SierraHighway, Building,B-145, Palmdale, CA 93550

87 USABILITY ISSUES CONCERNING THE CYBERLINK M E N T A L INTERFACE A N D PERSONS WITH A DISABILITY

EAMON DOHERTY M.S., CHRIS BLOOR PHD, GILBERT COCKTON PHD University of Sunderland, School of Computing, Info. Systems, and Engineering P.O. Box 299, Sunderland, SR6 OYN, England E-mail: [email protected] and {firstname.lastnam}sunderland.ac.uk JOANN RIZZO M.A., R.P.S. 62 Elm Street, Morristown, New Jersey 07960, USA D E N N I S B E N I G N O B.A. (Trustee, Neurological Institute of N.J.) 270 Hazel Avenue, Clifton, New Jersey, 07011, USA E-mail: [email protected] BRUCE DAVIS B.A. 9 Ridgedale Avenue.Cheshire Home, Florham Park,, New Jersey 07932, USA A mental interface device called the "Cyberlink" was tested with a group of persons with traumatic brain injury or cerebral palsy. All were able to play games but only some were able to perform rudimentary communication. The persons with cerebral palsy who also had a high level of spasticity experienced difficulty operating the Cyberlink because the headband containing the electrodes was often loosened or fell off during episodes of involuntary movements. Staff and parents were reluctant to leave such persons unsupervised with the device because such users would experience frustration if the headband became unsecured and they were unable to operate the device. A study was conducted to investigate some available methods to secure the headband and thus improve the ability of disabled persons to independently perform recreational and communicative tasks. Another Cyberlink usability issue was also examined. The institutional staff and parents reported that the configuration of settings of the Cyberlink were difficult to optimize. The parents and staff said further that settings often became invalid during the session as a person's physiological signals changed. The purpose of the paper is to report some of the shortcomings of the Cyberlink and postulate solutions that may be implemented and thus increase the usability for some disabled persons.

1

Introduction

The Cyberlink is an assistive technology device that uses signals collected from the forehead to operate a cursor. The signals are collected by the use of electrodes that are

88 placed upon the forehead and held in place with a headband. Signals from eye movements (EOG), muscle movements (EMG), and brain waves (EEG) are processed and drive the cursor around the screen [1]. The United States Congress has defined an assistive technology as any device that assists in allowing persons to perform activities of daily living such as communicating or operating appliances [2]. The definition was further expanded to devices that allow persons to recreate [3]. Research with a variety of brain-injured persons using the Cyberlink revealed that the device was some people's only means to communicate and recreate even if on an inconsistent basis due to medications, sustained injuries, or problems with either the Cyberlink software and/or hardware [4]. The device was also useful to persons with mental retardation and cerebral palsy for navigating a cursor through an onscreen maze and playing various video games [5].

2

Previous Studies

The results of previous studies using forty-four participants with the Cyberlink and interviewing approximately two hundred institutional staff, participants, guardians, and parents revealed that a "Yes / No" program (See Figure 1.) was needed to allow braininjured persons with total locked-in syndrome to communicate [6]. The program allowed some persons with total locked-in syndrome an opportunity to communicate some of the time.

1 NO

Yes

J

Figure 1. "Yes/No" Screen

89 3

A Parent's View

Dennis Benigno, a parent of a Cyberlink user with total locked-in syndrome, was happy to witness his son navigate the cursor four times to select yes and no responses. The son was instructed what four answers to select on June 12, 2000. The joy engendered by one deliberate yes or no response from his son is difficult for persons not in such a situation to comprehend [6]. Dennis wants to use the program independently with his son but finds the configuration of the Cyberlink confusing and time consuming. He says a settings agent is needed to simplify setting configuration. It was also established that settings need to change within a session because of physiological changes in the user [6]. Dennis feels a settings agent should regularly investigate the appropriateness of settings throughout a Cyberlink usage session. The agent should then make appropriate changes when necessary. Dennis would also like to see the Cyberlink rely more on brain power (EEG) and less on physically generated signals from eye movements because it might be easier for his son. On August 1,2000 it took 270 seconds for Dennis' son to answer a definite yes. The son had answered six out of seven yes and no answers correctly on June 26,2000 and averaged less than a minute on each response. The son's ability to control the device varies greatly due to technology limitations as well as his injuries. This frustrates the parents who would like to communicate regularly with their son. 4

A User's Viewpoint

Bruce Davis, a Cyberlink research participant, also agrees with Dennis that settings become inappropriate within a session and that a settings agent is required. Bruce also feels that a better method of fastening the electrodes to the head is needed because many disabled persons have varying degrees of involuntary movements that affect the placement of the Cyberlink headband. Such shifting of the headband has often made the Cyberlink unusable. Staff in group homes are often busy and cannot always attend to an assistive technology device as more serious care issues take precedence. Bruce says his head size is twenty-four inches and the headband should have more velcro to accommodate people with larger heads. Bruce knows of two other Cyberlink users with sixteen inch heads and lots of long hair who find the headband does not stay securely on their head. Bruce said a helmet with the electrodes inside would be a better solution.

90 5

Cyberlink Usability Issues

The above comments indicate the need for a 'settings agent', and that there may be a problem relating to the use of the headband. Dr. Junker, inventor of the Cyberlink, did attempt to address the former problem but it appeared to be very complex and was not solved before funding for the project ran out [1]. The headband moving did appear to be a problem at times with approximately twenty of the forty-four persons involved with the research. It was decided that three options could be pursued to secure electrodes to the forehead: • suction cup electrodes could be placed upon the forehead • the Cyberlink headband could be embedded within a bicycle helmet • The headband and electrodes could be fastened to the head with tape Conversations with health care professionals revealed that suction cups could leave unsightly marks and damage the self-esteem of the participants. Discomfort from the suction cups could also cause users to focus on the suction and not the task at hand. Health care professionals also indicated that tape could remove facial hair, cause pain, and irritate the skin of the participant. Neither participants, health care professionals, or guardians raised any objections to the option of having the Cyberlink headband embedded within a bicycle helmet. Consent was obtained and the helmet was tested with a group of disabled users with various impairments. 6

The Helmet Study

An investigation was undertaken into means of making the headband easier to fasten on the Cyberlink user's head and keep it secure. A standard cycle safety helmet was adapted by placing the three electrodes normally used with the Cyberlink on the internal surface of the helmet. The study started March 1,1999 and ended on March 4, 1999. A one-time twenty-minute session was performed with each participant. Participants were given instructions to place the helmet on their head if possible or let the helmet with embedded electrodes be fastened on their head. Each participant was to independently select and play as many games as possible from a hands-free menu. The participants' spastic movements and its effect on the headband were noted. An alpha wave in the range 8-10 Hz. was used for vertical control of the cursor while EOG signals in the 1-2 Hz. range was used to power horizontal movements of the cursor. A sign language interpreter was available for deaf participants. The participants included fifteen males and females with a variety of neurological and mental impairments including a high level of spasticity. Adequate electrical

91

connections were established for all participants. To achieve this, the quantity of foam block packing had to be adjusted as the head sizes of the participants varied from sixteen to twenty-four inches. It was necessary when fitting the helmet to pay careful attention so as not to disturb hearing aids and glasses. These measures considerably increased the time required to fit the helmet, although only the second point would need to be considered for an individually tailored system. Some participants who were prone to falling said the helmet provided a feeling of safety in case he or she fell. Participants all scored well on the Cyber-Pong but reported increased difficulty in performing a click in the hands-free menu. The helmet did keep the headband secure but at a cost of increasing the difficulty of clicking for a population with a history of problems using the Cyberlink to click. The participants reported at the end of the study that they liked the helmet and agreed it made them look like a racing car driver or bicycle enthusiast. However other institutional residents said that the users looked autistic wearing the helmet and therefore the users would not wear the helmet again. 6.1

Helmet Study Conclusion

It was concluded that the helmet further impeded clicking and also decreased the selfesteem of some of the participants. The helmet also did not save time because extra time had to be taken not to disturb hearing aids or eyeglasses. The helmet did provide a method of keeping the electrodes on the forehead and provided some with an initial feeling of security if he or she was prone to falling. Overall the helmet provided more negatives than positives and was discarded as a hardware improvement. 7

Overall Conclusion

The Cyberlink is a useful assistive technology for disabled persons but is difficult to configure and maintain relevant settings throughout a session. A settings agent is needed to quickly find usable settings and update them as the user's physiological signals change. A better method of fastening the electrodes to the forehead is needed for spastic individuals. Such changes could make the Cyberlink more usable and thus be more readily accepted by both institutions and private residences as an assistive technology.

Figure 2. Bruce Davis Operating a Laptop and Cyberlink and a Telephone I

References

1. Berg, C, Junker, A., Rothman, A., Leininger, R. (1998) The Cyberlink Interface: Development of A Hands-Free Continuous/Discrete Multi-Channel Computer Interface Small Business Innovation Research Program ( SBIR ) Phase II Final . Report Published by Brain Actuated Technologies, Incorporated- 139 East Davis Street, Yellow Springs, Ohio 45387., United States of America 2. Public Law 100-407" [S. 2561]; August 19, (1988a) Technology-Related Assistance For Individuals With Disabilities Act of 198.8,. Section 29 USC 2202, United States Congress 3. Public Law 100-407 [S. 2561]; August 19, (1988b) Technology-Related Assistance • For Individuals With Disabilities Act of 1988, Section 2, Findings and Purposes, USC 2202, United States Congress 4. Doherty, E, Bloor, G, Cockton,( 1999b) The "Cyberlink" Brain Body Interface as an Assistive Technology for Traumatically Brain Injured Persons: Longitudinal Results from a Group' of-Case Studies, CyberPsychology and Behavior, Liebert • Publications, Larchmont, New York,'USA, pages 249-259 • 5.- Doherty, E, Bloor, C, Cockton, - G., Engel, W., Rizzo, J., Berg, G,'(1999a) ' -Cyberlink - An Interface for Quadriplegic and Non --'Verbal People, Pages 237' 249, Conference Proceedings, GT99'3rd International Cognitive Technology,-: Aug 11-14,1999 •" in' - San Francisco, -Proceedings" Available at http://www.cogtech.org/CT99, Published by M.I.N.D. Lab, Michigan State University 6. Doherty, E, Bloor, C. and Cockton "YES/NO" A Mind Operated Application for Severely Motor Impaired Persons, Proc. ICCHP 2000, Karlsruhe, Germany, July 16-21

93 ROBOTICS FOR THE BRAIN INJURED: AN INTERFACE FOR THE BRAIN INJURED PERSON TO OPERATE A ROBOTIC ARM PAUL GNANAYUTHAM, CHRIS BLOOR AND GILBERT COCKTON University of Sunderland, School of Computing, Engineering and Technology, The Informatics Centre, St. Peter's Campus, St Peter's Way, Sunderland. SR6 ODD, UK E-mail:[email protected], [email protected], [email protected] This paper discusses a pioneering area of research that is being carried out by Sunderland University that allows brain injured persons to do simple tasks using robotic arms. Although robotics have been used for helping disabled persons in various areas of disability, very little research has been done with the brain injured persons and robotics. This paper discusses the implementation of a simple model, which consists of brain body interface, a computer, an interface program, an electronic circuit to interface the computer to the robotic arm and a robotic arm. We hope to improve the lives of brain injured people once the pilot studies are completed

1

Introduction

This an extract from the statement presented to the 56th Session of the UN Commission on Human Rights in Geneva, in early April 2000, by Bengt Lindqvist. "Throughout the centuries we have designed and constructed our societies, as if persons with disabilities did not exist, as if all human beings can see, hear, walk about, understand and react quickly and adequately to signals from the world around them. This illusion, this misconception about human nature, this inability to take the needs of all citizens into account in the development of society, is the main reason for the isolation and exclusion of persons with disabilities, which we can observe in different forms and to different degrees all over the world. It will take a long time to change this pattern of behaviour, which is deeply rooted in prejudice, fear, shame and lack of understanding of what it really means to live with a disability". "World estimates show that there are more than 500 million people who are disabled as a consequence of mental, physical or sensory impairment. This makes people with disabilities one of the world's largest minorities" [1]. The statement by the Commissioner on Human Rights shows that it is a right for all human beings to live without any prejudice and the second statement by Dr. Agarwal shows that the disable community is one of the world's largest minorities, which certainly need to be addressed. Many researchers and careers keep contributing to the area of disability to lessen the prejudice as we reach the twenty first century. Computer technology, Artificial Intelligence and the Human Computer Interaction also contribute to the goal set by the United Nations to make user interfaces that can be used by any user including users with special needs.

94 University of Sunderland has been carrying out extensive research in brain body interfaces for brain injured and has created human machine systems, which gives hands free access to the computers. This facilitates simple communications between the brain injured and the outside word, which was not possible until few years ago. In this paper, we take the brain body interface communications a step further where the brain injured persons will not only communicate but will also be able to do simple tasks such as lifting a small item and having a closer look. Remember this is the only the beginning of research in the area of robotics for the brain injured. 2

Robotics for the brain injured

In this section we look at the communications devices used by the brain damaged users and the new device for carrying out simple tasks using a Robotic arm. 2.1 Brain Body Interfaces Not all users with special needs can use a mouse, trackball, and keyboard or have the ability to speak to a speech recognition system. So we need a device that provides communication capabilities for those who cannot use any of the regular input devices. There are many brain body interfaces; e.g. • HeadMouse™ - (using wireless optical sensor that transforms head movement into cursor movement on the screen [10]. • Tonguepoint™ - a system mounted on mouth piece [9]. • Cyberlink™ - a brain body actuated control technology that combines eyemovement, facial muscle and brain wave bio-potentials detected at the users forehead [8]. All the devices above have their advantages and disadvantages. A user with cerebral palsy will not have good motor abilities to operate the 'Tonguepoint™'. A user with spinal vertebrate fusion may not be able to turn his or head and the HeadMouse™ will be of no use to this user. At present only the cyberlink™ seems to be applicable to the brain injured because it uses a combination of signals. 2.11 Cyberlink™ 'Cyberlink™' can be used as a control technology that combines eye movement, eye blink, facial muscle and brain wave bio-potentials detected at the user's forehead to generate a mouse input that can be used for communicating. Cyberlink™ uses the forehead as noninvasive site, for convenience and also because it has a rich variety

95 of bio-potentials. The signals for communications are obtained by attaching probes on the forehead of the patients. Basically it is 3 silver/silver chloride contact electrodes (i.e. non-invasive), which are placed on a headband that picks up EEG (brain wave), EMG (muscle movement wave) and EOG (Eye ball movement) signals when applied on the forehead. These are then fed into an amplifier box and then to the mouse port, so the computer just sees the device as a mouse, which, is used to control the cursor. The main signals used are due to muscle movement, only about 10% is due to thought processes (Brain wave). We used the cyberlink to communicate with the brain injured persons to get basic yes/no answers this time we want to go a step further and make the brain injured user perform simple tasks using a robotic arm. University of Sunderland carried out extensive research in the area of brain body interface devices for communication instead of the regular devices for the brain injured persons. For many years brain injured patients were written off as vegetative patients but now there are some groups of brain injured who are able to communicate using the brain body interface devices [5,6,7]. There is still research being done in this area. 2.2 Modelfor operating the robotic arm using the brain body interface The Model consists of following components: 1. A cyberlink™ brain body actuated control technology system that connects to the computer via the serial port 2. A computer with a parallel port and serial port free. An Interface program written in Visual Basic™ to operate the functions of the robotic arm 3. An Electronic circuit to read the parallel port of the computer and operate the motors that manipulate the robotic arm [2] 4. A robotic arm (Super Armatron™) that is operated using a series of motors [2]

c Y B E R L I N K

E L E C T R O N I C

COMPUTER

& INTERFACE PROGRAM

Fig.l

c I R C

u I T

R O B O T I C

A R M

96

The diagram in figure 1 shows the model for operating the robotic arm using the brain body interface. The computer needed one serial port for the cyberlink™ and a parallel port for the electronic circuit that interfaced the computer with the robotic arm. The cyberlink probes were attached to the forehead of the user and the other end of the cyberlink was connected to the serial port of the computer. The computer had a program written in Microsoft Visual Basic 6.0™, which had six paths for controlling the robotic arm. The paths ended up in one of these functions, arm go up, arm down, arm left, arm right, open claw and close claw. When one of these six functions were triggered, the program sent a binary code to the parallel port, which drove one of the motors to carry out what was requested by the user. The Electronic circuit used in the above setup is shown below.

Decoder

Switches

Motors

Joints

Motor Power Supply

Fig.2 In the above diagram we see the block diagram of the electronic circuit that was used. The output from the parallel port was decoded and used for switching transistors. The transistors switched the motors on and off in either direction [2]. The mechanical side of the circuit included aligning shafts and making sure there were no vibrations.

Discussion The meaning of the word "Robot" is slave worker (from the 1923 play Rossum's Universal Robots). The early uses of robots were mechanical devices using gears and levers. The advent of computers and the fast and furious new technology has given the robots the capability to perform sophisticated tasks others than mundane routine jobs. Robots are in action in the Military, Health sector, Manufacturing, Space exploration, Mining etc.

97

This paper tackled the area of giving this sophisticated and powerful robot as a tool for the brain injured who perhaps need it more than any other category of people. The research carried out at the University of Sunderland gave the brain injured people the opportunity to communicate using brain body interface devices [5,6,7], this new area of robotics for the disable mainly the brain injured is going to open a vast area of research which will end up in very useful applications for all the people regardless they have special needs or not. Robots are being used in places such as Japan for caring for the physically handicapped people. These robots do the daily routine chaos thus taking the burden away from the careers and also saving a lot of money. The operators control these mobile robots through the Internet and mobile phones [12]. The new trend in robotics is to control robots remote using the Internet or mobile phone. Robots have been in science fiction for many years and but now there is some exciting new research going on at the moment which is going to change many a brain injured person's life. Blinking or moving forehead muscles are quite tedious process for a brain damaged person. One new approach taps into electrical noise generated by the brain. A probe on the scalp is used to measure tiny amounts of current as the nerve cells fire. These biofeedback signals (EEG) can be used to communicate. The users can control these biofeedback signals and create regular patterns in order to use it as a cursor in a computer or operate other devices [3]. In another approach, probes are planted directly into person's brain to detect neurons in the area that once carried out a physical function for example controlled an arm. The area of research is very useful to locked-in patients. The electrodes planted contains proteins which encourage nerve cells to grow near the electrode [11] The future of this research area covered in this article will only be successful if the government, commercial organisation, research personnel and care-givers work together to create robotic hands and feet for this group of handicapped people. There is whole world of applications to be created for the brain injured people.

4

Acknowledgements

We thank Professor Doherty for all the work he is doing in the area of brain injured people and all his encouragement in this new area of research "Robotics for the brain injured".

References 1. 2. 3. 4. 5.

6.

7.

8. 9. 10. 11. 12.

Agarwal V. Disability India Journal, January (2001) Banas J. Computer Controllled Robot Arm, Radio Electronics, pp50-53, (1995) Bradbury D. The cyber century, Computer Weekly 26 April 2001 (2001) Christensen D. Mind over Matter, Science News, Vol. 156, Issue 9, pl42, (1999) Doherty E. Bloor C. Cockton, G. Engel W. Rizz J. and Berg C, (1999) Cyberlink - An Interface for Quadriplegic and Non - Verbal People, Pages 237-249, Conference Proceedings, CT99 3rd International Cognitive Technology, (1999) in San Francisco, Doherty E. Bloor C. and Cockton The "Cyberlink" Brain Body Interface as an Assistive Technology for Traumatically Brain Injured Persons: Longitudinal Results from a Group of Case Studies, CyberPsychology and Behavior, Liebert Publications, Larchmont, New York, USA, pp 249-259 (1999) Doherty E. Bloor C. Cockton, G. Engel W. and Benigno D., "Yes/No - A Mind Operated device for Severely Motor Impaired Persons" (2000), Computers Helping People with Special Needs, ICCHP 2000, Proceedings of the 7th International Conference on computers Helping People with Special Needs, (2000) Junker A., United States Patent, 5,692,517, (1997) Origin Instruments. HeadMouse: Head-Controlled Pointing for Computer Access, http//www.orin.com/access/headmouse/index.tm (1998) Salem C. and Zhai S., An Isometric tongue pointing device, Proc. Of CHI 97 (1997) pp 22-27 Siuru B. A Brain Computer Interface, Electronics Now, Mar99, Vol. 70, Issue 3,pp55,(1999) Yoshiyuki T. Takashi K. Tateki U. Masao M. and Hiroyuki K., development of the Mobile Robot System to aid the daily life for Physically Handicapped, Proceedings of the 7th International Conference on Computers Helping People with Special Needs, (2000)

99 A COMPUTER INTERFACE FOR THE "BLIND" USING DYNAMIC PATTERNS DAVID VEAL and STANISLAW PAUL MAJ School of Computing, Edith Cowan University, Perth, Western Australia, Australia E-mail: [email protected], [email protected] It is increasingly important that the blind can make effective use of computers (Internet, word-processing, etc). Methods such as text to speech or text to Braille may help, but involve extra cost. Text to speech may not be effective if the student also has hearing difficulties. Likewise should the student have problems with touch sensation text to Braille conversion may present problems. Many people classified as blind have some residual vision. If they can simultaneously discern a small number of different colored areas on a computer screen then this may suffice for representation of textual symbols. Each pattern can represent a character. A word, sentence or book can be regarded as a dynamic sequence of such patterns. A program, Dynamic Pattern System (DPS), was developed as a prototype. It runs on a standard PC with no extra equipment. Text can be entered and displayed as both patterns and textual characters, which can be saved and redisplayed later. Sets of patterns can be adapted to an individual's visual capacity and stored and modified for future use.

1

Introduction

Most people classified as blind possess some residual vision. A survey by the Royal Blind Society of New South Wales stated that: "The general community assumes that 'blind' people have no vision at all. This assumption is false. 92% of respondents to the survey have some useful residual vision, even though their functioning is affected by vision loss". They go on to note that: "The definition of blindness used by medical professionals, support services and government agencies (most especially the Department of Social Security) is a visual acuity of 6/60 or less or field restriction of 10%, i.e. the person sees at 6 metres what a person with 'normal' vision would see at 60 metres, or can only see up to 10% of the normal visual field" [11]. If any residual vision is sufficient to enable people with vision impairment to read highly magnified text then enlarged print or text magnification via a video camera attached to a computer may be used. If these methods are inappropriate then text to speech [6], or text to Braille conversion [16] may help. Documents can be scanned into the computer and converted to text and then into Braille or speech. Speech to text conversion and Braille printers may also prove useful. However, Text to speech may not be effective if the student also has hearing difficulties and a UK report states that: "Disabilities other than vision is higher among visually impaired people than among the general population" [13]. Of those taking part in the Royal

100 Blind Society of New South Wales survey 30% noted that they also experienced a hearing loss. Furthermore, 2% of persons from this same survey reported a major hearing loss [11]. The authors propose a system Dynamic Pattern System (DPS) whereby each alphanumeric character can be represented by a pattern. If users can simultaneously discern a small number of different colored areas, or elements, on a computer screen then this may suffice for representation of textual symbols. 2

Dynamic Pattern System (DPS)

Each pattern can represent a character. A word, sentence or a book can be regarded as a dynamic sequence of such patterns. A computer program, DPS, was developed as a prototype. The patterns are displayed serially and in sequence. DPS runs on a standard PC with no extra equipment. Text can be entered and displayed as both patterns and textual characters, which can be saved and redisplayed later. The character representing the pattern being displayed can also be shown. The text box, on screen controls, and the individual character display can all be hidden if required so that a user can only see the patterns being displayed and the top menu bar. This feature is to avoid cluttering the screen with unnecessary detail during experiments. Each pattern can have up to 5 elements. Using only 4 elements and 4 colors enables 4 4 or 256 distinct patterns to be represented. However, it is important to incorporate some redundancy so that the actual patterns used are separated by more than only one distinct difference to avoid confusion between patterns of similar appearance. The size, shape and separation of the elements of patterns, background color, and the speed of delivery of patterns can also be adjusted. The time period between words can also be varied as a multiple of the time period between letters to enable this distinction to further define word boundaries. Furthermore, the ASCII space character is represented by its own pattern, which also helps this process. The DPS program allows the resulting sets of patterns that are adapted to an individual's visual capacity, to be both stored and modified for future use. This stored pattern set includes information on the speed at which the patterns are to be displayed when the user is reading, pattern element size, background color, and element shape. A particular user can have their preferred set of patterns. The program allows the user to read an entire file or select blocks of tests, which can then be automatically converted into patterns. This can be repeated continuously for training purposes. The DPS program enables a set of patterns used to represent the alphanumeric character set to be matched to an individual's remaining vision capabilities, unlike conventional English language text where its symbols are pre-ordained, although there can be some variation of size and shape. At present the DPS program does not distinguish between upper and lower case letters. This is to reduce the number of patterns that users need to recognize. All that needs to be transmitted or received by the users' computers are standard text files as each pattern corresponds to an ASCII character. Any conversions to and from the pattern sets are made via the DPS

101

program itself, hence aiding both the transcription as well as the transmission processes for a range common computer interfaces for the blind. The UK Royal National Institute for the Blind who notes that PDF files can now be converted into text or HTTP files for use in text to speech or text to Braille converters [14] . Whilst Paciello states: "The key to reading a Web document or displayed server messages is that the output stream is ASCII. Since many blind users rely on character cell browsers (LYNX, W3, Cern Line Mode Browser) that read ASCII text in conjunction with their synthesizers and Braille displays" [10]. The authors tested the DPS program on themselves and, with practice, learnt to recognize some basic words and phrases. Safety features were also added to avoid pattern presentation speeds that might induce fits in photosensitive epileptics [18]. Advice was also obtained from neurologists working in the field of epilepsy, from the Epilepsy Association of WA [1], medical practitioners and from the internal publications of local Blind Associations [15]. 3

The Experiments

Firstly volunteers were informed of the possible call on their time and any possible risks involved. Each session took a maximum time of 20 minutes for each individual per week. The tests were undertaken over a maximum period of three weeks using the same machine with the same settings. Volunteers were initially asked to attempt to recognize large single characters in various colors and then to recognize words composed of these characters to test their ability to read enlarged text. Volunteers then chose their own patterns. Drill and practice exercises were then given to the volunteers to help them recognize these patterns and to associate them with the characters, and then the words, represented. Previous work by the authors developed a method to represent a mathematically based relationship between patterns, via the pre-testing of volunteer's vision system, was not used due to time constraints [19]. There were 12 volunteers, one volunteer was deaf-blind and the others were blind. Although all had less than 10% vision none was unable to read highly enlarged text, in some cases a single letter covering most of the screen. The authors regard this as an essential requirement to prove the feasibility of the DPS. Yet, over the test period of 3 weeks, useful data still was obtained regarding some of the problems that blind users could experience when using this system. Some other potential volunteers were excluded, as they could see neither the computer nor the screen. The shape of the colored areas used in the patterns chosen did not have any significant effect on the ease of pattern recognition. Some volunteers had parts of their vision field missing, and this required the position of the pattern elements to be adjusted to accommodate this condition. Other volunteers could not distinguish between colors that a fully sighted person would have perceived as clearly different. This reinforced the authors' trust in their adopted method of allowing users to choose their own pattern sets. Although a lot of flexibility was built into the DPS test bed program,

102 this was insufficient, as some volunteers would have preferred to use more than 5 pattern elements. There was unexpected "interference" experienced by 2 volunteers who had previously learnt Braille even though they had only previously used Braille via touch. There was a large difference in the volunteers' success rates in discerning patterns and pattern sequences and also in relating these to the characters and words that they represented. One volunteer could recognize 14 patterns making up more than 10 words included in sentences after only one 10 minute and one 20 minute practice session, although this was exceptional. It has been noted that these skills are different [5], whilst Liu has investigated the connection between memory and display scanning [9].

4

Future Work

As this research is heading into the region of more severe deaf-blind impairment, so communication with volunteers, and finding them, is becoming more problematic. It be necessary for investigators to learn deaf-blind language whereby two people can converse by imprinting letters with their right hand onto the other's open left hand [3]. The DPS program is written in the computer programming language Microsoft Visual Basic. Visual Basic for Applications (VBA) is included in many common Microsoft Office programs and could be used to run DPS as a front end for these and many other programs. Thus the DPS program would merely enable the conversion the text to patterns or when using such packages.

5

Social Inclusion

"Participation in the broader community depends on the individual capacity to receive and use visually-based information. Limited access to visually-based information affects all aspects of the lives of people who are blind or vision impaired" [2]. DPS is intended as a potential aid to help to further promote the social and economic inclusion of the deaf-blind into the workplace and education systems. The importance of including people with disabilities into the modern computerized workplace has been noted by Ladner: "As pointed out to the author by Jay Brill, these two trends, growing numbers of workers with disabilities and growing demand for workers with the ability to use computers compound each other"; and further that: "Meeting the needs of computer users with disabilities is tantamount to meeting an important need of society as a whole" [8]. This message has been further reinforced in a call to make Information Systems (IS) accessible to people with disabilities [7, 17], and there also could be important implications in the US Disabilities Act with respect to the provision of aids for vision-impaired employees using computer equipment [12]. Computer usage can help to foster

103 independence amongst the disabled. In respect to their Internet usage "It is not the time spent on the Internet that increases independence but rather the number and variety of Internet services used. This variety is a source of richness that provides an opportunity to increase knowledge" [4]. 6

Conclusions

The only difference between a user working with a DPS or on a conventional word processing program is that with the DPS program only sequences of patterns are seen on the screen instead of text. DPS remains to be further proved by being successfully used by people who cannot see even enlarged text and needs to be tested using a full character representational set. So far a maximum of 14 characters have been tested on any one volunteer. More testing of deaf-blind volunteers needs to be undertaken using deaf-blind volunteers who cannot see even highly enlarged text. Learning a new character representational system is a major undertaking and only those volunteers who cannot read highly enlarged text are likely to be willing to sacrifice the time and effort required. 7

Acknowledgements

The authors wish to express their gratitude both to the Royal W.A. Institute for the Blind and their volunteers who participated in these experiments and also to the volunteers from Edith Cowan University for their time and assistance. References 1. 2. 3. 4.

5.

6. 7.

AEA, Epilepsy First Aid, Australian Epilepsy Associations (AEA), Epilepsy Association Services, Canberra, Australia.,, (1998) B. Fewtrell, Equitable access for the visually impaired.,, Incite (1998), pp. 10. J. Foley, The Guinness Book of Signs and Symbols, Guinness Publishing Ltd., London, (1993), pp. 68 - 77. C. Grimaldi and T. Goette, The Internet and the independence of individuals with disabilities., Internet Research: Electronic Networking Applications and Policy, 9 (1999). D. L. Hintzman and T. Curran, Comparing Retrieval Dynamics in Recognition Memory and Lexical Decision, Journal of Experimental Psychology, 126 (1997), pp. 228 - 247. I. J. Jacobs, These Keys were made for Talking; Turning Text into Spoken Words,, Mac WEEK, (1993), pp. 14. J. Kador, Drawing the Disabled into IS, Datamation, 39 (1993), pp. 86 - 89.

104

8.

R. E. Ladner, Computer Accessibility for Federal Workers with Disabilities: It's the Law., Communications of the ACM, 32 (1989), pp. 952 - 956. 9. Y. Liu, Interactions between Memory Scanning and Visual Scanning in Display Monitoring, Ergonomics, 39 (1993), pp. 1038 -1053. 10. M. Paciello, Making the Web Accessible for the Blind and Visually Impaired,, 4th international conference on the World Wide Web Consortium (W3C)., Boston MA, (1995). 11. RBS, "When Even Glasses Don't Help" A study of the needs of people who are blind or vision impaired,,, Royal Blind Society (RBS) of NSW, Sydney, NSW, Australia, (1996), pp. 43 - 52. 12. E. Reese, Managers and Employees Benefit from Disabilities Act.,, MacWEEK, (1993), pp. 20 - 2 2 . .13. RNIB, Blind and partially sighted adults in Britain: The RNIB survey,,, Royal National Institute for the Blind (RNIB), London UK., (1991). 14. RNIB, Web Accessibility Initiative. Royal National Institute for the Blind (RNIB) Report on the "Access Track",, 6th international conference on the World Wide Web Consortium (W3C)., Santa Clara USA., (1997). 15. RWAIB, Guidelines and Safety Procedures,,, Royal WA Institute for the Blind (RWAIB) Accommodation Services, Perth, Western Australia, Australia, (1998), pp. 42 - 48. 16. B. Schulz, Development of Computerized Working-aids for Visually Impaired Persons, in A. M. Tjoa, H. Reiterer and R. Wagner, eds., Proceedings of Computers for Handicapped Persons., Oldenbourg, Vienna, Austria, (1989), pp. 189-200. 17. N. Snell, Making IS accessible, Datamation, 38 (1992), pp. 79 - 82! 18. The West Australian Newspapers, Japanese probe sickening cartoon,, The West Australian, Perth, Western Australia, Australia, (1997), pp. 24. 19. D. Veal and S. P. Maj, Dynamic Patterns as an Alternative to Conventional Text for the Partially Sighted., Special Interest Group in Computers and the Physically Handicapped (SIGCAPH) (1998), pp. 11 - 15.

105 DESIGNING FOR DYSLEXIA - THE DEVELOPMENT OF MATHEMATICS SOFTWARE FOR STUDENTS WITH SPECIFIC LEARNING DIFFICULTIES WALTER MIDDLETON*, EMMA LEJK AND CHRIS BLOOR* University of Sunderland, School of Computing, Information Systems and Engineering, P.O. Box 299, Sunderland, SR6 OYN, England E-mail: (firsiname.secondnamej @sunderland.ac.uk Approximately 3% of students at the University of Sunderland have a Specific Learning Difficulty (SpLD) or Dyslexia, and a large proportion of these students also appear to struggle with mathematical concepts. In order to help tackle this problem we have developed a multimedia teaching program. An initial review of screen design guidelines for this class of user indicated that there were few general guidelines and that those which did exist were often conflicting. One issue which constantly arose was that the requirements for font type, case, and screen colour schemes were very much based on the specific requirements of individual users. Expert teachers and counsellors of students with SpLD were consulted and their methods for mathematics teaching adapted and used as the basis of our program. The guidelines we used, for both screen design and the actual teaching of mathematics, are presented and the results of our evaluation discussed.

1

Introduction

Dyslexia is a complex neurological syndrome which affects some 10% of children, 4% severely [1]. Most research [2] has focused on language difficulties and supports the hypothesis that dyslexia persists throughout life. The characteristics of dyslexia can appear in different combinations and different levels of severity. The following features [1,3,4,5] are common: •poor reading, spelling and written language; •problems with visual processing and phonology; •poor short-term memory and organisational skills; •poor motor, automatising and co-ordination skills; •the late achievement of developmental milestones; •problems with numeracy. The latter point may explain the recent increase in research focusing on the mathematical difficulties experienced by those with dyslexia [6-9].

106 2

Mathematics and Dyslexia

Mathematics is a sequential subject which builds on existing skills and knowledge to attain the new. The subject involves organisation, patterns, abstract ideas and concepts. The necessary skills for mathematics [10] are suggested as: flexible thinking; logical thought in quantitative relations, number and letter symbols; the ability to think using mathematical symbols; and the ability to rapidly and broadly generalise mathematical objects. Mathematics teaching should aim to maximise the development of these skills commensurate with the stage of cognitive development that is needed for conceptual understanding [6]. As mathematical skills are developed, different cognitive skills are required but there does not appear to be a single cognitive defect which causes failure in learning mathematics. Research [7] suggests that there may be several cognitive areas with the potential to be defective. Specific areas of the dyslexia syndrome associated with causing problems for learners of mathematics have been categorised as follows: Memory problems. Wrongly changing symbols in mid calculation [8]. Spatial awareness. Students may ignore zeros in numbers [9]. Language and symbols. The misinterpretation of familiar words is common [8]. Direction and sequencing. Counting backward is a common difficulty [9]. Visual perception. Differentiating x,-s-,+ and - can cause difficulties [9]. Fractions. The concept of equivalence causes particular difficulties [6,10]. Avoidance. Difficult questions are avoided or abandoned [12,13]. 3

Teaching mathematics to students with dyslexic using CAL

Hogg [16] identified a number of possible causes for poor mathematics performance amongst dyslexics. These can be summarized as 'intellectual' difficulties such as making associations and generalizations and a restricted ability to interpret abstractions; plus 'physical' difficulties such as a short attention span and impoverished sensory perception.: Research [2,5,14,15] suggests that effective approaches to overcome the above are those which incorporate multisensory learning; over-learning and automaticity; structured learning; sequential and cumulative approaches; sympathetic teaching, and real-world applications. Clark [17] also suggests the CAL based solutions below as an effective way of combating Hogg's identified set of problems: Flexibility of presentation to enable generalisation and association extension; Simulation and demonstrations to associate the concrete with the abstract; Interactivity and interest to can promote intellectual curiosity; Reinforcement by using graphics, sound and animation to revitalise interest.

107 4

The CAL program

We have established above the types of techniques which, the research suggests, should be effective in a CAL programme directed towards dyslexic students. One further issue is the actual screen design. Here we encountered difficulties. The guidelines we were able to identify [18,19] gave advice which was either conflicting or suggested that the choice of fonts and colour was very much an individual matter. One web site targeted towards a dyslexic audience actual allows users to vote on their preferences[19]. Whilst this is not a well-controlled survey, and we cannot guarantee that only people with dyslexia will vote, it is interested to note that opinions were divided on the choice of font (serif or sans serif) and background colour. Given this situation we consulted with expert teachers and counsellors and decided on a fairly neutral colour scheme. The actual program presents work in the areas of arithmetic operations, basic fractions, fraction arithmetic and conversions all involving interactivity. Feedback is provided but the learner is given a full solution after three unsuccessful attempts at a question. Money, measurement and time are used to explain decimals and equivalent fractions and the program appeals to the senses of the learner in the following ways. Visual features, colours, fonts and custom designed buttons with icons rather than written labels are used as are real-world metaphors. Aural features, a voice-over, activated by a button is offered on every page. Tactile/kinesthetic features, dragging objects ensures user interaction. Overlearning, to facilitate automaticity via worked examples and a test section. 5

Program Evaluation

The software was tested by four dyslexic students, and three non-dyslexics who acted as controls, in consequence the analysis given is qualitative. Each student was given a paper copy of relevant exercises and allowed one hour to complete them. They were then given one hour to use the program and asked to complete a similar set of questions on-screen. Finally a questionnaire was completed.

5.1

Comparison of CAL and paper tests

Table 1 shows that the control students (5,6,7) all performed better on the paper tests than computer tests. They also performed better than the dyslexic students on the paper test (with one exception) but made careless errors on-screen due a lack of attention to instructions. Two of the poorer students (3,4) performed better onscreen than on paper. It is suggested that these students gained the most from the program. Students 1 and 2 performed better on paper than on-screen, these students were apprehensive when using the computer. One dyslexic student had experience of

108 degree level mathematics, and scored marks on a par with non-dyslexic students on paper showing that it is wrong to generalise that all dyslexics are poor mathematicians.. Table 1. Results of paper and on-screen tests Paper tests % 89 50 59 10 100 97 86

Name Student 1 (SpLD) Student 2 (SpLD) Student 3 (SpLD) Student 4 (SpLD) Student 5 Student 6 Student 7

5.2

On-screen tests% 52 42 84 21 91 81 81

The questionnaire responses

General attitudes towards mathematics were obtained by asking the students to grade the four statements shown in table 2 below. Table 2. Student attitudes towards mathematics Control students

Dyslexic students I like to do mathematics I am worried about mathematics I try to avoid doing mathematics I feel I am good at mathematics

Disagree

Agree

Neutral

Disagree

Agree

Neutral

3 1

I 3

0 0

0 3

3 0

0 0

0

4

0

3

0

0

3

0

1

0

3

0

Three dyslexic students expressed negative feelings about mathematics and all four sought to avoid it. Interestingly, even the student with experience of degree level mathematics could only express neutral feeling when questioned about his ability. The preferred mode of working (Table 3) appears to be slightly biased towards the computer although student 1 preferred working with paper on the grounds of familiarity and student 2 liked both methods of working. Computer preference was based on the grounds of clarity, visual appeal and variety of approaches. Table 3. Preferred mode of working

j j Paper

Dyslexic students 1

Control students 0

j

109 I Computer | Liked both

::::::::::::z:::::::::::::i 1

2 1

All four features incorporated were appreciated by most users, dyslexic or not, particularly the voice-overs and animation as shown in Table 4. However, it is surprising that two dyslexic students expressed no particular preference for colour. Table 4. Student opinions on CAL features

Dyslexic students Colours Voice-overs Navigation, buttons Animations

Control students

Liked

Disliked

Neutral

Liked

Disliked

Neutral

2 4 3 4

0 0 0 0

2 0 1 0

3 3 3 3

0 0 0 0

0 0 0 0

The students were asked whether they thought that the brief exposure to the program may have improved their mathematics. Two dyslexic students and one control answered 'yes', one control answered 'no.' The responses of the two dyslexic students are encouraging, particularly as these students (3 and 4) had the most severe problems. Finally, the users suggested a reduction in the amount of text, an increase in the use of graphics, the use of a larger font size and a revision of the fractions section of the program. 6

Discussion

A results analysis showed that users often got a correct answer only at the second or third attempt, the feedback provided appearing effective at re-directing the thoughts of the users. Some students claimed that they were more inclined to enter speculative answers into a computer than on paper since they did not feel embarrassed if they were wrong. The variability of the results does not appear to indicate that this degraded performance.

7

Conclusions

Students with SpLD at the University of Sunderland now have access to a CAL program designed with them in mind. Experts who teach and counsel students with SpLD have seen their input realised in terms of a usable program which has undergone successful initial trials and evaluation. Suggestions from the testers for improvement should lead to further developments in the near future.

110

References 1. Crisfield J., The Dyslexia Handbook (1996), British Dyslexia Association. 2. McLoughlin D., Fitzgibbon G. & Young V., Adult Dyslexia: assessment, counselling and training, (1994), Whurr, London. 3. Miles T.R., Dyslexia: the Pattern of Difficulties, (1983), Blackwell. 4. Vellutino F.R., Dyslexia, Scientific American, 256(3), (1987), pp. 20-27. 5. Thomson M., Developmental Dyslexia (3E, 1990), Whurr, London. 6. Bley N. & Thornton C, Teaching Mathematics to the Learning Disabled, (1981), Aspen Systems Corporation, Maryland. 7. Ginsburg H.P., Mathematical learning disabilities: a view from transdevelopmental psychology, J. Learn. Dis. 30(1), (1997), pp. 20-33. 8. Henderson A., Math, and Dyslexics, St David's College, Llandudno, (1989). 9. Clayton P., Using computers for numeracy and mathematics with dyslexic students, (1994), Dyslexia Computer Resource Centre, University of Hull. 10. Krutetski V.A., In Kilpatric J. & Wirszup I., The Psychology of Math. Abilities in School Children, (1976), University of Chicago Press 11. Baumunk K. & Dowling C.E., Validity of spaces for assessing knowledge about fractions, J. Math. Psychology, 41, (1997), pp. 99-105. 12. Chinn S.J., A pilot study to compare aspects of arithmetic skills, Dyslexia Review 7(1), (1995), pp. 4-7. 13. Miles T.R. & Miles E. (eds.), Dyslexia and Mathematics, London, (1992), Routledge. 14. Dienes Z.P., Building Up Mathematics, Hutchinson Educational, (1960). 15. Aubrey C, Eaves J., Hicks C. & Newton M.J., The Aston Portfolio, Wisbech, (1982), Learning Development Aids. 16. Hogg B., Microcomputers and Special Educational Needs: a Guide to Good Practice, (1984), National Council for Special Education. 17. Clark M.M., Educational technology and children with moderate learning difficulties, The Exceptional Child, 33(1), (1986), pp. 28-34. 18. iANSYST Ltd, http://www.dyslexic.com/rational.htm accessed 5th March 2001 19. iANSYST Ltd, http://www.dyslexia.com/qaweb.htm accessed 5th March 2001

111

USING HANDS-FREE TECHNOLOGY IN PROGRAMS FOR PROFOUNDLY DISABLED CHILDREN H. TODD EACHUS Advocate Schools, 11980 South Mount Vernon Avenue, Grand Terrace, California E-mail: [email protected]

92313

ANDREW M. JUNKER Brain Actuated Technologies, Inc., 1300 President Street, Yellow Springs, Ohio 45387 E-mail: [email protected] Access to computer resources, the internet, interactive software and computer-based assistive devices by profoundly disabled and vegetative children and adults is extremely limited. This problem is particularly important when educational programs are being designed for profoundly disabled children who are technology dependent. The availability of technology that permits hands-free operation of a computer mouse permits more open access to edcuational services by such children. The present study explores the use of a hands-free device called Cyberlink in an educational program for children diagnosed as vegetative. Data are presented that demonstrate steady increases of independent use of a personal computer for educational purposes by these students.

1

Introduction

The use of the Persistent Vegetative State (PVS) diagnosis has been the subject of discussion in the literature concerning accurate assessment and certain ethical considerations. Effective use ofan assessment tool has beem re[prted in working with individuals in PVS [8]. Investigators were able to distinguish between those who emerged and who did not emerge from PVS on the basis of scores using the Sensory Modality Assessment and Rehabilitation Technique (SMART). In another study, an assessment methodology for use in studying response to instructions in minimally conscious patients has been reported [7]. The accuracy of assessment in treating minimally responsive patients goes beyond the technical aspects of the assessment. There is a lack of information about the effect of clinical setting, family involvement and instructional activities in the implementation of interventions with such patients [6]. The diagnosis may be taken as an absolute thus providing no further consideration of treatment for the individual. In a discussion of the use of neuropsychological assessment findings in a case involving a minimally functioning patient [5] consideration was being given for withdrawal of feeding. Evidence of communication and responsiveness from the neuropsychological testing resulted in withdrawal of a court petition to terminate feeding. The patient remained

112

dependent six years post injury, but was talking and eating by mouth and showing insight into her condition. There may be a spectrum of conditions involved in PVS which should be periodically assessed. Without some periodic review, there is no likelihood that treatment services will be resumed. This may present certain ethical issues as well. Questions of ethical concerns with the PVS diagnosis such as PVS being ". . . the paradigmatic neurologic syndrome for decisions to discontinue treatment." Have been raised [1]. Elsewhere [4] certain other ethical issues in the use of the PVS diagnosis are discussed concerning that the diagnosis may be inappropriately used to support end-oflife decisions that constitute non-voluntary euthanasia. Certainly efforts to evoke responsiveness in PVS patients are rare, if they occur at all. Only one reference to work with vegetative individuals was found in the psychological literature [2]. That study reported operant conditioning of right arm movements in an 18 year old male. The subject displayed no voluntary motor activity. It is unclear whether he required ventilator support of any kind, but he did take liquid food by mouth. Arm movement was conditioned and extinquished using a sugar-milk solution as the reinforcer. An extensive search of the literature produced no other reports of work with vegetative individuals since 1949. The improvements in medical care in recent years has had the effect of preserving life for individuals who suffer severe trauma, prolonged anoxic events and extremely premature gestation. Many of these individuals are diagnosed as PVS by attending neurologists or pediatricians. Children who are diagnosed as being PVS are considered incapable of benefiting from clinical treatment other than those necessary to maintain life. Attempts to evoke responses to environmental stimuli through occupational, physical or speech therapy are not made. PVS children are kept clean, warm and nourished. They are provided medications to reduce the risk of infection, prevent seizures, reduce respiratory problems and otherwise maintain the highest level of health possible. Ordinarily, few additional treatment services are offered 2

Method

2.1 Subjects Recently, an educational program for thirty-five children residing in a sub-acute care facility was structured. These children were victims of catastrophic illness or injury and some presented with severe birth defects. Thirteen of these children had been diagnosed as PVS. None of the thirteen was receiving any OT, PT or Speech. Medical reports indicated no voluntary or responsive movement.

113

Subjectively, the teacher and aides working with these children reported that "something was going on." They felt that they were able to distinguish mood changes in their students, that some of them responded to instructions and that they enjoyed such activities as hearing stories read or listening to music. However, monthly reviews by the attending neurologist and other physicians did not reveal any change in the status of these children. The thirteen PVS students ranged in age from 3 to 20. Eight are female and 5 are male. Three of the children had been victims of near drowning. Three had been struck by a motor vehicle. Two others had experienced hypoxic encephalopathy. At least one was born with the congenital disorder of microcephalus and one was born hydrocephalic. These children are particularly susceptible to respiratory infections. Each is easily fatigued. The findings for one of these students will be presented here. The subject is a 17 year old female who was struck by a motor vehicle at the age of 11. She suffered a traumatic brain injury, cerebral palsy, spastic quadriplegia and scoliosis. She was diagnosed as PVS following emergency treatment at the time of the accident. She has a tracheostomy and gastrostomy. Her chart notes that she tracks visually, uses eye-blink to indicate yes/no and that she can use switches with physical prompting but displayed no voluntary motor control. 2.2 Equipment The field of Augmentative and Alternative Communication (AAC), has provided a means for many individuals to express themselves that has not been available in the past. Alternative computer input devices are now available in a variety of options to support special needs communication and access to AAC (i. e., frontalis muscle switch, head and eye-tracking devices, chin switches, 'sip and puff,' voice activation, etc.) Use of this technology is, however, dependent on the user's ability to control their muscles. Children and adults who have no motor control of head movement and who have undergone tracheostomy cannot use much fo this technology. These individuals often have disabilities related to cerebral palsy (CP), amyotrophic lateral sclerosis, multiple sclerosis, muscular dystrophy, or traumatic brain injury. The subject was provided with a Cyberlink device to control a computer mouse. The Cyberlink is a hands-free controller that operates in a non-coherent phase detection mode. The Cyberlink consists of a headband, interface box, serial connection to a PC, and decoding software. Signal electrodes are held to the forehead in approximately the Fpl and Fp2 locations with the headband. A reference electrode is located between the two signal electrodes. The electrodes are fabricated from silver-silver chloride plated carbon filled plastic. The three electrodes operate in a common-mode rejection configuration. Low-noise preamplifiers and analog filters are used to separate the

114

sensor signals onto three bands or channels of data. Since these signals are derived from the forehead, the three channels each contain a mix of ElectroOcular Graphic (EOG), EEG and EMG generated bio-potentials. Within the PC various Cyberlink software programs, in the form of training games, are available to provide the user opportunities to learn to control the forehead-derived signals. 2.3 Procedure The subject had the headband with sensors put on during class time for from three to five sessions per week for a total of 160 training days. The teacher calibrated the Cyberlink device at the start of each training day and then started one of the training programs to develop control of vertical, horizontal or diagonal movement of a cursor on the computer monitor. These training sequences appeared as video games similar to Pong or various mazes. The degree to which the subject was able to complete the training program without assistance was charted on a simple scale ranging from 0 for no response to 100 for responses without verbal or physical prompting. A seven point scale indicated the amount of physical and verbal prompting provided to complete a response by the subject. As the subject demonstrated greater independence in completing simple training activities, her instructor presented slightly more difficult tasks using additional verbal prompting until mastery or near-mastery was observed. Textual material, concept formation tasks and maze solving were introduced during the training period. 3

Results

The subject acquired skill in moving the cursor in the vertical axis quickly with only verbal prompts . By training day 28, the attending neurologist observed the trials and removed the PVS diagnosis. Short story materials were introduced on day 43. The subject was able to use the click function to turn 'pages' in the on-screen book. By training day 155, the subject had demonstrated reliable control of the vertical, horizontal and diagonal movement of the cursor. This enabled her to utilize the computer mouse to operate several instructional software programs. She continues to need some verbal prompting to complete tasks, but has demonstrated more frequent independence. No reinforcers other than task completion and verbal praise from staff have been provided to this subject. Her medical condition has prevented her from using the device on a daily basis so the rate of acquisition has been slow. However, staff have reported that at approximately training day 150, the subject lifted her arm to place her hand on a switch to turn on a tape player. This response was not part of the training program.

115

VERT PONG 4/10 ~HOR PONG 8/10

PVS DIAGNOSIS REMOVED

\

/

/

USED CYBERLINK TO FOLLOW SHORT STORY

TWO AXIS DIAGONAL PX

L TWO AXIS PLUS CLICK

4 MAZE TRIALS

VOL USE OF RT HAND

TRAINING DAYS

Figure 1. Skill Acquisition

116

,,\

IDustration 1. Subject tracing :a maze using tie Cyberimk

4

Dlscusstoi

Children who suffer catastrophic illness or injury are left with severe deficits that result many times in a.diagnosis of Persistent Vegetati¥e State. When that diagnosis is made, typically no further efforts are made to stimulate motor or language responses. The absence of motor or language responses makes it difficult to determine whether any academic skills are present or not. The availability of assistive technology to permit such children-to display responses in such forms as switch closures and mouse cursor movement provides a means to shape progressively more complex repertoires in an academic setting. In addition^ the alteration of the PVS diagnosis can result in greater gains for an individual child with the resumption of clinical services such as physical therapy, occupational therapy and speech and language services. The availability of assistive technology that provides access to computer resources for individuals who do not have motor control of their extremities or facial muscles can open a wide range of resources for them. Individuals whose lives have been

117

dramatically altered by catastrophic injury or illness can now have the opportunity to regain some control over their environment and to restore communication with their fellows that otherwise would be lost.

References 1. 2. 3.

4. 5.

6.

7.

8.

Cranford, R. E. The vegetative and minimally conscious states: ethical implications. Geriatrics, 53 (1998) pp.70-73. Fuller, P. R. Operant conditioning of a vegetative organism. Amer. J. Psychol. 62, (1949) pp. 587-590. Junker, A. M. and Berg, C. R., The Cyberlink control system: A brain-body actuated computer interface and hands-free mouse replacement. Procd. IEEE Trans Rehab Eng. 8:1 (2000) McLean, S. A. Legal and ethical aspects of the vegetative state. J Clin Pathol. 52 1999, pp. 490-493. McMillan, T. M. and Herbert, C. M. Neuropsychological assessment of a potential "euthanasia" case; a 5 year follow up. Brain Inj. 14:2 (2000) pp. 197203. Piguet, O., King, A. C , and Harrison, D. P. Assessment of minimally responsive patients: clinical difficulties of single-case design. Brain Inj. 13:10 (1999) pp. 829-837. Whyte, J., DiPasquale, M. C , and Vaccaro, M. Assessment of commandfollowing in minimally conscious brain injured patients. Arch Phys Med Rehab, 80:6 (1999) pp. 653-660. Wilson, S. L., Gill Thwaites, H. Early indication of emergence from vegetative state derived from assessments with the SMART-A preliminary report. Brain Inj. 14:4 (2000) pp. 319-331.

Parallel Computing/Techniques

121

D Y N A M I C LOAD B A L A N C I N G IN T H E PARALLEL C O N T I N U O U S GLOBAL OPTIMIZATION P R O B L E M B Y U S I N G INTERVAL A R I T H M E T I C * A. BENYOUB AND E. M. DAOUDI Labo. Research in Computer Science, Faculty of Sciences, University of Mohamed First, 60 000 Oujda, Morocco, E-mail:{ benyoub, mdaoudi} ©sciences. univ- oujda. ac.ma In this work, we theoretically study, on a distributed memory architecture, the parallelization of the continuous global optimization problem with inequality constraints, using interval arithmetic, in particular we are interested by the load balancing problem which can decrease the performance of the parallel algorithm. We propose a parallel algorithm based on a dynamic and cyclic redistribution of the working list among the processors.

1

Introduction

The continuous global optimization problem with inequality constraints is well known and can be formulated as follows:

{

Minimize f(x), x 6 S C Mn. Subject to constraints: Ci(x) < 0, i = l, •••,jn, m € IN.

Where the objective function / and the constraints Ci, i = 1 • • -m, are continuously difFerentiable real functions defined over the domain S — [ai,&i] x ••• x [an,bn], (OJ,&i)i<j< n € H 2 . Let / * = m i n / ( x ) be the minimum of / on 5 and P* = {x* | f{x*) = / * } the set of feasible global minimizers x* of / on S. Solving the same problem by using interval arithmetic consists in finding subboxes X* which bound the feasible points a:* and a real interval [/,/] which bounds the global minimum /*. The main idea behind interval Branch-And-Bound methods consists in the definition of [4]: (i) an inclusion function F to compute bounds for / , (ii) an inclusion functions (Cj)i<j< m for the inequality constraints (cj)i
122 Several works in the literature, based on the interval Branch-And-Bound methods are developed for the global optimization problem and its parallelization with different load balancing strategies [3,4,5,6,7,10,12,13]. This work deals with the parallelization of the Hansen's algorithm [7], in particular we propose some techniques in order to improve the load balancing. In Sec. 2, the principle of Hansen's sequential algorithm is presented. Sec. 3 is devoted to present the parallel algorithm with the load balancing strategy. Sec. 4 presents the dynamic and cyclic redistribution of the working list. 2

Hansen's sequential algorithm

This algorithm is based on the interval arithmetic and consists in making an exhaustive search in a box. If this box is not rejected, during the different elimination phases, then it will be reduced by applying the reduction step, during which several sub-boxes, of lower sizes, will be generated. The subboxes of too small sizes (lower than a given tolerance) will be stored in a final list. On the other hand the boxes with large sizes (upper than the given tolerance) will be subdivided into several new sub-boxes with lower sizes and will be stored in a working list in order to be treated later. After that, the bounds over the global minimum are then computed. At the end of the execution (the working list becomes empty), the global minimum is localized inside a real interval with small size and the final list is composed of certainly feasible or indeterminate small sub-boxes containing the global minimizers. 3

Principle of the parallel algorithm with the load balancing

It is based on the decentralized approach where the working list WL is distributed among the (p— 1) processors (slave processors). But, on the processor PQ (master processor), we only store the lower and the upper bound values fk and fk computed locally in each slave processor Pk, 1 < k
123

an estimation of the number of the remaining boxes in the local working lists as soon as a local working list becomes empty. This redistribution, avoids that the corresponding processor becomes idle. We present in the following the steps of the parallel algorithm: • Step 1: Po partitions and distributes the initial box X^ (p - 1) slave processors.

among the

• Step 2: Each slave processor Pf.: - computes a first local upper bound fk by using one of the methods described in [1,7,13]. - sends /* to the processor P0. • Step 3: Po computes / =

min

(fk) and then broadcasts / to proces-

i<Jfe
sors Pk• Step 4: Each Pk applies the sequential Hansen's algorithm on its local working list WLk while this one is not empty. If WLk becomes empty {WLk = 0) then: - a synchronization is established, in order to estimate the total number of the remaining boxes among the slave processors. - a uniform redistribution of the remaining boxes among all slave processors is performed. • Step 5: the program stops when all local lists become empty. Updating the global upper bound: For updating the global upper bound / , we distinguish two possibilities: (a) The value of fk is dynamically communicated while the local list WLk is still not empty. This possibility needs more communications, but on the other hand, it requires less computations since the dynamic exchange of the values of fk presents the advantage to update dynamically the global upper bound / which is used to eliminate the sub-boxes verifying

FL(X)>J.

(b) The value of fk is communicated during the synchronization phase (step 4). This possibility needs less communications but it jieeds more computations since the update of the global upper bound / is delayed until a synchronization is established (a local list becomes empty).

124

4

Dynamic and cyclic redistribution of the working list

In the literature, Berner [3], Ratschek and Rokne [13] show that the "bestfirst strategy", which consists in ordering the boxes X of the working list by non-decreasing values of FL(X), is best than other strategies like "depthfirst strategy" and "oldest-first strategy", since at each iteration, the search is continued in the box X where / has the smallest lower bound FL(X) (bounding principle). If a given box X is chosen according to the "best-first strategy", and if this was not rejected, then the chances are best for finding the global minimum inside the box X. Our technique for the redistribution of the boxes among the processors exploits the criterion of the "best-first strategy". Assume that the global working list WL contains r boxes ordered according to the "best-first strategy", as follows: WL = {X\ X2,---,Xr-x, Xr} where: FL(X1) < FL(X2) <•••< FL(Xr-1) < FL{Xr) A first technique to redistribute the boxes among the (p — 1) slave processors v v is to affect the first _ | r j boxes to one processor, the next _ | -J boxes p— 1 p—1 to another processor,- • •, etc. Unfortunately, this redistribution gathers the r best first [ 7 J consecutive promising boxes in the same processor and on p-1 the other hand the consecutive "less promising boxes" in another processor. This strategy can increase the number of synchronization phases, and hence the communication cost since it is probably that the local working lists composed of the "less promising boxes" become empty more quickly than the local working lists composed of the "best promising boxes". This leads that some processors become idle more quickly than others. In order to keep all processors working as long as possible, before a synchronization phase, we propose to adopt a cyclic redistribution of the sorted working list among the processors in the following manner:

p 1 ^{x 1 ,x(p- 1 )+ 1 ,x 2 (p- 1 )+ 1 ---} p2 <— {x 2 , x(p-x)+2,x2(p-1'+2, • • •}

Pi <— {x\ x(p-v+i, x2
This redistribution technique avoids to store, on the same processor the "best

125

promising" boxes in the same processor. Since the proposed algorithm with the cyclic and dynamic redistribution takes into account: - the improvement of the load balancing: as soon as a processor becomes idle, a uniform redistribution of the remaining boxes is performed. - the reduction of the number of synchronizations and hence the communication costs: • a synchronization is only established when a local working list becomes empty. • the "best an less" promising boxes are uniformly distributed on all slave processors. We think that we will obtain a good results with this strategy. The simulation of the parallel implementations will be realized using the MPI environment. 5

Conclusions

In this work, we have presented and discussed the parallelization of Hansen's algorithm. This algorithm, based on a dynamic and uniform redistribution of the working list, is developed for improving the load balancing. In order to minimize the number of the synchronization phases and consequently the communication costs, we have proposed a cyclic redistribution of the working list, sorted according to the "best-first strategy" proposed in the literature. The sequential algorithm is implemented [1] using the C + + PROFIL/BIAS libraries [8,9]. The simulation of the parallel implementations will be realized using the MPI environment. We think that we will obtain a good results since our algorithm takes into account: - the improvement of the load balancing: as soon as a processor becomes idle, a uniform redistribution of the remaining boxes is performed. - the reduction of the number of synchronizations and hence the communication costs: • a synchronization is only established when a local working list becomes empty. • the "best an less" promising boxes are uniformly distributed on all slave processors.

1. Benyoub A. and Daoudi E.M., An Algorithm for Finding an Approximate Feasible Point by using Interval Arithmetic. Research Report (LaRI Laboratory, Dept. of Maths&CS, Faculty for Science, Univ. Mohammed First, Oujda, Morocco, 2000). 2. Benyoub A. and Daoudi E.M., Parallelization of the Global Continuous Optimization Problem by using Interval Arithmetic. In HPCN2001 Amsterdam (LNCS Springer-Verlag, 2001). 3. Berner S., A Parallel Method for Verified Global Optimization, Scientific Computing and Validated Numerics: Proc. Int. Sym. on Sc. Computing; ed. G. Alefeld, A. Prommer, B. Lang (Akademie Verlag GmbH, Berlin, Germany, 1996). 4. Casado L.G. and Garcia G., Work Load Balance Approaches for Branchand-Bound Algorithms on Distributed Systems. Research Report (Department of Computer Architecture and Electronics, Almeria University, Almeria, Spain, 1998). 5. Denneulin Y., Mehaut J.F., Planquelle B. and Revol N., Parallelization of Continuous Verified Global Optimization. 19th Conference TC7 on System Modeling and Optimization (Cambridge, England, 1999). 6. Eriksson J. and Lindstrom P., A Parallel Interval Method Implementation for Global Optimization using Dynamic Load Balancing. Reliable Computing 1(1995), pp. 77-91. 7. Hansen E., Global Optimization Using Interval Analysis. (Marcel Decker, New York, USA, 1992). 8. Knuppel O., BIAS - Basic Interval Arithmetic Subroutines. Technical Report 93.3 (Harburg-Hamburg, Germany, 1993). 9. Knuppel O., PROFIL - Programmer's Runtime Optimized Fast Interval Library. Technical Report 93.4 ( Harburg-Hamburg, Germany, 1993). 10. Leclerc A.P., Efficient and Reliable Global Optimization, PhD Thesis (The Ohio State University, USA, 1992). 11. Leclerc A.P., Parallel Interval Global Optimization and its Implementation in C + + , Interval Computations 3 (1993), pp. 148-163. 12. Messine F., Methodes d'Optimisation Globale basees sur I'Analyse d'Intervalles pour la Resolution de Problemes avec Contraintes. Ph.D. Thesis ( Universite Paul Sabatier, Institut de Recherche en Informatique de Toulouse, France, Septembre, 1997). 13. Ratschek H. and Rokne J., New Computer Methods for Global Optimization. (Ellis Horwood Limited, Market Cooper Street, Chichester, West Sussex, P 0 1 9 1EB, England, 1988).

127

INVESTIGATION OF A LOW-COST HIGH-PERFORMANCE SHAREDMEMORY MULTIPROCESSOR SYSTEM FOR REAL-TIME APPLICATIONS CONSTANTINE N. MANIKOPOULOS, SOTIRIOS G. ZIAVRAS AND CHARALAMBOS CHRISTOU Department of Electrical and Computer Engineering, New Jersey Institute of Technology, University Heights, Newark, NJ, 07102-1982, USA E-mail: [email protected] and [email protected] The proposed architecture can effectively support up to 64 modern digital signal processors (DSPs) in contrast to a smaller number of DSPs supported by existing bus-interconnected systems. This significant enhancement is achieved by introducing two small programmable fast memories (Twins) between the processor and the shared bus interconnect. While one memory is transferring data from/to the shared memory, the other is supplying the core processor with data. The elimination of the traditional direct linkage of the shared-bus and processor data bus makes feasible the utilization of a wider shared bus i.e., the shared bus width becomes independent of the data bus width of the processors. Simulation results show mat the fast prefetching memories and the wider shared bus provide additional bus bandwidth to the system, which eliminates large memory latencies; such memory latencies constitute the major drawback for the performance of shared-memory multiprocessors with standard memory modules.

1

Introduction

Parallel architectures can be classified in two large classes: shared-memory multiprocessors and message-passing multicomputers [1] [2]. Multiprocessors have a single, global, shared address space visible to all processors. Communication takes place via the shared memory. Multicomputers do not have a shared memory and must communicate by message passing. Mainly because of programming complexities of message-passing multicomputers, the architectural trend for future general-purpose computers favors distributed shared-memory systems (DSMs) where a multicomputer has hardware and software support for shared memory [1] [3] [4] [5] [6]. Latency tolerance for memory access is a major limitation in shared-memory systems [7] [8] due to the bandwidth constraint. Four modern processors accessing the same memory module can easily use up all the available bandwidth [9]. Private caches have contributed to the reduction of the ill effects of large memory access times. They are placed between the (fast) processor and (slow) main (shared) memory and have the basic function to hold regions of recently referenced shared-memory, i.e. prefetching [7] [8] [10] [11]. Software controlled prefetching schemes rely on the programmer/compiler to insert prefetch instructions prior to the instructions that trigger a miss. Hardware controlled prefetching schemes detect accesses with regular patterns and issue prefetches at run time [4] [8]. The simplest and least costly way to construct a multiprocessor system is to connect the processors on a shared bus [4] [12]. The amount

128 of data, which a shared bus can deliver to a computer system, depends on its speed (clock rate), memory access time, and data-bus width. The proposed twin-prefetching system separates the traditional linkage of the shared bus and processor bus and enables the utilization of a wider shared bus [13]. 2

Twin-Prefetching DSP-Based Shared-Memory System

The proposed multiprocessor is a high-speed low-cost DSP-based twin-prefetching shared-memory MIMD parallel system (Figure 1). It consists of P nodes where P is power of two. The system is investigated for several values of P such as 1,2,4,8,16,32, and 64. For every value of P, shared-bus width receives values 32, 64, 128, 256, and 512. We execute two-dimensional convolution on eight consecutive high-resolution (1024x1024, 16-bit pixels) images with a 3x3 template matrix. At the heart of each node is a 32-bit DSP processor (ADSP-21060, a super harvard architecture processor (SHARC) [14]) optimized for image processing, graphics, speech, sound and other high-speed numeric processing applications. It has four independent buses for dual data, instructions, and I/O. With its separate program and data memory buses and on-chip instruction cache, the ADSP-21060 can fetch two operands and an instruction (from the cache), all in a single cycle. The ADSP-21060 contains 4 megabits of on-chip SRAM, organized as two blocks of 2 Mbits each, which can be configured for different combinations of code and data. Each memory block is dual-ported for single-cycle, independent internal memory accesses. While each memory block can store combinations of code and data, accesses are more efficient when one block stores data, using the data memory (DM) bus, and the other block stores instructions and data, using the program memory (PM) bus. Thus, a dedicated bus to each memory block, assures single-cycle instruction execution with two data transfers. Single-cycle instruction execution is also maintained when one of the data operands is transferred to or from offchip, via the ADSP-21060's external port. The twin-prefetching system operates thus, storing code and some data (filter coefficients, for example) in the internal memory and retrieving all image data from the DM data bus through the external port. 2.1 Data Memory For the proposed system, the data memory (prefetching cache) functions as both a passive (memory) and an active (processor) resource; passive because it supplies the core processor with data, and active because it initiates and completes data transfers from/to the global (shared) memory. The data memory is comprised of two controllers (twin TTCs) and two fast memories (twin-prefetching caches) placed between the DSP processor and the shared-bus interconnect. The two TTC/cache pairs are referred to as

129 Twin! and Twin2. In a typical operation, one Twin is accessible to the processor providing data operands while the other Twin is transferring data from/to the shared memory, i.e., as soon as a block of data is moved into the cache, the ADSP-21060 begins processing it, while the other cache is emptied and then filled with new data. The Twinl and Twin! controllers are more specifically, DMA-like devices capable of twodimensional addressing. They move rectangular regions of data between global memory and one of the node's caches. Loading (input image segments) and unloading (results) from/to the Twins occur simultaneously with data processing. The back and forth switching of Twinl and Twin2 allows maximum utilization of resources; thus optimum system performance. For P processing elements, a maximum of 2P TTCs compete for the shared-bus that is granted to the TTC by an arbitrator implementing rotating priority. Processor Pj is serviced before processor P i+1 and Twinl is serviced before Twin2. More precisely, if Twiny is the j * Twin of the i* processor, the rotation proceeds as follows: Twinn Twin2i Twinpl Twin12, Twin22, ...,Twinp2. 2.2 Host Processor The host can directly read or write asynchronously the internal memory of any ADSP21060 via the VME bus and through the communication channels of the DSP processor. The host processor is responsible for booting up all nodes and downloading all necessary code and some data to the internal memory of every processor. The data downloaded to the internal memories include the addresses of image segments in the global memory that every node is assigned to process. These addresses are the result of partitioning; a technique for decomposing a large data set into many small pieces for parallel execution by multiple processors. Addresses are calculated once for a specific application and they are available for all future requests. 2.3 Partitioning Data into Cache Prefetching Segments If P processing elements are available, the image is divided into P sections, each section divided into / segments i.e., / segments of data are allocated to every processor. Adjacent segments may overlap each other depending on the application. Interchangeably, TTC1 and TTC2 are unloading results and prefetching fresh segments of data for processing in their respective caches. The size of a segment was kept at 1632 Kbytes while the total size of prefetching cache needed for every Twin was less than 100 Kbytes.

130 3 Results Tables 1, 2 and 3 and Figures 2, 3 and 4 show that Convolution with a template matrix of 3x3 is effectively executed (E>0.50) by up to four processors if nl=32, by up to eight processors if n/=64, by up to 16 processors if n/=128, by up to 32 processors if nl=256 and by up to 64 processors if nl=5l2. Figures 3 and 4 and Tables 2 and 3 show near perfect speedup factors and system efficiencies (E>0.90) for «Z=32&P=2,4, for n/=64 & P=2,4, for n/=128 & P=2,4,8, for n/=256 & P=2,4,8,16, and for n/=512 & P=2,4,8,16,32. Table 4 and Figure 5 show remarkable speedups of specific P-node systems, which are due only to nl increases. The speedup of the system increases by increasing the shared-bus-width. For example, the last entry of the last column (corresponding to the 64-node system) in Table 4, indicates a shared-bus-speedup factor of 14.2. Table 4 and Figure 5 show a maximum shared-bus-speedup-factor of 1.3, 1.3, 1.4, 2.7, 5.3, 10.0 and 14.2 for 1-node, 2-node, 4-node, 8-node, 16-node, 32-node, and 64-node systems, respectively. 4 Conclusions The elimination of the traditional direct linkage of the shared-bus and processor data bus makes feasible the utilization of a wider shared-bus. Additional bandwidth provided by a wider shared-bus along with twin-prefetching mechanism makes possible the effective support (E>0.50) of 64 processors, and the near perfect effective support (E>0.90) of 32 processors. The proposed architecture has as a goal to meet the requirements for realtime image processing application execution through the least expensive and least complex hardware. The cost of the proposed system is kept low because of the following reasons: (1) single bus-based systems are the least costly, (2) DSPs cost considerably less that RISC or CISC microprocessors, (3) as this work concludes, the amount of fast (expensive) memory needed for every prefetching cache is small (less than lOOKbytes. In contrast to existing DSP-based multiprocessors, the proposed system sustains peak performance, regardless of image size, due to the twin-prefetching mechanism, i.e., the assigned to each processor image sections, regardless of being large or small are partitioned into smaller segments that are interchangeably loaded on Twinl and Twin2 prefetching caches. References 1. 2.

Kai Hwang, Advanced Computer Architecture with Parallel Programming, McGraw Hill, Englewood Cliffs, NJ, 1993. D.E. Culler and J.P. Singh, Parallel Computer Architecture, A Hardware/Software ApproachMorgan Kaufmann Publ., San Francisco, CA, 1999.

131 3. 4 5 6 7 8 9 10 11 12 13 14

Ted G. Lewis, "Where is Computing Headed," IEEE Computer, Vol. 27, No. 8, pp. 59-63 August 1994. Daniel Lenoski, James Laudon, Kourosh Gharachorloo, Wolf-Dietrich weber, Anoop Gupta, John Hennessy, Mark Horowitz, and Monica S. Lam, "The Stanford Dash Multiprocessor," IEEE Computer, pp. 63-79, March 1992. T.I. Golota and S.G. Ziavras, "A Universal, Dynamically Adaptable and Programmable Network Router for Parallel Computers," VLSI Design, Vol. 12, Nol, pp. 25-52, 2001. S.G. Ziavras, et al., "A New-Generation Parallel Computer and its Performance Evaluation," Future Generation Computer Systems, Vol. 17, No. 3, pp. 315-333, 2000. Todd Mowry and Anoop Gupta, "Tolerating Latency Through Software-Controlled Prefetching in Shared-Memory Multiprocessors," Journal of Parallel and Distributed Computing, Vol. 12, pp. 87-106, 1991. Fredrik Dahlgren, Michel Dubois, and Per Stenstrom, "Sequential Hardware Prefetching in Shared-Memory Multiprocessors," IEEE Transactions on Parallel and Distributed Processing, Vol. 6, No. 7, pp. 733-746, July 1995. Chung-Ho Chen and Arun K. Somani, "Effects of Cache Traffic on Shared Bus Multiprocessor Systems," International Conference on Parallel Processing, pp. 285-288, 1992. Tien-Fu Chen and Jean-Loup Baer, "Effective Hardware-Based Data Prefetching for HighPerformance Processors," IEEE Transactions on Computers, Vol. 44, No. 5, pp. 609-623, May 1995. David J. Lilja, "The Impact of Parallel Loop Scheduling Strategies on Prefetching in a Shared Memory Multiprocessor", IEEE Transactions on Parallel and Distributed Processing, Vol. 5, No. 6, pp. 573-584, June 1994. John K. Bennett, Sandhya Dwarkadas, Jay Greenwood, and Evan Speight, "Willow: A Scalable Shared Memory Multiprocessors," Proceedings of Supercomputing'92, pp. 336-345, 1992. D. Lopez, J. Llosa, M. Valero, and E. Ayguade, "Widening Resources: A Cost-Effective Technique for Aggressive ILP Architectures," MICRO, pp. 237-246,1998. Analog Devices, Norwood, MA, ADSP- 2106XSHARC™ User's Manual, 1997.

132

Figure 1: Twin prefetching multiprocessor system diagram

Table 1: Execution Time vs. Number of Processors template matrix=3x3, when nl=32, 64, 128,256,512 Execution Time vs. Number of Processors, template matrix = z1x3 nl = 32 nl = 256 nl = 64 nl= 128 nl = 512 1 IMI i P T (sec) T (sec) T(sec) T(sec) 2.5391 1.9290 2.U5IO , 1.9697 2.2 M" 1 1.27IU 0.9645 2 1.1071 I.025"7 0.9X49 0.0569 0 4826 0.4932 0.55M 0.5142 4 ll..ll|)f) "'""0.2412 0 26U4 0.2482 0.649* 8 ! 0.1229 0.6495 0.1277 a.lMx 0.1679 16 0.6495 0.12 is | II. U>.?4 O.USd-l 0.065 1 32 64 0.6492 0.3248 0.1624 0.0812 0.0157

—•—nl = 32 - • - n l = 64 —A— nl = 128 - X - n l = 256 - * - n l = 512

1

2

4

8

16

32

64

P

Figure 2: Execution time vs. number of Processors

Table 2: Speedup vs. Number of Processors for template matrix=3x3, when nl=32, 64, 128,256,512

p 1 2 4 8 16 32 64

Speedup vs. Number of Processors, template matrix = 3x3 S S S S S n/ = 512 nl = 256 nl = 32 nl = 64 n/=128 1.0000 i.0000 1.0000 1.001)0 1.0000 l.9W(V 1 .')999 1 .9W1 , 1 .9995 2.0000 3.9937 3.9971 3.X65* j ur7S6 3.0SXX7 "7,'>35l> 7.1)67S 7.X70 * O.XI.Sf, ""15.4241 15.6957"" " 12.2156 " 22.7975 2y.C312 "' 5.9W3 6.S150 "12.0293 12.2101 24.2^4 12.6293 3.9193 6.8156

S ideal 1.0000 2.0000 4.0000 8.1)000 15.7000 31.0000 (4.0000

134

Figure 3: Speedup vs. number of Processors

Table 3: Efficiency vs. Number of Processors for template matrix=3x3, when nl=32, 64, 128,256,512 Efficiency l>s. Number of Processors, template. matrix=3x3 E E E E E P n/=32 n/=64 n/=128 n/=256 n/=512 1 l.ddOd 1 iHJOO l.oddd 1.0000 1 .IMllIll 2 O.'UMfi 1) OOlJS 11. WW l.ddDI) " 4 IMtfrfi? n^in; ~ 0.WU7 ()')i)S4 "o.w: 8 D.S'fO . ()t)S-|S .i I.--: ( ) 1>1>I,M 16 II ' . : , n INiii D.7filS MWIo D'JSIM 32 0.1222 0.2130 0 'Odd 0.394 7 0.7124 64 0.0611 0.1065 (1 M l ) i 0.1973 0.3790

135

-nl = 32 -nl = 64 -nl = 128 -nl = 256 -nl = 512

8

16

32

64

Figure 4: Efficiency vs. Number of Processors

136

Figure 5: Speedup vs. Shared-bus-width (n/)

Table 4: Speedup Vs. Shared-bus-width («/)

SP1 1.0000 1.1470 1.2380 1.2891 1.3163

Speedup vs. Shared-bus-width, template matrix = 3 x 3 SP2 SP4 SP8 SP32 SP16 1.0000 1.0000 1.0000 1.0000 1.0000 1.1806 1.1472 1.9628 1.9982 1.9982 1.2390 1.2775 2.4942 3.8654 3.9963 1.3319 1.2896 5.0822 2.6168 7.5116 1.3584 1.3168 2.6828 5.2807 9.9693

SP64 1.0000 2.0028 3.9994 7.9988 14.2123

137

A L I N E A R A L G O R I T H M TO F I N D PATH A M O N G OBSTACLES U S I N G R E C O N F I G U R A B L E M E S H DAJIN WANG Department of Computer Science, Montclair State Upper Montclair, NJ 07043, USA E-mail: wang@pegasus. montclair. edu

University

The reconfigurable mesh (RMESH) is an array of mesh-connected processors equipped with a reconfigurable bus system, which can dynamically connect the processors in various patterns. A 2-D reconfigurable mesh can be used to solve motion planning problems in robotics research, in which the 2-D image of robot and obstacles are digitized and represented one pixel per processor. In this short paper, we present an algorithm to compute a collision-free path between two points in an environment containing obstacles. The time complexity of the algorithm is O(fc) for each pair of source/destination points, with 0(log 2 N) preprocessing time, where k is the number of obstacles in the working environment, and N is the size of the reconfigurable mesh.

1

Introduction

The general purpose of robot motion planning is to plan the movement of a robot in a known or unknown environment filled with obstacles. The planned movement is often subject to certain requirements. In a typical case, the robot will be commanded to navigate among a collection of static or time-dependent obstacles (i.e., moving obstacles). The shapes of those obstacles may range from very simple ones (such as a single segment) to arbitrarily complicated objects. The robot has to move from an initial position to a target position without colliding (but may be allowed to touch) with any of the obstacles. Moreover, there can be many different restrictions on the types of movement of the robot, and the robot's mechanical limitations have to be taken into account when planning its motion. For example, for a non circular robot whose workspace is 2-dimensional, its movement can be either translational or rotational. Thus one possible restriction is that only translational movements are allowed. To speed up path planning computation, parallel algorithms with different underlying architectures have been proposed. In the application of meshes for path planning, robot and obstacle images taken by the camera are represented in mesh processors, with each processor holding one pixel of the image. The image data can then be parallelly processed. This paper proposes an algorithm to compute a collision-free path using

138

the reconfigurable mesh multiprocessor computer [1,2] which is an array of mesh-connected processors with a reconfigurable bus system that can reconfigure the whole mesh into different substructures. In our problem, we assume a 2-dimensional working environment. The image of robot and obstacles are digitized, input and stored in the reconfigurable mesh, with one processor holding one pixel of the image. The obstacles we deal with are supposed to be disjoint convex or concave polygons (if two polygon images intersect, they are considered one polygon). With an 0(log 2 N) preprocessing time for the given obstacle image, the proposed algorithm uses O(k) time to compute a path for a pair of source/destination while avoiding all obstacles in the environment, where N is the total number of processors (pixels) and k is the number of disjoint obstacles. The rest of this paper is organized as follows. In Section 2, we describe in details the structure of RMESH, the reconfigurable mesh computer we use to carry out our algorithm. In Section 2, we introduce some operations on RMESH developed by other researchers. These operations will be made use of in our algorithm. Section 3 outlines the algorithm to compute a collision-free path using reconfigurable mesh and analyzes its complexity. Concluding remarks are given in Section 4. 2

Preliminaries and previous results

There are several different models of reconfigurable meshes. The particular reconfigurable mesh architecture we consider in this paper is called RMESH [1,2]. It employs a reconfigurable bus to interconnect all processors. Figure 1 shows the conceptual structure of a 4 x 4 RMESH, in which a square represents a processor, a circle represents a switch. The switches can be programmed to open/close in any specified manner. By doing so the interconnection bus may be reconfigured into smaller subbuses that connect a subset of processors. The extra-equipped bus system is the most important feature of RMESH. The major features of an RMESH are as follows. • A 2-D RMESH is a 2-dimensional mesh-connected, mxn array of processors. We define N = mxn to be the size of the RMESH. The id of each processor is a 2-tuple (i,j), where i is its row index, j its column index. We designate that the id of the lower left corner processor be (0,0). • Processors are connected to each other with bus switches. Each processor has up to four switches (see Figure 1), named E- (east), W- (west), S- (south), and N- (north) switches, respectively. These switches are software controlled and can be used to reconfigure the bus into subbuses.

139

3-G-

- & -

•

processor switch link

Figure 1. A 4 x 4 RMESH. By setting bus switches, t h e processors can be reconfigured into different substructures.

For example, row buses are formed if each and every processor disconnects its S- and N-switches, and connects its E- and W-switches. Column buses are formed by disconnecting all E- and W-switches, and connecting all S- and N-switches. An all-plane bus can be formed by connecting all switches. Once a subbus is formed, all data move is among processors connected to this subbus. • Only one processor can put data onto a given subbus at a time. • In unit time, data put on a subbus can be read in parallel by every processor connected to the subbus, i.e., data are transferred using only switches without stopping at intermediate processors. Mesh-connected parallel computers have been used to solve image processing problems [2,3], in which the images are digitized, input and stored in the mesh with one processor holding one image pixel. RMESH can be used to solve these problems much faster with its extra communication and reconfiguration power. The collision-free path computation algorithm we propose in this paper will make use of some operations introduced in [2]. These operations and their time complexities are stated below. The detailed description and analysis can be found in [2]. Performing

"OR" on a row/column of boolean values

140

A row/column of boolean values (1/0) can be ORed in 0(1) time on RMESH. Let processor Pitj at row i contain boolean value bj, 1 < j < n. After a fixed number of operations, the OR-result of all bj can be collected at Pifi (and can be broadcast to any other processor if wanted). Determining the maximum (or minimum) value of a row/column of data This useful operation can be done in O(l) time. Let a row of elements (xi,X2,...,xn) reside at the bottom row of a RMESH. Then after a fixed number of operations, the maximum (or minimum) value of (x±,X2, ••-,xn) can be found — the processor PQJ containing the maximum value Xj will be aware of this fact. The method uses n x n processors and the column-OR operation stated above. Determining the tangent lines from a point to a convex polygon Given a convex polygon G mapped on RMESH, and a point A outside of G. The two tangent lines from A to G can be determined in constant time on RMESH. Enumerating the extreme points of the convex hull of a polygon Given a polygon G mapped on RMESH, we want to identify the extreme points of the convex hull of G. The enumerated extreme points can completely represent a polygon on plane. An enumerated extreme point stores its own location and number, and the locations and numbers of its preceding and following extreme points. The enumeration can be done in 0(log 2 N) time on an RMESH, where N is the size of the RMESH. Finding the two tangent lines of two convex polygons Given two disjoint convex polygons G and H, mapped on RMESH. The two tangent lines of G and H can be determined in 0(1) time. The operation works out the convex hull of G and H. 3 3.1

Collision-free path computation Basic operations

Set handling We designate a certain number of processors to represent a set, with each processor representing an element of the set. The unique location id can be used as the element id. A membership flag indicates whether the element is in the set or not (1 or 0). The following operations will be employed by

141

our algorithm. They can all be implemented in O(l) time. The detailed description is omitted due to length limit. • Determining whether the set contains at least two elements • Adding a new element • Deleting an element Merge (convex hull) of two non-disjoint convex polygons In the proposed algorithm, we need to merge two convex polygons G, H into one. G and H are intersecting with a specific pattern, i.e., G intersects with exactly one known edge of H. This merge can be done in constant time. 3.2

Algorithm outline

Before algorithm starts, we assume the digitized obstacle image has been stored in the RMESH, one pixel per processor. The images are black and white. A processor has a 1/0 flag indicating whether it is a black/white pixel. In the following algorithm description, sd represents the line segment from point s to point d; SG represents a set of polygonal obstacles; CH(Gi,Gj) represents the convex hull of polygons Gi and Gj. The algorithm first draws a straight line sd from source s to destination d. If sd intersects with any obstacles, we merge these obstacles by computing their convex hulls. The tangent lines used to form convex hulls may intersect with more obstacles. The newly intersected obstacles are then merged again. The above process is repeated until the tangent lines are not intersecting with any obstacles, so that there is only one "big obstacle" Gf intersecting with sd. We then draw tangent lines from s to Gf and from d to Gf, respectively. The 4 tangent lines may intersect with more obstacles, which will be merged with Gf again. The merging process will be repeated until the 4 tangent lines do not intersect with any obstacles. By now there are two feasible collisionfree paths from s to d, and we can choose either one. We illustrate the idea of the algorithm in Figure 2. The formal description of the algorithm is omitted here due to length limit. A detailed time analysis cannot be conducted here because the algorithm is not presented formally. Roughly speaking, a while-loop is run for every original obstacle. Each round of while-loop executes a fixed number of steps, thus taking 0(1) time. Therefore the algorithm has 0(k) complexity, where k is the number of obstacles in the working environment.

142

Figure 2. Merges are repeatedly performed until the 4 tangent lines from s and d, respectively, do not intersect with any obstacles. As shown, there will be two feasible collision-free paths from s to d.

4

Conclusion

We have introduced an algorithm on reconfigurable mesh (RMESH) to compute a collision-free path between two points in an environment filled with obstacles, in which the 2-dimensional image of robot and obstacles are digitized and represented on RMESH one pixel per processor. The time complexity of the algorithm has been shown to be 0(k) for each pair of source/destination points, with C*(log N) preprocessing time, where k is the number of obstacles in the working environment, and N is the size of the reconfigurable mesh (i.e., the number of pixels in the image). We have used only 2-dimensional RMESH for polygonal obstacles considered in this paper. For obstacles of more complicated shapes, one plane of processors my not be enough to compute collision-free path in the "linear" O(k) time. If fast algorithms are still to be sought, we may need 3-dimensional RMESH so that the massive data can be parallelly transferred on different planes. References 1. R. Miller, V.K. Prasanna Kumar, D.I. Reisis, and Q.F. Stout, "Meshes with Reconfigurable Buses", Proc. MIT Conf. Advanced Research in VLSI, pp. 163-178, Apr. 1988. 2. R. Miller, V.K. Prasanna Kumar, D. Reisis, and Q.F. Stout, "Parallel Computations on Reconfigurable Meshes", IEEE Trans. Computers, vol. 42, no. 6, pp. 678-692, June 1993. 3. R. Miller and Q.F. Stout, "Geometric algorithms for digitized pictures on a mesh-connected computer", IEEE Trans. Pattern Analysis Mach. Intell, vol. PAMI-7 pp. 216-228, 1985.

143

DENSE WAVELENGTH DIVISION MULTIPLEXING FOR OPTICALLY INTERCONNECTED LINEAR ARRAY PROCESSORS HAKLIN KIMM Computer Science Department, East Stroudsburg University of Pennsylvania, East Stroudsburg PA 18301, USA E-mail: [email protected] An optically interconnected linear array processors based on DWDM (Dense Wavelength Division Multiplexing) is presented, which exploits the high communication bandwidth of optical waveguides. The optical bus system includes two important properties: unidirectional propagation and predictable propagation delays per unit length. Furthermore, DWDM allows the optical system to transmit optical waveguides in multiple channels simultaneously. The proposed optical DWDM system requires far less number of switches compared to a reconfigurable optical bus system while providing the same time complexity to solve a computational complexity problem: the maximal elements problem on the two dimensional plane.

1

Introduction

With the advance in the optical amplifier and wavelength division multiplexing (WDM), optics is being used successfully in networking and is extending further to the boundary to the network. Optics also enhances opportunities for its use in the construction of parallel computers. The high bandwidth of optics in an interconnection network allows powerful microprocessors to transmit and receive signals with the speed of light [4,9,13,14]. Optically interconnected bus system uses optical waveguides instead of using electrical buses to transmit messages among electronic processors. The advantages of using waveguides are shown: unidirectional propagation and predictable propagation delay per unit length, in addition to the high propagation with speed of light. These properties enable synchronized concurrent access of an optical bus in a pipelined fashion [5,9,11,13,14]. In the Optically interconnected Linear Array Processors model (OLAP), messages can be transmitted concurrently on an optical bus in a pipelined fashion. The pipelined optical bus system can be used dynamically by introducing Dense Wavelength Division Multiplexing (DWDM). With DWDM, optics can transmit multiple signals on multiple channels simultaneously in a pipelined fashion. The communication time for the OLAP model is measured based on the number of bus cycles used. A bus cycle for this model is defined as the end-to-end propagation delay on the bus, which is the time taken for an optical signal to propagate through the entire bus.

144 In this paper, we present the OLAP-DWDM model, which is applying optical folded-bus system with DWDM. The optical folded-bus system is discussed first and followed by OLAP-DWDM model. A parallel algorithm for finding maximal elements set on the two dimensional space [3, 7, 8, 10, 15] is being developed based on the OLAP-DWDM model. Finally we describe and analyze the OLAP-DWDM model in comparison to the previous optically interconnected array of processors model.

<0> delays

U>
Figure 1. An optical folded-bus system with delays

2

Optical folded-bus system

The optical folded-bus system is based on the folded-bus connection to which all the processors are connected with two directional couplers such as one for transmitting on the upper segment and the other for receiving from the lower segment of the bus, as shown in Figure 1. This optical bus system contains three identical waveguides: one for carrying messages (the message waveguide), and the others for carrying address information (the reference waveguide and the select waveguide). All the messages on the system are transmitted as fixed-length message frames. We note that the optical signals propagate unidirectionally from left to right on the upper segment of the bus and from right to left on the lower segment. There is one unit delay between any two processors on the receiving segment of the reference waveguide and the message waveguide, which is shown as a loop in Figure 1. Each loop is an extra segment of a fiber and the amount of delay added can be accurately chosen based on the length of the segment [11, 13, 14]. Thus, the propagation delays on the receiving segment of the select waveguide and the reference waveguide are no longer the same. Finally, a conditional delay A between any two processors pi and p i+1 , where 0 < i < n-2, is placed on the transmitting

145

segment of the select waveguides. The conditional delays can be implemented using 2 x 2 optical switches. A local processor sets two different states of each switch: straight and cross [12,13,14]. There have been several time-division switching methods being applied to route messages in an optical bus system. A fixed time slot is allocated to each processor to transmit or receive a message during that assigned time slot. A sequence of time slots formed on the transmitting segment of a bus is rearranged via a time-slot interchanger and sent to the receiving segment. Each time slot of the output sequence contains a message designated to the corresponding processor to receive that slot. In the second approach, a fixed transmitting slot is allocated to each processor. A sequence of time slots formed on the transmitting segment is directly forwarded to the receiving segment without interchanging the time slots. Instead of assigning a fixed receiving time slot to each processor, a SIMD environment is assumed where each processor knows which processor is sending a message to it and therefore knows the time slot that contains the message. In the third approach, a coincident pulse technique is applied to transmit and receive a message. Using this approach, the relative time delay of select pulse and a reference pulse is determined so that they will coincide, thus producing a double-height pulse, only at receiver processor ps. By properly adjusting the detecting threshold of the detector at processor pi, this double-height pulse can be detected, thereby addressing i [5,9].

A Multiplexor

V

Demultiplexer

Figure 2. OLAP model based on DWDM 3

OLAP-DWDM Model

The OLAP-DWDM model is based on the optical folded-bus system, upon which DWDM is applied to increase transmission bandwidth and number of channels to communicate. As shown in die above Figure 2, the OLAP-DWDM model is using DWDM transmission and optical multiplexing/demultiplexing. Multiplexor can generate different wavelengths and transfer them through proper channels without any conflict. But demultiplexer can merge different wavelengths from different

146 channels into one to transmit [1, 4, 16]. To provide more flexibility to develop parallel algorithms on the OLAP-DWDM model, the followings are assumed: 1.

2. 3.

4.

4

Optics can provides at least n number of multiple channels to transmit and receive signals, where n = number of processing elements. Current optics technology allows using up to 80 channels [4]. Optical signals can be multiplexed and demultiplexed without degrading signals. All the optical folded-buses of OLAP-DWDM can be controlled simultaneously. In addition, optical signals can be sent and received in a pipelined fashion. The coincident pulse technique is used to route messages or to broadcast messages on the bus. The switches on the sending segments are used to conditionally delay the select pulses and can be controlled by the local processors. Assume that the message processing time is roughly equal to the time of a bus cycle. If a basic operation takes a constant time, then a bus cycle also requires a constant time.

Parallel Algorithm on OLAP-DWDM model

To illustrate our parallel algorithm for finding a set of maximal elements set of a given two dimensional space, we assume that all the points p; = (Xj, yj) in the xyplane are distributed into the local memory of each processor PEj of the OLAPDWDM model, for i = 1,2,..., n, where all the points pi are ordered in Xj sequence. N processors of the OLAP-DWDM model are partitioned into nAog n different subgroups S(l), S(2), .... S(nAog n), each with the size of log n processors. For simplicity, we assume that log n is an integer value. Let p,y be the number in the fh processor of the j ' h partition (subgroup) S(j), where i = 1, 2, ..., log n and j = 1, 2, ..., nAog n. The local_max_set(i,j) stands for the numbers in the set of local maximal elements of the J* subgroup. The global_max_set(i,j) stands for the numbers in the set of global maximal elements in the/* subgroup. The local_max_R(c,j) is the number with the largest y-coordinate in the local maximal elements set in t h e / ' ' subgroup S(j), which happens to be located at the c'h processor of the fh partition. For the following algorithm, each processor PEj is equipped with a conditional delay switch Switchj, for i = 1, 2, ..., n. In our approach for solving the maximal elements set problem, the n'h conditional switch Switch,, is added to the original OLAP-DWDM model. P = (ph p2,..., pn) is given based on ascending order of x-coordinates; Input: Points pi = (Xj, yt) on the xy-plane for i = 1, 2, ..., n; Output: A set of maximal elements Q = (qh q2,..., qm); Method:

147 1. for all i =1,2,..., log n //repeat ( log n) times 1.1. P = (pi, P2,—, Pn) ore allocated to processors PEjj, for allj = 1,2,... n/log n, simultaneously; 2. for allj = /, 2, ..., n/log n, in parallel do 2.1. find the local_max_set(j) for the f subgroup; //find it in( log n) time 2.2. set local_max_set(i, j) to true for corresponding elements; 3. for k = 1 to log n, in serial do // repeat ( log n) times 3.1. for allj =2kl ,2kl+ 2k,..., m - 2k, in parallel do, where m = n/log n 3.1.1. find the rightmost (largest) y-coordinate from each local_max_set(j), say local_max_R(c, j); 3.2. for allj =2k~', 2k~' + 2k, ... , m- 2k, in parallel do 3.2.1. multicast the local_max_R(c, j) to the all the processors in subgroup S(j + 2k-'); 3.3. for all i = 1, 2, ..., log n, andj =2k~', 2k~' + 2k, ... , m- 2* in parallel do 3.3.1. if (local_max_set(i,j) = true and y-coord ofpti j+2k-i > y-coord of Pcj)

3.3.2. global_max_set(i,j) = true 3.3.3. else global_max(ij) = false; 5

Discussion

In this paper, we have presented a parallel Max_Elements_Set algorithm finding a set of maximal elements from the points on the two dimensional space using the OLAP-DWDM model. As shown on the above, a parallel algorithm for solving a maximal elements set problem can be implemented in 0(log n) time using n number of processors on the OLAP-DWDM model, where n = number of data elements. We note that a sequential algorithm for finding a set of maximal elements of the xyplane takes 0(n log n) time and a parallel algorithm for the corresponding problem on the reconfigurable optical folded-bus system takes O(log n) time using O(n) processors [6]. In addition, a parallel algorithm for the corresponding problem on the CREW PRAM model takes 0(log n) time using O(n) processors [2]. Therefore, the parallel Max_Elements_Set algorithm based on the OLAP-DWDM model is said to be efficient since its total work is equivalent to 0(n log n). In comparison to the previous reconfigurable optical bus system, OLAPDWDM needs far less number of optical switches and much less overhead time to reconfigure the optical bus system to implement procedures in parallel because the proposed model can replace the optical switches, which is providing reconfigurable features, with DWDM. Consequently, this dismisses the need for reconfiguring

optical bus system each time to exchange data between processors. In the proposed OLAP-DWDM model, messages are transmitted simultaneously in a pipelined fashion and an optical bus with DWDM can be used at any time to satisfy communication demands. References 1.

2.

3. 4. 5.

6. 7. 8.

9.

10. 11.

12.

13.

R. Alferness, H. Kogelnik and T. Wood, "The Evolution of Optical Systems: Optics Everywhere," Bell Labs Technical Journal, January-March 2000, pp. 188-202. M. Atallah and M.T. Goodrich, "Efficient Plane Sweeping in Parallel," Proceedings of the 2 ACM Symposium on Computational Geometry, 1986, pp. 216-225. T. Cormen, C.E. Leiserson and R.L. Rivest, Introductions to Algorithms, MIT Press, 1990. A. Glass, et al., "Advances in Fiber Optics," Bell Labs Technical Journal, January-March 2000, pp. 168 - 187. Z. Guo, R. Melhem, R. Hall, D. Chiarulli and S. Levitan, "Array Processors with Pipelined Optical Busses," Journal of Parallel and Distributed Computing, 12, 3, pp. 269-282 (1991). H. Kimm, "Two Dimensional Maximal Problem on a Reconfigurable Optical Pipelined Bus System," Proceedings of the ACM-SAC, pp. 623-627 (1998). H.T. Kung, F. Luccio, and F. Preparata, "On Finding the Maxima of a Set of Vectors," Journal of ACM, 22(4), 1975, pp.469-476. F. Dehne, "O(Vn) algorithm for the Maximal elements and ECDF searching problem on a mesh-connected parallel computer", Information Processing Letter, 22 (1986) 303-306. R. Melhem, D. Chiarulli and S. Levitan, "Space Multiplexing of Waveguides in Optically Interconnected Multiprocessor Systems," The Computer Journal, 32, 4, pp. 362-269 (1989). F.P. Preparata and M.I. Shamos, Computational Geometry, Springer-Verlag, New York, 1985. S. Pavel and S.G. Akl, "On the arrays with Optical Pipelined Buses," Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, pp. 1443-1454 (August 1996). Y. Pan and K. Li, "Linear array with a reconfigurable pipelined bus system: Concepts and Applications," Proceedings of the International Conference on Parallel and Distributed Processing Techniques and Applications, pp. 14311442 (August 1996). C. Qiao, R. Melhem, D. Chiarulli, and S. Levitan, "Optical Multicasting in linear arrays," International Journal of Optical Computing, 2, 1, pp. 31-48 (1991).

149 14. C. Qiao and R. Melhem, 'Time-division optical communications in multiprocessor arrays," IEEE Transactions on Computers, 42, 5, pp. 577-590 (May 1993). 15. I. Stojmenovic and M. Miyakawa, "An Optimal Parallel Algorithm for Solving the Maximal Elements Problem in the Plane," Journal of Parallel Computing,7, pp. 249-251 (1988). 16. J. Yoo and S. Banerjee, "Design, Analysis and Implementation of WavelengthRouted All-Optical Networks: Routing and Wavelength Assignment Approach," Journal of Parallel Computing, 7, pp. 249-251 (1988).

151

A TWO-LEVEL O P T I M A L D I S T R I B U T E D M O N I T O R I N G S C H E M E FOR M E S H DAJIN WANG Department of Computer Science, Montclair State Upper Montclair, NJ 07043, USA E-mail: wang@pegasus. montclair. edu

University

In this paper, we propose a two-level hierarchical distributed monitoring scheme for mesh, one of the most popular networks for parallel computers. The proposed scheme is optimal in terms of total communication cost attributed to monitoring activities.

1

Introduction

A hierarchical distributed monitoring system for a network consists of a hierarchy of monitoring units, which are grouped and distributed onto the network. The purpose of hierarchical structuring of monitoring units is to reduce overall cost incurred by monitoring. The optimization problem in this context is concerned with finding an optimal hierarchical partition of the monitoring units so that the total processing cost is minimal. In this short paper, we propose a two-level hierarchical distributed monitoring scheme for mesh interconnection network. The proposed scheme is optimal in terms of total communication cost attributed to monitoring activities. Base on the analytical results, partitioning algorithm can be constructed that minimizes the communication cost. Although this work is targeted at distributed monitoring, we believe that the results can also be applied to other hierarchical control problems in distributed computing. 2 2.1

Two-level distributed monitoring scheme for mesh Minimum cost for one-level monitoring

Let the squared mesh contain TV2 processors, with dimensions N x N. If the whole mesh is viewed as one hierarchy (one-level), then choosing the center node as the monitor (i.e., the cluster leader with the entire mesh as the cluster) would obviously minimize the total communication cost. Depending on whether N is an odd or even number, the cost can be calculated as follows. See Figure 1.

N is odd

152

00000 QQQ00 0QIQ0 00000 00000

0 0 0 0 0 0 0| J| 000000 ^ 000000 0 0 1 0 0 0 raggi 0 0 0 0 0 0 raggl 0 0 0 0 0 0 HHHl

(a) N odd: N = 5

- dist. (N-1J/2

-dist. N-(N-1)/2

(c) Cost calculation

(b) N even: N = 6

Figure 1. (a) A 5 x 5 mesh. At center is the monitor node (dark). Numbers in all other nodes represent their communication costs, (b) A 6 X 6 mesh. The dark node at the "pseudo center" is the monitor, (c) Illustration for cost calculation.

When TV is odd, there exists a true central node, and it will be taken as the monitoring node (the dark node in Figure 1(a) and (c)). The number in a node represents its communication cost for monitoring, i.e., the Hamming distance from itself to the monitor. The total communication cost, denoted as C0(N), can be calculated as follows. Refer to Figure 1(c). nodes of dist. 1

C0(N)=

nodes of dist. ( N - l ) / 2

nodes of dist. 2

1 - 1 - 4 + 2 - 2 - 4 + • • • + ((TV - l)/2) • ((TV - l)/2) • 4 all white nodes in Figure 1(c)

nodes of dist. N - l

nodes of dist. N - ( N - l ) / 2

odes of dist. N-2

+ (TV - 1) -1 • 4 + (TV - 2) • 2 • 4 + • • • + (TV - (TV - l)/2) - ((TV - l)/2) • 4 V

^

,

all grey nodes in Figure 1(c)

2

N-l 2

3

= 4]T i + 4^2(N - i)i = 4TV]T i = N -N2 i=l

i=l

N is even Refer to Figure 1(b). When TV is even, there is no true central node. Any one of the four nodes in the central "area" can be picked as the monitor, as shown in Figure 1(b). The total cost, denoted as Ce(N), is C0(TV — 1) plus the cost of grey nodes. Ce(TV)

(TV - l ) 3 - (TV - 1) Co(JV-l)

N-l

+ 2-(TV/2)+4 J2 i + l-N i=f+i v

v

all grey nodes in Figure 1(b)

'

153 _ JV3 ~ 2 To summarize the preceding discussion, the total communication cost C(N) for an N x N mesh using a central monitor is N3-N , N odd

C(N) =

(1) *£-,

AT even

It can be shown that using any non-central monitor would cost more, with the monitor at corner costing most. In the following Section 2.2, we will obtain the optimal partition of processors assuming a two-level hierarchy. 2.2

Optimal two-level partitioning

In a two-level partitioning, the whole mesh is divided into several submeshes. Each submesh has a local monitor. A local monitor will collect data in the submesh it monitors, and then turn the data in to a monitor at the higher level. As stated before, the purpose of hierarchical monitoring is to reduce the overall system cost incurred by monitoring. Assuming two-level hierarchy, we need to find out the best way to divide the mesh so that the total communication cost is minimum.

nan ••o n a n ana lanaoaa • • • : • • • n• •a •n na a« an aaa • • • : • • • DDD n a n DDD dDSQnD • • • n an • • • I Q B D a n a DDD • ••jDDn an da a DDDIDDD n • • • a«n • • • ! • • • DDD a n a

! • ! ! • , • • • n• •a •n Monitor of levle-two

-N/x levle-onenaa'DD D monitors

Figure 2.

See Figure 2. Let the submesh be of dimension a; x x, so that x divides

154

N. Then by Eq. (1) the cost for local monitoring will be: -=£, x odd C{x) = •

Therefore the total cost for all (—) level-one submeshes is given by: f ±(x3-x)-(f)2,xodd

Ci(N,x) = {

(2)

{ ±(x 3 ) • ( g ) 2 ,

x even

At level-two, note that all local monitors form a squared mesh by themselves (the darker nodes in Figure 2). So choosing the central or near-central node among them as the monitor (the darkest node in Figure 2) will give the minimum communication cost. However, the cost of "one step" (i.e., passage of data from a node to its immediate neighbor) is x instead of 1. Applying Eq. (1) again, the cost for level-two is given as follows:

(!((f) 3 -£)-*,f odd Cn(N,x)={

(3) U(S)3-*.

feven

Combining (2) and (3), we have the expression for total cost of the two-level hierarchical monitoring system: Ctotai(N,x)

= d(N,x)

+

Cn(N,x)

( !(*» - x) • ( £ ) 2 + § ( ( £ ) 3 - £ ) • * = »

2

*

3

- ^ 7 ^ » \ x odd, £ odd

§(x 3 - x) • ( g ) 2 + | ( f ) 3 • x = "2*3-»J*+»3, §(* 3 ) • ( f ) 2 + | ( ( f ) 3 - £ ) • * = W")

N2 3 N N * -2 /+ \

• ( f )2 + * ( * ) 3 • x = * ^ ,

x

odd, £ even

x even, f odd x even, £ even

There is an optimal x to make the minimum Ctotai(N,x). To obtain the optimal x, just take the derivative of Ctotai(N,x) with respect to x, denoted

155

Ctotai(N,x)'x,

and solve Ctotai(N,x)x

= 0 for x.

• (*v-^,+J^ = g ( N2x3-N2x+N3V

_

C tofai (iV,x)' x = < 2

V

J,_

^

\Z27JV+3^3+81/V 2 _

g _g |

i N

2

X

ii

g odd iV

°dd' x

eVen

x even, £ odd JV

^'

even

^ r = 0, respectively, for a:, and

1 , v/27JV+3v'3+81AP

3

N

g odd;

%" + 2^* - ^ 3 " ' 2

V 2^ A ~ Solving ^ j - + ^ — ^3- = 0 and ^ only taking the real root, we have

N

+

x odd

a; = <

V2N,

x even

The difference between &2N and [ V ^ + 3 3fr3+8i7v? x

v^27iV+3 ^ 3 + 8 1 ^ 2

is vanishingly small. So for any practical purpose, we can just use an integer close to v/2iV for the size of submesh to achieve the minimum total cost. T h e o r e m 1 In a two-level hierarchical monitoring system, if the level-1 submesh is of dimension x x x, so that 1. x is as close to \/2N as possible 2. x divides N then the system's total communication cost is minimum. Figure 3 illustrates the level-1, level-2, and total costs as function of submesh size x, for a two-level monitoring system, where the original mesh size is N = 72. \/2N = \/l44 « 5.24. According to Theorem 1, x = 6 will be chosen as the optimal submesh size. The total cost is 20736, which is minimum. The saving of communication cost gained by this two-level hierarchical scheme is very impressive. The ratio of min.-two-level-cost/one-level-cost is (assuming even N, even x) 3 N2x3+N 2x2

fN3

x=\/2N

)

3 3 /~2~ 2 ~2V N

156

N = 72

Level-1 Cost Level-2 Cost Total Cost

9 12 18 24 36 Level-1 Size (x)

Figure 3.

which is ever decreasing as N grows. Figure 4(a) shows a comparison between minimum-two-level-cost and one-level-cost, and Figure 4(b) shows the same comparison for a broader range of N. When N = 10, the min.-two-levelcost/one-level-cost ratio is about 40%; when N = 100, less than 9%; when N = 200, less than 6%. 70000 60000 50000 !-•— One-Lewi Cost

I

-m~ Min. Two-Level Cost

30000

10000 10

20

30

40

50

N

(a)

Figure 4.

References L. Shi, O. De Vel, J. Cao, and M. Cosnard, "Optimization in a Hierarchical Distributed Performance Monitoring System", Proc. First IEEE International Conference on Algorithms and Architectures for Parallel Processing, Brisbane, Australia, April, 1995, 537-543.

157

NON-LINEAR CLUSTERING SCHEDULING WITH TWO C L U S T E R S , O N E B E I N G L I N E A R , IS N P - H A R D

W I N G N I N G LI Department

of Computer Science and Computer Engineering, Arkansas, Fayetteville, AR 72701 E-mail:[email protected]

University

of

JOHN JINGFU JENQ Department

of Computer

Science,

Montclair State University, NJ 07043 [email protected]

Upper

Montclair,

Scheduling a Directed Acyclic Graph (DAG) on a multiprocessor computer generally consists of two parts, the assignment of tasks to processors, which is called clustering, and the ordering of the tasks for execution in each processor, including specifying the starting time of each task, which is called clusteringscheduling. Recently it has been shown that for unit-time task DAGs two-cluster clustering scheduling is polynomial time solvable whenever one of the two clusters is linear . In this paper we prove t h a t if the tasks in the DAG are not unit-time tasks, then the two-cluster clustering scheduling with one cluster being linear is NP-hard. As a generalization of this result, almost linear m-cluster (all m clusters except one are linear) clustering scheduling is NP-hard, even though when all m clusters are linear the problem is polynomial time solvable.

1

Introduction

In scheduling of DAGs on multiprocessor systems, the assignment of tasks to processors is called clustering. Clustering has been used as a pre-processing step in the scheduling of task graphs on parallel architectures [7, 10]. The objective of the clustering is to minimize the maximum completion time of the processors. The maximum completion time of the processors is also called parallel time. To evaluate a given clustering, one must know what is the minimum parallel time of the clustering. In determining the minimum parallel time, optimal scheduling of the clustering must be obtained. In a clustering, tasks have already been assigned to the processors, therefore an optimal clustering scheduling only needs to find the ordering of tasks for execution in each processor and specifying the starting time of each task so that parallel time is minimized and precedence and communication constraints are satisfied. A clustering is called non-linear if at least one cluster contains two independent tasks, i.e., there is no directed path between the two tasks, otherwise it is called linear. A clustering is call almost linear if all clusters in the clus-

158

tering except one are linear. For a linear clustering, the length of the optimal schedule and the actual task schedule can be computed in linear time by topologically travering the scheduled DAG [10]. The total ordering of the tasks assigned to each processor provided by the linear clustering makes the scheduling of linear clustering easy to compute. For a non-linear clustering, however, finding the ordering of tasks in each processor to achieve the optimal schedule is NP-hard even for unit time task DAGs [4]. Recently, it has been shown that for unit-time task DAGs the scheduling of two-cluster clustering is polynomial time solvable provided that one of the two clusters is linear [4]. In this paper we prove that if the task DAG is not unit-time then the two-cluster clustering schedule with one cluster being linear is NP-hard. Thus the polynomial time algorithm for the unit-time DAGs cannot be extended to handle non-unit-time task DAGs unless P — NP. As a generalization of this result, almost linear clustering scheduling is NP-hard. 2

Task Graph Clustering

This section is divided into the following subsections for a more logical presentation. 2.1

Task Graph

A DAG or data dependency graph can be used to represent the dependencies among elementary computational steps in a given program or algorithm. Some compilers automatically generate these graphs in intermediate optimization stages. The reader is referred to [2, 7] for detailed discussions and examples on how programs are represented as graphs and to [1] for examples on how compilers generate these graphs. In what follows, we shall describe the dag representation as an input instance to be scheduled. Definition 1 A Task Graph is a dag G=(V,E,d,cm), where a) V is the vertex set of G. Each vertex represents a task; b) E C V x V is the edge set of G. The edge set forms a partial ordering of V, representing data and control dependencies between tasks in V; c) d is a task weighting function with domain V and range Z+. where w is the running time for v;

d(y) = w

d) cm is an edge weighting function with domain E and range Z+. cm(< u,v >) — w where w is the communication cost of edge < u,v > when tasks u and v are processed by different processors.

159

Depending on the underlining architectures being modeled, the communication cost may subject to more than one interpretations, which then result in different task execution models and scheduling strategies. For example, in the shared memory multi-threaded architecture, a task must obtain data from its predecessor tasks one by one. The communication cost of getting the data from a task's predecessors is accumulative. The reader is referred to the literature for a more detailed discussion of the model [5, 9]. On the other hand, in the message-passing architecture, a task receives all input from its predecessor tasks before starting execution, executes to completion without preemption, and immediately sends the output to all successor tasks in parallel. Under the message-passing architecture, a task instead of fetching data from its predecessors, sends the data needed by its successors in parallel. In this case, the communication cost is dominant in the sense that, assuming all predecessor tasks finish at the same time, the largest communication cost of a task's predecessor running on a different PE determines the wait time of the task. Again the reader is referred to the literature for a more detailed discussion of the model [6, 8], The complexity result of the paper holds for both models. In fact the result still holds even if all edge communication costs are zeros. 2.2

Clustering

Clustering is a mapping of the tasks of a DAG onto m clusters. The problem is to find a mapping, map, map(vj) = i, 1 < j < |V|, 1 < i < m, of the nodes of G onto m clusters, so that the goal of minimizing the parallel time on m processors is achieved. Fig. 1(b) and (c) demonstrate two examples of clustering. The clustering that results in the minimum parallel time is called the optimal clustering. Two basic clustering strategies are used in heuristic algorithms: linear and nonlinear. It has been shown that a linear clustering strategy can benefit a clustering algorithm by perserving maximum parallelism and at the same time making the clustering scheduling problem computationally easy to solve. However, unless an unbounded number of processors is assumed, it is impossible to always obtain a linear clustering. For example, it is impossible to obtain a two-cluster linear clustering for the DAG of Fig. 1(a). When the number of available processors is small, we have to resort to nonliner clustering and face the challenge of nonlinear clustering scheduling. Since when all the clusters in a clustering are linear the optimal clustering

160

schedule can be obtained in linear time, it is interesting to find out if all the clusters except one are linear, whether the optimal clustering schedule can still be obtained in polynomial time. It turns out, as shown in the next section, that there is no polynomial time algorithm to solve the clustering scheduling problem, as long as one of clusters is nonlinear, unless P = NP. 3

Complexity result

In this section, we shall first establish that two-cluster clustering scheduling with one cluster being linear is NP-hard, even if the communication cost of each edge is zero. We shall use the following NP-hard problem [3] to prove that. Partition Input: A finite set A and a size s(a) 6 Z+ for each a £ A. Output: 'yes' iff there is a subset A'

C A such that ^2a€Ai s(a)

=

Theorem 1 The two-cluster clustering scheduling problem with one cluster being linear is NP-hard, even if the communication cost of each edge is zero. Proof: Given an instance of the partition problem, A = {ai, 02,..., a„}, the task graph consists of n + 3 nodes v\, v2, ••-, vn+3 and 2 edges < vn+i, n„+2 >

(a)

(b)

(c)

Figure 1. (a) Task Graph, (b) Nonlinear clustering, (c) Linear clustering

161 and < vn+2, v„+3 >. The processing time of the tasks are: d(t>.) = s(aj), 1 < ,, ^ •«, ^ v ^ n + ^ ; — ,., „ v „ n + i ; — ~K~nt<sj — '—2~—~- The communication costs of the two edges are zeros. The two clusters in the clustering of the task graph are Ci = {vn+i,vn+3} and C2 = {t>i,v 2 ,...,v n ,v n + 2 }. Clearly, C\ is a linear cluster and the construction can be done in linear time. Now we shall prove the claim that the clustering instance has a schedule of length J2i o r parallel time of Ylii<><"

task vn+3 starts at time "

^

i

.-,

~ + 1 and finishes at 5Zi) + 1-

x,^.,.^ - ^ i . „«,•» ^ n + 0 v-^xw^x ^.^^ ^„ >,x^~

'

j

—, since it must wait for

its predecessor vn+2 that runs on processor P2. On processor P2, task vn+2 starts at *-•

2

~

an<

^ finishes at • "

2"

— + 1, since its predecessor

„n+1 , . ^ u ™ * 2> — and communication cost is zero. The rest of the tasks, corresponding to the two partitions A' and A — A' are scheduled into ( 0 the two time slots [0, S " ' a " 0 ] and [ ^ + 1, E K , < „ s(ai) + 1] 2" ' " respectively. Hence, the corresponding clustering scheduling Instance has a schedule of length Yli
L

M
.

.

z — i < i < n " v - w ' "•> """-" -"""• "«-t-^ " —•••• "" ' •> "~ an<^ finish at ' 2 — + 1> due to the constraints imposed on by its predecessor vn+\ and successor v„+3. The rest tasks in C2 must be scheduled into the two time slots [0, ^2 2"'(ai)] and [ ^ " 2 " 5 ( a ' ) + i ; J2i ] . We have E . M - « W = E „ e A - A ' « ( « ) = ^ " T ' ' " ' ' Hence, the corresponding partition instance has a 'yes' answer. D Theorem 1 concerns about two-cluster clustering scheduling and shows that the problem is NP-hard whenever one cluster is not linear. The result and the proof idea of Theorem 1 can easily be generalized to m-cluster (m > 2) clustering scheduling. For a m-cluster clustering, even all clusters except

162 one are linear the clustering scheduling problem is NP-hard. This result is summerized in the next corollary Corollary 1 The m-cluster (m > 2) clustering scheduling problem with m—\ clusters being linear is NP-hard, even if the communication cost of each edge is zero. Since pseudo-polynomial time algorithms exist to solve the partition problem, it is natural to ask the question whether a pseudo-polynomial time algorithm exists to solve the two-cluster clustering scheduling problem with one cluster being linear. Unfortunately, our next theorem shows that the problem is NP-hard in the strong sense. Therefore, a pseudo-polynomial time algorithm does not exist to solve the problem unless P = NP. We shall use the following strongly NP-hard problem [3] to prove that. 3-Partition Input: A finite set A of 3m elements, a bound B G Z + , and a size s(a) £ Z+ for each a £ A such that 5 / 4 < s(a) < 5 / 2 and such that J2aeA s(a) = mB. Output: 'y e s ' iff A can be partitioned into m disjoint sets A\,A2,..., Am such that, for 1 < i < m, J2a£A s(a) ~ & (note that each A* must therefore contain exactly three elements from A). Theorem 2 The two-cluster clustering scheduling problem with one cluster being linear is NP-hard in the strong sense, even if the communication cost of each edge is zero. Proof: Given an instance of the 3-partition problem, A = {ai,a2, •••,a3m}, the task graph consists of 5m — 1 nodes ^1,^2, - . " s m - i and 2m — 2 edges < D3171+11 ^3m+i+j >, 1 < i < 2m — 2. The processing time of the tasks are: d(v{) = s(ai), 1 < i < 3m, d(v3m+2i) = 1, 1 < i < m - 1, d(v3m+1+2i) = B, 0 < i < m — 1. The communication cost of each edge is zero. The two clusters in the clustering of the task graph are C\ = {v3m+i+2«|0 < i < m — 1} and C2 = {^i, ^2, •••, V3m} U {^3m+2»|l < * < m — 1}. Clearly, Ci is a linear cluster and the construction can be done in linear time. One may easily verify the claim that the clustering instance has a schedule of length mB+m—1, or parallel time of mB+m— 1 iff the 3-partition instance has a 'yes' answer. Since the idea used in the verification is very similar to that used in Theorem 1, the details are left to the reader. D Similar to Theorem 1, Theorem 2 has the following generalization. Corollary 2 The m-cluster (m > 2) clustering scheduling problem with m— 1 clusters being linear is NP-hard in the strong sense, even if the communication cost of each edge is zero.

163

4

Conclusion

It is well known that for a linear clustering of a non-unit time task graph, the length of an optimal schedule can be computed in linear time by topologically traversing the scheduled DAG. In this paper, we prove that for an almost linear clustering, i.e., all clusters in the clustering except one are linear, the problem of determining the length of an optimal clustering schedule is NPhard in the strong sense. The complexity result also holds for clusterings that consist of only two clusters. Hence, the complexity result demonstrates that the techniques used for developing polynomial time algorithms that solves two-cluster clustering scheduling with one cluster being linear for unit time task graphs [4] cannot be extended to tasks graphs that are not unit time unless p = NP. References 1. IF1 An Intermediate Form for Applicative Languages, reference manual version 1.0 edition, Univertisy of California-Davis 1985. 2. J. T. Feo. An analysis of the computational and parallel complexity of the livermore loops. Parallel Computing, pages 163-185, July 1988. 3. M.R. Garey and D.S. Johnson. Computers and Intractability. Freeman, San Francisco, CA, 1979. 4. W. N. Li and J. F. Jenq. Scheduling unit-time task dag with communication delay in a nonlinear clustering, in preparation. 5. W. N. Li and J. F. Jenq. On the thread scheduling problem. J. Universal Computer Science, 5(10):994-1014, 2000. 6. C.H. Papadimitriou and M. Yannakakis. Towards an architectureindependent analysis of parallel algorithms. SIAM J. Comput., 19(2):322-328, April 1990. 7. V. Sarkar. Partitioning and Scheduling Parallel programs for execution on Multiprocessors. MIT Press, Cambridge, MA, 1989. 8. V.J. Rayward Smith. Uet scheduling with unit interprocessor communication delays. Discrete Applied Mathematics, 18:55-71, 1987. 9. M.A. Thornton and D.L. Andrews. Graph analysis and transformation techniques for runtime minimization in multi-threaded architectures. In Proceedings of the Hawaii International Conference on Systems Sciences, pages 566-575, 1997. 10. T. Yang and A. Gerasoulis. On the granularity and clustering of directed acyclic task graphs. IEEE Transactions on Parallel and Distributed Systems, 4(6):686-701, June 1993.

165 F O R M A L V E R I F I C A T I O N OF M I C R O I N S T R U C T I O N S E Q U E N C I N G LUBOMIRIVANOV Department of Computer Science, Iona College, 715 North Avenue New Rochelle, NY 10801, USA E-mail: [email protected] The complexity of the instruction set of modern microprocessors often leads to faults in the microinstruction sequencing and timing errors in the implementation of the processor control. These errors are difficult to detect with conventional simulation methods. As an alternative, formal verification uses a mathematical model of the system to verify its correct behavior by constructing a formal proof. Recently we introduced a new partial order formal verification method based on the notion of series-parallel posets. The associated verification algorithms have a low order space- and time complexity, and have been successfiilly applied to the verification of properties of real-world systems such as the PCI local bus protocol and the MESI cache coherence protocol. In this paper we use series-parallel posets to model and verify the behavior of the DLX microprocessor control.

1

Introduction

The complexity of designing a modern pipelined/superscalar processor leads to a significantly increased probability of serious design faults such as improper microinstruction sequencing and timing errors, while limiting the usefulness of the classical simulation and testing methods for uncovering these design faults. The recent examples of "bugs" in the microcode of the Pentium® processors illustrate the severity of the problem. A promising alternative is offered by the field of formal verification, which, based on a mathematical model of the system under consideration, attempts to prove or disprove facts about the system model, guaranteeing that all desired properties are satisfied, and unwanted properties and design faults are absent. An excellent overview of the field of formal verification can be found in [1]. Some powerful formal verification methods such as Symbolic Model Checking [1] and (it-Automata Verification [2] have gained significant popularity, and have led to the development of industrial-level verification tools (SMV, FormalCheck, etc.). Unfortunately, the power and expressiveness of these methods is offset by the high computational complexity of their verification algorithms. This imposes limits on the size of the circuits to which such general techniques can be applied. In the meantime, a number of new verification methods have emerged, which, while relatively less expressive, guarantee a significantly improved efficiency. Among these, several methods have been based on using partial orders to describe the dependence or independence of sets of events occurring in a hardware system [3,4, 5, 6, 7]. The main appeal of using partial orders in modeling and verifying system behavior is in avoiding the study of all possible interleavings of events occurring during a run

166 of the system. In addition, partial order models are usually very clear and intuitive, and the verification algorithms can be fully automated. In [8, 9, 10, 11] we introduced a new formal verification method for proving timing properties of complex systems. The method is based on the inductively defined notion of series-parallel posets. The verification algorithms are characterized by a low-order polynomial complexity. In [12] the technique was used to verify the behavior of a Handshaking Communication Protocol, and the popular PCI local bus interconnect protocol. In [13] the method was applied to the modeling and formal verification of the MESI cache coherence protocol for a system of n write-back cache memories in a Shared Memory MIMD multiprocessor system. In this paper we present another important application of our series-parallel poset methodology - the modeling and verification of the DLX microprocessor control. We begin with a description DLX, and the stages of its instruction cycle. We then present the formal model of the DLX control using series-parallel posets, and demonstrate the verification of a two properties. The main presentation is followed by a brief introduction to series-parallel posets, and an outline of our verification approach for iterated systems. Finally, we briefly discuss some strengths and weaknesses of our methodology in the context of other formal verification work. 2

The DLX microprocessor

The DLX processor was introduced by Hennessy and Patterson in [14]. It incorporates many features of popular commercial microprocessors such as Intel i860, SPARCstation-1, etc. Architecturally, it has thirty-two 32-bit general purpose registers (R0 hardwired to 0), thirty-two floating-point registers, which can be used for single-precision or (in pairs) for double precision floating point computation, and a set of special purpose registers for accessing status information. Memory is accessed through loads and stores which can transfer a byte, a halfword or a word. The address is 32 bits wide. The instructions set includes 4 types of instructions: • Load/Stores, (e.g. LW R3, 100(R1), or SB 45(R7), Rl) • ALU operations, (e.g. ADD R l , R2, R3, or SUBI Rl, R5, #4) • Branches and Jumps, (e.g. JR R5, or BEQZ R12, NEXT) • Floating Point Operations, (e.g. ADDD F0, Fl, F2) For further architectural details, refer to [14]. The internal organization of DLX (except the FPU) is given in figure 1 below. MAR and MDR are the memory address- and data registers, IAR is the interrupt address register, IR is the instruction register, and the A, B, C ports are used for accessing the 32 registers, R0 - R31.

167

A I\

—• ^

<

-B«-

Control Unit

kA w

ytt l^^

Register File

Temp PC IAR MAR

«-§«^ ^ ^ ^ ^ ^ ^ <-

7 ''

M U X

«*—

V

«—.

IR ir T from Data Bus to Address Bus to Data Bus Figure 1 DLX Internal Organization

from Data Bus

The instruction cycle for all (except FPU) DLX instructions consists of 5 stages: • Fetch: MAR <- PC; IR <- Mem[MAR] • Decode: A <- RegSl; B <- RegS2; PC <- PC +4 • Execute: - Mem reference: MAR<-A+(IR16)16##IR16..3i; MDR <- RegD - ALU instr.: ALUout <- A op (B or (rRl6)16##rR16..3i) - Branch/Jump: ALUout <- PC + (IR16)16##IRi6..3i • MemAccess: MDR <- MemfMAR] or Mem[MAR] <- MDR or If(cond)PC<-ALUout • Write Back: RegD <- ALUout or MDR Here, RegSl and RegS2 are the two source registers, and RegD is the destination register. The notation "(IR16)16##IR16 31" means that the contents of the lower half (bits 16-31) of the IR register are sign extended by replicating bit 16 sixteen times. 3

Modeling the DLX Control

Each stage of the instruction cycle involves the execution of a sequence of microinstructions. In most cases the sequencing must be strictly enforced, e.g. the ALU should not attempt to carry out an arithmetic operation before the two register file ports, A and B, have been loaded with data from the source registers.

168 However, in some cases, the order is insignificant, and the microinstructions can be executed in any sequence, independently of each other. For example, the 3 microinstructions of the decode phase are independent. Our series-parallel poset methodology is aimed precisely at modeling the notions of dependence and independence, and offers a convenient way to model the DLX control. Our model defines the execution of each microinstruction to be an event, and specifies the correct ordering or the independence of microinstruction execution with the help of operations concatenation, •, (for sequencing), shuffle, ®, (for independence), Kleene star, *, (for iteration/repetition), and union, +, (for choice). The formal model of the DLX control is presented below: B = (eo»e!*» (e 2 »e 3 + (e 4a ® e 4b ® e4c ) • (BDT+ BALU + BSet+ BJmp+ BBr)))*, where: BDT = (e 5 • (e 6 »e 7 * + e8*»(e9+ e10+ e n + e12+ e13))+ e14) • ei5+ e16 BALU = (e17+els) »(e19+ e20+ e21+ e22+ e23+ e24+ e25+ e26+ e27) »e15 BSet= (,e17+e18) •(e2i+ e^-H e30+ e31+ e32+ e33) »(e34+ e35) »ei5 Bjmp = e36 + e37 + e38 »(e36+ e 37 ) • e39 + e2 • e40 BBr = (e 4] + e42) • (e37 + 1) Each event in the DLX control model is a unique microinstruction as follows: Event

Microinstruction

Event

Microinstruction

eo

M A R < - PC

e2i

C <- A & Temp(AND)

ei

IR < - M e m [ M A R ]

e22

C<-AITemp

e2

IAR«-PC

e»

C
(OR)

e3

P C < - 0; C l e a r intrpt

e24

C < - A « T e m p (SLL)

e4a

PC <- PC + 4

ey

C <- A » T e m p ( S R L )

e4b

A«-RegSl

e»

C < - T e m p « 1 6 (LHI)

e4c

B <- RegS2

e27

C<-(Ao) T o m p ##(A>>Temp>r e m p..3i ( S R A )

es

MAR«-A+(IRi6)16##IRi6..3,

e28

A == Temp

e6

MDR<-B

e29

A != T e m p

e7

Mem[MAR] <- MDR

e3o

A < Temp

eg

MDR <- MemfMAR]

e3i

A <= Temp

e9

C<-(MDR 2 4) 2 4 ##MDR 2 4 ...3i

e32

A > Temp A >= Temp

24

eio

C < - 0 ##MDR 2 4...3i

e33

en

C < - 0 1 6 ##MDRi 6 ...3,

e34

C < - 1; t r u e

en

C <-(MDRi 6 ) 1 6 ##MDRi6...3i

e35

C < - 0; false

en

C<-MDR

e-it,

PC<-A

eu

C<-IAR

e37

PC<-PC+(IRi 6 ) 1 6 ##IRi6..3i

eis

R e g D
e38

C<-PC

ei6

IAR<-A

e39

R3K-C

ei?

Temp <- B

e4o

PC « -

16

eis

T e m p < - (IR 1 6 ) ##IRi6.3i

e4i

A==0

ei9

C < - A + Temp

(ADD)

e4 2

A!=0

e2o

C«-A-Temp

(SUB)

(IRI6)'6##IR,6JI

Table 1 Microinstructions of the DLX Control Unit

169 4

Verifying the DLX Control

To verify the proper sequencing of microinstructions, and check for event dependence/independence, we need to specify a set of properties, which tests all critical aspects of system operation. Once the properties are identified and converted to series-parallel poset expression format, they can be verified using the predicates defined in [11] (and briefly outlined in the "Overview" section). For lack of space, we shall limit ourselves to demonstrating the verification of only two properties from the complete set of properties mentioned above: Property Pi: "Memory addresses remain fixed for the entire duration of a memory access." For the address to remain fixed throughout an instruction fetch or a memory data access, the contents of the Memory Access Register, MAR, must not be updated until the memory read/write is complete. By examining Tbl. 1, we observe that MAR gets modified only by microinstructions MAR<— A+(lRi6)16##IRi6..3i (event e5) and MAR <— PC (event e0). Memory is accessed by the microinstructions IR <— MemtMAR] (event e j , Mem[MAR] <- MDR (event e7), and MDR <- Mem[MAR] (event e8). Therefore, property Pi can be specified as the following series-parallel poset expression: P , = ((ei*+e 7 *+e 8 *)»(e 0 + e5))* Once the property has been specified, the next step of the verification algorithm is to reduce the size behavior expression, B, by eliminating all events not in the property, Pi. The reduced behavior with respect to the events in set(Pi) is: B = (e0 • ei* • e5 • (e7* + e8*))* For the processor to operate correctly, property Pj must always be satisfied. Hence, we will use the Always Satisfied verification predicate1: AS(B, P,): Since both B and Pi are iterated, AS(B, P,) = AS(e 0 »e 1 *«e 5 .(e 7 *+e 8 *), (ei*+e7*+e8*).(eo+e5)) AS(e0»e1*»e5*(e7*+e8*), (e1*+e7*+e8*)»(e0+e5)): - (ei*+e7*+e8*)»(e0+e5) e SP(Z*) (i.e. the property has iteration) - AS(e0»e1*»C5»(e7*+e8*), (e]*+e7*+e8*)) = 2 AS(ei*»(e7*+e8*), (e,*+e7*+e8*)): To follow the outlined verification process, , use the algorithm in the Overview of Series-Parallel Posets section or consult [11]. 2 Reducing the behavior to include only events from the property

170 AS(ei*»(e7*+e8*), e,*) = 2 AS(ej*, e,*) = TRUE AS(e,*«(e7*+e8*), e7*) = 2 AS(e7*, e7*) = TRUE AS(e1*«(e7*+e8*)) e8*) =2 AS(e8*, e8*) = TRUE => AS(e0«ei*»e5»(e7*+e8*)( (e^+e^+eg*)) = TRUE - AS(e0«e1*»c5»(e7*+e8*), eo+e5) = 2 AS(e 0 »e5, eo+e5): AS(e0, e0) = TRUE AS(e5, e 5 )= 2 TRUE => AS(e0»e1*»e5«(e7*+e8*), e0+e5) = TRUE - predB({c0)) = {eo, e,, e5, e7, e8} n {eo} = {e0} - predB({e5}) = {eo, e^ e5, e7> e8} n {e5} = {e5} => AS(B, Pi) = TRUE Property P2 : "Before any ALU operation is attempted, the register file ports A and B are already loaded with data from the source registers." Data is written into ports A and B by microinstructions A <— RegSl (event e4b) and B <— RegS2 (event e*;). The microinstructions which use ports A and/or B in an ALU operation are labeled by events e5, ei 9 , e2o, e 2 i, e22, e23, e24, e25, e27, e28, e29, e^, e 3 i, e32, e33, e4i, and e42. Therefore, the property can be expressed as the following series-parallel poset expression: P2 = ((e4b®e4c) •(e5+ei9+e2o+e2i+e22+e23+e24+e25+e27+e28+e29+e3o+e3i+e32+e33+e4i+e42))* The reduced behavior with respect to the events in set(P2) is: B = ((e4b® e^) • (e5+e19+e20+e2r+e22+e23+e24+e25+e27+e28+e29+e30+e3i+e32+e33+e4i+e42))* It is now easy to see that AS(B, P2) is satisfied. The verification of the two properties presented in this paper was carried out by hand to illustrate the operation of the verification algorithms. The actual verification of the entire property set of the DLX control was carried out with the help of a software package, developed at Iona College on the basis of the theoretical constructs presented in [8, 9,10,11].

5

Overview of Series-Parallel Posets

A partially ordered set {poset) is a set with a reflexive, antisymmetric, and transitive relation defined on the set elements. A E*-labeled poset P=(P,< ,1) consists of a poset (P,<), and an assignment of a nonempty word (a label) l(v)el,* to each vertex v in P. Given posets P and Q with P n Q = 0 , we define two operations on labeled posets:

171 Concatenation (•):

P »Q := (PuQ,
Shuffle (®):

P ®Q := (PuQ,
where

v v

V
A Series-Parallel Poset (SPP) over an alphabet £ is defined inductively: • The empty poset, J, is a SPP • For each o e E, the singleton labeled a is a SPP • If P and Q are SPPs, so are P*Q and P®Q The set of all series parallel posets formed from 1 and the singletons and closed under concatenation (•) and shuffle (<8>) forms a bimonoid denoted SP(£*) [15]. For our purposes, the alphabet, E, will consist of all distinct events occurring during a run of the system under consideration. Let each event, e,-, occurring in a system be represented by a singleton poset, c,-. Then, the fact that event e,- precedes event ej is represented by «(•«,-. On the other hand, the independence of events e,- and e,- is represented by e,®e,-. This extends naturally to sets of events. What does it mean for two sets of events two be dependent or independent? Two sets of events, P and Q, are independent if no event in P triggers a chain of events leading to the occurrence of an event in Q and v.v. In other words P and Q are independent if the set of events, which are predecessors of P does not involve any event from Q and v.v. A set of events P always precedes a set of events Q if all events in P occur before any event in Q does, i.e. when each event in Q has all events in P as predecessors. A set of events P partially precedes a set of events Q if P sometimes occurs before Q. This so when each event in Q has at least one predecessor from P, or when P and Q are independent. Let us illustrate the definitions with an example. Consider the following seriesparallel poset: B=

((e,®e2)»e3)®e4 Figure 2 A Simple Series Parallel Poset Example

Consider now the sets of events Pi = [et, e4), and P2 = {e3}. Clearly, Pi and P2 are not independent since ei must occur before e$. Pi does not always precede P2 since a possible event sequence is eh e2, e3, and then e4. But Pi may sometimes precede P2 since another possible event sequence is,, for example, e^ e2, e4, e$.

172

Interpreting series-parallel posets as descriptions of the dependence or independence of sets of events allows us to model system behavior in terms of the sequences of events occurring during its operation. In [8] we presented a methodology for modeling the behavior and properties of non-iterated systems with series-parallel posets. A non-iterated system is one, in which the events are distinct and not repeated.3 . a 2 ,t>2

p t>3 FA t

I c4

c3

I3'

c2 FA 2

|b

i

fk> . b 0

Cl

FAi

I

S3

S2

Si

FA0 -

I

S0

Figure 3 A Non-Iterated System: 4-bit Binary Adder We presented an algorithm, which can be used to verify that a particular property is always satisfied or sometimes satisfied within a given behavior. In [9], the methodology was further expanded to deal with globally iterated/locally noniterated systems. These systems consist of non-iterated sub-systems operating in series or in parallel, such that the global output is fed back for another iteration.

T3^ -3^

Figure 4 A Simple Globally-Iterated/Locally Non-Iterated System The verification algorithms have a low-order polynomial time- and space complexity, which is further improved by the introduction of the behavior reduction methodology in [10]. An iterated system is one in which some or all events are repeated. It consists of a number of components, which function in series or independently so that each component is either an iterated- or a non-iterated system. A wide variety of systems can be considered iterated: • • • 3

Communication-, Interconnect, or Cache Protocols Asynchronous Sequential Circuits Feedback Control Systems, etc.

Not all non-iterated systems can be expressed with series-parallel posets. See the section on Contributions and Limitations.

173 Concrete examples of iterated systems to which we have applied our methodology are the Peripheral Component Interconnect (PCI) bus protocol, used in all Pentum® based PCs, and the Modified/Exclusive/Shared/ Invalid (MESI) cache coherence protocol used to synchronize the operation of cache controllers in shared-memory MIMD systems, and maintain data consistency between the level-1 and level-2 caches of the Pentium® microprocessor [12, 13]. Here is a simple example of an iterated system at the logic gate level:

0>^§}^fc> Figure 5 A Simple Iterated System (gate level)

The notion of a series-parallel poset is not sufficient to describe the behavior of a system with iteration. We, therefore, need to introduce a new structure - the star shuffle semiring S = (S, +, •, ®, *, 0,1) of series-parallel posets, defined as follows: • • • • • • •

S - the set of finite subsets of SP(Z*), closed under the semiring operations If Ke S and L&S,K+L = {PI PeKvPeL}eS If KeS and LeS, K»L= [P»Q\ PsK A Q e L J e S If KeS and LeS,K®L={P®Q\ PeK A QeL}sS If KeS, then*:* = / + K + K»K +...= Zi=0..„ K ' e S , where K ' = K»K • ...•K, i times. 0 is the empty set of posets / i s the empty poset

We define the behavior, B, of an iterated system to be an element of the star-shuffle semiring S, i.e. BeS. Thus, the behavior of an iterated system is a set of seriesparallel posets. For example, if we denote the event "gate i produces a valid output" by e;, then the behavior of the system in Figure 5 is given by the following expression B = ((ej • (e2 <8> e3) • e4)* • es)*. We can represent the verification properties as sets of series-parallel posets as well, i.e. PeS. Unlike behaviors, however, properties are usually be defined over a subset of the alphabet S, since we are most often interested in the mutual dependence or independence of a relatively small subset of system events. For example the property "Gates 2 and 3 produce valid outputs independent of each other, but gate 4 depends on both gates 2 and 3" is given by the expression P = fe ® ej) • e4.

174 The verification questions are specified as predicates over sets of series-parallel posets. These predicates are: • SS(B, P) is a binary predicate, interpreted as "The property P is sometimes satisfied within the behavior, B". The predicate takes a behavior and a property and verifies that P can sometimes be traced within the behavior, B. • AS(B, P) is a binary predicate, interpreted as "The property P is always satisfied within the behavior, B". The predicate takes a behavior and a property and verifies that P can always be traced within the behavior, B, of the system. There are four normal forms of behavior and property expressions: • Concatenation: B = B1»B2»...»B„ & P= Pj»P2:..»Pm • Shuffle: B = B,®B2®...®Bn & P = Pj®P2®...®Pm • Plus: B=Bi+B2+...+B„ & P=Pi+P2 + ...+ Pm • Star: B=B,* & P = P,* To simplify the reasoning about sets of events and the complexity of the verification algorithms we introduce the notion of a reduction of the system behavior. It is prompted by the fact that, while the system behavior may involve hundreds of thousands of events, in most cases the verification property involves only a few events. The reduction is carried out by a recursively defined projection function Pr(B, set(P)), which takes a behavior, B, and a set of events, and returns a reduced behavior, B', with respect to the events in the specified set. The effect of the projection function is to substitute 1 in place of all events not in set(P) without modifying the ordering of events in the behavior. Based on a number of theorems, corollaries, and lemmas, which examine the satisfaction of all forms of properties with respect to all forms of behaviors, we derive the formal definition of the two verification predicates SS(B, P) and AS(B, P) for iterated systems. In this outline, we present only AS(B, P): AS(B, P) iff • P =e A

B=e

•

P=P,*AB=B,®B2®.

•

P =

•

P =

•

P = Pj®P2® ...®PmAB = B] ® B 2 ® . . . ® B „ A F g S P ( X * ) A V i e [ m ] AS(B, Pi)) A Vie[m-1] Independent*^,, PM)) A Vie[m] (Pi=(Pa)* -> Veef»4 L(set(lisc(B, e))) c L(set(Pi))) P = Pi»P2» ... 'Pm A (B = B!»B2» . . . » B „ v B = B,*) A PgSP(2:*) A Vie[m] AS(B, Pt) A Vie [m] (Pi=(P,/)* -» V e e / ^ L(set(lisc(B, e))) c L^setifi))) A Vie [m-1] (\/eePi+1L(predB({e})) n L(set(Pt) = L(set(Pi))) B = B, + B2 +...+ Bn A Vie [n] AS(B,-, P) P = Pj + P2 +...+ Pn A 3 ie [n] AS(B, P,)

•

• •

. .®Bn A A S ( « , i>,)AVie [n]B,= Ba*

P,*AB=B1*AAS(B,,PI) P1®P2®...®P,„AB=B1*AAS(B1,P)

175

In the above definitions we made use of number of functions - the labeling functions /(s) and Z^{s1>...,s„}), the predecessor function, pred(P), and the functions set(P), "Non-Iterated", NI(B), and "Least Iterated Sub-Component", lisc(B, e). The exact definition of these functions is presented in [11] and omitted here for lack of space. We also used the auxiliary predicate Independents^, Q). The predicates serve as a basis of a verification algorithm. The analysis of its requirements shows that the worst-case time complexity is 0(n+m3), and the average case time complexity is 0(n+m2), where n is the number of events in the behavior (before the reduction), and m is the number of property events. The space complexity is O(m). 6

Contributions, Limitations, and Conclusions

In this paper, we presented the modeling and formal verification of the DLX processor control based on the recendy developed series-parallel poset methodology. The technique is less expressive than some other formal verification methods, but has a low complexity. Thus, we can model complex real-world systems and protocols. Current work is on the verification of the InMOS Transputer microcode, and modeling the behavior of dataflow computers. The issues of event sequencing and timing has been studied for a long time by many researchers - D. Dill, B. Moszkowski, Z. Manna, etc. In many respects, our approach is close to the study of language containment of behavior and property automata [2]. However, we approach the topic from a different point of view, avoiding the issue of exhaustive substring matching. Moreover, the use of the shuffle operator (<8>), significantly simplifies and speeds up the verification task by avoiding the study of all possible independent event interleavings. Closest to our work is that of V.Pratt [4]. However, the main stress in [4] is on modeling system behavior with the help of an extensive collection of operations. Our technique uses a far smaller collection of operations (•, <8>, *), but models not only system behaviors but properties as well. The emphasis is on verification, and the reduced collection of operations simplifies analysis, and improves the algorithms' efficiency. One important shortcoming of our technique is the inability to model "N"-type dependencies among the events occurring in a system. These are encountered quite often in real systems and significantly limit the general applicability of our algorithm. Consider the simple example below:

Figure 6 "N"-type event dependence in a simple system

176 If e,- represent the event "gate i produces a valid output", then the event dependency diagram has the "N"-shape described on the right. This type of dependency cannot be modeled only with operations •, ®, and *. Current work is aimed at extending our verification methodology to deal with "N"-type event dependencies as well. References 1. 2. 3. 4. 5.

6. 7. 8.

9.

10. 11. 12. 13. 14. 15.

K. McMillan, "Symbolic Model Checking", Kluwer Academic Publishing, 1993 R.Kurshan, "Computer Aided Verification of Coordinating Processes: The Automata-Theoretic Approach", Princeton Series in CS, Princeton, 1994 M.Nielsen, G.Plotkin, and G.Winskel, "Petri nets, event structures, and domains", TCS, 1981 V.Pratt, "Modeling Concurrency with Partial Orders", Int. Journal of Parallel Prog., 1986 P. Godefroid, "Partial Order Methods for the Verification of Concurrent Systems: an Approach to the State Explosion Problem", Doctoral Dissertation, University of Liege, 1995 R. Nalumasu, G. Gopalakrishnan, "A New Partial Order Reduction Algorithm for Concurrent System Verification", Proceedings of IF1P, 1996 D. Peled, "Combining Partial Order Reductions with On-the-Fly Model Checking", Journal of Formal Methods in Systems Design, 8 (1), 1996 L.Ivanov, R.Nunna, S.Bloom, "Modeling and Analysis of Non-Iterated Systems: An Approach Based on Series-Parallel Posets", Proceedings of ISCAS'99, 1999 L.Ivanov, R.Nunna, "Formal Verification with Series-Parallel Posets of Globally-Iterated Locally-Non-Iterated Systems", Proceedings of MWSCAS'99, 1999 L.Ivanov, R.Nunna, "Formal Verification: A New Partial Order Approach", Proc. of ASIC/SOC'99, 1999 L. Ivanov, R. Nunna, "Modeling and Verification of Iterated Systems and Protocols", Proc. of MWSCAS'01, 2001 L. Ivanov, R. Nunna, "Modeling and Verification of an Interconnect Bus Protocol", Proc. of MWSCAS'00, 2000 L. Ivanov, R. Nunna, "Modeling and Verification of Cache Coherence Protocols", Proc. of ISCAS'01, Sydney, 2001 J.Hennessy, D.Patterson, "Computer Architecture: A Quantitative Approach", Morgan Kaufmann Publ. Inc., 1990 Bloom, Z. Esik, "Free Shuffle Algebras in Language Varieties", Theoretical Computer Science 163 (1996) 55-98, Elsevier

177

D Y N A M I C BLOCK DATA D I S T R I B U T I O N FOR PARALLEL SPARSE G A U S S I A N ELIMINATION * E. M. DAOUDI+, P. MANNEBACK* AND M. ZBAKH+ Lab. Research in Computer Science, Faculty of Sciences, University of Mohamed First, 60 000 Oujda, Morocco, E-mail:{mdaoudi, zbakh} @ sciences.univ-oujda.ac.ma "Computer Science Lab., Polytechnic Faculty, 7000 Mons Belgium E-mail: Pierre.Manneback@fpms. ac. be

+

This article is devoted to describe a new dynamic block data distribution algorithm over a grid of processors for sparse Gaussian elimination in order to improve the load balance compared to the classical static block-cyclic distribution. In order to assure a numerical stability and to separate the ordering and the symbolic factorizations, Demmel and al. [2,3] presented a new method for sparse Gaussian Elimination with Static Pivoting called GESP where the data structure and the communication graph are known before the numerical factorization. In this work, we assume that the ordering and the symbolic factorizations are already performed and we are interesting by the numerical factorization of the final structure of the matrix to be computed. The experimental results show the advantages of our new approach.

1

Introduction

The efficiency of parallel algorithms on distributed memory machines depends on how to distribute the data over processors. The distribution should be well chosen to minimize the execution time (balance the load work of processors and/or minimize the communication cost). The problems dealing with dense matrices are classically solved by block-cyclic distribution [1]. This distribution is also used for the sparse case in many tools and packages for linear systems solution such SuperLU (Supernodal LU [4]). However for sparse Gaussian elimination, this distribution can lead to a bad load balance and can increase the execution cost. In this paper we propose a new approach of data distribution that balance the workload of processors and/or minimize the total execution time of the sparse Gaussian elimination. The main idea is to redistribute the data before each step of factorization. It is important that the performances do not be degraded by the communication overhead arising from the migration of data. The test matrices are of size nxn and the target •THIS WORK IS SUPPORTED BY THE EUROPEAN PROGRAM INCO-DC, DAPPI" PROJECT

178

distributed memory machines have &px q processor grid topology. The outline of this paper is as follows: in sec. 2, we present one way to partition the data by blocks for sparse matrices, in sec. 3, we present some motivating examples showing the inefficiency of the block-cyclic distribution for sparse Gaussian elimination, sec. 4 is devoted to describe our new distribution approach, the experiment results are given in sec.5. 2

Block data structure

The block partitioning method of the matrices is based on the notion of unsymmetric supernode approach [4]. Let L be the lower triangular matrix in the LU factorization. A supernode is a range of columns of L with the triangular block just below the diagonal being full and with the same row structure below this block. This supernode partition is used in both row and column dimensions. If there are N supernodes in an n x n matrix A, the matrix will be partitioned into iV2 blocks of nonuniform size [3]. The size of each block is matrix dependent. The largest block size is equals to the number of columns of the largest supernode. For large matrices, this can be a few thousand, especially towards the end of matrix L [3]. Such a large granularity would lead to very poor parallelism and load balance. Therefore, when this occurs, the large supernode will be broken into smaller chunks, so that the size of each chunk does not exceed a threshold, representing the maximum block size [3]. For the present study, we assume that the blocks have the same size r, where n = N xr. In the sparse Gaussian elimination with dynamic pivoting, the computational graph does not unfold until run-time, in other words, the symbolic and numerical algorithms become inseparable. Demmel and Li [3] presented a new method for sparse Gaussian Elimination with Static Pivoting called GESP where the data structure and the communication graph are known before the numerical factorization. The basic numerical factorization algorithm in GESP is given by algorithm 1 [3]. for k := 1 to N do 1. 2. 3. 4.

Compute the block diagonal factors L(k,k) and U(k, k); Compute the block column factors L(k + 1 : N,k); Compute the block row factors U(k, k + 1 : N); Update the sub-matrix A(k + 1 : N, k + 1 : N) : for j: := k + 1 to N do for i := k + 1 to N do if(L(i,k)^OkU(k,j)^0) A(i,j) = A(i,j)-L(i,k)U(k,j); Alg.l : Sparse right-looking LU factorization

179

In this work, we suppose that the ordering and the symbolic factorizations are already performed. We are interested by the numerical factorization of the final structure of the matrix to be computed A. 3

Block-cyclic distribution

By block-cyclic distribution we mean that block A(ij) (0 < i,j < N) is mapped onto the processor at coordinates (i mod p, j mod q) of the processor grid. This distribution on a grid of processors is not efficient in term of load balancing and communication costs. We present bellow some motivating examples. Example 1: load imbalancing In Figure 1(a) we illustrate the matrix to be distributed cyclicly (Figure 1(b)) on a grid of 2 x 2 processors (Figure

(a)

(b) 0 2

(c)

1 3

(d) Figure 1. (a): block data structure for a matrix, (b): block-cyclic distribution on a grid of 2 x 2 processor grid, (c): blocks to be updated in step 1 of elimination

As shown on Figure 1(c), the step 1 of elimination is executed by one processor (processor 0). Similarly for all other steps. So, each step of elimination is executed sequentially, this shows a bad load balance. Example 2: bad communication management: Figure 2(b) shows that two consecutive not nil (shaded) blocks in the same row/column are not mapped on two neighbor processors of the grid during the first step. This can be improved to decrease the communication cost by replacing, in row 1, processor 2 by processor 1, 4 by 2 and 6 by 3.

180

(a)

(b)

Figure 2. (a): data structure and block-cyclic distribution for one matrix on a 7 x 7 processor grid, (b): the blocks to be updated in step 1 of elimination

4

Description of the new distribution approach

The proposed algorithm consists in redistributing efficiently the data over the processor grid at each step of elimination. The idea is to accumulate the blocks to be updated at each step of elimination in a dense matrix and redistribute this matrix by the block-cyclic approach. The remaining of this section is devoted to describe in details the different steps of the algorithm. For each step k, 1 < k < N, first of all we determine the sub-matrix Mk formed by the blocks Aij that will be updated in step k. Then we determine how the sub-matrix Mk can be efficiently distributed on the grid. Determination of Mk: A block A^ of the matrix A is an element of Mk if the blocks Aik and Akj are both not nils (that is to say Aik ^ 0 and Akj ^ 0). The size of Mk is determined by the number of blocks not nils in row and column k of the initial matrix. Mk is determined by the algorithm 2, where the couple (is, js) indicates the new coordinates of Aij in sub-matrix Mk. 3s ~ 0; for j := k t o N do{

if(Atii^0){ is '•= 0;

for i := k to N do{ if (Ai,k ? 0) { %s .— Xs ~r I ,

I }

181 js

••= js +

l;

} } Alg.2 : Determination of the sub-matrix Mk Distribution of Mk: • if any element of Mk was not an element of any precedent sub-matrix Mk , k < k, then we distribute Mk cyclicly on the grid; • Otherwise, we analyze the latest affectation of each block in Mk and we determine one sub-matrix of the most size, called Mk, which is already distributed cyclicly. We keep the distribution of M§ and we complete the block-cyclic distribution for the rest of Mk. This choice minimize the redistribution cost. The outline of the new approach of distribution is given in algorithm 3. for k := 1 to N d o if Mk is not already distributed distribute cyclicly Mk; else determine Mk; we keep the distribution of Mk and we complete the cyclic distribution of Mk; Alg.3 : The outline of the new distribution algorithm To illustrate the steps of the new algorithm, we consider the matrix M of Figure 3(a) and a grid of 5 x 5 processors (Figure 3(b)). We determine the sub-matrix M1 to be updated in step 1 (Figure 4(a)) and we distribute it by block-cyclic distribution (Figure 4(b)). The Figure 5(a) illustrate the submatrix M2 to be updated in step 2. M$ is formed by the blocks mapped to the processor of the sub-grid formed by processors 6, 7, 8, 11, 12, 13, 16, 17 and 18. We keep the distribution of M$ and we complete the block-cyclic distribution of M 2 (Figure 5(b)). The blocks that were already affected are mentioned by the symbol !. We proceed at the same way in the remaining steps. 5

Numerical results

The implementations are done, in LaRIA laboratory of Amiens (France), under MPI environment [5] for communication and ScaLAPACK subroutines [1] for computation. The target machines are:

182

o 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 2 0 21 2 2 23 2 4

m

SxS processor grid

Figure 3. Matrix to be distributed by new approach over a 5 x 5 processor grid

m

m

9.

Figure 4. The sub-matrix

M1 (blocks mapped to processors)and

its

distribution

Figure 5. The sub-matrix

M2 (blocks mapped to processorsjand

its

distribution

• Cluster of 19 Celerons Intel Pentium 400 MHz and 18 Celerons Intel Pentium 466 MHz connected by a Fast Ethernet network of 100 Mb/s. • Cluster of eight alphas processors of 533 MHz connected by the Myrinet cards of 1 Go/s. Table 1 presents the computation times for the first five elimination steps of sparse Gaussian elimination on the cluster of 2 x 2 alphas. The test matrix is of size 1600 x 1600 and has the same structure as the matrix of example 1. It

183

is structured by blocks of size 80 x 80. T\ and T2 represent the computation time at each step (they are measured at the end of each step) for each processor for the block-cyclic distribution and for the new approach of distribution respectively. Steps

->

Proc I 0 1 2 3

2

1 Ti 6.26 -

T2 1.61 1.57 1.66 1.69

Ti .073

4

3

r2 .1 -

Tx 5.88 -

T2 1.61 1.26 1.33 1.08

Ti .07

5 T2 .1 -

Ti 4.05 -

T2 1.03 1 1.05 1.08

Table 1. Computation time in seconds on the cluster of alphas (the symbol — means that the corresponding processor is idle)

We remark that, for the block-cyclic distribution, the steps 1, 3 and 5 are sequentially executed, however, with the new distribution, the treatment of these steps is distributed between all processors. This shows a well load balance. For steps 2 and 4, there is only one block to be treated by one processor for the two approaches of distribution (Ti and T2 are roughly equals). In table 2, we present the execution times for sparse Gaussian elimination on the cluster of 4 x 4 Celerons. The test matrix is of size 320 x 320 and has the same structure as the matrix of example 2. It is structured by blocks of size 80 x 80. Ti and T2 represent the execution time at each step for the blockcyclic distribution and for the new approach of distributions respectively. This shows that the execution time is reduced by using the new distribution. Etapes -»•

Ti T2

1 .45 .37

2 .31 .24

3 .09 .02

4 .09 .02

Table 2. Execution time in seconds on the cluster of Celerons

6

Conclusion and perspectives

In this work, we are interested by the numerical factorization of sparse Gaussian elimination. We assume that the ordering and the symbolic factorizations

184

are already performed. We have proposed a new dynamic distribution based on block-cyclic approach. At each elimination step, only the dense blocks concerned by the corresponding elimination step are distributed. This new approach allows to balance the workload compared to the block-cyclic approach. The experimental results show that the workload of processors is well balanced and the execution time can be improved compared to the block-cyclic distribution. Our objective in the future work is to generalize the algorithm for non-uniform blocks and to extend the experimentation for other type of matrices. References 1. Blashford L. S., Choi J., Cleary A., D'Avezado E., Demmel J., Dhillon I. and Dongarra J., ScaLAPACK Users' Guide. (Second Edition SIAM, 1997). 2. Li X. and Demmel J., A Scalable Sparse Direct Solver Using Static Pivoting. In Proceedings of the 9th SIAM Conference on Parallel Processing and Scientific Computing, San Antonio, Texas, 1999. 3. Li X. and Demmel J., Making Sparse Gaussian Elimination Scalable by Static Pivoting. In Proceedings of SC98 Conference, Orlando, Florid, 1998. 4. Li X. , Demmel J. and Gilbert J. R., An Asynchronous Parallel Supernodal Algorithm for Sparse Gaussian Elimination, SIAM J. Matrix Anal. Appl., 20 (1999) pp. 915-952. 5. Pacheco P. S., Parallel Programing with MPI.(Morgan Kaufmann Publishers, Inc. San Fransisco, California 1997).

185 A L L PAIRS S H O R T E S T P A T H S C O M P U T A T I O N USING J A V A O N PCS CONNECTED ON LOCAL AREA NETWORK

JOHN JINGFU JENQ Computer Science Department, Montclair State University, Upper Montclair, NJ 07043 E-mail: [email protected]

WINGNJNG LI Department of Computer Science and Computer Engineering, Fayetteville, AR 72701 E-mail: [email protected]

University of Arkansas,

The computation of shortest path using Java programming language on PCs that are connected on local area network has been developed. Comparisons with different number of processors, and with different number of nodes of the problem graph have been performed experimentally. Speed up factor and efficiency analyses have been conducted experimentally as well.

1

Introduction

Shortest path computation is a very important task required for efficient routing in transportation and communication networks. The problem shows up in applications such as robot path planning, circuit design, missile defense, and highway transportation design. How to compute shortest paths has been studied intensively for both sequential and parallel computing models. An annotated bibliography and taxonomy of related problems using uniprocessor is given in [5]. Experimental evaluation of the efficiency of different shortest paths algorithms is reported in [3]. The shortest paths problem is commonly represented as a graph with vertices and edges. For a dense graph, a matrix representation is usually used. For a sparse graph, a linked list (and its variations) representation is more popular. For the all pairs shortest paths problem, one is required to compute the shortest path from node i to node j , for i * j , and 0 < i, j < N; where N is number of vertices in the graph under consideration and vertices are numbered from 0 to N-l. Many studies using parallel processing approaches to efficiently solve the shortest paths problem are reported in the literature. Examples include using VLSI systolic arrays [6], using distributed processing [4][11], using CREW PRAM model [2], using CRCW PRAM model [7][8], and using EREW PRAM model [8]. Sometimes it is desirable to find rectilinear shortest paths for robotic applications. Algorithms dealing with rectilinear shortest paths are reported in [2] [12]. Results concerning three-dimensional shortest paths that also find application in robotics are given in [1]. For results of parallel program development for commercial available

186

machines, the reader is referred to the paper of Jenq and Sahni where programs that run on NCUBE mini super computers with 64 nodes are developed and experiment is conducted [10]. The problem of finding a shortest path between two nodes that also has the smallest number of edges is investigated by Zwick [13]. In this paper, we develop programs, using Java programming language, that solve the all pairs shortest paths problem on PCs that are connected to a local area network and conduct experiment to study the effectiveness of the approach. The experimental results are very encouraging and demonstrate the potential of the approach. Java is a popular high level programming language and has gained its popularity in teaching client/server computation for its ease of usage and the shorten software development time. Our experience also confirms that and further suggests using Java as a tool to conduct research in parallel and distributed processing. The paper is organized as follows. Section 2 describes the fundamental sequential algorithm to be paralleled and the basic Java classes used in the program development. Section 3 presents the parallel program. Section 4 analyzes and discusses the experimental results, including comparisons using speed up factor and efficiency measure. Section 5 concludes the report. 2

Preliminary

The famous fast algorithm to solve the all pair shortest path problem for dense graphs is the Floyd's algorithm. Our development of the parallel programs is based on this algorithm. Let N be the number of nodes in the graph. The distance between node / and nodey is denoted as Dist(i.j). Initially, Dist(i,j) is 0 if i -j; is infinity if/ and j are not adjacent; and is the weight of edge (i,j) if/ and j are adjacent. At the end of the computation, Dist(i.j) gives the true distance, or the length of the shortest path, between node i and node j . To use Floyd's algorithm, we shall assume that there is no negative-weight cycles in the graph, though negative-weight edges may be present. The following is Floyd's algorithm consisting of three nested for loops. for (k=0; k < N; k++) for (i=0; i < N; /++) for (j=0;j < N;j++) if(Dist(i,j)) > (Dist(i,k) + Dist(kj)) Dist(i,j) - Dist(i.k) + Distfkj); On each iteration of k, a new matrix will be generated. Let Dist0 be the initial matrix, then it can be verified [9] that the result will not be changed if we replace the if statement by if(Dist(i,j) > (Distk~l (i,k) + Dist1"1 (k,j)) Dist(ij) = Dist*'1 (I,k) + Dist ' (k,j). This enables us to develop parallel algorithms that distribute the

187

computation of two inner most for loops in network computers and only exchange information (by communication) in the outeryor loop. In the current Java net package library there are streaming based socket and data gram socket. We choose data gram socket in our application due to its small communication overhead compared with streaming based sockets. In addition to that, we also take advantage of the multicast facility in the java.net library that further reduces the communication time a lot. 3

Parallel all pair shortest path algorithm on PC area network

We run the program on PC a lab. with Dell computers, using Window 98 operating system, that are connected to a local area network. The language used to develop the program is Java that was developed by SUN. With the java.net library, it is easier for us to construct the program by concentrate more on application problems. The whole application in fact contains two subprograms. One is the controller program and the other a worker. The controller controls the coordination among the workers. It gives control signal through broadcasting. It also receives signals from the workers. The controller issues start signal at the beginning of each iteration, for the computation of the matrix. It then waits for the workers to send in completion signals. When all the signals are received, it starts another iteration of computation by giving another start signal. The total number of broadcasting of the signals to start the computation is therefore 0(N). The signal is just one byte signal that was sent using data gram. Rather than use the streaming based sockets, we used both ordinary and multicast data gram sockets. It is relatively cheaper to use data gram sockets than the streaming sockets in our application. As for the worker program, each worker get a portion of the matrix based on its ID. Note the ID was assigned by the controller when program begins. In addition to the ID number received from the controller, the number of nodes of the graph also received from the controller. The information about the number of PCs, that will participate in the computation, also received from the controller at the beginning execution of the program through broadcasting. The pivot row, required by each processor in the computation, is sent at the beginning of each iteration by the appropriate PC. Broadcasting is used in this operation to reduce the overhead. As soon as the row of data has received, the computation starts on the data partition for each PC. At the end of the computation in each iteration, the PC sent a completion signal back to the controller. It then wait for the controller to start another iteration. Upon receiving the signal, each PC computes and check whether it is its term to broadcast the data to all others.

188 4

Experiment results and analysis

The experiment results are shown in Table 1. The numbers are in milliseconds. A run time of 0 simply means that the run time is less than 1 millisecond, as in the one PE case. It can be seen from the table that, as expected, the run time grows in 0(N

) for the one PE case when N increases. Table 1 Running time using different number PEs 1PE 0 330

64 Nodes 192 Nodes

2 PEs

4 PEs 6 PEs 8 PES 10 PEs 12 PEs

170

110

120

660

610

610

160 660

180 710

170 770

2530

2140

2080

2140

2200

13070 10820

9060

8700

7910

148740 82560 65690 49370

41630

38170

384 Nodes

3740

3620

768 Nodes

28830

21370

1536 Nodes

228270

3072 Nodes 1852580 1183920 587870 411170 324170 267600 232780

As usual, we measure the effectiveness of the parallel programs experimentally using speed up and efficiency. The speed up factor is defined as: running ~ time ~ of ~ sequential ~ a lg orithm . running ~ time ~ of ~ parallel ~ a lg orithm The efficiency E is defined as the speed up factor divided by the number of processors, p, involved in the computation. sp E= — P In the experiment, edge-weight matrices are generated randomly using the Random class of Java that generates pseudo random numbers. For simplicity, only integers are used for the edge-weight. We believe that changing integer type to float or double type will not invalidate the experimental results, though it would be interesting to see how those changes may affect the actual run time. The measurement of run time reported in Table 1 does not include the time used to generate the edge-weight matrix. When the number of nodes of the graph is small, the parallel program does not achieve any speed up at all. Only when a certain threshold of the number of nodes is exceeded, then the speed up becomes significant. The speed up versus the number of PEs used is plotted in Figure 4.1. Note that when the graph is large, the more PEs we have the faster the computational process. For example, considering the case of 3072 node graph, one PC takes about 30 minutes to finish, while 12 PCs only take 4 minutes. sp =

189

Speed up 10 8 6 4 2 0

H 64 Nodes B192 Nodes

2 PEs

4 PEs

6 PEs

8 PEs

10 PEs

12 PEs

• • • •

384 Nodes 768 Nodes 1536 Nodes 3072 Nodes

Figure 4.1. Speedup vs. number of PEs Figure 4.2 depicts the efficiency E plot. Note that higher efficiency is always achieved by a fewer number of PEs than by a larger number of PEs. Nonetheless, when the graph size increases, independent of the number of PEs involved, the efficiency always increases. 5

Conclusions and remarks

Parallel programs, written in Java and run on a set of inexpensive PCs connected to a local area network in a typical student lab, to solve the all pairs shortest paths problem have been developed. A speed up factor of 8 can be achieved when 12 PCs are used to run a graph of around 3000 nodes. Greater speed up would be expected when the number of nodes in the graph increases or when the number PCs increases. For graphs with small size, one PC alone outperforms the multiple network approach due to communication overhead among participant PCs. As the graph size increases, the computation time out weights the communication time and the proposed approach becomes effective. One remark about the great advantage of using Java to write network application is the shorten development time due to the rich classes in the java.net library that comes with the Java develop kit. In our experiment, we take advantage of the multicast facility of Java that results in further reduction of the control coordination overhead and therefore improves the overall performance.

Figure 4.2. Efficiency vs. number of PEs 6

References

1. Agarwal, P. K., Har-Peled, S., Sharir, M., and Varadarajan, K., Approximating shortest paths on a convex polytope in three dimensions, Journal ACM, vol. 44, no. 4, pp 567-584, (1997). 2. Atallah, M., and Chen, D., Parallel rectangular shortest paths with rectangular obstacles, Proceedings of the Second! Annual ACM Symposium on Parallel Algorithms and Architectures, pp 270-279, (1990) 3. Cherkassky, B. V., Goldberg, A., and Radzik, T., Shortest paths algorithms: theory and experimental evaluation, Proceedings of the Fifth Annual ACMSIAM Symposium on Discrete Algorithms, pp 516-525, (1994). 4. Chen, C , A distributed algorithm for shortest paths, IEEE Transactions on Computers, vol. c-31, pp 898-899, (1982). 5. Deo, N and Pang, C. Shortest path algorithms: Taxonomy and Annotation, Networks, pp 275-323., (1984). 6. Dey, S., and Srimani, P. K., Parallel VLSI computation of all shortest paths in a graph, Proceedings of the ACM Sixteenth annual Symposium on Computer Science, pp 373-379., (1988). 7. Frieze, A., and Rudolph, L., A parallel algorithm for all pairs shortest in a random graph, Proc. 22nd Annual Allerton Conf on Communication, Control and Computing, pp 663-670., (1984). 8. Han, Y., Pan, V., and Reif, J., Efficient parallel algorithms for computing all pair shortest paths in directed graphs, Fourth Annual ACM Symposium on Parallel Algorithms and Architectures, 1992, pp353-362, (1982). 9. Horowitz, E., and Sahni, S., Fundamentals of Computer Algorithms, (Computer Science Press, 1978)

191

10. Jenq, J., and Sahni, S., All pairs shortest paths on a hypercube multiprocessors, Proceedings of the 1987 International Conference on Parallel Processing, pp. 713-716., (1987). 11. Lakhani, G., An improved distribution algorithm for shortest path problem, IEEE Transactions on Computers, vol. c-33, pp. 855-857, (1984). 12. Lee, D. T., Chen, T. H., and Yang, C. D., Shortest rectangular paths among weighted obstacles, Proceedings of the Sixth Symposium on Computational Geometry, pp. 301-310, (1990). 13. Zwick, U., All pairs lightest shortest paths, Proceedings of the Thirty-first Annual ACM Symposium on Theory of Computing, pp. 61-69, (1999)

Computing Education

195 ENHANCING STUDENT LEARNING IN E-CLASSROOMS JEROME ERIC LUCZAJ AND CHIA Y. HAN Department ofECECS, University of Cincinnati, Cincinnati OH 45221-0030, USA E-mail: [email protected], [email protected] E-classrooms provide a unique opportunity to use technology to implement prompt feedback and thereby augment the current classroom experience. Coordinating various instructional streams with student assessment and feedback will provide the means for instructors to know when and if their intended message was communicated to their students, permitting instructors and students to react quickly when there is a gap between intent and understanding. Further, developing a flexible instructional infrastructure will create a bridge between course objectives and course assessment, classroom instruction and student feedback, and seat-time and study-time. By developing this framework within an eclassroom, information can be gathered to measure student, instructor and organizational achievement and to assist in improvement.

1

Introduction

Many different factors influence whether a student has successfully attained learning objectives. Student performance is rarely homogeneous, resulting in the familiar bell-shaped distribution. Causes for poor student performance are hard to pinpoint. A comprehensive assessment strategy needs to be defined and implemented. It must demonstrate that the content has been delivered to the student and that the student has received it. Further, it must give prompt feedback to students, instructors and programs. The resulting impact on student learning should be "A Significant Difference" [1]. With the recent advent of powerful computer technology for delivering mediarich content in classroom settings, many new possibilities are now available for augmenting classroom instruction and learning. Previous related works at several academic research centers, such as Georgia Tech, U. of Massachusetts, and Cornell University, have contributed significantly to the use of electronic notes with audio and video recording. Project Classroom 2000 [2] produced several solutions, such as Zen*, a client server system that allowed each electronic whiteboard to be a client tied into a threaded central server; DUMMBO (Dynamic Ubiquitous Mobile Meeting Board) system uses a SmartBoard electronic whiteboard and a Webinterface access method that was time-line based; Stupad is a customizable tool for personalized notes and playback of all captured streams. MANIC (Multimedia Asynchronous Networked Individualized Courseware), developed at U. of Mass. [3, 4], is an asynchronous system that uses HTML slides and GIF images synchronized with audio via RealAudio to a Web browser. The browser uses the media plug-in to

196 start the presentation. MANIC uses CGI scripting to create the slides from the GIF images and slides. CGI then sends the slides to the Web server when requested by the user, and then the server sends this request to the user's browser. Project Zeno distance learning tools [5] include a full spectrum of automatic editing and playback. The Lecture Browser uses a Real-video plug-in, to allow playback of MPEG videos. Although many new methods are available, the different formats of material being delivered during lectures may overwhelm students, hindering learning, causing student disengagement in the classroom. Research that focuses on getting more out of the classroom experience shows that user-interaction in selecting data keeps the students interested in material. [6] These studies indicate that students' needs and viewpoints have to be taken into consideration when course material is given in technology-based classrooms. Further, since a typical classroom experience involves multiple, simultaneous activities or streams of information, it is important that technology offer a system that will synchronize these streams, providing context and valuable insight during assessment, instructor feedback, and student review. Frequent, periodic assessment providing the basis for prompt feedback to students, instructors, and programs will enhance student learning. 2

Major Assessment Issues

Educational assessment has multiple, distinct uses in instructional improvement including: school and student accountability for academic achievement, feedback for teachers to revise teaching and administrators to allocate resources, and stimulation for students to receive deeper understanding [7]. In most cases, the main criterion for assessing student performance is the degree of subject matter understanding. Student performance is evaluated at irregular intervals, through graded homework, quizzes, tests, projects, and final exams. Typically, students who were confused during lectures would not find out how far behind they were until it was too late to catch up. It is during the contact time in a classroom where the instructor can exert the greatest impact on student learning. In fact, a study from the University of Tennessee found that teacher effectiveness was the dominating factor affecting student academic gain [8]. Thus, it is important to assess learning in the classroom and let instructors take timely measures and make any necessary remedial changes. In terms of teaching evaluation, especially at the collegiate level, the burden of teacher and course evaluation falls upon the student using either in-class or Webbased survey forms. Typically, these evaluations are done just once, if at all. Since the survey is normally completed toward the end of the academic term, it does not impact students in the current class, though it may be helpful to future students in the same course with the same instructor. Also, since students may not take the survey seriously, the validity of the student response is questionable.

197 Typically, there is no correlation between student learning assessment and instructor teaching evaluation. Without timely feedback connecting student learning to instructor evaluation, neither the instructor nor the students have the power to affect change or to correct problems. Frequent feedback to the students advances student learning. According to Brien and Eastmond in Cognitive Science and Instruction, "During instructional activities, the competencies taught must be reinforced each time they are adequately used by the learner." [9] Further, they describe the "ideal" situation as one where the learner knows the final goal as well as the sub-goals that support the final goal. Therefore, it is important that learning outcomes be explicit and feedback frequent.

3

Approach: CaSA System Design

Networked computers or ubiquitous computing/PDA-based terminals are becoming widely available, so they should be used in classrooms to enhance learning through active interaction between students, the instructor and the material. To make use of the new technology, a new generation of instructional software is needed. A new framework, CaSA (Classroom and Student Achievement assessment), is presented. CaSA is a flexible framework to augment the classroom experience by coordinating and synchronizing instructional streams, matching class plans to student class experience, and presenting instruction in a variety of media forms to promote self-directed learning. The emphasis is on facilitating timely feedback from students, offering alternatives to students with differing learning styles, and collecting assessment data.

Real-Tim

e

Stream

Com

p o n e n t a Ho n C o m

S f f a a m B e e a l i in g Com ponent

Com

S tu d e n t « o le

p o S l u d a n t F a e D b a ck C
p o n e n t

C om P° rte it t

Figure 1. CaSA Component Diagram

The CaSA framework will consist of and coordinate three major components: a Preparation component, a Real-Time Stream component and a Review component

198 (see Figure 1). These components are organized by whether their functionality supports e-classrooms prior to, during or after classroom instruction. CaSA will support assessment by providing: student topic marking, periodic and frequent electronic concept questions both during and after class sessions, and instructional stream review and evaluation. 3.1

Pre-Classroom Preparation

Each lecture supports learning objectives from the course syllabus. Prior to the class, the instructor defines the lecture outline, covered topics, and the lecture notes (ClassPlan) which are associated with the text and course syllabus. The lecture consists of a ClassPlan presentation. As it is delivered, the various streams of presentation will be parsed into lecture segments, each of which is associated with a media type and descriptive labels, such as 'introduction,' 'concept,' 'equation,' 'illustrative example,' etc. derived from the ClassPlan. CaSA will also permit streams to be added or removed interactively. As instructor presentation formats evolve, CaSA will accommodate these changes to electronically capture the class session. CaSA will capture the data required to support creation of concept maps from the ClassPlans, the covered topics representing core concepts. As instructors and students connect additional material to these core topics, the data available for the concept map grows. There are several ongoing efforts in the area of visually representing concept maps including work by Puntambekar, Stylianou and Jin at the University of Connecticut [10] as well as work by Aroyo, Dicheva and Velev [11]. CaSA will collect the data required for concept maps with the intent that CaSA will facilitate the use of concept map representational systems within an e-classroom. 3.2

In-Classroom Interaction

During the e-classroom experience, CaSA will coordinate instructional streams, assessments, and feedback and will electronically increase classroom interactivity. CaSA will electronically record the instructional content in its various forms, synchronizing the multiple instructional streams, as well as student feedback and assessment. It will provide context and valuable insight during assessment, instructor feedback, and student review. The instructor's ClassPlan will be made available to students on their computing device at the beginning of each class session. This will accomplish two things. First, it will let the students know what topics will be covered, setting their expectations. Second, it will provide information needed for students to mark topics that are covered. As the class presentation proceeds, students will be able to indicate when they believe a specific topic has been covered. These marks will individually index the

199 instructional streams. When they wish to review course material, the students can access the Web-delivered instructional streams based upon their individual marks. By allowing the students to use their individual marks to access course material, this system provides an incentive to mark topics as accurately as they can and serves to engage the students during the presentation. Each student will be able to create personal links from personal material to course material, facilitating self-directed learning. By providing tools that assist students in retrieving, evaluating, comprehending and memorizing information while performing learning tasks the system will provide clear benefits. [11] Student topic marking also serves another purpose. As the students mark topics, CaSA will determine how many students believe that the instructor has covered a particular topic. This information can be made available to the instructor on demand or via instant notifications based upon predefined thresholds. If the instructor feels that he/she has presented a topic, but student thresholds have not been met, the instructor will know immediately and can make appropriate changes. CaSA will use the installed network of computers to permit the instructor to ask frequent, periodic questions to assess student understanding of the main concepts from the ClassPlan. These questions will be integrated with an intelligent FAQ (iFAQ) data bank to promote self-directed learning. Answers will be collected and immediate feedback regarding the overall state of student understanding can be presented to the student and the instructor. This will permit both student and instructor to take corrective action in a timely matter. In addition, reaction to and general attitude toward the material and class can be surveyed through positive and constructive questions during and at the end of the class. 3.3

Post-Classroom Review and Study

In a post-classroom review and study the system should be entirely student-centric. An Adaptive Computer Assisted Instruction (ACAI) system such as Arthur, developed previously at UC [12] is an ideal platform for this phase of learning. Offering multiple presentation methods from the lecture and off-line instruction modules, Arthur allows a student to switch to the presentation method that will best suit the student's learning style. Arthur has been designed to achieve "A Significant Difference" in learning. CaSA will provide an interface to the existing Arthur system, incorporating it into the overall solution. 4

Conclusions

It is anticipated that the creation and implementation of CaSA will demonstrate the usefulness of technology in education, especially in higher education where significant technological resources are already invested. It will augment both

200

classroom and post-classroom learning, provide a bridge between seat-time and study-time, course objectives and course assessment and will provide a channel for student feedback so that timely and adaptive instruction can be realized. References 1.

Russell, T. L. (compiled by). The no significant difference phenomenon as reported in 355 research reports, summaries and papers: A comparative research annotated bibliography on technology for distance education. North Carolina State University: Office of Instructional Telecommunications. (1999). 2. Abowd, G.D, "Classroom 2000: An experiment with the instrumentation of a living educational environment, Pervasive Computing, Vol. 38, No. 4, 1999. 3. Padhye,, J. and Kurose, J, "An Empirical Study of Client Interactions with a Continuous-Media Courseware Server," Technical Report UM-CS-1997-056, University of Massachusetts, 1997. 4. Stern, M, Steinberg, J, Lee, H. I., Padhye, J., and Kurose, J.,"MANIC: Multimedia Asynchronous Networked Individualized Courseware," Proceedings of Educational Multimedia and Hypermedia, 1997. 5. Mukhopadhyay, S. and Smith, B., "Passive Capture and Structuring of Lectures," Project Zeno, Cornell University. 6. Hannafin, R. and Sullivan, H., "Learner Control in Full and Lean CAI Programs," Educational Technology Research and Development, Vol 43, No. 1, 1996, pp. 19-30. 7. Baker, E. L.; Mayer, R. E., "Computer-based assessment of problem solving," Computers in Human Behavior, 15, pp. 269-282 (1999). 8. Sanders, William L.; Wright, S. Paul; Horn, Sandra P., "Teacher and Classroom Context Effects on Student Achievement: Implications for Teacher Evaluation," Journal of Personnel Evaluation in Education Volume: 11, Issue: 1, April 1997, pp. 57-67. 9. Brien, Robert and Nick Eastmond, Cognitive Science and Instruction, , Educational Technology Publications, pp. 18-33 (1994). 10. Puntambeker, Sadhana; Agnes Stylianou and Qi Jin, "Visualization and External Representation in Educational Hypertext Systems " Artificial Intelligence in Education Volume 68, IOS Press, pp. 589-591 (2001). 11. Aroyo, Lora; Darina Dicheva and Ivan Velev, "A Concept-based Approach to Support Learning in a Web-based Course Environment" Artificial Intelligence in EducationVolume 68, IOS Press, pp. 1-13 (2001). 12. Gilbert, J. E.; Han, C. Y.,"Arthur: Adapting Instruction to Accommodate Learning Style," Proceedings of WebNet 99: World Conference on the WW and Internet, Honolulu, HI: Association for the Advancement of Computing in Education, pp. 433-439, (1999).

201

BUILDING UP A MINIMAL SUBSET OF JAVA FOR A FIRST PROGRAMMING COURSE ANGEL GUTIERREZ Department of Computer Science, Montclair State University, Upper Montclair, NJ 07043, USA E-mail: [email protected] ALFREDO SOMOLINOS Department of Mathematics and Computer Information Science, Mercy College, 555 Broadway, Dobbs Ferry, NY 10522, USA E-mail: [email protected]

There are problems associated with the usage of Java as a first programming language. The complexity of the language, for instance, makes it difficult to use the object-oriented approach from the beginning. We present in this paper the building blocks for a Reduced Instruction Set Java, which allows to write graphical programs and to teach a basic object-oriented programming course, without neglecting the fundamental constructs of the language.

1

Introduction

A common problem encountered in many Java books [1], [2], is that they try to cover too much. If these books are used to teach Java as a first programming language, the students are overwhelmed by the difficulty of the language. The language is too large, there are thousands of functions and therefore even simple programs require a lot of background. The usage of console applications as a starting point produces dull text output, and requires sophisticated techniques for input, usually hidden in ad hoc classes. Even if the books are more object-oriented, [3], [4], the great amount of classes used tends to jeopardize the acceptance of the language. In order to avoid these hurdles we start to develop a Reduced Instruction Set Java (RISJ). This set will be a minimum one that will allow writing graphical programs and teaching a basic object-oriented programming course. We use the strengths of the language, object orientation and graphics, from the very beginning. The approach to teach RISJ is a tutorial one, presenting the general ideas in the context of an application, without trying to be exhaustive or encyclopedic.

202

In the next sections we will expand the first components of RISJ, enumerating the programming ideas introduced in each section, and presenting sample programs that illustrate those ideas. We will also keep track of the new keywords and functions used, and when opportune the sample output will be shown. On the other hand, we will neither explain in detail the programming concepts nor show the complete syntax specifications. 2 2.1

The First Program: A Graphic "Hello World" Concepts

This basic program develop the following concepts: Writing, compiling and executing programs, and using libraries; Executing an applet either from the IDE or from a browser, using HTML; Classes of programs and inheritance; Graphic objects; Build-in functions; Calling a function and using parameters. 2.2

The Program II Program SayHiJava import java.awt*; // Import classes. All the programs will need this and next import classes. import java.applet*; // Import classes. We will not explicitly write them, although they should be there. public class SayHi extends Applet { public void paint(Graphics screen) { screen.fillRect(140,40,20,BO); // Left side of H screen.fiilRect06O,9O,6O,2O); //Centerbar screen.fillRect(220,40,20,130); // Right side screen.fillRect(260,90,20,80); // bottom of I screen.fiUOval(260,60,20,20); //1 dot } }

The output of this program can be seen in Figure 1.

Figure 1. Output of the SayHi program. 2.3

New keywords, constants and methods

Graphics, import, public, class, extends, void, fillRect(), fillOval().

203

3 3.1

Class inheritance Concepts

Inheritance is a mechanism for improving existing working classes without changing the source code of the original class. The inheritance mechanisms that we consider are: Adding variables, adding functions, and overriding functions. 3.2

Programs II Program Square.java. It needs to include the import classes public class Square extends Applet { public void paint(Graphics screen) { screen.fillRect(140,40,130,130);

( }

// Program SquareColor.java. It also needs the import classes. See first program public class SquareColor extends Square { public void paint(Graphics screen) { screen.setColor(Color.red); super.paint(screen); } } // Program SquareColorBG.java. It also needs the import classes public class SquareColorBG extends SquareColor { public void initO { setBackground(Color.blue); repaintO; }

A sample output can be seen in Figure 2.

Figure 2. Sample output of the class inheritance programs 3.3

New keywords, constants and methods

Init(), setColorQ, Color.red, Color.blue, setBackgroundQ, super, repaint().

204

4 4.1

Defining your own commands. Functions Concepts

We use global and local variables (storage locations). We see functions as black boxes. First we use functions without parameters and then functions with parameters, since passing values to a function avoids using global variables and adds flexibility. We distinguish between input parameters and formal parameters, these being just placeholders. We deal finally with the overloading of functions. 4.2

Programs II Program Circleplain. Java. It draws a circle. Do not forget to include the import classes. public class CirclePlain extends Applet { Color drawColor; // Global void circle(int px, int py, int pdiam, Graphics pscreen) { pscreen.setColor(drawCoIor); pscreen. setXORMode(Color.white); pscreen.fillOval(px,py,pdiam, pdiam);

) public void paint(Graphics screen) { drawColor = Color.blue; circle (50,50,150,screen);

i ) // end class // Program Ring.java public class Ring extends CirclePlain { public void paint (Graphics screen) { int x,y,diam; x = 50; y = 50; diam = 150; // variable defined in Circle Plain drawColor = Color.blue; circle (x,y,diam,screen ); // function defined in Circle Plain circle (x+10,y+10,diam-20,screen); } } // End Program Ring // Program Piggy Java. It draws a pig's face public class Piggy extends CirclePlain { public void paint(Graphics screen) { int x,y,diam; drawColor = Color.pink; // global in CirclePlain x = 140; y = 70; diam = 140; circle (x,y,diam,screen); // face x = 165; y = 105; diam = 20; circle (x,y,diam,screen); // eyes x = 235; y = 105; diam= 20; circle (x,y,diam .screen); x = 180; y = 130 ; diam = 60; circle (x,y,diam .screen); // spout x = 190; y = 145 ; diam =15; circle (x.y.diam .screen); // nostrils

205 x = 215; y = 145 ; diam = 15; circle (x,y,diam .screen);

A sample output can be seen in Figure 3.

o

11 111

Figure 3

4.3

New keywords, constants and methods

Color, int, getGraphics(), setXORmode(). 5 5.1

Using functions with parameters in loops Concepts

We create animation by changing a function parameter inside a loop. We deal with conditions, relational operators, and selection using if. Finally while and for loops are encountered. 5.2

Programs II Program MoveBall.java. A blue ball moves from left to right. Import classes need to be included public class MoveBall extends CirclePlain { int maxX, maxY; public void init() { maxX = 400; maxY = 300; setSize(maxX,maxY); } void pause(int count) { Graphics screen = getGraphics(); for (int i = 0; i < count; i++) circle (0,0,0,screen); // waste time

206 ) public void paint(Graphics screen) { int x.y.diam; y = 70; diam = 30; drawColor = Color.blue; // variable defined in Circle Plain x = 0 ; // Initialize Loop while ( x < maxX-diam) // Test { circle (x,y,diam,screen); pause(5000); circle (x,y,dian%screen); x = x+l;//Update

) } // End paint } // End Program

5.3

New keywords, constants, methods and operators

SetSize(), for, while, operator < , operator ++. 6

Discussion

•

We have introduce 22 keywords, constants and methods used in RISJ: import, public, class, extends, void, Graphics, fillRect(), fillOval(), Init(), setColor(), Color.red, super, Color.blue, setBackground(), repaint(), Color, int, getGraphics(), setXORmode(), SetSize(), for, while. Object oriented concepts were used from the beginning: The students learn how to use classes and objects before they are taught how to define their own. In fact, in the examples we used inheritance before we showed how to define our own methods.

•

References 1. Bell D. and Parr M., Java for Students, Second Edition (Prentice Hall, New Jersey, 2000). 2. Deitel H. and Deitel P., Java, How to program, Third Edition (Prentice Hall, New Jersey, 2001). 3. Horstmann C , Computing Concepts with Java 2 Essentials, Second Edition (John Wiley and Sons, New York, 2000). 4. Wu T., Introduction to Object-Oriented Programming with Java, Second Edition (McGraw-Hill, New York, 2000).

207

COMPLETING A MINIMAL SUBSET OF JAVA FOR A FIRST PROGRAMMING COURSE ANGEL GUTIERREZ Department of Computer Science, Montclair State University, Upper Montclair, NJ 07043, USA E-mail: [email protected] ALFREDO

SOMOLINOS

Department of Mathematics and Computer Information Science, Mercy College, 555 Broadway, Dobbs Ferry, NY 10522, USA E-mail: [email protected] There are problems associated with the usage of Java as a first programming language. The language is object-oriented, but the need to learn the details of the syntax, relegates the objectoriented concepts to the background. We complete in this paper a previously started Reduced Instruction Set Java, which allows to write graphical programs and to teach a basic object-oriented programming course, without neglecting the fundamental constructs of the language.

1

Introduction

A common problem encountered in many Java books [1], [2], is that they try to cover too much, or they deal with too many classes, [4], [5], as we previously mentioned [3]. We now use graphics-style interactions with events to complete a Reduced Instruction Set Java (RISJ), [3]. The approach is the same as before, but here we try to avoid text input through the console. 2 2.1

Declaring and constructing GUI objects: Buttons and TextFields Concepts

The Graphics User Interface i.e. predefined classes to create a program's graphic interface. Creating objects with new. Using constructors. Adding objects to the applet. Sending messages to the objects (using the object member functions).

208

2.2 Programs II Program TwoBttns.java. Constructing GUI objects import Java.applet.*; // Import classes. All the programs will need this and next import classes. import java.awt.*; // Import classes. We will not explicitly write them, although they should be there. public class TwoBttns extends Applet { Button oneButton,twoButton; public void init() { oneButton = new Button( "one!" ); // create buttons add( oneButton); twoButton = new Button( "two!" ); twoButton.setBackground(Color.cyan); add(twoButton ); } //end init public void paint(Graphics screen) { oneButton.setLabel("Red"); oneButton. setBackground(Color.red); }

) // Program LoopsTexts.java. It uses Text Fields and nested for and while loops. It needs to import classes public class LoopsTexts extends CirclePlain { public int maxX, maxY; TextField redText,blueText; public void init() { redText = new TextField( "Red" ,30); add( redText); blueText = new TextField< "Blue", 30 ); add(blueText); maxX = 400; maxY = 300; setSize(maxX ,max Y); } void pause(int count) { for (int i = 0; i < count; i++) showStatus("Paused"); // Waste some time

1 public void paint(Graphics screen) { int x = 0 , y = 70, diam = 20; for (int k = 0; k < 2; k++) { drawColor = Color.blue; blueText.setText("Blue going right!"); while ( x < maxX-diam) { circle (x,y,diam,screen ); pause(5000); circle (x,y,diam,screen ); x = x+l; } blueText. setText("Last x = " + x); drawColor = Color.red; redText.setTextf'Red going left!"); while ( x > 0 ) { circle (x,y,diam,screen); pause(5000); circle (x,y,diam,screen ); x = x-l;

209 redTextsetTextfFirst x = " + x); } // end for }//end paint } // End Program

The output can be seen on Figure 1.

Figure 1. Constructing GUI objects.

2.3

New keywords, constants and methods

Button, new, add(), setLabel(), TextField, SetText(), ShowStatus(). 3 3J

Interactive Programs. Events Concepts

We deal with events and event handlers, i.e., interrupt driven programming, interfaces and the methods whose implementation is required by the interface. In particular we show the ActionListener interface and the implementation of the method actionPerformed and how to obtain the source of the interrupt. 3.2

Programs li Program ColorOval.java. ActionEvents are generated by clicking on buttons, and changing the contents of text // fields. We need only to implement one function actionPerformed. The ball turns green when the button is clicked // We need to import the customary classes. public class ColorOval extends Applet implements ActionListener { int x,y,width,height; Color drawColor; Button bgreen; public void initO { bgreen = new Button("Green"); add(bgreen); bgreen.addActionListener(this); I

210 public void paint(Graphics screen) { x = 100; y = 100; width = 150; height = 150; drawColor = Color.red; screeasetColor(drawColor); screen.setXORMode(Color. white); screen.fiUOval(x,y,width,height); } public void actionPerformed(ActionEvent buttonEvent) { String bLabel = buttonEventgetAcuonCommandO; if (bLabel.equals("GreenM)) drawColo r= Color.green; repaintO; } // end actionPerformed } // End Program

The output can be seen on Figure 2.

fcftppial started

Figure 2. ActioeEvents are generate!, for instance, by clicking on buttons

33

New keywords, constants and methods

String, implements ActionListener, addActionListener(), this, actionPerformed(), ActionEvent, getActionCommand(), equals. 4 4.1

A Text Input Program Concepts

We use Text Fields for input and for generating interrupts. The ActionPerformed event is revisited. We also convert from strings to numbers, from decimal numbers - float - to integers, and we use the Math library functions.

211 4.2

The program II Program TextEventjava. The program asks the user to guess a number between 1 and 100. // It gives hints, "too high", "too low", to help in the search. It needs to include the usual import classes

public class TextEvent extends Applet implements ActionListener { TextField outputBox.promptBox, inputBox; int targetNumber; public void initO { outputBox = new TextField("Guess a number from 1 to 10 ",40); add(outputBox); promptBox = new TextField("Move below with the mouse. Type it Press Enter",40); add(promptBox); inputBox = new TextField("",20); add(inputBox); inputBox.addActionListeneitthis); targetNumber = (intX 1 + 100 * Math.random()); // randomO returns a number between 0 and 1 } public void actionPerformed(ActionEvent inputBoxEvent) { int number, String StringOfDigits; StringOfDigits = inputBoxEventgetActionCommand(); number = Integer.parselnt( StringOfDigits); if (number = targetNumber) outputBox.setText("Congratulauons! The number " + number + " is the winner"); else if (number < targetNumber) outputBox.setText("The number you entered " + number + " is too low "); else outputBox.setText("The number you entered " + number + " is too high ");

The output can be seen on Figure 3.

PKiiiR^^Si^^^^^^Biiii \

|

}

|GuessanumbBrfrom1to100

|Movebeloww[ththeminise.TypeH PressEnter

1

1

^Applet started.

1

!

!

]

hue number you entered 51 istoohigh [Move belowwlth the mouse. Type It PressEnter

j

L5!!

:

Applet started.

Figura 3. Asking the user to guess a number, and giving hints about the answer.

212 4.3

New keywords, constants, methods and operators

Math.random(), float, (int), Integer.parselnt(), operator + for strings. 5

Discussion

•

There are in total, [3], 41 keywords, constants and methods used in RISJ: import, public, class, extends, void, Graphics, fillRect(), fillOval(), Init(), setColor(), Color.red, super, Color.blue, setBackground(), repaint(), Color, int, getGraphics(), setXORmode(), SetSize(), for, while, Button, new, add(), setLabel(), TextField, SetText(), ShowStatus(), implements ActionListener, addActionListener(), this, actionPerformedO, (int), float, ActionEvent, String, getActionCommand(), equals, Math.random(), Integer.parselnt(). Using these few keywords, we think that one can illustrate most of the standard techniques introduced in a first programming course. Object oriented concepts were used from the beginning and user interaction is handled using the GUI: Texfields and buttons are all we need to input data from the user. They generate Action Events, which can be easily handled. Using the program source as input: When the student is working with the IDE, the simplest way of changing the behavior of the program is to change the values of the variables. This can be done from the Watch, or Inspector windows in the debugger, or by just modifying the source and recompiling. This is much faster than having to answer several questions of the type: "Please enter the value". Later, when they are more comfortable with the language, they can be taught how to change the values of variables using more traditional methods.

•

•

References 1. Bell D. and Parr M., Java for Students, Second Edition (Prentice Hall, New Jersey, 2000). 2. Deitel H. and Deitel P., Java, How to program, Third Edition (Prentice Hall, New Jersey, 2001). 3. Gutierrez A. and Somolinos A., Building up a Minimal Subset of Java for a First Programming Course, (to appear). 4. Horstmann C., Computing Concepts with Java 2 Essentials, Second Edition (John Wiley and Sons, New York, 2000). 5. Wu T., Introduction to Object-Oriented Programming with Java, Second Edition (McGraw-Hill, New York, 2000).

213 DEWDROP: EDUCATING STUDENTS FOR THE FUTURE OF WEB DEVELOPMENT JOHN BEIDLER Computing Sciences, University ofScranton, Scranton, PA 18510, USA E-mail: [email protected] There are many references supporting the Web's client-side (web browsers), a few references describing the Web's server-side, but there is little in the way of comprehensive support material on all aspects of website development. This paper describes the modifications being made to the Web Development course at the University of Scranton. The changes are based on the premise that the Web is an object-oriented client-server system for the dissemination and gathering of information. If the Web supports the dissemination and gathering of information, then a database is an appropriate repository for that information. The Web Development course is a junior-senior level course. It is being reorganized into three levels of presentation: (lj Introductory Part - Presents the fundamentals of client-side development, the Common Gateway Interface (CGI), and server-side programming. (2) Intermediate Part - Develops the material required to support server-side development, through the construction and delivery of virtual web pages using object-based reusable components and reducing defects by paying attention to process patterns. (3) Advanced Part - Present the fundamentals of web server to database interface for the delivery of virtual web pages. The approach is called DEWDROP, Database Enhanced Web Development with Reusable Objects and Patterns.

1

Introduction

The University of Scranton has offered a Web Development course since the 19961997 academic year. Initially, it was offered as a Special Topics course. From its inception the course covered the essentials of web development - client-side development, the Common Gateway Interface (CGI), and server-side software development. For the first three years I experimented with additional topics on various aspects of web programming. Because of my particular interest in software reuse, the construction of reusable resources has always been an integral part of the course. After the first three years we noticed that the course was having an impact on several other upper division courses because the course introduces students to approaches to programming and software development not normally developed in other courses including such topics as regular expressions for tokenizing strings, using hash tables to handle information, event driven programming, and practical software reuse experience. For example, the discussion of the Web as a set of Internet protocols and an introduction to security issues in the web development course introduces students to topics covered in depth in the Network

214 Communications Course. The Web Development course became extremely popular; almost all computing majors take this course, with the vast majority doing so in their junior year. During 1999-2000 the Web Development course made the transition from a Special Topics offering to a regular upper level course. As part of that process the department faculty discussed the positioning of this course relative to other courses. The course has a single sophomore level course as a prerequisite. The course was approved by the department, passed its review by the College of Arts and Sciences, and was approved by the faculty senate. The department specifically positioned the Web course before the Network Communications Course and the Database Course so that these three courses along with the senior capstone course could be the basis for a set of sequenced assignments leading to a comprehensive capstone assignment. 2

The PNA Project

In the 1999-2000 academic year, I received funding for a project to develop a prototype of a health sciences website in collaboration with a professor of Dietetics and Nutrition, Dr. Marianne Borja, at Marywood University. The project supported the development of a website called the Personal Nutrition Assistant Project (PNAP). The project helps diabetics and other individuals with a need to control their nutritional intake. The website used the USDA Nutrient Database for Standard Reference, nutritional information on over 6000 food items. Nonparticipants may access the system using the URLs, www.scranton.edu/pnap or www.marywood.edu/pnap. The website is currently utilized by eight local medical centers and is actively being developed. After putting the USDA database on our department's database machine and constructing a web interface using Perl 5's DBI module, I discovered that I had underestimated the ease with which a database could be used as the backend for a website. Following consultation with Dr. Yaodong Bi, the faculty member in our department who teaches the database courses, I realized that, with some effort, it was feasible to redesign the Web Development course in a way that gives the students a database driven website design experience. After further consultation with other members of our department we discussed methods of including database material in the Web Development course.

3

DEWDROP

The course will take students with little or no web development experience to where they are prepared to participate in the future of the Web utilizing database enhanced web development. At first it may appear that the amount of material I plan to cover

215 is too large for a typical three-credit course. Based on my experience with the web development course, I believe that the proposed collection of material can be covered by (1) keeping the course focused on its eventual goal, and by (2) making extensive application of software reuse. The course is achieved through a three-part presentation of the material described in the subsections below. A key element in the strategy of presenting this course is software reuse, which is not presented simply as a sound strategy for software development, but also as a means for delivering course material. Reuse is applied to both design patterns and software process patterns. 3.1

Introductory Part

The introductory material is a refinement of material developed over the last few years - an introduction to the client side, the CGI interface, and server-side programming. This material is covered in about three weeks. On the client side, time spent presenting HTML is kept to a minimum. Most students have previous HTML experience, however, all students have access to several on-line HTML tutorials. Web browsers are presented as containers for a pair of object models: the web browser's document object model and the object model for the Javascript interpreters that reside in web browsers. The course emphasizes how these two object models interact in different ways during the pre-load, onLoad, and post-load stages of a web page. The terms pre-load, onLoad, and post-load refer to the time frames surrounding the web page's onLoadQ event. I expect students to learn the basics of Javascript on their own. They have access to many on-line references, like the Javascript Tip of the Week website. The "Tip of the Week" site contains many useful Javascript examples, but the examples are not well packaged. This site is typical of many web sites in that they do not make good use of Javascript's object model. I use this opportunity to emphasize Javascript's object model and demonstrate how it may be employed to encapsulate resource in reusable js files. Normally, students complete two laboratory assignments and one regular assignment involving the encapsulation of Javascript resource. The construction of web forms leads to the CGI interface. The CGI interface is an excellent example of the need to follow standards and recognizing patterns. This topic is approached at three levels. The first two levels are discussed during the introductory part of the course; the third level is described in the intermediate part of the course. As part of the low level description of CGI, the web browser's encoding of the CGI interface is described and the resources required on the serverside to decode the information are introduced. Several artifacts to investigate the CGI interface are provided. One is an artifact, called formecho, that echoes back to the web browser a copy of the encoded string sent by the browser to the we b server. A modified version of Perl's cgi-lib.pl, a standard CGI's interface, is presented as

216

the second level CGI interface. The modified version of cgi-lib.pl includes an extension that supports off-line testing of server-side software using as input the encoded strings echoed back by the formecho artifact. This begins an important multi-step process pattern, described in detail the intermediate part of the course. The introductory part ends with an introduction to the resources essential for server side software development: 1. String processing features. 2. File and directory processing. 3. Access to environmental variables. 4. The ability to use other system resources (call programs). 5. Appropriate data structures, in particular, hash tables (associative memory). Although Perl is used because it provides access to these resources, other programming languages may be used as well. I have seen examples written in Tel, Java, Ada, COBOL, C, and C++. 3.2

Intermediate Part

The intermediate part plays an essential role in the successful delivering of material presented in this course. This part consumes the middle half of the course, about seven weeks. A key element in this part is the emphasis on software process, paying attention to how tasks are accomplished in order to avoid defects, or remove them as early as possible. Emphasis on the client side is on the web page document object model, Javascript's object model, and the interactions between them. This is developed by using a technique I developed to address some of the differences between the Javascript object models in Microsoft's Internet Explorer (IE) and Netscape's Navigator. Many of the conflicts between the IE and Netscape are addressed by employing a technique that is not well documented; namely, in both models, practically any item that can be addressed using typical dotted notation, ABC.xyz, can also be accessed as a hash, ABC["xyz"]. It is amazing how frequently Javascript that works on one browser and not on the other can be made to work on both browsers by replacing the problematic dotted notation with hash-like access. Here we present the third approach to the CGI interface using Perl 5's CGI module. This approach uses Perl 5's object model, presenting the interface as an object. In addition, CGI supports multipart forms, which include a file upload capability. The CGI interface is also used as the focal point for addressing cross platform development. One problematic software development scenario is one where software is developed on a Microsoft platform and the production website is on a UNIX platform. This scenario provides an opportunity to address process issues. Information on the CGI interface in the introductory part is extended from

217 off-line testing of CGI scripts, to testing a CGI script with a web server on the development machine, and finally moving the script to a production website on a UNIX platform. Students are required to maintain defect logs as they move through this three-stage process. They learn to recognize and correct defects in both their software and their development processes. This leads to the creation of support scripts to automate the process and further reduce defects. A key in the middle part of this course is selecting the right types of assignments that will prepare students for the advanced part of this course. It is relatively easy to develop interesting assignments that do this by combining the use of regular expressions, hash tables, and tab delimited files while giving the students more experience with both sides of web development. One good example is a concordance listing assignment, which uploads a file in specific programming language and constructs a framed web page that allows a person to browse a formatted and colorized version of the program that appears in one frame by clicking on the links in the concordance listing in the second frame. PHP is introduced at the end of the first half of the course. PHP is a scripting language that is placed in an HTML file. With PHP the developer can describe both the client-side actions and server-side actions in a single location, the web page. Server-side actions are described within process tags that are performed on the server-side using a simple method of executing server-side software called ServerSide Includes. As a result, the software developer has both the client-side processes and server-side processes described in one document, an HTML file. One of PHP's advantages is that it helps the software developer to distinguish between the roles of objects, their attributes, and representations of objects and attributes on both the client-side and the server-side. The result is the potential for reduced software development time and improved packaging of reusable software. 3.3

Advanced Part

This part covers the last three to four weeks of the course. Since the assumption is that students do not have previous database experience, the course appears to be limited as to what it can accomplish. I have discussed this issue with Dr. Yaodong Bi, the faculty member who teaches our database course, and we agree that the approach described here is feasible. The emphasis placed on tab-delimited files and hash tables in the intermediate part of the course leads naturally to database access. Several pre-defined databases are being considered, and a small set of SQL commands will be presented. The students' previous experience with hash tables will be used as a basis for explaining the SQL commands. A danger in this part of the course is to attempt to do too much. Remember, database experience is not a prerequisite for this course. However, the hash-tables-to-simple-databases analogy along with a small collection of SQL commands is sufficient to set the stage for using databases as the back end to a website.

218 Great care must be taken in developing this material, including the selection of the right tools and a good process. At least two choices are available, Perl 5's DBI module and PHP's database interface. Both support SQL commands. Since I have experience with Perl 5's DBI module, I want to parallel that experience with PHP and construct several laboratory assignments around both Perl and PHP and use Perl's DBI interface one year and PHP's the following year and perform a formal assessment to determine the relative merits of each approach.

4

Curricular Impact

By offering the Web Development course in the junior year, we expect that a very large majority of students will take the Network Communication course and the Database course before taking the Senior Projects course. We plan to develop several assignments that that will build on the web course and lead to possible senior projects. The result will be an opportunity to give students piece-meal assignments that could span up to three semesters. Another course that has been impacted by the web course has been the Programming Languages course. The extensive use made of hash tables and tokenizing using Perl's GREP-like regular expression capabilities has forced the instructor in the Programming Languages course to rethink several assignments. As a result, that course now covers a richer collection of languages, including the scripting language Tel. 5

Conclusions

Too often when I tell people that I teach a course in Web Programming for majors I encounter skepticism about offering such a course to majors. Usually it is given in the context of a statement like, "You teach a course in HTML!?" Needless to say, this is not a course in HTML. Web programming offers a unique opportunity to present multi-platform, multi-program language software development. However, the future of web technology lies not in HTML, it lies in the delivery of virtual web pages, pages constructed on demand to meet the needs of the client. That construction involves selecting information from a database and delivering the desired results in a useful format. This paper describes one approach to teaching web programming as a junior level course that does not have a database prerequisite, DEWDROP. To compress the course materials so that database web development may be taught the course makes extensive use of reusable software components packages using the object features in the various programming languages. Another alternative would be to teach the web course as a senior level course and having the database course as a prerequisite. We considered that possibility, but

219 it did not appear as attractive as the approach we are taking because it did not allow for the opportunity to have projects that could run as long as three semesters. In any case, the web course clearly demonstrates the power of two features that need more development, the use of a regular expression capability and the use of hash tables. Both give students a unique experience demonstrating the importance of having the right tools for the job. Finally, if you are teaching databases, the USDA Nutrient Database for Standard Reference, Release 13, http://www.nal.usda.gov/Jhic/foodcomp/Data/, is an example of a well constructed database with well designed tables. It is a real, non-contrived database that is ready for your use. Try it out. 6

Acknowledgements

I'd like to thank Yaodong Bi, Paul Jackowitz, Bob McCloskey, and Richard Plishka for their advice and suggestions as the Web Development course evolved, and as it continues to evolve. References 1.

2. 3. 4. 5. 6.

7. 8. 9. 10. 11. 12. 13.

Borja, Marianne, and John Beidler, "The Personal Nutrition Assistant Project", Proceeedings of the American Dietetics Association Conference, Denver, CO. October 17-20, 2000. Beidler, John, and Marianne Borja, "The PNA Project", Proceedings of the CCSCNE-01. Middlebury, Vt. April 2001. Goodman, Danny and Brendan Eich, The Javascript Bible (4th Edition). Hungry Minds, Inc. April 2001. Guelich, Scott, et al, CGI Programming. O'Reilly & Associates. July 1997. Hamilton, Jacqueline D., CGI Programming 101. CGI101 .com. February 2000. Heinle, Nick, and David Siegel, Designing With JavaScript: Creating Dynamic Web Pages (Web Review Studio Series). O'Reilly & Associates. September 1997. Kabir, Mohammed J., Apache Server Bible. Hungry Minds, Inc. July 1998. Kingsley-Hughes, Adrian and Kathie Kingsley-Hughes, Javascript 1.5 by Example. Que. January 11, 2001. Laurie, Ben, Peter Laurie, and Robert Denn, Apache : The Definitive Guide. O'Reilly & Associates. February 1999. Lea, Chris, et al. Beginning PHP4. Wrox Press Inc. October 2000. Medinets, David. Perl 5 by Example. Que. October 1996. Musciano, Chuck and Bill Kennedy, HTML & XHTML : The Definitive Guide. O'Reilly & Associates. August 2000. Ray, Erik T., Learning XML. O'Reilly & Associates. February 2001.

220

14. Schwartz, Randal L., et. al. Learning Perl (2nd Edition). O'Reilly & Associates. July 1997. 15. Thomson, Laura. PHP and MySQL Web Development. Sams. March 2001. 16. Wall, Larry, et.al. Programming Perl (3rd Edition). O'Reilly & Associates. July 2000.

221 B O O L E A N FUNCTION SIMPLIFICATION ON A PALM-BASED ENVIRONMENT

LEDION BITINCKA AND GEORGE ANTONIOU Department of Computer Science, Montclair State University, Upper Montclair, New Jersey 07043, USA E-mail: [email protected], [email protected], In this paper the problem of minimizing Boolean expressions is studied and an optimal implementation is provided. The algorithm follows the Karnaugh map looping approach. For the implementation C++ coding was used on the CodeWarrior for Palm Operating System environment. In order to make the overall implementation efficient, the object oriented approach was used. Two examples are presented to illustrate the efficiency of the proposed algorithm.

1

Introduction

It is well known that the Karnaugh-map (K-map) technique is an elegant teaching resource for academics and a systematic and powerful tool for a digital designer in minimizing low order Boolean functions. Why is the minimization of the Boolean expression needed? By simplifying the logic function we can reduce the original number of digital components (gates) required to implement digital circuits. Therefore, by reducing the number of gates, the chip size and the cost will be reduced and the computing speed will be increased. The K-map technique was proposed by M. Karnaugh [1]. Later Quine and McCluskey reported tabular algorithmic techniques for the optimal Boolean function minimization [2,3]. Almost all techniques have been embedded into many computer aided design packages and in all the logic design university textbooks [4]~[11]. K-map is a graphical representation of a truth table using Gray code order. It is suitable for elimination by grouping redundant terms in a Boolean expression. By optimizing the algorithm it is possible to simplify entirely a given Boolean expression. Unfortunately almost all the techniques along with the Espresso technique [12] do not always guaranty optimal solutions. In this paper a personal digital assistant (PDA) -based implementation is proposed for simplifying four-variable Boolean functions, using the K-map looping technique. The implementation is found to have excellent results.

222

The proposed PDA application is a useful tool for students and professors in the fields of computer science, electrical and computer engineering. It provides a fast and portable way to check and solve problems in digital logic, discrete mathematics and computer architecture courses. Also the proposed algorithm can be a valuable utility for the computer chip design industry due to the fact that it can be expanded to cover Boolean functions with more than four variables. 2

Algorithm

The proposed algorithm is based on the looping of redundant terms. Therefore in order to take a closer look on how to loop two, four or eight 1 's to get the smaller possible number of groups in a K-map table setting, consider the following simple example (lower case letter represent the complement value eg. a means complement of A): F = abed + aBcd + ABcd + ABcD + AbcD + AbCD

ab aB AB Ab

cd 1 1 1 0

cD 0 0 1 1

CD 0 0 0 1

Cd 0 0 0 0

Analyzing the above K-map table, the following looping observation can be made for each 1 present in the Table. • • • • • •

Cell abed - has one possibility to be paired, with aBcd aBcd - has two possibilities to be paired, with abed and ABcd ABcd - has two possibilities to be paired, with aBcd and ABcD ABcD - has two possibilities to be paired, with ABcd and AbcD AbcD - has two possibilities to be paired, with ABcD and AbCD AbCD - has one possibility to be paired, with AbcD

It is obvious that there are two cells that have one possibility to be paired, namely abed and AbCD, so they get the highest priority. These two boxes get paired the first, abed gets paired with aBcd and AbCD gets paired with AbcD. After these pairings the pairing possibilities of ABcd and ABcB are decremented by one, so leaving both of them

223

with one possibility to be paired. They get paired together finishing this way the optimization of the Boolean function resulting in three pairs. The observation reveals the presence of a consistency or rule that lies beneath this logic. Extending the described idea the following algorithm is derived for the optimal looping of l's in a K-map table. Step-1: Find and loop a possible octet. Step-2: Find and loop cells that have one possibility to pair. Step-3: Find and loop cells that have one possibility to quad. Step-3a: Repeat Step-2 for new cells with one possibility to get paired. Step-3b: Repeat Step-3 for new cells with one possibility to get quaded Step-4: Find and loop cells that have two possibilities to get quaded without sharing. Step-4a: If Step-4 fails because of sharing, choose one quad out of two, with less sharing. Step-4b: Repeat Step-4 until no quads found. Step-5: Find and loop cells that have two possibilities to get quaded with sharing. Step-6: Find and loop cell that have two possibilities to be paired without sharing Step-7: Find and loop cell that have two possibilities to be paired with sharing Step-8: Repeat Step-2. Step-9: If there are cells that have a value of one and are not quaded or paired, Then the cells a) Can't be paired or quaded, b) Have more than 2 possibilities to get paired or quaded. Step-9a: If a) is valid then do not consider this cell in Step-7. Step-9b: If b) is valid then change the possibilities of one of these cells to 2 and go to Step-3 to repeat the procedure. Step-9c: Repeat Step-7 until no cells that qualify for this step are found. It is noted that the pairing and quading possibilities of a cell are reduced by one when a pair or quad is looped and this cell can be paired or quaded with any of the cells of the pair or quad. 2.1 Design The program was developed using the CodeWarrior For Palm Os 7.0, which supports C, C++ and Java. The implementation of the program was done in C++ using its Object Oriented features. The program is divided into two major classes: a parent class (Kmap), and a child class (KmapElement). Each of these classes represent a logical division of the K-map table. The Kmap object controls everything related to the K-map table as a

224

whole, such as initializing and simplifying. In order to apply these functions the Kmap object creates 16 smaller objects representing each box. Each smaller object (KmapElement) learns its own properties and can only perform a function that includes the objects itself. Breaking the program down in this way is advantageous because the amount of complete, detailed, organized and correct information about the K- map is maximized. The parent object holds 16 children of the class KmapElement and administers the way the simplification methods are called. The child class KmapElement represents one element of the Kmap. This object has all the properties, such as: pairing and quading possibilities, the value of the box and the status of this box. These objects can learn about other objects of this class through their parent because they have a reference of the parent. 2.2 Program Flow As soon as an input is presented a Kmap object is created. As a result 16 children of class KmapElement are created and initialized. During the initialization phase each KmapElement gathers data about itself and its possibilities to be paired, quaded or octeted. The instance variables of this object are updated as soon as a pair, quad or octet is formed. After initialization, the Kmap object defines the way that the simplification is going to take place according to the presented algorithm. The actual pairing, quading, octeting and updating of the instance variables is done by the functions of the KmapElement class. The order in which these functions are executed is determined by the Kmap class because it knows the algorithm. The simplification happens only once, which means that there are no trials or secondary simplifications. All the functions in the KmapElement have options of choosing sharing and priority in any combination of the two. Each of the functions is executed only on objects that meet the requirements. In the case where the function completes its main task, it updates the instance variables of the neighboring cells that need to know the occurred looping. 3

Examples

Two salient examples, simple yet illustrative of the theoretical concepts presented in this work, follow below:

225 3.1 Example 1 Consider the following Boolean Expression: F = abed + aBcd + aBcD + aBCD + ABcD + ABCD + AbCD + AbCd

(1)

The following K-map table is generated.

ab aB AB Ab

cd 1 1 0 0

cD 0 1 1 0

CD 0 1 1 1

Cd 0 0 0 1

For each one on the table the following data can be collected: • • • • • • • •

abed aBcd aBcD aBCD ABcD ABCD AbCD AbCd

has one possibility to be paired has two possibilities to be paired has three possibilities to be paired, and one to quad has two possibilities to be paired and one to quad has two possibilities to be paired and one to quad has three possibilities to be paired and one to quad has two possibilities to be paired has one possibility to be paired

Following the presented algorithm yields, 1. 2.

3.

4.

There is no octets in the K-map table. abed and AbCd are paired with aBcD and ABCD respectively and the latter's pair possibilities are decremented by one. abed, AbCd, aBcD and ABCD are marked as done. The aBcD is looped as a quad. It is quaded with, ABcD, aBCD and ABCD. All these cells are marked as done and their quad possibility is decremented by one. No further cells of value of one that are not marked as done are left so all the cells have been included into pairs and quads. Therefore F = acd + AbC + BD

(2)

226 The above-simplified Boolean expression is the optimal solution for the given Boolean expression (1), having three terms. 3.2

Example 2

Consider the following Boolean expression: F = abcD + aBcD + aBCD + aBCd + ABcd + ABcD + ABCD + AbCD

(3)

Using a Palm PDA the following boxes are selected according to each term of (3).

Four VorioMes

cd cD CD Cd

abDEfnn aBDEfEfEf flB SfEf EfD (Simplify] flbDDEfn

.. In this case a quad is possible to be looped but according to the algorithm any quad of any type will not be looped before all the 1 's that have one possibility to be paired are looped. Therefore, • •

abcD has one possibility to be paired aBcD has three possibilities to be paired and one possibility to be quaded.

227

aBCD aBCd ABed ABcD ABCD AbCD

has three possibilities to be paired and one possibility to be quaded has one possibility to be paired. 1ms one possibility to be paired. has three possibilities to be paired and one possibility to be quaded. has three possibilities to be paired and one possibility to be quaded has one possibility to be paired.

According to the algorithm the following looping combinations can be obtained: ® abcD • aBCd • AbCD • ABcd

is paired with aBcD since abcD has one possibility to be paired is paired with aBCD since aBCd has one possibility to be paired is paired with ABCD since AbCD has one possibility to be paired is paired with ABcD since ABcd has one possibility to be paired F = acD + aBC + ACD +Abc

(4)

Using a Palm PDA aud pressing the "simplify" button the above derived result (4) is displayed in the following Palm screen.

228

The simplified Boolean expression (4) is the optimal solution for the given Boolean expression (3). 4 Conclusion In this paper an algorithm was presented to minimize a Boolean expression on a PDA. For the implementation C++ coding was used on the CodeWarrior for Palm environment. The .pre file, which is executable on a Palm PDA, is 54K and is available for download at: http://csam.monrclair.edu/~antoniou/bfs References 1. Karnaugh M., The map method for synthesis of combinatorial logic circuits, Trans. AIEE, Communications and Electronics, Vol. 72, pp. 593-598, (1953). 2. Quine W.V., The problem of simplifying truth tables, Am. Math. Monthly, Vol. 59, No. 8, pp. 521-531,(1952). 3. McCluskey E.J., Minimization of Boolean functions, Bell System Tech. Journal, Vol. 35, No. 5, pp. 1417-1444, (1956). 4. Gajski D. D., Principles of digital design, (Prentice-Hall, 1997). 5. Wakerly J.F., Digital design, Prentice-Hall, New York, 2000. 6. Hill F. J. and Peterson G.R., Computer aided logical design with emphasis on VLSI, )Wiley and Sons, New York, 1993). 7. Katz R.H., Contemporary logic design, (Benjamin/Cummings Publ, Redwood City, CA, 1994). 8. Mano M. and. Kime C. R, Logic computer design fundamentala, (Prentice Hall, New York, 2000). 9. Brown S and Z. Vranesic, Fundamentals of digital logic with VHDL, (McGrawHill, New York, 2000). 10. Hayes, J.P., Digital logic design, (Addison Wesley Publ., New York, 1993). 11. Chirlian P.M., Digital Circuits with microprocessor applications, (Matrix Publishers, Oregon, 1982) 12. Brayton, R.K., G.D. Hachetel, C.T. McMullen, and A.L. Sangiovanni- Vincentelli, Logic minimization algorithms for VLSI synthesis, (Kluwer Publ., Boston, 1984).

229

INTERNET-BASED BOOLEAN FUNCTION MINIMIZATION USING A MODIFIED QUINE-MCCLUSKEY METHOD SEBASTIAN P. TOMASZEWSKI, ILGAZ U. CELIK AND GEORGE E. ANTONIOU Image Processing and Systems Laboratory, Department of Computer Science, Montclair State University, Upper Montclair NJ 07043, USA E-mail: [email protected], [email protected] In this paper a four variable Boolean minimization algorithm is considered and implemented as an applet in JAVA. The application can be accessed on line since it is posted on the World Wide Web at the URL http://www.csam.montclair.edu/~antoniou/bs. After extensive testing, the performance of the algorithm is found to be excellent.

1

Introduction

The modified Quine-McCluskey (M Q-M) method is a very simple and systematic technique for minimizing Boolean functions. Why do we want to minimize a Boolean expression? By simplifying the logic function we can reduce the original number of digital components (gates) required to implement digital circuits. Therefore by reducing the number of gates, the chip size and the cost will be reduced and the speed will be increased. Logic minimization uses a variety of techniques to obtain the simplest gate-level implementation of a logic function. Initially Karnaugh proposed a technique for simplifying Boolean expressions using an elegant visual technique, which is actually a modified truth table intended to allow minimal SOP and POS expressions to be obtained [1]. The Karnaugh or K-Map based technique breaks down beyond six variables. Quine and McCluskey proposed an algorithmic-based technique for simplifying Boolean logic functions [2,3]. The Quine-McCluskey (Q-M) method is a computer-based technique for simplification and has mainly two advantages over the K-Map method. Firstly it is systematic for producing a minimal function that is less dependent on visual patterns. Secondly it is a viable scheme for handling a large number of variables. A number of methods have been developed that can generate optimal solutions directly at the expense of additional computation time. Another algorithm was reported by Petrick [4], This algorithm uses an algebraic approach to generate all possible covers of a function. A popular tool for simplifying Boolean expressions is the Espresso, but it is not guaranteed to find the best two-level expression [6]. In this paper an Internet based implementation is proposed for simplifying two to four-variable Boolean functions, using a Modified Quine-McCluskey (M Q-M) method. The M Q-M technique is implemented as an applet in Java, and can be accessed on line since it is posted on the World Wide Web. Due to the algorithmic

230

nature of the technique the proposed method and its implementation easily can be expanded to cover more than four variables. The main difference between the proposed algorithm and Q-M method starts when Q-M method groups the elements according to the number of one's in each element, but in the proposed algorithm grouping is not required. In the following steps the M Q-M follow Q-M up to the first step of the prime implicant table, which is identifying the essential prime implicants. For the next step Q-M uses several different techniques to eliminate the implicants efficiently. The M Q-M method simulates the elimination process of minterms and finally when the most efficient combination is reached it is taken out from the table. In the following section the algorithm is presented. 2

Algorithm

The M Q-M algorithm is presented using the following step-by-step approach. I. Input: 1.1 Enter the input of the Boolean expression either into the K-map, Truth Table, or as a Boolean expression. 1.2 Obtain the binary representation of each term from the inputted data. II. Calculations: 2.1 Compare each of the terms among themselves in order to find the terms that are logically adjacent. The following rules have to be followed when combining the terms: a. Combine the two terms only if they differ by only one bit. b. Once there are two terms that differ by one bit, create the new term with the same exact bits or characters, except replace the bit that is different in both of those terms to "-" symbol. c. Once done creating the new term mark both the old terms, indicating that both of the terms are combined. 2.2 Swap all of the combined terms (new terms) and terms that weren't combined at all. 2.3 Repeat steps 2.1 and 2.2 until it is impossible to combine the terms. III. Table: 3.1 Make sure that there is only one term alike. That is get rid of a term if it is a duplicate of another term in the content. 3.2 Create a prime implicant chart. 3.3 Identify the essential prime implicants and consider them as the first terms, which will make up the result. After each implicant is put into the result term area, the implicant chart should be updated.

3.4 If there are any more minterms left over, proceed as the following: a. Look into the prime implicant chart for the implicants, which have the exact same minterms and eliminate the one that is less efficient. b. Try selecting out one of the terms and see if the term will cancel out all of the implicants or not. c. If it cancels out all of the implicants, put the term back into the result term area. d. If it doesn't cancel out all of the implicants repeat step b, with the higher combination of the terms to be taken out. IV. Display: Display the values out of the result term area. In the following section a step-by-step example is given illustrating the proposed technique. 3

Example

Simplify the following Boolean function F = abed + abcD + aBcd + AbCd Applying the algorithm we have, I. Input: 1.1 Input the expression in either way as shown in Fig 1. 1.2 In order to obtain the binary representation of the terms, you will have to know that, lower case letters such as "a"," b", "c", and "
//First term //Second term //Thirdterm //Fourthterm

232

1/f-

Input Vr.ur Bmarji Fxpres;ion And Pre-.t Enter

abca

1*

\sbcc + abcD + aEkd + Ab£d

abcD

t?

abCD

r

abCd

f"

aBcd 15 aBcD

T

iBCD

T

aBCc

r

ABcd

T

ABeD

F"

A3CD

f'

ABCd

r

Abed

f"

AbcD

f'

AbCD

f~

AbCd

fi?

ecl

cD

ab

1

1

0

0

aB

1

0

0

0

AB

0

0

0

0

Ab

0

0

0

1

CD Cd

Clear Incut j

Fig. 1: Boolean function input II. Calculations: 2.1 and 2.2 According to the rules of combining the terms, the first term can be combined with the second and the third term. Therefore the result of combining is as follows: Old Terms XOOOO X0001 X0100 1010

New Terms 0000-00

Swapped Terms 1010 0000-00

The result of combining the terms creates only two new terms, but after swapping them we have three terms. The reason for this is because there was one term among the old ones that was not combined with any of the other terms. That is why X does not mark the term 1010.

233

2.3. Due to the reason that swapped terms cannot be combined any further, steps 2.1 and 2.2 are not repeated. III. Table: 3.1 Looking at the result of combining the terms and swapping, the following terms are present. 1010 0000-00 Since there is no any other repeating term, we can just skip this part of the method. 3.2 Prime Implicant Chart: 0000

0001

0100

1010

X

1010 000-

X

0-00

X

X X

Table 1: Prime implicant chart The prime implicant chart is created to indicate what given terms on the beginning were combined to create the resulting terms or the minterms. For example, the term 1010 wasn't combined with any of the other terms, which is why there is an X under 1010. On the other hand, the last two terms, 000- and 0-00_ were combined with the other two terms. The term 000- was combined with the term 0000 and 0001, which is the reason that there are X's under those terms in the corresponding column. 3.3 The next step is to find the essential prime implicants. These are prime implicants that cover minterms, which no other prime implicants cover. In our example the essential prime implicants are all prime implicants. 3.4 After the previous step there are no minterms that are not covered. IV. Display:

234

Initially the given Boolean expression had the form, 0000 + 0001+0100+1010 which is equivalent to (1). After the application of the procedure the following simplified expression is derived. 1010+ 000-+ 0-00 or F' = AbCd + abc + acd

(2)

The above expression (2) is the optimal solution for the given Boolean function (1).

««

i/

'i--i_L

•

r

Fr'd i strH . obci~T'-J

ciO

cd

cD

ab

1

I

aB

1

0

AE

0

0

Ab

0

1 1

CD Cd

aA'a '0C.-1

i/

.lrir.t

•w

•j

0 0

—1

c

0 1

=!£"::•

li-evii.luc .'-LICL'-"

A.- .:> « 0

Hs-!e 't i h i Peru i

Fig. 2: The Boolean function result

235

It is noted that the same result (2) is given in Fig 2., using our on-line implementation. 4

Conclusion

In this paper, a new modified Quine-McCluskey algorithm for minimizing Boolean expressions is proposed and implemented. The application was implemented in Java and can be accessed on the Internet. The results of this paper can easily be extended to cover more than four variables. This application can be an aid for students and professors in the digital logic design courses and a valuable tool for the digital logic designers. 5

Acknowledgements

The authors would like to thank Prof. Carl Bredlau of the department of computer science, Montclair State University for his valuable advice on JAVA programming. References 1. Karnaugh M, The map method for synthesis of combinatorial logic circuits, Trans. AIEE, Communications and Electronics, Vol. 72, pp. 593-598, (1953). 2. Quine, W.V., The problem of simplifying truth tables, Am. Math. Monthly, Vol. 59, No. 8, pp. 521-531, (1952) 3. McCluskey E.J., Minimization of Boolean Functions, Bell System Tech. Journal, Vol. 35, No. 5, pp. 1417-1444, (1956). 4. Petrick, S.K., On the minimization of Boolean functions, Proceedings of Western Joint Computer Conference, pp. 103-107, (1959). 5. Katz R.H., "Contemporary Logic Design", (Benjamin/Cummings Publishing Company, Redwood City, CA, 1994).

Learning Algorithms

239 A U T O A S S O C I A T I V E N E U R A L N E T W O R K S A N D TIME SERIES FILTERING JOSE R. DORRONSORO, VICENTE LOPEZ, CARLOS SANTA CRUZ, JUAN A. SIGUENZA * Depto. de Ingenieria Informdtica e Instituto de Ingenieria del Conocimiento Universidad Autonoma de Madrid, 28049 Madrid, Spain E-mail: [email protected] Autoassociatine neural networks have been used for d a t a compression and filtering. As with any other application, optimal network structure has to be decided. In this work we show how t o select optimal architecture and output parameters when linear autoassociative networks are used to filter white noise added to univariate time series. We also give a numerical illustration of the resulting procedures.

1

Introduction

Autoassociative neural networks (AAN), that is, networks where input patterns and targets coincide, are widely used for tasks such as data compression and dimensionality reduction [1]. They can also be used for filtering purposes, with the network outputs taken as filtered versions of the corresponding inputs. This is a natural approach for multidimensional input patterns, but it implies that unidimensional inputs have to be vectorialised somehow before they can be taken as AAN inputs. For univariate time series this is easily and naturally done by using time delays, and suggests AANs as a natural tool for series filtering. More precisely, suppose we have a noisy series xt = zt + nt derived from the addition of stochastic noise Nt to a clean series, which we will assume to be given by an integer-indexed stationary stochastic process Zt. Using for convenience an odd number 2M + 1 of delays, we can define a 2 M + 1 dimensional vector Xt = (xt-M,..., £t, • • •, xt+M)T, i-e., {Xt)j = Xt+j, T —M < j < M (A denotes the transpose of A). When Xt is feed into an AAN, its outputs yt can then be considered as a filtered version of Xt. However, although the original series xt was unidimensional, the 3^t provide now not a single filtered series but actually 2M + 1 of them, (y^M),..., (j/°),..., {Vt1), where we denote by j / ^ the value at position k, —M < k < M, of iVt-fc- Considering the number 2M + 1 of delays to be fixed, two problems are to be solved: the optimal choice of the network architecture and the selection of the "best" filtered series yk or, equivalently, the best filtering output. *WITH PARTIAL SUPPORT FROM SPAIN'S CICYT, GRANT TIC 98-247

240

In this work we will give theoretical answers to these questions in the case of linear AANs. Their architecture is given by an (2M + 1) x L x (2M + 1) network with 2M + 1 inputs and outputs and a single hidden layer with L units. The optimal parameters L and k will be characterized in the next section in terms of the square error between the processes Yk (see below) and that of the clean series Z. In turn, this error can be expressed in terms of the square error between Yk and X and the value a^, of the noise variance. The practical use of these facts requires thus an estimate of afj. We shall provide one in the third section and give also its error with respect the true noise variance. The results of these two sections are given without proofs, that will appear elsewhere. Finally, the resulting procedures will be illustrated on a numerical example.

2

Optimal network parameter selection

We will assume throughout this work that E[Z], and hence E[X], are 0. If we have N' samples xt of the process Xt = Zt + Nt, we can define N = N' — (2M + 1) + 1 delay coordinate vectors Xt as before. Once network training is finished, the network transfer function is given [3] by yt = J2i=i(XtUl)Ul, where Ul are the eigenvectors associated to the L l largest eigenvalues X x of the (2M + 1) x (2M + 1) matrix S ^ = X T X/iV, with X = (Xi,X2,..., Xx)T• £ M is approximately equal to the sample autocovariance matrix Y^ = (t'x)ij = (7kLji)> w^ith 7 * the k-th. sample autocovariance of X. Although the above eigenvalues and eigenvectors do depend on the concrete M value used, we will assume it fixed and drop the M index accordingly. Notice that the eigenanalysis of T^ has many applications to filtering problems [4]-[6]. Each series yk is thus given by

L

vkt = (yt-kh = YlW-k&K 1=1

L

=E

/

M

\

E *t-k+j*i si.

1=1 \j=-M

(!)

J

where ulk denotes the fc-th component of the l~th eigenvector Ul. We will work for simplicity with the underlying processes Z, X and TV instead of the sample values. A natural tool to characterize the optimal parameters L, k is the error estimate e^{Z, Yk) = E[\Z- Yk\2] between the clean process Z and each of the processes Yk derived from Xt through the time invariant filters given by the process version of (1), that is,

241

fe

L

I

M

\

k+M

( L

44-s ^ = E E *«-*««$ K ==fe-M E V(= E l '=1

Xt-..

(2)

\j=-M

Here u^ denotes the A;-th component of the l-th eigenvector of the true autocovariance matrices Tx or r|f. Notice that by our assumptions, Tx = r^f+0jv/2Af+i, w i t n ^ c being the KxK identity matrix. Thus, A^- = A^+CT^ and Tx and T% have also the same eigenvectors. The following result is then true. Proposition 1 In the above conditions,

eL(Z,Yk)

*l-^r(\lz-o%)(ulf.

=

(3)

;=i

As an easy consequence, we have the following. Corollary 1 The optimal L is the largest value such that for all I, 1 < I < L, Xlz > a% or, equivalently, Xlx > 2a%. Notice that from (3), this ensures the minimization of ej,{Z, Yk) for all k. For practical purposes, however, the error estimate CL{X, Yk) = E[\X — Yk\2] is more convenient than (3). It can be shown that Proposition 2 With Z, X and Yk as before,

eL{X,Yk)=o\-Yj\lx{ulk?

(4)

i=i

Moreover ei,{X,Yk)

and ei,{Z,Yk)

eL(Z,Yk)

are related as l

= eL(X,Yk)

+
(5)

i=i

Now, once L has been chosen, equation (5) tells us to select k = k(M, L) as k(M,L)

= arg mm{eL(Z,Yk)

: -M

< k < M} •

k

= arg min{e L (X, Y ) + cr

2

L

2£(4)

N

.

i=i

2

: -M
M}.

242

In practice, we can replace the theoretical Ul and Xlx by their sample based approximations. Moreover, if we have an estimate a2^ of the noise variance we can then estimate the optimal L and k values as follows: 1. We choose as L the largest V value such that for all I, 1 < I < L', Xlx > 2&%.

2. We select then k as k = arg min{eL(X,Yk) -M
+ a2N 2 £ f = 1 ( u ^ ) 2 - 1

M}, now with eL(X, Yk) = &x - £ f = 1

Xlx(ulk)2.

We will denote the resulting optimal estimate of E[\Z — Yk\2 as e(M). In the preceding we can use any noise estimate. However, the same ideas leading to the previous results can be used to get such an estimate. We show how next. 3

Noise variance estimates

Notice that a natural choice for ) = t f c - * " Y.?=-M fyiiu. We will argue next that \Hk{w)^ approximately acts as a 0-1 valued function. More concretely, a good approximation of the eigenvalues Ul can be obtained [2] by either one of the unit vectors

y/2M + l + DM{2cj*) (COB(-MW'), COB((-M

SOO =

+ l)a/*),..., 1 , . . . , cos(Mu;*))T,

V2 /

V 2M + l - £ > M ( 2 w * ) (sin(-Ma;*), s i n ( ( - M + l ) w * ) , . . . , 0 , . . . , sin(Ma;*)) T ,

where DM(ij) = s i n ( M + l/2)a>/sin(o;/2) denotes the well known Dirichlet kernel and u* is an appropriately chosen frequency (the coefBcients appearing in front of these vectors normalize their length to one). It thus follows that, if Ul ~ C(ul,M) for an appropriately chosen UJ1'M frequency, that is, if ulk ~ cos{k(jjl'M) up to a normalizing constant, the frequency response component Hlk of Hk can be approximated by

243

2cos(fca/' M )e- ifc '"

__,, .

^ '

lM

iju

j=-M

l , JMf)e ,„-to P « ( " - ^ ' M ) + ^ = cos(fca/' 2M + 1 +

(" + <J-M)}

DM(2UJ1>M)

and a similar formula holds when we have instead Ul ^ S(ul'M). Given the behavior of the Dirichlet kernel, \Hlk\2 acts as narrow band filter, letting only pass those frequencies near ±u>l,M. This behavior extends to that of the full eigenfilter frequency response H%(UJ), which in practice verifies that I'Hfc^)!2 — 1 near ±ul'M, 1 < I < L, while being close to zero away from them. In other words, \H^ I2 shows a near 0-1 response, and can thus be assumed to be essentially concentrated in a frequency range of measure ^v\H^{u})\2duj. This integral can be shown to be equal to 2ir^2i(ulk)2. Therefore, the measure of the region of [—n, w] outside the "support" of H% can be taken to be 27r(l — J2\ ( u l) 2 )- These ideas suggest the following noise variance estimates

a%(M,L,k)

=

* f |PXH|(1 - | ^ M | 2 ) ^ , 2 2*(1-Ei=i«) )-'-*

(6)

These a2sr{M, L, k) actually overshoot the true noise variance, for we have the following. Proposition 3 The estimate a%(M,L,k) can be written as

^

(MLk)-

'Z-'ZLMO2 1

_a,

4-Ef^K)2

- Ej=i«)2

i - Ei K)2

Notice that for fixed M, these a2^ estimates depend again on L and k while, in turn, we want to use them to obtain the optimal L, k. To avoid this circularity and because of the overshooting observed, in our next section's illustration we will select first an M dependent estimate a%(M) as a%(M) =

mina2N(M,L',k'), k',L'

which will be the value actually closer to
244

noise % 10 20 40 70

opt. M 4 3 2 3

opt. L 4 3 2 2

est. L 2 2 1 1

opt k 1 2 0 0

est k 1 2 0 0

% noise removed 19 44 58 70

Table 1. Estimated M values for an AR 1 process at different noise levels (col. 2), its associated optimal and estimated L values (cols. 3, 4) and optimal and estimated k output indices (cols. 5, 6). Column 7 shows the percentage of noise removed.

4

A filtering example

We shall illustrate the above procedures on a sample series derived from an autoregressive (AR) process of order 1. No comparisons will be made with other methods: despite their simplicity, the filters obtained have reasonably good noise reduction capabilities, but there are several ways in which they can be enhanced. The general form of an AR1 process is zt+i = 4>zt + at, where A — (at) is independent white noise of a certain variance a\ and its power spectrum is p(w) = 0^/(1 — 2(f>cosuj + (j>2). Since this spectrum has a minimum value of cr\/(l + )2 at ui = n, additive noise with a variance a% below this minimum cannot be removed by the above procedures. In fact for such a noise, we should expect Xlz > a2N for all I, and all the 2M + 1 eigenvalues of Y™ should be taken in the filter. Since a\ = £ } i = 1 A'z(uj.)2 and 1 = ^ L i + 1 ( u i ) 2 ' ^en ei,(Z,Yk) — a2N and no filtering effect would take place. We will work here with an AR1 process with aA = 1 and 4> — 0.9. The smallest removable noise variance is thus about 5.3 % of the clear signal variance a\ = (T\/(\ — <j)2) ~ 5.26. Figure 1 has been derived adding to the base AR1 series gaussian white noise with variance 20% of the clean signal's one. This gives a SNR value of 5, or about 6.95 Db. The figure shows for M values between 1 and 20 the evolution of the corresponding estimates for the optimal value e(M) of the error E[\Z - Yk\2} (dotted line) and its estimate e(M). It also shows (as the top line) the successive noise estimates a ^ ( M ) . Since any of them actually overshoots the true noise variance, a new value of a%(M) has been retained only if it was smaller than the previous one; hence the decreasing values observed in the figure. The straight line shows actual noise variance a2N of about 1.10. Table 1 shows for noise levels going from 10 to 70% of the signal's variance the optimal and estimated L and k values. Their M values have been obtained as those giving a first sharp minimum of

245

e(M) = min^fc e/,(Z, Yk). Taking for instance figure 1, it suggests an optimal filtering width of 7 = 2 x 3 + 1, corresponding to M = 3. The 20 % line in table 1 suggests an optimal L of 3, while its estimation by the above procedures is 2. In other words, the optimal filter (i.e., the one derived from the theoretical eigenvalues and eigenvectors and the true noise variance) should use the 3 first eigenvectors while our sample based procedures suggest 2 eigenvectors. On the other hand, the optimal and estimated outputs have a common value of 2, which corresponds to a one step delay. The noise reduction achieved by this filter is 2.5 Db, about 44% of the initial noise variance. 5

Conclusions

In this paper we have theoretically characterized the optimal architecture of a linear autoassociative filter for one dimensional time series, and shown how to chose the best filtering output component of such a network. Although simple, the resulting niters have good noise reduction capabilities, that will be further improved in future work along two distinct directions. The first one is concerned with non linear extensions of the autoassociative networks discussed here. The second will retain the present linear structure, but combining simpler, one hidden unit networks with a previous multiresolution decomposition of the signal to be filtered. References 1. Diamantaras, K.I., Kung, S.Y., Principal Component Neural Networks: Theory and Applications, John Wiley Publ., (1996). 2. Brockwell, P.J., Davis, R.A., Time Series: Theory and Methods, Springer Verlag, (1991). 3. Baldi, P., Hornik, K., Learning in Lineal Neural Networks: A Survey. IEEE Transactions in Neural Networks, 6, 837-858, (1995). 4. Haykin, S., Adaptive Filtering Theory, Prentice Hall, (1996). 5. Pisarenko, V.F., The retrieval of harmonics from a covariance function, Geophysics Journal Royal Astronomical Society, 33, 347-366, (1973). 6. Tufts, D.W., Kumaresan, R., Estimation of frequencies of multiple sinusoids, Proceedings of the IEEE, 70, 975-989, (1982)

246

0.8 -

0.6 -

Figure 1. Evolution for M between 1 and 20 of e(M) (continuous line) and e(M) (dotted line) for an A R l signal to which 20% noise has been added. The top curve shows noise variance estimates, and the middle straight line true noise variance.

247 NEURAL NETWORK ARCHITECTURES: N E W STRATEGIES FOR REAL TIME PROBLEMS

A.

UGENA, F. DE ARRIAGA,

M. EL ALAMI

Universidad Politecnica de Madrid, Escuela T.S. de Ingenieros de Telecomunicacion Ciudad Universitaria s/n, 28040 Madrid E-mail: farriaga @mat. upm, es

There are a few rules to design neural networks most of them coming from experience. In the case of functional-link neural networks it is even worst because they are not well known in spite of their advantages which make them suitable for real time problems. To void the intuition-inspired design some strategies related to very well known mathematical techniques are proposed. They have been used for the solution of a real time problem: speech recognition.

The obtained results, from which only a sample has been included, show drastic reductions in the iteration number to reduce the error under a certain bound and an increase of the learning rate.

1

Introduction

Artificial neural networks (ANN) which can aid the decision process by learning from experience, are a suitable procedure to solve the function approximation problem in many well known applications. But several others, such as image processing, speech recognition and control on real time, demand efficient and rapid function approximation even in cases where the analytic expression of the function is unknown, although some function values could be known. But even when we decide to use AAN, some further decisions have to be made concerning the neural network type and the strategy of use or the specific network model. As far as the neural network type is concerned, if we concentrate on supervised learning, the multi-layer perceptron has been widely used in the literature in problems which do not have critical restrictions on time. In many real-time applications the multi-layer perceptron solutions are not appropriate due to the low learning rate and the big number of input-output pairs needed for training. In order to get rid of those drawbacks we have introduced functional links among the nodes of the neural network according to Pao [1]. The functional-link technique allows the incorporation of a set of functions {fo> fi. ••» fn } to each node under the name of functional expansion. That way when the node k is activated producing the output Ok, we also get

248 { fo(Ok), fl(Ok),.., fnCOk) }

as additional node outputs. The set of functions, if they are linearly independent, has the mission of increasing the output space dimension, producing the faster obtainment of the pattern separation hyperplanes. The set of chosen functions can be applied to the node output, as we described on the previous paragraph, and/or to the node input. The difference matters in the case of the input (first layer) or output (last layer). As we will show, the advantages of the different functional expansions will be decisive to choose the appropriate network model in connections to real time applications. As far the functional link method is concerned, it has to be emphasized that according to Sobajic [3] it is always possible to solve supervised learning problems with AAN without hidden layers by means of this method. 2

Theoretical background

It can be shown [4] that for continuous and piecewise continuous functions, functional link neural networks are universal approximators, that is to say : any piecewise continuous function can be approximated with error less than a chosen bound, by means of a functional link neural network without hidden layers in any real interval. 3

Main Strategies for Using Functional-Link Neural Networks

Among the strategies we have set up for using functional-link neural networks in real-time problems we would like to mention those related to known mathematical techniques [2]: 3.1 Lagrange's Neural Network This model follows Lagrange's philosophy of the interpolating polynomial or the elementary polynomials. Let fj = Cj (X - Xi )

( X - Xj.j ) ( X - X i + 1 )

be the set of elementary polynomials, such that f i (x i ) = l,f i (x j ) = O i f i ^ j Lagrange's interpolating polynomial will be fn* = Zp(Xi)fi (x)

( X - Xn )

249 where p(i) are the known values of the unknown function, and x;, i= l,..,n are chosen points of the independent variable. The set of elementary polynomials plus the constant function f0 = 1, will be chosen as the functional expansion applied to the input layer nodes. There will be no hidden layer and only one single node in the output layer. In consequence, the full set of increased inputs will be: { x b x2, x3,...,xn, f)(x), f2(x),

, fn(x) }

The net output will be expressed by O = F (Sxi*Wj + 0 ) where F is the activation function, Wj are the weights or network coefficients, x; are the real inputs and 9 is the threshold. If the weights related to X; are chosen equal to zero and F is the identity function, we get Lagrange's polynomial and the weights, after the net training, will coincide with the polynomial coefficients. 3.2 Other strategies: Taylor's, Newton, Mc.Laurin's, Fourier's,.., Neural Network Following a similar approach we can device many other strategies. In the case of the Taylor's Neural Model and supposing that the specific point for the development is 0, we will use the following set of functions: fO = 1; fl = (x-0); f2 = (x-0)2/2;

; fn = (x-0)"/ n i

The net can be trained according to the explained procedure. With this method we get not only the function approximation but also the derivatives at a certain point, because the final network weights are the derivatives at the point defined by the first pattern. Therefore, if the results can be expressed as: P(x) = f(0) + f (0) (x-0) + f'(0) (x-0)2/2 + then

f(0) = wO; f (0) = wl;

+ f* (0) (x-0)7nl ; I01 (0) = wn

being wO, wl,..wn the weights associated to fO, fl,..fn. Similarly for the remainder models. 4. Phoneme recognition Data for these phoneme recognition experiments were obtained from 100 continuos voice utterances ( ordinary conversation ) of different speakers, digitised at the rate

250 of 16 Khz. Because of that, two sources of noise were introduced with the phonemes: the consonant joined to the vowel, and the influence of adjacent phonemes. The patterns were extracted from the spoken sentences, parameters were obtained with the Matlab format and a total of 350 patterns for each vowel was available for the experiments. First of all we have considered sine and cosine expansions (Fourier's model) with the following options: a) sin (TIX ), cos (nx ); 24 expansions b) sin (7tx ), cos ( roc), sin (2rcx ), cos (2rcx ); 48 expansions c) sin ( nx ), cos ( JCX ), sin (2roc), cos (2nx ), sin ( 3rcx ), cos ( 3icx );72expansions d) up to 120 expansions. The results are as follows: recognition rate: (a) 85.1, (b) 88.7; (c) 89.9; (d)91.2 error: (a) 10'2; (b) 105; (c) 10"6 ; (d) 10"8 As the second possibility we have used a finite set of the Taylor expansion. In our particular problem we have used the following expansions: a) (xi - x) and (xi - x) related to the first 12 coefficients; 24 expansions b) (xi - x) and (xi - x)2 related to the 25 coefficients; 50 expansions c) (xi - x) and (xi - x)2 related to the 25 coefficients, and (xi - x)3 related to the first 12 coefficients; 62 expansions d) using terms up to order fourth with a total of 74 expansions; in this case the network cannot recognise. The results are the following: rate of recognition: (a) 90.6; (b) 91.2 ; (c)92.2 error: (a) 10; (b) 1; (c) 10"' The third possibility we have contemplated has been the Mc.Laurin's development with the following options: a) x2 and x3 of the first 12 coefficients; 24 expansions b) x2, x3 and x4 of the first 12 coefficients; 36 expansions c) x2, x3, x4 and x5 of the first 12 coefficients; 48 expansions d) x2 and x3 of the first 25 coefficients, x3 and x4 of the first 12 coefficients; 74 expansions. The rate of recognition reaches 93.7, the highest value so far obtained, corresponding to option d). Fig. 1 and Table lshow the variation of error with training and the rate of recognition for Newton's model; Fig. 2 and 2.1 for Lagrange's model. Table 3 gives the comparison among different models and, finally Table 4 gives details of the rate of recognition for the multilayer perception. 5 Related work and comparison Waibel [8] uses feed-forward neural networks for the approximation of functions. Sadaoki Furui [7] deals with the problem of the speaker recognition which is

FUNCTIONAL-LINK AND SPEECH

0

200

400

600

800

1000 1200 Epocas

1400 1600

1800 2000

Figure. 1

a e i 0

u

a 94 2 1,5 0,5 2

e 2,5 87.5 3,5 4,5 2

i 0,5 2 96.5 1 0 Tablel

Error ds la Red

104

s 10' ^10°

*w__ n

lio* 110 J a

iir*

\ m BOO

Figure 2

1000 1200 1400 Epocas

S 0.925 E 3. 0.92

u 0,5 1,5 0 3 95

0

1,5 0.5 2 91.5 4,5

1600

1600 2000

f*^*-^**\

/ r—. ~***~^S

NX

K

Ampliaciories

\ \

/ .

252

Enhanc. Error Rate% epoch Operati.

a e i o u

a 90 1 1 5.5 2.5

Trig. 120 0.02 91.2 2000 13.15x10 s

e 0 95,5 2,5 1,5 0,5

Newton 75 5.5 93 2000 9.52xl0 8

i 0 4,5 93,5 0,5 1,5

Lagrange 84 4.82 92.52 2000 8.36x10 s

0

3,5 2 3 85 6,5

u 1 1,5 4 7 86,5

different from ours; he also uses text-independent recognition methods obtaining lower rate of recognition. D. Charlet and D. Jouvet [6] have also studied the speaker recognition problem; they have used a text-dependent speaker verification system and the best results they have obtained were with a genetic algorithm; their error levels are higher than those we have obtained. 6 Conclusions From the results so far obtained it can be stated that functional-link neural networks are most suitable for a sort of problems needing the reduction of the training period, the reduction of the error level or of the computing time for solution. Among those problems the phoneme recognition is one which appears adequate for that technique. The results obtained with polynomial expansions, such as Fourier, Taylor and Mc.Laurin developments show important improvements in relationship to those got with the multilayer perceptron, specially in the value of the rate of recognition and error levels. References 1. Pao, Y., Adaptive Pattern Recognition and Neural Networks. Addison-Wesley, 1989. 2. Amillo, J. , Arriaga, F., Andlisis Matemdtico con Aplicaciones a la Computacion McGraw-Hill, 1987. 3. Sobajic, D., Neural Nets for Control of Power Systems. Ph.D. Thesis. Computer Science Dept. Case Western Reserve University, Cleveland, OH., 1988. 4. Ugena ,A. Arquitectura de Redes Neuronales con Ligadura Funcional. Ph.D. Thesis. Departamento de Matematica Aplicada, Universidad Politecnica de Madrid, 1997.

253

6. Charlet, D. And Jouvet, D. Optimizing Feature set for Speaker Verification. Pattern Recognition Letters 18. Elsevier Science B.V. 1997. 7. Furui, S. Recent Advances in Speaker Recognition. Pattern Recognition Letters 18. Elsevier Science B.V. 1997. 8. Waibel, A. . Neural Networks Approaches for Speech Recognition. Ed. Prentice-Hall. 1991.

255 E V O L V I N G S C O R I N G F U N C T I O N S W H I C H SATISFY PREDETERMINED USER CONSTRAINTS

MICHAEL L. GARGANO AND YING HE School of Computer Science and Information Systems, Pace University, New York, NY 10038, USA E-mail: [email protected], [email protected] WILLIAM EDELSON Department of Computer Science, Long Island University,, Brooklyn, NY. 11201, USA E-mail: edelson@hornet. liunet. edu A scoring function assigns non-negative values (i.e., scores) which help evaluate various items, situations, or people. For example, a professor would like to assign point values to each problem on an exam that was recently administered to the students in his/her class. The professor demands a minimum point value (which may be different for each problem) while the remaining points can arbitrarily be apportioned and added to each problem. After grading each problem for each student on a scale from 0.00 to 1.00, the professor would like the remaining points apportioned so that a specified grade distribution is attained. We propose a GA (i.e., genetic algorithmic) solution to this problem (and other related problems, e.g., loan scoring and personnel hiring).

1

Introduction

A scoring function assigns non-negative values (i.e., scores) which help evaluate various items, situations, or people. For example, a professor would like to assign point values to each problem on an exam that was recently administered to the students in class. The professor demands a minimum point value (which may be different for each problem) while the remaining points can arbitrarily be apportioned and added to each problem. After grading each problem for each student on a scale from 0.00 to 1.00, the professor would like the remaining points apportioned so that a specified grade distribution is attained. We propose a GA (i.e., genetic algorithmic) [1-8] solution to this problem (and other related problems, e.g., loan scoring and personnel hiring). 2

The Genetic Algorithm Paradigm

The genetic algorithm paradigm is an adaptive method based on Darwinian natural selection. It applies the operations of selection (based on survival of the fittest), reproduction using crossover (i.e., mating), and mutation to the current generation of a population of potential solutions to generate a new, typically more fit population in

256 the next generation. This process is repeated over a number of generations until an optimal or near optimal solution is obtained. A genetic algorithm offers the following advantages: a) b) c) d) e) f) g) h) 3

it will usually obtain an optimal (or near optimal) solution(s) it can obtain a satisficing solution(s) it has polynomial computational complexity it easily handles constraints it easily incorporates heuristics it is easy to understand it is easy to implement it is robust Mathematical Model

After administering an exam to a class of students, a professor would like to assign points to each question on the test so that a predefined grade distribution is obtained. The professor would like a method that is academically sound and curves the exams fairly and objectively. If the exam consists of n questions q1; q2, ..., q;,..., qn_i, qn , we would like to assign to each question a nonnegative value or score s(qi) > 0 so that the sum of the scores is l(i.e., 100%). The professor would like to assign lower bounds for the scores of each question b; > 0 so that s(qs) > b; for 0 < i < n. This will guarantee that each question is assigned a minimum point value which is in the professor's control. In general, 1 > S bj > 0 so that the remaining B = 1 - S bj points must be distributed amongst the n questions assigning some proportion p{ (0 < i < n) of this excess to each question. Therefore, score s(qj) = bj + Pi • B = bj + p. • (1 - £ bj) (with 0 < i < n). The professor wants this excess to be distributed so that a predefined grade distribution D is obtained for the class (for example, a normal distribution Nor(|J., a2) estimated by a frequency histogram with mean (average) u, and variance a 2 .) To accomplish this, the professor first grades each question on every students exam and assigns what proportion Tjj of each question the student got correct. Then student Sj would get a grade of Gj = Z T;j • s(qi) and we would like Gj - D.

257 4

Encodings

Each member of the population is an apportion array ( p b p 2 , ...,Pi, ..., p„-i, pn) of length n where pi + p 2 + . . . pj + ... + p„_i + p„ = 1 = £ Pi and with pi > 0 (for 0 < i < n). It is quite easy to generate an initial population of random members for generation 0 using a normalization method. First, simply generate an (xi, x2, ..., Xj, ..., xn_!, xn) consisting of n independent identically uniformly distributed random variables x; ~ uniform [0,1]. Then by calculating the normalized array (XJ/EXJ, x 2 /Ex;, ..., Xj/E x,, ..., x n _i/£xj, xn / E x;) we create a random apportion array (of course we must observe the caveat that Ex, > 0 ) .

5

Mating (Crossover) and Mutating

Selection of parents for mating involves choosing one member of the population by a weighted roulette wheel method favoring more fit members and the other member randomly. The reproduction process is a simple crossover operation whereby the two selected parent members swap randomly chosen positions to create new offspring members. The crossover operation produces an encoding for offspring members having element values which satisfy the apportion constraints. Two parents PI = ( p b p 2 , ...,pj,..., pn_!, pn) and P2 =(JI,, i^, ..., 7ii,..., 7Cn_i, 7t„) can be mated to produce two offspring children CI and C2 where C I = (pj/s, T^/S, . . . , p / s , ..., TC-j/s, p n /s) With pi + 7C2 + ...+ Pi + ...+ 7Cn_i + p n = S

and C2 = (7ij/t, p 2 /t,.... 7t/t,..., p„-i/t, V 0 with 7ti + p 2 +...+ 7tj+...+ p„.i + 7tn = t Similarly, a random population member can be mutated. Mutation is carried out by randomly choosing a member of the population and then randomly changing the value(s) of its encoding (genotype) at randomly chosen positions subject to the apportion constraints. For example, M = (pi/s, rt2/s, ...,p/s,..., 7C„_i/s, pn/s) with p] + %i + •••+ Pi + •••+ ftn-i + Pn = s where positions 2,..., n-1 have been randomly mutated on the cloned member P = ( p b p 2 , ...,pi,..., pn_i, p„).

6

Fitness

After we have found the phenotype (si, s2, .. .,Si,..., s„.i, sn) for a population member P = (Pi. P2, ---,Pi,..., P„-i, P„) by applying the s(qi) = bs + ps • B,

258

we can find all the grades Gj for that scoring function and we can then find a frequency histogram H (i.e., a distribution). As a simplefitnessmeasure we can sum the absolute values of the differences in each of the pre-selected frequency intervals I to obtain: fitness of population member P = | #Dr - #1^ |. The smaller the fitness value the better the approximation to the predefined distribution. 7

Genetic Algorithm Methodology

We are implementing a genetic algorithm (GA) for the scoring problem using feasible encoding schemes for apportion arrays (described earlier). Our GAs create and evolve an encoded population of potential solutions (i.e., apportion arrays) so as to facilitate the creation of new feasible members by standard mating and mutation operations. (A feasible search space contains only members that satisfy the problem constraints for an apportion array. When feasibility is not guaranteed, numerous methods for maintaining a feasible search space have been addressed [7], but most are elaborate, complex, and inefficient. They include the use of problem-dependent genetic operators and specialized data structures, repairing or penalizing infeasible solutions, and the use of heuristics.) By making use of a problem-specific encoding and normalization, we insure a. feasible search space during the classical operations of crossover and mutation and, in addition, eliminate the need to screen during the generation of the initial population. We adapted many of the standard GA techniques found in [1, 8] to these specific problems. A brief description of mese techniques follows. The initial population of encoded potential solutions (genotype) is randomly generated. Each encoded population member is mapped to its equivalent scoring function (phenotype). Selection of parents for mating involves randomly choosing one very fit member of the population while the other member is chosen randomly. The reproductive process is a simple crossover operation whereby two randomly selected parents are cut into three sections at some randomly chosen positions and then have the middle parts of their encodings swapped and normalized to create two offspring (children). In our application the crossover operation produces an encoding for the offspring that have element values that always satisfy proportion constraints. Mutation is performed by randomly choosing a member of the population, cloning it, and then changing values in its encoding at randomly chosen positions and normalizing so as to satisfy the proportion constraints. A grim reaper mechanism replaces low performing members in the population with newly created more fit offspring and/or mutants. The GA is terminated when either no improvement in the best fitness value is observed for a number of generations, a certain number of generations have been examined, and/or a satisficing solution is attained (i.e., the predefined distribution is not precisely the same, but is satisfactorily close).

259

We now state a generic form of the genetic algorithm paradigm: 1) randomly initialize a population of encoded potential solutions (members) 2) map each new member (genotype) to its scoring function (phenotype) 3) calculate the fitness of any member which has not yet been evaluated (that is, how close the distribution is to the target distribution) 4) sort the all members of the population by fitness 5) select one parent for mating from by using the roulette wheel method and the other randomly 6) generate offspring using simple crossover 7) mutate randomly selected members of the population 8) replace the lower half of the current generation with new offspring and mutated members 9) if a termination criteria is met then return the best member(s) else go to 2

8

Related Problems

Two related problems are loan scoring by lending institutions and personnel selection by human resource functions. In the loan scoring problem, there is a record containing facts concerning the person who is requesting the loan and points are assigned based on an expert loan specialist's subjective judgement. A genetic algorithmic approach could lower the lender's risk, provide better investment returns, and be less biased by providing loans to a more diverse population. In the personnel selection problem, we can give an assessment instrument to measure what differentiates successful employees from non-successful employees. We can then assign the point values constrained by the fact we wish to give higher grades to the successful employees and lower grades to the non-successful ones. In this way we can create instruments which can better predict successful potential candidates from less successful candidates for a position in the future.

9

Conclusion

This research is a nice application of GAs to a real world problem. In the future we would like to get more data and perform more experiments on the related problems discussed in section 8.

260 10 Acknowledgement We wish to thank Pace University's School of Computer Science and Information Systems (SCSIS) and Long Island University's Computer Science Department for partially supporting this research. References 1. 2. 3.

4.

5.

6.

7. 8. 9.

Davis, L., Handbook of Genetic Algorithms, Van Nostrand Reinhold, (1991). Dewdney.A.K., The Armchair Universe - An Exploration of Computer Worlds, W. H. Freeman & Co., (1988). Edelson, W. and M. L. Gargano, Minimal Edge-Ordered Spanning Trees Solved By a Genetic Algorithm with Feasible Search Space, Congressus Numerantium 135, (1998) pp. 37-45. Gargano, M.L. and W. Edelson, A Genetic Algorithm Approach to Solving the Archaeology Sedation Problem, Congressus Numerantium 119,(1996) pp. 1 9 3 - 2 0 3 . Gargano, M.L. and W. Edelson, A Fibonacci Survival Indicator for Efficient Calculation of Fitness in Genetic Paradigms, Congressus Numerantium 136, (1997) pp. 7 - 1 8 . Gargano, M.L. and Rajpal, N., Using Genetic Algorithm Optimization to Evolve Popular Modern Abstract Art, Proceedings of the Long Island Conference on Artificial Intelligence and Computer Graphics, Old Westbury, N.Y., (1994), pp. 38-52. Michalewicz, Z., Heuristics for Evolutionary Computational Techniques, Journal of Heuristics, vol. 1, no. 2, (1996) pp. 596-597. Goldberg, D.E., Genetic Algorithms in Search, Optimization, and Machine Learning, Addison Wesley, (1989). Rosen, K.H., Discrete Mathematics and Its Applications, Fourth Edition, Random House (1998).

261 GENETIC ALGORITHMS FOR MININING MULTIPLE-LEVEL ASSOCIATION RULES NORHANA BT. ABDUL RAHMAN ARABY AND Y.P.SINGH Faculty of Information Technology, Multimedia University, Cyberjaya, Selangor, 63100, Malaysia E-mail: [email protected] This paper presents genetic algorithms formulation and generalization for mining multiple-level association rules from large transaction databases, with each transaction consisting of a set of items and a taxonomy (is-a hierarchy) on the items. The necessity for mining such rales are of great interest to many researchers [l]-[3]. Some criteria are investigated for pruning redundant rules. Example rales found in database transactions are presented.

1.

Introduction

Genetic algorithms have been used in concept learning in machine learning areas and mining association rules [4]. The proposed study investigates genetic algorithms for mining association rules and multiple-level rules for large database applications. The algorithm randomly generates an initial population of itemsets. Those itemsets' fitness will be rated by its frequency of occurrence as subset of given transactions. Itemsets that fit the user specified support and confidence threshold will survive. They will then replicate according to their fitness, mutate randomly, and crossover by exchanging parts of their subsets (substructures). The algorithm again evaluates the new itemsets for their fitness, and the process repeats. During each generation, the genetic algorithm improves the itemsets (individuals) in its current population. The genetic algorithm stops where there is little change in itemsets fitness or after some fixed number of iterations. The algorithm will finally return a set of frequent itemsets (hidden transactions) from different generations. The multiple-level association rule mining requires the following: 1. 2.

A set of transactions and taxonomy (is-a hierarchy) on items in the transactions. Efficient methods for multiple-level rule mining.

The genetic algorithm is proposed here for mining multiple-level association rules considering extended transactions, i.e. transactions having items combined with taxonomies. We consider the database consisting of a set of transactions and items' taxonomies as shown in Figure 1. Finding associations between items at any level of the taxonomy is known as mining multiple-level association rules.

262

Phase-1 : Find Extended transactions Given a set of transactions and the taxonomy, add all ancestors of each item in a transaction to the transaction as given below: Items in Transactions + taxonomies (items' ancestors) = Extended Transactions Phase-2 : Run genetic algorithm designed for finding frequent itemsets [4] and later find the multiple-level association rules.

Transactions w + taxonomies

2.

Frequent

Extended Pre process

transaction

Genetic Algorithm

itemsets

Algorithm

Multiplelevel association rules

Genetic Algorithms

The currently most important and widely known representatives of evolutionary computing techniques are: genetic algorithms (GAs), evolution strategies (ESs), and evolutionary programming (EP). These techniques are applied in problem solving by applying evolutionary mechanism. In the following we present a brief review of genetic algorithms, the evolutionary computing techniques and their use for machine learning problems. The basic evolutionary algorithm can be represented as given below: f~0; initialize P(t) ; (generalize initial population) evaluate P(t); While not terminate (P(t)) do Select: P(t):=jrom_P(t); Recombine: P'(t):=r(P(t); Mutate: P-(t):=m(PW; P"(t); Evaluate: t:= t+1; Od Return (best individual in P(t));

263

In this algorithm, P(t) denotes a population of individuals at generation t. Q(t) is a special set of individuals that has to be considered for selection and P" is offspring individuals. At the present time, genetic algorithms are considered to be among the most successful machine-learning techniques and are also used as general-purpose search techniques for solving complex problems. Based upon genetic and evolutionary principles, GAs work by repeatedly modifying a population of individuals through the application of selection, crossover, and mutation operators. The choice of an representation of individual (encoding) for a particular problem is a major factor determining a GAs success. GAs have been used for optimization as well as to classification and prediction problems with different kinds of encoding . A GAs fitness function measures the quality of a particular solution. The traditional GA begins with a population of n randomly generated individuals (binary string of fixed length I), where each individual encodes a solution to the task at hand. The GA proceeds for several number of generations until the fitness of the individuals generated is satisfied. During each generation, the GA improves the individuals in its current population by performing selection, followed by crossover and mutation. Selection is the population improvement or "survival of the fittest" operator. According to Darwin's evolution theory, the best individuals should survive and create new offspring for the next generation. Basically, selection process duplicates structures with higher fitness and deletes structures with lower fitness. There few methods which can be used for selection, such as proportional selection, tournament selection, roulette wheel selection, Boltzman selection, rank selection, steady state selection and some others. Crossover, when combined with selection, results in good components of good individuals yielding better individuals. The offspring are the results of cutting and splicing the parent individuals at various crossover points. Mutation creates new individuals that are similar to current individuals. With a small, prespecified probability (pm [0.005, 0.01] or pm = Ml where / is the length of the string representing individual), mutation randomly alters each component of each individual. The main issues in applying GAs to data mining tasks are selecting an appropriate representation and an adequate evaluation function.

264

3.

Simulation Result

Illustrative Example Given a sample taxonomy saying that Skimmed Milk is-a Milk is-a Drink and Bread is-a Food. We can infer a rule saying that "people who buy milk tends to buy bread". This rule may hold even though rules saying that "people who buy skimmed milk tends to buy bread" and "people who buy drink tends to buy bread" do not hold. Drink (1) MfflT(l) Skimmed Milk(l)

Food (2)

Mineral Water (2)

Pasteurized Milk (2)

Fruit (1)

Apple^T)

Bread (2)

Orlhge (2)

Figure 1: Example of taxonomy Let I = {Skimmed Milk, Pasteurized Milk, Mineral Water, Apple, Orange, Bread} - set of items Let T = {{Skimmed-Milk, Bread},{Mineral Water, Apple},{Pasteurized Milk, Bread}, {Pasteurized Milk, Bread}} = {Ti, T2, T3,T4} - sets of transactions Let • = {Milk, Drink, Fruit, Food} - items' ancestors Item, I (Leaf at the taxonomy tree) Skimmed Milk Pasteurized Milk Mineral Water Apple Orange Bread Table 1: Encoded items

Hierarchyinfo code 111 112 12 211 212 22

Normal individual bits 100000 010000 001000 000100 000010 000001

265 Ancestors, • Milk Drink Fruit Food

Hierarchy-info code 11 1 21 1

Table 2: Encoded ancestors The hierarchy-info code represents the position (level) of an item or ancestor in the hierarchy. For example, the item 'Pasteurized Milk' is encoded as '112' in which the first digit T represents 'drink' at level 1, the second digit ' 1 ' for 'milk' at level '2' and the third digit '2' represents the type 'Pasteurized Milk' at level 3. Hence, the more digit an item is encoded as, the deeper level it is in the hierarchy and vice versa. Transactions, T Ti T2 T3 T4

Extended Transactions {111,0,0,0,0,22} {0,0,12,211,0,0} {0,112,0,0,0,22} {0,112,0,0,0,22}

Normal transactions {1,0,0,0,0,1} {0,0,1,1,0,0} {0,1,0,0,0,1} {0,1,0,0,0,1}

Table 3: Transactions In an extended transaction, the bit position reflects the item involved in the transaction while the bit content reflects its hierarchy. A.

Finding Frequent Sets using Genetic Algorithms

GA Parameters Population size, pop size =10 Individual item size, ind_size = 6 Probability of crossover, pc = 0.6 Probability of mutation, pm = 0.01 An initial population is randomly generated where each individual consists of six '0' or ' 1' bits. These bits only represent the items at the lowest level (the leaves of a taxonomy tree) and don't include the ancestors.

266

All the individuals in the initial population is first evaluated to determine their fitness value before they are selected for further process of crossover and mutation. Each individual is compared with the normal transactions of itemsets in the database. The more frequent the individual occurs in the normal transactions, the higher its fitness value is. Roulette-wheel selection method is chosen to select the best individuals in the population, based on their fitness value. The fitter the individual is, the more chances they are to be selected. Crossover and mutation are two basic operators of GA and they may effect the performance of GA. Those individuals are randomly chosen and switched at a randomly chosen crossing point, between 1 to indsize. In this experiment, single point crossover is done where only one crossover point is selected. Other crossover methods for binary encoding are two point crossover, uniform crossover and arithmetic crossover. Mutation is then performed at a very low mutation probability. Bits are inverted randomly. The population is then evaluated again. These individuals are then passed to the next generation for selection, crossover, mutation and evaluation again. The process repeats for several generations, each time improving the fitness of the population. Finally, the GA process will generate a final population, consisting of the most frequent individuals or itemsets. B.

Construction of Multiple-Level Association Rules

From all the frequent sets which were generated by GA, we only choose the single frequent sets. They are then expanded and converted into hierarchy-info code. For example, the single frequent itemsets generated are {000001} and {010000}. Item

Normal individual

Ft

Bread

000001

Pasteurized Milk

010000

0,0,0,0,0,22 0,0,0,0,0,2* 0,112,0,0,0,0 0,11*,0,0,0,0 0,1**,0,0,0,0

Fitness/ Support 3 4 2 3 4

These single hierarchy-info encoded individuals, Fi are paired and evaluated. However, note that an item is not paired with its own ancestors in order to avoid uninteresting rules, such as A->ancestor(A) or ancestor(A) -»A. As for evaluation, each paired hierarchy-info encoded individuals, F2 is compared with the extended transactions. The number of occurrences of F2 in the extended transactions determines the fitness or support value.

267

F2 0,112,0,0,0,22 0,11*,0,0,0,22 0,1**,0,0,0,22 0,112,0,0,0,2* 0,11*,0,0,0,2* 0,1**,0,0,0,2*

Fitness/Support 2 3 3 2 3 4

The bit digits which are substituted with '*' are not taken into consideration when making comparison with the extended transactions. For example: 0,1**,0,0,0,2* is scanned through the extended transactions. If 1** and 2* (regardless of the bit position) is found in the extended transactions, then thefitness/supportvalue is incremented. From Fi and F2, we'll derive the multiple-level association rules. Let confidence threshold, y=0.8 Pasteurized Milk, Bread: PM-»B ifsupport(PMuB) = 2 > 0.8 (0,112,0,0,0,22) support(PM) 2 B->PM if aupporuXMvfi) = 2 < 0.8 support(B) 3 Milk, Bread: Milk-»Bread ifsupport(MuB) = 3 > 0.8 (0,11*,0,0,0,22) support(M) 3 Bread->Milk ifsupportfMuB) =JL> 0.8 support(B) 3 Drink, Bread: Drink-»Bread ifsupportfDuB) = 3 < 0.8 (0,1**,0,0,0,22) support(D) 4 Bread-»Drink if support(DuB) = 3 > 0.8 support(B) 3 Pasteurized Milk, Food: PM-»Food if supportfPMuF^ = 2 > 0.8 (0,112,0,0,0,2*) support(PM) 2 Food->PM ifsupportfPMuFI = 2 < 0.8 support(F) 4 Milk, Food: Milk-»Food if supportfMuF) = 3 > 0.8 (0,11*,0,0,0,2*) support(M) 3

268

Food-»Milk ifsupportfMuF) = 3 < 0.8 support(F) 4 Drink, Food: (0,1**,0,0,0,2*)

Drink—»Food if surjp_ort(DuF} = 4 > 0.8 support(D) 4 Food->Drink if supportfDuF) = 4 >0.8 support(F) 4

From the above computation, the multiple-level association rules derived are: Pasteurized Milk—>Bread, Milk—»Bread, Bread—>Milk, Bread—>Drink Pasteurized Milk—>Food, Milk—>Food, Drink—>Food, Food—>Drink The result shows that an item from any level can be associated with another item (from any level tob); regardless of whether its ancestors or descendants are also associated or not. For example, Milk—»Bread implies although Drink—»Bread doesn't imply. 4.

Conclusion

In this study, we have extended the scope of mining association rules from single level to multiple levels, using Genetic Algorithms. The major issue which was taken into account is the conversion of a given transactions and taxonomies into the extended transactions so that the bits encoded can reflect both the items and their hierarchy in the taxonomy. Mining multiple-level association rules may results in discovery of refined knowledge from a given transactions of data. References 1. Agrawal, A., T. Imielinski, and A. Swami, Mining Association Rules Between Sets of Items in Large Databases, Proc. 1993 ACM SIGMOD Int'l Conf. Management of Data, Washington, D.C., May (1993), pp. 207-216. 2. Agrawal Rakesh and Ramakrishnan Srikant, Mining Generalized Association Rules, Proc. 21st VLDB Conference, Zurich, Switzerland, (1995). 3. Han, J. and Y. Fu, Mining Multiple-Level Association Rules in Large Databases, technical report, (University of Missouri-Rolla, 1997). 4. Singh, Y.P. and Norhana Abdul Rahman Araby, Evolutionary Approach to Data Mining, Proc. IEEE ICIT, Goa, India, (2000).

269 A CLUSTERING ALGORITHM FOR SELECTING STARTING CENTERS FOR ITERATIVE CLUSTERING

ANGEL GUTIERREZ Department of Computer Science, Montclair State University, Upper Montclair, NJ 07043, USA E-mail: [email protected] ALFREDO

SOMOLINOS

Department of Mathematics and Computer Information Science, Mercy College, 555 Broadway, Dobbs Ferry, NY 10522, USA E-mail: [email protected] Iterative clustering algorithms are strongly dependent on the number and location of the starting centers. We present some examples of this dependence for two classes of algorithms: fuzzy clustering and competitive learning. In order to select an optimal location for the starting centers, we propose a non-iterative clustering algorithm which creates groups of points based on the average distance of each point to its closest neighbor, and merges the groups so obtained into clusters. The radius of attraction of each point is defined as the average of the distances from every point to its closest point plus a factor times the standard deviation. Adjusting this factor we can vary the number of groups generated. We merge those groups that are close in the sense of the Haussdorf distance. The algorithm allows declaring the minimum number of points that can constitute a group. The user can then drop those points that do no constitute a group, merge them with the closest group if they fall inside the radius of attraction of that group, or allow them to stand as an independent group.

1

Introduction

Clustering algorithms are used in a variety of fields, data mining, statistical data analysis, pattern recognition, for example, using radial basis functions neural networks and, in general, in preprocessing data for classification algorithms. Most of the algorithms used in clustering are iterative. Starting with a partition, in classes or, equivalently, with the centers of the classes, the algorithm moves the centers, or redefines the classes, for a fixed number of iterations or until a fitness function reaches a certain level. The efficiency of these iterating algorithms depends strongly on the selection of the initial groups. The main problem lies in guessing the correct number of groups. But the location of the starting centers can have a huge impact on the algorithm performance.

270

In our experience, not all the groups in a data set have the same number of points, and the average distance between the points in one group is not the same as the average distance in another group of the same set. Thus we have created two-dimensional data samples with these properties. Some of the methods we use to illustrate clustering problems work better with uniformly distributed groups of the same number of elements. So we have also used uniform samples. In Figure 1 we present the sample we will use in most of the examples. It contains 4 groups of 6, 12, 18 and 24 points, generated with random amplitudes of 0.2, 0.3, 0.4, and 0.5. Clearly, a person would discern four groups. But there are points that could be considered a subgroup, and this could have importance in the classification of signals for medical diagnosis. A distinct subgroup could be the telltale feature that would help diagnose a clinical condition. Or it could be simply an outlier. We want to be able to choose what to do with such subgroups. Drop them, merge them with the larger one, or give them independent meaning.

c:,\

Figure 1. Sample data. Four groups with different number of points and different point spread.

*

# •

\

.**** & * •

*

—

* • » •

*

•

t - *****

Figure 2. Effect of choosing three centers and six centers using fuzzy C-means

•

271

2 2.1

Importance of the initial center selection. Some examples Choosing the wrong number of centers

We first use fuzzy C-means [2]. Starting with three centers, one group is completely unaccounted for. On the other hand, starting with more centers provides us with two subgroups, clearly differentiated in the left group; not so clearly in the bottom one (Fig. 2). ^mmmmmi mmmmmr

a£fn

era

*9"i

£*

*^l|»* * *

***.*

. 4£? .»«*' Figure 3. Competitive learning. Four and five centers.

In Figure 3 we show the results of competitive learning, [4]. The initial centers are attracted by the points in the groups. We present the trace of the motion of centers. The starting centers location was selected to obtain a reasonable result. At the left, four centers start at the four corners and then move to the groups. At the right, we have an extra center at the bottom, which moves straight up. 2.2

Influence of the starting location

We use Frequency Sensitive Competitive Learning [1]. We start with six centers at the top of the screen. Three of them are captured by the top group and one ends up between two groups. We show the final position of the centers at the right (Fig. 4). Rival Penalized Competitive Learning [3] expels the centers that are not needed. It works very well when all groups have the same number of points. Two are sent out of the figure. Four are placed at the center of each cluster. But, if we choose to place all six centers at the top left corner, things can go very wrong, Five centers are sent out of the picture and only one center occupies the middle of the screen (Fig. 5).

272

}"^:: s •. .'.-JIJU.-.".•;• >: ":;:.:'.: iV.'": 'g. *!!•: :'..• ": : lg centers

• •*.*..

^

\>

°\

Figure S. Rival Penalized Competitive Learning. Six ini.iui coitcra.

•

c" «

« *g^ a *

Figure 6. Rival Penalized Competitive Learning. All centers start at top left corner.

273 It should be clear by now that choosing the number and locations of the starting centers is an important task and that it deserves the effort of preprocessing the data. 3

Description of the pre-selection deterministic algorithm

The algorithm works like the creation of clouds. Water particles cling to their closest neighbors to form droplets, and, then, the droplets bunch together to create the clouds. Thus, we first create grouplets by finding the closest neighbors of each point; then, we merge the grouplets into clusters. 3.1

Creating the grouplets •

•

3.2

Find the radius of attraction of the points: Create a matrix of the distances from each point to all the others. Sort the matrix, by rows. Find the average and standard deviation of the first non-zero column, the distances to the closest point. Define the radius of attraction as the average closest distance plus a factor times the standard deviation. Taking this factor small creates a lot of small grouplets. Making it big increases the radius of attraction and the grouplets are bigger. Start with any point. Find all other points inside its radius of attraction. Recursively find the points inside the radius of attraction of the points just added. Stop when there are no more points in the radius of attraction of all the points in the grouplet. Creating clusters by merging the grouplets

•

•

Find the radius of attraction of the grouplets. Using the Hausdorff distance find the average and standard deviation of the distances from each grouplet to its closest neighbor. Define the radius of attraction as the average of the closest distances plus a factor times the standard deviation. Taking this factor small we will merge few grouplets and the clusters will be just the grouplets. Taking the factor large we will have clusters made out of several grouplets Merging the grouplets: Find the number of points in the grouplets - if it a singleton or a doublet we may want to drop it. If the number of points is less than the minimum number of points, apply the chosen strategy, drop or merge. If the grouplet has more than the minimum number of points, find all the

274

grouplets inside its radius of attraction. They would form the cluster. Merge them together. 3.3

Example. Clustering the above sample

figure 7. Ten grouplets, two of them solitons, are merged into five clusters. By adjusting the radii of attraction we could create less grouplets and clusters.

4

Discussion

We have presented a non-iterative method for selecting starting centers for iterative clustering. The method is very flexible and avoids the problems related to choosing the wrong number of centers, or to placing them in the wrong starting location. References 1. Ahalt S.C., Krishnamurty A.K., Chen P. and Melton D.E., Competitive Learning Algorithms for Vector Quantization, Neural Networks 3 (1990) pp. 277-291 2. Jang J.R., Sun C. and Mizutani E., Neuro-Fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence (Prentice Hall, New Jersey, 1997) 3. Krzyzak A. and Xu L., Rival Penalized Competitive Learning for Clustering Analysis, RBF Net and Curve Detection. IEEE Transactions on Neural Networks 4 (1993) pp. 636-641. 4. Rummelhart D.E. and Zipser D., Feature Discovery by Competitive Learning, Cognitive Science 9 (1985) pp. 75-112.

275

DIMENSION REDUCTION IN DATAMINING H.M. HUBEY, I. SIGURA*, K. KANEKO*, P. ZHANG* Department of Computer Science, Montclair State University, Upper Montclair NJ 07043 E-mail: [email protected] * Email: [email protected] A complete, scalable, parallelizable, and unified method combining Boolean Algebra, fuzzy logic, modified Karnaugh maps, neural network type training and nonlinear transformation to create a mathematical system which can be thought of as a multiplicative (logical-AND) neural network that can be customized to recognize various types of data clustering. The method can thus be used for: (1) displaying high dimensional data, especially very large datasets; (2) recognizing patterns and clusters, with the level of approximation controllable by the user; (3) approximating the patterns in data to various degrees; (4) preliminary analysis for determining the number of outputs of the novel neural network shown in this manuscript; (5) creating an unsupervised learning network (of the multiplicative or AND kind) that can be used to specialize itself to clustering large amounts of highdimensional data, and finally; (6) reducing high dimensional data to basically three-dimensions for intuitive comprehension by wrapping the data on a torus [1], The method can easily be extended to include vector time series. The natural space for high dimensional data using the natural Hamming metric is a torus. The specifically constructed novel neural network can then be trained or fine-tuned using machine-learning procedures on the original data or the approximated/normalized data. Furthermore we can determine approximately the minimal dimensionality of the phenomena that the data represent.

1

Introduction

There are a set of related problems in the fields of datamining, knowledge discovery, and pattern recognition. We don't know how many neurons should be in the hidden layer or the output layer. Thus if we attempt to use ANNs for clustering as a preliminary method to finding patterns we must use heuristic methods to determine how many clusters the ANN should recognize (i.e. what is the rank/ dimension of the output vector). This is just another view of the problem in datamining of knowing how many patterns there are in the data and how we would go about discerning these patterns. There is a related problem in k-nearest-neighbors clustering in which we need an appropriate data structure to be able to efficiently find the neighbors of a given input vector. Indeed, before the k-neighbors method can be used to classify an input vector we need to be able to cluster the training input vectors and an ANN might have been used for this process. The problem of knowing how many patterns (categories or classes/clusters) there are is an overriding concern in datamining, and in unsupervised artificial neural network training. Typically the basis of all datamining is some kind of a clustering technique which may serve as a preprocessing, and data reduction technique which may be followed by other algorithms for rule extraction, so that the data can be interpreted for and comprehended by humans. Prediction and classification may be a goal of the process also. The major clustering methods can be categorized as fol-

276

lows [2]: (i) Partitioning Methods; (ii) Hierarchical Methods; (iii) Density-based Methods; (iv) Grid-based Methods; (v) Model-based Methods; 2

Boolean algebra, K-maps & Digital Logic

A Karnaugh map (K-map) is an 2D array of size at most 4 x 4 which represents a Boolean function. The arrangement of cell-addresses (nodes) is such that the numbering scheme follows the Gray code. An r-bit Gray code is an ordering of all r-bit numbers/strings so that consecutive numbers differ precisely one bit position. Thus a Gray code is a sequence of r-bit strings such that successive numbers are apart by distance one using the Hamming distance. The specifics of the K-map make it possible to perform Boolean algebraic simplifications and reductions graphically. For low dimensional spaces (i.e. n < 4), there is a natural distance (Hamming) metric defined on the K-map. It is reminiscent of the city block metric used in data mining procedures. The K-map is a 2 x 2 array where 1 4 ) are naturally related. 101, IT — — —

'A

/I

/

'

I

/

010 7 1

;

f •

'

' - <•

/

J

100

i

ft

/ /011

Figure 1: Hypercubes, Gray Code, and Data Clustering: The graphics shows (i) the Gray coding (ii) the'growing' of an n-dimensional hypercube from an (n-1) dimensional hypercube, and (iii) the 'shrinking' of the center. For n > 4 the K-map needs to be modified (KH-map) so that it is a metric space [1]. The ideal visualization tool for high dimensional data is the KH-map. Karnaugh map is used for small dimensions. "For more than four variables, the Karnaugh map method becomes increasingly cumbersome. With five variables, two 16x16 maps are needed, with one map considered to be on top of the other in

277

three dimensions to achieve adjacency. Six variables requires the use of four 16x16 tables in four dimensions! An alternative approach is a tabular technique, referred to as the Quine-McCluskey method." [3]. The 4-variable K-map corresponds to a 4-dimensional hypercube. More on how this can be used in clustering, please see below. In this space, each node is adjacent to 4 other nodes. This notion of 'neighborliness' can be visually realized on the 4-variable K-map. The K-map can be wrapped on a cylinder and then the cylinder is bent into the shape of a torus so that the corner cells become neighbors. [4]. We can also create maps similar to K-maps and use them in ways similar to grid-based methods, and also wrap them on a torus. In finding clusters (as in classification/categorization) for an input vector of rank n there are easily 2 possible outputs (clusters). This simplification results from reducing the inputs to binary vectors and allowing only the corners of the hypercube to represent clusters. This is nothing but the decoder problem. The decoder is a Boolean circuit (a classifier!) that identifies the 'coded' input vector (binary string). If we use fuzzy-gates instead of crisp Boolean gates, we will have our classifier with one caveat: the n-dimensional hypercube is the natural space n of this phenomenon, however, the phenomena-space size is 2 . We need to be able to cluster the data which might be spread out over this space. One can easily be constructed using fuzzy logic however, with some thought the procedure can have some desirable properties. 3

Hypercube, Datacube, and the Karnaugh map

The n-dimensional hypercube has TV = 2 nodes and n2 edges. Each node corresponds to an n-bit binary string, and two nodes are linked with an edge if and only if their binary strings differ in precisely one bit. Each node is incident to n = lg{N) [where lg(x)=log2(x)] other nodes, one for each bit position. An edge is called a dimension-k edge if it links two nodes that differ in the kth bit posik tion. The notation u is used to denote a neighbor of u across dimension k in the hypercube [5]. Given any string u = u,...u,

k w, the string u is the same as u

except that the kth bit is complemented. The string u may be treated as a vector. Using d(u, v) the Hamming distance V«W[d(M, U ) = 1] . The hypercube is node and edge symmetric; by just relabelling the nodes, we can map any node

278

onto any other node, and any edge onto any other edge. That is, for any pair of edges (w, v) and (u, v) in an N-node hypercube H , there is an automorphism O of H

such that CJ(W) = u and <J(V) = v . An automorphism of a graph is a

one-to-one mapping of nodes to nodes such that edges are mapped to edges. If u - u,ur>...u,

N

and u =

MJJ^-.-W/P/V,

men

^ or

an

y permutation n on

{1,2, ..., IgN} we can define an automorphism a with the desired property by [5] <5{xxx2...xXogN) = U J t ( 1 ) ® « 7 [ ( 1 ) © « i ) * ( ^ ( 2 ) © » n ( 2 ) e " 2 ) * -*

0)

*(xn(lgN)®un(lgN)®ulgN)

where p.*v denotes the concatenation of string |i with string v . 000

111

011

\

110|

J

e

7

1001

101

/

010

nm

\

001

Figure 2: Graph Automorphism. The automorphism on an input-vector hypercube is equivalent to a permutation of the components of the input vector. As a simple example (Fig 2) the automorphism o(x,XyK^) = x~*{xr, @ l)*x,

(2)

maps the edge (010,011) to edge (110,100), where the © indicates an XOR. Other examples can be seen in Leighton[5]. Any nD (n-dimensional) data can be thought of as a series of (n-l)D hypercubes. This process can be used iteratively to reduce high-dimensional spaces to visualizable 2D or 3D slices. Properties of high-dimensional hypercubes are not intuitively straightforward. The datacube that is used in datamining [2] is a lattice of cuboids, 2D or 3D slices of which can be displayed by common datamining programs. There is no need to look all over the hypervolume for high-dimensional data sets; we need to look only along the surface (or even near the corners (nodes)). Vectors in fi

{0, 1}

n

are called binary vectors, and the vectors in {-1, 1}

are bipolar vec-

279 tors. These vectors are the sets of corners or vertices of {0, 1} respectively. The points in {0, 1}

and {-1, 1}

are located at distances (0 to Jn) from the

origin but the vectors in {-1, 1} are all of length Jn. Therefore {-1,1} is a subset of the hypersphere of radius Jn in 3i . The domains of such vectors are the hypercubes [6]. In high-dimensional Euclidean spaces the volumes and areas of hypercubes and hyperspheres are counterintuitive. Hypercubes in n-dimensional spaces are highly anisotropic, something like spherical porcupines [5]. Thus, most of the data in a high-dimensional space will be found at the corners of the hypercube of the normalized datacube. Now, all we need to do is normalize the input data vectors to {0,1}. Then obviously, the hypercube is the natural data structure for datamining. Furthermore, for visualization and manipulation, we need another kind of a data structure, and this data structure is the KH-map[l]. 4

Fuzzy Logic, Neural Networks and Dimensional Analysis

The other important part of the method used to affect a dimension reduction is the use of a novel multiplicative neural network which can be interpreted in terms of a special kind of fuzzy logic in Hubey[7]. Therefore the final result is not merely an ad hoc multiplicative network but as a result of a rigorous mathematico-scientific reasoning. Furthermore, the results can be transformed back into the arithmetic domain (where interval and ratio scaled values may be used) and still be interpreted using fuzzy logic so as to create association rules. The axioms of fuzzy logic can be found in many books [7,8,9]. Also in Hubey[7] is the special logic that is useful for training of arithmetic (interval-scaled or ratio-scaled) multiplicative neural networks. Aside from the standard axioms that should be satisfied by the fuzzy norm and conorm, if we want to take some guesses as to what kinds of laws of logic are impeccably true and should be preserved, the three that are commonly put forward as candidates are: The Law of the Excluded Middle (LEM) The Law of Noncontradiction (LNC) The Law of Involution or Self-Inverse (LSI)

x + C(x) = 1 C(x • C{x)) = 1 C{C(x)) = x

(3a) (3b) (3c)

where C(x) is the negation or complement function. LNC is usually written as x-C(x)= 0 and called the Law of Contradiction. Adding another level of indirection to continuous valued logic by separating of the truth value assigned to the log-

280

ical variable from the value of the logical variable allows the satisfaction of some of the constraints above by functions other than t(x) = x. For example, normally inputs into the datamining problems are positive finite numbers in some interval x e [0, L] where L > 1 . If we define the truth valuation as t(x) = x and the complement as C(x) = l/t(x) = 1/x, then we can treat these real numbers akin to logical values. For example, all values of x > 1 would be interpreted as more true than false and their complements would always be (1 /x) < 1 . In the IEEE floating point standard a very large number is called a NaN {not a number) which is its way of saying something like "infinity" is returned when and overflow occurs. Therefore the 1/x can easily be used as a fuzzy complement by trapping the NaN, and thinking of two values {0, °°} as the two truth values of crisp logic [10]. In general the outputs (using the suppressed summation notation of Einstein) for this ANN are of the n \n(y.)

= wik\n(xk)

or

y- =

wik J j xk

(4)

k= 1 where the repeated index denotes summation over that index. This network is obviously a [nonlinear] polynomial network, and thus does not have to "approximate" polynomial functions as the standard neural networks. The clustering is naturally explicable in terms of logic so that association rules follow easily. Since we can interpret multiplication as akin to a logical-AND (conjunction) and addition as a logical-OR (disjunction), we can then convert Eq (4) to the logical-form of a neural network and train it using the actual data values instead of the normalized values. A special fuzzy logic developed Hubey[7] is especially well-suited to interpret such multiplicative neural networks. The resulting neural networks are not created ad hoc but rather follow directly and logically from the results developed in Hubey[l]. Therefore, using the QuineMcCluskey method, or a related method we can cluster the inputs, and thus classify them. Using this minimization (or clustering) we can create a specially tuned multiplicative neural network which can be interpreted using this special fuzzy logic. This is the essential basis of the method. w21 -w22 -w23 Terms, [or fuzzy-minterms] such as Xj x 2 x^ serve functions similar to dimensionless groups of fluid dynamics [1] and the exact relationships amongst the input variables should be sought in terms of these groups. Hence, the method also has achieved a dimension reduction akin to PCA. For more examples of the use of dimensional analysis books such as White[10] Olson [11] may

281

be consulted. An example of the use of dimensional analysis to solve a problem in speech can be found in Hubey[12]. At the same time, we have achieved the solution to one of the problems associated with neural networks, that is, we now know how many output neurons an ANN should have for some specific problem at hand. We can now modify digital circuits which are custom-made for the problem at hand to create a [multiplicative] neural network which can be interpreted using the specific fuzzy logic shown above. This multiplicative neural network is customized for the problem and also does nonlinear regression. In addition, (i) we know how many output neurons we should have (ii) can perform nonlinear separation and (iii) does not need a second stage (for classification). As a simple example, a simple singlelayer multiplicative network can solve the XOR problem of Minsky [1]. It is known from empirical evidence, and from the Buckingham Pi Theorem [11,12] that nonlinear dimensional reduction is affected when dimensionless groups of variables regressed against each other. It is based on Rayleigh's "method of dimensions" in (Theory of Sound, 1887). The Pi Theorem was first stated by Vaschy, and proved in increasing generality by Buckingham, Riabouchinsky, and Martinot-Lagarge, and Birkhoff [12]. Without dimensionless numbers experimental progress is fluid dynamics and heat transfer would have been almost nil; it would have been swamped by masses of accumulated data. Indeed, the Navier-Stokes equations for fluid dynamics have never been solved in generality and all progress is due to dimensional analysis. It may be said that dimensional analysis was the first datamining technique used in the sciences. However clusters of variables (which are multiplicative, and may even be ratios after training) can be used as dimensionless groups and thus determine the 'size or dimension of the problem'. It is analogous to embedding a high-dimensional problem into a smaller dimensional space. 7

Conclusion

The method of KH-mapping combined with the Quine-McCluskey-like algorithms, together with the special fuzzy logic in Hubey[7] is the analogue ofPCA methods of statistics and Dimensional Analysis of physics. It is an embedding of a high-dimensional problem into a smaller dimensional space, and it is also a datamining, clustering technique. With minor modifications the method can handle; i) Supervised or Goal-directed Clustering; ii) Multiple Stages (Product-ofSums and Sum-of-Products); iii) Nonspherical Clusters; iv) Spectral Analysis in the Time domain. Correlation matrices can be used. In addition the method is highly scalable and parallelizable. Furthermore, the KHmap is ideal for visualiztion of high dimensional data. It can easily handle extremely large data sets or sparse data sets with minor modifications and can be used along with statistical

282

sampling techniques. It works together with specialized fuzzy logics [7,14,15] to create interpretations of the complex phenomena in high dimensional spaces. References 1. Hubey, H.M. A Complete Unified Method for Taming the Curse of Dimensionality in Datamining and Allowing Logical-ANDs in ANNs, submitted to the Journal of Datamining and Knowledge Discovery, June 2001 2. Han, J and M. Kamber, (Data Mining, Morgan Kaufmann, New York, 2001) 3. Stallings, Wm. Computer Organization and Architecture, (Macmillan, New York, 1993) 4. Hubey, H.M. Mathematical and Computational Linguistics, (Mir Domu Tvoemu, Moscow, Russia, 1994) 5. Leighton, T., Intwdution to Parallel Algorithms and Architectures, (Morgan Kaufmann, San Mateo, California, 1992) 6. Hecht-Nielsen, R (1990) Neurocomputing, Addison-Wesley, Reading, MA 7. Hubey, H.M. The Diagonal Infinity: problems of multiple scales, (World Scientific, Singapore, 1999) 8. Klir, G. and B. Yuan, Fuzzy Sets and Fuzzy Logic, (Prentice-Hall, Englewood Cliffs, NJ, 1995) 9. Jang, J., Sun, C. and E. Mizutani, Neuro-Fuzzy and Soft Computing, (Prentice-Hall, Upper Saddle River, NJ, 2000) 10. Hubey, H.M. Mathematical Foundations of Linguistics, (Lincom Europa, Muenchen, Germany, 1999) 11. White, F. Fluid Mechanics, (McGraw-Hill, New York, 1979) 12. Olson, R. Essentials of Engineering Fluid Mechanics, (Intext Educational Publishers, NY. 1973) 13. Hubey, H.M. Vector Phase Space for Speech Analysis via Dimensional Analysis, Journal of the International Quantitative Linguistics Association, Vol 6, No 2, August 1999, 117-148. 14. Hubey, H.M. Fuzzy Logic and Calculus of Beauty, Moderation and Triage, The Proceedings of the 2000 International Conference on Mathematics and Engineering Techniques in Medicine and Biological Sciences (METMBS2000), June 26-29, Las Vegas. 15. Hubey, H.M. Fuzzy Operators, Proceedings of the 4th World Multiconference on Systemics, Cybernetics, and Informatics (SCI2000), July 23-26, 2000, Orlando, FL.

283 PROCESS C O N T R O L OF A LABORATORY COMBUSTOR USING NEURAL NETWORKS T. SLANVETPAN AND R. B. BARAT Department of Chemical Engineering, New Jersey Institute of Technology, Newark, AV'07102 E-mail: [email protected], [email protected] JOHN G. STEVENS Department of Mathematical Sciences, Montclair State University, Upper Montclair, NJ 07043 E-mail: [email protected] Active process control of nitric oxide (NO) emissions from a two-stage combustor burning ethylene (doped with ammonia) in air is demonstrated using two multi-layer-perceptron neural networks in series. Steady-state experimental data are used for network training. A Visual Basic interface controller program accepts incoming concentration and flow rate data signals, accesses the networks, and outputs feedback control signals to selected electronic valves. The first network identifies the amount of ammonia in the feed. Based on that value and the NO set point, the second network adjusts the first-stage fuel equivalence ratio
1

Introduction

Minimization of both transient and steady state emissions through active process control of combustion systems like waste incinerators, furnaces, turbines, automobile engines or power plants has become an important research area. Classical controllers such as the Proportional-Integral-Derivative (PID) are widely used. However, they can be challenged by process oscillations when large disturbances or set point changes are encountered. Another drawback is that tuning the PID is time consuming and requires a combination of operational experience and trial-and-error. The task becomes even more difficult for highly nonlinear processes especially in the presence of significant time delay. The use of controller models based on artificial neural networks has become an important research area. The inherent parallel structure and its ability to learn non-linear relationships from a known input-output data set have made the neural network a strong candidate for an alternative to PID. In this study, the emissions from a laboratory combustor are controlled via an active feedback control loop operating with trained neural networks.

284 2

Experimental Setup

2.1 The two-stage combustion facility A two-stage reactor served as the combustion facility. It has been well characterized elsewhere [3, 4, 5]. The first stage is a well-mixed zone that can be modeled as a perfectly stirred reactor (PSR). The hot effluent from the first zone passes into a linear flow zone that can be modeled as a plug flow reactor (PFR). Gas residence times are ~ 25 milliseconds. Thermocouples measure temperatures in various locations. Watercooled extractive probes withdraw gas samples for analyses. Figure 1 shows the overall system. 2.2

Controller Design and Setup

The application of feedback process control for the combustor involved several components and tasks. Two electronic control valves, one for primary air and the other for secondary air, served as the final control elements. Analog signals from continuous emission monitors (0 2 , NO, and C02) and PSR zone thermocouple were fed continuously into a Fluke data logger. All signals from the Fluke data logger were digitalized and simultaneously sent into the controlling computer COM port via an RS232, based on requested command from the controlling computer. A Visual Basic program with control software and hardware enabled the computer to process, display, and transmit signals from a thermocouple or gas analyzer simultaneously. The computer also provided feedback control by detecting deviations from assigned set points and generating correction signals (4-20 milliamps) to the electronic valves through a Keithley 12 bit 8-channel analog output board (DDA-08/16). All experimental data are transferred and recorded into an Excel spreadsheet. 2.3

Neural Network Architecture

Network architectures are varied depending on the complexity of each individual process and the objectives for using them. Multi-layer-perceptron (MLP) networks were constructed and used in each phase of the experiments. This architecture has been successfully used in several neural-network-based control applications [1, 2, 6, 7]. The back-propagation learning algorithm was applied because of its simplicity and ease of use. It is an iterative process that involves changing the weights, by means of a gradient descent method, to minimize the learning error. In the learning process, the training data were partitioned into two groups, one for training the network and the other for testing the network. The training data were

285

presented to the networks in random order to break up any serial correlations. Doing so led to significant improvements in convergence speed and performance of the trained network. Network weights were updated after the presentation of a set of known inputoutput pairs during the training. The NeuroSolutions software package from NeuroDimension Inc. was used to construct all neural networks in this study. The software combines a modular, iconbased, network design interface with a built-in custom wizard. These allowed us to build our own networks, generate and compile executable dynamic link library (DLL) files, and embed them into the existing Visual Basic controller interface. The DLL file is an executable module that performs the inputs-outputs mapping function (recall process). All measurable combustor parameters were displayed and preprocessed in the Visual Basic controlling interface before being fed into the neural networks. 3

Experimental Results

The main objective of this experiment was to develop a neural-network-based nonlinear process identification and model-predictive controller for minimizing the NO level from the two-staged combustor. Ethylene (C2H4) served as the fuel. Ammonia (NH3) was used as a model waste dopant containing fuel-bound nitrogen to produce NO. The purpose of the controller was to maintain the first stage fuel equivalence ratio ((b^ and the overall fuel equivalence ratio (i is kept at about 1.35, with 0 is fixed to achieve low CO emissions [5, 13], the new secondary air rate (A2) is computed from: £ = (F/(A + A2))/\F\ A)stoiMomeMc •

286

Figure 3 shows the results from the experiment. The initial equivalence ratios were set at (|>i = 1.14 and <> | 0 = 0.89. The initial dopant/fuel ratio in the feed was set at 0.027. The measured NO concentration from the PFR was 460 ppm. This number was set as a process set point. A step disturbance was then applied to the NH3 flowrate to raise the dopant ratio to 0.057. Figure 3 shows the open-loop NO level rose to 620 ppm. In closed loop, the controller increased <(>! to 1.3 by reducing the Ai in order to bring the NO back to the set point (460 ppm). Flow rate A2 was increased in order to maintain <|>0. The control action was set to execute every 240 seconds to facilitate the long time delay in the gas sampling and NO analysis process. This allowed the controller to receive the true value of the feedback signal from the NO analyzer. Although the controller response tended to be somewhat sluggish, the NO level was brought back to the set point after a reasonable period of time. Several strategies have been used to deal with the process time delays. Here, recurrent methods [3,4] such as Smith predictor and internal model controller (IMC) are under consideration. To extend the range of the neural network operation and improve the network accuracy, steady-state PSR+PFR simulation results with detailed chemical mechanisms will be used to expand the training set to cover a wider range of combustor operating conditions. 4

Conclusions

This paper has demonstrated the usefulness and effectiveness of applying a neuralnetwork-based controller to a combustion process. The second experimental part provided a detailed case study in which neural networks were applied to the nonlinear NO control process with significant time delay in the sampling process. The use of a controller based on the two neural networks connected in series was demonstrated. With the neural-network-based identifier and controller, the process was successfully brought back to the set point after a step disturbance in the feed stream.

287

Two-itaf td coakutor

=2. <

-HI

<XH

1 r^Qt ^

^

«.Qft Figure 1. Overall experimental system

, — 1 _ 5 ^ _ , I............

5*3

^S

-^ t'

PSB

PFR

m i "in -W

IX]

Aeaul (low line -EkranmicMiWDt™

Figure 2. NO control experimental setup

288 703 650 600 550 500 450 " 400 closed-bop

350 300 0

1000

1500 71me(s)

Figure 3. Open-loop and closed-loop results of NO experimental control

References 1. Allen, M.G., C.T. Butler, S.A. Johnson, E.Y. Lo, and F. Russo, "An Imaging Neural Network Combustion Control System for Utility Boiler Applications." Combustion and Flame. Vol. 94, (1993): 205-214. 2. Bhat, N., T.J. McAvoy, "Use of Neural Nets for Dynamic Modeling and Control of Chemical Process Systems." Computers Chemical Engineering. Vol. 14. No. 4/5 (1990): 573-583. 3. Cheng, Y., T.W. Karjala, and D.M. Himmelblau, "Identification of Nonlinear Dynamic Processes with Unknown and Variable Dead Time Using an Internal Recurrent Neural Network." Industrial & Engineering Chemistry Research. Vol. 34,(1995): 1735. 4. Chovan, Tobor, Thierry Catfolis, and Kurt Meert, "Neural Network Architecture for Process Control Based on the RTRL Algorithm." AIChE Journal. Vol. 42. No. 2, (1996): 493-502. 5. Mao, Fuhe, "Combustion of Methyl Chloride, Monomethyl Amine, and Their Mixtures in a Two Stage Turbulent Flow Reactor." Ph.D. Dissertation, New Jersey Institute of Technology. (1995) 6. Palancar, Maria C, Jose M. Aragon, and Jose S. Torrecilla, "pH-Control System Based on Artificial Neural Networks." Industrial & Engineering Chemistry Research. Vol. 37, (1998): 2729-2740.

289

7.

Syu, Mei-J and Bow-C. Chen. "Back-propagation Neural Network Adaptive Control of a Continuous Wastewater Treatment Process." Industrial & Engineering Chemistry Research. Vol. 37, (1998): 3625-3630. 8. Mao, Fuhe and Robert B. Barat "Minimization of NO During Staged Combustion of CH3NH2." Combustion and Flame. Vol. 105, (1996): 557-568.

Communications Systems/Networks

293 INVESTIGATION OF SELF-SIMILARITY OF INTERNET ROUND TRIP DELAY JUN LI AND CONSTANTINE MANIKOPOULOS Electrical and Computer Engineering Department, New Jersey Institute of Technology, University Heights, Newark, NJ 07102, USA E-mail: [email protected] and [email protected] JAY JORGENSON Department of Mathematics, CCNY, Convent Ave, at 138 ST., New York, NY 100031, USA E-mail: [email protected] Measurements of local-area and wide-area network traffic have shown that network traffic magnitude exhibits variability at a wide range of scales—self-similarity. Interestingly, recent research work on network packet delays demonstrates that the distribution of the round trip delay of the packets that traverse the Internet also obeys self-similarity. Understanding the reason of self-similarity in packet round trip delay is critical in order to improve quality of services (QOS) and the efficiency of network bandwidth utilization. In this paper, we show that the phenomenon of self-similarity in the packet round trip delay process is mainly caused by the self-similarity in the background traffic. Moreover, the Hurst parameter of the packet round-trip delay process is proportional to the Hurst parameter of the process of the background traffic magnitude.

1

Introduction

Recent measurements of local-area and wide-area network traffic have shown that it exhibits variability at a wide range of scales [1][2]. Such scale-invariant variability is in strong contrast to the traditional models of network traffic, which show variability at short time scales but are exponentially smooth at large scales. This effect is described statistically as long-range dependence (LRD), and the time series description showing this effect is said to be self-similarity. Since the traffic pattern is significantly changed, the traditional analyses on network resource utilization and performance parameters are also challenged. In [3][4], some experimental and simulation studies show that long range dependence has significant impacts on all metrics of network performance, such as packet loss rate and queue occupancy. The main reason why LRD affects these metrics is that LRD traffic seriously decays the queuing performance. As showed in the experiments in [3], the tail of a queue length distribution decays much more slowly with LRD packet traffic than with short-range dependence (SRD) packet traffic, which is widely used in traditional network performance analysis. Recent studies based on empirical measurements using the User Datagram Protocol (UDP) probe packet, transmitted over the Internet, showed that the round-trip delay of the packets also exhibits self-similarity [5][6][7]. In these experiments, each packet

294 generated by the source is routed to the destination via a sequence of Internet backbone switches or routers, and then is sent back by the destination to the source. The round-trip delay is thus the sum of the delays experienced at each hop, as the packet gets transferred. Since these packets are of fixed packet length, each such delay, in turn, consists of two components: a fixed component, that includes the transmission time delay at a hop and the propagation delay on the link to the next node, and a variable component, that includes the processing and queuing delays at each node. Through statistically analyzing these data traces, the researchers found that although packet interdeparture times are deterministic, arrivals at the source exhibit LRD in most cases. It is noteworthy that the degree of LRD of these data traces, as indicated by the Hurst parameter, varies significantly. Understanding the reason why the round-trip delay exhibits LRD, especially why the degree of LRD varies significantly from case to case, is very important, for the proper design of network algorithms, such as routing and flow control algorithms and for the dimensioning of buffer and link capacity. The empirical measurement data for the Internet packet delay can be better modeled by a self-similar process with long-range dependence, rather than a traditional Poisson process with short range dependence. However, the reason why this packet delay is selfsimilar, has been obscure, but will be revealed in this paper. In order to figure out why the Internet round trip packet delay is self-similar, we should consider the following two factors. The effect of the background traffic: When the probe packets transfer through the Internet, they do so in the company of other background Internet traffic. Now, let us assume that these probe packets as well as the background traffic are fed into a network router with fixed service rate. If the background traffic is self-similar, then the queue length of the router, when viewed as a time series, will also be self-similar. As a consequence, the queuing delay, at the router, of the probe packet should also exhibit self-similarity. Thus, the background traffic has significant effect at least on the router delay component of the Internet round trip packet delay. This is borne out by our simulation results, shown in Section 3. The correlation of the queues of the routers: As indicated by recent empirical measurements on the Internet round trip delay is self-similar. As the UDP probe packets traverse the Internet, sometimes from one continent to another, they will often go through a series of routers or switches, some of which will part of the backbone. As discussed earlier, the distribution in time of the queue length of a specific router is selfsimilar. Traffic through these routers subsequently converges into the inputs of other routers, thus influencing greatly the distributions in their queues. Thus, we may reasonably expect that the length of the queues of the routers, that the probe packet passes, should have some degree of correlation. This issue is also indicated by the work in [7] when the author worked to derive the formula for Internet packet delays.

295

In this paper, we focus on the effect of the first factor, i.e. the effect of the background traffic magnitude. We carry out a simulation study on the packet round trip delay process, by analyzing the behavior of the round trip delay of probe UDP packets, when the degree of LRD of the background traffic is changed. Our simulation results show that the degree of LRD of the round trip delay process of the probe UDP packets is strongly related to the degree of LRD of the background traffic. The rest of the paper is organized as follows: In section 2, we will present our approach to generate the experimental data sets. And in section 3, we will present the results of statistical analyses on the simulation data. Finally, in section 4 we conclude this paper.

Figure 1. Diagram of the simulation network

2 Simulation Experiments In our simulation experiment, we model two classes of traffic, that is, the UDP probe traffic and the background Internet traffic. In this section, we will present our approach to generate the experimental data sets. First, the simulated network configuration is introduced in subsection 2.1. In subsection 2.2, we present our approach to model the background Internet traffic. In subsection 2 3 , we introduce our technique to generate the background traffic with different degree of self-similarity. In subsection 2.4, we discuss the method to generate the UDP probe packets and calculate the round-trip delay of these packets. 2.1 Network Model Our simulated network, shown in Figure 1, consists of four subnets: Subnetl, 2, 3, and SubnetJServer: The clients are located in Subnetl (Ethernet), Subnet2 (FDDI), and Subnets (Token Ring), that are all connected with routers to the server, located in

296 Subnet_Server (Ethernet). The simulations were carried out using OPNET, a network simulation facility. During the simulations, the clients in Subnetl, 2 and 3 establish conversations with the server in Subnet_Server. The links connecting the four Subnets and routers are T l . 2.2 Background Internet Traffic Configurations In our simulation experiments, we model the four most popular TCP/IP services: Http, Telnet, Ftp and Smtp. The application-layer workload characteristics of these four Internet services used in our simulations derive from the reported literature, as in [1][8]. From their work, we find that these workload characteristics closely resemble network measurements. The work in [9] identified these four services as being responsible for 86% of Internet traffic in bytes. We note that due to the workload characteristics of the four Internet services used in our simulations, the network traffic should be self-similar, i.e. the traffic will exhibit variability that appears as "burst" phenomena at a wide range of time scales. 2.3 Achieving Various Degree of Self-Similarity In this paper, we want to observe the behavior of the packet round trip delay when the degree of self-similarity of background traffic varies. In order to achieve this goal, we need to generate the background traffic with various degree of self-similarity. In our simulation experiments, we use the method suggested by the work in [10], that is, changing the shape parameter of the Pareto distribution, which is the statistical model of file size transferred from the server to the client for Http traffic, from 1.05, 1.3, 1.55, 1.8 to 2.0. As indicated by the Hurst parameters shown in Table 2 and 3, this is a practical way to achieve data traces with various degree of self-similarity. 2.4 Probe UDP Packets In our simulation experiments, we send the UDP probe packet from the source, which is a client located in Subnetl, to the destination, which is a client located in Subnet_Server. After the destination receives a probe packet, it will send it back immediately. When the probes reach the source, it will calculate the round trip delay in a way similar to that adopted by [7].

297 2.4.1

Choice of Parameters

Packet Length: In order not to affect the network workload, we use packets with small lengths for probing. We want to use these packets as probes in the network and obtain their round trip delay. In our simulation experiment, we fix the length of the probe packets to 700 bytes. In this way, we can assure that the propagation and transmission delays are fixed, and the variability of round trip delay is due mainly to the variable queuing delay. Inter-packet Departure: We send the UDP probe packet from the source every 100 milliseconds. We want to send as many probes as possible to get an accurate characteristic of the round trip delay. However, sending the probes in a faster rate will affect the background network workload and thus our observations. 3 Simulation Results and Discussion 3.1 Simulation Results In our simulation experiments, we gather data on three network performance parameters. The first parameter we collected is the aggregated packet rate, which is measured on the link connecting Router 1 with Subnet_Server. The traffic going through this link consists of two components, the background traffic and the UDP probe traffic. The second parameter is the length of the queue in Router 1. And the third parameter we collected is the round trip delay of the UDP probe packets. In Figure 4, we show the data traces we collected in our first group of simulation experiments. The results of our first group of simulation experiments are listed in Table 2. In this group of simulations, 50 clients located in Subnet 1, 2 and 3 will communicate with the server in Subnet_Server. The first column of Table 1 shows the shape parameters of the Pareto distribution, which is the statistical model of file size transferred from the server to the client for Http traffic. In column 2, we list the estimated values of the Hurst parameter of the background traffic magnitude, and in column 3 we list the estimated values of the Hurst parameter of the round trip delay of the UDP probe packets. Comparing the results in column 2 and 3, it is interesting to find that when the Hurst parameter of the background traffic magnitude decreases, the Hurst parameter of the round trip delay also decreases; in other words, the degree of self-similarity of round trip delay decreases when the degree of self-similarity of background traffic magnitude decreases. This is depicted in Figure 2. In order to check whether the network topology will affect this result, we carried out a second group of simulation experiments. In this group of simulations, we reduce the number of the clients communicating with the server from 50 to 25. Correspondingly, the background traffic magnitude will be reduced by one half. As shown in Table 3 and Figure 3, the degree of self-similarity of background traffic magnitude and the round

298 trip delay of the UDP probe packets measured across both topologies changed in very similar manner. Shape Parameter* 1.05 1.30 1.55 1.80 2.00

Background Traffic Magnitude (BTM) R/S Analysis Variance-Time 0.9171 0.9233 0.8456 0.8584 0.8011 0.8076 0.7154 0.7293 0.6331 0.6214

Round Trip Delay (RTD) R/S Analysis Variance-Time 0.9419 0.9512 0.8803 0.8814 0.7412 0.7567 0.6834 0.6712 0.6167 0.6101

Table 2 Estimates of Hurst parameter for data traces collected from network topology with 50 clients Round Trip Delay (RTD) Shape Background Traffic Magnitude (BTM) Parameter* R/S Analysis Variance-Time R/S Analysis Variance-Time 0.8710 0.8862 1.05 0.8310 0.8312 0.7038 0.7629 0.7613 1.30 0.7153 1.55 0.6986 0.7012 0.6489 0.6645 1.80 0.5623 0.5616 0.6246 0.6348 2.00 0.5325 0.5530 0.5417 0.5219 * The shape parameter of Pareto distribution, which is the statistical model of file size transferred from the server to the client for Http traffic. Table 3 Estimates of Hurst parameter for data traces collected from network topology with 25 clients 3.2 Discussion As indicated by our simulation results shown in Table 2 and 3, the background traffic has a significant effect on the packet round trip delay. Below we identify the mechanisms of the correlation of the degree of self-similarity of the packet round trip delay to the degree of self-similarity of the background traffic magnitude. As noted in section 1, packet delay in the Internet consists of four components. Considering packets with fixed length, going through a fixed route (the routing of the Internet packets is stationary, as found in [11]), the variability of packet delay is due only to the queuing delay. Since the background traffic and the UDP probe traffic will go through the same route, we can consider the situation that the two classes of traffic are fed into the queue of a specific router. When a probe packet experiences minimum queuing delay, it means that when it enters the queue, there are no packets in it and it is served immediately. However, this is a special case and rarely happens. More commonly, when the probe packet enters the queue, there are a number of packets waiting there, and the probe

299 packet can be served only after all the packets head of it are served. Since the background traffic is self-similar, that is, exhibiting variability at a wide range of scales, we expect that the queue length in number of packets waiting for service also exhibits variability at a wide range of scales. Accordingly, the queuing delay should also exhibit self-similarity of some degree. From Figure 4, we can see this phenomenon visually. This figure shows the data traces we collected in our first group of simulation experiments. The first column is the data traces of background traffic magnitude in packets/second. The second column shows the data traces of the queue length of Router 1 of Figure 1 in number of packets waiting for service. The third column is the round trip delay of the UDP probe packet in seconds. Comparing the figures in the same row, we find a consistent pattern of correlation; there are more bursts in the figures of queue length and round trip delay when there are more burst in the figure of background traffic magnitude. From this figure, we can understand why the degree of self-similarity of the packet round trip delay will be increased when the degree of self-similarity of the background traffic magnitude is increased. 4

Conclusions

As indicated above, two factors may cause the self-similarity of the Internet round trip delay. In this paper, we focus on the effect of the first factor, the background traffic. Based on our simulation results, we conclude that the background traffic has a significant effect on the Internet packet delay, and the degree of self-similarity of the Internet packet delay is strongly and proportionately related to the degree of the selfsimilarity of background traffic magnitude. Acknowledgement We thank OPNET Technologies Inc., for partially supporting the OPNET simulation software that we used. References 1. 2.

Vern Paxon and Sally Floyd, Wide-area traffic: the failure of poisson modeling, IEEE/ACM Transaction on Networking, 3(3) pp226-244, June 1995. W. E. Leland, M. S. Taqqu, W. Willinger and D. V. Wilson, On the selfsimilarity nature of Ethernet traffic (extended version), IEEE/ACM Transactions on Networking, Vol. 2, pp 1-15, Feb. 1994.

3.

4.

5.

6. 7.

8.

9.

10.

11.

Ashok Erramilh, Onuttom Narayan and Walter Willinger, Experimental queuing analysis with long-range dependent packet traffic, IEEE/ACM Trans. on Networking, pp. 209-223, April 1996 Jean-Chrysostome Bolot, Charactering end-to-end packet delay and loss in the Internet, Journal of High-Speed Networks, vol. 2, no. 3 pp 305-323, Dec. 1993 M. S. Borella and G. B. Brewster, Measurement and analysis of long-range dependent behavior of Internet packet delay," Proceedings, IEEE Infocom '98, pp. 497504, Apr. 1998. O. Gudmundson, D. Sanghi, and K. Agrawala. Experimental assessment of end-to-end behavior on Internet. In Proc. InfoComm '93, March 1993 Li, Qiong, and D.L. Mills. On the long-range dependence of packet round-trip delays in Internet. Proc. IEEE International Conference on Communications (Atlanta GA, June 1998), 1185-1191. Barford, Paul; Crovella, Mark. Generating Representative Web Workloads for Network and Server Performance Evaluation, http://cspub.bu.edu/techreports/1997-006-surge.ps.Z. Kevin Thompson, Gregory J. Miller and Rick Wilder, Wild-Area Internet Traffic Patterns and Characteristics (Extended Version), IEEENetwork, Nov/Dec, 1997, pp 10-23. Kihong Park, Gitae Kim and Mark Crovella, "On the Relationship between File Sizes, Transport Protocols, and Self-Similar Traffic", Technical Report, Boston University, TR-96-016 Y. Zhang, V. Paxson, S. Shenker, and L. Breslau, The stationarity of Internet path properties: routing, loss, and throughput, in submission, Feb. 2000.

Hurst Parameter of BTM

Hurst Parameter of BTM

Figure 2. Estimates of Hurst parameter of our first Figure 3. Estimates of Hurst parameter of our second group of simulations group of simulation experiments

301

iiyiiiiliiiial

•-lylMuiil

1:

#liiiiili

^HiiiiMil

iminmir**** 1 °> jjJAMilintiMMfr

Figure 4. Data Traces collected in our first group of simulation experiments with packet rate of background traffic (left), queue length (middle) and round trip delay (right). The figures in row 1, 2, 3,4 and 5 correspond to the data generated with shape parameter of Pareto distribution equal to 1.05, 1.3, 1.55, 1.8 and 2.0.

303

MODIFIED HIGH-EFFICIENCY CARRIER ESTIMATOR FOR OFDM COMMUNICATIONS WITH ANTENNA DIVERSITY UFUK TUKELI AND PATRICK J. HONAN Department of Electrical Engineering and Computer Engineering, Stevens Institute of Technology, Hoboken, NJ 07030 E-mail: [email protected] Orthogonal frequency division multiplexing (OFDM) based wireless communication systems combined with coding and antenna diversity techniques operate at very low signal-to-noise ratio (SNR) levels. Receivers are generally coherent demodulators implemented around a fast Fourier transform (FFT), which is the efficient implementation of the Discrete Fourier Transform (DFT). Carrier synchronization is critical to the performance of an OFDM system. Tone or sub-carrier orthogonality is lost do tofrequencyoffset error. OFDM carrier frequency offset estimation and subsequent compensation can be performed from the received signal without periodic pilots or preambles. OFDM algebraic structure can be exploited in a blind fashion to estimate carrier offset. However, the performance degrades at low SNR. The algorithm here will allow highly accurate synchronization by exploiting maximum diversity gains to increase effective SNR without reference symbols, pilot carriers or excess cyclic prefix. Furthermore, diversity gains overcome lack of identifiability in the case of channel zeros on the DFT grid.

1

Introduction

Next generation wireless communication systems will handle broadband applications and OFDM coupled with antenna diversity has been proposed [1]. Carrier frequency synchronization is critical to the performance of an OFDM system. Tone or sub-carrier orthogonality is lost do to frequency offset error. This results in higher inter-channel-interference (ICI) levels thus lowering signal-tointerference-and-noise ratio (SINR). For OFDM, frequency offsets of as little as 1% begins to result in noticeable penalties in SINR [2]. These effects can be countered by correcting for frequency offset prior to the FFT. This requires the accurate estimation of the frequency offset. OFDM is the standard modulation scheme used in Europe for Digital Audio Broadcasting (DAB) and Digital Video Broadcasting (DVB) [9,10]. In addition, local area networks (LANs) such as IEEE 802.1 la are OFDM based. These systems are based on earlier developed synchronization methods using known preambles and/or periodic pilots. Pilot based carrier synchronization systems consume bandwidth and power and result in significant intra cell interference. These methods spend valuable bandwidth and power resources and require channel estimation. Since the performance of OFDM is severely degraded in the presence of carrier frequency offset, reliable channel estimation is difficult to perform before carrier frequency offset compensation. Blind synchronization that exploits the structure of

304

the signaling can be applied directly to the received signal without periodic pilots or preambles [3,4]. Next generation systems will benefit from extensive coding and diversity techniques able to operate at extremely low to negative SNRs. These techniques in particular multi-antenna diversity are quite effective in combating multi-path channel fading effects. Synchronization methods will not directly benefit from coding and channel estimation gains. These prospects make the task of synchronization that much more difficult. The algorithm presented here will allow highly accurate frequency offset estimation, even at low SNR, by exploiting maximum antenna diversity gain. This paper will present an algorithm for maximizing multi-antenna diversity gain for the purposes of improved blind frequency offset estimation. The paper will proceed by first formulating the algorithm around the high efficiency blind frequency offset estimator proposed in [3]. Then observations of the estimators improved performance, as modified by this algorithm, are discussed in terms of identifiability and increased effective SNR. Finally, numerical simulation results are presented and discussed. 2

Problem Formulation

Carrier offset estimation on a multi-path frequency selective fading channel at low SNR results in high variance. Antenna diversity is has been touted as one of the solutions to mitigate channel fading. The probability that all the signal components will fade is reduced when replicas of the same signal are received over independently fading channels [5]. Denote s(k)=[sj(k), s2(k),... sp(k)]T as the kth block of data to be transmitted. The transmitted signal is OFDM modulated by applying the inverse DFT to the data block s(k). Using matrix representation, the resulting N-point domain signal is given by: b={ql(k),S1(k)...gr(knT=VI*(fi,

(1)

where \Vp is a matrix of the NxN IDFT matrix W. In a practical OFDM system, some of the sub-carriers are not modulated in order to allow for transmit filtering. In other words, the number of sub-channels that carry the information is generally smaller than the size of the DFT block, i.e., P< N. because of the virtual carriers [2]. Without loss of generality, we assume carriers no. 1 to P are used for data transmission. For systems with antenna diversity, i.e. the receiver has m antennas the receiver input for the kth block consists of: yk=[y#;,y#;-y„^L

(2)

where y,{k) =WpH,-s(k), is the input to the ith antenna. H,= diagiHj (1), Hs (2),..., H,{Pj), where Ht (p) defines the channelfrequencyresponse at the pth sub-carrier. In

305

the presence of a carrier offset, e'*, the receiver inputs are modulated by E(<j>) = diag(l, e>*..... ei(N-1)lt') and becomes y,(k) =E(0)W,H,s(k) el^,)(N+Ng), where Ng is the length of the cyclic prefix. Since W / ^ E ^ W ^ I , the E(0) matrix destroys the orthogonality among the sub-channels and thus introduces ICI. To recover {s(k)}, the carrier offset, 0, needs to be estimated before performing the DFT. This paper presents an extension of the estimation method developed in [3] to take advantage of antenna diversity. This extension algorithm will compensate for deep fading of modulated carriers, and enable unique identification [4]. Frequency selective fading is to be expected for OFDM signals, which is used for broadband applications over multi-path frequency selective fading channels. The cost function developed in [3], minimizes the following cost function,

p{z)=X X < .z"' (z)y(*)y" (*)z(*)w™,

O)

where Z(z) = diag{\, z, z2,...,z"'1). The y(k) in (3) is equivalent to y/(k) as defined in (2), a single antenna case. An estimate of the covariance is performed as follows:

"y,W"T

•

y*y? =• \si(k)

••

ym(k)\ ym(k)H\

K=j i>*y? •

(4)

k=\

The estimate ROT is averaged over k=l,2..K sample blocks and used in the modified cost function as follows: P( Z ) = Xw;,Z-'(z)R,Z(z)w,

(5)

This form of the cost function is quite effective at taking advantage of multi-antenna diversity. The covariance calculation removes the phase dependency so that received signals are added constructively while preserving the algebraic structure due to the modulation matrix and carrier offset. Figure 1 depicts a multi-antenna receiver implementation of the proposed algorithm.

306

Figure 1. Multi-antenna receiver implementation. The algorithm is computationally efficient and a further improvement is achieved through an adaptive implementation [6]: Ryy(k)=aRyy(k-\)

+ (l-a)y(k)yH

3

(k),

dz w *

(6)

Observations

Severe multi-path complicates synchronization, especially in the presence of deep fades. The diversity combining technique proposed above is shown in numerical results section to be remarkably robust even in the presence of deep fading. Identifiability (uniqueness) conditions for frequency estimation are developed here. The effect of the novel combining algorithm, developed above, on the combined channel transfer is best demonstrated by first creating the new matrix representation rm =[H,,H 2 ,...H,J. The received signal vector and combined signals covariance function can then be expressed respectively as:

R„

=E()wprmRssrHm^EH()+m
(7)

where S^. is an m antenna generalization of s(£) and Vk additive noise, they are of matrix form defined similar to y k. The covariance expression (7) isolates the

307

multi-antenna signal and additive noise Vk components. The cost function introduced in (5) can now be modified as: P(z)

_ji = £ w^z-(z)E (^)w,r.R11r:w;E"(^)z(z)wM (8)

2 + t7v X<,Z"(z)I„Z(z)Wf

This new cost function results in effective SNR gain, directly by maximizing the diversity gain and indirectly by improving the estimator identifiability. Diversity gain is maximized by exploiting the time domain correlation of the signal. This is evident by considering that the OFDM signal will have correlation in time whose contribution is summed in (8) where noise is not correlated and will not benefit in the same way. The resulting gain over noise is equivalent to coherent combing of the received antenna signals. In the absence of additive noise, P(z) will have a zero at the offset frequency due to null-space orthogonality as developed in the previous section. Again, at the correct frequency offset estimate, that is z= e'*, the condition Z~ (z)E(0) = IN is satisfied, thus restoring the orthogonality of the received signal to the virtual carriers Wp+, and as a result a P(z) equal to zero. Limiting (8) to the no-noise case, the cost function is the sum of non-negative quadratic forms that are zero if and only if: w^,.z-'(z)E

(0)w^r„r^=o,

v; e [i, #-/>].

(9)

This expression assumes the input signals are independently distributed. Rss is a constant multiplied by the identity matrix due to persistence of excitation. The matrix T^r" is a non-negative diagonal matrix given by: T J t = diagiA,]; V/e[l,/>]. (10) We'll show that a lesser stringent condition than assuming full rank of r ^ T " will suffice. Lemma: For a unique zero of P(z) it is necessary and sufficient that only two of the diagonal elements of T T" namely Ax and AP be non-zero, as expressed below in terms of the combined channel responses:

A, =£#,(<))#;«», ••-I

AP =£//,(;>) W ) .

(i i)

i-i

Proof: The virtual carriers wp+, for V/e [1,N — P], consist of fej2""'*')""}""'', span the null-space of W p . The minimization P(z) seeks to find z such that w"+/ Z"'(z)E(0), a function of (2ic(P+i)/N + 0 - 0 ) ,

is in the left hand null-

308

space of the argument \V,r m r". Thus if TmF" is full rank and Q = (j), such that Z _1 (z)E(^)= I N , the arguments null-space is spanned by only the set of virtual carriers. If either A{ or A ? is equal to zero, which is an event of zero probability, P(z) is also minimized at = (p ± 2TC/N respectively, thus non-unique zero minima's. It should be understood that the channel zeros ^j2""1}_,, that fall on the FFT grid, other than at 1 or P, do not result in an ambiguity because (8) is not simultaneously minimized by contribution from (1:N-P> virtual carriers in (9), and the lemma follows. 1

1

1

. I

Figure 2(a). MSE vs SNR with channel zero at i=P.

10 fc .

.

.

.

.

.

.

.

.

.

.

.

.

Figure 2(b). MSE vs. SNR

The non-realizable (zero-probability) conditions proposed in the above are helpful to understand the effects do to added noise. At low SNR, channel zero's located even near sub-carriers result in estimator ambiguity (uniqueness) issues. As equation (8) shows, minimums near or below the noise level will result in ambiguity. Simulations discussed in the next section, resulting in Figure 3 (b), show this for an SNR of 15 dB which begins to cause an ambiguity for the m=2 case. The algorithm is able to mitigate this by effectively increasing SNR on all sub-carriers by exploiting maximum diversity gains.

309

rtTfftKln'aftBc) taqlflKtvttfaoc)

Figure 3(a). P(z) spectrum for multi-path and no-noise.

4

Figure 3(b). P(z) spectrum for multi-path and 15 dB SNR.

Numerical Results and Discussion

Computer simulations were run over 100 Monte-Carlo realizations of random FIR channels with five sample spaced taps with random uniformly distributed phase ~U(0,2 pi), normal distributed imaginary and real components ~N(0,1). For each Monte-Carlo run, K=5 symbols were used. The OFDM signal consisted of N=32 sub-carriers and P=20 data streams. Offset frequency used <j> =1.11 AD3 where A05= 27t/N is the channel spacing. The performance enhancements of the proposed algorithm are illustrated in Figure 2. The results shown in Figure 3 (a and b) respectively show the improved identifiability for multi-antenna cases m= 1,2,4 & 8, where the sample covariance obtained in (4) was used in the cost function (5). The performance enhancement in the presence of deep fading is most evident when comparing Figure 2(a) and 2(b), each includes multi-path but former has a channel zero at sub-carrier P. As expected, this has a severe impact on the performance of estimator with m=l, and to a lesser degree for the higher diversity configurations. For m=8 estimator shows little performance degradation and an approximately 9 dB relative improvement in SNR. Improved estimator identifiability is demonstrated by Figure 3 (a and b), which show the cost function, without noise, and corresponding to Figure 2(a) for an SNR of 15 dB. Evident in Figure 3(b) are the ambiguous minimums, which are beginning to become an issue for the m=2 estimator. As the SNR is further lowered, the m=4, and eventually m=8 estimator is impacted. But what's most impressive here, even at very low SNRs of minus 3-4 dB, for m=8 estimator meets the design target of

310

OFDM systems operating at much higher SNRs [8]. This combined with the proposed adaptive implementations, will allow high estimator resolution even at lower SNRs. The above simulation results demonstrate the utility of the proposed algorithm for the synchronization of OFDM wireless systems operating at extremely low SNR. 5

Conclusions

The modified high-efficiency carrier estimator for OFDM communications with antenna diversity proposed here will meet the requirements of the next generation wireless communication systems. The proposed by exploiting both the algebraic structure of OFDM and the gains inherent in multi-antenna diversity is able to attain efficiencies in terms of bandwidth, power, and computation beyond other proposed methods. References 1. Y. Li, N. Seshadri, and S. Ariyavisitakul, Channel Estimation for OFDM systems with transmitter diversity in mobile wireless channels. IEEE Journal on Selected Areas in Communications (1999) 17(3): 461^*71. 2. T. Pollet and M. Moeneclaey. Synchronizability of OFDM signals. In Proc. Globecom (1995) pp. 2054-2058. 3. H. Liu and U. Tureli, A high efficiency carrier estimator for OFDM communications. IEEE Communication Letters (1998) 2(4):104-106. 4. U. Tureli, D. Kivanc and H. Liu, Experimental and analytical studies on a highresolution OFDM carrier frequency offset estimator. IEEE Transactions on Vehicular Technology (2001) 50(2):629-643. 5. John. G Proakis, Digital Communications , 3"* edition, (McGraw-Hill, 1995) p. 777. 6. Simon Haykin, Adaptive Filter Theory , 3rd edition (Prentice-Hall, 1996) Chapter 8. 7. R.V. Nee and Ramjee Prasad, OFDM For Wireless Multimedia Communications , (Artech, 2000) Chapter 9. 8. T. Pollet, M. Van Blabel, and M. Moeneclaey, BER Sensitivity Of OFDM Systems To Carrier Frequency Offset And Wiener Phase Noise, IEEE Trans. Communications 43 (1995) pp.191-193. 9. European Telecommunications Standard, Radio Broadcast Systems; Digital Audio Broadcasting to mobile, portable, and fixed receivers, preETS 300 401, (1994) 10. U. Reimers, "DVB-T: The COFDM Based System For Terrestrial Television, Electron Commun., Eng. J. 9 (1995) pp. 28-30

311

A COMPARISON BETWEEN TWO ERROR DETECTION TECHNIQUES USING ARITHMETIC CODING BIN HE AND CONSTANTINE N. MANIKOPOULOS Department of Electrical and Computer Engineering New Jersey Institute of Technology Newark, NJ 07102 Email: [email protected] [email protected] This paper compares the error detection capability of two joint source and channel coding approaches using arithmetic coding, i.e., the approach with redundancy in form of forbidden coding space and the approach with redundancy in form of periodically inserted markers. On the one hand the comparison shows two approaches basically have the same detection capability, on the other hand the forbidden symbol approach is more efficient in error correction while the marker symbol approach is simple and suitable for packet switching networks.

1

Introduction

In recent years, the interest of joint source and channel coding has considerably increased. In many of wireless communication systems and the Internet services, Shannon's separation theorem does not hold [1]. A joint design is needed for these systems where source coding and channel coding function dependency to maximize system performance. Moreover, the joint source and channel coding with variable length codes is receiving increasing attention because of the efficient utilization of limited channel resources, e.g., bandwidth, for variable length codes. Of the variable length codes in joint source and channel coding, arithmetic coding [2] is widely used. Boyd et al. [3] proposed a detecting approach which introduces redundancy by adjusting the coding space such that some parts are never used by encoder. If decoding process enters the forbidden region, an error must have occurred. Kozintsev et al. [4] analyzed the performance of this approach in communication systems by introducing "continuous" error detection. The redundancy versus error detection time was studied. Pettijohn et al. [5] extended this work in sequential decoding to provide error correction capability. Another idea of using arithmetic codes for error detecting was proposed by Elmasry [6], where the redundancy needed for error detection is introduced to the source data before compression in the form of periodically inserted markers. The decoder examines the reconstructed data for the existence of the inserted markers. An error is indicated if a marker does not appear in its proper location. Figure 1 shows the diagram of this idea.

312 Input

(T\ J

Xt>

Data

Block Size Counter

•—

Marker Generator

SoaK

?

1 0

3 2 1

Q

Reconstructed Data

Source

Figure 1. Diagram of the marker symbol approach

In next section of this paper, a comparison between above two detection approaches using arithmetic coding is discussed. Two approaches are referred to as forbidden symbol approach and marker symbol approach, respectively. The comparison is carried out in terms of redundancy and error propagation distance (i.e., error detecting time [4]), which mainly determine the performance of system based on the approach. The comparison shows while two approaches basically have the same error detection capability, they should be applied to different kind of systems. 2

Comparison of Redundancy and Error Propagation Distance

For forbidden symbol approach, the redundancy and error propagation distance are analyzed by Kozintsev et al. [4], where the forbidden symbol is assigned probability e (0 < e < 1). Following geometric distribution with random variable F l is modeled to represent the number of symbols it takes to detect an error after it occurs, i.e., error propagation distance, Pyl{k) = (1 - e)*- 1 e

k = 1,2,.. .,co,

(1)

and the probability that error propagates more than n symbols decreases with n t h order of (1 — e),

P1[Yl>n]

=

(l-e)n.

(2)

The redundancy is Ri = — log 2 (l — e)

bits per symbol.

(3)

For the marker symbol approach, three kinds of marker strategies were introduced by Elmasry [6]. Because of its small amount of redundancy, the previous marker strategy is discussed in this paper where a block of size m

313

source symbols is turned into a block of (m +1) symbols by repeating the m t h symbol at the (m + l ) t h location. The amount of redundancy is R2 =

~.

(4)

m

The probability density function (PDF) of error propagation distance is plotted in Figure 2. m = 10, 20, and 30 are used to show the PDF with different amount of redundancy.

IS 20 25 30 35 Position of the detected error (characters)

40

45

SO

Figure 2. Error propagation distance P D F of marker symbol approach

We can see the PDF remains almost constant in each marker region. But between these marker regions, PDF drops quickly. The probability that error is checked within I markers can be statistically determined by simulation. Here is a simple explanation. Assuming one symbol contains c bits, we have 2° symbols in the alphabet. Since even a single error will cause the decoding process losing synchronization and the decoded data being garbled, when comparing the marker and the previous symbol, the probability that they are the same roughly equals to the probability that two independent numbers selected from 1 to 2C are same, which is 2?y.2c = ^~°- Letting random variable Y2 represent number of markers error will propagate, we have the probability that error propagates more than I markers, i.e., the misdetection probability, P2[Y2 >l]=

(2~ c )' = 2-

lc

(5)

Now we compare the redundancy and error propagation distance between two approaches. If the redundancy of marker symbol approach is counted in

314

number of bits per symbol, we have R2 — — bits per symbol. For the same amount of redundancy, i.e., Ri = R2, we have -log2(l-e) = - > m

(6)

(1 - e) m = 2~ c .

(7)

or, This means the probability that error propagates more than m symbols in forbidden symbol approach is equal to the probability that error propagates more than 1 marker in marker symbol approach. Similarly for I markers, we have (1 — e) / m = 2~lc and it shows two approach have basically the same error detection capability. Figure 3 compares two approaches with same redundancy. The forbidden space e = 0.12 and marked block size m = 30 are selected. We see two approaches have different error propagation distance PDF, but for error that propagates beyond a marked block size, the detection probabilities are same (the areas of two curves in a marked block size are same). 0.1

— —

ma/kef «ymbol approach (m-30) forttddgn symbol approach (e-0.12) |

0.09 0.08 0.07 0.06 u. 0 0.05 O. 0.04 0.03 0.02 0.01 0 0

10

20 30 40 Position of the detected error

50

60

Figure 3. Comparison of error propagation distances of two approaches

However, with the forbidden symbol approach, the error propagation distance distribution is non-uniform. This makes the approach useful for error correction. The reason is that we can estimate the error location based on the geometrically distributed error propagation distance. While in marker symbol approach, the error location PDF within a marked block is approximately uniformly distributed. We can only guess the error may be in a marked block, but do not know the error position within the block.

315

Although the marker symbol approach is less efficient in error correction, it is simple and does not change the entropy code in encoder and decoder. The approach can be applied to existing systems without modification of encoder and decoder. Moreover, though forbidden symbol approach provides continuous error detection, in current packet switching networks there is no need for high frequency of error checking. By introducing several markers in a packet, the marker symbol approach is capable of detecting errors with less computation complexity. 3

Conclusion

This paper compares the error detection capability between forbidden symbol approach and marker symbol approach using arithmetic coding. The comparison shows two approach have basically the same error detection capability. With its geometric distribution of error propagation distance, the forbidden symbol approach is useful for error correction. The marker symbol approach is simple and does not change the source encoding and decoding design. Though forbidden symbol approach provides continuous error detection, its high frequency of error checking is generally not needed in packet switching networks. The marker symbol approach with less computation complexity may be a better choice in this case. References 1. S. Vembu, S. Verdii, and Y. Steinberg. The source-channel separation theorem revisited. IEEE Trans. Inform. Theory, 41(l):44-54, Jan. 1995. 2. G. G. Langdon, Jr. An introduction to arithmetic coding. IBM J. Res. Develop., 28(2):135-149, Mar. 1984. 3. C. Boyd, J. G. Cleary, S. A. Irvine, I. Rinsma-Melchert, and I. H. Witten. Integrating error detection into arithmetic coding. IEEE Trans. Commun., 45(l):l-3, Jan. 1997. 4. I. Kozintsev, J. Chou, and K. Ramchandran. Image transmission using arithmetic coding based continuous error detection. In Proceedings of the Data Compression Conference, pages 339-348, Snowbird, UT, Mar.-Apr. 1998. 5. B. D. Pettijohn, K. Sayood, and M. W. Hoffman. Joint source/channel coding using arithmetic codes. In Proceedings of the Data Compression Conference, pages 73-82, Snowbird, UT, Mar. 2000. 6. G. F. Elmasry. Joint lossless-source and channel coding using automatic repeat request. IEEE Trans. Commun., 47(7):953-955, July 1999.

317 A N OPTIMAL INVALIDATION M E T H O D FOR MOBILE DATABASES WEN-CHI HOU, HONGYAN ZHANG Department of Computer Science, Southern Illinois University at Carbondale, IL 62901, USA E-mail: [email protected]. edu MENG SU1 Department of Computer Science, Venn State Erie, The Behrend College, Erie, PA 16509, USA E-mail: [email protected] HONG WANG CoManage Corporation, 8500 Brooktree Rd, Wexford, PA 15090, USA E-mail: michellew @ comanage. net Mobile computing is characterized by frequent disconnection, narrow communication bandwidth, limited communication capability, etc. Caching can play a vital role in mobile computing by reducing the amount of data transferred. In order to reuse caches after short disconnections, invalidation reports are broadcasted to clients to help update/invalidate their caches. Detailed reports may not be desirable because they can be very long and consume large bandwidth. On the other hand, false invalidations may set in if detailed timing information of updates is not provided in the report. In this research, we aim to reduce the false invalidation rates of the reports. From our analysis, it is found that false invalidation rates are closely related to clients' reconnection patterns (i.e., the distribution of the time spans between disconnections and reconnections). By using Newton's method, we show how a report with a minimal false invalidation rate can be constructed for any given disconnection pattern.

1

Introduction

Mobility and portability of wireless .communication create an entirely new class of applications and new massive markets combining personal computing and consumer electronics. Information retrieval is probably one of the most important mobile applications. In a mobile computing environment, a set of database servers disseminates information via wireless channels to mobile clients. Clients are often disconnected due to some battery power saving measures [16], unpredictable failures, etc., and they also often relocate and connect to different database servers at different times. Due to the narrow bandwidth of wireless channels, clients should minimize communication to reduce contention for bandwidth. Caching of frequently accessed data at mobile clients has been shown to be a very useful and effective mechanism in handling these problems. Many caching algorithms have been proposed for conventional client-server architectures. However, due to the unique features of the mobile environment, such as narrow 1

Correspondent author.

318 bandwidth, frequent disconnections, weak communication capability (of clients), etc., conventional algorithms are not directly applicable to mobile computing. Research and development in cache management for the mobile computing environment has been discussed, for example, in [2, 3, 4, 6, 7, 10, 11, 16, etc]. In order to reuse the caches after frequent short disconnections, invalidation reports are broadcasted to clients to help update/invalidate their caches [3, 11] in mobile databases. Detailed reports can be long, consuming large bandwidth, and thus may not be desirable. On the other hand, cached items could be falsely invalidated (called false invalidations) if detailed timing information of updates is not provided in the report. In this paper, we discuss how to construct a report with a minimal false invalidation rate. We have found that false invalidation rates have to do with clients' reconnection patterns (i.e., the distribution of the time spans between disconnection and reconnection). By applying Newton's method [1] to the clients' reconnection pattern, a design of the report with a minimal false invalidation rate can be obtained. The rest of the paper is organized as follows. In section 2, we describe and review caching management model in the mobile computing architecture and. In section 3, we take clients' reconnection patterns into account in the design of invalidation reports. By using Newton's method, a design with the minimal false invalidation rate is obtained. Section 4 is the simulation results and the conclusions. 2

Cache Management in Mobile Computing

2.1 Cache Management Problems in Mobile Computing Environment Caching can reduce client-server interaction, lessening the network traffic and messageprocessing overhead for both the servers and clients. Various cache coherence schemes [5, 14, 15, etc.] have been developed for the conventional client-server architecture. Since mobile client hosts often disconnect to conserve battery power and are frequently on the move, it is very difficult for a server to keep track of the status and locations of the clients and the validity of cached data items. As a result, the Callback approach is not easily implemented in the mobile environment. On the other hand, due to the limited power of batteries, mobile clients generally have weak or little transmission capability. Moreover, the narrow bandwidth of the wireless network could be clogged up if a massive number of clients attempt to query the server to validate their cached data. As a result, both the Callback and Detection approaches employed in the traditional client/server architecture are not readily applicable to the mobile environment and new methods for maintaining cache consistency have to be designed. Updates are generally made by the server and broadcasted to its clients immediately. Thus, as long as a client stays connected, its cache will be current. Discarding entire caches after short and, moreover, frequent

319 disconnections could be wasteful, as many of the cached items may still be valid. Thus, research in cache consistency has aimed to reuse the caches after short disconnections. An approach of broadcasting invalidation messages to clients to help update their cached items has attracted a lot of attention [8, 9, 10, etc]. It is generally assumed that there is a dedicated channel for broadcasting invalidation messages, which is different from the channel for data broadcast. Based on the timing of the invalidation messages being broadcasted by the servers, cache invalidation methods can be either asynchronous or synchronous. In the synchronous approach, the server gathers updates for a period of time and broadcasts these updates with the time when they were updated in the report. Note that some latency could be introduced between the actual updates and the notification of the updates to the mobile clients. Once invalid items are found in the cache, the client has to submit an uplink request for updated values. The broadcast of the invalidation report divides the time into intervals. A mobile client after reconnection has to wait until the next invalidation report has arrived before answering a query. That is, a mobile client keeps a list of items queried during an interval and answers them after receiving the next report. 2.2

Broadcasting Timestamp (BT) Strategy

Broadcasting timestamp (BT) strategy [3] was developed based on the synchronous invalidation approach. The report is composed of a set of (ID, timestamp) pairs, in which ID specifies an item that has been updated and the timestamp indicates when a change was made to that item. The longer the update activities are recorded in a report, the larger the invalidation report is, which can lead to a longer latency in dissemination of reports. 2.3

Bit-Sequence (BS) Approach

In the above approach, updated items are indicated by IDs and their respective timestamps in the report. When the number of items updated is large, the size of an invalidation report can become very large too. In order to save the bandwidth of wireless channels, the bit-sequence (BS) approach is proposed [11]. The bit-sequence mapping aims to reduce the naming space of the items. Since our approach is based on the BS approach, we will elaborate a little more here on this approach. In the BS approach, each data item is mapped to one bit of an N-bit sequence, where N is the total number of data items in the database. That is, the n* bit in the sequence corresponds to the n"1 data item in the database. A value "1" in a bit indicates the corresponding data item has been changed; and "0" indicates otherwise. This technique reduces the naming space for N items from Nlog(N) bits as needed in BT

320 approach to N bits here. It is noted that at least log(N) bits are needed to store an item's ID in the BT approach. The bit-sequence mapping is illustrated in Figure 2.1. Database

/

/ 1

y

/

3

4

N-l

k

it

i i

a

1

0

2

N /

1 i.

1

A

1

1

ii

0

Figure 2.1 Bit Sequence Mapping

In order to reduce false invalidations, a hierarchically structured and more detailed report is proposed [11]. Instead of using just one bit-sequence (and a timestamp), n bit-sequences (n > 1) (each is associated with a timestamp) are used in the report to show the update activities of n overlapping subintervals. Specifically, the i"1 sequence (0 < i < n-l), denoted Bj, has a length of N/21 bits, where N is the number of data items in the database; it records the latest N/2 ,+1 update activities. Each bit-sequence is associated with a timestamp T(B;) indicating since when there have been such N/21+1 updates. As shown in Figure 2.2, the first bit-sequence B 0 , has a length of N and has N/2 " 1 ' bits, showing that N/2 items have been updated since T(B 0 ). The second sequence Bi has N/2 bits, each corresponding to a data item that has been updated since T(B 0 ) (i.e., the " 1 " bits in B 0 ). Again, half of the bits in B] (i.e., N/4 bits) have the value " 1 " , indicating half of the N/2 items that have been updated since T(Bo) were actually updated after T(B]). In general, the j " 1 bit of the Bj represents the j * " 1 " bit in the Bn, and half of each bitsequence are l's. It can be observed that the total number of bit-sequences n is log(N). The modified scheme is called the dynamic BS. Instead of mapping an item to a bit in B 0 , each updated item is represented explicitly by its ID in the dynamic BS. Thus, B 0 is now made of the IDs of those items that have been updated since T(B 0 ). The rest of bit-sequences (B], ... Bn.]) are constructed in the same way as in the original BS. That is, sequence Bj (0 < i < n-l) has k/21"1 bits, with half of them being "l"s, where k is the number of items updated since T(Bo). If both an ID and a timestamp are implemented as a 32-bit integer, the total size of a report can be calculated approximately as 32k + 2k + 32 log(k), where 32k is the size of k IDs (i.e., Bo), 2k is the size of the rest of the bitsequences (i.e., Bi, ... Bn_i), and log(k) is the number of timestamps (or the number of bit sequences) [11].

321

Jata base

/

/

/

/

1 I k

. 1

3 |

\ N

1

0

\

9

I'l'

• • • •

Client 1 dlient2 •

1/ N/2

1

• •

1

0

0

0

"<J

tt

\

1

Clients' disconnectio n time

1

1

0

T(B,)

T(B„.2) Client 3

_*_

Cuiren time

2 bits:

1

0

T(B».,)

Figure 2.2 An Invalidation Report with Client Disconnection Time

3

An Optimal Construction of Hierarchical Bit-Sequences

Although Jing's hierarchically structured bit-sequences discussed above reduced the naming space of items and number of timestamps, there is no justification for why the bit-sequence hierarchy should be constructed based on half of the updates, i.e., N/2' or k/21 updates. In fact, this "half-update-partition" scheme could favor shorter reconnections than longer ones. Here, we use Figure 2.2 as an example. After reconnection at the current time, clients 1 and 2 will rely on Bo to invalidate their data items, and client 3 will use Bn.2. All cached items updated between T(B0) and T(Bi) in clients 1 and 2's caches will have to be invalidated after reconnection, even though some of them might have been already updated in the caches before disconnections, recalling that updates are immediately reflected on the clients' caches while connected. Notice that the time span between T(Bo) and T(Bi) is much longer than the time between T(B„.2) and T(Bn.!). Tlius, if there are a large number of clients like clients 1 and 2, who disconnected

322 during the earlier period of the window (which is quite likely), there could be a lot of items falsely invalidated. Clearly, this hierarchical structure with "half-update- partition" cannot achieve minimal false invalidations. We redesign Jing's hierarchically structured bit-sequences to minimize the false invalidation rate. Specifically, we will investigate the division of the n overlapping subintervals in the report such that the false invalidations can be minimized. As in any approach, a report can only cover the update activities of a limited period. The window size of an invalidation report W refers to the time period of updates covered in a report. The larger the window size, the longer the clients can stay disconnected without discarding the caches. However, the larger window size also gives rise to larger reports, which may cause longer latency between two consecutive reports, recalling that a reconnecting client has to receive a report before it can answer any query. Here, we assume that both W and L are fixed and predetermined by the system. 3.1 Reconnection Patterns It is observed from Figure 2.2 that false invalidation rates are closely related to the reconnection patterns of mobile clients (i.e., how long clients are likely to reconnect after disconnection). Therefore, to reduce the false invalidation rates, a reorganization of the bit-sequences that takes into account clients' reconnection patterns needs to be devised. Assume that the reconnection pattern can be represented by a certain probability Frequency

CT: Current Time DT: Disconnection time

1

1

1

2

1 3

I 4

1 5

Figure 3.1 The Reconnection Time Distribution

1 6

*• Reconnection time: CT - DT

323 distribution, such as the one shown in Figure 3.1, where the X-axis is the difference between the reconnection time and the last disconnection time of the mobile clients and the Y-axis represents the number of clients. Let us now analyze the relation between the false invalidations and the reconnection distributions. Assume that a mobile client disconnected at time x. After it reconnects and receives the first report, it looks for a sequence, say B i; in the report, with the largest timestamp that is less than or equal to its disconnection time x (i.e., T(Bj) < x ). If the client did not disconnect exactly at T(Bj), there might be a chance for false invalidation, because the client may have already updated some of the items in its cache between T(Bj) and x when it was still connected. The larger the difference between x and T(Bj), the more items might have been updated by the clients before disconnection, and those items would be falsely invalidated when reconnected. Now, let us derive the relationship between false invalidation and division of the window for a given reconnection pattern. Assume updates arrive at a rate C. Since the server receives update requests from users of all kinds, we may assume that updates are independent. Then, the expected number of items falsely invalidated for a client disconnected at x , denoted FI(x), is C*(x-T(Bt)). Letj{x) be the reconnection pattern. Then, the expected total number of falsely invalidated items, denoted by TFI, is „_1 7-(Bl+

1

)

m = c i ( i(x-T(B,y)* f(x)dx) 1=0

T(Bi)

= CX( \x*f(x)dx)-CJX Jr(B ; )*/W^) ;=o nsj) >=o us) 7W

„-i r
T(Ba)

i=0

= C \x*/(x)dc-CX(

J7X3)*f(x)dx)

T(Bil

where n is the number of bit-sequences in the report, T(B„) = CT (i.e., the current time). To minimize TFI is equivalent to maximum "y, <'=»

'{T(B ) * f(x)dx) ' a s r(fl,)

\x * f(x)dx T(B 0 )

constant. We will see how to find a partition of the window into n subintervals to maximum £ ( JT(B,) * f{x)dx) fr°m '=0

T{B,)

me

following theorem and its proof.

IS a

324

Theorem 1. Let £ be an arbitrary positive real number, fix) be a continuous real function. Then, there exists a vector X = [xx, • • •. xn_x ] r such that n-i

X

.

M

g(X) — S ( I x-* f(x)dx) i'=o

.

.

.

.

,

obtains its maximum, where n, Xo, xn are three constants,

Xi

x <•••< x < x-^, <•••< x , and the vector X can be approximated with error £ 0

I

,+1

n

Proof: Let x0 = a, Xn = b, then afaf(x)dx < g (X) < bfaf(x)dx. There exists a maximum value of continuous function g (X) for a-xQ<xx <--<xn_x <xn=b. When a = x0 = xx =••• = xn_ltxn =b, we have gXX)=aIf(.x)dx. a

The solution xx, • • •, JC„_J of dg(*) = 0 ,- _ i 2 ... n -1

must

^e m e

vames sucn

dxj

that g (X) attains maximum atX=[a, xh x^ ..., JC„_I , b]T. Consider the following equation ?4^- = & + if(x)dx + f(xi)(xi_l-xi) = 0,i = l,2,-,n-l. (D dXj

i

g (X) is smooth and it can attain its maximum. Therefore, the equation (1) has x =[xit,-,x„_ , — ,x l]T, solutions. Let X

F^)

=

MX),and

F(X) = [f,(X), F2(X),

•,/v 1 (X)] T

dX:

then (1) is equivalent to F(X) = 0, where 0 is a n-1 dimension vector [0, 0

(2) 0] . By using Newton's iterative method [1], T

we can find the approximation of the solution Xt as following: F(Xk) + F(XkXXM-Xk)

= 0,

(3)

XM-Xk=-F\Xk)-lF{Xk) = -DF(Xky1F(Xk),

*=0,1,2, •••.

We know that Xk -> X,, X, is the solution of (2) such that g ^ ) attains the maximum at X,

=(a,Xt,bf. We choose initial value

^o=[ Jc o,i. Jc o,2'---' Ji: o, n -i] T ' a<x0i
for i = l,2,---,n-l.

325 In (3), DF(X)

is the Jacobian matrix and Af,

f(x2) M, /(*,)

/(*z)

DF(X) --

0

/(* 3 ) M3

/U4)

0 0

/(*_,)

M„_,

whereM, =-2/(jt f ) + (*,_, -JC,-)/'(*,•) fori = l , 2 , — , n - l , Z)F(X)is asymmetric tridiagonal matrix and (3) is equivalent to the linear system equations DF(Xk)(X-Xk) = -F(Xk) (4) By using LU decomposition method [1], we can solve for the linear equations (4) to obtain Xk+i.

We know that the complexity of this method is O(n), n is the order of the

tridiagonal matrix. Equation (4) can be solved for sequence {jft}k = l,2,---. After Nsteps of iteration, the complexity is N • 0{ri) = 0(n). The arbitrary approximation of the solution in polynomial time can be obtained by using I*™ ~ ** | ! = "S <**+!..- - **, ) 2 ^ e. E > 0. II

112

,-_,

to find the number of steps N. Hence the result satisfies the precision requirement. Example 1. Assume that the reconnection pattern has a uniform distribution within the window, that is, fix) = c, where c is a constant. From the above theorem, we have g(X) = 1\™xif(x)dx=ctxi{xM

-x,)

Ft ( f ) = £f+1 cdx+ C(JC,._, -*,•) = c(*,+i - xi + XM ~ xi) fori = 1,2,- • •, n -1. If F ; ( X ) = 0 , we have xi+l — xt — jcg-_j — xt, then

+—. This indicates evenly n spaced intervals gives the optimal solution when the reconnection distribution is uniform within the window. x

= XQ

Example 2. Assume we are to divide a window of size 10 into 3 subintervals, and the reconnection distribution follows the formula

326 x,

0<x <5

y= 10-x, 5<x<10; Then, from the above theorem, we know that there exists a maximum value of g(X), where g(X) = xAXl

ydx + x,|"°ydx. a n d O < X l < x 2 < 1 0 .

If0<x,<5, 5<x2<10, g(X) = ^x2(x2

- 1 0 ) 2 - i ( * 2 -10) 2 x, +25*, - | V -

IfO<x, < x 2 < 5 , g(X) = iAT1(JC22-V) + ^ 2 ( 5 0 - X 2 2 ) . If 5 < JCJ < x 2 < 1 0 , * ( ^ ) = | ( * i -x2)(x2 -10) 2 - ^ U 2 -10) 2 ^,. With the help of MATLAB, we obtained that g(X) has the maximum value of 86.9385 when Xi = 3.1008 and x2 = 5.4005. Thus, we will divide the window at 3.1008 and 5.4005. In Section 4, we will use these two distributions to perform simulations and measure the performance. 3.2 Algorithms for Clients and Servers The algorithm has two parts: one runs on the server side and the other on the client side. The server maintains a sorted linked list, which we call an updatelist. The list contains data items updated during the last window period in chronological order. Each node in the linked list has the following data structure. typedef struct list { int index; int updatetime; struct list *next; int oneposition; } updatelist;

//the index number of the data item in the database // the time of the update // a pointer to the next node // the position among the " 1 " bits, i.e., ith 1-bit in the bit sequence

The server constructs and broadcasts the reports periodically using a dedicated channel, while the clients interpret the reports after reconnection. The construction of the

327 bit-sequences at the server side and the interpretation of the report at client sides are described as follows. • Server side algorithm 1. for (i = 0; i < n; i ++) // calculate timestamps T(Bj) for each Bj T(Bj) = CurrentTime - time[i]; // time[] stores the interval dividing values derived from Theorem 1. 2. for each node in the list // constructing B0; onecount = 1, initially { set the j l h bit of B 0 to " 1", where j is the index of the data item represented by the current node; oneposition (of the current node) = onecount; onecount = onecount +1; } 3. for (k=l; k S n-1; k++) // constructing Bk, 0< k < n-1 { allocate space for Bk (of length onecount bits) and intialize it to all 0's; onecount =1; while (updatetime of the current node > T(Bk)) do // for all nodes in the updatelist { set the j t h bit of Bk to "1", where j is the value of oneposition of the current node; oneposition (of current node) = onecount; onecount = onecount +1; } } • Client side algorithm: An input to the algorithm is the variable "Last" that indicates the last time when the client received a report. 1. if T(Bn.!) < Last, no data cache needs to be invalidated, Stop; //cache is up to date 2. if Last < T(B0), the entire cache is invalidated, Stop; // outside the window 3. Locate the bit sequence Bj such that T(Bj) < Last and Last < TCBj+i) for all j (0 < j < n ); 4. Invalidate all the data items represented by " 1 " bits in Bj. To determine the data items corresponding to the " 1 " bits in Bj in the step 4 above, the following algorithm can be ued. A: if j = 0, then use the positions of those " 1 " bits in Bn to identify the data items and stop; for each "0" bit in Bj, reset the ith " 1 " bit in Bj_i, where i is the position of a '0' bit in BJ; j = j - 1 and go back to step A. 3.3 A Dynamic Scheme Like the dynamic BS scheme [11], we can modify our method a little bit to further reduce the size of the report when then number of items updated is small. That is, instead of using an N-bit sequence for B 0 , we use explicitly the IDs of items that have been update since T(B 0 ). Other bit sequences (Bj, ..., B„_i) are constructed as before. If both IDs and timestamps are implemented as 32-bit integers, the overall size of the report is 32k +

328 y"! " I Bj I + 32n, where the first term is the size of k IDs (or B 0 ), the second term is the size of the rest of the bit-sequences (i.e., B], ... Bn.!), and the last term is the size of n timestamps. Note that the number of timestamps (or bit sequences) in our approach is in general different from that of Jing's. We will pick up this issue in the next section. 4

Preliminary Simulation Results

In this section, we report simulation results on length of the reports and false invalidations of our and Jing's approaches. We have chosen to experiment with the dynamic schemes of these two approaches because of their flexibility in accommodating a variable number of updates in the report (especially for Jing's approach). The results ought to apply to the original schemes without any difference. Due to space limitation, we present only some of the important results here. Interested readers are referred to [17] for more comprehensive simulation results. A report has two parts - bit-sequences and timestamps. Since item IDs are used in the first bit-sequence B 0 in both approaches, we shall exclude B 0 from the "bitsequences" in the following discussion, unless otherwise stated. We will also discuss the effect of timestamps on the overall size of a report later. The purpose of the simulations is mainly to compare the size of the reports and the effectiveness of the bit-sequences in reducing false invalidation rate, denoted FIR, which is defined as p T R _ number— of — items — falsely—invalidated number—of — items—invalidated To the best of our knowledge, there has been no study on the distributions of potential reconnection patterns. Therefore, we have chosen to use the two patterns, a uniform distribution and a non-uniform distribution with a peak in the middle of the window as described in the Examples 1 and 2 of Section 4.1 for our simulations. Hopefully, these distributions are good approximations to some of the potential reconnection patterns. It can be observed that cache size has no effect on FIR, which is due to the inaccuracy of the report. Therefore, we shall not mention cache size in the following discussion. We have also tested with various database update rates, 10%, 20%, 30%, 40%, and 50%, which are the percentages of items updated during the last window period, to see their effects on the false invalidation rate. The lengths of the bit-sequences in two approaches are usually different. As mentioned earlier, the expected size of Jing's bit-sequences (excluding B0) can be calculated beforehand as 2k, while in our approach it depends on the number of subintervals in the window. In order to compare the effectiveness of bit-sequences, we

329 have chosen the number of subintervals to be 3 in our approach so that the lengths of our bit-sequences can be as close to Jing's as possible. As discussed earlier in Section 3, for a uniform reconnection pattern, an evenly divided window gives the optimal performance. For the convenience of calculation, the window size has been set to be 10 time units in all simulations. In the following tables, we report the length of bit-sequences in bits in the "Length" column. The row "Ratio" shows the ratios of the length and FIR of our approach to Jing's.

Update Rate 10% Length FIR Optimal Jing's Ratio

20% Length FIR

30% FIR Length

40% Length FIR

50% Length FIR

1731 0.1883 2281 0.2008

3398 0.1878 4313 0.2004

5065 0.1884 6344 0.2001

6732 0.1881 8345 0.2006

8397 0.1880 10378 0.2005

0.7589 0.9379

0.7879 0.9371

0.7984 0.9412

0.8067 0.9377

0.8091 0.9378

Table 4.1. Uniform Distribution

As observed from Table 4.1, not only our bit-sequences are shorter than Jing's, but also achieve a slightly better (or lower) false invalidation rates (approximately 94% of Jing's). This implies our bit-sequences are more effective in lowering the false invalidation rate. It can be further observed that our bit-sequence size is around 80% of Jing's (i.e., 0.7589, 0.7879, 0.7984, 0.8067, 0.8091 for 10%, 20%, 30%, 40%, 50% update rates, respectively). This result is consistent with our analysis on the estimations of bit-sequence sizes. Recall that the length of Jing's bit-sequences is 2k (excluding B 0 ), while ours is k + (2/3)k = (5/3)k, where k is the size of Bi (and also the number of items updated), and (2/3)k is the expected size for B2. That is, ours is only 83% ((5/3)k / 2k ~ 0.83) of Jing's. The FIRs basically remain the same for different database update rates in each approach, that is, around 18.8% in our approach and 20.0% in Jing's approach. It indicates that FIR has to do with the ways of constructing bit-sequences, but has nothing to do with the rates of updates. Now let us consider the size of timestamps. The total size of timestamps in Jing's report is 321og(k), while it is 32n in ours, where log(k) and n are the numbers of bit-sequences in respective reports. In our report, there are 3 (i.e., n = 3) timestamps, while in Jing's report, it has log(k) timestamps (log( 1,000)= 10, log(2,000)=ll log(5,000)=13 for 1,000, 2,000, ..., 5,000 updates, respectively, during the last window). Clearly, we use much less timestamps and consume less space than Jing's report.

330 Update Rate 10% Length FIR

20% Length FIR

30% Length FIR

40% Length FIR

50% Length FIR

Optimal

1754 0.1755

3444 0.1759

5134 0.1755

6826 0.1754

8513 0.1757

Jing's

2281 0.2315

4313 0.2307

6344 0.2310

8345 0.2313

10378 0.2309

Ratio

0.7690

0.7581

0.7985

0.7626

0.8093

0.7598

0.8180

0.7585

0.8203

0.7610

Table 4.2 Non-uniform Distribution

In Table 4.2, we show the results for the non-uniform distribution described in Example 2 of Section 3.1. According to Theorem 1, we divided the window at 3.1008 and 5.4008. Again, our bit-sequences are shorter (about 80% of Jing's), and yet achieve much better (or lower) false invalidation rates, that is, 76% of Jing's. As in the uniform case, the FIRs remain the same in each approach for different item update rates. The FIRs are around 17.6% in our approach, compared to 23.1% in Jing's. It is worth mentioning that if there is enough bandwidth for longer reports, in our approach we can easily divide the window into more subintervals (i.e., more bitsequences) to achieve lower false invalidation rates. However, this may not be possible for Jing's approach because the number of bit-sequences (i.e., log(k)) is completely determined by the number of updates k during the window period (assumed fixed), and the number of updates and thus the update rate have nothing to do with FIR, as shown in the tables. That is, even though there is still bandwidth left for use, Jing's approach simply cannot use it to reduce the false invalidate rates. (Excess bandwidth may be used to cover longer periods though). In summary, our approach clearly outperforms Jing's approach in terms of the length of bit-sequences, number of timestamps, effectiveness of reducing FIR, and flexibility in using excess bandwidth to reduce false invalidation rates. References 1. 2. 3.

4.

Axelsson. "Iterative Solution Methods", Cambridge University Press, 1994. D. Barbara, "Mobile Computing and Databases-A Survey", IEEE Transactions on Knowledge and Data Owe Engineering, pp. 108-117, Vol. 11, No. 1, Jan/Feb, 1999 D. Barbara and T. Imielinski. "Sleepers and workaholics: Caching strategies for mobile environments". Proc. of the ACM SIGMOD Conference on Management of Data, pp. 1-12, May, 1994. J. Cai, K, Tan, and B. Ooi, "On Incremental Cache Coherency Schemes in Mobile Computing Environments," Proc. of IEEE Data Engineering, Pg. 114-123, April, 1997

331 5.

6.

7.

8. 9. 10.

11.

12. 13.

14. 15. 16.

17.

M. J. Carey, M. J. Franklin, M. Livny & E. J. Shekita, "Data Caching Tradeoffs in Client-Server DBMS Architectures", Proc. of ACM 1991 SIGMOD, pp. 357-366, May, 1991. H. Chung, H. Cho, "Data Caching with Incremental Update Propagation in Mobile Computing Environments", Proc. Australian Workshop on Mobile Computing and Databases and Applications, pp. 120-134, Feb. 1996. A. K. Elmargarmid, J. Jing, and T. Furukawa, " Wireless Client-Server Computing for Personal Information Services and Applications," ACM SIGMOD Record, pp. 4349, Dec. 1995. M. Franklin, M. Carey, and M. Livny. "Global Memory Management in ClientServer DBMS Architectures". Proc. ofVLDD, pp. 596-609, August 1992. C. G. Gray and D. R. Cheriton. "Leases: An Efficient Fault-Tolerant Mechanism for Distributed File Cache Consistency". Proc. ofSOSP, pp. 202-210, Feb. 1989. T. Imielinski, S. Vishwanath, and B. R. Badrinath. "Energy efficient indexing on air", Proceedings of the ACM SIGMOD Conference on Management of Data, Minneapolis, Minessota, 1994. J. Jing, A. K. Elmagarmid, A. Helal, and R. Alonso. "Bit-Sequences: An Adaptive Cache Invalidation Method in Mobile Client/Server Environments". ACM/Baltzer Journal of Mobile Network and Applications, 2(2), pp.115-127, 1997. M., Stonebaker, et al, "Third -Generation Data Base System Manifesto," SIGMOD Record 19, 3, pp. 241- 234, Sept. 1990. J. Strain, R. Acuff, T. Rindfleisch & L. Fagan, "A Pen-Driven, Mobile Surgical Database: Design and Implementations," http://www.smi.stanford.edu/projects/mobile/amia94-2.html, 1994. Y. Wang, "Cache Consistency and Concurrency Control in a Client/Server DBMS Architecture", Proc. of ACM SIGMOD 1991, pp. 367-376, May, 1991. K. Wilkinson & M. Neimat, "Maintaining Consistency of Client-Cached Data", Proc. of 16th VLDB, pp. 122-133, Aug.1990. K.L. Wu, P.S. Yu and M.S. Chen "Energy-efficient Caching for Wireless Mobile Computing", Proc. 12th International Conference on Data Engineering, pp. 34-50, Feb. 1996. H. Zhong, "A New Invalidation Method for Cache Management in Mobile Databases", Master Thesis, CS Department, SIU, May, 1999.

333 COMPARISON OF WAVELET COMPRESSION ALGORITHMS IN NETWORK INTRUSION DETECTION

ZHENG ZHANG, CONSTANTINE MANIKOPOULOS, Electrical and Computer Engineering Department, New Jersey Institute of Technology, University Heights, Newark, NJ 07102, USA, E-mail: [email protected], [email protected]

Department

JAY JORGENSON CCNY, Convent Ave. at 138 ST., New York, NY 10031, USA E-mail: jjorgenson@mindspring. com

of Mathematics,

JOSE UCLES Network Security Solutions, 15 Independence Blvd. 3rd FL., Warren, NJ 07059, USA E-mail: [email protected] In this paper we report on the experimental results of the effectiveness of using wavelet compression on the monitored data collected in HIDE, a hierarchical intrusion detection system. HIDE measures the network traffic parameters and abstracts them into probability density functions (PDFs), utilizing sixty-four bins. For the sake of resource optimization, we compressed these PDF representations using various wavelet families and then compared their effectiveness. The four families we studied are: hoar, sym2, coi/3, db4. The results showed that all four wavelet bases can reliably compress the PDFs from 64 bin values to 16 to 28 wavelet coefficients (compression range 3), without major performance deterioration. In comparing the four wavelet algorithms, we found that the sym2 wavelet family performed the best; its performance at compression range 5, employing only six wavelet coefficients, resulting in compression ratio of 10.67, is still very satisfactory.

1

Introduction

Network intrusion detection is increasingly needed to protect networks and computers from malicious network-based attacks. Intrusion detection techniques can be partitioned into two complementary trends: misuse detection, and anomaly detection. Misuse detection systems, such as [1][2], model the known attacks and scan the system for the occurrences of these patterns. Anomaly detection systems, such as [3] [4], flag intrusions by observing significant deviations from typical or expected behavior of the systems or users. In [4], we proposed the prototype of a Hierarchical Intrusion DEtection system (HIDE) that uses statistical preprocessing and neural network classifications to detect network-based attacks. HIDE is a hierarchical anomaly intrusion detection

334

system. It gathers data from network traffic, the system log and hardware reports; it statistically processes and analyzes the information, detects abnormal activity patterns based on the reference models, which correspond to the expected activities of typical users, for each parameter individually, or in combined groups using neural network classification; it generates the system alarms and event reports; finally, HIDE updates the system profiles based on the newly observed network patterns and the system outputs (see Fig. 1). Subsequent simulation experiment results have shown that the system could identify attacks accurately and efficiently [5] [6] [7]. Inputs

Network Data • Network Traffic Information (IP, UDP.TCP,...) • Higher Layer Log Information • Hardware reports and information from some other sources

HIDE

Outputs

Event Probe and Analysis

Intrusion Detection

Anomaly Alarm Network Reports Event Log

2££ System Update

Fig. 1 Inputs and Outputs of HIDE

In HIDE, the network traffic parameters are measured and built into probability density functions (PDF). Each PDF is represented by sixty-four bins, with each bin corresponding to the probability of a certain event. The PDF representation is capable of displaying the nimble differences between the activity patterns of legitimate users and those of intruders, thus leads to a better performance than traditional threshold-based IDS. However, building and handling PDFs also consume more system processing power and need more memory and storage resources. Data compression of the PDFs has the potential to enhance the efficiency and the performance of our system. Some preliminary results using haar-based wavelet compression in HIDE are reported in [7]. In this paper, we present our experiments on four different wavelet algorithms with various compression ranges, from range 1 to 5, applied to the PDFs of HIDE. The rest of this paper is organized as follows. Section 2 describes the usage of probability density functions (PDFs) for intrusion detection. Section 3 introduces the wavelet algorithms and the experiment approaches we made. In Section 4, we report the test bed and the attack schemes we simulated. Some experimental results are also in section 5. Section 6 draws some conclusions and outlines future work.

335

2

PDFs for Intrusion Detection

HIDE uses statistical models and neural network classifiers to detect anomalous network conditions. The statistical analysis bases its calculations on PDF algebra, in departure from the commonly used isolated sample values or perhaps their averages, thus resulting in higher classification accuracy. Our system generalizes further by combining the information of the PDFs of the monitored performance parameters, either all of them or subgroups of them, in one integrated and unified decision result. This combining is powerful in that it achieves much higher discrimination capability that enables the monitoring of individual service classes in the midst of general traffic consisting of all other classes. It is also capable of discriminating against intrusion attacks, known or novel. 3

Wavelet compression

For each wavelet family we computed a multi-level one dimensional wavelet analysis and computed approximation coefficients, which are obtained by convolving with a low pass filter. Further levels of approximation coefficients were obtained by repeated convolutions with the given low pass filters. The approximation PDF was then obtained by direct reconstruction via the approximation coefficients. The analysis we undertook is available using commands within the Wavelet Toolbox of MatLab [11]. • Original PDF - PDF after haar wavelet

Puik_ 10

20

30

40

A. M 50

60

Original PDF _ _ PDF after coif3 wavelet

, I

_ ^ ^ _ ^ 10

20

30

40

SO

60

10

20

30

40

50

60

Original PDF PDF after db4 wavelet

f

•

TA 10

A^ 20

30

^ w * ! - , 40

50

Fig. 2 A Sampled PDF with Different Wavelet Compressions

Among the many wavelet families that exist [8], [9], [10], we selected a wavelet from each of the following families: the classical Haar basis (called haar afterward), the Symlets basis (called sym2 afterward), Coiflets (called coifi afterward), and Daubechies (called db4 afterward) wavelets. The Haar wavelet is

336

the oldest and simplest wavelet, and it has the shortest support among all orthogonal wavelets. The Haar basis is a multiresolution of piecewise constant functions and, in practice, is known to not be well adapted to approximating smooth functions because it has only one vanishing moment. Symlets wavelets are characterized as being compactly supported, orthogonal wavelets with least asymmetry and highest number of vanishing moments for a given support width. Graphs of the scaling functions can be found on page 254 of [9] (see also [10]). Coiflet wavelets are characterized has being compactly supported, orthogonal wavelets with minimum support width while requiring the highest number of vanishing moments for both the scaling function and the wavelets. Coiflets first appeared in applications to numerical analysis (see page 254 of [9]). Daubechies wavelets have the minimum support width for a given number of vanishing moments. Daubechies wavelets are orthogonal and very asymmetric. Graphs of the scaling functions can be found on page 253 of [9] (see also [10]). Sampled PDFs of these four wavelet compression algorithms are shown in Fig. 5. In HIDE, we are using wavelet algorithms to compress the network parameters, which were measured as PDFs with 64 bins, into sets of wavelet coefficients. To compare the system performances under various wavelet compression ranges, the compressed PDFs are decompressed and then processed by the statistical modules and the neural network classifiers. The outputs with wavelet compressions are compared with those without compression. The process is illustrated in Fig. 3.

Probe & Event Preprocessing

Fig. 3 Wavelet Compression and Decompression

The numbers of wavelet coefficients obtained for each range for each of the utilized wavelets are tabulated in Table 1. Table 1 Wavelet Coefficients for Different Wavelet Compressions

Wavelet haar | sym2 coi/3 db4

Range1 64 64 64 64

Number of Coefficients Range 2 Range 3 Range 4 32 16 8 33 18 10 21 35 14 22 40 28

Range 5 4 6 10 19 |

337

In the table, PDFs at compression range 1 correspond to uncompressed PDFs. Note that the wavelets coifi and db4 require the largest number of coefficients at each range, while the wavelets haar and sym2 provide the greatest amount of compression. 4

Testbed

We constructed a virtual network using simulation tools to generate attack scenarios. The experimental testbed that we built using OPNET, a network simulation facility, is shown in Fig. 4. The testbed is a 10-BaseX LAN that consists of 11 workstations and 1 server.

Q 11 UDP flooding attack*

n

wt»m_i

n

v*ctn_3

•

n

n

11

wfcstn_5

wkitn_7

n

«riutn_S

Fig. 4 Simulation Testbed

We simulated the UDPfloodingattack with 2 Mbps background and 100 kbps attack traffic using the testbed. 5

Experimental Results

We collected 6000 records of network traffic. These data are divided into two separate sets, one set of 4000 data for training and the other of 2000 data for testing. In each scenario, the system was trained for 100 epochs. In subsection 5.1, we studied the PDF figures of different wavelet compression algorithms. The subsection 5.2 describes the mean squared root errors and the misclassiflcation rates of the outputs. The Receiver Operating Characteristic (ROC) curves of the wavelet results are shown in subsection 5.3. 5.1

PDF Figures

Some pictures of the decompressed PDFs with various compression ranges are plotted in Fig. 5.

338

i"iL^ 10

0.1

n 0.05

>

^

20

haar range 1 I . 40

30 |

IA" \i\ . 10

I A

.

n

20

A. 50

r\ |

.

n, r

r\. 40

30

A , 60

haar range 2 |

SO

60

haar range 3 I

0 10

£

L

0.05

e *

n

10

0.04 e

1°n

40

50

60

haar range 4 I

/ 20

30

SO

40

60

haar range 5 I

. /

* °-2

sym2 range 1 I

\

10

,

vu 10

t-i t-: 1H

30 I

I.

0.02

I*

20

A 20

30

40

A, 50

sym2 range 2 I. /\ 20

30

40

\i

AyC SO

30

40

50

A

A

10

20

1

30

40

/\

V 10

60

S \ 20

30

40

20

30

40

50

A / 60

S\J. 50

60

db4 range 3 I

20

30

40

sym2 range 4 |

10

A. 50

db4 range 2 I

: ^ •

20

\.

10

60

sym2 range 3 j

10

db4 range 1 I

I 0.1 \

A / 60

50

60

db4 range 4 I-

10

60

sym2 range 5 I-

g

20

30

40

0.04 0.02

•

50

60

db4 range 5 I

~~:

Fig. 5 Some HIDE PDFs at Various Wavelet Compression Ranges

From the figures, we can see that, as the compression ranges (and ratios) get higher, the PDF shapes look more and more different from the original uncompressed PDFs. This is, of course, expected since we are loosing more information by representing a PDF with fewer coefficients. For all wavelet bases, the reconstructed PDF curves of compression range 2 are all very close to the original PDF, which hint that there might little or no performance differences at range 2. In fact, compression range 3 appears promising as well. However, visually significant PDF shape details seem to be lost at ranges 4 and 5.

339

5.2

MSR Errors and Misclassification Rates

We evaluated the mean squared root errors and the misclassification rates of the system using different wavelet compression ranges, Fig. 6. The misclassification rate is defined as the percentage of the inputs that are misclassified by neural networks during one epoch, which includes both false positive and false negative misclassifications. In Fig. 6, the x-axis values represent the five compression ranges we tested.

compression range

compression range

Fig. 6 MSR Errors and Misclassification Rates

We can see that the curve of wavelet compression algorithm sym2 rises slowly, and, even for compression range 5, the algorithm still shows strong performance with misclassification rate about 2%. For the other three wavelet bases, the system performance is satisfactorily for ranges 1 to 3, but then deteriorates for ranges 4 and 5. Therefore, wavelet compression algorithm sym2 is a more appropriate choice for PDF compression. In practice, we found that sym2 wavelet compression with compression range of 3 is a safe choice for HIDE to maintain a satisfying performance while boosting system resource efficiency by four fold. From the PDF figures in the previous subsection, the decompressed PDFs of range 3 or higher are noticeably different from the original PDF, but the results in this subsection shows no big difference from range 1. One explanation is that wavelet compression at range 3 still keeps the necessary information to identify typical and abnormal traffic patterns. 5.3

ROC Curves

The Receiver Operating Characteristic (ROC) curves are illustrated in Fig. 7. The xaxis of the figure is the false alarm rate, which is the rate of the typical traffic events being classified intrusions; the y-axis of the figure is the detection rate, which is calculated as the ratio between the number of correctly detected intrusions and the total number of intrusions. For each curve, the point at the upper left corner

340

represents the optimal detection with high detection rate and low false alarm rate. Each Curve corresponds to the system characteristic under certain wavelet compression range. - ' O •••<&

- © - compression compression compression compression compression

0.2

range range range range range

C

5 4 3 2 1

<3>

*3t>*K3>-*<37* O Q

£ °-'

0.4 0.6 false alarm rate (haar) tffr

- e - compression compression O compression compression a compression

-o-^t•p. _*_ _&_

0.2

compression compression compression compression compression

range range range range range

5 4 3 2 1

0.4 0.6 false alarm rate (sym2)

0.8

Q3> Q ? 0 O

range range range range range

5 4 3 2 1

E 0.

-e-J^-0_ if... _$_

compression compression compression compression compression

range range range range range

5 4 3 2 1

0-2;)

0.2

0.4 0.6 false alarm rate (coif3)

O.E

0.4 0.6 false alarm rate (db4)

0.8

Fig. 7 ROC Curves

From the figure, we can observe that the ROC curves of sym2 wavelet compressions are very close to that without wavelet compression. For the others, the deviations become noticeable quickly as the ranges increase. This observation also proves our conclusion in the above subsection once again. 6

Conclusions

We applied wavelet compression in network intrusion detection monitoring data. Our results showed that wavelet compression improves the efficiency of the representation of PDFs for use in statistical network intrusion detection systems. All four wavelets we tested on HIDE, maintain stable performance at compression range 3, thus improving system efficiency by two to three-fold. The Sym2 wavelet algorithm performed best, by effectively compressing the HIDE PDFs from 64 bins

341 to only 6 wavelet coefficients, thus resulting in a compression ratio of 10.67, without major performance deterioration. Acknowledgements. This research acknowledges support by a Phase I and II SBIR contract with US Army and OPNET Technologies, Inc.™, for partially supporting the OPNET simulation software.

References 1.

G. Vigna, R. A. Kemmerer, NetSTAT: a network-based Intrusion Detection Approach, Proceedings of I4tn Annual Computer Security Applications Conference, 1998, pp. 25 -34. 2. W. Lee, S. J. Stolfo, K. Mok, A Data Mining Framework for Building Intrusion Detection Models, Proceedings of 1999 IEEE Symposium of Security and Privacy, pp. 120-132. 3. Joao B.D. Cabrera, B. Bavichandran, R.K. Mehra, Statistical Traffic Modeling for Network Intrusion Detection, Proceedings of 8™ International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication systems, Aug. 2000, pp. 466-473 4. Z. Zhang, J. Li, C. Manikopoulos, J. Jorgenson, J. Ucles, A Hierarchical Anomaly Network Intrusion Detection System Using Neural Network Classification, CD-ROM Proceedings of 2001 WSES International Conference on: Neural Networks and Applications (NNA '01), Feb. 2001 5. Z. Zhang, J. Li, C. Manikopoulos, J. Jorgenson, J. Ucles, Neural Networks in Statistical Intrusion Detection, accepted by the 5tn World Multiconference on Circuits, Systems, Communications & Computers (CSCC2001), July 2001 6. Z. Zhang, J. Li, C. Manikopoulos, J. Jorgenson, J. Ucles, HIDE: a hierarchical network intrusion detection system using statistical preprocessing and neural network classification, accepted by the 2n" Annual IEEE Systems, Mans, Cybernetics Information Assurance Workshop, June 2001 7. Z. Zhang, C. Manikopoulos, J. Jorgenson, J. Ucles, HIDE, A Network Intrusion Detection System Utilizing Wavelet Compression, submitted to the Eighth ACM Conference Computer and Communication Society, Nov. 2001 8. R. Todd Ogden. Essential Wavelets for Statistical Applications and Data Analysis, Birhauser: Boston (1997). 9. Stephane Mallat. A Wavelet Tour of Signal Processing, second edition.. Academic Press, New York (1999). 10. Ingrid Daubechies. Ten Lectures on Wavelets, SI AM Philadelphia, PA (1992). 11. David Donoho, Mark R. Duncan, et al, WAVELAB 802 for Matlab 5.x, http://www-stat.stanford.edu/~wavelab/.

Information Technology/Linguistics

345 THE EMERGING CHALLENGE OF RETAINING INFORMATION TECHNOLOGY HUMAN RESOURCES

RICK GIBSON, Ph.D. American University, 4400 Massachusetts Ave. Washington DC 20016 USA E-mail: [email protected] The responsibilities and functionalities of the Information Technology (IT) department are spreading rapidly and becoming more involved in all aspects of businesses. As a consequence, the process of attracting and retaining skilled IT workers has become increasingly important. Due to the fact that the demand for skilled IT workers exceeds supply, the shortage and turnover problems that occur when skilled IT workers cannot be retained place companies at a competitive disadvantage. This report explores retention methods that have been effective in retaining IT workers.

1

Introduction

For some time now the demand for Information Technology (IT) workers has outpaced the supply. Bort [1] estimated that for the next seven years, there will be 95,000 new IT job positions yearly, but there will be only 45,000 graduates with ITrelated degrees. As a result, most companies are currently faced with a shortage of qualified IT professionals. Additionally, companies must resolve operational and personnel issues resulting from the uniquely high turnover among IT employees. A survey of IT managers revealed that a 15% to 20% annual turnover is now considered average in IT shops [2]. Clearly, organizations with a high reliance on IT professionals need to examine the process of attracting and retaining skilled IT professionals. The situation in the public sector is even worse. According to Hasson [3], in the next few years 50% percent of the Federal IT workforce will reach retirement age. Annual salary surveys reveal further problems. Almost 17,000 IT professionals are satisfied with their jobs, to include 45% of staff and 54% of managers who are satisfied with their pay. The second concern is finding interesting work, getting away from the current company's management culture, flexible work schedules and job stability. Of the 17,000 IT professionals surveyed 45% say their company is good or excellent at attracting talent, only 38% say they know how to retain employees and 2% say their companies are poor or unsatisfactory at retaining employees. Thus, the purpose of this research is to investigate ways for organizations to retain a critical mass of IT employees by addressing several questions. What techniques are being used by the private sector for recruiting IT professionals? What are IT professionals looking for in an IT job? What changes have been

346

implemented to retain IT professionals? Can some of the private sector's techniques for recruiting IT professionals be implemented for the federal government? 2

Methods

In order to answer the questions posed, a descriptive study was chosen for this research. To keep the question focused on finding various solutions, a few investigative questions were developed: (1) What are IT professionals looking for in an IT job, (2) What techniques are being used by the private sector for recruiting IT professionals, (3) What changes has the federal government implemented to recruit IT professionals, and (4) Can some of the private sector's techniques for recruiting IT professionals be implemented for the Federal government. The gathering of the data began with creating a list of IT websites, searching the periodicals online, computer publications websites, newspaper articles and reviewing journal articles. 3

Overview

Organizations have always been concerned about losing their most skilled managerial employees. Technological advances have created a work environment where the skills needed to operate a business require the science of IT to accompany the art of management. These recent technological advances have created an environment in which management level employees are no longer the sole holders of the skills that are vital to the daily operations of a business. Moreover, in contrast to the more mature and experienced members of management, younger workers are dominating the IT workforce. In a TechRepublic Salary Survey [4] report on the average ages of IT professionals, the average age for the majority (32%) of IT professionals are in the 18-25 range, with 25% in the range of 26-35 years of age. Most IT executives and consultants (25%) fall in the 36-55 age range. Today's youthful IT worker has a different attitude regarding worker-employee relationships than previous generations of workers. They are more independent and confident in themselves, their training and skills. They are not afraid to seek new employment as often as every fifteen to eighteen months. A recent survey by Yakelovich Partner reports that 73 percent of all young IT employees said that they could easily find a new job if they had to [5]. A recent study [6] concluded that the top perceived or real factors that increase the turnover rate of IT personnel are various combinations of the following: Boredom or lack of challenge, limited opportunities for growth, low expectations or standards, inferior or ineffective co-

347

workers, lack of leadership, poor supervision, Inflexible work hours, noncompetitive compensation package, commute distance and time. 4

Discussion

In a survey of the retention practices at over 500 high technology companies, American Electronic Association lists the top 10 retention techniques by degree of effectiveness as following [7]. 1. Challenging work assignments 2. Favorable work environment 3. Flextime 4. Stock options 5. Additional vacation time 6. Support for career/family values 7. Everyday casual dress code 8. High-quality supervision and leadership 9. Visionary technical leadership 10. Cross-functional assignments; tuition and training reimbursement; 401(k) matching. For discussion purposes, these effective retention strategies can be categorized into the following four main components; compensation, career advancement, participation relationship with management and positive work environment. 4.1 Compensation Employees should be able to negotiate on issues such as salary, bonus and stock options. For IT professionals, the trend for salaries has been an increase in all positions from the previous year. Despite having salaries that are usually higher than the national average, some IT professionals are not satisfied with their salaries. When various IT workers were surveyed, only 41% reported that they are fairly compensated. This leaves 59% of IT professionals who feel that they are underpaid. The average of dissatisfaction for network administrators was 66.3%, help desk professionals was 61.7% and IT trainers was 60.3%. [8]. Stock options are among is the largest motivators as it relates to compensation [9]. Because stock offerings can amount to more than two thirds of a company's total compensation package, the potential value of the stock is very important. Each company must determine the appropriate amount of stock options to grant. Stock offerings can vary according to the industry and company size. Hence, compensation packages can vary. Executive search firms specializing in competitive compensation analysis assists companies in determining the most appropriate and effective compensation package for each position.

348

Some of the top incentives that businesses are using to reward employees for exceptional work include annual bonuses, competency-based pay, quality-based pay, retention bonuses, profit sharing and tuition reimbursement. However, the components of a compensation package are usually not the highest concern for many IT professionals. This is especially true in regards to today's young IT professional, for whom money is not the best retention tool. They need incentives to which they respond positively, e.g., a visible level of independence, a flexible schedule or closer peer-to-peer relationship with their managers. 4.2 Career Advancement Another effective retention technique involves career advancement opportunities offered to employees. Employees expect the ability to move laterally, e.g., between departments or assignments, within a company. Moreover, they hope to get the opportunity for vertical advancement, i.e., to get promoted to a higher position. Training programs led by the executives of companies such as mentoring and leadership seminars are perceived as positive incentives for retaining employees. Employees often look for value-added programs that will improve their career. Companies that provide these programs are perceived as companies committed to the professional advancement of their employees. Training can be provided through university, community college or self-study courses, seminars, computer-based training or various other interactive methods [10]. Although providing training is one way to keep employees on the job and increase their value to companies, the fear exists that newly trained employees might leave for new jobs after obtaining valuable skills or certifications [11]. Therefore, the training sessions must be associated with a sense of company trust and familiarity that will develop a strong company loyalty. Training reimbursement agreements are usually employed to protect a company's investment. 4.3 Participation Relationship With Management Communication between the management and employees are also important in the retention strategies. Employees are more loyal if they feel "connected" with the company. They need to know that their opinions matter and that management is interested in their input. This includes that employee are part of decision making process. The leadership team of a company, including the CIO and other senior executives, should be directly responsible for applying retention strategies and acting as cultural role models. They need to know how to use various retention methods and how they can be combined those methods to solve the particular problems. Determination should be made regarding the various types and levels of training, benefits and compensation packages for employees [12].

349

In addition, basic human sociological factors need to be considered when designing retention strategies. It should also be recognized that sometimes employees leave a company because of poor leadership within the company [13]. Also, personal or professional conflict between employees and supervisors can cause IT professionals to leave their jobs. Finally, Sohmer [14] believes that smart leadership is often the key to success. He states that management should vary but be consistent in retention offerings and look to use the right strategy to retain nonmanagement level IT professional. 4.4 Positive Work Environment A compatible corporate culture and environment is key to attracting and retaining employees. It is important for all employees to believe in the vision and goals of the company while also feeling comfortable and even passionate about the company for which they work. Examples of corporate culture include the allowance of causal dressing, flexible working hour or free entertainment provided for the staff. Employees often say that they have insufficient time to meet all work and family responsibilities. This time constraint causes a great deal of stress for employees. Some companies address this concern by combining various non-traditional methods, such as flextime, telecommuting and compressed workweeks into traditional work schedules. Empowering employees to productively manage their schedule is a powerful tool in an effective retention strategy. Location, as an environmental factor, is also a way to attract and retain employees. Companies should carefully choose their sites for various groups of employees. For example, a research and development operation in Silicon Valley might be useful, in order to tap into cutting-edge thinking. But a research and development project with a long lead-time would have a high turnover rate, because the skills of the development team would be in high demand. A better location would be in a rural community. Trying to get people to relocate to remote regions poses challenges. Another way private industry recruits IT professionals is by petitioning for H-1B visas. An H-1B nonimmigrant visa may be used to bring a temporarily foreign worker or professional into the United States for a specialty occupation or a professional position. To qualify for an H-1B visa you must hold a bachelors degree or its equivalent or higher. Private industry was given new hope by new legislation that significantly increased the number of new visas available to foreign workers sought by the U.S. high-tech industry. Along with the increase number of new visas, the filling fee for each H-1B visa application will increase to be used for education and training programs for U.S. citizens.

350 5

Conclusions

This paper examined several methods that have proven to be effective in IT employee retention. It appears that to be effective retention strategies must rely on a combination of methods. Companies adopt and use a variety of methods to retain employees. It is important to have a balance of programs, processes and cultural standards that are attractive to as many employees as possible in all positions. Fitzharris [15] reports on General Electric's appliance group use of a combination of three methods for retention: salary, career opportunities and recognition. Implementing these three methods reduced the turnover rate from 11 percent to 3.1 percent. Another approach comes from the Hay Group, a Washington, D.C., based human resource consultancy, which uses three types of rewards and incentives to retain our people: money, career advancement and a positive work environment. These three methods produced reduction in the turnover rate of at least 30 percent. Further, the following guidelines should be emphasized: A single approach cannot be employed for every situation. Compensation and benefit packages are necessary, but not sufficient elements in the retention of employees. • Competitiveness and fairness of compensation and benefit packages compared with the labor market could create the feeling of equity among employees. • The career path is the one of the factors that we cannot serves to maintain the challenge in daily work. • Overlooked factors such as leadership and management style, the corporate culture, and flexible hours are effective in ensuring loyalty to a company. A key factor to be considered as effective retention tool is the direct involvement of an IT management team. This team, composed of the IT manger and the individual in charge of retention efforts, must collaborate with the Human Resources department to develop effective retention strategies. In constructing these strategies, attention must be focus on both performance management of the employees and relationship strategies between management and non-management. • •

References 1. Bort, J. (2000). Mining for high-tech help. Coloradobiz; Englewood, 27, 48-56. 2. Diversity best practices. (2001). Retention. Wow!fact2001 [Online]. Available: http://www.ewowfacts.com/wowfacts/chap39.html 3. Hasson, J. (2000,April). Aging work force alarms CIO. [online]. Available: http://www.fcw.com. (January 20, 2001).

351 4. TechRepubublic, The SalarySurvey2000, The It world: secure, diverse, and comfortable [Online]. Available: http://www.TechRepublic.com 5. Sohmer, S. (2000). Retention getter. Sales and Marketing Management, New York, 152, 78-82. 6. Christian & Timbers (2000). Retention starts before recruitment [Online]. Available:http://www.ctnet.com/ctnet/university/client/retention/retention3.htm 1 7. Sparks, R. (2000,May). Ideas for recruiting and retaining quality employees. Creating Quality, 9 [Online].Available: http ://outreach .m i ssouri .edu/c.. ./cq_m ay 00_ideas%20for%20recruiting%20and %20retaining.htm 8. Thibodeau, P. (2001, January 15). Survey: Above all else, IT workers need challenge, Computerworld [Online]. Available: http://www.computerworld.eom/cwi/story/0,1199 ,NAV47_STO56335,00.htrnl 9. Fitzharris, A.M.(1999, June 24). Balancing employee's professional and personal lives. TechRepublic [Online]. Available: http://techrepublic.com/article.jhtml?src=search&id=r00619990624maf01.htm 10. Meyer, B. (2000,October 9). Providing training is key to retaining IT employees [Online]. Available: http://houston.bcentral.com/houston/stories/2000/10/09/focusl0.html 11. Tseng, W. (2000). [Interview with Mr. Eric Schemer]. Report of Information Governance. 12. Frankel, A. (1998, January 1). Retaining "playing for keep". CIO Magazine [Online]. Available: http://www.cio.com/archive/010198Joyalty.htm 13. Christian & Timbers (2000). Retention starts before recruitment [Online]. Available: http://www.ctnet.com/ctnet/university/client/retention/retention3.html 14. Sohmer, S. (2000). Retention getter. Sales and Marketing Management, New York, 152, 78-82. 15. Fitzharris, A.M.(1999, June 24). Balancing employee's professional and personal lives. TechRepublic [Online]. Available: http://techrepublic.com/article.jhtml?src=search&id=r00619990624mafDl.htm

353

HARD-SCIENCE LINGUISTICS AS A FORMALISM TO COMPUTERIZE MODELS OF COMMUNICATIVE BEHAVIOR BERNARD PAUL SYPNIEWSKI Rowan University - Camden, Broadway and Cooper Street, Camden, NJ 08102 USA E-mail: [email protected] Hard Science Linguistics (HSL) is a new linguistic theory, first worked out in Yngve [1], developed from the insights into human language gained during the history of machine translation and similar efforts. Unlike most linguistic theories, HSL concerns itself with the details of how people communicate rather than on how sentences are parsed. From the start, HSL developed with an eye toward making its results scientifically valid, not in the sense of other linguistic theories but in the sense of physics or chemistry. Both the historical basis in machine translation for HSL and the attention paid to a scientifically valid formalism make HSL an attractive candidate for the development of large-scale computer models of communicative behavior. In this paper, I will use some "mixed domain" terminology in order to more quickly explain HSL in the space available.

1

Introduction

The Cold War need to translate large volumes of scientific and military publications into English spurred the earliest attempts at developing machine translations systems. Initial attempts concentrated on the grammar of sentences and began with word for word translations. Problems with this and similar approaches appeared in short order Yngve [2]. Most approaches to machine translation have assumed that translation is exclusively a problem of language to be addressed grammatically Barr, Cohen and Feigenbaum [3]. While researchers acknowledged that context was a difficult problem that needed to be solved, context was mostly seen as grammatical context. Though some researchers and philosophers, such as Austin [4] recognized the behavioral elements in the problem of context, they often overlooked the consequences of these elements. Most linguistic theories tacitly assume that language can be studied by itself, without reference to the societal matrix in which it exists. Linguistics generally treats language understanding as equivalent to the understanding of grammar. Artificial intelligence has adopted this outlook. While there is much to be said for natural language processing systems that understand the construction of sentences, we should not confuse these systems with systems that try to understand how language is used by human beings in their everyday lives. Our systems must understand more than grammar. Most linguistic theories do not provide us with an understanding of anything other than grammar. Despite the success of Generative Transformational Grammar (GTG), linguistics does not have a sound scientific basis. Linguistic discourse is

354

philosophical discourse with roots in the ancient Aristotelian and Stoic grammatical and logical traditions. Most linguists do not produce scientifically testable results because general linguistics does not provide a scientifically sound formalism; indeed, while many linguists pay lip service to the need to make linguistics scientific, there is no general expectation that linguists will produce scientifically acceptable results. HSL consciously developed a formalism that could produce scientifically sound linguistic results. One of the more controversial themes of HSL is the de-emphasis of the importance of grammar, what HSL refers to as the "linguistics of language". HSL models "communicative behavior", i.e., language in a social context. Language, for HSL, is purposeful rather than merely grammatical. HSL provides a method for unifying traditional linguistic and extra-linguistic issues in a scientifically acceptable way. In its brief history, HSL has been used to model complex social phenomena such as business negotiations Brezar [5], ethnic stereotypes, Cislo [6], the analysis of textbooks, Coleman [7], Czajka [8], and criminal litigation, Sypniewski [9] as well as more traditional linguistic concerns such historical linguistic change, Mills [10], Malak [11] fillers, and Rieger [12]. 2

Some Methods and Tools Provided by Hard-Science Linguistics

HSL provides the researcher with a number of tools to describe the interaction between individuals and their environment. Briefly, some of those tools are: 1.

2.

3.

4.

An individual may interact with other individuals by playing a role part in a linkage. A linkage is a theoretical framework in which communicative behavior takes place. A role part is a description of the linguistically relevant behavior that an individual performs in a particular linkage. An individual may play a role part in several different linkages, which may or may not overlap in space or time. For example, an individual may be both a father and a little league coach at the same time. The role parts exist in different linkages that interact while a little league game is in progress. Every linkage has a setting. A setting is a description of the linguistically relevant environment in which a linkage exists. Settings may have props, linguistically relevant objects. For example, the amount of feedback in an auditorium's sound system may affect the communicative behavior of speakers on stage. Linkages have tasks, which, in turn, may have subtasks. Tasks and subtasks are descriptions of linguistically relevant behavior, somewhat analogous to functions in computer programming. Individuals have properties that may be affected by communicative behavior or that may have an effect on the communicative behavior of others. The loudness

355

of a speaker's voice or the speaker's command of the language of the audience may be reflected in properties of the speaker or listener's role part. 5. HSL uses its own notation (procedures, properties, and other elements) to construct plex structures. Plex structures describe the building blocks of an HSL model. The researcher models people or groups communicating among themselves along with their relevant setting(s) by creating a linkage, enumerating its participants and describing their role parts, describing the sequence of tasks and subtasks that must take place, describing the setting in which the linkage exists, and the relevant properties of the role parts and setting. HSL insists that all models be based on observable communicative behavior stated so that the results of the model accurately predict behavior in the real world in a reproducible way. 3

Implications for Computer Science

HSL allows the modeler to develop a model of arbitrary complexity. Furthermore, the modeler is not restricted to describing language. HSL is based on the scientific principle that the world has structure. An HSL model of communicative behavior is more complex than any model based on any other linguistic theory. The payback from this complexity is substantial. Communicative behavior becomes more manageable, the findings more justifiable, and the model more reflective of the real world. Since HSL sees the world in terms of properties, structures, and function-like tasks, a thoroughly developed model may be easily ported to an appropriate computer language. A structured model of communicative behavior resembles familiar paradigms in computer science. Linkages may be modeled by interacting classes, with each class representing a task or subtask. It may even be possible to use the Unified Modeling Language to move a model from HSL to the computer. The event-driven programming paradigm may be able to express some of the dynamism inherent in HSL. This is still controversial among HSL workers because of the type of model HSL creates. Professor Yngve believes that it will be difficult to adequately model the parallelism of complex HSL models on a serial computer. Because HSL is in its infancy, this remains an experimental question.

4

Discussion of the Current Attempts to Build SIMPLEX

In the mid-1980s, Victor Yngve, then at the University of Chicago, began to develop a simulator called SIMPLEX for his linguistic theories, later to become HSL. Because of the size and capabilities of contemporary machines, Professor Yngve decided to write SIMPLEX in FORTH. SIMPLEX remained incomplete, partially because the underlying linguistic theory needed further development.

356

Presently, Professor Yngve, I, and others have resurrected SIMPLEX and intend to develop it beyond its mid-1980s incarnation. We intend to continue using FORTH for three reasons. First, we will be able to use the code already written since the basic FORTH language has retained stable. Second, the American National Standards Institute (ANSI) standardized FORTH after SIMPLEX was written. ANSI FORTH is now file-based, rather than block-based, as were the original FORTHs. Cross-platform development will thus be simpler using ANSI FORTH. Third, and most important, HSL is now a fully developed theory. We now have a goal for SIMPLEX. SIMPLEX will be both a program and a programming language that will process plex structures. Because HSL models the real world and not just the grammar of sentences, HSL provides a methodology for representing parallel tasks. One of the reasons that FORTH proves useful is that FORTH originated as a computer language to handle multiple synchronous and asynchronous tasks. Accurately representing parallel tasks is one of the biggest challenges for SIMPLEX and one of its biggest potentials. At the time this paper was written, SIMPLEX is still in its infancy. However, the development of SIMPLEX is substantially advanced even though the computer code is, roughly, where it was in the mid-80s. We now have a complete formalism and methodology to model; this was not the case at the time that the original SIMPLEX was written. We are currently porting code from the old block structure to a file structure. In the process, we are testing various FORTHs on different platforms to identify compatibility problems. Because FORTH is a very efficient language, it is likely that SIMPLEX will run on machines that are significantly less powerful than today's desktop standard. We are testing different FORTHs on different platforms in order to determine the minimal configuration needed for SIMPLEX. There is significant interest in HSL in countries where state of the art computing equipment might not always be available. Some preliminary tests with the original version of SIMPLEX show that SIMPLEX may prove useful on Palm Pilots and similar devices. Our goal is to create a cross-platform computer program that will accept files of plex structures, analyze them, and simulate them on a desktop computer. A researcher will then be able to see the model in action and modify it whenever necessary. Once we finish porting SIMPLEX from block-oriented to file-oriented FORTH, we will begin developing sections of the simulator that will process specific HSL structures. SIMPLEX, when fully developed, will become a major tool for HSL researchers who wish to verify the plex structures and findings that they have developed. 5

Acknowledgements

I wish to thank Victor Yngve for his critical review of the manuscript.

357

References 1. 2.

3.

4. 5. 6. 7. 8. 9.

10.

11. 12.

13.

Yngve, Victor H., From Grammar to Science. (John Benjamins, Philadelphia, PA, 1996). Yngve, Victor H., Early MT Research at M.I.T. - The search of adequate theory. In Early Years in Machine Translation, Amsterdam Studies in the Theory and History of Linguistic Science, vol. 97 ed. by W. John Hutchins, (John Benjamins, Amsterdam/Philadelphia, 2000) pp. 38-72. Barr, Avron, Cohen, Paul R. and Feigenbaum, Edward A., The Handbook of Artificial Intelligence, vol. 4 (Addison-Wesley, Reading, MA 1989) pp. 223237. Austin, J. L., How to Do Things with Words, 2nd ed. (Harvard U. P., Cambridge, MA, 1975). Brezar, Mojca Schlamberger, A Business Negotiation Analysis in the Scope of Hard-Science Linguistics, In Yngve and Wajsik [13]. Cislo, Anna, The Victorian Stereotype of an Irishman in the Light of Human Linguistics, In Yngve and Wa_sik [13]. Coleman, Douglas W. Data and science in Introductory Linguistics Textbooks, Paper presented at the LACUS Forum XXVII, Houston 2000. Czajka, Piotr, Human Needs as Expressed in Educational Discourse on the Basis of Textbooks in Linguistics, In Yngve and Wa_sik [13]. Sypniewski, Bernard Paul, A Hard Science Linguistic Look at Some Aspects of Criminal Litigation in Contemporary New Jersey. Rowan University-Camden Campus, ms. Mills, Carl, Linguistic Change as Changes in Linkages: Fifteenth-Century English Pronouns, Paper presented at the LACUS Forum XXVII, Houston 2000. Malak, Janusz, Mayday or M'aider. A Call for Help in Understanding Linguistic Change, In Yngve and Wajsik [13]. Rieger, Caroline L., Exploring Hard Science Linguistics: Fillers in English and German Conversations, Paper presented at the LACUS Forum XXVII, Houston 2000. Yngve, Victor H. and Wa_sik, Zdzistaw (eds.), Exploring the Domain of HumanCentered Linguistics from a Hard-Science Perspective (Poznan, Poland: School of English, Adam Mickiewicz University, 2000).

359 B-NODES: A PROPOSED NEW METHOD FOR MODELING INFORMATION SYSTEMS TECHNOLOGY STANISLAW PAUL MAJ AND DAVID VEAL Department of Computer Science, Edith Cowan University, Western Australia, 6050. E-mail: [email protected], [email protected] There are many rapid developments in the technologies upon which information systems are based. In order to help describe and define these technologies there exist a wide variety of modeling techniques. However this wide range of techniques is in itself problematic and it is recognized that a new higher level of abstraction is needed. A new high-level modeling technique is proposed that can be used to control the technical complexity of information systems technologies. This new method, called B-Nodes, is a simple, diagrammatic, and easy to use method. The model employs abstraction and hence it is independent of underlying technologies. It is therefore applicable not to current and old technologies but is likely to be valid for future technological developments. It is a scalable modeling method that can be potentially used for both small systems (e.g. a PC) and a global information structure. It allows recursive decomposition that allows detail to be controlled. The use of fundamental units allows other more meaningful units to be derived. Significantly therefore the derived units may be used to more accurately specify hardware performance specifications. The model has been successfully used as the pedagogical framework for teaching computer and network technology. Results to date indicate it can be used to model the modules within a PC (microprocessor, hard disc drive etc), a PC, a Local Area Network and an e-commerce web site.

1

Introduction

Computer and network technologies underpin the IT industry. Furthermore many information systems, such as e-commerce web sites, are global in nature. In this type of application there is, in effect, a contiguous link between a client accessing a web page and all the technologies that link that client with data that may be stored on a hard disc drive on another part of the globe. The quality of service of a global IT system depends therefore on the performance of a wide range of heterogeneous devices at both the micro and macro level. At the micro level the performance of a PC depends upon the technical specification of its component modules (microprocessor, electronic memory, network interface card etc). At a higher level of abstraction the PC may be functioning as a server on a Local Area Network (LAN). In this case the performance of the LAN depends upon the operational characteristics of the PC (as a complete device) and the associated networking devices such as hubs and switches. At a macro level a collection of different servers (web-server, application-server, payment-server etc) may be located in a LAN and connected to the Internet. In order to control this complexity a wide range of

360

modeling techniques are used which is in keeping with the ACM/IEEE Computing Curricula 1991 in which abstraction is a recurring concept fundamental to computer science [1]. Semiconductor switching techniques and modelling provides an abstraction that is independent of the underlying details of quantum mechanics. Similarly digital techniques and modelling provide a higher-level abstraction that is independent of the underlying details of semiconductor switching. Such combinational or sequential digital circuits can be described without the complexity of their implementation in different switching technologies e.g. TTL, CMOS, BICMOS etc. Computer and network technology can therefore be described using a progressive range of models based on different levels of detail (e.g. semiconductors, transistors, digital circuits) each with their own different performance metric. However, there appears to be no simple modeling technique that can be used to describe and define the different heterogeneous technologies within a PC. The use of benchmarks to evaluate performance at this level is subject to considerable debate. Similarly, from an IT perspective, a range of different models is used. A business model is used to define the purpose of an e-business. The functional model defines the e-commerce web navigational structure and functions. Customer models are used to define the navigational patterns of a group of customers that may be used to quantify the number and type of customers and the associated request patters - all of which may be used to define an e-commerce workload. Again a wide range of performance metrics is used and includes: hits/second, unique visitors, revenue, page views/day etc. All these different models are designed to progressively hid, and hence control detail, and yet provide sufficient information to be useful for communication, design and documentation. But this wide range of different modeling techniques (from digital systems to customer models) and associated metrics is in itself problematic. Ultimately a global, e-commerce business is a contiguous system and should if possible be modeled as such. The use of a single modeling technique may help to control the technical complexity but also allow the use of a single performance metric, from which other metrics may be derived.

2

Modeling

The principles of modeling were reviewed in order to obtain the required characteristics of models. Models are used as a means of communication and controlling detail. Diagrammatic models should have the qualities of being complete, clear and consistent. Consistency is ensured by the use of formal rules and clarity by the use of only a few abstract symbols. Leveling, in which complex systems can be progressively decomposed, provides completeness. According to Cooling [2], there are two main types of diagram: high level and low level. Highlevel diagrams are task oriented and show the overall system structure with its major sub-units. Such diagrams describe the overall function of the design and interactions between both the sub-systems and the environment. The main emphasis is 'what

361 does the system do' and the resultant design is therefore task oriented. According to Cooling, 'Good high-level diagrams are simple and clear, bringing out the essential major features of a system'. By contrast, low-level diagrams are solution oriented and must be able to handle considerable detail. The main emphasis is 'how does the system work'. However, all models should have the following characteristics: diagrammatic, self-documenting, easy to use, control detail and allow hierarchical top down decomposition. By example, computer technology can be modeled using symbolic Boolean algebra (NOR, NAND gates). At an even higher level of abstraction computer technology can be modeled as a collection of programmable registers. Dasgupta suggested computer architecture has three hierarchical levels of abstraction [3]. A model for describing software architectures was introduced by Perry and Wolf that consists of three basic elements - processing, data and connecting [4]. On this basis various architectural styles exist that include: Dataflow, Call & Return, Independent Process, Virtual Machine, Repository and Domain Specific. Each model is valid. According to Amdahl, 'The architecture of a computer system can be defined as its functional appearance to its immediate users. '[5] However, computer design and manufacture has changed significantly. The PC is now a low cost, consumer item with a standard architecture and modular construction. Two studies by Maj in Australia [6] and Europe [7] found that in both cases the computer and network technology curriculum failed to provide the basic skills and knowledge expected by both students and potential employers. Furthermore, there is considerable unmet potential demand from students of other disciplines (e.g. multimedia) for instruction in computer technology [8] due to the perceived lack of relevance of the current computer technology curriculum. According to the 1991 ACM/IEEE-CS report, 'The outcome expected for students should drive the curriculum planning' [1]. Significantly the current modeling methods used for computer and network technology may no longer be appropriate. Clements comments, 'Consequently, academics must continually examine and update the curriculum, raising the level of abstraction' [9]. 3

Bandwidth Nodes

A new high-level modeling technique called Bandwidth Nodes (B-Nodes) has been proposed [10]. Each B-Node (microprocessor, hard disc drive etc) can now be treated as a quantifiable data source/sink with an associated transfer characteristic (Mbytes/s). This approach allows the performance of every node and data path to be assessed by a simple, common measurement - bandwidth. Where Bandwidth = Clock Speed x Data Path Width with the common units of Mbytes/s. This is a simple, diagrammatic, and easy to use method mat can be used to model different technologies. The heterogeneous nature of the nodes of a PC is clearly illustrated by the range of measurement units used varying from MHz to seek times in milliseconds. Evaluation of these different nodes is therefore difficult. However, it is

362

possible to compare the performance of different nodes using the common measurement of bandwidth in Mbytes/s. The Pentium processor has an external data path of 8bytes with maximum rated clock speeds in excess of 400Mhz giving a bandwidth of more than 3200Mbytes/s. Dual In Line Memory Modules (DIMMs) rated at 60ns (16MHz) with a data path width of 8 bytes have a bandwidth of 128Mbytes/s. The data transfer rate for a hard disc drive can be calculated from the sector capacity and rotational speed (data transfer rate = sector capacity x sectors per track x rps). Typical figures are in the range of 5 Mbytes/s. Modem performance is typically measured in Kbits/s which can be converted to Mbytes/s or Frames/s. CDROM performance is quoted in speeds e.g. x32 speed where single speed is 150kbytes/s. CDROM speeds can easily be converted to Mbytes/s or Frames/s. According to Mueller[ll], the maximum transfer rate of a bus in MBytes/s can be calculated from the clock speed and data width. Significantly, a common performance metric (Mbytes/s) is used thereby allowing the relative performance of the different heterogeneous technologies to be easily evaluated (Table 1). Table 1: Bandwidth (Mbytes/s) Device

Clock (MHz)

Speed

Processor DRAM Hard Disc CROM ISA Bus

400 16 60rps

8 8 90Kbytes

8

2

Data (Bytes)

Width

Bandwidth (Mbytes/s) B = CXD 3200 128 5.2 4.6 16

B-Nodes typically operate sub-optimally due to their operational limitations and also the interaction with other B-Nodes. The simple bandwidth equation can be modified to take this into account i.e. Bandwidth = Clock x Data Path Width x Efficiency (B = C x D x E) with the units MBytes/s [10]. The Pentium requires a memory cycle time of 2-clock cycles i.e. the 2-2 mode (Efficiency = 1/2) for external DRAM [12]. However, if the memory cannot conclude a read/write request within this clock period additional clock cycles may be needed i.e. wait states. Each wait stare reduces the efficiency factor accordingly. For efficient data access burst mode is possible during which transfers can be affected by an initial 2 clock cycles and subsequent transfers needing only 1 clock cycle. The restrictions are an upper limit of 2-1-1-1 for the READ operation. The efficiency is therefore 4/5 i.e. 4 transfers in 5 clock cycles (Table 2).

363

Table 2: Pentium Mode

C

D

(MHz)

(Bytes)

E

Bandwidth (MBytes/s) =CxDxE

2-2

100

8

V4

400

lWait

100

8

1/3

266

Burst

100

8

4/5

640

The ISA bus operates at 8MHz with a data width of 2 bytes. However at least 2 clock cycles are needed i.e. E = 1/2. Each wait state reduces the efficiency accordingly (Table 3). Table 3: ISA Bus Mode

C

D

(MHz)

(Bytes)

E

B (Mbytes/s) =CxDxE

2-2

8

2

'/2

8

lWait

8

2

1/3

5

The Peripheral Component Interconnect (PCI) bus is a 32 bit bus but operating at frequency of 33Mhz. The PCI bus uses a multiplexing scheme in which the lines are alternately used as address and data lines. This reduces the number of lines but results in an increased number of clock cycles needed for a single data transfer. Each wait state reduces the efficiency accordingly. However the PCI bus is capable of operating in unrestricted burst mode. In this mode after the initial 2 clock cycles data may be transferred on each clock pulse. In this case E tends to unity (Table 4).

364 Table 4: PCI Bus Mode

C

D

(MHz)

(Bytes)

E

B (MBytes/s) =CxDxE

Write

33

4

l

/2

66

lWait

33

4

1/3

44

Burst

33

4

1

133

Using B-Nodes it is possible to model a spectrum of PCs ranging from those based on the first generation processor (8088, 8 bit ISA, floppy disc drive etc) through to those based on the latest fifth generation processors (Pentium, PCI, AGP etc). The use of the fundamental units of Mbytes/s allows other, more user oriented units to be derived. Bandwidth Nodes (B-Nodes) have been used as the pedagogical framework for computer and network technology curriculum and evaluated. According to Maj [10], advantages to using this pedagogical model include: • Students can perceive the PC as a unified collection of devices •

Node performance, measured in bandwidth (Frames/s) is a user based, easily understood measurement

•

The units Mbytes/s and Frames/s use a decimal scaling system

•

Students are able to evaluate different nodes of a PC by means of a common unit of measurement

•

Students can easily determine the anticipated performance of a PC given its technical specification

•

Students are able to critically analyze technical literature using this integrating concept.

•

The model is suitable for students from a wide range of disciplines (Computer Science, Multimedia, IT, Business IT)

•

The model is valid for increasing levels of technical complexity.

•

Nodes are independent of architectural detail

365

The model employs abstraction and hence it is independent of underlying technologies. It is therefore applicable not to current and old technologies but is likely to be valid for future technological developments. It is a scalable modeling method that can be used for digital systems, PC modules and a small LAN [13]. 4

A B-Node Model of an E-Commerce Web-Site

A range of different models is used for e-business web sites. The business model is used to define the business directions and objectives for a given level of resources. The business model itemizes the trading processes that can then be used as the basis of a functional model to specify e-commerce web navigational structures and functions. Customer models such, as the Customer Behavior Model Graph (CBMG) is a server-based characterization of the navigational patterns of a group of customers that may be used to quantify the number and type of customers and the associated request patters - all of which may be used to define an e-commerce workload [14]. A wide range of performance metrics is used and includes: hits/second, unique visitors, revenue, page views/day etc. The workload in conjunction with the resource model of hardware and software ultimately must be able to clearly define the site performance, which is used to specify a Service Level Agreement (SLA). Assume an e-commerce web site consists of a collection of servers (web server, application server, payment server etc) on an Ethernet LAN. This configuration can be modeled using CBMG and Client Server Interaction Diagrams (CSID's) in order to obtain the probability of message traffic between the different servers. Given the size of the messages then an approximation can be made about the performance of the LAN. Furthermore if the servers are located on two different LAN's it is possible to calculate the message delays and again the expected performance of this architecture. However the functional and customer models use a range of different metrics, which in turn differ from those used to specify server architecture. It is therefore difficult to directly translate the performance specification measured in page views/day to the required specification of for example a hard disc drive in the server. However, if a web server is modeled as a B-Node then the performance metric is bandwidth with units of Mbytes/s. The sub-modules of a server (microprocessor, hard disc, electronic memory etc) and also be modeled as B-Nodes, again using the same performance metric. The use of fundamental units (Mbytes/s) allow other units to be derived and used e.g. transactions per second (tps). Assuming the messages in a client/server interaction are lOkbytes each, the performance of each B-Node can be evaluated using the units of transactions/s (Table 5)

366

Table 5: Bandwidth (Utilization) Device

Bandwidth (MBytes/s)

Bandwidth (Tps)

Load (Tps)

Utilization

Processor DRAM Hard Disc CROM ISA Bus Ethernet

1600 64 2.7 2.3 4 11.25

160k 6.4k 270 230 400 1.1k

250 250 250 250 250 250

<1% 4% 93% >100% 63% 23%

Capacity planning is the process of predicting future workloads and determining the most cost-effective way of postponing system overload and saturation. If the demand on this server is 250 Transactions/s it is a simple matter to determine both performance bottlenecks and also the expected performance of the equipment upgrades. From table 5 it is possible to determine that for this web server, the hard disc drive, CDROM and ISA bus are inadequate. The metric of transactions/s can easily be converted to the fundamental unit of Mbytes/s, which can then be used to determine the required performance specification of alternative bus structures, CDROM devices and hard discs. A PCI (32 bit) bus structure is capable of 44Mbytes/s. A 40-speed CDROM device has a bandwidth of approximately 6Mbytes/s. Similarly replacing the single hard disc drive by one with a higher performance specification (rpm and higher track capacity) results is a new server capable of meeting the required workload (Table 6). Table6: Upgraded server Device Processor DRAM Hard Disc CROM PCI Bus Ethernet

5

Bandwidth (MBytes/s) 1600 64 12.5 6 66 11.25

Bandwidth (Tps) 160k 6.4k 1.25k 0.6k 6.6k 1.1k

Load (Tps) 250 250 250 250 250 250

Utilization <1% 4% 20% 42% 4% 23%

Secure Electronic Transactions

Security is an essential aspect of e-commerce transactions. There are two main classes of cryptographic algorithms: Symmetric and Public Key (PK). It is possible

367 to estimate the overheads of employing different cryptographic algorithms using the simple bandwidth model. The most common PK algorithm is RSA which has been evaluated as a system load for different key sizes measured in milliseconds [15]. Cryptographic algorithms are CPU intensive operations that required considerable microprocessor time. The B = C x D x E equation is still applicable, however instead of C (clock frequency, MHz) the reciprocal is used I/time (seconds) and D is the key length. It is then possible to calculate the effective bandwidth of a microprocessor in Mbytes/s. For a Pentium II, 266MHz the Input/Output bandwidth is approximately 2128Mbytes/s. However, the computational overhead for a 256Byte key size the public key performance is 5470Bytes/s and the private key performance is 2128Bytes/s. These figures clearly demonstrate that the PK encryption cannot be used for transferring large data volumes. 6

Conclusions

Large IT systems are a complex collection of heterogeneous technologies described by a wide variety of different models and associated performance metrics. This wide range of models and metrics is problematic as it is difficult to compare the relative performance of the different technologies. The performance of any system is ultimately dependant of the speed of the slowest device. B-Nodes have been successfully used to model computer technology on a micro level (digital systems, microprocessor, electronic memory, hard disc drive etc.). The use of fundamental units allows other more meaningful units to be derived. Using B-Nodes it is simple to convert the performance of an e-commerce web site (transactions/s) to Mbytes/s and hence determine the load on a server architecture. B-Node modeling is a simple, diagrammatic, and easy to use method. The model employs abstraction and hence it is independent of underlying technologies. It is therefore applicable not to current and old technologies but is likely to be valid for future technological developments. It is a scalable modeling method that can be used at the micro level but also on a macro level for a global information structure. It allows recursive decomposition that allows detail to be controlled.

References 1.

Tucker, A.B., et al., A Summary of the ACM/IEEE-CS Joint Curriculum Task Force Report, Computing Curricula 1991. Communications of the ACM, 1991. 34(6). 2. Cooling, J.E., Software Design for Real-Time Systems. 1991, Padstow, Cornwall: Chapman and Hall. 3. Dasgupta, S., Computer Architecture - A Modern Synthesis. 1989, New York: John Wiley & Sons.

368 4.

5. 6.

7.

8.

9. 10.

11. 12.

13.

14.

15.

Perry, D.E. and A.L. Wolf, Foundations for the study of software engineering. ACM SIGSOFT, Software Engineering Notes, 1992. 17 (4): p. 40-52. Amdahl, G.M., Architecture of the IBM/360. IBM Journal of Research and Development, 1964. 8(2): p. 87-101. Maj, S.P., et al., Computer and Network Installation, Maintenance and Management - A Proposed New Curriculum for Undergraduates and Postgraduates. The Australian Computer Journal, 1998. 30(3): p. 111-119. Maj, S.P., D. Veal, and P. Charlesworth. Is Computer Technology Taught Upside Down? in 5th Annual SIGCSE/SIGCUE Conference on Innovation and Technology in Computer Science Education. 2000. Helskinki, Finland: ACM. Maj, S.P., G. Kohli, and D. Veal. Teaching Computer and Network Technology to Multi-Media students - a novel approach, in 3rd Baltic Region Seminar on Engineering Education. 1999. Goteborg, Sweden: UNESCO International Centre for Engineering Education (UICEE), Faculty of Engineering, University of Melbourne. Clements, A., Computer Architecture Education, in IEEE Micro. 2000. p. 10-22. Maj, S.P. and D. Veal, Computer Technology Curriculum - A New Paradigm for a New Century. Journal of Research and Practice in Information Technology, 2000. 32(August/September): p. 200-214. Mueller, S., Scott Mueller's Upgrading and Repairing PCs.. 1999, QUE. Indianapolis Indiana, p. 891-898. Mazidi, M.A. and J.G. Mazidi, The 80x86 IBM PC & Compatible Computers, Volumes I & II, Assembly Language Design and Interfacing. 1995, New Jersey: Prentice Hall. Maj, S.P., D. Veal, and A. Boyanich. A New Abstraction Model for Engineering Students, in 4th UICEE Annual Conference on Engineering Education. 2001. Bangkok, Thailand: UNESCO International Centre for Engineering Education (UICEE), Faculty of Engineering, University of Melbourne. Menasce, D.A., et al. A Methodology for Workload Characterization for ECommerce Servers, in 7999 ACM Conference in Electronic Commerce. 1999. Denver, CO: ACM. Freeman, W. and E. Miller. An Experimental Analysis of Cryptographic Overhead in Performance-Critical Systems, in Seventh International Symposium. Modelling, Analysis and Simulation of Computer and Telecommunications Systems. 1999. College Park, MD.

369 THE MONTCLAIR ELECTRONIC LANGUAGE LEARNER DATABASE EILEEN FITZPATRICK AND STEVE SEEGMILLER Department of Linguistics, Montclair State University, Upper Montclair, NJ 07043, USA E-mail: {fitzpatr/seegmillj® sapir.montclair.edu The work described here aims to enable more efficient research and application design in the field of second language performance. We are doing this by expanding a corpus of error-annotated written English that we have built as a feasibility study [2], The goal is to make the resulting corpus publicly available for applications in second language pedagogy, research in second language acquisition, and the design of online writing aids for second language learners.

1

Introduction

Research and development in the field of natural language engineering proceeds by building models of human language performance in an effort to duplicate that performance on a machine. Over the past 10 years, the paradigm for language modeling in natural language engineering has shifted. Models based on introspectively obtained rules have given way to models based on empirically observed patterns in archived data, or corpora. This shift followed the success of the empirical approach in speech recognition [4,5] and the increase in machine storage capacity that enables large amounts of data to be maintained and manipulated. Since most language engineering applications serve the general user, most corpora are designed to model the language of the native speaker (NS). However, more recently, corpora that model the performance of non-native speakers (NNSs) of a language have begun to appear [3]. These corpora are designed primarily to enable the study of differences between NS usage and NNS usage, with the aim of understanding more about second language acquisition, and to enable tool development (spell checkers, grammar checkers, and other writing aids) for NNSs. This paper describes a particular type of NNS corpus of formal written English being developed at Montclair State. For some applications, primary language data is sufficient for model building, but the value of the data is greatly increased when it is annotated with linguistic information like the part of speech of the words in a sentence or the syntactic structure of the sentence. After careful hand annotation of a representative subset of the language, subsequent annotation may be done automatically [POS & parsing refs]. However, since the language of NNSs often varies greatly from the standard language, automatic annotation designed for the standard language performs poorly on NNS text. In this paper we describe a project at Montclair State that is hand

370

annotating the errors in NNS text in such a way that the text can subsequently be submitted for conventional automatic annotation for part of speech and syntactic structure. Montclair is particularly well-suited to carry out this project. Its Center for Language Acquisition, Instruction, and Research (CLAIR) teaches a set of languages - though primarily English - to speakers of unusually diverse native language backgrounds. CLAIR also houses master teachers of English as a Second Language and linguists to annotate the text. 2

The Raw Corpus

The raw corpus currently consists of formal essays written by upper level students of English as a Second Language preparing for college work in the United States. A portion of the essays are timed essays written in class; the rest are untimed drafts written at home. The corpus is small at 25,000 words, but we have recently begun collecting the data systematically which will increase its size quickly. Essays are either submitted electronically or transcribed from hand-written submissions. A record is kept as to how each essay was submitted. Interested student authors sign a release form that entitles us to enter their written work into the corpus throughout the semester. These students also complete a background form on native language, other languages, schooling, and extent and type of schooling in the target language, currently only English. 3

The annotation

Other corpora that are annotated for error, including the Hong Kong corpus [7] and the PELCRA corpus at the University of Lodz, Poland, use a predetermined tagset to mark the errors. While this approach guarantees a high degree of tagging consistency among the annotators, it limits the errors recognized to those in the tagset. Our concern in using a tagset was that we would skew the construction of a model of L2 writing by using a list that is essentially already a model of L2 errors. The use of a tagset also introduces the possibility that annotators will misclassify. Finally, we are concerned that the 'one size fits all' approach of a tagset would force us to apply the same standards to different written genres, e.g., email or postings to listserves. In place of a tagset, we ask annotators to minimally reconstruct the error to yield an acceptable English sentence. Each error is followed by a slash and a minimal reconstruction of the error is written within curly brackets. Missing items and items to be deleted are represented by "0". Tags and reconstructions look like this: school systems {is/are} since children {0/are} usually inspired

371 becoming {a/0} good citizens Reconstruction is faster than classification, there is no chance of misclassifying, and even less common errors are captured. Additionally, syntactic parsers and part-ofspeech taggers often fail with ungrammatical input. A reconstructed text can be more easily parsed and tagged for part-of-speech information. Reconstruction, however, has its own difficulties. Without a tagset, annotators can vary greatly in what they consider an error. One recurring example of this involves the use of articles in English. For instance, the sentence The learning process may be slower for {the/0} students as well is correct with or without the article before students. However, the use of the indicates that a particular group of students had been identified earlier in the essay, whereas the absence of the indicates that students refers to students in general. An additional difficulty is that different annotators may reconstruct an error differently. For example, the student need help can be reconstructed as the {student/students} need help or the student {need/needs} help. We are performing several experiments to determine how much accuracy and efficiency we could achieve in tagging errors[2]. The first experiment was a baseline test to determine if it is possible to get any sort of tagging agreement without a predetermined tagset. In a set of 1549 words, we identified 152 errors. The annotators achieved an average precision rate of .85 and a recall rate of .81.' Encouraged by these results, we tested whether we could develop annotation guidelines that would improve tagging agreement. The authors independently tagged a set of essays and compared annotations. This comparison is shown as Test One in Table 1. We then discussed our annotations, agreed on guidelines, annotated a second set of essays and compared. The comparison after discussion and guidelines is shown as Test Two. Test One Two

Words 2476 2418

Errors 241 193

Recall .73 .76

Precision .84 .90

Table 1. Experiment with annotator guidelines after Test One. Given these results, we are now replicating this experiment with master teachers to develop careful guidelines and a model tagged data set for graduate student annotators to follow. We anticipate that we will not achieve a higher level of agreement between the annotators tagging independently and that we will continue to need two annotators to produce reliably tagged data.

372 4

Annotation Tools

Currently, annotators are using a simple Linux text processor of their choice to annotate the text. We anticipate that we will be able to annotate common errors like subject-verb disagreement automatically and present 'cleaner' text to the annotators who will be left to deal with the more idiosyncratic errors. We intend to automate soon for a few high frequency errors and test whether this improves inter-annotator accuracy and/or efficiency in tagging. Other annotation projects increase efficiency by using interactive annotation tools [1], which also help accuracy by reducing some of the tedium of the task. These tools are better suited to part-of-speech and syntactic tagging where either small windows of text or partial syntactic trees are shown to the annotator. Since our annotation sometimes requires a global judgment at the paragraph level (for instance, in the case of the referent of students in the example given in section 3), we have not used interactive tools. Annotators compare their tags word by word with a Linux shell script using sdiff that lines up the text as shown below and also counts the number of shared tags and the number of tagging discrepancies. This enables the annotators to concentrate on the discrepancies efficiently {this/it} {will:would) not be surprising 5

{this/it} I will not be surprising

Accessing the Data

Each essay is stored in a separate file keyed by a unique number. Each file contains the essay including the annotations. Where the two annotators disagreed about a tag, both annotations are saved. A Linux s e d script enables a researcher examining the essays to see either the original, unannotated essay or the text with annotations. Corpus subdirectories divide the text data by course level and particular class and further subdivide it by essay type (timed or untimed). Background information on the author of each essay is kept in a single data file linked to the essay by the key. A menu driven by a Perl script gives the researcher access to the background information for a particular essay. We are currently writing a script that will enable the researcher to accumulate background information for a particular kind of error.

373 6

Data Processing Tools

We are currently using only Linux tools to look for patterns in the data while we build the corpus. These include searching for particular kinds of error and calculating error frequency given corpus and essay size. An issue we have yet to address regarding the corpus user is the idiosyncracy of the tags. Currently a user cannot search for a particular error type, for example number disagreement, since the tags do not indicate the error type. For high frequency errors, we plan to convert the tags automatically to a named error type for easy search. However, for less frequently occurring errors, the user will still have to peruse a list of tagged errors. Tests of how the user searches the corpus will inform our design of a tool to display less frequently occurring error types. We are also building a tool that will give the corpus user statistics on error occurrence, including error type plotted against background information. This is particularly useful in second language acquisition research which seeks to discriminate second language errors attributable to the native language background from errors attributable to the learning process in general, or to some universal features of language. 7

Applications

We plan to make the MELD corpus publicly available. The design of the corpus allows it to be used for several applications in second language pedagogy and research, as well as in the building of editing tools for second language learners. Here we give examples of possible applications of the corpus. 7.1

Second Language Pedagogy

Frequency of error by level or native language background gives a teacher information as to what writing problems s/he should concentrate on. By comparing word usage and syntactic usage against a comparable NS corpus, the teacher or textbook writer can discover gaps in the NNS use of the language and develop materials accordingly. The corpus can also be used for testing purposes since it allows testing to be targeted to specific levels and language backgrounds. Several types of corpus-based exercises for students have been developed [6] though they are not widely available. A publicly available corpus will enable more exercises of this type. In addition, MELD's reconstruction of the text enables students to use portions of the corpus for proofreading exercise. Certain types of error can be 'turned off so that the student sees only the type of usage s/he needs to master. The student can then compare corrections with those of the annotator.

374

7.2

Second Language Acquisition Research

As mentioned above, research in second language acquisition is heavily oriented to investigating the origin of errors either in the NNS's transfer of first language attributes or in use of an interlanguage that the NNS creates as ever closer approximations to the second language. The corpus will enable the researcher to statistically analyze the distribution of errors by native language background, level of study, gender, age, mastery of other languages, and spoken and written exposure to the target language. The corpus can also be used by lexicographers to study how the NNS word usage diverges from that of native speakers. 7.3

Editing Tools

Spell checkers and grammar checkers are typically based on frequency of error distribution. They do not work well for the writing of NNSs because their errors show statistically different distributions. For example, error of complement type (/ need {of/0} somebody) are rare in NS writing, but very common in the MELD corpus. The corpus provides the statistical base required to develop these tools. 8

Conclusion

The corpus being collected and annotated should provide a wealth of empirical data to assist in second language pedagogy, research, and tool building. We plan to make the corpus and tools publicly available on the web. 9

Acknowledgements

We thank the master teachers Jacqueline Cassidy, Norma Pravec, and Lenore Rosenbluth, who contributed careful labor and thoughtful discussion in providing a tagged data set and tagging guidelines and the graduate student annotators Jennifer Higgins and Donna Samko. References 1. Bredenkamp, A, B. Crysmann, and J. Klein Annotation of error types for German news corpus, in Journees AT ALA sur Ies Corpus Annotes pour la Syntaxe Treebanks Workshop (1999) Paris. 18-19 juin pp. 77-84. 2. Fitzpatrick, E. and Seegmiller, S. Experimenting with Error Tagging. The Second North American Symposium on Corpus Linguistics and Language

375

3. 4. 5. 6. 7.

1

Teaching. Northern Arizona University, Flagstaff, AZ, (March 31-April 2 2000). Granger, S. (ed). Learner English on Computer. (1998) Addison-Wesley Longman. Jelinek, F. Self-organized language modeling for speech recognition. IBM T.J. Watson Research Center, Continuous Speech Recognition Group, Yorktown Heights, NY (1985). Jelinek, F. Markov source modeling of text generation. In The Impact of Processing Techniques on Communications, ed. by J.K. Skwirzinski (Nijhoff, Dordrecht, 1985). Milton, J. Exploiting LI and interlanguage corpora in the design of an electronic language learning and production environment. In Granger, S (ed). Milton, J. and N. Chowdhury. Tagging the interlanguage of Chinese learners of English. In Entering Text, ed. by L. Flowerdew and A.K.K. Tong. Language Centre, The Hong Kong University of Science and Technology (1994).

Precision is the measure of errors identified by both annotators divided by the errors identified by the 'non-expert'. (How many tags were correct out of all the errors s:he tagged?) Recall is the measure of errors identified by the non-expert divided by the errors identified by the expert. (Out of all the errors identified, how many did the non-expert get?) Precision and recall show the distance in tagging between the two annotators.

Computing Formalism/Algorithms

379

I M P R O V E M E N T OF SYNTHESIS OF C O N V E R S I O N RULES BY EXPANDING KNOWLEDGE REPRESENTATION H.MABUCHI Iwate Prefectural University, 152-52 Sugo, Takizawa, Iwate, 020-0173, Japan E-mail: [email protected] K.AKAMA. H.KOIKE AND T.ISHIKAWA Hokkaido University, Kita 11, Nishi 5, Kita-ku, Sapporo, 060-0811, Japan E-mail: [email protected], [email protected] This paper proposes a natural and efficient method for solving a problem by expanding the space using the characteristics of the space in which a problem can not be solved. We deal with the synthesis of conversion rules that simplify a logical circuit as a concrete problem, and by comparing the synthesis of conversion rules in the space before expansion of a knowledge representation with that in the space after expansion, we show that expansion of knowledge representation is efficient for synthesis of conversion rules. A declarative program is treated as a knowledge representation, and the synthesis of conversion rules is performed by equivalent transformation. The expansion of knowledge representation is performed by applying an equivalent transformation rule.

1

Introduction

Each of knowledge representations has been shown to be superior for certain types of problems. However, in some cases a problem can not be solved effectively and automatically in only a certain space of a certain knowledge representation. We therefore consider how to solve such problems effectively and automatically. As a solution, we propose in this paper a natural and efficient method for solving a problem by expanding the space using the characteristics of the space in which a problem can not be solved. We deal with the synthesis of conversion rules that simplify a logical circuit as a concrete problem [1], and by comparing the synthesis of conversion rules in the space before expansion of a knowledge representation with that in the space after expansion, we show that expansion of knowledge representation is efficient for synthesis of conversion rules. The synthesis of conversion rules requires a new conversion rule (i.e., synthesis rule), which is obtained by combining two or more existing successive conversion rules to achieve a new conversion from one state to another. A declarative program is treated as a knowledge representation, and a problem that can not be solved in the space before expansion of declarative

380

program is solved in an expanded space that uses the characteristics of the space before expansion. The space before expansion of declarative program is the same as the space of logic program [2]. In an expanded space, various data structures, including strings and multisets as well as terms, can be treated. The synthesis of conversion rules is performed by equivalent transformation [3,4]. Equivalent transformation is the conversion of a program into another equivalent program. The expansion of knowledge representation is performed by applying an equivalent transformation rule and enables a solution to be obtained efficiently and automatically and at a low cost. 2

Improvement of Synthesis by Expanding Knowledge Representation

The concept of improvement of synthesis by expanding knowledge representation is shown in Fig.l. Let a space before expansion be Ti, and let the space after expansion be IV

\ y i

y 1

/

Figure 1. Improvement of Synthesis by Expanding Knowledge Representation.

A is a program including conversion rules which are synthesized. Program B including a synthesis rule is obtained from A. However, if this synthesis rule is not useful, program D including a useful synthesis rule must be obtained in Ti. When automatically obtaining D from B is difficult (see chapter 3), we must consider other methods to obtain a useful synthesis rule. The problem, however, is the cost of change. Methods such as not changing the way to synthesize or not changing the knowledge representation could be considered to reduce this cost. We therefore propose a method for solving a problem in an expanded space of Ti using the characteristics of space Y\. Data structures are characterized by a mathematical structure called a specialization system [4], and a program is defined on a specialization system. A specialization system is a theoretical foundation of knowledge representation and determines the objects that are treated in each space. By prescribing a specialization system, Ti and 1^ can be made. 1^ must be established so that

381

program C including a useful synthesis rule from B can be obtained efficiently and automatically. To achieve this, we expand Ti to allow treatment of various data structures including multisets and constraints. The transformation from Ti into T2 is performed by an equivalent transformation rule that converts representation of Ti into that of I V As mentioned above, program A can be transformed efficiently and automatically into C through B by equivalent transformation in I V 3

Synthesis in the Space before Expansion of Declarative Program

We describe two conversion rules ("andAnd rule" and "noConnection rule"), which are synthesized, from among the many conversion rules that simplify logical circuits. The andAnd rule means that "when an output terminal of a and element is an input terminal of another and element, two and elements become one" and is represented in the space before expansion as follows. C\ : arc(andAnd,Circuitl,Circuit2)*member_rest([and,E,IN],Circuitl.RestCircuit), member_rest([and,D,R1],RestCircuit,RR), member_rest(D,IN,R2), member_rest([and,E,R3],Circuit2,RestCircuit), union(R1.R2.R3).

The first argument of predicate arc is the name of the conversion rule, the second argument is a logical circuit before conversion, and the third argument is a logical circuit after conversion. As for [and,E,IN], for example, the element is and, output is E, and input is IN. member_rest(X, Y,Z) means that "Z is a list that removes element X from list Y". The noConnection rule means that "when a terminal of an element is not connected to a terminal of another element, the element is removed from the logical circuit" and is represented in the space before expansion as follows. C2 '• arc (noConnection, Circuit, RR1)«— member_rest([ELEMENT,AA,P].Circuit,RR1), notExist(AA.RRl), free(AA).

notExist(AA,RRl) means that AA is not used as a terminal in logical circuit RR1, and free(AA) means that AA is not connected to another terminal. The rule obtained by the synthesis of Cj and C2 is as follows. C3 : newarc(andAndnoConnection,Circuitl,RRl)-(—

382 member_rest([and,E,IN],Circuitl.RestCircuit), member_rest([and,D,R1],RestCircuit,RR), member_rest(D,IN,R2), member_rest([and,E,R3],Circuit2,RestCircuit), union(R1,R2,R3), member_rest([ELEMENT,AA,P],Circuit2,RRl), notExist(AA.RRl), free(AA).

The predicate newarc has the form such that newarc (synthesis rule r, state 1, state 2), and this means that "state 1 is converted into state 2 by r". However, Cz is simply a combination of the body of C\ and that of C2 • By calling one clause Cz instead of calling two clauses Ci and C2, execution time is slightly decreased, but there are no other merits of the synthesis. Thus, C3 in program B is not useful. Therefore, we try to look for a useful synthesis rule (including in program D of Fig. 1) i n T i . Since there are five member_rest literals in the body, we consider reducing the number. Therefore, we consider a transformation in the intersection of plural member_rest literals in the body. For example, suppose that the following possible transformation is selected. Tx : {member_rest(Yl,Y2,Y3), ...(a) member_rest(Y4,Y5,Y2), ...(b) member_rest(Yl,Y5,Yans)} . . . ( c )

I {member_rest(Yl,Y2,Y3), ...(d) member_rest(Y4,Yans,Y3)} . . . ( e ) 7\ means to transform {(a),(b),(c)} into {(d),(e)}. By applying this transformation to Cz, one useful synthesis rule can be obtained. However, finding this transformation is difficult, and this transformation is not equivalent. 4

Example of Improvement of Synthesis by Expanding Knowledge Representation

We look for a useful synthesis rule in an expanded space by applying an equivalent transformation rule to Cz- This processing corresponds to the transformation from B to C in Fig.l. Since there exist five member_rest literals in the body of Cz, we apply the following equivalent transformation rule to these literals.

383

T2: member_rest(X,Y,Z)

4equal(Y,{X|Z}) equal means Y = {X | Z}. By applying T2 to C3, the following clause C 4 is obtained. Then, in an expanded space, logical circuits can be represented as a set of plural elements. C4 : newarc(andAndnoConnection,Circuitl,RR1)<— equal(Circuitl,{[and,E,{D|R2}],[and,D,Rl]|RR}), equal(Circuit2,{[and,E,R3],[and,D,Rl]|RR}),union(Rl,R2,R3), equal(Circuit2,{[ELEMENT,AA,P]|RR1}), notExist(AA.RRl),free(AA). Here, {[and,E,R3],[and,D,Rl] | RR} is unified with {[ELEMENT,AA,P] | RR1}. Then, the unifications could be as follows. (1) [and,E,R3]0 = [ELEMENT,AA,P]a, {[and.D.Rl] IRR}0 = RRICT (2) [and,D,Rl]0 = [ELEMENT,AA,P]CT,{[and,E,R3]IRR}0 = RRICT

(3) RR0 = {[ELEMENT,AA.P] |RR'}<7,{[and,D,Rl] , [and,E,R3] |RR'}0 == RRl
384

Here, o, mpl, mql, etc. are input or output terminals. The examples of (1) and (2) are substituted for Ce- Then, there exists a substitution such that {D / {mp2},Rl / {mql, mq2},E / o,R2 / {mpl}, R R / (),R3 / {mql, mq2 , mpl}}, and this substitution is also true concerning the body. As for C5 and CV, there exists no such substitution. Therefore, C& is a useful synthesis rule corresponding to the given examples. 5

Discussion

By expanding knowledge representation, three synthesis rules were automatically obtained. Therefore, one synthesis rule C3 in the space before expansion tacitly contains information on these three synthesis rules. However, it is difficult to obtain one useful synthesis rule from C3 in I V In contrast, in an expanded space, by giving examples, one useful synthesis rule (C§) corresponding to the given examples can be obtained efficiently and automatically. In the representation of a rule, since most of the logical circuits are represented in the body, in the space before expansion, the body is long and it is difficult to understand connections of circuits. Moreover, since the body is unfolded by equivalent transformation, the processing cost is high when the body is long. In contrast, in an expanded space, since most of the logical circuits are represented in the head as a set, the body is short and it is easy to understand the connections of circuits. References 1. Takeuchi A. and Fujita H.. Competitive Partial Evaluation. Workshop on Patial Evaluation and Mixed Computation. (1987) pp. 317-326. 2. Lloyd J.W.. Foundations of Logic Programming. Second Edition. (Springer-Verlag. 1987). 3. Tamaki H. and Sato T.. Unfold/fold Transformation of Logic Programs. Proc. of 2nd ILPC. (1984) pp. 127-138. 4. Akama K.. Shimizu T. and Miyamoto E.. Solving Problems by Equivalent Transformation of Declarative Programs. Journal of Japanese Society for Artificial Intelligence, vol.13. N0.6 (1998) pp. 944-952. 5. Dejong G. and Mooney R.. Explanation-Based Learning: An Alternative View. Machine Learning, vol.1. No.2 (1986) pp. 145-176. 6. Mitchell T.M.. Keller R. and Kedar-Cabelli S.. Explanation-Based Generalization: A unifying view. Machine Learning, vol.1. No.l (1986) pp. 47-80.

385 A BLOCKS-WORLD PLANNING SYSTEM BHANU PRASAD AND VORAPAT CHAVANANIK.UL School of Computer and Information Sciences, Georgia Southwestern State University, AmericusGA 31709 USA E-mail: [email protected] In this paper we present a planning system for blocks-world domain. This system is inspired by the way how human beings perform real world tasks. An important component of the system is a whole priority list, which guides the system in selecting suitable sub-goals in solving a given problem. The system is entirely different from the existing systems, which are primarily based on either backtracking or on invariant intermediate states or random selection of sub-goals. This system selects sub-goals in a systematic fashion, as guided by the whole priority list. It generates a plan in polynomial amount of time. The system has been implemented using Common Lisp. A graphical user interface is incorporated for the convenient specification of user inputs.

1

Introduction

For a given start and goal states, a plan is a sequence of operators (or states) that connects the start state to the goal state. The process of finding this sequence is the task of planning [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]. One of the important domains for demonstrating planning systems is blocks-world [4, 8,9, 12]. A

c A

B C

B

Start state: ON(A, TABLE), ON(C, A), ON(B, TABLE)

Goal state: ON(C, TABLE), ON(B, C), ON(A, B)

Figure 1. A sample start and goal state descriptions. In the literature, various versions of blocks-world planning have been widely investigated [4, 8, 9, 12]. The objects in the problem domain include a finite number of cubical blocks and a table large enough to hold all the blocks. Each block is on a single other object (either another block or the table). For each block b, either b (i.e., the top of b) is clear or a single (unique) block a sitting on b. There is a single action called move, which can move a single clear block, either from another block onto the table, or from the top of an object onto another clear block.

386 A problem in this domain is specified by giving the start and goal state descriptions. For example, figure 1 contains a start and goal state descriptions. In the literature, a number of approaches have been presented for blocks-world planning [4, 8, 9, 12]. These systems either require non-deterministic selection of sub-goals [12] or backtracking [9, 12] or need some invariant intermediate state [4] or require hierarchical arrangement among the pre-conditions of the operators [12] or require a case-base of previous plans [1, 2, 3, 5, 6, 10]. Non-deterministic selection is very expensive in terms of time and computational efforts. Though the intermediate state approach is simple, it often generates non-optimal plans. The problem with the hierarchical approach is that the user has to explicitly define the hierarchies. The case-based approaches are efficient but they are good for structured domains [10]. In this paper we present a system for solving blocks-world planning problem. 2

Proposed System

The proposed system is inspired by the way how human beings perform real world tasks such as building construction (buildings are constructed from bottom to top but not vice-versa). The system is based on the observation that, for a given start and goal states, the sub-goals need to be satisfied from the bottommost level to till the topmost. For example, in figure 1, the sub-goal ON(C, TABLE) is at the bottommost level and the sub-goal ON(A, B) is at the topmost level. •|

Input

1

Graphical User interface 1 J r™"f. . ,. " ijf Priority list eenerator p~~ w

Output

Figure 2. Schematic diagram of the system. Once the structure of a given goal is analyzed, then the analysis is used in determining the order in which the sub-goals are considered while generating a plan. The system consists of three modules namely the graphical user interface, the priority list generator and the planner, as shown in figure 2. The system has been implemented using Common Lisp. In the diagram, the arcs represent the direction of data flow. The graphical user interface module accepts user input and converts it into a suitable format and supply the result to the priority list generator. The priority list generator generates a list containing the ordering relations among the sub-goals and supply the output to the planner. The planner generates a plan and supplies the output to the user. Now we see each of these modules in detail.

387 2.1

Graphical user interface

A graphical user interface is developed for the convenient specification of user inputs. Visual Basic® front end is used as the interface. The interface consists of two forms, one for specifying the start state description and another one for the goal state. If a block X is on top of another block Y then this information is represented as ON(X, Y). Here, ON is the name of a predicate. The information regarding the internal representation of user input is passed on to the priority list generator. 2.2

Priority List Generator

The priority list generator generates information regarding the order among the subgoals. This module is based on a keyword parser [6]. Some concepts that are used in the rest of the paper are presented below. Tower: A vertical structure of blocks is called a tower. Example: In figure 3, the goal state consists of three towers. The first tower is made up of the blocks D, H, G and E, the second one is made up of F, B, and A and the third one consists of C. Priority list of a tower: The priority list of a tower is a sequence of elements of the form ON(X, Y), in which the first element is the bottommost sub-goal in that tower and the next element is the next bottom level sub-goal in the tower and so on. Example: In figure 3, the priority list of the first tower in the goal state is: (ON(D, TABLE), ON(H, D), ON(G, H), ON(E, G)) H

E

G

G

A

F

H

B

E

D

F

Start state

C

Goal state

Figure 3. A sample start and goal states. Whole Priority List (WPL) of a state: For a given state description, the collection of priority lists with each list represents a tower in the state is called the whole priority list of the state or simply WPL of the state. Example: The WPL of the goal state in figure 3 is: ((ON(D, TABLE), ON(H, D), ON(G, H), ON(E, G)) (ON(F, TABLE), ON(B, F), ON(A, B)) (ON(C, TABLE))) Bottommost goal of a tower: From WPL, find out the appropriate priority list of the tower and return its first element. Bottommost goal of a tower is updated whenever WPL of the state is updated. Example: The bottommost goal for the first tower in the goal state of figure 3 is: ON(D, TABLE).

388 Bottommost set: It is the union of the bottommost goals of each of the towers in the goal state. Bottommost set is updated whenever the bottommost goal of a tower in the goal state is changed. Example: the bottommost set for the example in figure 3 is: ((ON(D, TABLE), ON(F, TABLE), ON(C, TABLE)). Topmost block of a state: For a given state description, a block which does not support any other block is a topmost block in that state. Example: The topmost blocks of the goal state in figure 3 are: E, A, and C. Height of a block in a state: For a given state and a block, the number of supporting blocks of the block is called the height of the block in the state. Example: In figure 3, the heights of the blocks D, H, G, E in the goal state are respectively 0, 1,2, and 3. 2.2.1

Algorithm for finding WPL

1. Receive goal state description from the user. 2. Convert the user input into the internal form and call it as INPUT 3. WPL
389 2.3

Planner

The planner generates the plan based on the information supplied by the priority list generator. The planning algorithm is presented below. In this algorithm the start state is represented as S and the goal state as G. 1. Create an empty list called PLAN. Create a Boolean variable FLAG. 2. Until the bottommost set is empty do the following { 3. FLAG
Complexity Results

If the number of blocks in the goal state is n then in the worst case, the INPUT is parsed n + (n-1) + (n-2) +...2 + 1 = n(n-l)/2 times. Therefore, the worst case complexity of the WPL algorithm in section 2.2.1 is O(n'). Steps 3, 4, and 5 are the important ones for the algorithm in section 2.3. The worst-case complexity of step 3 is 0(n"). The worst case complexity of step 4 is 0(n3). This is because, in the worst case, there are n elements in the bottommost set. For each of these elements, the move operator can be instantiated and applied in nc2 ways. The worst case complexity of step 5 is again O(n). As a result, the total complexity of the algorithm in section 2.3 is 0(n4). The total complexity of the system is 0(Maximum(n2, n4)) = 0(n4), which is polynomial type. 4

Conclusion and Future Work

In this paper we present a system for solving blocks-world planning problem. In addition, we have proved that the complexity of this algorithm is 0(n4). Since blocks-world planning is NP-hard [9], in some special cases the plans may not be

390 optimal. Now we are investigating the special cases. We are also applying the above results to other NP-hard problems in computer science. References 1. Bhanu Prasad., Planning With Hierarchical Structures, Proceedings of the Australian and New Zealand International Conference on Intelligent Information Systems, Australia, 1995. 2. Bhanu Prasad., Planning With Case-Based Structures, Proceedings of the American Association for Artificial Intelligence (AAAI) Fall Symposium on Adaptation of Knowledge For Reuse, David Aha and Ashwin Ram, Co-Chairs. MIT, Cambridge, USA, 1995. This paper is also available at http://www.aic.nrl.navv.mil/~aha/aaai95-fss/papers.html 3. Bhanu Prasad and Deepak Khemani, A Hierarchical Memory-Based Planner, Proceedings of the 1995 IEEE International Conference on Systems, Man & Cybernetics, Canada, 1995. 4. Bhanu Prasad and Deepak Khemani, Search Reduction in Blocks-World Planning, Proceedings of the Third International Conference on Automation, Robotics, and Computer Vision, Singapore, 1998. 5. Bhanu Prasad and Deepak Khemani, A Memory-Based Hierarchical Planner, Case-Based Reasoning Research and Development, Manuela Veloso and Agnar Aamodt (eds.), Lecture notes in Artificial Intelligence, Springer-Verlag 1313. 6. Bhanu Prasad and Deepak Khemani, Cooperative Memory Structures and Commonsense Knowledge for Planning, Progress in Artificial Intelligence, E. Costa and A. Cardoso (Eds.), Springer-Verlag, Lecture Notes in Artificial Intelligence, 1323. 7. Do. B. and Kambhampati. S., Solving Planning Graph by Compiling it into a Constraint Satisfaction Problem, International Conference on Artificial Intelligence Planning Systems 2000. 8. Gupta. N and Nau. D.S., Complexity Results for Blocks-World Planning, In AAAI-91, 1991. 9. Gupta. N and Nau. D.S., On the Complexity of Blocks-World Planning, Artificial Intelligence, 56(2-3):223-254, 1992. 10. Hammond. K., Case-Based Planning: Viewing Planning as a Memory Task, Acedemic Press, NewYork, 1989. 11. Lotem. A, Nau. D and Hendler. J., Using Planning Graphs for Solving HTN Problems, In AAAI-99, 1999. 12. Nilsson. N.J., Principles of Artificial Intelligence, Morgan Kaufmann Publishers, 1993 13. Zimmerman. T and Kambhampati. S., Exploiting the symmetry in the Planning-graph via Explanation-guided Search, In AAAI-99, 1999.

391 MULTI-COMPUTATION MECHANISM FOR SET EXPRESSIONS

H. K O I K E Hokkaido

Division of System and Information Engineering, University, Kita 11, Nishi 5, Kita-ku, Sapporo 060-0811, E-mail: [email protected]

Japan

K. A K A M A Hokkaido

Center for Information and Multimedia Studies, University, Kita 11, Nishi 5, Kita-ku, Sapporo 060-0811, E-mail: [email protected]

Japan

H. M A B U C H I Iwate Prefectural

Faculty of Software and Information Science, University, 152-52 Sugo, Takizawa, Iwate 020-0173, E-mail: [email protected]

Japan

A set expression is useful for describing problems in which sets of elements satisfying given conditions are treated. In this paper, we propose a new method for representing and computing sets of terms that satisfy given conditions. We introduce an expression for representing sets, called a 'set-of reference', equivalent transformation rules for computation of the expression, and multi-computation space called 'world ! G In this paper, we also show a sample problem that can be solved by our method but cannot be solved by a logic paradigm.

1

Introduction

A set expression is useful for describing problems in many cases [5]. Logic programming languages such as Prolog [2] each have a built-in predicate 'setof' for finding a set of answers to a given condition as an atom or more [1]. However the 'setof predicate cannot treat infinite sets and does not have logical formulas. In this paper, a new method for treating sets is proposed. The method is based on the equivalent transformation (ET) paradigm [3] in which problems including sets are represented by a declarative description (a set of extended clauses) and computation is performed by equivalent transformations. In our method, problems including sets are solved with theoretical validation of correctness without using the usual theory of logic programming. A combination of the multi-computation mechanism called world mechanism and equivalent transformation rules provides flexible computation and avoids infinite loops. We show a sample problem that can be solved by our method,

392

but cannot be solved by logic programming. 2

Declarative Description

In the ET paradigm, a problem is defined by a declarative description. A declarative description is a set of definite clauses. An example of declarative description is as follows: P = {Ci,C2,Ca,C4}U{Qi}. Ql = (yes «— setof(X, [even(X)], S),mem(4, S).) Ci = (mem{A, [A\Z]) «- .) C 2 = [mem(A, [B\Z]) <- mem{A,Z).) C3 = (even{0) <- .) C4 = (even(X) <- X = Y + 2, even(Y).) The declarative description P consists of clauses from Ci to C4 and Qi- The clause Qi is a query and asks whether a set of all even numbers contains 4. The setof atom in Q\ is called a 'set-of reference' and represents a set of all even numbers. In a later section, the setof atom is described in detail. The clauses C\ and C2 define the mem predicate that determines whether the first argument is an element of the second argument. The clauses C3 and C4 define the even predicate that represents even numbers. In the ET paradigm, a declarative description does not have procedural semantics, but declarative semantics. Some problems rise when we attempt to solve P in Prolog systems. One of them is that C4 causes some errors, since the order of atoms in Prolog is significant. A more serious problem is that the setof atom in Qi, which is implemented as a built-in predicate, may cause infinite loops. Using our method, however, the correct answer to P can be obtained in finite time. 3

Declarative Meaning

A declarative description determines a set of ground atoms, which is called a 'meaning'. Definition 1 Let P be a declarative description. The declarative meaning M(P) of P is defined as follows: M(P)d±f 7>(0)Up>] 2 (0) U[r P ] 3 (0)U = --- = U~ = iP>]"(0), where 0 is an empty set and Tp is the "immediate consequence mapping", which is defined as follows. Definition 2 For any set x of ground atoms, g €Tp(x) iff there is a substitution 0 and a definite clause C in P such that CO is a ground clause, g is the head of CO, and all the body atoms of CO belong to x.

393 An ET rule w.r.t P is a rewriting rule that preserves .M(-P). 4

Set-of Reference

In order to represent a problem of treating a set of terms that satisfy a given condition, we propose a 'set-of reference'. Definition 3 For an arbitrary set X, gterm(X) is a mapping that gives the set of all lists that represent the set X. For example, gterm({a, b}) = [[a, b], [b, a]]. Definition 4 Let AT be an atom, x a term in AT, P a declarative description, X a term, QT a set of all ground terms, and S a set of all substitutions. An atom setofp(x,AT,X) represents the following relation. X G gterm({xO € QT \6 € S,AT6 6 M(P)}). Definition 5 Let ATL be a list of atoms, x a term whose elements are all contained in ATL, P a declarative description, X a term, QT a set of all ground terms, and S a set of all substitutions. An atom setofp(x,ATL,X) represents the following relation. X 6 gterm\{xe EQT\0£S, ATLO C M (P)}). For example, let Q = {p(l,2) <- . ,p(3,4) <- .}. setofQ(X, \p(X, Y)], [S\Z]) is true iff Z = [1]. The setof atom represents a 'set-of reference'. 5

ET Rule

Representation of ET rules ET rules rewrite a declarative description P into P' preserving the relation M.(P) = M{P'). An ET rule is represented by the following form: (head) => {a list of procedures} i, (a list of atoms) i; =>• {a list of procedures^ i (a list of atoms) 2 ; =^- {a list of procedures}n, (a list of atoms) n. An atom (head) is called 'head'. A head represents a matching pattern. For i = 1, 2, ..., n, a pair of {a list of procedures}j, which is a list of procedures such as unifications and arithmetic operations, and {a list of atoms}j is called a 'body' of the ET rule. A body can have a (possibly empty) list of procedures and a (possibly empty) list of atoms. An ET rule replaces an atom in a body of a clause matched by head with a list of atoms in a body of the rule if the

394 rule has one body. An ET rule can have one or more bodies. More generally, when an ET rule is applied to a clause the rule makes n copies from the original clause, executes the list of procedures in each body, rewrites atoms in each duplicate clause, and replaces the original clause with new clauses. If the execution of any of the procedures in the duplicated clause fails, then the clause is deleted. World Mechanism The world mechanism is used to compute set-of references. It provides multicomputation space called 'world' that is concurrently processed. Due to this mechanism, computation is divided into some smaller computations, and we can process them in any order when more than one applicable ET rule exists; thus, flexible computation is possible and we can avoid infinite loops. To process computation with the world mechanism, we introduce three rules; initializing rule, referring rule, and deleting rule. The initializing rule creates a new world and puts a new clause into the world. The new clause is made from a setof atom and is transformed by ET rules to obtain a set of answers. The referring rule is used by a world to refer to states of other worlds. The deleting rule deletes objects created by the initializing rule if they are not needed. Fig. 1 shows an example of the flow of computation with the world mechanism. A detailed explanation is given in a later section. Setof Atom A set-of reference setofp(T, C, S) is represented as: setof (T,C,S), where T is a term, C is a sequence of atoms representing conditions, and S is a list of terms. T represents an element of S and is included in C. S represents a set of ground instances of T satisfying C. The setof refers implicitly to a declarative description P by the world mechanism. 6

Problem-Solving Based on the ET Paradigm

In this section, an example of computation for set-of references is presented. The problem of determining whether 4 is an even number or not is considered. The declarative description P, described in section 2, defines the problem. Qi in P is a query that means the above question. Note that Q\ has a setof atom that represents a set of all even numbers.

395 Suppose a declarative description P is given. The following rules are made from P. (rl) mem(X,Y)

=> {Y = [X\Z]}; =*• {Y = [A\Z]},mem(X,Z). (r2) even(X) => {X = 0}; =>X = Y + 2,even(Y). The rules are ET rules since they preserve M(P). In computation, only <3i is transformed by ET rules. Fig. 1 shows the flow of computation. Worldl in Fig. 1 is implicitly created at the beginning of computation. The computation steps are as follows: 1. First, Qx is rewritten into Qi by the initializing rule, and the rule creates World2 and puts clause Ni into World2. The ref atom in Q2 refers to World2. 2. iVi is next rewritten into N2 by (r2). Since there exists a unit clause (ans(0) «— .) in WorZd2, Q2 is applied by the referring rule. The rule substitutes [OjS'2] for the third term S in the ref atom and deletes the unit clause in World2. Thus, Q3 and N3 are obtained. 3. Qz is transformed into Q4 by (rl), and N3 is transformed into N4 by (r2). 4. Since there exists a unit clause (ans(2) «— .) in World2, Q4 is applied by the referring rule. This rule substitutes [2153] for the third term 52 and deletes the unit clause in World2. Thus, Q 5 and A^5 are obtained. 5. A^5 is transformed into A^ by (r2). Then Q§ and N7 are obtained by the referring rule. 6. Q7 is obtained by applying (rl) to the atom in Q67. Qs is obtained by applying (rl) to the atom in Q-?. 8. Finally, since ref atom no longer has any effect on other atoms, the deleting rule is applied and Qg is obtained. Since there is a unit clause in Qs, computation can be terminated and an answer to the query is obtained.

396 7

Conclusions

We have proposed set-of references for describing problems with sets of terms satisfying a given condition and rules for computing set-of references and the world mechanism. Set-of reference is implemented on the world mechanism, enabling correct computation and avoiding infinite loops. Therefore, our method overcomes some problems that cannot be solved by logic programming. In our method, computation is conducted by using ET rules. We can describe computation corresponding to SLD-resolution and other methods by ET rules. Thus, we can describe more efficient algorithms by ET rules that cannot be realized by SLD-resolution. Furthermore, compared with other implementations for set expressions [5], a combination of the world mechanism and ET rules is simpler. References 1. D. Li: A PROLOG DATABASE SYSTEM, Research Studies Press Ltd., 1984. 2. J.W. Lloyd: Foundations of Logic Programming, Second edition, (Springer-Verlag, 1987). 3. K. Akama, T. Shimizu, and E. Miyamoto: Solving Problems by Equivalent Transformation of Declarative Programs, Jounal of Japanese Society for Artificial Intelligence, vol.13, NO.6, 1998, 944-952. 4. T. Yokomori: A Note on the Set Abstraction in Logic Programing Language, Proceedings of The International Conference on Fifth Generation Computer Systems, 1984, 333^340. 5. B. Jayaraman, K. Moon: Subset-logic programs and their implementation, The Journal of Logic Programming 42, 2000, 71-110.

397

World 1

"'{L

yes«-setof(X, [even(X)], S), mera(4, S). By t h e initializing rule The Initializing rule cre< itesworld2. '

w , VVUI

Id 2

Ans(X)«-even(X).

r

Q2<^

. * By (r2) C_ans(0)»-.; ,_-'"'' insTX)*1-X=M+2, even(M).

By t h e referring rule

r

"I

.,--""

yes«-ref(X, W2, S2), mem(4,[0|S2]).

J J

|

By(rl)

By (r2)

yes«-ref(X, W2, S2), mem(4,[S2]). C'ans(2)«^) , ' - ' ' ~ a n s T X ? - X « M + 2 , M=N+2, even(N).

1

.,'''

1

yes«-re«<X, W2, S3), mem(4,[2 |S3]). *"

* By t h e referring rule

6W

_-—"

_____

" ans(X? : : X=M+2, M=N+2,N=P+2,even(P).

By(rl)

r

*{ r Q8<*

1

ans(X)«-X-M+2,M=N+2,N-P+2,even(P). |

yes«~ref(X, W2, S4),mem(4,[4|S4]). |

}"•

By (r2)

Cans(4)«-D

yes<-ref(X, W2, S4), mem(4,[2,4|S4]).

L

}"•

: 1

ans(X)<-X=M+2, M=N+2,even(N).

r-

h 1

ans(X)*-X=M+2, even(M).

By t h e referring rule

-{

1N2

1

1

r

-{

k J

•

yes*-ref(X, W2, S), mem(4,S).

By(rl]

yes—ref(X, W2, S4). 1 By t h e d e l e t i n g rule

Q9J yes*—.

Fig. 1. Flow of Computation.

}"• 1 }"'

399

P R O V I N G T E R M I N A T I O N OF w R E W R I T I N G

SYSTEMS

Y. S H I G E T A Toshiba

Corporation,

580-1, Horikawa-cho, Saiwai-ku, Kawasaki, E-mail: [email protected]

212-8520,

Japan

K. A K A M A A N D H. K O I K E A N D T . ISHIKAWA Hokkaido

University, Kita 11, Nishi 5, Kita-ku, Sapporo, 060-0811, E-mail: {akama, koke, ishikawa}@cims.hokudai.ac.jp

Japan

A termination problem of a rewriting system proves the non-existence of an infinite reduction sequence obtained by the rewriting system. This paper formalizes a method for proving termination by abstraction, i.e., reducing an original concrete termination problem to a simpler abstract one and then solving it to prove the original problem's termination. Concrete and abstract rewriting systems in this paper are called w rewriting systems. They include very important systems such as term rewriting systems, string rewriting systems, semi-Thue systems, and Petri Nets.

1

Introduction

A termination problem of a rewriting system is to prove that the rewriting system has no infinite reduction sequence. Proving termination by "bruteforce search" [7] can take much (often infinite) time and space, since many (infinite) paths must be checked before all paths turn out to be finite. A better method is to try to map the original concrete termination problem to a simpler abstract one, with the aim of deriving useful information for the solution of the original problem. Such a technique is often applied to complicated problems in computer science and artificial intelligence [8,5,3]. The aim of this paper is to formalize such a method to prove termination of various rewriting systems that can be formalized as u rewriting systems. The class of u> rewriting systems is a very large class of rewriting systems and most important rewriting systems, including term rewriting systems, semi-Thue systems, and Petri Nets, are u rewriting systems. The class of u) rewriting systems is defined on axiomatically formulated base structures, called u structures, which are used to formalize the concepts of "terms," "substitutions," and "contexts" that are common to many rewriting systems. The base domains of abstract rewriting systems must often be defined as domains that differ from the domains of the original concrete rewriting systems. The class of u rewriting systems is large enough to include both concrete and abstract rewriting systems. Adoption of the class of u rewriting

400 systems is essential to establish the present theory of termination. 2 2.1

u Structures and u Rewriting Systems u> Structures

In order to formalize the common base structures of term rewriting systems and other rewriting systems, a structure, called an w structure, was introduced [1,2]. An w structure is used to formalize the concepts of "terms," "substitutions," and "contexts" that are common to many rewriting systems. Definition 1 Let Trm, Sub, and Con be arbitrary sets, e an element in Sub, and • an element in Con. Let Dom be a subset of Trm. Let fss be a mapping from Sub x Sub to Sub, fee a mapping from Con x Con to Con, fas a mapping from Trm x Sub to Trm, frc a mapping from Trm x Con to Trm, and fes a mapping from Con x Sub to Con. Then, the eleven-tuple {Trm, Dom, Sub, Con, e, • , fss, fee,

fTS,frc,fcs)

is called an u> structure when it satisfies the following requirements: Dl D2

Vt€Trm:fTS{t,e)

= t,

VteTrm:fTC{t,n)=t,

D3 Vf G Trm,W1,62

G Sub :

/TS(/TS(MI),02)

D4 Vt G r r m , V c i , c 2 G Con : fTc{fTc(t,c1),c2)

= frs{t,

fss(0i,92)),

= /Tc(t,/cc(cic2)),

D5 yt G Trm, Vc G Con, V0 G Sub : fTs{fTc{t,c),8) = fTc{fTs(t,9),fcs{c,6)). Application of the mappings fSs, fee, fas, frc, and fCs is usually denoted in a more readable manner: fss(61,62) is denoted by 6i82, fcc(ci,c2) by cic 2 , fTs(t,0) byt6, fTc{t,c) byte, and fcs(c,0) by c6. Hence, fTe{fTs(t,0),c) is denoted by (td)c. Left associativity is assumed for such notation. For instance, (• • • ((tci)c2) • • • c„), which is the result of successive application of ci, c2, • • •, cn to t G Trm, is denoted by tc\c2 • • • c„. Thus, the five requirements in Definition 1 can be restated as follows: Dl Vt G Trm :te = t, D2 V* €Trm:tn D3 Vi G Trm,V81,02

= t, G Sub : t6xe2 -

t(dxB2),

401

D4 W G Trm,Vci,e 2 G Con :tc\c2 = t(cic2), D5 V* G Trm,Mc G Con,V0 G 5u6 : ic0 = t6(c$). 2.2

Rewriting Systems on u> Structures

A rewriting system R on an w structure £l = (Trm, Dom,Sub,Con,
rule (I, r) G R, denoted by u —4 v, iff there are 9 G Sub and c G Con such that u = Wc and v = r0c. Definition 3 A term v G Dom is immediately reachable from u G Dom by a R

rewriting system R, denoted by u —> v, iff there is a rule (I, r) G R such that u —4 v. Definition 4 A term v G Dom is reachable from u G Dom by a rewriting system R, denoted by u —>•* v, iff Dom includes terms si, S2, • • •, sn (n > 1) such that R U — Si

R

>• S2

R ¥ •• •

¥ Sn

=

V.

The rewriting relation —>•* is the reflexive and transitive closure of—K For an arbitrary u rewriting system R, the set of all pairs (x, y) such that % —K y will be denoted by [K\t i.e., [ti\ = {(x,y)\x-$.y}. The class of all u rewriting systems includes very important systems [1,2] such as term rewriting systems, string rewriting systems, semi-Thue systems, and Petri Nets. 3 3.1

Homomorphism Definition of Homomorphism

The concept of a homomorphism from an u> structure to an ui structure is introduced. Definition 5 Let Q\ and £l2 be w structures: fii = (Trmi, D o m i , 5 « 6 i , C o n : , e i . D i , / S S I , / C C I , / T S I , / T C I , / C S I ) , Q 2 = (Trmi, Dom2,Sub2,Con2,e2,D2, fSS2, feci, frs2,/TC2>/CS2>-

402

Let hx be a mapping from Trmi to Trm2, hs a mapping from Subi to Sub2, and he a mapping from Coriy to Con2. A triple of mappings (hT,hs,hc) is a homomorphism from Cli to Q,2 iff 1. hT(fTCi(fTSi(t,6),c))

=

fTC2(fTS2{hT(t),hs(9)),hc(c))

for all t £ Trmi, 0 6E Sub\, and c £ Con\, 2. hT{Domi)

C Dom2.

Using the notational convention mentioned earlier, the first requirement for a homomorphism is denoted simply by hT(t$c) = hT{t)hs{e)hc(c). In this paper, a triple of mappings {hx,hs,hc) is assumed to be a homomorphism from an w structure Qi = (Trmi, DomltSubi,Coni,ei.Di, fssi, feci,hsi, foci, fesi) to an u> structure Q 2 = {Trm2, Dom2, Sub2,Con2,€2,^2, fss2,fcc2,fos2,foc2,fcs2)Since h? is a mapping from Trmi to Trm2, it can naturally be extended into the following mappings: hr : Trmi x Trmi —> Trm2 x Trm2, (x,y) L

1—>(h T (x),h T (y)),

. WTrmiXTrmi

_.

Wrrm2X.Trm2

S^{(hT(x),hT(y))\(x,y)eS}. Note that, for the sake of simplicity, all these extensions are referred to by the same name hr- In particular, a rewriting system on fii is transformed into a rewriting system on Q,2 by the mapping u

. oTrmixTrmi

.

nTrm2xTrm2

In other words, if R is a rewriting system on fii, then hx{R) is a rewriting system on Q2. 3.2

Relation between Concrete and Abstract Rewriting

Assume that a concrete rewriting system and an abstract rewriting system are in a homomorphic relation (hT,hs,hc), i.e., (hT,hs,hc) is a homomorphism from the concrete rewriting system to the abstract rewriting system. Then, it can be proven that, if x is rewritten into y by the concrete rewriting system R, then hx(x) is rewritten into hx(y) by the abstract rewriting system hr(R), Proposition 1 (See [3]) Let {hx,hs,hc) be a homomorphism from an w rewriting system fii to an u) rewriting system Q,2. Let R be a rewriting system on fli. If x—>y,

then hx{x)

^—± hx(y).

403

4 4-1

Termination of w Rewriting Systems Termination

Let R be an ui rewriting system on fi. A term t in Dom is non-terminating with respect to R iff there is an infinite sequence of terms ti,tz,-- • ,tn,- • • in Dom such that t = 11 and i,- —> t,-+i for all i = 1,2,3, • • •. A term t in Dom is terminating with respect to R iff i is not non-terminating with respect to

R. Let D be a subset of Dom. A set D is terminating with respect to R iff all terms in D are terminating with respect to R. An u rewriting system R is terminating iff Dom is terminating with respect to R. 4-2

Termination Theorem for u> Rewriting Systems

Termination with respect to an u rewriting system R is determined by the set [R]; i.e., t is non-terminating if and only if there is an infinite sequence of terms ti,^,- • • ,tn,- • • in Dom such that t = ti and (tt,i t -+i) £ [R] for all i — 1 , 2 , 3 , - •. Hence, by the homomorphism theorem, the following theorem is obtained. Theorem 1 [Termination Theorem] Let R be an ui rewriting system on Qj. Let (hT,hs,hc) be a homomorphism from an ui rewriting system fii to an u> rewriting system 0,2- Then, R is terminating ifhT(R) is terminating. Proof. Assume that t in Dom\ is non-terminating with respect to R. Then, there is an infinite sequence of terms ti,t2,- • • ,tn,- • • in Dom such that t — ti and ti — • t,-+i for all i = 1,2,3, • • •. By Proposition 1, there is an infinite sequence hxiti), /»T(^2), • • •, kritn), • • • in Dom such that hxit) = hr{ti) a n d hriU) — • ^r(*t+i) f° r a u J = 1>2,3, •••. Hence hx{t) is nonterminating with respect to hT(R). This proves that if t in Dom\ is nonterminating with respect to R, then hr{t) is non-terminating with respect to hx(R). By contraposition, it follows that t in Dom\ is terminating with respect to R if /&T(*) in Dom,2 is terminating with respect to hx(R). Hence, R is terminating if HT{R) is terminating. •

4-3

Example

The coffee bean puzzle [3,6] is formulated by an w rewriting system R = {bb -j-iu, bXw -*- Xb}. A homomorphism consisting of Ay that mapps a string into the number of b

404 and w in the string, one has hT(R) = { 2 -> 1, 2 + X - » 1 + X } . Since h,T{R) includes only decreasing rules, it follows t h a t h-T(R) is terminating. Therefore, by T h e o r e m 1, R is also terminating. 5

Concluding Remarks

This p a p e r proposes a theoretical foundation for proving termination of ui rewriting systems. T h e theory comprises the following elements; two u structures, two u> rewriting systems, two reachability relations on the two w rewriting systems, a homomorphism between two u structures, a homomorphic relation between the two w rewriting systems, and the termination theorem. References 1. K. Akama, Common Structure of Semi-Thue Systems, Petri Nets, and Other Rewriting Systems, Hokkaido University Information Engineering Technical Report, HIER-LI-9407 (1994), revised version in IEICE Trans, of Information and Systems, E80-D (12), pp.1141-1148 (1997). 2. K. Akama, An Axiomatization of a Class of Rewriting Systems, Hokkaido University Information Engineering Technical Report, HIER-LI-9409 (1994). 3. K. Akama, H. Mabuchi, Y. Shigeta, Homomorphism Theorem and Unreachability for Omega Rewriting Systems, in Xiao-Shan Gao and Dongming Wang (Eds.), Computer Mathematics, Proceedings of the 4th Asian Symposium on Computer Mathematics (ASCM2000), Lecture Notes Series on Computing Vol.8, pp.90-99, (2000). 4. B. Buchberger, History find Basic Features of the Critical-Pair / Completion Procedure, / . Symbolic Computation 3, pp.3-38 (1987). 5. P. Cousot and R. Cousot, Abstract Interpretation and Application to Logic Programs, J. Logic Programming, 13 (2&3), pp.103-179 (1992). 6. N. Dershowitz and J. Jouannaud, Rewrite Systems, Handbook of Theoretical Computer Science, Chapter 6, pp.243-320 (1990). 7. R.E. Korf, Planning as Search: A Quantitative Approach, Artificial Intelligence 33, pp.65-88 (1987). 8. E.D. Sacerdoti, Planning in a Hierarchy of Abstraction Spaces, Artificial Intelligence 5, pp.115-135 (1974).

405 SEMANTICS FOR DECLARATIVE DESCRIPTIONS WITH REFERENTIAL CONSTRAINTS

K. A K A M A A N D H. K O I K E A N D T . ISHIKAWA Hokkaido

University, Kita 11, Nishi 5, Kita-ku, Sapporo, 060-0811, E-mail: {akama, koke, ishikawa}@cims.hokudai.ac.jp

Japan

Higher-order relations, such as not and set-oj, are useful for knowledge representation, especially for description of queries to databases. However, it is very difficult to formalize the semantics for correct computation of higher-order relations. In this paper, we introduce a class of constraints, called referential constraints, the meaning of which is related to the meaning of other atoms, and define the semantics of referential constraints. This theory formalizes a general semantics for constraints (simple and referential constraints), based on which we obtain correct computation of many constraints such as not and set-oj constraints and first-order constraints.

1

Introduction

Constraints in the body of a definite clause are used to restrict instantiation of the definite clause [2,5] and are useful for representing already known relations that can not be defined by a finite set of definite clauses. Usual constraints, which will be called simple constraints in this paper, can not, however, represent "higher-order relations" such as not and set-of constraints, the meaning of which is related to the computation results of some queries. In this paper a concept of referential constraints is newly defined as an extension of usual constraints. A referential constraint has, as its arguments, more than one declarative description, which is a set of definite clauses, each of which may contain referential constraints in the body. Semantics of referential constraints will be defined together with referential declarative descriptions. This theory is essential to the correct computation of referential declarative descriptions, which include not and set-of constraints and first-order constraints [3,6]. 2 2.1

Declarative Descriptions Terms, Atoms, and Substitutions

Let K, F, V, and R be mutually disjoint sets. The four-tuple of K, F, V, and R is called an alphabet and denoted by S. Each element in the sets K, F, V, and R is called, respectively, a constant, a function, a variable, and a predicate (on E). All concepts in this paper will be defined on the alphabet

406

E. However, reference to the alphabet E is often omitted for simplicity. We assume that terms, atoms (atomic formulas), and substitutions (on E) are denned as usual [5]. The definition of ground terms, ground atoms, instances of terms, instances of atoms are assumed to be the same as the ones in [5]. An object is either a term or an atom. A ground object is either a ground term or a ground atom. A substitution {ii/
Declarative Descriptions

Declarative descriptions consisting of atoms and constraints are inductively defined as follows. Definition 1 [Declarative Description] 1. A constraint is a (m+l)-tuple

(<j),di,d2, • • • ,dm), where

• m > 0, • <j> is a mapping from G\ x G2 x • • • x Gm to {true, false}, with each d (i = 1,2,3, • • •, m) being identical to either Q, QT, or 2g. • di (i = 1,2,3, • • • ,m) is either an atom if Gi is Q, a term if Gi is QT, and a declarative description ifG{ is 1?. A constraint (cj),di,d2, • • • ,dm) is called a simple constraint iff di,d2,- • • ,dm are all objects (terms and atoms). A constraint (, di, d2, • • •, dm) is called a referential constraint iff at least one d( is a declarative description. 2. A definite clause is an expression of the form H <— B\,B2, • • • ,Bn, where H is an atom, and each Bj (i = 1,2, ••-,«) is either an atom or a constraint. A definite clause is called a simple definite clause iff it does not include referential constraints. A definite clause is called a referential definite clause iff it includes one or more referential constraints. 3. A declarative description is a set of definite clauses. A declarative description is called a simple declarative description iff it consists of

407

only simple definite clauses; otherwise it is called a referential declarative description. D This is an inductive definition. Firstly, simple constraints are defined by 1. Secondly, simple definite clauses are defined by 2. Thirdly, simple declarative descriptions are obtained by 3. Next, referential constraints that contain simple declarative descriptions are defined by 1. Then, new definite clauses containing these referential constraints are added, and new declarative descriptions containing these new definite clauses are defined. Repeating such definition, all declarative descriptions are determined. The set of all constraints is denoted by Con. The set of all definite clauses is denoted by Del. The set of all declarative descriptions is denoted by Dsc. Let C be a definite clause H <— B\, B2, • • •, B„. H and (B\, B2, • • •, Bn) are respectively called the head and the body of C. The head of C is denoted by head(C). The set of all atoms Bi in the body of C and the set of all constraints Bj in C are denoted, respectively, by atom(C) and con(C). Let con be a constraint (<j),di,d2, • • • ,dm). The set of all objects d,- in {d\,d2, • • • ,dm} is denoted by cobj(con). The set of all declarative descriptions d{ in {d\,d2, • • • ,dm} is denoted by dsc(con). The set of all declarative descriptions that appear at the toplevel of C, i.e., {d I d £ dsc(con),con G con(C)}, is denoted by dsc(C). A constraint {<j>,di,d2, • • • ,dm) is a ground constraint iff each d,- (i = 1,2, •••,m) is either a ground object or a declarative description. The set of all ground constraints is denoted by Gcon. A definite clause consisting of only ground atoms and ground constraints is called a ground definite clause or, more simply, a ground clause. The set of all ground clauses is denoted by Gels. 2.3

Examples of Declarative Descriptions

A simple declarative description, i.e., a declarative description that does not contain referential constraints, is shown. [ Example 1 ] A definition of even relation is given by the following declarative description. Peven = { even(0) <- . even(Y) <-even(X),(fadd2,X,Y). }. faddi is a mapping from QT X QT to {true, false} such that fadd2(t>s) = true

if t and s are numbers and t + 2 = s,

408

fadd2(t,s) = false otherwise. Next, a referential declarative description is shown. The declarative description Peven in the previous example is used in the referential constraint in order to define a predicate odd. [ Example 2 ] The odd predicate is defined by the following declarative description P0dd, which refers to a declarative description Peven defined in Example 1. Podd = { odd(Z) <(fnot,even(Z),Peven),nat(Z). nat(Z)i-(fnat,Z). }. fnot is a mapping from Q x 2s to {true, false} such that fnot{g,G)=true if g<£G, fnot(g, G) = false if g € G. fnat is a mapping from GT to {true, false} such that fnat{g) = true fnat{g) = false 3

if g is a natural number, if g is not a natural number.

Meaning of Declarative Descriptions

3.1

Specialization by Substitutions

Specialization operation by substitutions to constraints, definite clauses, and declarative descriptions is inductively defined as follows. Definition 2 [Specialization] 1. The result of application of a substitution 8 £ S to a constraint c — {<j>,rfi,d2, • • •, dm), denoted by c0, is defined by c6 = (,di,d2,---,dm)0= {
409 tions is defined by 1. Then, specialization of new definite clauses containing these referential constraints is added, and specialization of new declarative descriptions containing these new definite clauses is defined. Repeating such definition, specialization of all constraints, all definite clauses, and all declarative descriptions is determined. 3.2

Meaning of Definite Clauses and Declarative Descriptions

The meaning of definite clauses and declarative descriptions is inductively defined as follows. Definition 3 [Meaning] /. A mapping val : Gcon —• {true, false} is defined by val{{(f>,di,d2,--- ,dm)) = {9\,92,--- ,9m), where gt = di if di GG^GT,

and gi — M{di)

if d, G Dsc.

2. A set Tcon, called the set of all true constraints, is defined by Tcon — {con \ con 6 Gcon, val(con) = true}. 3. The meaning A4(C) of a definite clause C is defined by M(C)

d

= {(head{C0), atom(C0)) \ 0 G sub(C), CO e Gels, con(C6) C Tcon},

where sub(C) is the set of all substitutions on the set of all variables in {head(C)} U atom(C) U cobj(C). 4- A mapping Tp : 2^ —> 2^ for a declarative description P is defined as follows. For any set x of ground atoms, TP{x)d=

{head\ C £P, atom C x, (head, atom) G

M(C)}.

5. The meaning M(P) of a declarative description P in Dsc is defined, using the mapping Tp for P, by oo

M(P)d^def \J[TP]n(9). n=l

410

T h e meaning of a declarative description P is computed as follows. Firstly, val(con) is determined for all simple constraints con. Secondly, the set of true simple constraints is determined as a subset of Tcon. Thirdly, the meaning of simple definite clauses is determined. Fourth, Tp is defined for a simple declarative description P. Fifth, the meaning of declarative descriptions P consisting of only simple definite clauses is determined. Next, val(con) is determined for all referential constraints con t h a t include simple declarative descriptions P. Repeating such operations, the meaning of all declarative descriptions is determined. 4

Concluding Remarks

In order to develop a theoretical foundation for knowledge representation and correct computation with higher-order relations, referential constraints are defined as an extension of usual constraints. Semantics of referential constraints has been defined by assigning a set of ground atoms for each declarative description t h a t may contain simple and referential constraints. Then, correct computation for referential constraints is immediately determined by the equivalent transformation paradigm [1], i.e., declarative descriptions with referential constraints are transformed equivalently preserving their meaning. Based on the theory, we have an interpreter (named E T I [4]), which enables us to correctly compute not, set-of, and first-order constraints. References 1. K. Akama, Y. Shigeta and E. Miyamoto, A Framework of Problem Solving by Equivalent Transformation of Logic Program, J. Japan Soc. Artif. Intell., Vol.12, No.2, pp.90-99 (1997). 2. J. JafFar and J. L. Lassez, Constraint Logic Programming, Technical Report, Department of Computer Science, Monash University, June 1986. 3. H. Koike, K. Akama and H. Mabuchi, Multi-Computation Mechanism for Set Expressions, International Conference on Computing and Information Technologies (ICCIT 2001), (to appear 2001). 4. H. Koike, K. Akama and H. Mabuchi, Equivalent Transformation Language Interpreter ETI, 5th IEEE International Conference on Intelligent Engineering Systems 2001 (INES 2001), (to appear 2001). 5. J.W. Lloyd, Foundations of Logic Programming, Second edition, SpringerVerlag, 1987. 6. T. Yoshida, K. Akama and E. Miyamoto, Program Synthesis from First-order Expressions for Problem Solving in the String Domain, Trans. Information Processing Society, Vol.41, No.SIG 7 (TOM 3), pp.12-22 (2000).

411

SOLVING LOGICAL P R O B L E M S B Y EQUIVALENT T R A N S F O R M A T I O N K. AKAMA AND H. KOIKE Hokkaido

University, Kita 11, Nishi 5, Kita-ku, Sapporo, 060-0811, E-mail: {akama,koke}Qcims.hokudai.ac.jp

Japan

Y. SHIGETA Toshiba

Corporation,

580-1, Horikawa-cho, Saiwai-ku, Kawasaki, E-mail: [email protected]

212-8520,

Japan

H. MABUCHI Iwate Prefectural

University, 152-52 Sugo, Takizawa, Iwate, E-mail: [email protected]

020-0173,

Japan

In logic programming, computation is regarded as inference. In this paper, we propose a new method to solve logical problems by equivalent transformation and develop a theoretical foundation for the correctness of the method. Given a logic program P and a query q, A logical problem (P, q) is formalized as finding the set L(P, q) of all ground instances g of q such that P \= g. The set L(P, q) is represented by t h e declarative semantics of a logic program P' that is produced from P and q. The logical problem (P, q) is solved by transforming P' equivalently into a simpler form, preserving its declarative semantics and utilizing many transformation rules. Inferential (resolution-based) problem solving can be regarded as a special case of the proposed method.

1

Introduction

In logic programming, computation is regarded as inference [4]. Given a logic program P and a query q, Prolog finds substitutions 9 such that P |= q9 by using inference (SLD-resolution). It is widely believed that inference is the unique and best way to solve logical problems. Many extensions of logic programming based on logical inference have also been developed. In this paper, however, we propose a new method to solve logical problems by equivalent transformation, without sticking to inference, and develop a theoretical foundation for the correctness of the method. A logical problem is reformulated as finding the set L(P,q) of all ground instances g of q such that P (= g. The set L(P, q) is represented as a function of the declarative semantics of a logic program P' that is produced from P and q. P' is transformed equivalently into a simpler form, preserving its declarative semantics and utilizing many transformation rules. From the simplified P', the solution of (P, q) is obtained. This method provides a more general class of computation than the

412

logical inference in logic programming does in the sense that any computation by SLD-resolution can be obtained by equivalent transformation and there is more efficient computation by equivalent transformation that is not obtained by SLD-resolution.

2 2.1

Logical Problems Logic Programs

Let A be an alphabet for the predicate logic. Let A be the set of all atoms on A, Q the set of all ground atoms (atoms that do not include variables) on A, and S the set of all substitutions on A. An instance of an atom is an atom obtained by application of a substitution to the atom. A ground instance of an atom is a ground atom that is an instance of the atom. The set of all ground instances of an atom a is denoted by rep(a). A definite clause on A is a formula of the form H «— B\, • • •, Bn (n > 0), where H, Bi, • • •, Bn are elements in A. H and ( 5 i , • • •, Bn) are called the head and the body of the definite clause, respectively. The head of a clause C is denoted by head(C), and the set of all atoms in the body of a clause C is denoted by body(C). Atoms that occur in the body of a definite clause are called body atoms. A definite clause consisting of only ground atoms is called a ground clause. An instance of a definite clause is a definite clause obtained by application of a substitution to all atoms in the definite clause. A ground instance of a definite clause is a ground definite clause that is an instance of the definite clause. A logic program on A is a set of definite clauses on A. A logic program is often called simply a program in this paper. The set of all definite clauses on A and the set of all logic programs on A are denoted by Dclause(A) and Program(A), respectively.

2.2

Interpretation and Model

An interpretation / on A is a subset of Q. A ground clause C is true with respect to an interpretation / iff head(C) 6 / or body(C) <£. I. An interpretation / is a model of a definite clause C iff all ground instances of C are true with respect to / . An interpretation / is a model of a program P iff / is a model of all definite clauses in P.

413

2.3

Logical Consequence

A set Ei of definite clauses is a logical consequence of a set E\ of definite clauses [E\ |= i?2) iff any model of E\ is a model of Ei- A definite clause C is a logical consequence of a set E of definite clauses (E \= C) iff any model of E is a model of C.

2-4

Logical Formalization of Problems

In Prolog, a problem to be solved is specified by a pair of a program P and an atom (called a query) q. To find all substitutions 6 that satisfy P |= (q6 <—) is the aim of the problem °. Computation in Prolog is regarded as solving these problems by "reductio ad absurdum", and is formalized as SLD-resolution. Assume that substitutions 9\, $2, • • •, 0m are obtained by SLD-resolution. The soundness and completeness theorem of SLD-resolution guarantees that the set of these substitutions 0\, 62, • • •, #m is a correct answer to the query q with respect to P in the sense that, the set Ui 6 { l i 2 ,...,m}{ftp|3/9G5}, i.e., the set of all substitutions that are more specific than one of 9\, 82, • • •, 6m, is identical to the set {0\P\=(q9<-)}, i.e., the set of all substitutions 0 that satisfy P (= (q0 <—). In this paper, however, a solution to be found by logic programming is formulated not as a set of substitutions but as a set of ground atoms. More precisely, we introduce the following definition. A pair (P, q) of a logic program P on A and a query q € A is called a logical problem, which requires finding L{P, q) = {g\P\=(g <-), g £ rep(q)}, which is a subset of Q. L(P,q) is called a solution set of the logical problem (P,q). When a substitution 0 is obtained by SLD-resolution, let 6 be regarded as a representative of the set rep(q6), i.e., the set of all ground instances of q6, and consider that all elements of rep(q$) are obtained. Then, the soundness and completeness theorems of SLD-resolution guarantee to compute L(P,q) correctly.

°(a) In the sequel, only definite clauses will be used for logical formulas. Thus, P |= V(q6) in the conventional theory is denoted by P \= (q6 <—).

414

3 3.1

Transformation of Logical Problems Introduction of New Predicates

Let P be a logic program on A, and q an atom on A. Consider a new logic program P ' = P U {4>{q) <— q}, where (q) is the atom obtained by changing the predicate r of q into a new predicate r'. We give some definitions for such transformation of logic programs. Let r be a predicate of q, and r' a new predicate that is not included in the predicates in A. Let Ar be the set of all r atoms (atoms with a predicate r), and Qr the set of all ground r atoms. Let A' be the set of all r' atoms, and Q' the set of all ground r' atoms with a predicate r'. Let <j> : Ar —*• A' be a one-to-one mapping that maps an r atom into an r' atom by only replacing r with r'. Let A be an alphabet obtained by augmenting A with a predicate r'. The set of all atoms on A is A U A'- The set of all ground atoms on A is G UQ'. The set of al[substitutions on A is S. Then, P' = PU {<j>(q) <- q} is a logic program on A. 3.2

Basic Propositions

Three propositions are given for investigating in Section 3.3 the relation between a logic program P on A and a logic program P U {(q) <— q} on A. Proposition 1 If M C. Q and M' C Q', then the next two conditions are equivalent. (1) (Mf)rep(q)) (2) MUM'

CM'.

is a model of {(q) <- q}.

Proposition 2 Let M be any model of P U {4>{q) <— {M D rep(q)) C M'. Proposition 3 Let M (C Q) be any model of P, and M' (C Q') a set that satisfies (M l~l rep(q)) C M'. Then, M U M' is a model of P U {(q) <- q}. Proposition 4 The next two conditions are equivalent. (1) I is a model of (a <—). (2) rep(a) C / . 3.3

Transformation of Logical Problems

From Proposition 1, 2, 3, and 4, the next theorem is obtained.

415

T h e o r e m 1 The next two conditions are equivalent. (1) P\=(g<-),gerep(q). (2)PU{(q) <- q} \= (4>(g) < - ) . * € & • The solution set of a logical problem can be transformed by the following theorem. T h e o r e m 2 The solution set L(P,q) of a logical problem {P,q) is identical to

{g\PU{
Representation of Logical Problems using Declarative Semantics Minimal Model

Let P be a logic program. A subset MM(P) of Q is a minimal model of P iff MM{P) is a model of P and MM(P) C M for all model M of P. The following theorems regarding the minimal model of a logic program are well known [4]. T h e o r e m 3 Any logic program P has a minimal model. The minimal model of P is the intersection of all models of P. When a logical consequence is a unit clause, the relation of logical consequence can be represented by the inclusion relation of two sets. T h e o r e m 4 P \= (a <-) «=>• MM(P) D rep(a). 4-2

Declarative Semantics of Logic Programs

Declarative semantics of a logic program will be defined. Firstly, a mapping Tp : 2 e —> 2 e is defined for a logic program P on A. Definition 1 ( M a p p i n g Tp) A mapping Tp : 2 e —>• 2 e for a logic program P on A is defined by TP{I) d= {head(C6) \CeP, 0eS, CO e Gclause(A), body{CB) C J } for each ICQ, where Gclause(A) is the set of all ground clauses on A. Declarative semantics of a logic program P is defined by using the mapping TP. Definition 2 (Declarative semantics of a Program) Let P be a program on A. Declarative semantics of a program P, denoted by M(P), is

416

defined by oo

M(p)^{j[TPn®), n= l

where 0 denotes the empty set. It is already known [4] that declarative semantics of a program P is identical to the minimal model of P. Theorem 5 M(P) = MM(P). 4-3

Representation of the Solution Set of a Logical Problem by using Declarative Semantics

Theorem 6 The solution set L(P,q) of a logical problem (P,q) is equal to (?) <-<7})n£'). 5

Solving Logical Problems by Equivalent Transformation

5.1

Method of Solving Logical Problems by Equivalent

Transformation

From Theorem 6, the solution set L(P,q) of a logical problem (P,q) can be computed by finding M{PU {{q) <- g}) n Q'. Theorem 7 Let F be a set of unit clauses whose heads are atoms in A'. If M(PU{<j>(q)^q})=M(PUF), then L{P,q) = U W a ) < - ) € F r e P ( a ) Now, a method of computing the solution set L(P, q) of a logical problem {P, q) by equivalent transformation is obtained. [Solution by Equivalent Transformation] 1. Let r be a predicate of q. Let r' be a predicate that is not included in the set of all predicates in A. Let be a mapping that transforms an r atom into an r' atom by replacing r with r'. 2. Let G\ be a definite clause (<j)(q) <— q). 3. Transform P ' = P U {Gi} equivalently until we obtain a clause set P" = PDF, where F is a set of unit clauses consisting of r' atoms. Then, an equivalent transformation sequence PU{Gi}-> >PUF is obtained. Unless such a clause set F is obtained, this procedure fails.

417 4. Obtain t h e solution set L(P,q)

of t h e logical problem (P, q) by

L(P, which is t h e union of all sets rep(a) for all atoms a, each of which is included in a clause ((a) <—) in the set F. 6

Conclusion

A theoretical foundation of solving logical problems by equivalent transformation is developed. Many problems, including the kind of problems solved by Prolog, can be formalized and solved by using this method. In this method, computation is correct as long as all the rules are correct. Various rules are available for correct computation by equivalent transformation, while only definite-clause rules are used in logic programming. Hence, the proposed m e t h o d allows various computation p a t h s compared with Prolog, which is one of the key points for more efficient computation [1]. We have implemented an interpreter called E T I and a compiler called E T C for programming based on equivalent transformation [3]. Using these systems, experiments on integrated processing of syntactic and semantic analysis of n a t u r a l languages [2] and development of automatic generation of programs from specifications [5] have been carried out. References 1. K. Akama, Y. Shigeta and E. Miyamoto, A Framework of Problem Solving by Equivalent Transformation of Logic Program, J. Japan Soc. Artif. Intell., Vol.12, No.2, pp.90-99 (1997). 2. M. Hatayama, K. Akama and E. Miyamoto, Improvement of knowledge processing systems by addition of Equivalent Transformation Rules, J. Japan Soc. Artif. Intell., Vol.12, No.6, pp.861-869 (1997). 3. H. Koike, K. Akama and H. Mabuchi, Equivalent Transformation Language Interpreter ETI, 5th IEEE International Conference on Intelligent Engineering Systems 2001 (INES 2001), (to appear 2001) 4. J.W. Lloyd, Foundations of Logic Programming, Second edition, SpringerVerlag, 1987. 5. T. Yoshida, K. Akama and E. Miyamoto, Program Synthesis from First-order Expressions for Problem Solving in the String Domain, Trans. Information Processing Society, Vol.41, No.SIG 7 (TOM 3), pp.12-22 ! ! (2000)

419 DECIDING THE HALTING PROBLEM AND PRELIMINARY APPLICATIONS TO EVOLUTIONARY HARDWARE, AND HYBRID TECHNOLOGY A.A. ODUSANYA Biomedical Computing Research Group (BIOCORE) School of Mathematical and Information Sciences, Coventry University, Priory Street Coventry, CV1 5FB, UK E-mail: [email protected] The halting problem is a historical problem with computers, this paper puts forward an abstract computational machine that would decide the halting problem, and thus improves the quest for aspects of artificial intelligence, in this case, automated verification of computational devices as in evolutionary computation, and the inherent ability for computers to construct other computer hybrids.

1

Introduction

Turing in 1936 had indicated earlier on that the halting problem was not decidable, in this paper I present a conclusive proof that shows why the halting problem is decidable. 2

Why the halting problem is decidable

The rhetorical question as was put forward largely by the founders of computer science can be rephrased as what precisely can a computing machine do logically, and what can't it do? In other words given a task, is it computable (Turing [1, 2] Godel [3], Hofstadter [4], Hopcroft and Ullman [5], and Chaitin [6])? The halting problem is one instance of the class of undecidable problems (problems that can't be solved by any mechanical procedure). Theorem 1. A (computational) machine can decide the halting problem in finite time, once its steps are accomplished in finite time exactly equal to zero. Proof. The halting problem states that a Turing machine determines whether another Turing machine will halt, once started: and according to Turing [1] (see also Hopcroft and Ullman [5]), intuitively this however is the case when we have atomic steps, that complete in some finite time, no matter how small. Following the example of Turing himself who proposed the abstract one step machine, which later was actualized into serial and parallel computers, we can propose a machine (say a zero-duration) that has each step achieved in zero time. This machine is equivalent

420 to the formally defined Turing machine, except that each transition function §(•) in a Turing machine that runs in some finite time t >= 0, the zero-duration machine would have the same transition function 8(») run in exactly time t = 0. The feasibility of this machine is sound as further discussed. This machine would be able to simulate to completion any Turing machine in exactly zero-time. And thus the halting problem is solvable, since a zero-step machine necessarily halts no matter how many non-empty zero-duration steps are taken. Corollary 1. An instance of this sort of machine is a Language machine. A plausible example of a zero-step machine is a language. In principle a language can be classified as a computational device, given the necessary variations that exist and thus allow distinction and combination, as the symbols. More formally, a language is a computational device of equal power to a Turing machine, and it is since we can represent a Turing machine as a language (see Hopcroft and Ullman [5], for an example of a Turing machine defined as a language for simulation), and a language as a Turing machine (a Turing machine accepting a given a language is representative of that language), then it implies that all languages are equivalent according to the Church's thesis (Hopcroft and Ullman [5]). And thus undecidable languages are the same as decidable languages. And thus the halting problem is indeed decidable (solvable by a mechanical process).

3 3.1

A few consequences of the halting problem in practice The Application to Evolutionary Hardware

There exist examples of evolutionary hardware (Sipper and Ronald [7]) as a further corollary to evolutionary software algorithms. The fitness function or their equivalents in evolutionary hardware, according to the conventional wisdom behind the halting problem, would never be able to adequately verify an offspring. However a window of opportunity for computational hardware successfully verifying subsequent offspring computational hardware exists based on the preceding discussion for example. 3.2

A Universal Theory of Hybrid Technology

The question can a computer decisively construct a hybrid at random that solves a given problem in decidability. It is sufficient to have a machine arbitrarily juxtapose two given machines together in a way that is syntactically correct (a hybrid), and then determine if the resulting hybrid accomplishes a required task. The first part of the juxtaposition is trivial, the second part is a problem in verification: would the hybrid return proper outputs for proper inputs, and would the hybrid halt after every

421 run. (If the hybrid fails the verification test, then another random hybrid is generated.) Theorem 2. (The universal theory of hybrid technology) A machine that solves the halting problem can trivially solve any hybrid problem in any way possible. Proof. The problem can be represented as some languages, and thus is decidable. 4

Conclusion

The physical/chemical elements that would enable the construction of this type of machine has not been discovered to the best reckoning of this writer, but the considerations put forward in this paper ought to highlight the feasibility of improved computers. References 1.

2.

3. 4.

5.

6. 7.

Turing A. M., On computable numbers with an application to the entscheidungsproblem, Proceedings of the London Mathematical Society 2, (1936) pp. 230-265. Turing A. M., On computable numbers with an application to the entscheidungsproblem. A correction, Proceedings of the London Mathematical Society 2, (1937) pp. 544-546. G5del K. On Formally Undecidable Propositions, (Basic Books, New York, 1962). Hofstadter D. R., Godel, Escher, Bach: An Eternal Golden Braid. A Metaphorical Fugue on Minds and Machines in the Spirit of Lewis Carroll, (Penguin Books, Singapore, 1979). Hopcroft J. E. and Ullman J. D., Introduction to automata theory, languages and computation, (Addison-Wesley Publishing Company, Inc., Philippines, 1979). Chaitin G. J., A century of controversy over the foundation of mathematics, Complexity 5, (2000) pp. 12-21. Sipper M. and Ronald E. M. A., A new species of hardware, IEEE Spectrum 37, (2000) pp. 59-64.

423

A N E W A L G O R I T H M FOR T H E C O M P U T A T I O N OF I N V A R I A N T CURVES U S I N G A R C - L E N G T H PARAMETERIZATION K. D. EDOH Department of Computer Science, Montclair State Upper Montclair NJ 07003, USA E-mail: edohk@mail. montclair. edu

University,

J. LORENZ Department of Mathematics and Statistics, University of New Mexico Albuquerque NM 87131, USA E-mail: [email protected] In this paper we introduce a new algorithm for computing invariant curves of a family of dynamical systems using arc-length parameterization of the curves. The main feature is that no smoothness (differentiability) of the curves is required. The algorithm is fast and robust; the results compare favorably with those of existing methods.

1

Introduction

A manifold M is said to be invariant under the diffeomorphism / if for any xo € M we have fn{x0) € M for all integers n. Finding efficient algorithms to compute invariant manifolds in dynamical systems has been a major research topic in recent years. We present a new method to compute invariant curves using arc-length parameterization and performing a sequence of iterations on a set of equally distributed grid points on the curve. In each iteration the grid points are first mapped by / to new points and then a cubic spline interpolation is used to approximate the curve that passes through the new points. Second, a set of equally distributed points on the new curve is determined using the spline interpolation. In addition, we introduce a simple adaptive strategy in which points are added to/removed from the circle to increase the accuracy of our results. Some of the existing methods include the Hadamard graph transform algorithm for computing attracting invariant manifolds [1,2]. In this method computing the invariant curves of Poincare maps requires solving finitely many boundary value problems for each graph transform. This may result in a very large computation time. The direct iteration method used to compute invariant curves in [3] often breaks down when the curve has a fixed point on it. In that case the approximating curve can collapse to the fixed point.

424

Other methods include the polygonal approach by van Veldhuizen [4]. It uses a polar coordinates representation of the invariant circle and may suffer from the unbounded norm of higher order interpolation schemes. The new algorithm was tested on the Van-der-Pol equation and the delayed logistic map. Our results compare favorably with the existing ones. Our method has resulted in a simple, fast, and robust algorithm. With the arc-length parameterization of the curves we eliminate the difficulties and restrictions posed by using polar coordinates. The adaptive scheme introduced into the algorithm has increased the accuracy of our results. As parameters in the diffeomorphism change, the invariant curves typically deform and they may lose their smoothness. This breakdown of smoothness often corresponds to a transition from quasiperiodic to chaotic dynamics. Since this bifurcation (transition) is of great interest for applications, one wants to have an algorithm that can approximate invariant curves that are not smooth. The algorithm presented here does not make use of any tangent information of the invariant curves and is therefore well-suited for path following of non-smooth curves. A restriction of the algorithm is that it requires attractivity of the invariant curves. 2

The N e w M e t h o d

Consider an orientation preserving diffeomorphism f : R2 —> R2, and let r c f l 2 denote a simply closed, continuous curve which is invariant under / , i.e., /(T) = r . Suppose that an approximation to the unknown invariant curve T is denoted by the parameterized curve T*:(X(l),Y(l)),

0<1
(1)

where L is the length of T* and the parameter I is approximately arc-length. (This assumes that T* is piecewise C 1 ; in fact, we use periodic cubic spline interpolation to determine the parameterization functions X(l) and Y(l).) Given an equidistant mesh in the parameter interval, U:0 = lo
= L,

(2)

the discrete points (X(li),Y(li)),i = 0,...,N, form an approximation to T which is also approximately equidistant. We now describe an iteration step that can be used to improve the approximation. Denote pi = (X°ld(k), Yold{k)), i = 0,2,...,N, with p0 = pN- Here old (X (l), Yold(l)) denotes a parameterization of a known approximation to T: r o W : (Xold(l),Yold(l)),

0
425

%, %

Figure 1. The mesh points along the invariant curve.

We get the new approximation Vnew

. (XneW(l),Ynew(l)),

0
as follows:

1. Compute the points q± = f(pi) and determine the distances di = \qi - qi-i\,

i=

l,...,N.

2. Compute the values N

Lnew = Y,du i=l

AL = Lnew/N, k = iAL.

and i-0,...,N

3. Modify the (equidistant) mesh U in the parameter interval using the following rules: • If \qk — qk-i | > 10AL for any fc = 1 , . . . , N then add the mesh point (h + Zfc_i)/2, modify N, and goto step 1. • If \qu — qk-i\ < 1/10AL for any k = 1 , . . . , N then delete the mesh point Ik or lk-i, modify N, and goto step 1.

426

Note that the final mesh Z* is equidistant in the parameter interval. 4. Compute cubic interpolants (Xnew(l),Ynew(l)) for the points qk using the mesh m* = J^=i fy, i.e., the mesh m* reflects that the points qu axe not equidistant. 5. Compute new pointsp t = (Xnew(h),Ynew(li)), ating at the equidistant mesh h.

i = 0,...,N,by

evalu-

The points pi form a new, approximately equidistant, discrete approximation to the unknown curve F, and the whole process can be repeated etc. The following criteria are used to determine when to stop the iteration. Here ej,2 are specified tolerances. Criterion 1. Require \Lnew - Lold\ < ex . Criterion 2. Require maxi=o,...,Ndist(Tnew,qi) < ti . If both criteria are satisfied then stop the iteration. To (approximately) determine the distance dist(Tnew, qi), one can proceed as follows. a) Find the points pk andp*+i which are closest to q^. b) Determine a point along the line through pk and Pk+i that is closest to qi. Though there is no guarantee that this leads to the actual distance of q^ from rnew, we found this to be satisfactory in practice.

Figure 2. The approximation of the shortest distance from the curve.

Step a) is easy to compute. Step b) is determined as follows: Let Pk = (pi,Pk),

Pk+i = (pl+i,Pk+i)

and tt = (?{>«?)•

Denote s

= (ri+i -PkiPk+i ~Pk)

and 6 = (q\ -p\,q?

-p\)

427

a =

< a,b> —Ti2—

ds = b — aa = Then \ds\ = (dsf + dsl)i Pk+i3

(dsi,ds2)

is the distance of qt from the line through pk and

Results

The method was tested on two problems; cubic spline interpolation was used to interpolate the points qk = f(Pk)3.1

Delayed logistic map

This is a population model that has been commonly used as a test problem. The equation of this model is given by Pn+1 = aPn(l - P B _!)

(3)

where Pn is the scaled population size in the nth generation of a species and a is a parameter reflecting the growth rate of the species. Equation (3) can be rewritten in the form Fa(xn,yn)

= (i„+i,2/ n +i) = (yn,ayn(l

- xn))

(4)

The diffeomorphism i ^ has the fixed points (x*, y*) = (0,0) and ^ ^ ( 1 , 1 ) . The point ^ ^ ( 1 , 1 ) loses stability when a increases from a < 2 to a > 2, and an invariant curve is born through a Neimark-Sacker bifurcation at a = 2. The invariant curves were computed using continuation in a from a = 2.0 to a = 2.25 with N = 400 points. The results are comparable to those of [1,4]. 3.2

Van-der-Pol

equation

The second problem is the Van-der-Pol oscillator. It is used to model electric circuits with a triode valve and also to model some biological problems. We considered the forced oscillator with periodic forcing. The equation of the forced system is given by x + a(x2 -l)x

+ x = 0cos(ut),

a, /? C R .

(5)

Under the transformations [5] p(x) = x3/3 -x,

y = x + ap(x),

(6)

428

Figure 3. The invariant curves for the delayed logistic map.

we obtain the system x = y- ap(x)

,?.

y = —x + Pcos{<jjt).

*• '

This equation has the form i = f(z,t) (8) where z = (x, y) and f{z,t) is periodic with period T = 2-K/U. Using the new method, we follow the invariant circle of the corresponding Poincare map to the parameter value where it collapses into a fixed point. Let K = P/2a and a = (1 — ui2)/a. Figure 4 shows the invariant curves for ft values 0.38 - 0.3925 with a = 0.55 and a = 0.4. The results are in agreement with those of van Veldhuizen [1,6]. Acknowledgments Research on this project has been supported by DOE grant DE-FG0395ER25235.

429

Figure 4. The invariant circles for van der Pol Oscillator.

References 1. K. Edoh, A numerical algorithm for the computation of invariant circles, DIM ACS series in discrete mathematics and theoretical computer science, 34 117 (1997). 2. N. Fenichel, Persistence and smoothness of invariant manifolds for flows, Indiana Univ. Math. J., 21 193 (1971). 3. D.G. Aronson, M.A. Chory, G.R. Hall and R.P. McGehee, Bifurcation from an invariant circle for two parameter families of maps of the plane: a computer-assisted study, Comm. Math. Phys. 83 303 (1982). 4. M. Van Veldhuizen, Convergence results for invariant curve algorithms, Math. Comp. 5 1 , 677 (1987). 5. J. Guckenheimer and P. Holmes, Nonlinear Oscillations, Dynamical Systems and Bifurcation of Vector Fields, Springer- Verlag, New York (1983). 6. M. Van Veldhuizen, A new algorithm for the numerical approximation of an invariant curve, SIAM J Sci. Stat. Comp. 8 951 (1987).

AI/Fuzzy Sets Application and Theory

433 C O M P A R I S O N OF I N T E R V A L - V A L U E D FUZZY SETS, INTUITIONISTIC FUZZY SETS, A N D BIPOLAR-VALUED FUZZY SETS KEON-MYUNG LEE Dept. of Computer Science, Chungbuk National University, and Advanced Information Technology Research CenteriAITrc), Cheongju, 361-763, Korea E-mail: [email protected] KYUNG-MI LEE Dept. of Computer Science and Systems Engineering, Kyushu Institute of Technology, Iizuka, Japan KRZYSZTOFJ.CIOS Dept. of Computer Science and Engineering, University of Colorado at Denver, Denver, Colorado, USA There are several kinds of fuzzy set extensions in the fuzzy set theory. Among them, this paper is concerned with interval-valued fuzzy sets, intuitionistic fuzzy sets, and bipolarvalued fuzzy sets. In interval-valued fuzzy sets, membership degrees are represented by an interval value that reflects the uncertainty in assigning membership degrees. In intuitionistic sets, membership degrees are described with a pair of a membership degree and a nonmembership degree. In bipolar-valued fuzzy sets, membership degrees are specified by the satisfaction degrees to a constraint and its counter-constraint. This paper investigates the similarities and differences among these fuzzy set representations.

1

Introduction

Fuzzy sets are a kind of useful mathematical structure to represent a collection of objects whose boundary is vague. In fuzzy sets, membership degrees indicate the degree of belongingness of elements to the collection or the degree of satisfaction of elements to the property corresponding to the collection. There have been proposed several kinds of extensions for fuzzy sets.[l] Type 2 fuzzy sets represent membership degrees with fuzzy sets. L-fuzzy sets are a kind of fuzzy set extension to enlarge the range of membership degree [0,1] into a lattice structure. Interval-valued fuzzy sets represent the membership degree with interval values to reflect the uncertainty in assigning membership degrees. [6] Intuitionistic fuzzy sets have membership degrees that are a pair of membership degree and nonmembership degree. [2] Bipolar-valued fuzzy sets have membership degrees that represent the degree of satisfaction to the property corresponding to a fuzzy set and its counter-property. [4] In this study, we are concerned with these three fuzzy set extensions: interval-valued fuzzy sets, intuitionistic fuzzy sets, and bipolar-valued fuzzy sets. These fuzzy sets have some similarities and some differences in their representation and semantics.

434 This paper is organized as follows: Section 2, 3, and 4 briefly describe the interval-valued fuzzy sets, the intuitionistic fuzzy sets, and the bipolar-valued fuzzy sets, respectively. Section 5 compares the interval-valued fuzzy sets with intuitionistic fuzzy sets, and Section 6 compares intuitionistic fuzzy sets with bipolar-valued fuzzy sets. Section 7 gives some examples to use these fuzzy set representations. Finally Section 8 draws conclusions.

2

Interval-valued Fuzzy Sets

Interval-valued fuzzy sets are an extension of fuzzy sets, where membership degrees of elements can be intervals of real numbers in [0,1]. An interval-valued fuzzy set A is formally defined by membership functions of the form A = {(*, fiA(x)) \xeX] M A W : X - * P([0,1]), where M-tW ' s a closed interval in [0,1] for each x e X.[6] Suppose that A and B are interval-valued fuzzy sets whose membership degrees of elements x are represented like this: HA(x) = [n'A (x),nrA(x)] nB (*) = t / 4 (*)> Ms (*)] The basic set operations for interval-valued fuzzy sets are defined as follows: AuB = [(x,nAuB(*))Ixe X} nAuB(x) = [fi'AKjB(X),/i^B(x)] firAuB(x) =

[i'A„B(x) = max.{iiA(x),nB(x)) AnB

= {(x,/iAnB(JC))Ixe

X}

nAnB(x)

HAnB(x) = min{nA(x),n'B(x)} A=i(x,fiA-(xy)\xe

X] r

n'A(x) = l-v Ax)

n-{x)

=

[fi'AnB(x),fiAnB(*)]

^AnB(x) =

max{nA(x\nB(x)}

=

min[nA{x),fia(x)}

r

[n'-(x),fi A(x)]

^ W = 1"MMW

In interval-valued fuzzy sets, interval values are used as membership degrees in order to express some uncertainties in assigning membership degrees. The larger the interval is, the more uncertainty there are in assigning membership degrees.

3

Intuitionistic Fuzzy Sets

The intuitionistic fuzzy set theory is an extension of the fuzzy set theory by Atanassov[2]. Here we give some basic definitions for the intuitionistic fuzzy sets. Let a set X be the universe of discourse. An intuitionistic fuzzy set A in X is an object having the form A = {(x,fiA(x),vA(x))\xe X},

435 where the functions jiA(x) : X -> [0,1] and vA(x) : X -> [0,1] define the degree of membership and the degree of non-membership respectively of the element x e X to the set A, which is a subset of X, and for every x e X, 0
The amount rcA(x) - 1 - (/iAW + vA(x)) is called the hesitation part or intuitionistic index, which may cater to either membership degree or nonmembership degree. It means that the intuitionistic fuzzy sets are a representation to express the uncertainty in assigning membership degrees to elements. If A and B are two intuitionistic fuzzy sets on the set X, their basic set operations are defined as follows[2]: A u B = {(x, nAuB (x),vAuB (x)) I x e X} AnB

HAUBW = max{nA(x),liB(x)} = {(x,nAnB (x),vAnB (x)) \xeX) MAr,B(x) = min{fiA(x),fiB(x)}

vAuB(x) =

mm{vA(x),vB(x)}

vAnB(x) =

max{vA(x),vB(x)}

A={(x,nA(x),vA(x))\xeX} VA(x)=vA(x) 4

v-(x) = [iA(x)

Bipolar-valued Fuzzy Sets

Bipolar-valued fuzzy sets are an extension of fuzzy sets whose membership degree range is enlarged from the interval [0, 1] to [-1,1]. In a bipolar-valued fuzzy set, the membership degree 0 means that elements are irrelevant to the corresponding property, the membership degrees on (0,1] indicate that elements somewhat satisfy the property, and the membership degrees on [—1,0) indicate that elements somewhat satisfy the implicit counter-property. [4] In bipolar-valued fuzzy sets, two kinds of representation are used: canonical representation and reduced representation. In the canonical representation, membership degrees are expressed with a pair of a positive membership value and a negative membership value. That is, the membership degrees are divided into two parts: positive part in [0, 1] and negative part in [-1, 0]. In the reduced representation, membership degrees are presented with a value in [—1, 1]. The following gives the definitions for those representation methods. Let X be the universe of discourse. The canonical representation of a bipolar-valued fuzzy set A on the domain X has the following shape: A = {(x,(fiA(x\n^(x)))\xe P

X) N

H A(x): X->|ft 1] ft A (x): X-» [-1,0] The positive membership degree HA(x) denotes the satisfaction degree of an element x to the property corresponding to a bipolar-valued fuzzy set A, and the negative

436 membership degree fiA (x) denotes the satisfaction degree of x to some implicit counter-property of A. If ^ ' ( x ^ O a n d [iA(x) = 0, it is the situation that x is regarded as having only positive satisfaction for A. If \xpA (x) = 0 and /xA (x) * 0, it is the situation that x does not satisfy the property of A but somewhat satisfies the counter-property of A. In the canonical representation, it is possible for elements x to be fipA (x) * 0 and \iA (x) * 0 when the membership function of the property overlaps that of its counter-property over some portion of the domain. The reduced representation of a bipolar-valued fuzzy set A on the domain X has the following shape: A = {(x,fiRA(x))\xeX]

AI*:X-»[-1,1]

The membership degree/if (x) for the reduced representation can be derived from its canonical representation as follows:

MAW =

P ifH =0 y H-AA(x) '

tf{x) *A W f(/J.A(x),^lA(x))

otherwise

Here f(nA(x),/j.A(x)) is an aggregation function to merge a pair of positive and negative membership values into a value. Such aggregation functions f(lip(x),iJ.A(x)) can be defined in various ways. The choice of the aggregation function may depend on the application domains. [4] Suppose that there are two bipolar-valued fuzzy sets A and B expressed in the canonical representation as follows: A = Hx,(jifo),nZ{x)))\ XBX) B = {(x,(nPB(x),^(x)))\ xe X] The set operations for bipolar-valued fuzzy sets are defined as follows: A u B = {(x, fiAuB (x)) I x e X}

/iAuB(x) = (ppwB(x),

fi^B(x))

PAUB(*) = ma x{j"£W>VB (*)) f^Aua(x) = minf/i A (x),HB (x)} AnB

= {{x,nAnB(x))\xe

X]

HAnB(.x) = mm{nP(.x),(iPB(x)} A = {(x,fij(x)) I xe X} A*£(x) = l - j u ; ( x )

5

tJ.AnB(x)^(nPnB(x),^nB(x)) fiAnB(x) =

max{^(x),fiB(x)}

n-(x) = 0 i | ( x ) , / i f (x)) /i£(x) = - l - / i ? ( x )

Comparison of Interval-valued Fuzzy Sets with Intuitionistic Fuzzy Sets

Intuitionistic fuzzy sets can be regarded as another expression for interval-valued fuzzy sets. According to this interpretation, we can convert an intuitionistic fuzzy set into an interval-valued fuzzy set as follows:

437

Intuitionistic fuzzy sets A = {(x, fiA(x), vA(x)) I x s X}, Interval valued fuzzy sets A = {(x,[nA(x),nrA(x)])\xe X} where, fi'A(x) = )iA(x) firA(x) = 1 -v A (x) From the correspondence between boundary values of interval membership degrees in interval-valued fuzzy sets and the pairs of membership and nonmembership degrees in intuitionistic fuzzy sets, we can deduce that the basic set operations for interval-valued fuzzy sets and intuitionistic fuzzy sets have the same roles. To begin with, let us see the case of union operations. AUB

= {(X,[^'AUB(X),HAWB(X)])\XB

X)

HAwB(x) = mzK{n'A(x),n'B (x)} liAuB (x) = m a x { ^ (x),jUB (x)} The lower bound HAuB(x) = m&x.{n'A(x), fi'B(x)} of interval-valued fuzzy set union can be transformed by the correspondence relationship fi'A(x) = nA(x) like this: fiAKjB(x) = max{n'A(x),LiB(x)} = max{nA(x),HB(x)} = l*AuB(x) This is the same with the union fiAwB (x) of the intuitionistic fuzzy sets. The upper bound /J.AKJB(X) = max{/i^(x),/*g(x)}can be transformed by the relationship lxrA (x) = 1 - vA (x) as follows: ^B(x) = m&x{nA{x),ixB(x)} = max{l-v/,(x),l-vB(x)} =\-mm{vA(x),vB(x)} When we rewrite the above equation using the relationship /nA(x) =l-vA(x), we can see that the upper bound of the union operation of interval-valued fuzzy sets corresponds to the nonmembership degree vAuB(x) =min{v /1 (x),v B (x)}. It means that both union operations of interval-valued fuzzy sets and intuitionistic fuzzy sets are the same. In a similar way, we can prove that the intersection operations for both kinds of fuzzy sets are the same. The following shows the equivalence in negation operations. A={(x,[iiA-(x),nA-(x)])\xeX}

nL(x) = l-nrA(x)

nrA(x) =

l-^'A(x)

r

fJ.'A(x) and l^ A(x) can be rewritten as follows: li'A{x) =

\-vA{x)=\-{\-vA(x))=vA(x)

r

H A(x) = l-n'A(x) = l-nA(x) We can see that n'A(x) and Mj(x) correspond to HA(x) and v^(x) respectively. From those observations, we can see that interval-valued fuzzy sets and intuitionistic fuzzy set have the same expressive power and the same basic set operations. 6

Comparison of Intuitionistic Fuzzy Sets with Bipolar-valued Fuzzy Sets

When we compare a bipolar-valued fuzzy set A = {(x, (/iA(x),fiA(x))) I x e X} with an intuition-istic fuzzy set A = {(x, )J.A{x), vA(x)) I x e X] under the conditions

438 \ipA (x) = nA(x) and fiA (x) = -v A(x) , bipolar-valued fuzzy sets and intuitionistic fuzzy sets look similar each other. However, they are different each other in the following senses: In bipolar-valued fuzzy sets, the positive membership degree fiA (x) characterizes the extent that the element x satisfies the property A, and the negative membership degree fiA(x) characterizes the extent that the element x satisfies an implicit counter-property of A. On the other hand, in intuitionistic fuzzy sets, the membership degree fiA(x) denotes the degree that the element x satisfies the property A and the membership degree vA(x) indicates the degree that x satisfies the Tier-property of A. Since a counter-property is not usually equivalent to notproperty, both bipolar-valued fuzzy sets and intuitionistic fuzzy sets are the different extensions of fuzzy sets. Their difference can be manifested in the interpretation of an element x with membership degree (0, 0). In the perspective of bipolar-valued fuzzy set A, it is interpreted that the element x does not satisfy both the property A and its implicit counter-property. It means that it is indifferent (i.e., neutral) from the property and its implicit counter-property. In the perspective of intuitionistic fuzzy set A, it is interpreted that the element x does not satisfy the property and its nof-property. When we regard an intuitionistic fuzzy set as an interval-valued fuzzy set, the element with the membership degree (0, 0) in intuitionistic fuzzy set has the membership degree [0, 1] in interval-valued fuzzy set. It means that we have no knowledge about the element. On the other hand, their set operations union, intersection, and negation are also different each other. These things differentiate bipolar-valued fuzzy sets from intuitionistic fuzzy sets. The intuitionistic fuzzy set representation is useful when there are some uncertainties in assigning membership degrees. The bipolar-valued fuzzy set representation is useful when irrelevant elements and contrary elements are needed to be discriminated. 7

Examples

This section gives some examples to use the three fuzzy set representations for a fuzzy concept frog's prey. The next is an interval-valued fuzzy set for frog's prey: frog's prey = {(mosquito, [1,1]), (dragonfly, [0.4,0.7]), (turtle, [0,0]), (snake, [0,0])) The following shows an intuitionistic fuzzy set corresponding to the above intervalvalued fuzzy set: frog's prey = {(mosquito, 1, 0), (dragonfly,

0.4,0.3), (turtle, 0, 1), (snake, 0, 1)}

From those examples, we can see that interval-valued fuzzy sets and intuitionistic fuzzy sets have the same expressive power. The next shows a bipolar-valued fuzzy set for frog's prey: frog's prey = {(mosquito, (1,0)), (dragonfly, (0.4,0)), (turtle, (0,0)),(snake, (0,-1))}

439 For the element snake, the above interval-valued fuzzy set and the intuitionistic fuzzy set have 0 membership degree which just means that snake does not satisfy the property corresponding to frog's prey despite that snake is a predator offrog. On the other hand, the above bipolar-valued fuzzy set has -1 membership degree which indicates that snake satisfies some counter-property with respect to frog's prey. Meanwhile, interval-valued fuzzy sets and intuitionistic fuzzy sets can express uncertainties in assigning membership degrees to elements.

8

Conclusions

This paper compared three fuzzy set representations: interval-valued fuzzy sets, intuitionistic fuzzy sets, and bipolar-valued fuzzy sets. It showed that interval-valued fuzzy sets and intuitionistic fuzzy sets have the same expressive power and the same basic set operations. Interval-valued fuzzy sets and intuitionistic fuzzy sets can represent uncertainties in membership degree assignments, but they cannot represent the satisfaction degree to counter-property. On the other hand, bipolar-valued fuzzy sets can represent the satisfaction degree to counter-property, but they cannot express uncertainties in assigning membership degrees.

9

Acknowledgements

The works was supported by the Korea Science and Engineering Foundation through the Advanced Information Technology Center(AITrc). References 1. 2. 3.

4. 5. 6.

H.-J. Zimmermann, Fuzzy Set Theory and Its Application, Kluwer-Nijhoff Publishing, 1985. K. T. Atanassov, Intuitionistic Fuzzy Sets, Fuzzy Sets and Systems, Vol.20, pp.87-96, 1986. T. Ciftcibasi, D. Altunay, Two-Side (Intuitionistic) Fuzzy Reasoning, IEEE Trans, on System, Man, and Cybernetics -Part A, Vol.28, No.5, pp.662-677, 1998. K.-M. Lee, Bipolar-valued fuzzy sets and their operations, Fuzzy Sets and Systems (accepted). H. Bustince, Construction of intuitionistic fuzzy relations with predetermined properties, Fuzzy Sets and Systems, Vol.109, pp.379-403, 2000. G. J. Klir, T. A. Folger, Fuzzy Sets, Uncertainty, and Information, Prentice-Hall Editions, 1988.

441 I N T R O D U C I N G USER C E N T R E D DESIGN INTO A H Y B R I D INTELLIGENT INFORMATION SYSTEM METHODOLOGY KATE ASHTON and SIMON L. KENDAL School ofCET, St Peter's Campus, University of Sunderland, Sunderland SR6 ODD UK E-mail: [email protected] Hybrid intelligent information systems (HIIS) present a special case in the context of integrated systems in that both intelligent and conventional component systems are integrated. To date methodologies for development of these increasingly important systems concentrate on essential strategies for integration of component knowledge-based and conventional system technology. This is expected to be an absorbing issue in the early stages of HIIS methodology evolution, but integration of usability now forms an interesting challenge. The HIIS methodology known as HyM, currently undergoing development at Sunderland University, has been extended to include user centred design and the potential for responding to a varied user population by means of intelligent interface technology. Targets were set to preserve the ethos of the original methodology while providing seamless integration into it. The modular structure of HyM was exploited to achieve integration of volatile user knowledge without interference with stable data and knowledge essential for core system functions. Modularity and object-orientation promoted economical integration of a shared user modelling component. The extended HyM methodology was applied to enhancement of a HIIS. Evaluation indicates that smooth and seamless extension of HyM to incorporate user centred design has been achieved.

1. Introduction Hybrid intelligent information systems (HIIS) are integrated computer systems in which conventional information systems and knowledge-based systems co-operate to share knowledge and data. HIIS are potentially complex interactive systems, the variety of their components giving rise to the possibility of disparate user populations. User involvement is a major factor in the development of most successful computer systems [15]. This feature is absent from HIIS. They are, therefore, important and legitimate targets for the development of user centred design in their methodologies. It is expected to have increased significance in HIIS but so far this is largely unexplored. The HyM methodology for HIIS currently under development at Sunderland University was targeted to investigate the potential for incorporating user centred design. Its fundamental modularity was exploited to provide for this feature and for integration of a simple, consistent, shared user model representing user populations. 2.

Hybrid Intelligent Information Systems (HIIS): a Special Case

Hybrid intelligent information system (HIIS) are computer systems that consist essentially of an integration of a variety of potentially physically separate (although co-operating) conventional and knowledge based systems. This integration of reasoning with conventional processes may result in development problems caused by the fundamentally different structures required by differing components. HIIS

442

are likely to process very large volumes of data from various sources [6], to require more levels of security than conventional systems and to inherit problems or strengths from radically different component technologies and integration techniques. The nature and complexity of these systems suggests that both user and stakeholder populations may be large and disparate. User centred design in this context is a relevant consideration. 3.

Importance of User Centred Design

User centred design implies that system users are made a central issue throughout the design process. It remains a salient topic [17]. The advisability of involving users in system design is well documented [15,10]. User involvement varies from participation only at evaluation stages, to right through the entire life cycle [10]. Shackel believes human factors to be paramount, advocating that design must start with the end user [15]. However, user centred design does not require end users to be given priority over data and knowledge needed to deliver valid systems. 3.1

HIIS Users

Interactive HIIS and their methodologies reported in literature have been examined to gather information about user populations. Although increasing in number, HIIS are still not commonplace. Few references to potential system users occur as methodologies concentrate on software engineering techniques [2,4,8,11,12]. Evidence of user populations that consist of distinct groups is reported by Fedra [6]. The user population of HMISD, a hybrid diagnostic system developed according to HyM [3], consists of patients, nurses, audiologists, secretaries and doctors (consultant or general). Such disparate groups cannot be accommodated by one set of help or information messages (the canonical option). An alternative is to make use of intelligent interface technology. 4.

Intelligent Interface Technology

Intelligent interface technology (IIT) incorporates a wide range of methods where some intelligence is applied to user interface design and implementation by means of a user model [1]. Complexity of HIIS and disparity of their user populations makes them an interesting target for introducing IIT. However, increasing the complexity of already complex systems must detract from the advantages of tailoring system response to users. Only a judicious choice of user model has the potential for minimising this effect. 5.

HyM: An Object-Orientated HIIS Methodology

The HyM prototype methodology [9] already under development at Sunderland

443

University organises, refines, develops and builds upon earlier methodologies (waterfall process, prototyping and model-based approaches). Integration procedures are controlled and facilitated by making full use of the object-orientation paradigm throughout its life cycle in the processing of procedural information and declarative logic. A salient feature of the HyM life cycle is the combined analysis and design stage. These features, established before the start of the user centred design extension, provide the potential method by which its integration may be achieved. 5.1

User Centred Design in HyM

Figure 1: HyM Life Cycle with Expanded Analysis and Design Detail

In the HyM the integrated analysis and design stage provided the location for smooth extension to accommodate an interface design cycle (Figure 1). Core system engineering deals with the processing of stable data and knowledge in the system cycle, including the user only as a member of the design team. The interface cycle extension allows separation of volatile user issues. The two separate communicating cycles now address clearly distinguishable issues. The system cycle models the iterative development of component systems within the HIIS, processing data or knowledge essential for developing valid systems. The interface cycle contains all aspects of user-system communication and accommodates user interface design and

444

procedures deemed necessary for its satisfactory development. The interface cycle, therefore, models a close interaction between the HIIS and its components, users and designers and hence incorporates principles of human computer interaction. Two distinguishable user tasks are advocated, a restricted role as a member of the design team, responsible only for core system development and a separate and wider role in interface design. 5.1.1

Interface Cycle

The interface cycle (Figure 1) models an iterative process consisting of interface analysis, interface design and interface evaluation. The one-way connection between model evaluation and interface analysis indicates that, at each pass through the system cycle, information from increasingly valid HIIS components becomes available. At some stage, in depth user analysis is appropriate. The interface cycle is then an important parallel development with priority equal to the system cycle. User Modelling in HyM A user model is an essential part of intelligent interface technology. It is the system's model of a set of those characteristics of its users that affect their interaction, the model being clearly distinguishable from other system knowledge. In HyM the user model is an integral part of the interface cycle (Figure 1). It is modelled as an optional subset of each stage resulting in expansion as in Figure 3. Information accumulating at interface analysis enables a decision to be made on incorporating IIT, system designers having the option of rejecting the technology. If it is accepted, design and implementation follow

Figure 2: Integrating User Modelling into HyM In the case of HIIS, the user model's role is clear: to model users for the purpose of tailoring system responses. A striking feature about a chunk of data characterising HIIS users is its capacity to exploit advantages of frame systems and stereotypes.

445 The object-orientated nature of the HyM methodology provides an ideal medium for constructing a shared model by means of the same paradigm It allows sharing, consistency, modularity and integration by message passing to already established system classes. Stereotypical models of users' experience seem to be the representation most used by system builders [7]. They have been applied to user modelling from adaptive hypermedia systems [5,14] to computer integrated manufacturing systems [13]. 6.

Evaluation Issues

To evaluate ideas arising in the course of HyM extension they were applied to the HMISD system including construction of an object-orientated user model representing group and individual users. The object-orientated approach to stereotype construction enabled HIIS systems to communicate with the hierarchy of user classes created. One user model was accessible by all HIIS components. Empirical evaluation with users confirmed that disparate groups with different interests existed in the HMISD user population and were then accommodated. Evaluation of the extended methodology made use of the GQM paradigm [16] to reveal inconsistencies between priorities of the extended HyM methodology and life cycle and those of the earlier version 1 methods. This evaluation indicated that consistent integration of user centred design into HyM had been achieved. 7.

Conclusions

User centred design is so far largely absent from increasingly important HIIS methodologies. A HIIS methodology (HyM) has been extended to include this. An important feature of the extended methodology is the separation of volatile user data from stable core system data and knowledge by accommodating them in separate, communicating core system and interface cycles. Accommodating the enhancement in the crucial analysis and design stage makes usability central in the extended methodology. Human factors are not paramount in the enhanced HyM; they have status equal with that of system data and knowledge. Accommodation of disparate user populations of HIIS was provided for by optional incorporation of IIT. Objectorientation enabled user model and interface to be integrated smoothly. HyM is a versatile HIIS methodology promoting seamless extension and integration of new methods and components.

446

References 1. 2.

3.

4. 5.

6. 7. 8.

9.

10. 11.

12.

13.

14. 15.

Benyon D. R., Murray, D. M. (Editorial) Special Issue on Intelligent Interface Technology: Editor's Introduction. Interacting with Computers 12 (2000) Bravo-Aranda G., Hernandez-Rodriguez, F. et al Knowledge-Based System Development for Assisting Structural Design. Advances in Engineering Software 30 (1999). Chen X., Kendal, S. L. et al Development and Implementation of a Hybrid Medical information System. Medical Informatics Europe '96 J. Brender et al (Eds.) IOS Press (1996). Chen Z., Zhang H., Zhu et al An Integrated Intelligent System for Ceramic Kilns. Expert Systems with Applications 16 (1999). Di Lascio L. Fischetti E. et al., A fuzzy-based Approach to Stereotype Selection in Hypermedia. User Modeling and User Adapted Interaction 9(4) (1999). Fedra K A., Decision Support for Natural Resources Management: Models, GIS and Expert Systems. A.I. Applications 9(3) (1995). Hook K., Steps to Take Before Intelligent Interfaces Become Real Interacting with Computers 12 (2000). Karunaratna, D. D., Gray, W.A. et al., Establishing a Knowledge Base to Assist Integration of Heterogeneous Systems. Advances in Databases - 16th British National Conference on Databases BNCOD Proceedings (1998) Kendal S. L., Chen X. et al., HyM: a Hybrid Methodology for the Development of Integrated Hybrid Intelligent Information Systems, Proceedings of Fusion 2000. Third International Conference on Information Fusion Paris, (2000). Madsen K. H. The Diversity of Usability Practices. Communications of the ACM. 42(5), (1999). Matthews K. B., Sabbald A. R. et al., Implementation of a Spatial Decision Support System for rural Land Use Planning: Integrating Geographic Information System and Environment Models with Search and Optimisation Algorithms. Computers and Electronics in Agriculture 23 (1999). Molina M., Sierra J.L., et al., Reusable Knowledge-Based Components for Building Software Applications: A Knowledge Modelling Approach. International Journal of Software Engineering and Knowledge Engineering (3) (1999). Monfared R. P., Hodgson A. et al., Implementing a Model-Based Generic User Interface for Computer Integrated manufacturing Systems. Proceedings of the Institute of Mechanical Engineers Part B - Journal of Engineering Manufacture 212: (7) (1998). Pagesy R. et al., Improving Knowledge Navigation with Adaptive Hypermedia. Medical Informatics and the Internet in Medicine 25 (1) (2000). Shackel B (1997) Human-Computer Interaction: Whence and Whither? Journal of the American Society for Information Science 48 (11) (1997).

447

16. van Solingen R., Berghout E. The Goal/Question/Metric Method: a practical guide for quality improvement of software development. Publisher McGraw-Hill Companies, (1999) 17. Vredenburg K. Increasing Ease of Use. Communications of the ACM 42(5) (1999).

449 TOWARDS HYBRID KNOWLEDGE AND SOFTWARE ENGINEERING S. KENDAL, X. CHEN School ofCET, St. Peter's Campus, University of Sunderland, Sunderland UK, SR6 ODD E-mail: simon. kendal@sunderland. ac. uk Software Engineers face many requirements for the development of large-scale and complex systems. A new challenge is the study of Hybrid Intelligent Information Systems (HIISs) that integrate conventional software systems and knowledge-based systems. This paper describes a hybrid methodology HyM for the development of such large-scale hybrid systems, which combines conventional software system development models with knowledge-based system development approaches. The method provides a hybrid life-cycle process model to combine the waterfall process, incremental development, rapid prototyping and model-based approaches, which results in a hybrid knowledge and software engineering approach to systems development.

1

Introduction

Over past few years, the development of large-scale and complex hybrid systems has generated much interest in the artificial intelligent (AI) community, for several reasons [5]: Many current knowledge-based systems (KBSs) are very large and complex, consisting of both intelligent system components and database systems. This has demonstrated a clear need to develop and support the seamless integration of knowledge based systems with conventional information systems. Therefore, it is important for a systematic approach to be suitable for the development of different components. There is no a single method in software engineering and knowledge engineering that perfectly covers all phases and aspects of system engineering. Conversely, the use of several independently developed methods has a number of drawbacks such as inconsistency, redundancy, increase of change effort and possible loss of information. In an attempt to provide at least a partial solution to these problems, we propose a hybrid knowledge and software engineering methodology HyM for the development of large-scale and complex hybrid AI / conventional systems. This provides a hierarchical architecture with three levels and a hybrid life-cycle process model to combine the conventional waterfall process, incremental development, rapid prototyping and model-based approaches. Recently researchers have suggested several process models and approaches for the development of large-scale intelligent systems. Gillies [7] described a strategy to avoid ill-defined requirements and reduce time scales. However, in this model, analysis and design are still two independent phases. Complete requirement

450

specifications are required before the design phase can start. Thus this model still has some of the limitations inherent in the waterfall model. Other models, also provide a hybrid process for the development of complex software systems [2]. The incremental development life-cycle is an improved rapid prototyping model where each delivered increment provides needed operational capability. This shifts the management emphasis from developmental products to risk assessment. The incremental development model can reduce the frequency of loops and effort in the conventional rapid prototyping model. The spiral model emphasises the use of three process models together to develop different parts of the system. However these models almost always assume that analysis is a static process that can be separated from design and is independent of any implementation consideration. The development of Hybrid Intelligent Information Systems (HIISs) requires a gradual shift from analysis concerns to design concerns [8]. Model-based methodologies [9,11,12] have been suggested for the development of KBSs. These approaches provide many advantages however they mostly emphasise the problems in KBS development and give a few considerations to conventional software systems development. A life-cycle model is proposed that incorporates advantages of the evolutionary approaches and systematic model-based approaches. 2

The Hierarchical Architecture of Complex Software Systems

A hierarchical architecture concept for a complex software system is proposed to support the development of HIISs, as Figure 1 illustrates. There are three levels in this architecture, the repository level, component level, and hybrid intelligent information system level. Following is a brief explanation of these levels. 2.1

Repository level

The lowest level is the repository level. This level deals with coding technologies, transactions implementation and repositories of data and knowledge. Repositories act as basic building blocks of a system component. They contain descriptions of various types of data and knowledge that are produced, managed, exchanged and maintained in a software system. A key challenge for repositories is an ability to handle and manage many types of data and knowledge. This requires powerful means for representing and mapping different data and knowledge models at multiple levels of abstraction.

451 2.2

Component level

The component level consists of those relatively independent system components based on models in a complex software system. A component may integrate data,

Figure 1. A hierarchical architecture of complex software systems Modelling is one of the most important technologies to determine a model design and implementation in developing a component. There are two major modelling activities, conceptual modelling (model analysis) and formal modelling (model design) which are associated with model transformation, design and implementation. 2.3

Hybrid intelligent information system level

In the HIIS level, various system components are combined to configure a hybrid system. This level deals with techniques for building an ideal system architecture and producing good interoperability among the system components. Potentially a hybrid system could be an abstract entity made up entirely from physically independent, distributed, and parallel processed components working on multi platform environments. All co-operating towards some larger goal.

452

Working above the HIIS level requires techniques to support systems integration and co-operation. One approach is to use multi-agent systems [1] or to develop systems with sharable component libraries [6]. The hierarchical architecture, proposed here, provides two views for the development of complex systems. From the technical view, these levels are independent of each other, e.g., there seems no direct relationship between techniques of database design in the lowest level and modelling technologies in the component level. On the system view they are interrelated, e.g., every HIIS consists of components and their data, knowledge and procedures. Many current software techniques can also be mapped onto these levels. From the view of the software development process, the conventional top-down process starts in the HIIS level and the bottom-up process begins from the lowest level. Model-based processes are used to form the component level. 3

Proposed Hybrid Process Model

A new life-cycle process model is proposed that supports hybrid knowledge and software engineering. This combines four conventional process models: waterfall process, incremental development, rapid prototyping and model-based approaches, as shown in Figure 2.

Figure 2.

The HyM life-cycle

This process model consists of two iteration sub-processes: internal and external. The external process is a cross between the waterfall life-cycle and incremental prototyping. The internal process is a rapid prototyping process, which crosses phases of requirements analysis and system design, i.e. there is a gradual move from analysis phase to design phase. The internal process includes steps

453

related to system models: model analysis, model design, model evaluation and similar steps related to the development of the interface. This hybrid life-cycle model has many benefits for the development of hybrid information systems. It encourages strategic decisions within the feasibility study phase of the project to promote good project control. It promotes the smooth transition from analysis to design. The internal iteration process is a rapid prototyping model suitable for small knowledge module development and allows for thorough evaluation. When a system component is modelled into a data model or a procedure model, little iteration is required. For a knowledge component the iteration process is completed in a few cycles with the component being prototyped in software and having finally passed a strict quality control review to ensure that the reasoning is complete and at an appropriate depth. Using model-based concepts, the new process can model and partition system components based on the objectoriented paradigm. The new life-cycle process overcomes those problems in the waterfall life-cycle when developing a KBS and the problems associated with the use of rapid prototyping when developing a conventional software system. Finally, separation of stable functional requirements from volatile user considerations facilitates the development of re-usable repositories and components 4

Applications

The HyM methodology [10,3]. integrates four existing methods using two integration approaches: intra-process and inter-process. In the requirements analysis phase, a structured method is applied to function analysis, an information modelling method is applied to data analysis, and a knowledge acquisition method is applied to knowledge analysis. An intra-process approach is then used to integrate these techniques. Finally, an object-oriented method is applied to the design and implementation of hybrid information systems. Using this methodology, a hybrid medical information system for dizziness (HMISD), a complex medical domain, was developed [4]. Following evaluation the use of this system is being expanded to other regional hospitals. 5

Conclusions

Along with rapidly increasing requirements to develop large-scale and complex intelligent systems, new technologies, introduced daily, profoundly impact on developing applications and will require equally profound changes in software system architectures and development process models. In this paper, we propose a hybrid knowledge and software engineering approach, consisting of a hierarchical architecture and a hybrid life cycle process model, for the development of largescale and complex hybrid AI / conventional software systems.

454 References 1. Aylett, R.; Brazier, F.; Jennings, N. et al, Agent Systems and Applications, The Knowledge Engineering Review, Vol.13, No.3, (1998). 2. Boehm, B.W. A Spiral Model of Software Development and Enhancement, IEEE Computer, Vol.21, No.5. (1988). 3. Chen, X.; Kendal S.; Potts I and Smith P, Towards an Integrated Method for Hybrid Information System Development, IEE Proceedings on Software Engineering, Vol.144, No. 5-6, (1997). 4. Chen, X; Vaughan-Jones, R.; Hawthorne, M. et al, HMISD: an Hybrid Medical Information System for Dizziness, Proceedings of the First European Conference on Health Informatics., (1995). 5. Gaspari, M.; Moffa, E. and Stuff, A. An Open Framework for Cooperative Problem Solving, IEEE Expert, (1995). 6. Gennari, J.; Stein, A. and Musen, A. Reuse for Knowledge-Based Systems and CORBA Components, Proceedings of Knowledge Acquisition Workshop (KAW'96), (1996). 7. Gillies, A. The Integration of Expert Systems into Mainstream Software, Chapman & Hall Computing, (1991). 8. Harmon, P. and Hall, C. Intelligent Software Systems Development - An IS Manager's Guide, John Wiley & Son, Inc, (1993). 9. Lee, J. and Yen J., Enhancing the Software Life Cycle of Knowledge-Based Systems Using a Task-Based Specification Methodology, International Journal of Software Engineering and Knowledge Engineering, Vol.3, No.l., (1993). 10. Kendal S., Chen X. and Masters A., HyM: a Hybrid Methodology for the Development of Integrated Hybrid Intelligent Information Systems. Proceedings of Fusion 2000 - 3rd International Conference On Information Fttf/on. Paris, (2000). 11. Pour, G. Towards Component-Based Software Engineering, Proceedings of the 22nd IEEE Annual International Conference on Computer Software and Applications, (1998). 12. Schreiber, G.; Welinga, B. and Breuker, J. CommonKADS: A Comprehensive Methodology for KBS Development, IEEE Expert, December, (1994). 13. Song, X. and Osterweil, L. Experience with an Approach to Comparing Software Design Methodologies, IEEE Transactions on Software Engineering, Vol.20, No.5., (1994).

455 DYNAMICAL COMPUTING, COMMUNICATION, DEVELOPMENT AND HIERARCHICAL INFERENCE

H. M. HUBEY Department of Computer Science, Montclair State University, Upper Montclair, New Jersey, 07043, USA E-mail: [email protected] P. B. IVANOV International Science and Technology Center, 9 Luganskaya Street, P. O. Box 25, Moscow, 115516, Russia E-mail: [email protected] A model of computing is suggested, combining the approach of analytical mechanics with the principles of a general psychological theory of activity. Thus reformulated, the traditional picture of computation allows generalizations of interest for distributed and parallel computing, artificial intelligence, or consciousness studies. The notion of hierarchical computing is discussed, stressing the communicative aspect; the directions of increasing the complexity of both computational universe and the computing agents are indicated. The idea of computability is reconsidered in the light of the new approach. The basic principles of hierarchical logic are presented as a tool for constructing generic formal systems.

1

Introduction

Using a computer, one has to arrive to useful results starting from some raw material. The principal question is that of computability. First computers were relatively simple, and the famous Godel theorems reformulated for various formal systems [1-3] indicated the limits of primitive sequential computing. With the development of the Internet, the problem of computers talking to each other gained importance, and the rapid development of parallel computing and peer-to-peer technologies requires a different theoretical picture reflecting the present situation. The inherent insufficiency of the traditional logical systems in a complex environment has been demonstrated by Hubey [4]. In studies of human behavior, computer analogies are still popular, which may hinder the inverse process, understanding computation as a primitive analog of consciousness. A general theory of activity developed in Russian psychology since 1920s [5,6] could provide a solid framework for analysis of the communities of computers. The key principle of this theory, sociality of development, perfectly reflects the practices of the World Wide Web, and may serve as a source of ideas in designing efficient computer protocols approaching conscious communication. Hierarchical structures and systems are necessary for efficient computation in a developing world [7]. However, the general principles of hierarchical organization

456 are still poorly explicated in the literature, and the relation of hierarchy to development is far from being well understood. In this paper, we present a summary of hierarchical approach to computation. A general model of dynamical computing serves to translate the traditional static notions into a language more suitable for description of motion and development. Then we consider communicating computers and demonstrate how the opposition of the inner and outer world appears. We also present a formal scheme of hierarchy, replacing the traditional idea of inference with the directed construction.

2

Dynamics of computation

Traditionally, theories of computing were developed as formal models of an isolated computer operating in an essentially static world. Such an approach complies with the classical paradigm of mathematical study, but its application to real computation can only be limited, since the results of one computation serve to shape many other computation processes. That is why alternative pictures of computing may be useful. 2.1

The configuration space

Every computation occurs in some universe, so that successive operations would change the state of that universe. We admit that its distinct states can be somehow specified, and the collection of all the possible states forms what physicists usually call a configuration space X, which may be modeled with some mathematical structure (e.g. a finite set, a Euclidean space, a Hilbert space, a functional space, or a manifold). Points x of the configuration space X represent both the possible initial data and the possible outcomes of computation. 2.2

The agent

The agent is a device that can perform computation. According to A. N. Leontiev's theory [6,8], we distinguish the following levels of any agent's functioning. 2.2.1

Operations

An elementary operation changes the state of the universe, which is naturally represented as transition from one point of the configuration space to another. In every particular state (point x) there is a variety of admissible changes; by analogy to analytical mechanics [9,10], we will call it the tangent space to X in point x, Tx, the union of all the Tx is called the tangent space to X and denoted with T. Configuration space X together with all the tangent spaces Tx forms the phase space of the system, similar to a stratified manifold. Different agents are represented by different tangent spaces T.

457 The points of X that can be connected with a single operation are considered as adjacent. With thus introduced notion of relative adjacency, points adjacent for one agent may be not adjacent for another. For instance, different processors may emulate each other's functionality on the microprogrammatic level. 2.2.2

Actions

In this model, a computation process is represented by a trajectory in the phase space. Formally, there is a mapping d: X —»T, so that every point x of X corresponds to a single element dx from T^ . Given the initial and final states, xx and jcf, one can choose an admissible trajectory to arrive from xt to xs; the class of such trajectories is called an action. That is, contrary to operations that connect only adjacent points, actions link distant points via a sequence of operations. Different agents may have different classes of admissible trajectories, and die same action may either be unavailable to some agents, or be realized in different ways. The range of possible actions is intimately related to the nature of the agent, and it can usually be derived from a few fundamental principles. Thus, in classical mechanics, the principle of minimum action normally selects a single trajectory for fixed Xj and xt. The same holds for quantum mechanics, but the trajectory in a functional or operator space is considered instead of the usual 3-dimesional space. 2.2.3

Activities

In simple configuration spaces, only operations and actions are possible. In a more complex case, the points of the space X form a number of classes X t ; any trajectory connecting the points of the same classes Xj and X 2 belongs to the same action class, which is called activity. An activity is like higher-level operation, connecting adjacent classes; on the other hand, activity is non-local, since it demands action. For an example of activity, one can consider an infinite trajectory in some configuration space X: the points on the trajectory belong to the same class, and any action that can be represented by a finite segment of that trajectory connects that class to itself, hence belonging to the same activity. Yet another important example: if the initial and final states are structured, any action transforming a component of the initial state to some component of the final state will be a representative of the same activity. Such partial actions (iterations) may fail to converge; however, such an activity often leads to quite acceptable results (e.g. using asymptotic series expansions in special function approximation). 2.3

The computable world

Every agent encounters certain initial (boundary) conditions and operates following its built-in logic. However, due to the limited operational capacity, the agent cannot achieve any point of the world at all. Some points are unachievable because they are

458 not connected to the initial state by any admissible trajectory; some other points are only asymptotically approached; there may also be dynamic singularities that cannot lie on any admissible trajectory regardless of the initial conditions. That is, any single agent can only span a subspace of the full configuration space X; this subspace is called the world W of that particular agent, in given conditions. A world is an analog of a dynamic flow (a bunch of trajectories) in analytical mechanics. In the simplest case, the world reduces to a single trajectory. This individual world may have structure quite different from the structure of the configuration space in general, actualizing only a part of the possibilities available. For instance, in a Euclidean configuration space, the individual world may form a sphere, a torus, or a fractal. Also, the worlds spanned by the same agent with different initial conditions may be quite different. The agent can never break out of its individual world unless there are other agents, and hence a hierarchy of agents operating in a hierarchical world.

3

Hierarchical computing

The very distinction of the levels of operation, action and activity is already introducing hierarchy in the model, implying a hierarchical organization of both the configuration space and the agent. In this section, we consider communication as the source of hierarchical development, which gives way to numerous implications of importance in distributed computing and artificial intelligence. Let there be two agents Al and A2 operating in the same computational universe. Since a point in the configuration space X is a distinct state of the universe, and since, in this model, any operation changes the state of the whole universe, the two agents cannot act simultaneously, save in the trivial cases, and sequential operation is the only possibility: ... —»*i —>A\ —> x 2 —>^2 —»-*3 —>A\ —>*4—> —

This means that, from the viewpoint of each agent, the state of the universe between successive operations or actions may "spontaneously" change, which is impossible in the traditional approach to computability. That is, the activity of another agent results in discontinuities of individual trajectories, up to switching to an entirely different class of trajectories (activity). Similarly, assuming a universe developing according to some natural laws, we arrive to the necessity of accounting for the regularities of such development in the individual computation processes. However, in this work, we are mostly interested in agent-produces changes in the universe and do not consider naturally developing worlds. Agents Ai and A2 exist in their individual worlds Wj and W 2 , in general, spanning different parts of the whole configuration space X . This leads to a number of useful notions characterizing the possible relations between the worlds. Non-

459

intersecting worlds W! and W 2 imply that agents A | and A2 cannot operate together; if one of them works, the other must be stopped. An operation of agent Ai is A2compliant iff the resulting state of the universe belongs to W 2 . Such operations do not change the activity of agent A2 , rather influencing the timing of an action; alternatively, they could be called boosts. An operation of agent A i is A2-compatible iff it results in a point x that can lie on some trajectory of A 2 , maybe with different initial conditions. In other words, there are points of X adjacent to x in A i . Obviously, all A2-compliant operations of Aj are also A2-compatible. Existence of non-compliant operations means that the actual configuration space of an agent does not coincide with the whole X and hence is reducible to some subspace of X. However, as it will be shown below, there are no such domain limitations in hierarchical agents. Indeed, one can consider sequential operations performed by different agents as a higher-level operation performed by an agent A consisting of both A, and A2: ... —» JCJ —> ( A ! — > x 2 — > A 2 ) —>JC 3 —» ...

The intermediate state x2 (point of X) can be interpreted as internal for A, and the elementary operation of A transforming xx into x3 is a composition of the operations of A i and A 2 ; the point JC2 of X, beside being a specific state of the universe, represents a particular composition of operations. Points of X that serve as internal for some agent A (and hence mediate communication of lower-level agents) are called products. State s of the universe that is exclusively used to switch activity from one agent to another is called a symbol. Alternatively, one can consider a hierarchy of operations. The original tangent space T now contains only direct operations, while there also are indirect operations mediated by other agents. In the above example, x3 may be unachievable for A] with any direct operation, but it becomes achievable with a second-order operation involving another agent. Hierarchical agents imply hierarchical worlds composed of many individual worlds, plus the points x achievable via collective actions. In the above scheme, the points Xi and agents A, become interchangeable: ... -»Ai —>(*i —>A2 —>x2) —>A3 —>... Like points x may become internal for hierarchical agents, transformations of the world (operations and agents) can be considered as occurring in the interior of a higher-level point of the hierarchical configuration space. The difference between agents and the states of the computational universe hence becomes relative.bFrom the hierarchical viewpoint, one could consider any action as an operation of a higher-level agent arising from self-communication: ... —> JC] —» (A —> x2 —» A ) —> * 3 —> ...

460 The agent A thus becomes composed of two specialized components: one of them (the afferent component) transforming an outer state of the universe x\ into an inner state of the agent x2 , and the other (efferent) component producing an outer state x3 from the inner state. 4

Hierarchical inference

So far, we considered hierarchical computing in a static universe, so that only its state could be changed. Beside the already mentioned natural development, this picture can be complicated by new objects produced by the agents. Once the states of the universe become represented by some other states (symbols), the operations on symbols may develop in a very complicated area. After all, agents do not stop on symbolic computation, and pass to material production, which enormously extends the configuration space and opens new direction of hierarchical development. Formally, this process could be modeled in a peculiar logic, containing the following rules of inference. 1.

(Reflexivity) If there is an object O, there is a link —» of this object to itself:

o->o 2.

3. 4. 5.

6.

(Unfolding links) For any link —> there is an object O' mediating it, so that —» is equivalent to —»(?'—> ; the resulting links are different from the original and denoted with the same arrow merely for brevity. (Folding links) The reverse of (2): any mediated link can be folded in a higherlevel link. (Abstraction) For any linked objects Ot —> 02 , there is an object O representing the link. (Unfolding objects) Any object mediating a link —» 0 ' - > is a contracted form of a triad of input, inner state, output: —» (S' —> C" —>/?)—> ; this rule might be replaced with an equivalent: —»0'—» implies —> (S' —>R') —>, and then -> (S' -> C" -> R1) -> by rule (2). (Refoldability) —> (0\ —> 02) —> is equivalent to —> Oi) —» ( 0 2 —> , with a proper re-interpretation of links.

The entity obeying these laws is called a hierarchy. This set of rules is not minimal, and there may be many equivalent formal systems. Explicitly specifying the levels of hierarchy for both objects and links, one can construct rather complex structures, then fold them into simple schemes, and unfold in different way. As it is easy to see, objects do not differ much from links in a hierarchy as a whole, while they will certainly be different in every hierarchical structure. Obviously, no hierarchy can be complete, since any element can be unfolded in a complex structure, producing additional elements and additional types of links. Hierarchical logic is a method of construction, rather than construction itself.

461 However, a hierarchy possesses a kind of absolute integrity, since every element is related to each other, and the hierarchy can always be unfolded in a structure, in which these elements are connected with a direct link. One could put forward the hypothesis that any static formal system, as known in modern mathematics, can be obtained via hierarchical development, as one of the possible unfoldings.

5

Conclusions

We have outlined an alternative approach to computation based on the hierarchical ideas. This approach conveniently links the traditional notions of analytical mechanics to the studies of human behavior within a general psychological theory of activity. Such a synthesis may be productive enough, to give birth to various nonstandard theories of computation and inference, efficient methods of distributed and parallel computing, new forms of artificial intelligence. Even if not so, it presents one more possible conceptualization, which is not reducible to any known mathematical structure, rather being a tool for reconstructing any integrity at all.

References 1. Mendelson, E. Introduction to mathematical logic (Princeton, NJ: D. van Nostrand, 1964). 2. Cutland, N. Computability: An introduction to recursive function theory. (Cambridge: Cambridge Univ., 1980) 3. Mesarovic M. D. and Takahara Y. General Systems Theory: Mathematical Foundations (N.Y.: Academic, 1975) 4. Hubey H. M. The Diagonal Infinity (Singapore: World Scientific, 1998) 5. L.Vygotsky, Thought and language (Cambridge, MA: MIT Press, 1986) 6. Leontiev A. N. Activity, Consciousness and Personality. (Englewood Cliffs, NJ: Prentice Hall, 1978) 7. Efimov E. I. Intellectual Problem Solvers. (Moscow: Nauka, 1982) 8. Ivanov P. B. A hierarchical theory of aesthetic perception: Scales in the visual arts Leonardo Music Journal, 5 (1995) pp. 49-55 9. Arnold V. I. Mathematical Methods of Classical Mechanics (Moscow: Nauka, 1979) 10. Dobronravov V. V. Foundations of Analytical Mechanics (Moscow: Vysshaya Shkola, 1976)

Imaging Applications

465 CATADIOPTRIC SENSORS FOR PANORAMIC VIEWING R. ANDREW HICKS, RONALD K. PERLINE AND MEREDITH L. COLETTA Department of Mathematics and Computer Science Drexel University Philadelphia PA 19104, USA E-mail: [email protected] We describe a family of reflective surfaces for panoramic viewing which achieves approximately cylindrical projections. The requirement of satisfying the single viewpoint constraint restricts the surface type to conies; in contrast, relaxing this requirement allows us to obtain a novel class of sensors which give a highly accurate approximation of a cylindrical projection. Design parameters for these sensors enable control of the region of space to be imaged, therefore increasing effective resolution by excluding unwanted portions of the scene.

1

Introduction

A panoramic image is one that provides a "360 degree" field of view. There are numerous ways to capture such images, from wide-angle lenses to cameras with slits that rotate around a piece of film wrapped on a cylinder. An approach which has recently generated much interest within the computer vision community is the use of catadioptric sensors, which consist of combinations of cameras with conventional lenses (dioptrics) and curved mirrors (catoptrics). In this paper, our attention will be focused on this type of sensor. Their usefulness in panoramic imaging stems from the following idea: if one points a camera at a curved mirror (usually convex), the field of view of the camera can be increased. The image may then be digitized and numerically transformed as desired, including the generation of various different projections from the acquired image. For example, one might use the sensor to generate a perspective projection of a small portion of the image. A major asset of this type of panoramic sensor is that it operates in real-time, facilitating video applications. A consequence of the increased field of view of a catadioptric sensor is that image resolution tends to low. To make this precise, we define the resolution of the sensor to be the total number of pixels in the image divided by the total solid angle (steradians) that have been imaged. (Recall that solid angle simply refers to the area of a region on a unit sphere; hence a hemisphere corresponds to an angle of 27r. A true omnidirectional sensor can view a whole sphere - a solid angle of 4TT.) Suppose one chooses a fixed camera with a conventional glass lens; this choice effectively fixes accumulated pixels and resolution. If one augments the sensor by introducing a reflective surface

466

component which increases the imaged solid angle, the resolution obviously decreases. Decreased resolution is a major disadvantage for important applications such as stereo imaging and tracking. It is therefore worthwhile to design sensors which maximize resolution by more "efficiently" imaging the view sphere, and ignoring regions not of interest to the observer. The shape of the catoptric component of a catadioptric sensor is crucial in determining not only the field of view, but also the types of projections that may be mimicked by the sensor (possibly coupled with digitally transforming the image). An important example is the class of surfaces that yield a single effective viewpoint. We will say that a sensor has a single effective viewpoint if it measures the intensity of light passing through some fixed point in space, in every possible direction. This point, which is known as the effective viewpoint, acts as a sort of virtual center of projection. If a sensor does have a single effective viewpoint, then perspective images with respect to that point may be recovered. This may be achieved by backprojecting the image onto the plane of choice. For example, consider a parabolic mirror being viewed by an orthographic-type" camera. This sensor design, due to Nayar *, exploits the fact that the focus of a parabola can play the role of a single effective viewpoint if the parabola is viewed orthographically.

Horizon line Figure 1. Here we see a schematic depiction of an image from a parabolic sensor. The region in side of the dotted circle is of no use when creating a panoramic image, and so those pixels are "wasted".

Suppose we wish to create a panoramic image with such a parabolic sensor, where the region on the view sphere of interest lies between parallel "Two basic projection models for a camera are orthographic and perspective. In the orthographic case the camera is assumed to detect light rays that are parallel to one another in a fixed direction. In the perspective case, the rays detected are all those that pass through a fixed point called the pinhole or center of projection.

467

latitudinal lines - imagine the region of space which is swept out by the rotating beam of light of a lighthouse, the "beam sweep". A large fraction of the image obtained by the parabolic sensor is devoted to the camera (centered in the middle), and its immediate neighborhood - an area likely not of interest to the observer. Thus, valuable pixels are wasted; the resolution of the sensor is not optimized. See figure (1) for a diagram illustrating this phenomenon. To circumvent this difficulty, we have designed an "exotic" (non-conic) rotationally symmetric reflective surface which gives an approximate cylindrical projection of a specified "beam sweep". Recall that a cylindrical projection is a mapping from the world to the plane obtained by choosing a center of projection on the axis of symmetry of a cylinder and projecting points in the world onto the cylinder along the lines that contain the center of projection. Then the cylinder is "unrolled" by a software transformation to obtain a 2-dimensional image (see figure (2)). The design specification for our mirror translates mathematically into the requirement that the cross-sectional profile of the surface satisfy a certain differential equation (see section 3).

World point projected onto the cylinder

. World point

Cut the cylinder along a line

Figure 2. The cylindrical projection

Any rotationally symmetric mirror will have the property that horizontal circles on a cylinder (whose symmetry axis coincides with that of the mirror) will be imaged without distortion, up to scaling. Our mirror enjoys the additional property that vertical lines along a given cylinder are imaged without distortion, that is, they are subjected to a simple linear scaling. We have observed experimentally that this property corresponds to a certain robustness for the unwarping process - the sensor is designed to image objects at a certain distance, but continues to work reasonably well for objects at varying distances.

468

2

Previous Work

There are numerous systems for creating panoramic images: wide-angle and fisheye lenses, mechanical means, stitching images, etc. We will not survey all of these methods, since we are interested in catadioptric sensors. The reader interested in these systems may consult the surveys by Svoboda and Pajdla 2 and Yagi 3 . An early use of mirrors for panoramic imaging is a patent by Rees 4 , who proposes the use of a hyperbolic mirror for television viewing. Another patent is by Greguss 5 , which is a system for panoramic viewing based on an annular lens combined with mirrored surfaces. Nalwa 6 describes a panoramic sensor that uses flat mirrors and multiple cameras that has a single effective viewpoint. Chahl and Srinivasan 7 describe a mirror that has the property that there is a linear relationship between the angle of incidence of the light and its angle of elevation. The mirror is described using a differential equation. Hicks and Bajcsy 8 consider a mirror which provides perspective images without digital unwar ping. Nayar 1 has described a true omni-directional sensor, with the goal of reconstructing perspective views. Nayar and Peri 9 investigate two mirror systems with a single effective viewpoint. 3

The Polar Sensor

ax+b x

c

Figure 3. Derivation of the differential equation

We now derive the differential equation that describes our panoramic mirrors for cylindrical projection. In figure (3) we see a schematic of the

469

mirror geometry. The cross section of the mirror is the graph of a function f(x). We assume that the mirror is being viewed orthographically 6 and that the image is formed on the z-axis. We fix the vertical line at x = c and assume also that the mirror shape is such that the point (0, x) is the image of a point (c, T(x)) where we require T(x) = ax + b; this corresponds to a linear relationship between distance measured along the vertical line x = c, and the "film" located along y = 0. Thus the image obtained from such a sensor can be used to create a cylindrical projection by applying a simple polar coordinate transformation, hence we refer to this sensor as the polar sensor. 9 is the angle between the normal to the curve and the light ray, where we assume of course that the angle of incidence is equal to the angle of reflection. We derive our equation by computing tan(20) is two different ways, and setting the two expressions equal to each other. ,a=9 4

u* l/u /

3 2

°

/ 0.5

1.5 2

Figure 4. Cross sections of mirrors for use with orthographic projection, with scaling factors o = 1 (a straight line of slope 1), a = 3 and a = 9, all with 6 = 0 and c = 10.

On one hand, since tan(0) = / ' ( # ) , tan(20) = Y^FM^- ®n * n e ° t n e r hand, from the diagram tan(20) = ./X\ZT(X) • Here we are mostly interested in the case when T(x) — ax + b. Thus we have our basic equation:

2f'(x) l-/'(x)2

c—x

f{x)-ax-b"

Solutions to this equation tend to look like straight lines or concave up monotonic functions (see figure (4)). In figure (5) we see a panoramic view of a scene obtained from a prototype of our catadioptric sensor. In this case the "vertical field of view" is about 45 degrees. The image quality decreases towards the top of the image, where there is a discontinuity in the unwarping process. 6

This is a reasonable approximation for a camera with telecentric lens. A similar derivation is possible for the case in which the camera projection is assumed to be perspective.

470

Figure 5. On the left we see an image obtained directly from our sensor. On the right we see the same image unwarped.

4

Conclusions

Using differential equation techniques, we have introduced a new class of mirrors for panoramic imaging. By employing these mirrors, one can construct catadioptric sensors which efficiently image a given latitudinal range of the view sphere. In general, extensions of these techniques can be employed to construct catadioptric sensors with specific design constraints while optimizing their effective resolution. We axe currently investigating these research directions. References 1. S. Nayar. Catadioptric omnidirectional camera. In Proc. Computer Vision Pattern Recognition, pages 482-488, 1997. 2. T. Svoboda and T. Pajdla. Panoramic cameras for 3d computation. In Proc. of the Czech Pattern Recognition Workshop, pages 63-70, 2000. 3. Y. Yap. Omnidirectional sensing and its applications. IEICE Trans, on Information and Systems, E82-D(8), pages 568-579, 1990. 4. D. Rees. Panoramic television viewing system. United States Patent, (3,505,465), April, 1970. 5. P. Greguss. Panoramic Imaging Block for Three-dimensional space. United States Patent, (4,566,736), January, 1986. 6. V. Nalwa. A true omnidirectional viewer. Technical Report, Bell Laboratories, Bolmdel, NJ 07788, USA, 1996. 7. J.S. Chahl and M.V. Srinivasan. Reflective surfaces for panoramic imaging. Applied Optics, 36:8275-8285, 1997. 8. R. Hicks and R. Bajcsy. Catadioptic sensors that approximate wide-angle perspective projections. In Proc. Computer Vision Pattern Recognition, pages ' 545-551,- 2000. 9. S. Nayar and V. Peri. Folded catadioptric cameras. In Proc. Computer 'Vision Pattern Recognition, pages 217-223, 1999.

471 HIGH-PERFORMANCE COMPUTING FOR THE STUDY OF EARTH AND ENVIRONMENTAL SCIENCE MATERIALS USING SYNCHROTRON X-RAY COMPUTED MICROTOMOGRAPHY HUANFENG Department of Earth and Environmental Studies, Montclair State University, Upper Montclair, NJ 07043, USA E-mail: fengh@mail. montclair. edu KEITH W. JONES Environmental Sciences Department, Brookhaven National Laboratory, Upton, NY 11973, USA E-mail: [email protected] MICHAEL MCGUIGAN, GORDON J. SMITH, JOHN SPILETIC Information Technology Division, Brookhaven National Laboratory, Upton, NY 11973, USA E-Mails: [email protected], [email protected], [email protected]

Synchrotron x-ray computed microtomography (CMT) is a non-destructive method for examination of rock, soil, and other types of samples studied in the earth and environmental sciences. The high x-ray intensities of the synchrotron source make possible the acquisition of tomographic volumes at a high rate that requires the application of highperformance computing techniques for data reconstruction to produce the threedimensional volumes, for their visualization, and for data analysis. These problems are exacerbated by the need to share information between collaborators at widely separated locations over both local and wide-area networks. A summary of the CMT technique and examples of applications are given here together with a discussion of the applications of high-performance computing methods to improve the experimental techniques and analysis of the data.

1

Introduction

Materials studied in the earth and environmental sciences are generally very inhomogeneous and complex materials. Investigation of the three-dimensional properties of these materials is essential. However, there are relatively few ways that these properties can be measured using non-destructive methods. These methods include laser confocal microscopy, magnetic resonance imaging, and x-ray computed tomography. The use of x-ray computed tomography is particularly powerful since it gives information on x-ray attenuation coefficients and thus can distinguish between different minerals, pore space, and liquid-filled pore space in specimens that can range from a few millimeters to many centimeters in size. Our purpose here is to describe the application of computed tomography techniques to these problems based on the use of synchrotron radiation and to discuss the application of high performance computing to improve the technique.

472

2

Instrumentation and Method

A schematic diagram of the CMT apparatus at the Brookhaven National Synchrotron Light Source (NSLS) is shown in Figure 1 [7]. The x rays are detected with a thin scintillator of CsI(Na) or YAG:Ce. A mirror/lens system is used to focus light from the scintillator onto a charge-coupled device (CCD) camera. Blurring effects caused by scattering of the x rays in the scintillator are minimized by the small depth-of-field of the magnifying lens. The spatial resolution of the system is energy dependent and of the order of 0.005 mm. The CCD cameras used employed CCD chips with 1317 x 1035 and 3072 x 2048 pixels. In practice, the pixels are often binned to reduce the size of the tomographic volumes and thereby ensure practicable times for data acquisition and reconstruction. Monoenergetic beams are used at energies to about 20 KeV, and filtered white beam is used for measurements at higher energies to obtain higher x-ray intensities. Typical data collection times are of the order of 1-2 hours.

^ s ^ ^ # %%m$m &?$m&'

<—i^m^mS"

Figure 1. A schematic diagram of the apparatus.

Data acquisition produces a set of camera frames taken as the sample is rotated in, ^ series of steps determined Jby the number of pixels desired in the volume. The procedure covers an angular range of 180 degrees with respect to the incident x-ray beam, the number and size of files depend on the sample size and desired spatial resolution. , The data reconstruction proceeds in 3 phases. Phase 1 applies a whitefield normalization, any filters needed, and writes files containing data frbm all views for

473

a single slice (horizontal row of pixels), one file per slice. Phase 2 processes each slice independently. It applies the view-by-view air value normalization, optionally applies a filter to reduce the ring artifacts, computes the location in the images of the center of rotation, and converts the data to a sinogram. Phase 3 is the actual reconstruction. It generates a square array with dimensions of the horizontal row size. After this, the reconstruction is completed and the visualization process begins. This is a much more varied process and depends strongly on the particular sample being analyzed. 3

Experimental Results and Discussion

We have used tha CMT apparatus to investigate many different materials relevant to the earth sciences. In particular, sandstones are of wide interest, and typical data is presented here to show a specific example of the usefulness of synchrotron CMT. The data can be presented in several ways. A volume representation showing a 3-D view of sandstone from the Vosges region of France is shown in Figure 2. The grain structure of the material is clearly visible. A view of a single section through the sample is shown in Figure 3. This type of view helps to highlight the pore-grain relationships. The data can then be processed according to the measured attenuation coefficients to display the data in binary form representing either solid or pore space. Analysis of this data then gives the two-dimensional correlation function, porosity, permeability, and tortuosity. The measured microgeometry also can be used as a realistic basis for fluid flow calculations [2].

F i g u r e 2. Isosurface rendering of a three-dimensional tomographic volume of a sample of sandstone from the Vosges. The color scale indicates the values of the measured absorption coefficients. The pore space is shown as blue.

474

N

250-f

41

a soo625 750 125 250 375 500 625 75Q

-D jamas &cra I

0 . 0 0 0 0 0 . 0 0 2 5 0 . 0 0 5 0 0 . 0 0 7 5 0.0100

'slice F i g u r e 3* Single section through the sandstone volume shown in Figure 2. Absorption coefficients are indicated by the color scale.

4

High Performance Computing

The present system used at BNL and others, similar in design at the NSLS, APSS ESRF, md other synchrotrons, have demonstrated their worth in varied experiments on environmental and earth science topics. However, the system performance and usefulness can be vastly improved by use of high-performance computing technology. First, present day CCD cameras can produce 15-20 frames per second with a size of about 1000 x 1000 pixels. Assuming that about 1500 frames will be necessary to acquire data for a tomogram, approximately 6 gigabytes of data are produced per second (2 bytes per pixel). The goal is then to carry out all steps of the reconstruction process in the time needed to acquire the data, or about 1500 frames/15 frames/s or 100 s. Second, in order to be able to adjust experimental parameters in near real time, it is necessary to produce data displays on about the same time scale. There needs to be control of the display by the experimenter so that different views are feasible and thresholding can be set to display changes in pore structures or fluid motion.

475

There are other aspects of the problem that include high-speed networking, rapid data storage, and remote viewing that present challenges to computer science. Here we concentrate our attention on the first two points, data reconstruction and viewing. We have developed methods for doing these tasks on the time scales required by recourse to parallel computation techniques. X-ray computed microtomography is a highly computer intensive and memory intensive application. The large volumes and small grid spacing required for micrometer resolution push the limits of even the most powerful workstations. For this reason we are applying high performance parallel computation and data compression for remote access of tomographic data sets. The two main areas that need to be addressed are reconstruction and visualization. Reconstruction takes many projections obtained from the high resolution CCD camera and uses a FFT transform method developed by Robert Marr at Brookhaven to form a threedimensional gridded data set. Visualization of the data set either as a volume or an isosurface is used to closely inspect the sample and extract new features. This is typically done within a complete visualization package such as OpenDX or VTK. Reconstruction of large data sets can take several hours. Visualization on large grids can also take tens of minutes just to form a particular view of an isosurface. The use of parallelization to address these problems is one way to achieve necessary speedups for fast reconstruction and visualization. Basically one divides up the data set across multiple processors and reassembles the final result by synchronization and communication among processors. There are two main protocols for parallelization: Parallel Virtual Machine (PVM) and the Message Passing Interface (MPI). MPI is the more recent of the two protocols and is becoming a standard particularly on clusters of Linux computers. Parallel reconstruction using MPI is currently being used at the APS, and this technique can also be applied to data from the NSLS beam line. The results of applying parallel visualization using MPI to the NSLS beam line data are described here. 4.1

Parallel Visualization

Parallel visualization refers to the use of multiple computers for the graphical depiction of large data sets. It has its origins in parallel rendering where one breaks up the image to be rendered into smaller pieces and has each processor render its own part with all the pieces brought together for the final composite image [6, 8]. For example, Baily [5] has ported Pixar's RenderMan rendering software to the parallel Intel paragon machine. The free ray tracing program POV has also been ported to Linux clusters using MPI and PVM and achieved near linear speedups comparable with more expensive specialty supercomputers. Recently, parallel

476

visualization has gone a step beyond this by breaking up the data set itself. This data parallel model allows large data sets of high resolution, such as the x-ray tomography data sets discussed here, to be manipulated by using the cumulative memory and processing power of a Linux cluster. The data parallel model has been implemented in OpenDX 4.1.2 and VTK using MPI [1,3]. 4.1.1

OpenDX

OpenDX is an open source visualization package that can be applied to a wide variety of data sets [1]. Currently, we use the software to give a quick view of x-ray tomographic data sets, extract isosurfaces and slices, and convert from NetCDF data format to a data format that can be read into VTK. Recently, a port of the software using MPI has been achieved, and we are planning to apply this version to x-ray tomography. Figure 4 shows a screen save of the OpenDX software we have developed. OpenDX builds applications by dragging modules of specific functionality into the visual programming editor canvas and linking them together to form a network. The network shown uses an isosurface module, an export module to do the NetCDF conversion, and an image module for the final rendering. The data set shown is a high resolution x-ray tomographic data of the thigh bone of a rat used in osteoporosis studies.

Figure 4. OpenDX application for isosurface extraction and conversion from NetCDF data format to VTK format.

477

4.1.2

VTK

The Visualization Toolkit (VTK) is a multiplatform visualization package with C++, TK/TCL and Python bindings. It is known for its high quality implementation of the latest algorithms in computer graphics. Its most recent version includes a parallel MPI implementation of a subset of its modules [3, 4]. Here we discuss two programs built on this subset which we have applied to x-ray tomographic data sets. The first program, Paralso, breaks up the data set according to the command line arguments. It then computes isosurfaces for each piece on separate processors and then renders the final image with colors to indicate the work of the separate processors. An example for a tomographic data set is shown in Figure 5. The computation was done on a 4 Xeon processor 1400 Linux server from SGI. The second MPI program DataParallelism breaks up the data more directly by using an image reader module. It includes it's own sample data and explicit timing functions to measure the performance of the parallel isosurface computation. The results for the sample data are shown in Figure 6 and indicate nearly a factor of three speedup using all four processors.

Figure 5. Output from the VTK program Paralso. The four colors indicate the portion of the isosurface computed by each of the four processors on the Linux cluster.

478

100 _ m

B CM

CO

'

# processors Figure 6. Graph showing the speedup using the parallel MPI VTK software DataParallelism to compute isosurfaces.

4.2

Remote Visualization

The use of large computer facilities to perform x-ray tomographic reconstruction and visualization is of no use unless the final rendering can be delivered to the scientist's desktop. This can mean delivery over a high-speed intranet or more remote delivery connecting dispersed researchers over the Internet or wide area network. We are currently testing two software packages designed to perform remote visualization. Both packages use compression algorithms to speed up transmission to the desktop. The first remote visualization software, VizServer from SGI, transmits OpenGL visualization remotely from a multipipe SGI Onyx2 computer to SGI, Sun, or Linux clients. The software was able to achieve frame rates useful for interactive viewing (approx. 10frames/sec)for typical data sets. The Onyx2 we used had 2 GB of Ram, 6 processors and 2 pipes. As only one pipe is available for vizserving, this means at most one user at a time can access the data remotely from this system. A second remote visualization software has recently been released by TGS as part of Openlnventor 3.0. Openlnventor is a high level C++ and Java graphics toolkit built on top of OpenGL. This software works on a variety of platforms and doesn't require separate pipes for each user. It delivers Openlnventor applications directly to the desktop, and these can include x-ray tomographic data in Inventor and VRML format.

479 Another approach to remote visualization is the construction of a VRML server to deliver 3-D models directly over the Internet. VTK itself can be used to construct such a server in conjunction with a VRML browser. Currently we are using Cosmoplayer and parallelgraphics VRML browsers. The later can deliver on a variety of platforms including wireless PDA devices. 4.3

Stereoscopic Viewing

To understand the three-dimensional structure from high-resolution x-ray computed tomography it is useful to have a stereoscopic presentation of the data. Currently we are using a passive system with two Barco projectors and polarized filters [9]. The projectors are connected to an Onyx2 computer and the visualization is rear projected on a special screen that preserves polarization. The final visualization is suitable for group viewing in a visualization theatre with inexpensive polarized glasses. At the desktop, stereoscopic viewing can be done using a page-flipping method and active glasses. Both methods can be used with OpenDX or VTK visualization toolkits. The parallel graphics VRML browser can also be operated for remote stereoscopic visualization on the Internet. 5

Conclusions

In this paper we briefly described the use of synchroton CMT for investigation of earth and environmental sciences samples and then described improvements to the CMT system by application of high performance computing methods. We used OpenDX and MPI versions of VTK to perform the visualizations and achieved a factor of three increase in speed using a four processor Linux cluster. We also studied the application of visualization server software for desktop delivery and achieved reasonable frame rates for typical data sets. In the future, we will apply MPI versions of OpenDX and the use of VRML servers in order to make these high performance techniques widely available. 6

Acknowledgments

We wish to thank C. Law, B. Geveci, and S. Murtagh for assistance with parallel VTK and C. Holmes for assistance with SGI vizserver as well as Betsy Dowd for help with CMT data sets. Work supported under DOE Contract DE-AC0298CH10886.

480

References 1. Abram, G. and Treinish, L. An extended data-flow architecture for data analysis and visualization. Proceedings of Visualization, 263, IEEE Computer Society Press (1995). 2. .Adler, P. M, Thovert, J.-F., Jones, K. W., Peskin, A. M., Siddons, D. P., Andrews, B., and Dowd, B. Determination of topology and transport in porous media using synchrotron computed microtomography (abstract). Presented at 1996 Spring Meeting of The American Geophysical Union, Baltimore, Maryland (1996). 3. Ahrens, J., Law, C, Shroeder, W., Martin, K., and Papka, M. A parallel approach for efficiently visualizing extremely large, time-varying datasets. LANL Technical Report LAUR-00-1620. 4. Ahrens, J., Martin, K., Geveci, B., Law, C, and Papka, M. Large-scale visualization using parallel data streaming. (preprint) 5. Bailey, M. Parallel RenderMan. http://www.sdsc.edu, (1996). 6. Crockett, T. W. Parallel rendering. In SIGGRAPH '98 "Parallel Graphics and Visualization Technology" Course #42. (1998) 7. Dowd, B. A., Andrews, A. B., Marr, R. B., Siddons, D. P., Jones, K. W., and Peskin, A. M. Advances in x-ray computed microtomography at the NSLS. Presented at 47th Annual Denver X-Ray Conference, Colorado Springs, Colorado, August 3-7, 1998. In Advances in X-Ray Analysis. Vol. 42, Plenum Publishing Corp., New York, New York. (1999) 8. Ellsworth D., Polygon Rendering for Interactive Visualization on Multicomputers, PhD thesis, University of North Carolina at Chapel Hill. (1997) 9. Smith, G., Andrews, B. ITD's BNL Visualization Lab, http://www.itd.bnl.gov/ visualization/vis3.html, ITD, BNL, Upton, NY (1997).

481

AUTHOR INDEX

Abdul Rahman Araby, N., 261 Akama, A., 379, 391, 399, 405, 411 Antoniou, G., 221, 229 Ashton, K., 441 Barat, R., 283 Beidler, J., 213 Benigno, D., 87 Benyoub,A., 121 Bitincka, L., 221 Bloor, C , 87, 93, 105 Carmo, L.F., 9 Celik,L, 229 Chavananikul, V., 385 Chen, X., 449 Christou, C , 127 Cios, K., 433 Cockton, G., 87, 93 Coletta, M.L., 465 Daoudi,E.M., 121,177 Davis, B., 87 De Arriaga, F., 247 Delicato, F.C., 9 Doherty,E., 81,87 Dorronsoro, J.R., 239 Eachus, H.T., 111 Edelson, W., 255 Edoh, K.D., 423 El Alami, M., 247

Feng, H., 471 Fitzpatrick, E., 369 Gargano, M., 255 Gibson, R., 345 Gnanayutham, P., 93 Gutierrez, A., 201, 207, 269 Han,C.Y., 195 He, B., 311 He,Y., 255 Hicks, R.A., 465 Honan, P.J., 303 Hou,W.-C.,317 Hubey, H.M.,15, 71, 275, 455 Ishikawa, T., 379, 399, 405 Ivanov, L., 165 Ivanov, P., 15, 455 Jenq,J., 157, 185 Jones, K. W., 471 Jorgenson, J., 293, 333 Junker, A., I l l Kaneko, K., 275 Kawaguchi, A., 3 Keerthi, S.S., 31 Kendal, S., 441, 449 Kimm, H., 143 Koike, H., 379,391,399,405,411

482

Layton, H., 55 Lee, Keon-Myung, 433 Lee, Kyung-Mi, 433 Lejk, E., 105 Li, J., 293 LLW.N., 157, 185 Lopez, V., 239 Lorenz, J., 423 Luczaj, J. E., 195 Mabuchi,H., 379,391,411 Maj, S.P., 99, 359 Manneback, P., 177 Manikopoulos, C , 127,293,311,: McGuigan, M., 471 Middleton, W., 105 Moore, L., 55 Mowshowitz, A., 3 Murthy, K.R.K., 31 Murty,M.N.,31 Odusanya, A.A., 419 Perline, R., 465 Pirmez, L., 9 Pitman, E.B., 55 Prasad, B., 385 Prim, M., 23 Rizzo, J., 81, 87 Rosar, M., 63 Santa Cruz, C , 239 Seegmiller, S., 369

Shigeta,Y.,399,411 Sigiienza, J.A., 239 Sigura, I., 275 Singh, P., 261 Slanvetpan, T., 283 Smith, G.J., 471 Somolinos , A., 201, 207, 269 Spiletic, J., 471 Stevens, J.G., 283 Stephenson, G., 81 Su,M., 317 Sypniewski, B.P., 353 Tomaszewski, S., 229 Tureli, U., 303 Tuszynski, J., 41 Ucles, J., 333 Ugena, U., 247 Veal, D., 99, 359 Wang,D., 137, 151 Wang, H., 317 Zaritski, R., 55 Zbakh, M., 177 Zhang, H., 317 Zhang, P., 275 Zhang, Z., 333 Ziavras, S., 127

•l 1

ISBN 981-02-4759-1