Fuzzy Logic in Geology
This Page Intentionally Left Blank
Fuzzy Logic in Geology
Edited by
Robert V. Demicco and George J. Klir CENTER FOR INTELLIGENT SYSTEMS BINGHAMTON UNIVERSITY (SUNY) BINGHAMTON, NEW YORK, USA
Amsterdam Boston Heidelberg London New York Oxford Paris San Diego San Francisco Singapore Sydney Tokyo
This book is printed on acid-free paper. Copyright © 2004, Elsevier Science (USA) All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopy, recording, or any information storage and retrieval system, without permission in writing from the publisher. Permissions may be sought directly from Elsevier’s Science & Technology Rights Department in Oxford, UK: phone: (+44) 1865 843830, fax: (+44) 1865 853333, e-mail:
[email protected]. You may also complete your request on-line via the Elsevier Science homepage (http://elsevier.com), by selecting “Customer Support” and then “Obtaining Permissions.” Academic Press An imprint of Elsevier Science 525 B Street, Suite 1900, San Diego, California 92101-4495, USA http://www.academicpress.com Academic Press 84 Theobald’s Road, London WC1X 8RR, UK http://www.academicpress.com Library of Congress Cataloging-in-Publication Data ISBN 0-12-415146-9
03
PRINTED IN THE UNITED STATES OF AMERICA 04 05 06 07 08 9 8 7 6 5 4 3 2 1
Contents
Contributors Foreword by Lotfi A. Zadeh Preface Glossary of Symbols
vii ix xiii xv
Chapter 1
Introduction
1
Chapter 2
Fuzzy Logic: A Specialized Tutorial
11
Chapter 3
Fuzzy Logic and Earth Science: An Overview
63
Chapter 4
Fuzzy Logic in Geological Sciences: A Literature Review
103
Chapter 5
Applications of Fuzzy Logic to Stratigraphic Modeling
121
Chapter 6
Fuzzy Logic in Hydrology and Water Resources
153
Chapter 7
Formal Concept Analysis in Geology
191
Chapter 8
Fuzzy Logic and Earthquake Research
239
Chapter 9
Fuzzy Transform: Application to the Reef Growth Problem
275
Chapter 10
Ancient Sea Level Estimation
301
Acknowledgments Index
337 339
v
This Page Intentionally Left Blank
Contributors
●
Andras Bardossy (Chapter 6): Institute of Hydraulic Engineering, University of Stuttgart, Pfaffenwaldring 61, D-70550 Stuttgart, Germany [
[email protected]].
●
Radim Bˇelohlávek (Chapter 7): Department of Computer Science, Palacký University of Olomouc, Tomkova 40, CZ-77900 Olomouc, Czech Republic [
[email protected]].
●
Istvan Bogardi (Chapter 6): Department of Civil Engineering, University of Nebraska at Lincoln, Lincoln, Nebraska 68588, USA [
[email protected]].
●
Robert V. Demicco (Chapters 1, 3, 4, and 5): Department of Geological & Environmental Studies and Center for Intelligent Systems, Binghamton University (SUNY), Binghamton, New York 13902, USA [
[email protected]].
●
Lucien Duckstein (Chapter 6): Ecole Nationale du Génie Rural des Eaux et des Forêts, 19 avenue du Maine, 75732 CEDEX 15, France [
[email protected]].
●
Chongfu Huang (Chapter 8): Institute of Resources Science, Beijing Normal University, 19 Xinjiekouwai Street, Beijing 100875, China [
[email protected]].
●
George J. Klir (Chapters 1 and 2): Department of Systems Science & Industrial Engineering and Center for Intelligent Systems, Binghamton University (SUNY), Binghamton, New York 13902, USA [
[email protected]].
●
Vilem Novák (Chapter 10): Institute for Research and Applications of Fuzzy Modeling, University of Ostrava, 30. dubna 22, 70103 Ostrava, Czech Republic [
[email protected]].
●
Irina Perfilieva (Chapter 9): Institute for Research and Applications of Fuzzy Modeling, University of Ostrava, 30. dubna 22, 70103 Ostrava, Czech Republic [Perfi
[email protected]].
●
Rita Pongracz (Chapter 6): Department of Meteorology, Eotvos Lorand University, Pazmany P. setany1/A, H-1117 Budapest, Hungary [
[email protected]].
vii
This Page Intentionally Left Blank
Foreword
In October 1999, at the invitation of my eminent friend, Professor George Klir, I visited the Binghamton campus of the State University of New York. In the course of my visit, I became aware of the fact that Professor Klir, a leading contributor to fuzzy logic and theories of uncertainty, was collaborating with Professor Robert Demicco, a leading contributor to geology and an expert on sedimentology, on an NSF-supported research project involving an exploration of possible applications of fuzzy logic to geology. What could be more obvious than suggesting to Professors Klir and Demicco to edit a book entitled “Fuzzy Logic in Geology.” No such book was in existence at the time. I was delighted when Professors Klir and Demicco accepted my suggestion. And, needless to say, I am gratified that the book has become a reality. But, what is really important is that Professors Klir and Demicco, the contributors and the publisher, Academic Press, have produced a book that is superlative in all respects. As the editors state in the preface, Fuzzy Logic in Geology is intended to serve three principal purposes: (1) to examine what has been done in this field; (2) to explore new directions; and (3) to expand the use of fuzzy logic in geology and related fields through exposition of new tools. To say that Fuzzy Logic in Geology achieves its aims with distinction is an understatement. The excellence of organization, the wealth of new material, the profusion of applications, and the high expository skill of contributors, including Professors Klir and Demicco, combine to make the book an invaluable reference and an important source of new ideas. There is no doubt that Fuzzy Logic in Geology will be viewed as a landmark in its field. In the preface, Professors Klir and Demicco note that applications of fuzzy logic in science are far less visible than in engineering and, especially, in the realm of consumer products. Is there an explanation? In science, there is a deep-seated tradition of striving for the ultimate in rigor and precision. Although fuzzy logic is a mathematically based theory, as is seen in Chapter 2, there is a misperception, reflecting the connotation of its label, that fuzzy logic is imprecise and not well-founded. In fact, fuzzy logic may be viewed as an attempt to deal precisely with imprecision, just as probability theory may be viewed as an attempt to deal precisely with uncertainty.
ix
x
Foreword
A related point is that in many of its applications, a concept which plays a key role is that of a linguistic variable, that is, a variable where values are words rather than numbers. Words are less precise than numbers. That is why the use of linguistic variables in fuzzy logic drew critical comments from some of the leading members of the scientific establishment. As an illustration, when I gave my first lecture on linguistic variables in 1972, Professor Rudolf Kalman, a brilliant scientist/engineer, had this to say: I would like to comment briefly on Professor Zadeh’s presentation. His proposals could be severely, ferociously, even brutally criticized from a technical point of view. This would be out of place here. But a blunt question remains: Is Professor Zadeh presenting important ideas or is he indulging in wishful thinking? No doubt Professor Zadeh’s enthusiasm for fuzziness has been reinforced by the prevailing climate in the US—one of unprecedented permissiveness. ‘Fuzzification’ is a kind of scientific permissiveness; it tends to result in socially appealing slogans unaccompanied by the discipline of hard scientific work and patient observation.
In a similar vein, a colleague of mine at UCB and a friend, Professor William Kahan, wrote: Fuzzy theory is wrong, wrong, and pernicious. I cannot think of any problem that could not be solved better by ordinary logic. . . . What Zadeh is saying is the same sort of things as, ‘Technology got us into this mess and now it can’t get us out’. Well, technology did not get us into this mess. Greed and weakness and ambivalence got us into this mess. What we need is more logical thinking, not less. The danger of fuzzy theory is that it will encourage the sort of imprecise thinking that has brought us so much trouble.
What Professors Kalman, Kahan, and other prominent members of the scientific establishment did not realize is that mathematically based use of words enhances the ability of scientific theories to deal with real-world problems. In particular, in both science and engineering, the use of words makes it possible to exploit the tolerance for imprecision to achieve tractability, robustness, simplicity and low cost of solution. The use of linguistic variable is the basis for the calculus of fuzzy if-then rules—a calculus which plays a key role in many of the applications of fuzzy logic—including its applications in geology. During the past few years, the use of words in fuzzy logic has evolved into methodology labeled computing with words and perceptions (CWP)—a methodology which casts a new light on fuzzy logic and may lead to a radical enlargement of the role of natural languages in science and engineering. Computing with words and perceptions is inspired by the remarkable human capability to perform a wide variety of physical and mental tasks, e.g., driving a car in city traffic or playing tennis, without any measurements and any computations. In performing such tasks, humans employ perceptions—perceptions of distance, speed, direction, intent, likelihood, and other attributes of physical and mental objects.
Foreword
xi
There is an enormous literature on perceptions, spanning psychology, philosophy, linguistics, and other fields. But what has not been in existence is a theory in which perceptions can be operated on as objects of computation. Fuzzy logic provides a basis for such a theory—a theory which is referred to as the computational theory of perceptions (CTP). In the computational theory of perceptions, perceptions are dealt with not as patterns of brain activity, but through their descriptions in a natural language. In this sense, a natural language may be viewed as a system for describing perceptions. Thus, if classical, bivalent logic is viewed as the logic of measurements, then fuzzy logic may be viewed as the logic of perceptions. Although the methodology of computing with words and perceptions is not treated explicitly in the book, the basic ideas which underlie it are in evidence throughout. Furthermore, Fuzzy Logic in Geology ventures beyond well-established techniques and presents authoritative expositions of methods which lie on the frontiers of fuzzy logic. In this respect, particularly worthy of note are the chapters on formal concept analysis (R. Beˇ lohlávek), F-transformation (I. Perfilieva), and linguistic theory (V. Novák). In sum, Fuzzy Logic in Geology is a true role model. It is a high quality work which opens the door to application of new methods and new viewpoints to a variety of basic problems in geology, geophysics, and related fields. It is well-organized and reader-friendly. The editors, the contributors, and the publisher deserve our thanks and accolades. Lotfi A. Zadeh May 13, 2003 Berkeley, CA
This Page Intentionally Left Blank
Preface
This book has three purposes. Its first purpose is to demonstrate that fuzzy logic opens a radically new way to represent geological knowledge and to deal with geological problems, and that this new approach has been surprisingly successful in many areas of geology. This book’s second purpose is to help geologists understand the main facets of fuzzy logic and the role of these facets in geology. The final purpose of this book is to make researchers in fuzzy logic aware of the emerging opportunities for the application of their expertise in geology. This book is a chimera in that it is oriented not only at theoreticians, practitioners, and teachers of geology, but also at members of the fuzzy-set community. For geologists, the book contains a specialized tutorial on fuzzy logic (Chapter 2), a basic introduction to the application of fuzzy logic to model geological situations (Chapter 3), an overview of currently known applications of fuzzy logic in geology (Chapter 4), and six additional chapters with more extensive examples of applications of fuzzy logic to problems in a broad range of geological disciplines. For fuzzy logicians, the book is an overview of areas of geology in which fuzzy logic is already well established or is promising. Thus, our overall aim in preparing this book is to provide a useful link between the two communities and further stimulate interdisciplinary research. The book is a product of a close cooperation between the editors and the several contributing authors. The authors were commissioned to write chapters on specific topics. Great care has been taken to assure that the mathematical terminology and notation are uniform throughout the book. Moreover, care was also taken to assure that the structure of individual chapters and the style of referencing were consistent throughout. Furthermore, authors were requested to focus on clarity of presentation, adding summaries of technical content where appropriate. All these features make the book attractive and appropriate as a text for graduate courses and seminars. The book is written, by and large, in a narrative style, with the exception of a few sections in Chapters 7 and 9. These chapters are dependent on fairly complex mathematical preliminaries. It is far more efficient to introduce these preliminaries in a more formal style, typical of mathematical literature, using numbered definitions, lemmas, theorems, and examples. Although this formal presentation in Chapters 7 and 9 is essential for understanding operational details of the described methods, it is not necessary for a conceptual understanding of the methods and their geological applications. In fact, these chapters are structured conceptually. With this structure, xiii
xiv
Preface
the reader may still get the gist of the chapter without studying the details of the formal presentation. The idea of preparing a book on fuzzy logic in geology was suggested to the editors by Lotfi Zadeh, the founder of fuzzy logic, during his visit to Binghamton University in October 1999. Our opinion then, and now, is that it was a good idea. While fuzzy logic is now well established as an important tool in engineering, its applications in science are far less developed. Nevertheless, the utility of fuzzy logic in various areas of science has been increasingly recognized since at least the mid 1990s. A good example is in chemistry, where the role of fuzzy logic is examined in the excellent book Fuzzy Logic in Chemistry, edited by Dennis H. Rouvray and published by Academic Press in 1997. It thus seemed natural to propose this book, which examines the role of fuzzy logic in geology, to Academic Press, with an eye toward obtaining a synergistic effect. We hope that this book will not only serve its purpose well, but that it will stimulate publication of other books exploring the role of fuzzy logic in other areas of natural sciences such as biology and physics as well as in the social sciences such as geography and economics. Robert V. Demicco and George J. Klir Binghamton, New York December 2002
Glossary of Symbols
General Symbols {x, y, . . .} {x | p{x}} x1 , x2 , . . . , xn [xij ] [x1 , x2 , . . . , xn ] [a, b] [a, b), (b, a] (a, b) A, B, C . . . . x ∈A A(x) or μA (X) αA α+A A=B A = B A−B A⊆B A⊂B SUB(A, B) P(X) F(X) |A| hA A A∩B A∪B A×B A2 f: X → Y f −1 R◦Q
Set of elements x, y, . . . Set determined by property p n-tuple Matrix Vector Closed interval of real numbers between a and b Interval of real numbers closed in a and open in b Open interval of real numbers Arbitrary sets (crisp or fuzzy) Element x belongs to crisp set A Membership grade of x in fuzzy set A α-cut of fuzzy set A Strong α-cut of fuzzy set A Set equality Set inequality Set difference Set inclusion Proper set inclusion (A ⊆ B and A = B) Degree of subsethood of A in B Set of all crisp subsets of X (power set) Set of all standard fuzzy subsets of X (fuzzy power set) Cardinality of crisp or fuzzy set A (sigma count) Height of fuzzy set A Complement of set A Set intersection Set union Cartesian product of sets A and B Cartesian product A × A Function from X to Y Inverse of function f Standard composition of fuzzy relations R and Q xv
xvi
Glossary of Symbols
R∗Q R −1 < ≤ x|y x⇒y x ⇔ y max(a1 , a2 , . . . , an ) min(a1 , a2 , . . . , an ) N Nn R
Join of fuzzy relations R and Q Inverse of a binary fuzzy relation Less than Less than or equal to (also used for a partial ordering) x given y x implies y x if and only if y Summation Product Maximum of (a1 , a2 , . . . , an ) Minimum of (a1 , a2 , . . . , an ) Set of positive integers (natural numbers) Set {1, 2, . . . , n} Set of all real numbers
Special Symbols B(X, Y, I ) c d(A) E h hp i imin iw J L L LX m NecE pA p PosE S(Q, R) T X,Y X, Z, I u umax
The set of all fuzzy concepts in a given context X, Y, I Fuzzy complement Defuzzified value of fuzzy set A Similarity relation (fuzzy equivalence) Averaging operation Generalized means Fuzzy intersection or t-norm Drastic fuzzy intersection Fuzzy intersection of Yager class Fuzzy implication operator Set of truth degrees Complete residuated lattice The set of all fuzzy sets in X with truth values in L Fuzzy modifier Necessity measure corresponding to PosE Fuzzy propositional form and truth assignment Fuzzy probability qualifier Possibility measure associated with a proposition “ν is E” Solution set of fuzzy relation equation R ◦ Q = R Fuzzy truth qualifier Variables Fuzzy context Fuzzy union or t-conorm Drastic fuzzy union
Glossary of Symbols uw W X Ø ⊗ → ∧ ∨
xvii
Fuzzy union of Yager class Set of possible worlds Universal set (universe of discourse) Empty set Operation on L corresponding to conjunction (t-norm) Operation on L corresponding to implication Classical operation of conjunction or minimum operation Classical operation of disjunction or maximum operation
This Page Intentionally Left Blank
Chapter 1
Introduction
Robert V. Demicco and George J. Klir
Traditionally, science, engineering, and mathematics showed virtually no interest in studying uncertainty. It was considered undesirable and the ideal was to eliminate it. In fact, eliminating uncertainty from science was viewed as one manifestation of progress. This attitude towards uncertainty, prevalent prior to the 20th century, was seriously challenged by some developments in the first half of that century. Among them were the emergence of statistical mechanics, Heisenberg’s uncertainty principle in quantum mechanics, and Gödel’s theorems that established an inherent uncertainty in formal mathematical systems. In spite of these developments, the traditional attitude towards uncertainty changed too little and too slowly during the first half of the century. While uncertainty became recognized as useful, or even essential, in statistical mechanics and in some other areas (such as the actuarial profession or the design of large-scale telephone exchanges), it was for a long time tacitly assumed that probability theory was capable of capturing the full scope of uncertainty. The presumed equality between uncertainty and probability was challenged only in the second half of the 20th century. The challenge came from two important generalizations in mathematics. The first one was the generalization of classical measure theory [Halmos, 1950] to the theory of monotone measures, which was first suggested by Choquet [1953] in his theory of capacities. The second one was the generalization of classical set theory to fuzzy set theory, which was introduced by Zadeh [1965]. In the theory of monotone measures, the additivity requirement of classical measures is replaced with a weaker requirement of monotonicity with respect to set inclusion. In fuzzy set theory, the requirement of sharp boundaries of classical sets is abandoned. That is, the membership of an object in a fuzzy set is not a matter of either affirmation or denial, as it is in the case of any classical set, but it is in general a matter of degree. For historical reasons of little significance, monotone measures are often referred to in the literature as fuzzy measures [Wang & Klir, 1992]. This name is somewhat confusing since no fuzzy sets are involved in the definition of monotone measures. However, monotone measures can be fuzzified (i.e., defined on fuzzy sets), which results in a more general class of monotone measures—fuzzy monotone measures [Wang & Klir, 1992, Appendix E]. 1 FUZZY LOGIC IN GEOLOGY
Copyright 2004, Elsevier Science (USA) All rights of reproduction in any form reserved. ISBN: 0-12-415146-9
2
1 Introduction
As is well known, probability theory is based on classical measure theory which, in turn, is based on classical set theory [Halmos, 1950]. When classical measures are replaced with monotone measures of some type and classical sets are replaced with fuzzy sets of some type, a framework is obtained for formalizing some new types of uncertainty, distinct from probability. This indicates that the two generalizations have opened a vast territory for formalizing uncertainty. At this time, only a rather small part of this territory has been adequately explored [Klir & Wierman, 1999; Klir, 2002]. Liberating uncertainty from its narrow confines of probability theory opens new, more expressive ways of representing scientific knowledge. As is increasingly recognized, scientific knowledge is organized, by and large, in terms of systems of various types (or categories in the sense of mathematical theory of categories) [Klir & Rozehnal, 1996; Klir & Elias, 2003]. In general, systems are viewed as relations between states of some variables. They are constructed for various purposes (prediction, retrodiction, prescription, diagnosis, control, etc.). In each system, its relations are utilized, in a given purposeful way, for determining unknown states of some variables on the basis of known states of some other variables. Systems in which the unknown states are determined uniquely are called deterministic; all other systems are called nondeterministic. By definition, each nondeterministic system involves uncertainty of some type. This uncertainty pertains to the purpose for which the system was constructed. It is thus natural to distinguish between predictive uncertainty, retrodictive uncertainty, diagnostic uncertainty, etc. In each nondeterministic system, the relevant uncertainty must be properly incorporated into the description of the system in some formalized language. To understand the full scope of uncertainty is thus essential for dealing with nondeterministic systems. When constructing a system for some given purpose, our ultimate goal is to obtain a system that is as useful as possible for this purpose. This means, in turn, to construct a system with a proper blend of the three most fundamental characteristics of systems: credibility, complexity, and uncertainty. Ideally, we would like to obtain a system with high credibility, low complexity, and low uncertainty. Unfortunately, these three criteria conflict with one another. To achieve high usefulness of the system, we need to find the right trade-off among them. The relationship between credibility, complexity and uncertainty is quite intricate and is not fully understood yet. However, it is already well established that uncertainty has a pivotal role in any efforts to maximize the usefulness of constructed systems. Although usually undesirable in systems when considered alone, uncertainty becomes very valuable when considered in connection with credibility and complexity of systems. A slight increase in relevant uncertainty may often significantly reduce complexity and, at the same time, increase credibility of the system. Uncertainty is thus an important commodity in the knowledge business, a commodity that can be traded for gains in the other essential characteristics of systems by which we represent
1 Introduction
3
knowledge. Because of this important role, uncertainty is no longer viewed in science and engineering as an unavoidable plague, but rather as an important resource that allow us to deal effectively with problems involving very complex systems. It is our contention that monotone measures and fuzzy sets (as well as the various uncertainty theories opened by these two profound generalizations in mathematics) are highly relevant to geology, and that their utility in geology should be seriously studied in the years ahead. The aim of this book is to demonstrate this point by focusing on the role of fuzzy set theory, and especially the associated fuzzy logic, in geology. The term “fuzzy logic” has in fact two distinct meanings. In a narrow sense, it is viewed as a generalization of classical multivalued logics. It is concerned with the development of syntactic aspects (based on the notion of proof ) and semantic aspects (based on the notion of truth) of a relevant logic calculus. In order to be acceptable, the calculus must be sound (provability implies truth) and complete (truth implies provability). These issues have successfully been addressed for fuzzy logic in the narrow sense by Hájek [1998]. In a broad sense, fuzzy logic is viewed as a system of concepts, principles, and methods for dealing with modes of reasoning that are approximate rather than exact. The two meanings are connected since the very purpose of research on fuzzy logic in the narrow sense is to provide fuzzy logic in the broad sense with sound foundations. In this book, we are concerned only with fuzzy logic in the broad sense, which is surveyed in Chapter 2, and its role in geology, which is the subject of Chapters 3–10. From the standpoint of science, as it is still predominantly understood, the ideas of a fuzzy set and a fuzzy proposition are extremely radical. When accepted, one has to give up classical bivalent logic, generally presumed to be the principal pillar of science. Instead, we obtain a logic in which propositions are not required to be either true or false, but may be true or false to different degrees. As a consequence, some laws of bivalent logic no longer hold, such as the law of excluded middle or the law of contradiction. At first sight, this seems to be at odds with the very purpose of science. However, this is not the case. There are at least the following four reasons why allowing membership degrees in sets and degrees of truth in propositions in fact enhances scientific methodology quite considerably: 1. Fuzzy sets and fuzzy propositions possess far greater capabilities than their classical counterparts to capture irreducible measurement uncertainties in their various manifestations. As a consequence, their use improves the bridge between mathematical models and the associated physical reality considerably. It is paradoxical that, in the face of the inevitable measurement errors, fuzzy data are always more accurate than their crisp (i.e., nonfuzzy) counterparts. Crisp data of each variable are based on a partition of the state set of the variable. The coarseness of this partition is determined by the resolution power of the measuring instrument employed. Measurements falling into the same block of the partition are not distinguished in crisp data, regardless of their position within the block. Thus, for
4
1 Introduction
example, a measurement that is at the mid-point of the block is not distinguished from those at the borders with adjacent blocks. While the former is uncertainty free, provided that the block is sufficiently large relative to the resolution power of the measuring instrument employed, the latter involves considerable uncertainty due to the inevitability of measurement errors. This fundamental distinction is not captured at all in crisp data. On the contrary, fuzzy data can capture this and other measurement distinctions of this kind in terms of distinct membership degrees. Fuzzy data are thus more accurate than crisp data in this sense. Membership degrees that accompany fuzzy data express indirectly pertinent measurement uncertainties. When fuzzy data are processed, the membership degrees are processed as well. This implies that any results obtained by this processing are again more accurate (in the empirical sense) than their counterparts obtained by processing the less accurate crisp data. 2. An important feature of fuzzy logic in the broad sense is its capability to capture the vagueness of linguistic terms in statements that are expressed in natural languages. Vagueness of a symbol (a linguistic term) in a given language results from the existence of objects for which it is intrinsically impossible to decide whether the symbol does or does not apply to them according to linguistic habits of some speech community using the language. That is, vagueness is a kind of uncertainty that does not result from information deficiency, but rather from imprecise meanings of linguistic terms, which are particularly abundant in natural languages. Classical set theory and classical bivalent logic are not capable of expressing the imprecision in meanings of vague terms. Hence, propositions in natural language that contain vague terms were traditionally viewed as unscientific. However, this view is extremely restrictive. As has increasingly been recognized in many areas of science, including especially geology, natural language is often the only way in which meaningful knowledge can be expressed. 3. Fuzzy sets and fuzzy propositions are powerful tools for managing complexity and controlling computational cost. This is primarily due to granulation of systems variables, which is a fuzzy counterpart of the classical quantization of variables. In quantization, states of a given variable are grouped into subsets (quanta) that are pairwise disjoint. In granulation, they are grouped into suitable fuzzy subsets (granules). The aim of both quantization and granulation is to make precision compatible with a given task. The advantage of granulation is that, contrary to quantization, it allows us to express gradual transitions from each granule to its neighbors. In quantization, the transition from one quantum to another is always abrupt and, hence, rather superficial. Granulation is thus a better way than quantization to adjust precision of systems as needed. 4. The apparatus of fuzzy set theory and fuzzy logic enhances our capabilities of modeling human common-sense reasoning, decision-making, and other aspects of human cognition. These capabilities are essential for acquiring knowledge from human experts, for representating and manipulating knowledge in expert
1 Introduction
5
systems in a human-like manner, and, generally, for designing and building human-friendly machines with high intelligence. Fuzzy sets and fuzzy propositions are also essential for studying human reasoning, decision making, and acting that are based on perceptions rather than measurements. It is the synergy of all these capabilities that has made fuzzy set theory and fuzzy logic highly successful in many engineering applications over the last two decades or so. The most visible of these applications have been in the area of control, ranging from simple control systems in consumer products (intelligent washing machines, vacuum cleaners, camcorders, etc.) to highly challenging control systems, such as the one for controlling a pilotless helicopter via wireless communication of commands expressed in natural language. Less visible but equally successful applications have been demonstrated in the areas of database and information retrieval systems, expert systems, decision making, pattern recognition and clustering, image processing and computer vision, manufacturing, robotics, transportation, risk and reliability analyses, and many other engineering areas. In fact, every field of engineering has already been positively affected, in one way or another, by fuzzy set theory and fuzzy logic [Ruspini et al., 1998]. In science, applications of fuzzy set theory and fuzzy logic have developed at a considerably slower pace than in engineering and only in some areas of science thus far. This is understandable if we realize how extremely radical the ideas of fuzzy sets and fuzzy propositions actually are. Nevertheless, successful applications have already been demonstrated in many areas of science. Examples are applications in quantum physics [Pykacz, 1993; Cattaneo, 1993], chemistry [Rouvray, 1997], biology [Von Sternberg & Klir, 1998], geography [Gale, 1972], ecology [Libelli & Cianchi, 1996], linguistics [Rieger, 2001], economics [Billot, 1992], psychology [Zétényi, 1988], and social sciences [Smithson, 1987]. In geology, the utility of fuzzy set theory and fuzzy logic was recognized, by and large, only in the late 1990s, but the number of publications dealing with applications of fuzzy logic in geology is already substantial and is growing fast (Chapter 4). This is a clear indicator that the use of fuzzy logic in geology has a great potential. Our motivation for publishing this book is to help to develop this potential. It is important to realize that fuzzy set theory and fuzzy logic are not only tools that help us to deal with some difficult problems in science, engineering, and other professional areas, but they also provide us with a conceptual framework for a radically new way of thinking. Sharp boundaries of classical sets and absolute truths or falsities of classical propositions are still possible under the new thinking, when justifiable, but they are viewed as limiting cases rather than the only possibilities. Thinking in absolute terms is replaced with thinking in relative terms. Everything becomes a matter of degree. This change in our thinking will undoubtedly open new, more refined ways of looking at old issues of epistemology, ethics, law, social policy, and other areas that affect our lives.
6
1 Introduction
The emergence of fuzzy set theory and fuzzy logic and their impact on mathematics and logic as well as on science and science-dependent areas of human affairs possess all distinctive features that are characteristic of a paradigm shift, as introduced in the highly influential book by Thomas Kuhn [1962]. Since logic is fundamental to virtually all branches of mathematics as well as science, this paradigm shift has much broader implications than those generally recognized in the history of science and mathematics, each of which affects only a particular area of science or mathematics. It is thus appropriate to refer to it as a “grand paradigm shift.” Various special characteristics of this paradigm shift, which is still ongoing, are discussed by Klir [1995, 1997, 2000]. It is generally agreed that this paradigm shift was initiated by the publication of the seminal paper by Zadeh [1965]. However, many ideas pertaining to fuzzy logic had appeared in the literature prior to the publication of that paper. Unfortunately, these ideas were by and large ignored at that time [Klir, 2001]. The purpose of this book is threefold: (i) to examine how fuzzy logic has already been applied in some areas of geology; (ii) to stimulate the development of applications of fuzzy logic in other areas of geology; and (iii) to stimulate the use of additional tools of fuzzy logic in geology. Material covered in Chapters 2–10 was carefully selected to accomplish this purpose. The following is a brief preview of this material. Chapter 2 is an overview of fuzzy logic in the broad sense. It is written as a tutorial for those readers who are not familiar with fuzzy logic. This chapter covers not only those components of fuzzy logic that are employed in subsequent chapters, but also some additional ones which offer new application possibilities for geology. Moreover, this chapter introduces terminology and notation that are followed consistently throughout the whole book. The aim of Chapter 3 is twofold: (i) to discuss reasons for using fuzzy logic in geology; and (ii) to illustrate the use of fuzzy logic in geology by simple examples. For geologists, some of the notions of fuzzy logic introduced in Chapter 2 are further discussed in terms of simple geological interpretations. For researchers in fuzzy logic, the chapter is a sort of tutorial which introduces them to some issues that are of concern to geology. Chapter 4 is a comprehensive overview of currently known applications of fuzzy logic in geology. It is primarily an annotated bibliography that is grouped into the following nine categories: (1) surface hydrology; (2) subsurface hydrology; (3) groundwater risk assessment; (4) geotechnical engineering; (5) hydrocarbon exploration; (6) seismology; (7) soil science and landscape development; (8) deposition of sediments; and (9) miscellaneous applications. In addition, the role of fuzzy logic within the broader area of soft computing is briefly characterized. The aim of this chapter is to provide readers with a useful resource for further study of established applications of fuzzy logic in geology, sometimes in the broader context of soft computing. Each of the remaining six chapters of this book covers in greater depth applications of fuzzy logic in some specific area of geology. The utility of fuzzy logic to
1 Introduction
7
stratigraphic modeling is demonstrated in Chapter 5 via several case studies. The chapter describes two-dimensional and three-dimensional stratigraphic simulations that use fuzzy logic to model sediment production, sediment erosion, sediment transport, and sediment deposition. It is shown that fuzzy logic offers a robust, easily adaptable, and computationally efficient alternative to the traditional numerical solution of complex, coupled differential equations commonly used to model sediment dispersal in stratigraphic models. Chapter 6 examines the utility of fuzzy logic in hydrology and water resources. These are areas of geology where applications of fuzzy logic are well established. After the various applications of fuzzy logic in these areas are surveyed, one major area of hydrology is chosen to describe the use of fuzzy logic in detail: the area of hydro-climatic modeling of hydrological extremes (i.e., droughts and intensive precipitation). Results over four regions (Arizona, Nebraska, Germany, and Hungary) and under three different climates (semiarid, dry, and wet continental) suggest that the use of fuzzy logic is successful in predicting statistical properties of monthly precipitation and drought index from the joint forcing of macrocirculation patterns and ENSO information. The purpose of Chapter 7 is to present formal concept analysis of fuzzy data and to explore its prospective applications in geology and paleontology. Formal concept analysis is concerned with analyzing data in terms of objects and their attributes. It is capable of answering questions such as: (i) What are the natural concepts that are hidden in the object-attribute data (e.g., important classes of organisms, minerals, or fossils)?; or (ii) What are the dependencies that are implicit in the object-attribute data? Fuzzified formal concept analysis, which is a relatively new methodological tool, is described in detail in the chapter and is illustrated by an example from paleontology. Chapter 8 is a comprehensive overview of the role of fuzzy logic in seismology and some closely related areas. Basic terminology of seismology is introduced to help readers who are not familiar with this area of geology. The focus in the chapter is on applications of fuzzy logic and other areas of fuzzy mathematics to earthquake prediction, assessment of earthquake intensity, assessment of earthquake damage, and study of the relationship between isoseismal area and earthquake magnitude. The last two chapters of the book explore some new ideas emerging from fuzzy logic that can be applied to a broad range of geological problems. These chapters require some mathematical sophistication, but they are self-contained in the sense that the reader is provided with the relevant preliminaries and specific examples of applications. Chapter 9 describes a new numerical technique—fuzzy transformation—that allows complex functions to be approximated to a high order. Moreover, useful manipulations (such as numerical integration) are, in a number of cases, easier for the transformed expressions than for the originals. This technique is then applied to a solution of an ordinary differential equation used to model long-term reef growth under a variable sea level regime. Chapter 10 provides an example of how fuzzy logic can mathematically formalize what heretofore were primarily only linguistic
8
1 Introduction
descriptions and interpretations of geologic phenomena. In this case, a computer program using specialized fuzzy-set based “evaluating expressions” is taught to mimic the linguistic geologic “rules” for both the division of Paleozoic measured sections of limestone into a hierarchy of different cycles, and the interpretation of those cycles in terms of ancient sea level.
References Billot, A. [1992], Economic Theory of Fuzzy Equilibria. Springer-Verlag, New York. Cattaneo, G. [1993], “Fuzzy quantum logic II: The logics of unsharp quantum mechanics.” International Journal of Theoretical Physics, 32(10), 1709–1734. Choquet, G. [1953–54], “ Theory of capacities.” Annales de L’Institut Fourier, 5, 131–295. Gale, S. [1972], “Inexactness, fuzzy sets and the foundations of behavioral geography.” Geographical Analysis, 4, 337–349. Hájek, P. [1998], Metamathematics of Fuzzy Logic. Kluwer, Boston, MA. Halmos, P. R. [1950], Measure Theory. Van Nostrand, Princeton, NJ. Klir, G. J. [1995], “From classical sets to fuzzy sets: a grand paradigm shift.” In: Wang, P. P. (ed.), Advances in Fuzzy Theory and Technology, Vol. III, pp. 3–30. Duke University, Durham, NC. Klir, G. J. [1997], “From classical mathematics to fuzzy mathematics: emergence of a new paradigm for theoretical science.” In: Rouvray, D. H. (ed.), Fuzzy Logic in Chemistry, pp. 31–63. Academic Press, San Diego, CA. Klir, G. J. [2000], Fuzzy Sets: An Overview of Fundamentals, Applications, and Personal Views. Beijing Normal University Press, Beijing, China. Klir, G. J. [2001], “Foundations of fuzzy set theory and fuzzy logic: A historical overview.” International Journal of General Systems, 30(2), 91–132. Klir, G. J. [2002], “Uncertainty-based information.” In: Melo-Pinto and H.-N. Teodorescu (eds.), Systemic Organisation of Information in Fuzzy Systems, pp. 21–52. IOS Press, Amsterdam. Klir, G. J., & Elias, D. [2003], Architecture of Systems Problem Solving (2nd edition). Kluwer/Plenum, New York. Klir, G. J., & Rozehnal, I. [1996], “Epistemological categories of systems: an overview.” International Journal of General Systems, 24(1–2), 207–224. Klir, G. J., & Wierman, M. J. [1999], Uncertainty-Based Information: Elements of Generalized Information Theory (2nd edition). Physica-Verlag/Springer-Verlag, Heidelberg and New York. Kuhn, T. S. [1962], The Structure of Scientific Revolutions. University of Chicago Press, Chicago, IL. Libelli, S. M., & Cianchi, P. [1996], “Fuzzy ecological models.” In: Pedrycz, W. (ed.), Fuzzy Modelling Paradigms and Practice, pp.141–164. Kluwer, Boston, MA. Pykacz, J. [1993], “Fuzzy quantum logic I.” International Journal of Theoretical Physics, 32(10), 1691–1707. Rieger, B. B. [2001], “Computing granular word meanings: A fuzzy linguistic approach in computational semiotics.” In: Wang, P. P. (ed.), Computing with Words, pp. 147–208. John Wiley, New York.
References
9
Rouvray, D. H. (ed.) [1997], Fuzzy Logic in Chemistry. Academic Press, San Diego, CA. Ruspini, E. H., Bonissone, P. P., & Pedrycz, W. (eds.) [1988], Handbook of Fuzzy Computation. Institute of Physics Publishing, Bristol (UK) and Philadelphia, PA. Smithson, M. [1987], Fuzzy Set Analysis for Behavioral and Social Sciences. Springer-Verlag, New York. Von Sternberg, R., & Klir, G. J. [1998], “Generative archetypes and taxa: A fuzzy set formalization.” Biology Forum, 91, 403–424. Wang, Z., & Klir, G. J. [1992], Fuzzy Measure Theory. Plenum Press, New York. Zadeh, L. A. [1965], “Fuzzy sets.” Information and Control, 8(3), 338–353. Zétényi, T. (ed.) [1988], Fuzzy Sets in Psychology. North-Holland, Amsterdam and New York.
This Page Intentionally Left Blank
Chapter 2
Fuzzy Logic: A Specialized Tutorial
George J. Klir
2.1 2.2 2.3
Introduction 11 Basic Concepts of Fuzzy Sets 14 Operations on Fuzzy Sets 19 2.3.1 Modifiers 19 2.3.2 Complements 21 2.3.3 Intersections and unions 22 2.3.4 Averaging operations 25 2.3.5 Arithmetic operations 28 2.4 Fuzzy Relations 31 2.4.1 Projections, cylindric extensions, and cylindric closures 2.4.2 Inverses, compositions, and joins 33 2.4.3 Fuzzy relation equations 34 2.4.4 Fuzzy relations on a single set 36 2.5 Fuzzy Logic 38 2.5.1 Basic types of propositional forms 41 2.5.2 Approximate reasoning 44 2.6 Possibility Theory 46 2.7 Fuzzy Systems 49 2.8 Constructing Fuzzy Sets and Operations 53 2.9 Nonstandard Fuzzy Sets 55 2.10 Principal Sources for Further Study 57 References 59
32
2.1 Introduction The term “fuzzy logic,” as currently used in the literature, has two distinct meanings. In the narrow sense, it is viewed as a generalization of the various many-valued logics that have been investigated in the area of mathematical logic since the beginning of the 20th century. An excellent historical overview of the emergence and development of many-valued logics was prepared by Rescher [1969]; the various issues 11 FUZZY LOGIC IN GEOLOGY
Copyright 2004, Elsevier Science (USA) All rights of reproduction in any form reserved. ISBN: 0-12-415146-9
12
2 Fuzzy Logic: A Specialized Tutorial
involved in generalizing many-valued logics into fuzzy logic are thoroughly covered in monographs by Hájek [1998] and Novák et al. [1999]. In the alternative, broad sense, fuzzy logic is viewed as a system of concepts, principles, and methods for dealing with modes of reasoning that are approximate rather than exact [Novák & Perfilieva, 2000]. In this book, we are interested in fuzzy logic only in this broad sense. In this sense, fuzzy logic is based upon fuzzy set theory. It utilizes the apparatus of fuzzy set theory for formulating various forms of sound approximate reasoning in natural language. It is thus essential to begin our tutorial with an overview of basic concepts of fuzzy set theory. Fuzzy set theory, introduced by Zadeh [1965], is an outgrowth of classical set theory. Contrary to the classical concept of a set, or crisp set, the boundary of a fuzzy set is not precise. That is, the change from nonmembership to membership in a fuzzy set may be gradual rather than abrupt. This gradual change is expressed by a membership function, which completely and uniquely characterizes a particular fuzzy set. Every geologist is familiar with the terms clay, silt, and gravel, terms used to describe the “size” of sedimentary particles (Figure 2.1a). These terms stand for crisp sets as they are most commonly used, insofar as a grain can only belong to one size grade at a time. Thus, in the traditional “pigeon hole” view of grain sizes, a spherical grain with a diameter of 1.999 mm would be sand whereas a grain 2.001 mm in diameter would be gravel. An alternative representation of the crisp set “sand” would be to assign a value of 1 to grain diameters that are members of the set “sand” (the domain interval (0.0625–2] mm) and a 0 to grain diameters that are not sand. In contrast,
Figure 2.1 Comparison of crisp-set (a) versus fuzzy-set (b) representation of the geologic variable “grain size.”
2.1 Introduction
13
one possible representation of the sedimentary size terms clay, silt, sand, and gravel with fuzzy sets is shown in Figure 2.1b. In a fuzzy set representation the range of membership in a given set (e.g., “sand”) is not limited to 0 or 1 but can take on any value between and including [0, 1]. Our hypothetical 1.999 and 2.001 mm diameter grains are simultaneously members of both sets, sand and gravel, to a degree of about 0.5. The simple trapezoids represent the membership functions. Two distinct notations are most commonly employed in the literature to denote membership functions. In one of them, the membership function of a fuzzy set A is denoted by μA (x) and usually has the form μA : X → [0, 1],
(2.1)
where X denotes the universal set under consideration and A is a label of the fuzzy set defined by this function. The universal set is always assumed to be a crisp set. For each x ∈ X, the value μA (x) expresses the degree (or grade) of membership of element x of X in fuzzy set A. In the second notation, the symbol A of a fuzzy set is also used to denote the membership function of A. However, no ambiguity results from this double use of the same symbol since each fuzzy set is completely and uniquely defined by one particular membership function. That is, A(x) in the second notation has the same meaning as μA in the first notation; (2.1) is thus written in the second notation as A: X → [0, 1].
(2.2)
In this book, the second notation is adopted. It is simpler and, by and large, more popular in current literature on fuzzy set theory. Classical (crisp) sets may be viewed from the standpoint of fuzzy set theory as special fuzzy sets, in which A(x) is either 0 or 1 for each x ∈ X. Hence, we use the same notation for fuzzy sets and crisp sets. Fuzzy sets whose membership functions have the form (2.2), which are called standard fuzzy sets, do not capture the full variety of fuzzy sets. Since standard fuzzy sets are currently predominant in the literature, this tutorial is largely devoted to them. However, basic properties of several nonstandard types of fuzzy sets, whose importance in some applications has lately been recognized, are introduced in Section 2.9. Additional examples of membership functions are shown in Figure 2.2. These functions may be considered as candidates for representing the meaning of the linguistic expression “around 3” in the context of a given application. The width of each of these functions is, of course, strongly dependent on the application context. In general, a membership function that is supposed to capture the intended meaning of a linguistic expression in the context of a particular application must be somehow constructed. This issue is discussed in Section 2.8.
14
2 Fuzzy Logic: A Specialized Tutorial
Figure 2.2 Possible shapes of membership functions whose purpose is to capture the meaning of the linguistic expression “around 3” in the context of a given application.
2.2 Basic Concepts of Fuzzy Sets Given two fuzzy sets A, B defined on the same universal set X, A is said to be a subset of B if and only if A(x) ≤ B(x) for all x ∈ X. The usual notation, A ⊆ B, is used to signify the subsethood relation. The set of all fuzzy subsets of X is called the fuzzy power set of X and is denoted by F(X). Observe that this set is crisp, even though its members are fuzzy sets. Moreover, this set is always infinite, even if X is finite. It is also useful to define a degree of subsethood, SUB(A, B), of A in B. When the sets are defined on a finite
2.2 Basic Concepts of Fuzzy Sets
15
universal set X, we have SUB(A, B) =
A(x) −
x∈X
max[0, A(x) − B(x)] . A(x)
x∈X
(2.3)
x∈X
The negative term in the numerator describes the sum of the degrees to which the subset inequality A(x) ≤ B(x) is violated, the positive term describes the largest possible violation of the inequality, the difference in the numerator describes the sum of the degrees to which the inequality is not violated, and the term in the denominator is a normalizing factor to obtain the range 0 ≤ SUB(A, B) ≤ 1. When sets A and B are defined on a bounded subset of real numbers (i.e., X is a closed interval of real numbers), the three terms in (2.3) are replaced with integrals over X. For any fuzzy set A defined on a finite universal set X, its scalar cardinality, |A|, is defined by the formula |A| =
A(x).
x∈X
Scalar cardinality is sometimes referred to in the literature as a sigma count. Among the most important concepts of standard fuzzy sets are the concepts of an α-cut and a strong α-cut. Given a fuzzy set A defined on X and a particular number α in the unit interval [0, 1], the α-cut of A, denoted by αA, is a crisp set that consists of all elements of X whose membership degrees in A are greater than or equal to α. This can formally be written as A = {x|A(x) ≥ α}.
α
The strong α-cut, α+A, has a similar meaning, but the condition “greater than or equal to” is replaced with the stronger condition “greater than.” Formally, A = {x|A(x) > α}.
α+
The set 0+A is called the support of A and the set 1A is called the core of A. When the core A is not empty, A is called normal; otherwise, it is called subnormal. The largest value of A is called the height of A and it is denoted by hA . The set of distinct values A(x) for all x ∈ X is called the level set of A and is denoted by ΛA .
16
2 Fuzzy Logic: A Specialized Tutorial
Figure 2.3 Illustration of some basic characteristics of fuzzy sets.
All the introduced concepts are illustrated in Figure 2.3. We can see that A ⊆ α2A
α1
and
α1 +
A ⊆ α2 +A2
when α1 ≥ α2 . This implies that the set of all distinct α-cuts (as well as strong α-cuts) is always a nested family of crisp sets. When α is increased, the new α-cut (strong α-cut) is always a subset of the previous one. Clearly, 0A = X and 1+A = ∅. It is well established [Klir & Yuan, 1995] that each fuzzy set is uniquely represented by the associated family of its α-cuts via the formula A(x) = sup {α · αA(x)|α ∈ [0, 1]},
(2.4)
or by the associated family of its strong α-cuts via the formula A(x) = sup {α · α+A(x)|α ∈ [0, 1]},
(2.5)
where sup denotes the supremum of the respective set and αA (or α+A) denotes for each α ∈ [0, 1] the special membership function (characteristic function) of the α-cut (or strong α-cut, respectively). The significance of the α-cut (or strong α-cut) representation of fuzzy sets is that it connects fuzzy sets with crisp sets. While each crisp set is a collection of objects that are conceived as a whole, each fuzzy set is a collection of nested crisp sets that are also conceived as a whole. Fuzzy sets are thus wholes of a higher category. The α-cut representation of fuzzy sets allows us to extend the various properties of crisp sets, established in classical set theory, into their fuzzy counterparts. This is accomplished by requiring that the classical property be satisfied by all α-cuts of the fuzzy set concerned. Any property that is extended in this way from classical
2.2 Basic Concepts of Fuzzy Sets
17
set theory into the domain of fuzzy set theory is called a cutworthy property. For example, when convexity of fuzzy sets is defined by the requirement that all α-cuts of a fuzzy convex set be convex in the classical sense, this conception of fuzzy convexity is cutworthy. Other important examples are the concepts of a fuzzy partition, fuzzy equivalence, fuzzy compatibility, and various kinds of fuzzy orderings that are cutworthy (Section 2.4). It is important to realize that many (perhaps most) properties of fuzzy sets, perfectly meaningful and useful, are not cutworthy. These properties cannot be derived from classical set theory. Another way of connecting classical set theory and fuzzy set theory is to fuzzify functions. Given a function f : X → Y, where X and Y are crisp sets, we say that the function is fuzzified when it is extended to act on fuzzy sets defined on X and Y . That is, the fuzzified function maps, in general, fuzzy sets defined on X to fuzzy sets defined on Y . Formally, the fuzzified function, F , has the form F : F(X) → F(Y ), where F(X) and F(Y ) denote the fuzzy power set (the set of all fuzzy subsets) of X and Y , respectively. To qualify as a fuzzified version of f , function F must conform to f within the extended domain F(X) and F(Y ). This is guaranteed when a principle is employed that is called an extension principle. According to this principle, B = F (A) is determined for any given fuzzy set A ∈ F(X) via the formula B(y) =
max A(x)
x|y=f (x)
(2.6)
for all y ∈ Y . Clearly, when the maximum in (2.6) does not exist, it is replaced with the supremum. The inverse function F −1 : F(Y ) → F(X), of F is defined, according to the extension principle, for any given B ∈ F(Y ), by the formula [F −1 (B)](x) = B(y), for all x ∈ X, where y = f (x). Clearly, F −1 [F (A)] ⊇ A
(2.7)
18
2 Fuzzy Logic: A Specialized Tutorial
Figure 2.4 Illustration of the extension principle.
for all A ∈ F(X), where the equality is obtained when f is a one-to-one function. The use of the extension principle is illustrated in Figure 2.4, where it is shown how fuzzy set A is mapped to fuzzy set B via function F that is consistent with the given function f . That is, B = F (A). For example, since b = f (a1 ) = f (a2 ) = f (a3 ), we have B(b) = max[A(a1 ), A(a2 ), A(a3 )] by Equation (2.6). Conversely, F −1 (B)(a1 ) = F −1 (B)(a2 ) = F −1 (B)(a3 ) = B(b) by (2.7). The introduced extension principle, by which functions are fuzzified, is basically described by Equations (2.6) and (2.7). These equations are direct generalizations of similar equations describing the extension principle of classical set theory. In the latter, symbols A and B denote characteristic functions of crisp sets.
2.3 Operations on Fuzzy Sets
19
2.3 Operations on Fuzzy Sets Operations on fuzzy sets possess a considerably greater variety than those on classical sets. In fact, most operations on fuzzy sets do not have any counterparts in classical set theory. The following five types of operations on fuzzy sets are currently recognized: (a) (b) (c) (d) (e)
modifiers; complements; intersections; unions; averaging operations.
Modifiers and complements operate on one fuzzy set. Intersections and unions operate on two fuzzy sets, but their application can be extended to any number of fuzzy sets via their property of associativity. The averaging operations, which are not associative, operate, in general, on n fuzzy sets (n ≥ 2). In addition to these five types of operations, special fuzzy sets referred to as fuzzy intervals are also subject to arithmetic operations. As can be seen from this overall characterization of operations on fuzzy sets, this subject is very extensive. It is also a subject that has been investigated by many researchers, and that is now quite well developed. Due to the enormous scope of the subject, we are able to present in this section only a very brief characterization of each of the introduced types of operations, but we provide the reader with ample references for further study. 2.3.1 Modifiers Modifiers are unary operations whose primary purpose is to modify fuzzy sets to account for linguistic hedges, such as very, fairly, extremely, moderately, etc., in representing expressions of natural language. Each modifier, m, is an increasing (and usually continuous) one-to-one function of the form m: [0, 1] → [0, 1], which assigns to each membership grade A(x) of a given fuzzy set A a modified grade m(A(x)). The modified grades for all x ∈ X define a new, modified fuzzy set. Denoting conveniently this modified set by MA, we have m(A(x)) = MA(x) Observe that function m is totally independent of elements x to which values A(x) are assigned; it depends only on the values themselves. In describing its formal properties, we may thus ignore x and assume that the argument of m is an arbitrary number a in the unit interval [0, 1].
20
2 Fuzzy Logic: A Specialized Tutorial
In general, a modifier increases or decreases values of the membership functions to which it is applied, but preserves the order. That is, if a ≤ b then m(a) ≤ m(b) for all a, b ∈ [0, 1] or, recognizing the meaning of a and b, if A(x) ≤ A(y) for some x, y ∈ X, then MA(x) ≤ MA(y). Sometimes, it is also required that m(0) = 0 and m(1) = 1. Modifiers are basically of three types, depending on which values of the membership functions they increase or decrease: (i) modifiers that increase all values; (ii) modifiers that decrease all values; (iii) modifiers that increase some values and decrease other values. To illustrate these types of modifiers, let us consider the fuzzy set A in Figure 2.2. For each x ∈ R, A is clearly defined by the formula ⎧ (x − 1)/2 when x ∈ [1,3] ⎪ ⎪ ⎨ A(x) = (5 − x)/2 when x ∈ [3,5] ⎪ ⎪ ⎩ 0 otherwise Assume that this fuzzy set represents, in a given application context, the linguistic concept “close to 3.” To modify A for representing the concept “very close to 3,” we need to reduce in some way the values of A. This can be done by choosing an appropriate modifier from the class of functions mλ (a) = a λ ,
(2.8)
where a is the value of A to which mλ is applied and λ is a parameter whose value determines how strongly mλ modifies A. For each value of λ, which must be in this case greater than 1, we obtain a particular modifier. When applying the modifier to A, we obtain a new membership function, mλ [A(x)], a composite of functions A and m, which for each x ∈ R is defined by the formula ⎧ [(x − 1)/2]λ when x ∈ [1,3] ⎪ ⎪ ⎨ mλ [A(x)] = [(5 − x)/2]λ when x ∈ [3,5] ⎪ ⎪ ⎩ 0 otherwise This modified membership function has a shape exemplified by the function labeled as C in Figure 2.2. Its width is determined by the value λ of the chosen modifier: the larger the value, the narrower the function. The proper value of λ must be determined in the context of each particular application. Assume now that we want to modify the same set A for representing the concept “fairly close to 3.” In this case, we need to increase the values of A. This can be done with modifiers of the form (2.8), provided that λ ∈ (0, 1). Applying these modifiers to
2.3 Operations on Fuzzy Sets
21
A results in a new membership function whose shape is exemplified by the function labeled as F in Figure 2.2. The smaller the value of λ, the wider is the modified membership function. It should be mentioned at this point that (2.8) is given here solely as an example of a possible class of modifiers of fuzzy sets. As is well known, these modifiers do not always properly capture the meaning of linguistic hedges in natural language. A more comprehensive treatment of linguistic hedges is presented in Chapter 10; see also Novák [1989]. 2.3.2 Complements Similarly to modifiers, complements of fuzzy sets may be defined via appropriate unary operations on [0, 1]. While modifiers preserve the order of membership degrees, complements reverse the order. In particular, each fuzzy complement, c, must satisfy at least the following two requirements: (c1) c(0) = 1 and c(1) = 0; (c2) for all a, b ∈ [0, 1], if a ≤ b, then c(a) ≥ c(b). Requirement (c1) guarantees that all fuzzy complements collapse to the unique classical complement for crisp sets. Requirement (c2) guarantees that increases in the degree of membership in A do not result in increases in the degree of membership in the complement of A. This is essential since any increase in the degree of membership of an object in a fuzzy set cannot simultaneously increase the degree of nonmembership of the same object in the same fuzzy set. When used as a fuzzy complement, function c is always applied to membership degrees A(x) of some fuzzy set A. It depends only on the values A(x) and not on the objects x to which the values are assigned. For the purpose of characterizing fuzzy complements, we may thus ignore these objects and observe only how function c depends on numbers in [0, 1]. This is the reason why no reference is made to specific degrees A(x) in the requirements (c1) and (c2). However, when function c defines a complement of a particular fuzzy set A, we must keep track of the relevant objects x to make the connection between A(x) and c[A(x)]. Although requirements (c1) and (c2) are sufficient to characterize the largest class of acceptable fuzzy complements, two additional requirements are imposed on fuzzy complements by most applications of fuzzy set theory: (c3) c is a continuous function; (c4) c(c(a)) = a for all a ∈ [0, 1]. Requirement (c3) guarantees that infinitesimal changes in the argument do not result in discontinuous changes in the function. Requirement (c4) guarantees that fuzzy sets
22
2 Fuzzy Logic: A Specialized Tutorial
are not changed by double complementation. Fuzzy complements that satisfy (c4) are called involutive. A practical class of fuzzy complements that satisfy requirements (c1)–(c4) is defined for each a ∈ [0, 1] by the formula cλ (a) = (1 − a λ )1/λ ,
(2.9)
where λ ∈ (0, ∞); it is called the Yager class of fuzzy complements. One particular fuzzy complement is obtained for each value of the parameter λ. The complement obtained for λ = 1, which is called a standard fuzzy complement, is the most common complement in applications of fuzzy set theory. Clearly, the standard fuzzy is defined for each x ∈ X by the complement of a fuzzy set A, usually denoted by A, equation = 1 − A(x). A(x) Other parameter-based formulas for describing classes of fuzzy complements have been proposed in the literature. In fact, some procedures have been developed by which new classes of fuzzy complements can be generated [Klir & Yuan, 1995]. However, this theoretical topic is beyond the scope of this tutorial. To determine the most fitting complement in the context of each particular application is a problem of knowledge acquisition, somewhat similar to the problem of constructing membership functions. Given a class of fuzzy complements, such as the Yager class, the constructing problem reduces to the problem of determining the right value of the relevant parameter. 2.3.3 Intersections and unions Intersections and unions of fuzzy sets, denoted by i and u respectively, are generalizations of the classical operations of intersections and unions of crisp sets. They may be defined via appropriate functions that map each pair of real numbers from [0, 1] (representing degrees A(x) and B(x) of given fuzzy sets A and B for some x ∈ X) into a single number in [0, 1] (representing membership degree (A ∩ B)(x) of the intersection of A and B or membership degree of the union of A and B for the given x). Hence, (A ∩ B)(x) = i[A(x), B(x)] and (A ∪ B)(x) = u[A(x), B(x)] for all x ∈ X. To discuss properties of functions i and u, which do not depend on x, we may view i and u as functions from [0, 1] × [0, 1] to [0, 1].
2.3 Operations on Fuzzy Sets
23
Contrary to their classical counterparts, fuzzy intersections and unions are not unique. This is a natural consequence of the well-established fact that the linguistic expressions “x is a member of A and B” and “x is a member of A or B” have different meanings when applied by human beings to different vague concepts in different contexts. To be able to capture the different meanings, we need to characterize the classes of fuzzy intersections and fuzzy unions as broadly as possible. It has been established that operations known in the literature as triangular norms or t-norms and triangular conorms or t-conorms, which have been extensively studied in mathematics, possess exactly those properties that are requisite, on intuitive grounds, for fuzzy intersections and fuzzy unions, respectively. The class of t-norms/fuzzy intersections is characterized by four requirements; the class of t-conorms/fuzzy unions is also characterized by four requirements, three of which are identical with the requirements for t-norms. In the following list, the requirements for t-norms/fuzzy intersections i are paired with their counterparts for t-conorms/fuzzy unions u, and must be satisfied for all a, b, d ∈ [0, 1]: (i1) (u1) (i2) (u2) (i3) (u3) (i4) (u4)
i(a, 1) = a (boundary requirement for i); u(a, 0) = a (boundary requirement for u); b ≤ d implies i(a, b) ≤ i(a, d) (monotonicity); b ≤ d implies u(a, b) ≤ u(a, d) i(a, b) = i(b, a) (commutativity); u(a, b) = u(b, a) i(a, i(b, d)) = i(i(a, b), d) (associativity). u(a, u(b, d)) = u(u(a, b), d)
It is easy to see that the first three requirements for i ensure that fuzzy intersections collapse to the classical set intersection when applied to crisp sets: i(0, 1) = 0 and i(1, 1) = 1 follow directly from the boundary requirement; i(1, 0) = 0 and i(0, 0) = 0 follow then from commutativity and monotonicity, respectively. Similarly, the first three requirements for u ensure that fuzzy unions collapse to the classical set union when applied to crisp sets. Commutativity requirements ensure that fuzzy intersections and unions are symmetric operations, indifferent to the order in which sets to be combined are considered; together with monotonicity requirements, they guarantee that fuzzy intersections and unions do not decrease when any of their arguments are increased, and do not increase when any arguments are decreased. Associativity requirements allow us to extend fuzzy intersections and unions to more than two sets, in perfect analogy with their classical counterparts. The following are examples of some common fuzzy intersections and fuzzy unions with their usual names (each defined for all a, b ∈ [0, 1]). Standard fuzzy intersection: i(a, b) = min(a, b) Algebraic product: i(a, b) = ab
24
2 Fuzzy Logic: A Specialized Tutorial
Bounded difference: i(a, b) = max(0, ⎧ a + b − 1) ⎨a when b = 1 Drastic intersection: imin (a, b) = b when a = 1 ⎩ 0 otherwise Standard fuzzy union: u(a, b) = max(a, b) Algebraic sum: u(a, b) = a + b − ab Bounded sum: u(a, b) = min(1, ⎧ a + b) ⎨a when b = 0 Drastic union: umax (a, b) = b when a = 0 ⎩ 1 otherwise It is easy to verify that the inequalities imin (a, b) ≤ i(a, b) ≤ min(a, b) max(a, b) ≤ u(a, b) ≤ umax (a, b) are satisfied for all a, b ∈ [0, 1] by any fuzzy intersection i and any fuzzy union u, respectively. These inequalities specify, in effect, the full ranges of fuzzy intersections and fuzzy unions. Examples of classes of fuzzy intersections, iw , and fuzzy union, uw , that cover the full ranges of these operations are defined for all a, b ∈ [0, 1] by the formulas iw (a, b) = 1 − min{1, [(1 − a)w + (1 − b)w ]1/w } , (2.10) uw (a, b) = min[1, (a w + bw )1/w ] where w is a parameter whose range is (0, ∞). One particular fuzzy intersection and one particular fuzzy union are obtained for each value of the parameter. These operations are often referred to in the literature as the Yager classes of intersections and unions. Although it is not obvious from the formulas, it is relatively easy to prove that the standard fuzzy operations are obtained in the limit for w → ∞. Since Yager intersections increase as the value of w increases, they become less restrictive or weaker with increasing w. The drastic intersection is the strongest and the standard intersection is the weakest. For Yager unions, this pattern is inverted; they become more restrictive or stronger with increasing w. The standard union is the strongest, the drastic union the weakest. It should be mentioned that various other classes of fuzzy intersections and unions have been examined in the literature. Moreover, special procedures are now available by which new classes of fuzzy intersection and unions can be generated [Klir & Yuan, 1995]. Among the great variety of fuzzy intersections and unions, the standard operations possess certain properties that give them special significance. First, we recognize that they are located at opposite ends of the respected ranges of these operations.
2.3 Operations on Fuzzy Sets
25
While the standard intersection is the weakest one among all fuzzy intersections, the standard union is the strongest one among all fuzzy unions. Second, the standard operations are the only cutworthy operations among all fuzzy intersections and unions. Third, they are also the only operations among fuzzy intersections and unions that are idempotent. This means that is (a, a) = us (a, a) = a for all a ∈ [0, 1]. Nonstandard fuzzy intersections are only subidempotent, while nonstandard fuzzy unions are superidempotent; this means that i(a, a) < a
and
u(a, a) > a
for all a ∈ (0, 1). In addition, when using the standard fuzzy operations, errors of the operands do not compound. This is a desirable property from the computational point of view, which other fuzzy operations do not possess. Whatever combination of fuzzy counterparts of the three classical set-theoretic operations (complement, intersection, union) we choose, some properties of the classical operations (properties of the underlying Boolean algebra) are inevitably violated. This is a consequence of imprecise boundaries of fuzzy sets. The standard fuzzy operations violate only the law of excluded middle and the law of contradiction. Some other combinations preserve these laws, but violate distributivity and idempotence [Klir & Yuan, 1995]. 2.3.4 Averaging operations Fuzzy intersections (t-norms) and fuzzy unions (t-conorms) are special types of operations for aggregating fuzzy sets: given two or more fuzzy sets, they produce a single fuzzy set, an aggregate of the given sets. While they do not cover all aggregating operations, they cover all aggregating operations that are associative. Because of the lack of associativity, the remaining aggregating operations must be defined as functions of n arguments for each n ≥ 2. These remaining aggregation operations are called averaging operations. As the name suggests, they average in various ways membership functions of two or more fuzzy sets defined on the same universal set. They do not have any counterparts in classical set theory. Indeed, an average of several characteristic functions of classical sets is not, in general, a characteristic function! However, classical sets can be averaged if they are viewed as special fuzzy sets. For each n ≥ 2, an averaging operation, h, aggregates n fuzzy sets defined on the same universal set X, say sets A1 , A2 , . . . , An . Denoting conveniently the aggregate fuzzy set by H (A1 , A2 , . . . , An ), we have H (A1 , A2 , . . . , An )(x) = h[A1 (x), A2 (x), . . . , An (x)] for all x ∈ X. Since properties of various averaging operations h do not depend on x, but only on the membership degrees A1 (x), A2 (x), . . . , An (x) ∈ [0, 1], we may view these operations as functions from [0, 1]n to [0, 1].
26
2 Fuzzy Logic: A Specialized Tutorial
The following two requirements are requisite for any averaging operation h with n arguments (n ≥ 2): (h1) for all a ∈ [0, 1], h(a, a, . . . , a) = a (idempotency); (h2) for any pair of n-tuples of real numbers in [0,1], a1 , a2 , . . . , an and b1 , b2 , . . . , bn , if ai ≤ bi for all i ∈ Nn , then h(a1 , a2 , . . . , an ) ≤ h(b1 , b2 , . . . , bn ) (monotonicity). Requirement (h1) expresses our intuition that an average of equal numbers must result in the same number. Requirement (h2) guarantees that the average does not decrease when any of the arguments increase. In addition to these essential and easily understood requirements, averaging operations on fuzzy sets are usually expected to satisfy two additional requirements: (h3) h is a continuous function; (h4) h is a symmetric function in all its arguments, which means that h(a1 , a2 , . . . , an ) = h(ap(1) , ap(2) , . . . , ap(n) ) for any permutation p on Nn . Requirement (h3) guarantees that small changes in any of the arguments do not result in discontinuous changes in the average. Requirement (h4) captures the usual assumption that the aggregated fuzzy sets are equally important. If this assumption is not warranted in some application contexts, the symmetry requirement must be dropped. It is significant that any averaging operation h that satisfies the two basic requirements (h1) and (h2) produces numbers that for each n-tuple a1 , a2 , . . . , an ∈ [0, 1]n lie within the interval defined by the inequalities min(a1 , a2 , . . . , an ) ≤ h(a1 , a2 , . . . , an ) ≤ max(a1 , a2 , . . . , an ). To see this, let a∗ = min(a1 , a2 , . . . , an )
and
a ∗ = max(a1 , a2 , . . . , an ).
If h satisfies requirements (h1) and (h2), then a∗ = h(a∗ , a∗ , . . . , a∗ ) ≤ h(a1 , a2 , . . . , an ) ≤ h(a1∗ , a2∗ , . . . , an∗ ) = a ∗ . Conversely, if h produces numbers within the interval bounded by the min and max operations, then it must also satisfy requirement (h1) of idempotency; indeed, a = min(a, a, . . . , a) ≤ h(a, a, . . . , a) ≤ max(a, a, . . . , a) = a for all a ∈ [0, 1]. That is, averaging operations cover the whole range between the standard fuzzy intersection and the standard fuzzy union. The standard operations
2.3 Operations on Fuzzy Sets
27
play a pivotal role in the three types of aggregating operations of fuzzy sets. Owing to their idempotency, they qualify not only as extensions of the classical set intersection and union, but also as extreme averaging operations. One class of averaging operations that covers the entire interval between min and max operations consists of generalized means, which are defined by the formula
hp (a1 , a2 , . . . , an ) =
p
p
p
a1 + a2 + · · · + an n
1/p (2.11)
,
where p is a parameter whose range is the set of all real numbers except 0. One particular averaging operation is obtained for each value of the parameter. For p = 0, function hp is not defined by the formula but by the limit, lim hp (a1 a2 · · · an ) = (a1 a2 · · · an )1/n ,
p→0
which is the well-known geometric mean. Moreover, lim hp (a1 , a2 , . . . , an ) = min(a1 , a2 , . . . , an )
p→−∞
lim hp (a1 , a2 , . . . , an ) = max(a1 , a2 , . . . , an ).
p→∞
For p = 1, hp yields the arithmetic mean h1 (a1 , a2 , . . . , an ) =
a1 + a 2 + · · · + a n ; n
and for p = −1, it yields the harmonic mean h−1 (a1 , a2 , . . . , an ) =
n . 1 1 1 + + ··· + a1 a2 an
It seems reasonable to consider the arithmetic mean as the standard averaging operation. Generalized means are symmetric averaging operations. When symmetry is not desirable, they may be replaced with weighted generalized means, w hp , defined by the formula hp (a1 , a2 , . . . , an , w1 , w2 , . . . , wn ) =
w
n i=1
1/p p wi ai
,
(2.12)
28
2 Fuzzy Logic: A Specialized Tutorial
Figure 2.5 An overview of the three basic classes of aggregation operations for fuzzy sets.
where wi (i ∈ Nn ) are nonnegative real numbers, called weights, for which n
wi = 1
i=1
The role of the weights is to express the relative importance of the sets to be aggregated. It is worth mentioning that other classes of averaging operations have been proposed and studied in the literature. Also, some more sophisticated classes of functions have been proposed, which cover more than one of the three basic types of aggregation operations [Klir & Yuan, 1995]. The full scope of aggregation operations is summarized in Figure 2.5 in terms of the Yager intersections iw , Yager union uw , and the generalized means hp . The three types of aggregation operations of fuzzy sets are illustrated by the two fuzzy sets in Figure 2.6, which may represent, for example, silt and sand, as conceived in Figure 2.1. In each case, the bold lines represent the result of the standard operation and the shaded area indicates the range of all operations of that type.
2.3.5 Arithmetic operations Arithmetic operations are applicable only to special fuzzy sets that are called fuzzy intervals. These are standard and normal fuzzy sets defined on the set of real numbers, R, whose α-cuts for all α ∈ (0, 1] are closed intervals of real numbers and whose supports are bounded. Any fuzzy interval A for which A(x) = 1 for exactly one x ∈ R
2.3 Operations on Fuzzy Sets
29
Figure 2.6 An illustration of the three basic classes of aggregation operations.
is called a fuzzy number. Clearly, all fuzzy sets specified in Figure 2.2 are fuzzy numbers, and those in Figure 2.1b are fuzzy intervals. Every fuzzy interval A may conveniently be expressed for all x ∈ R in the canonical form ⎧ fA (x) when x ∈ [a, b) ⎪ ⎪ ⎪ ⎪ ⎪ ⎨1 when x ∈ [b, c] (2.13) A(x) = ⎪ gA (x) when x ∈ (c, d] ⎪ ⎪ ⎪ ⎪ ⎩ 0 otherwise,
30
2 Fuzzy Logic: A Specialized Tutorial
where a, b, c, d are specific real numbers such that a ≤ b ≤ c ≤ d, fA is a real-valued function that is increasing and right-continuous, and gA is a real-valued function that is decreasing and left continuous. For any fuzzy interval A expressed in the canonical form (2.13), the α-cuts of A are expressed for all α ∈ (0, 1] by the formula −1 −1 (α)] when α ∈ (0, 1) [fA (α), gA α (2.14) A= [b, c] when α = 1, −1 are the inverse functions of fA and gA , respectively. where fA−1 and gA Employing the α-cut representation, arithmetic operations on fuzzy intervals are defined in terms of the well-established arithmetic operations on closed intervals of real numbers [Moore, 1966; Neumaier, 1990]. Given any pair of fuzzy intervals, A and B, the four basic arithmetic operations on the α-cuts of A and B are defined for all α ∈ (0, 1] by the general formula α
(A ∗ B) = {a ∗ b|a, b ∈ αA × αB},
(2.15)
where ∗ denotes any of the four basic arithmetic operations; when the operation is division of A by B, it is required that 0 = αB for any α ∈ (0, 1]. Let A = [a(α), a(α)]
α
B = [b(α), b(α)].
α
Then, the individual arithmetic operations on the α-cuts of A and B can be defined more specifically in terms of these endpoints by the following formulas [Kaufmann & Gupta, 1985]: A + αB = [a(α) + b(α), a(α) + b(α)]
α
A − αB = [a(α) − b(α), a(α) − b(α)]
α
A · αB = [a, b],
α
where a = min{a(α) · b(α), a(α) · b(α), a(α) · b(α), a(α) · b(α)} b = max{a(α) · b(α), a(α) · b(α), a(α) · b(α), a(α) · b(α)} A/αB = [a(α), a(α)] · [1/b(α), 1/b(α)],
α
provided that 0 ∈ [b(α), b(α)] for all α ∈ (0, 1]. Fuzzy arithmetic described by these formulas is usually referred to as standard fuzzy arithmetic. It turns out that this arithmetic does not take into account constraints
2.4 Fuzzy Relations
31
among fuzzy numbers that exist in various applications. As a consequence, it leads to results that are, in general, deficient of information, even though they are principally correct. To avoid this information deficiency, we need to revise standard fuzzy arithmetic to take all existing constraints among fuzzy numbers in each application into account. This leads to constrained fuzzy arithmetic [Klir, 1997; Klir & Pan, 1998]. Fuzzy arithmetic is essential for evaluating algebraic expressions in which values of variables are fuzzy intervals or fuzzy numbers. It is also essential for dealing with fuzzy algebraic equations. These are equations in which coefficients and unknowns are fuzzy numbers and algebraic expressions are formed by operations of fuzzy arithmetic. Furthermore, fuzzy arithmetic is a basis for developing fuzzy calculus and for fuzzifying any area of mathematics that involves numbers. Although a lot of work has already been done along these lines, enormous research effort is still needed to fully develop the mathematical areas mentioned. 2.4 Fuzzy Relations When fuzzy sets are defined on universal sets that are Cartesian products of two or more sets, we refer to them as fuzzy relations. A fuzzy relation R is thus defined by a membership function of the general form R: X1 × X2 × · · · × Xn → [0, 1]. The membership degree R(x1 , x2 , . . . , xn ) of a particular n-tuple x1 , x2 , . . . , xn , where xi ∈ Xi for all i ∈ Nn = {1, 2, . . . , n}, indicates the strength of relation among elements of the n-tuple. The individual sets in the Cartesian product are called dimensions of the relation. With n sets in the Cartesian product, the relation is called n-dimensional. Relations that are 2-dimensional have special significance; they are usually called binary relations. When all dimensions of a fuzzy relation are finite sets, which is the usual case, any n-dimensional fuzzy relation may conveniently be represented by an n-dimensional array whose entries are real numbers in the unit interval [0, 1]. This representation is particularly important for dealing with fuzzy relations on the computer. For binary relations, clearly, the arrays become matrices. From the standpoint of fuzzy relations, ordinary fuzzy sets may be viewed as degenerate, 1-dimensional fuzzy relations. This implies that all concepts introduced for ordinary fuzzy sets are also applicable to fuzzy relations. The various types of aggregating operations introduced in Section 2.3 are applicable to fuzzy relations as well. However, fuzzy relations involve additional operations that emerge from their multidimensionality. These additional operations include projections, extensions, compositions, joins, and inverses of fuzzy relations. Projections and extensions are applicable to any fuzzy relations, whereas compositions, joins, and inverses are applicable only to binary relations.
32
2 Fuzzy Logic: A Specialized Tutorial
2.4.1 Projections, cylindric extensions, and cylindric closures To define projections and extensions of fuzzy relations, we assume that the relations are n-dimensional, where n ≥ 2. Let x = xi |i ∈ Ni denote elements (n-tuples) of the Cartesian product X=
×X i∈Nn
i
and let y = yj |j ∈ J , where J ⊂ Nn and |J | = r < n, denote the elements (r-tuples) of the Cartesian product Y =
×X . j ∈J
j
Furthermore, let y < x denote that yj = xj for all j ∈ J . Then, given a fuzzy relation R on X, a projection, RY , of R on Y is defined for all y ∈ Y by the formula RY (y) = max R(x). x>y
(2.16)
As can be seen from this formula, the operation of projection converts a given n-dimensional relation R into an r-dimensional relation RY (r < n) that (i) is consistent with R in all dimensions included in Y ; and (ii) in which all dimensions that are not included in Y are suppressed (not recognized). The maximum operator in (2.16) represents the standard fuzzy union of all singleton sets {x} for which x > y. Since the standard fuzzy union is the only union (t-conorm) that is cutworthy, the operation of projection defined by (2.16) is cutworthy as well. This means that, for each α ∈ (0, 1], αRE is the projection of R in the sense of classical set theory. An operation that is inverse to a projection is called a cylindric extension. By this operation, a given r-dimensional projection RY of an n-dimensional relation R(r < n) is converted to an n-dimensional relation REY , a cylindric extension of RY , by the formula REY (x) = RY (y)
(2.17)
2.4 Fuzzy Relations
33
for each x such that x > y. This definition, which again is a cutworthy generalization of the classical concept of cylindric extension, guarantees that R ⊆ REY . Clearly, cylindric extension REY of RY is the largest fuzzy relation that is consistent with RY . Given a set {RYk |k ∈ K} of projections of R, a cylindric closure, Cyl{RYk | k ∈ K}, of these projections is the standard fuzzy intersection of their cylindric extensions. That is, Cyl{RYk | k ∈ K}(x) = min{REYk (x)} k∈K
(2.18)
for all x ∈ X. Again, by using the standard fuzzy intersection, this definition is a cutworthy generalization of the classical concept of cylindric closure. It produces the largest fuzzy relation that is consistent with all the projections involved. Consequently, it guarantees that this relation, which is reconstructed from the given projections of R, always contains R. That is, R ⊆ Cyl{RYk | k ∈ K}
for any given set RYk | k ∈ K . Projections, cylindric extension, and cylindric closures are the main operations for dealing with n-dimensional relations. Some additional operations are important for dealing with binary relations. The rest of this section is devoted to these operations as well as to properties of some important types of binary fuzzy relations.
2.4.2 Inverses, compositions, and joins The inverse of a binary fuzzy relation R on X × Y , denoted by R −1 , is a relation on Y × X such that R −1 (y, x) = R(x, y) for all pairs y, x ∈ Y × X. When R is represented by a matrix, R −1 is represented by the transpose of this matrix. This means that rows are replaced with columns and vice versa. Clearly, (R −1 )−1 = R holds for any binary relation. Consider now two binary fuzzy relations P and S that are defined on set X × Y and Y × Z, respectively. Any such relations, which are connected via the common set Y , can be composed to yield a relation on Y × Z. The standard composition of
34
2 Fuzzy Logic: A Specialized Tutorial
these relations, which is denoted by P ◦ S, produces a relation R on Y × Z defined by the formula R(x, z) = (P ◦ S)(x, z) = max min[P (x, y), S(y, z)] y∈Y
(2.19)
for all pairs x, z ∈ X × Z. Other definitions of a composition of fuzzy relations, in which the min and max operations are replaced with other t-norms and t-conorms, respectively, are possible and useful in some applications. All compositions possess the following important properties: (P ◦ S) ◦ Q = P ◦ (S ◦ Q) (P ◦ S)−1 = S −1 ◦ P −1 . However, the standard fuzzy composition is the only one that is cutworthy. A similar operation on two connected binary relations, which differs from the composition in that it yields a 3-dimensional relation instead of a binary one, is known as the relational join. For the same fuzzy relations P and S, the standard relational join, P ∗ S, is a 3-dimensional relation X × Y × Z defined by the formula R(x, y, z) = (P ∗ S)(x, y, z) = min[P (x, y), S(y, z)]
(2.20)
for all triples x, y, z ∈ X × Y × Z. Again, the min operation in this definition may be replaced with another t-norm. However, the relational join defined by (2.20) is the only one that is cutworthy.
2.4.3 Fuzzy relation equations Consider binary relations P , Q, R, defined on sets X × Y, Y × Z, and X × Z, respectively, for which P ◦ Q = R, where ◦ denotes the standard composition. This means that a set of equations of the form max min[P (x, y), Q(y, z)] = R(x, z) y∈Y
is satisfied for all x ∈ X and z ∈ Z. These equations are called fuzzy relation equations. The problem of solving fuzzy relation equations is any problem in which two of the relations are given and the third is to be determined via the equations. When P and Q
2.4 Fuzzy Relations
35
are given, the problem of determining R is trivial. It is solved by performing the composition P ◦ Q, usually in terms of the matrix representations of P and Q. When R and Q (or R and P ) are given, the problem of determining P (or Q) is considerably more difficult, but it is very important for many applications. It is reasonable to view this problem as a decomposition of R with respect to Q or P . Several efficient methods have been developed for solving the decomposition problem of fuzzy relation equations. While these methods are rather tedious for human beings, they can easily be implemented on the computer and, moreover, they are highly suitable for parallel processing. Although details of these methods are beyond the scope of this tutorial, it seems useful to describe basic characteristics of the solutions obtained by them. Let S(Q, R) denote the solution set obtained by solving the problem of decomposing R with respect to Q. That is, members of the solution set are all versions of the relation P for which the fuzzy relation equations are satisfied, given relations Q and R. Any member P of the solution set S(Q, R) is called a maximal solution if, for all P ∈ S(Q, R), P ⊆ P implies P = P. Similarly, any member P of S(Q, R) is called a minimal solution if, for all P ∈ S(Q, R), P ⊆ P implies P = P . It is well established that, whenever the equations are solvable, the solution set always contains a unique maximum solution, P, and it may contain several minimal solutions, 1P , 2P , . . . , nP . Moreover, the solution set can be fully characterized by its maximum and minimal solutions. To describe this characterization, let i
P = {P | iP ⊆ P ⊆ P}
denote the family of relations that are between the maximum solution P and the minimal iP for each i ∈ Nn . The solution set is then described by taking the union of these families i P for all the minimal solutions i P . That is, S(Q, R) =
n
i
P.
i=1
This convenient way of characterizing the solution set is illustrated visually in Figure 2.7. In all the proposed methods for solving fuzzy relation equations, it is computationally very simple to determine the maximum solution. This is of great advantage to any application of fuzzy relation equations in which the maximum solution is sufficient. When the solution to fuzzy relation equations is unique, which is a rather rare case, this unique solution is identical with the maximum solution. It may also happen that the given fuzzy relation equations are not solvable. If a solution is essential in some application, then it is important to be able to find a reasonable approximate solution. Important results regarding approximate solutions of fuzzy relation equations have already been obtained, but this problem is still a subject
36
2 Fuzzy Logic: A Specialized Tutorial
Figure 2.7 Structure of the solution set S(Q, R).
of active research. Another area of current research is concerned with fuzzy relation equations that are based on compositions distinct from the standard composition. These more general fuzzy relation equations play a useful role in some applications, particularly in the area of approximate reasoning. It is well recognized that many problems emanating from diverse applications of fuzzy set theory can be formulated in terms of fuzzy relation equations. Methods for solving these equations have thus a broad utility. Constructing rules of inference in fuzzy knowledge-based systems, knowledge acquisition, the problem of identifying fuzzy systems from input–output observations, and the problem of decomposing fuzzy systems are just a few examples illustrating this utility. The principal source for fuzzy relation equations is the book by Di Nola et al. [1989]; other books with a thorough coverage of this subject include Pedrycz [1989] and Klir & Yuan [1995].
2.4.4 Fuzzy relations on a single set Binary relations in which elements of a set are related to themselves have special significance and utility. For example, they allow us to rigorously define equivalence, compatibility, and various kinds of orderings among elements of the set of concern.
2.4 Fuzzy Relations
37
Figure 2.8 Diagram and matrix representation of a fuzzy relation on N6 .
Although membership functions of relations of this kind have the form R : X × X → [0, 1], they are usually referred to as relations on X rather than relations on X × X. Their matrix representations are square matrices in which rows and columns are assigned to the same elements. Other useful representations of these relations are simple diagrams with the following properties: (i) each element of the set X is represented by a single node in the diagram; (ii) directed connections between nodes are included in the diagram only for pairs of elements of X that are contained in the support of the relation; (iii) each connection in the diagram is labeled by the membership degree of the corresponding pair in the relation. An example of the diagram representation of a relation R defined on N6 is shown in Figure 2.8, where it is compared with the matrix representation. Three of the most important classes of classical binary relations on a set— equivalence, compatibility, and ordering relations—are characterized in terms of four distinctive properties: reflexivity, symmetry, antisymmetry, and transitivity. To reformulate reflexivity, symmetry, and antisymmetry for fuzzy relations R on X is rather trivial. We say that a fuzzy relation is: ● ●
reflexive if and only if R(x,x) = 1 for all x ∈ X; symmetric if and only if R(x,y) = R(y,x) for all x,y ∈ X;
38 ●
2 Fuzzy Logic: A Specialized Tutorial antisymmetric if and only if R(x,y) > 0 and R(y,x) > 0 implies that x = y for all x,y ∈ X.
The property of fuzzy transitivity can be defined in numerous ways, all of which collapse to the classical definition of transitivity for crisp relations. According to the most common definition, R is transitive if R(x,z) ≥ max min[R(x,y), R(y,z)] y ∈X
(2.21)
for all x,z ∈ X. This definition, which is based on the standard fuzzy intersection and union, is the only one that is cutworthy. This particular definition of fuzzy transitivity is often referred to as max-min transitivity. Alternative definitions of transitivity, based upon other fuzzy intersections and unions, are possible and useful in some applications. However, they do not result in fuzzy relations that are cutworthy. Employing these definitions, fuzzy equivalence relations are reflexive, symmetric, and transitive (in the fuzzified sense of these properties), fuzzy compatibility relations are reflexive and symmetric, and fuzzy partial orderings are reflexive, antisymmetric, and transitive. Each of these types of fuzzy relations is cutworthy. That is, each α-cut of a fuzzy relation of a particular type is a crisp relation of the same type. Hence, fuzzy equivalence, compatibility, and partial ordering are properties that are preserved in each α-cut in the classical sense. Moreover, by increasing α, equivalence and compatibility classes in α-cuts become more refined, while α-cuts of fuzzy partial orderings increase the number of noncomparable pairs. Examples of simple fuzzy equivalence and compatibility relations are shown in Figure 2.9. Since both relations are reflexive and symmetric, the diagrams are simplified: the connections of each node to itself (required by reflexivity) are omitted, and the bidirectional connections of nodes (required by symmetry) are replaced with undirected connections. Equivalence classes and maximal compatibility classes in all α-cuts of these relations are also shown in the figure. The increasing refinements of these classes with increasing values of α are clearly visible. An example of a fuzzy partial ordering is shown in Figure 2.10. Also shown in the figure are all its α-cuts, which are crisp partial orderings. The α-cuts are represented by simplified diagrams, in which connections are made only to immediate successors and immediate predecessors. Diagrams of this sort, which are called Hasse diagrams, are common for crisp partial orderings.
2.5 Fuzzy Logic In order to use the apparatus of fuzzy set theory in the domain of fuzzy logic, it is necessary to establish a connection between degrees of membership in fuzzy sets and degrees of truth of fuzzy propositions. This is fairly straightforward, provided that
2.5 Fuzzy Logic
39
Figure 2.9 Examples of fuzzy equivalence and fuzzy compatibility.
the degrees of membership and the degrees of truth to be connected refer to the same objects. Let X denote the universal set of these common objects. To establish a meaningful connection between the two kinds of degrees, let us consider first the simplest propositional form pA : X is A, where X is a variable whose range is X and A is a fuzzy set representing an inherently vague linguistic expression (such as low, high, small, large, shallow, very deep, subtidal, etc.) in a given context. A proposition is obtained when a particular object x from X is substituted for the variable X in the propositional form. Let pA (x) denote the degree of truth of the respective proposition. This means that the symbol pA , which denotes the propositional form, is also employed for denoting a function by which degrees of truth are assigned to propositions based on the propositional form. Let this function be called a truth assignment function. The double use of the symbol pA , employed here for the sake of simplicity, does not create any confusion since there is only one truth assignment function for each propositional form. This is analogous to the use of the same symbol for a fuzzy set and its membership function.
40
2 Fuzzy Logic: A Specialized Tutorial
Figure 2.10 An example of fuzzy partial ordering.
2.5 Fuzzy Logic
41
Using this simplified notation, we can formulate the connection between fuzzy sets and fuzzy propositions as follows. Given a fuzzy set A, its membership degree A(x) for each x ∈ X may be interpreted as the degree of truth of the proposition obtained from the propositional form pA : X is a member of A for the same x ∈ X. That is, pA (x) = A(x)
(2.22)
for all x ∈ X. Conversely, given an arbitrary propositional form pA : X is A, the degree of truth pA (x) for each x ∈ X may be interpreted as the degree of compatibility, A(x), of x with the concept represented by A. That is, we obtain again Equation (2.22). In summary, a fuzzy set and a fuzzy propositional form are connected whenever both are defined in terms of the same set of objects and both represent the same meaning of a linguistic expression. Given a fuzzy set and a fuzzy propositional form that are connected in this sense, the degrees of membership in the fuzzy set and the degrees of truth of fuzzy propositions defined for the same objects are numerically equal, as expressed by (2.22). As a consequence, logic operations of negation, conjunction, and disjunction are defined in exactly the same way as the operations of complementation, intersection, and union, respectively. We should emphasize at this point that Equation (2.22) applies only to the simplest propositional form: X is A. For more complex forms (quantified, truth-qualified, conditional, etc.), the equation must be properly modified. To explain how to modify it, we need to introduce basic types of propositional forms from which fuzzy propositions can be obtained.
2.5.1 Basic types of propositional forms The principal aim of fuzzy logic is to formalize reasoning with propositions in natural language. The linguistic expressions involved may contain fuzzy linguistic terms of any of the following types: ● ● ● ●
fuzzy predicates—tall, young, expensive, low, high , normal, etc.; fuzzy truth values—true, fairly true, very true, false, etc.; fuzzy probabilities—likely, very likely, highly unlikely, etc.; fuzzy quantifiers—most, few, almost all, usually, often, etc.
42
2 Fuzzy Logic: A Specialized Tutorial
All of these linguistic terms are represented in each context by appropriate fuzzy sets. Fuzzy predicates are represented by fuzzy sets defined on universal sets of elements to which the predicates apply. Fuzzy truth values and fuzzy probabilities are represented by fuzzy sets defined on the unit interval [0, 1]. Fuzzy quantifiers are either absolute or relative; they are represented by appropriate fuzzy numbers defined either on the set of natural numbers or on the interval [0, 1]. Observe that simple linguistic terms of any of the mentioned types are sometimes modified by special linguistic terms such as very, fairly, extremely, more or less, and the like. These linguistic terms are called linguistic hedges. Contrary to the other linguistic terms, they are not represented by fuzzy sets, but rather by special operations on fuzzy sets. These operations are called modifiers and are discussed in Section 2.3.1, on p. 19. In a crude way, it is useful to distinguish the following four types of fuzzy propositional forms and fuzzy propositions based on them. Each of these forms may, in addition, be quantified by an appropriate fuzzy quantifier. 1. Unconditional and unqualified propositions are expressed by the canonical form pA : X is A. As already explained in this section, the truth values of propositions based on this form are given by Equation (2.22). 2. Unconditional and qualified propositions are characterized by either the canonical form pT (A) : X is A is T or the canonical form pP (A) : Pro{X is A} is P , where X and A have the same meaning as before, Pro {X is A} denotes the probability of a fuzzy event defined by the expression “X is A,” T is a fuzzy truth qualifier, and P is a fuzzy probability qualifier. Both T and P are represented by fuzzy sets defined on [0, 1]. For any given probability distribution function f on X, Pro {X is A} is determined by the formula f (x)A(x). Pro {X is F } = x∈X
The first canonical form is called a truth-qualified form, the second one is called a probability-qualified form. To obtain the degree of truth of a fuzzy proposition based on the truth-qualified form, we need to compose membership function A (representing a fuzzy predicate) with the membership function of the truth
2.5 Fuzzy Logic
43
qualifier T . That is, pT (A) (x) = T (A(x))
(2.23)
for all x ∈ X. Similarly, for fuzzy propositions based on probability-qualified forms, we need to compose membership function A with the membership function of the probability qualifier P . That is, pP (A) (x) = P (A(x))
(2.24)
for all x ∈ X. Propositional forms in which both kinds of qualification are involved, pTP(A) (x) : X is A is P is T are also meaningful. To obtain the degree of truth of a fuzzy proposition based on this form, we need to compose A with P first and, then, to compose the resulting function with T . That is, pTP(A) (x) = T (P (A(x)))
(2.25)
for all x ∈ X. 3. Conditional and unqualified fuzzy propositions have the canonical form pB|A : If X is A, then Y is B, where X , Y are variables whose ranges consist of objects in some universal sets X, Y , respectively, and A, B are relevant fuzzy predicates represented by appropriate fuzzy sets. These propositions may also be expressed in an alternative but equivalent form pB|A : (X , Y) is R, where R is a fuzzy relation on X × Y that is determined for each x ∈ X and each y ∈ Y by the formula R(x, y) = J [A(x), B(y)]. The symbol J stands for a binary operation on [0, 1] that represents a suitable fuzzy implication in the given context. Clearly, pB|A (x, y) = R(x, y).
(2.26)
It should be mentioned at this point that it is essential in fuzzy logic (as in classical logic) to distinguish implications (as well as other logic connectives) on two levels, the syntactic level and the semantic level. On the syntactic level, implication (or another logic connective) is represented by a symbol (usually denoted by ⇒).
44
2 Fuzzy Logic: A Specialized Tutorial On the semantic level, it is represented by a suitable operation (usually denoted by →) on the set of truth values [0, 1]. This distinction is followed in Chapter 10. Operations that qualify as fuzzy implications form a class of binary operations on [0, 1], similarly to fuzzy intersections and unions [Klir & Yuan, 1995]. In some sense, this class can be characterized in terms of fuzzy intersections, unions, and complements. The most common fuzzy implication, referred to as the Łukasiewicz implication, is defined by the formula J (a, b) = min[1, 1 − a + b].
Conditional fuzzy propositions are essential components of fuzzy rules of inference. Hence, they play a fundamental role in approximate reasoning. 4. Conditional and qualified fuzzy propositions have either the canonical form pT (B|A) : If X is A, then Y is B is T , if they are truth qualified, or the canonical form pP (B|A) : Pro {Y is B|X is A} is P , if they are probability qualified. These forms are basically combinations of the previous forms. Fuzzy propositions of any of the introduced types may also be quantified. In general, fuzzy quantifiers are fuzzy numbers. Fuzzy quantifiers of one type, which are called absolute quantifiers, are expressed by fuzzy numbers defined on the set of real numbers or on the set of integers. They characterize linguistic terms such as about 10, at least about 500, much more than a dozen, etc. Quantifiers of another type, which are called relative quantifiers, are expressed by fuzzy numbers defined on [0,1]. They characterize linguistic terms such as almost all, about half, no more than about 20%, most, etc. Various procedures for determining degrees of truth of quantified fuzzy propositions are described in the literature, but they are not covered here due to the limited space. This fairly complex subject is perhaps most extensively covered in various papers by Zadeh [Yager et al., 1987; Klir & Yuan, 1996] and Yager [1983, 1985–86, 1991], but its basic ideas are also summarized in the text by Klir & Yuan [1995].
2.5.2 Approximate reasoning Reasoning based on fuzzy propositions of the four types, possibly quantified by various fuzzy quantifiers, is usually referred to as approximate reasoning. Although approximate reasoning is currently a subject of intensive research, its basic principles are already well established. In general, approximate reasoning draws upon
2.5 Fuzzy Logic
45
methodological apparatus of fuzzy set theory, such as operations on fuzzy sets, manipulations of fuzzy relations, and fuzzy arithmetic. The most fundamental components of approximate reasoning are conditional fuzzy propositions, which may also be truth qualified, probability qualified, quantified, or any combination of these. Special procedures are needed for each of these types of fuzzy propositions. This great variety of fuzzy propositions makes approximate reasoning methodologically rather intricate. This reflects the richness of natural language and the many intricacies of common-sense reasoning, which approximate reasoning based upon fuzzy set theory attempts to model. The essence of approximate reasoning is illustrated in this section by explaining how the most common inference rules of classical logic—modus ponens and modus tollens—can be generalized within the framework of fuzzy logic. For the sake of simplicity, the explanation is restricted to unqualified fuzzy propositions without quantifiers. Consider variables X and Y, the ranges of which consist of objects in some given sets X and Y , respectively. Assume that the variables are constrained by a fuzzy relation R on X × Y . Then, knowing that X is A, where A is a fuzzy set on X, we can infer that Y is B, where B is a fuzzy set on Y , by the formula B(y) = sup min[A(x), R(x, y)])
(2.27)
x∈X
for all y ∈ Y . This formula, which is called a compositional rule of inference, is a basis for the generalized modus ponens as well as the generalized modus tollens; it can also be written, more concisely, as B = A ◦ R. Assume now that the relation R is not given explicitly, but it is embedded in the conditional propositional form pG|F : if X is F, then Y is G. In this case, the relation is determined via the formula R(x, y) = J [F (x), G(x)],
(2.28)
which, in turn, is determined by the choice of a suitable operation of fuzzy implication J . Using this relation, obtained from the given fuzzy propositional form pG|F , and given another propositional form pA : X is A, regarding variable X , we may conclude that Y is B by the compositional rule of inference (2.27). This procedure is called a generalized modus ponens.
46
2 Fuzzy Logic: A Specialized Tutorial
Viewing the conditional propositional form pG|F as a fuzzy rule and the simple conditional form pA as a fuzzy fact, the generalized modus ponens is expressed by the schema: Fuzzy rule:
If X is F, then Y is G
Fuzzy fact:
X is A
Fuzzy conclusion:
Y is B
In a similar way, the generalized modus tollens is expressed by the schema: Fuzzy rule:
If X is F, then Y is G
Fuzzy fact:
Y is B
Fuzzy conclusion:
X is A
In this case, the compositional rule of inference has the form A(x) = sup min[B(y), R −1 (y, x)].
(2.29)
y∈Y
To use the compositional rule of inference, we need to choose a fitting fuzzy implication in each application context and express it in terms of a fuzzy relation R by Equation (2.28). There are several ways in which this can be done. One way is to derive from the application context (by observations or experts’ judgments) pairs A, B of fuzzy sets that are supposed to be inferentially connected. Relation R, which represents a fuzzy implication, is thus determined by solving sets of fuzzy relation equations of either the form (2.27) or the form (2.29). This and other issues regarding fuzzy implications in approximate reasoning are discussed fairly thoroughly in the text by Klir & Yuan [1995].
2.6 Possibility Theory It was first recognized by Zadeh [1978] that possibility theory is a natural tool for representing and manipulating information expressed in terms of fuzzy propositions. In this interpretation of possibility theory, the classical (crisp) possibility and necessity measures based upon modal logic [Hughes & Cresswell, 1996] are extended to their fuzzy counterparts via the α-cut representation. Consider a set of alternatives, X. One of the alternatives is true, but we are not certain which one it is, due to limited evidence. Assume that we only know, according to all evidence available, that it is not possible that the true alternative could be outside a given set E, where ∅ = E ⊂ X. This simple evidence can be expressed by
2.6 Possibility Theory
47
a possibility measure, PosE , defined on X by the formulas 1 when x ∈ E PosE ({x}) = 0 when x ∈ E for all x ∈ X and PosE (A) = sup PosE ({x}) x∈A
for all A ∈ P(X ). In this equation, as well in other equations in this section, the supremum may be replaced with the maximum if the latter exists. Associated with the possibility measure PosE is a necessity measure, NecE , defined via the equation NecE (A) = 1 − PosE (A) for all A ∈ P(X). Assume now that the set E, in terms of which the evidence is expressed, is a standard fuzzy set. Then, the previous formulas are still applicable to the α-cuts of E, provided that ∅ = αE ⊆ X for all α ∈ [0, 1]. For each α ∈ [0, 1], we can define a possibility measure α PosE in the same way as before: 1 when x ∈ αE α PosE ({x}) = (2.30) 0 when x ∈ α E for all x ∈ X and α
PosE (A) = sup α PosE ({x}) x∈A
for all A ∈ P(X). Now, using the α-cut representation of E, we have E(x) = sup α · αE(x) α∈[0,1]
for all x ∈ X, where α E denotes here the characteristic function of the α-cut of E. Since Equation (2.30) can be rewritten as α
PosE ({x}) = αE(x)
for all α ∈ [0, 1] and x ∈ X, it is natural to define a possibility measure, PosE , in terms of the possibility measures α PosE (α ∈ [0, 1]) via the α-cut representation of E. Hence, PosE ({x}) = sup α · α PosE ({x}) α∈[0,1]
48
2 Fuzzy Logic: A Specialized Tutorial
for all x ∈ X, which can be rewritten as PosE ({x}) = E(x).
(2.31)
This definition of possibility measures, which is due to Zadeh [1978], is usually referred to in the literature as the standard fuzzy-set interpretation of possibility theory. Given PosE ({x}) for all x ∈ X, PosE (A) is then calculated by the equation PosE (A) = sup PosE ({x})
(2.32)
x∈A
for all A ∈ P(X). When A is a fuzzy set, Equation (2.32) must be replaced with the more general equation PosE (A) = sup min[A(x), PosE ({x})].
(2.33)
x∈X
The associated necessity measure, NecE , is again defined by the equation NecE (A) = 1 − PosE (A)
(2.34)
for each set A, which may be crisp or fuzzy. Possibility measures and the associated necessity measures that represent evidence expressed in terms of standard fuzzy set via Equations (2.31) to (2.33) are thus cutworthy. They form a coherent theory of evidence that is referred to as possibility theory [Dubois & Prade, 1988; De Cooman, 1997], provided that the requirement that ∅ = αE ⊆ X is satisfied for all α ∈ [0, 1]. This means that E must be a normal fuzzy set. When evidence is expressed in terms of a subnormal fuzzy set, the coherence of the standard fuzzy-set interpretation of possibility theory is lost. Observe, for example, that PosE (0+ E) = hE
and
NecE (0+E) = 1.
When hE < 1, then NecE (0+E) ≥ PosE (0+E). This violates the fundamental inequality of possibility theory, NecE (A) ≤ PosE (A), which is required to hold for all A ∈ P(X). The fact that the standard fuzzy-set interpretation of possibility theory, defined by Equation (2.31), is not applicable to subnormal fuzzy sets has been recognized in the literature since the mid-1980s. In a recent paper [Klir, 1999], it is shown that the only way to make a fuzzy-set interpretation applicable to all standard fuzzy sets
2.7 Fuzzy Systems
49
without distorting given evidence is to replace Equation (2.31) with the more general equation PosE ({x}) = E(x) + 1 − hE for all α ∈ [0, 1] and x ∈ X, where hE denotes the height of E. 2.7 Fuzzy Systems The term “fuzzy system” refers to any system whose variables (or at least some of them) range over states that are fuzzy sets. For each variable, the fuzzy sets are defined on some relevant universal set. In most typical systems, the universal sets are specific intervals of real numbers. In this special but important case, states of the variables are fuzzy intervals. Representing states of variables by appropriate fuzzy sets is a fuzzy quantization or, using a more common term, granulation. Each fuzzy set representing a state of a variable is called a granul. If each granul represents a linguistic term (such as very small, small, medium, etc.), the variable is called a linguistic variable. Fuzzy systems are thus usually systems of linguistic variables. Each linguistic variable is defined in terms of a base variable, whose values are assumed to be real numbers within a specific interval of real numbers. A base variable is a variable in the classical sense. Examples of geological variables are: distance from source, tidal range, depth, grain size, and percentage of coral cover. Linguistic terms involved in a linguistic variable are used for approximating the actual values of the associated base variable. Their meanings are captured, in the context of each particular application, by appropriate fuzzy intervals. That is, each linguistic variable consists of: ● ● ● ●
a name, which should reflect the meaning of the base variable involved; a base variable with its range of values (a closed interval of real numbers); a set of linguistic terms that refer to values of the base variable; a set of semantic rules, which assign to each linguistic term its meaning in terms of an appropriate fuzzy interval defined on the range of the base variable.
An example of a linguistic variable is shown in Figure 2.11. Its name is “tidal range,” which captures the meaning of the associated base variable—a variable that expresses the tidal range (in meters) at the place under study. The range of the base variable is [−10, 10]. Five states (values) are distinguished by the linguistic terms subtidal, lowintertidal, medium-intertidal, high-intertidal, and supratidal. Each of these linguistic terms is assigned one of the trapezoidal-shape fuzzy intervals shown in Figure 2.11. These fuzzy intervals are supposed to approximate the meaning of the linguistic terms in a given application context.
50
2 Fuzzy Logic: A Specialized Tutorial
Figure 2.11 An example of a linguistic variable.
In principle, fuzzy systems can be knowledge-based, model-based, or hybrid. In knowledge-based fuzzy systems, relationships between variables are described by collections of if-then rules (conditional fuzzy propositions). These rules attempt to capture the knowledge of a human expert, expressed often in natural language. Modelbased fuzzy systems are based on traditional systems modeling, but they employ appropriate areas of fuzzy mathematics (fuzzy analysis, fuzzy geometry, etc.). Hybrid fuzzy systems are combinations of knowledge-based and model-based fuzzy systems. At this time, knowledge-based fuzzy systems are more developed than model-based or hybrid fuzzy systems. In knowledge-based systems, the relation between input and output linguistic variables is expressed in terms of a set of fuzzy if–then rules (conditional propositional forms). From these rules and any fact describing actual states of input variables, the actual states of output variables are derived by an appropriate compositional rule of inference. Assuming that the input variables are X1 , X2 , . . . , and the output variables
2.7 Fuzzy Systems
51
are Y1 , Y2 , . . . , we have the following general scheme of inference to represent the input–output relation of the system: Rule 1:
If X1 is A11 and X2 is A21 and · · ·, then Y1 is B11 and Y2 is B21 and · · ·
Rule 2:
If X1 is A12 and X2 is A22 and · · ·, then Y1 is B12 and Y2 is B22 and · · ·
………………………………………………………………………… Rule n:
If X1 is A1n and X2 is A2n and · · ·, then Y1 is B1n and Y2 is B2n and · · ·
Fact:
X1 is C1 and X2 is C2 and · · ·
Conclusion: Y1 is D1 and Y2 is D2 and · · · This overall scheme can be broken down into several schemes, one for each output variable. For output variable Yk , for example, the ith rule (i ∈ Nn ) becomes Rule i: If X1 is A1i and X2 is A2i and · · ·, then Yk is Bki This rule can be rewritten as Rule i: If X1 , X2 , . . . is Qi , then Yk is Bki , where Qi is the cylindric closure of A1i , A2i , . . . . Similarly, the fact in the overall scheme can be rewritten as Fact: X1 , X2 , . . . is P , where P is the cylindric closure of C1 , C2 , . . . . Rule i can be further rewritten as Rule i: X1 , X2 , . . ., Yk is Ri , where Ri is a binary relation that expresses the chosen fuzzy implication, as explained in Section 2.5. Hence, the inference scheme for variable Yk can be rewritten in the following form: Rule 1:
X1 , X2 , . . ., Yk is R1
Rule 2:
X1 , X2 , . . ., Yk is R2
………………………………………………… Rule n:
X1 , X2 , . . ., Yk is Rn
Fact:
X1 , X2 , . . . is P
Conclusion: Yk is Dk
52
2 Fuzzy Logic: A Specialized Tutorial
Since the rules are interpreted as disjunctive, Dk is determined by the formula Dk = (P ◦ Ri ), (2.35)
i∈Nn
where and ◦ stand usually for the standard fuzzy union and the standard (max–min) composition. However, other fuzzy unions and compositions may be employed when desirable. The result of each fuzzy inference is clearly a fuzzy set. This set can be converted to a single real number, if this is needed, by a defuzzification method. The outcome of any defuzzification of a given fuzzy set should be the best representation, in the context of each application, of the elastic constraint imposed on possible values of the output variable by the fuzzy set. Among the various defuzzification methods described in the literature, each of which is based on some rationale, the most frequently used method is called a centroid method. To describe it, let us assume that we want to defuzzify a given fuzzy set A on X = R. The defuzzified value of A, d(A), obtained by the centroid method is defined by the formula xA(x)dx . (2.36) d(A) = R R A(x)dx It is clear that d(A) is in this case the value for which the area under the graph of membership function A is divided into two equal subareas. Following this interpretation, the centroid method is sometimes called the center of area method. Assume now that A is defined on a finite universal set X = {x1 , x2 , . . . , xn }. Then the formula n
d(A) =
xi A(xi )
i=1 n
(2.37) A(xi )
i=1
is a discrete counterpart of (2.36). It is now increasingly recognized that the centroid defuzzification method and other methods proposed in the literature may be viewed as special members of parametrized families of defuzzification methods. For the discrete case, an interesting family is defined by the formula n
dδ (A) =
xi Aδ (xi )
i=1 n i=1
, Aδ (xi )
(2.38)
2.8 Constructing Fuzzy Sets and Operations
53
where δ ∈ (0, ∞) is a parameter by which different defuzzification methods are distinguished. A good overview of the various defuzzification methods with references to the original publications was prepared by Van Leekwijck & Kerre [1999]. Although fuzzy systems based on numerical variables have special significance, due to their extensive applicability, these are not the only fuzzy systems. Any classical (crisp) systems whose variables are not numerical (e.g., ordinal-scale or nominalscale variables) can be fuzzified as well. One type of fuzzy systems, based in general on nominal-scale variables (whose states are nonnumerical or even unordered), comprises finite-state fuzzy automata [Klir & Yuan, 1995]. Literature on fuzzy systems is extensive. An important early book on fuzzy systems was written by Negoita & Ralescu [1975]; more recent books with broad coverage of fuzzy systems are by Yager & Filev [1994] and Piegat [2001].
2.8 Constructing Fuzzy Sets and Operations Fuzzy set theory provides us with a broad spectrum of tools for representing propositions expressed in natural language and for reasoning based on this representation. Most linguistic terms in natural language are not only predominantly vague, but their meanings are almost invariably dependent on context as well. A prerequisite for using the tools of fuzzy set theory in each application is to determine the intended meanings of relevant linguistic terms in the context of that particular application. Some linguistic terms are represented by fuzzy sets (predicates, truth or probability qualifiers, quantifiers), others are represented by operations on fuzzy sets (logical connectives, linguistic hedges). To capture the intended meanings of linguistic terms involved in an application, we need to construct appropriate membership functions or operations on membership functions. The problem of constructing fuzzy sets and operations on fuzzy sets in the context of various applications is not a problem of fuzzy set theory per se. It is a problem of knowledge acquisition, which is a subject of a relatively new field referred to as knowledge engineering. The process of knowledge acquisition involves one or more experts in a specific domain of interest, and a knowledge engineer. The role of the knowledge engineer is to elicit the knowledge of interest from the experts, and express it in some operational form of a required type. In applications of fuzzy set theory, knowledge acquisition involves basically two stages. In the first stage, the knowledge engineer attempts to elicit relevant knowledge in terms of propositions expressed in natural language. In the second stage, he or she attempts to determine the meaning of each linguistic term employed in these propositions. It is during this second stage of knowledge acquisition that membership functions of fuzzy sets as well as appropriate operations on these fuzzy sets are constructed.
54
2 Fuzzy Logic: A Specialized Tutorial
Many methods for constructing membership functions are described in the literature. It is useful to classify them into direct methods and indirect methods. In direct methods, the expert is expected either to define a membership function completely or to exemplify it for some selected individuals in the universal set. To request a complete definition from the expert, usually in terms of a justifiable mathematical formula, is feasible only for a concept that is perfectly represented by some objects of the universal set, called ideal prototypes of the concept, and the compatibility of other objects in the universal set with these ideal prototypes can be expressed mathematically by a meaningful similarity relation. If it is not feasible to define the membership function in question completely, the expert should at least be able to exemplify it for some representative objects of the universal set. The exemplification may be facilitated by asking the expert questions regarding the compatibility of individual objects x with the linguistic term that is to be represented by fuzzy set A. These questions, regardless of their form, result in a set of pairs x, A(x) that exemplify the membership function under construction. This set is then used for constructing the full membership function. One way to do that is to select an appropriate class of functions (triangular, trapezoidal, S-shaped, bell-shaped, etc.) and employ some relevant curve-fitting method to determine the function that fits best the given samples. Another way is to use an appropriate neural network to construct the membership function by learning from the given samples. This approach has been so successful that neural networks are now viewed as a standard tool for constructing membership functions. When a direct method is extended from one expert to multiple experts, the opinions of individual experts must be properly combined. Any averaging operation, including those introduced in Section 2.3.4, can be used for this purpose. The most common operation is the simple weighted average A(x) =
n
ci Ai (x),
i=1
where Ai (x) denotes the valuation of the proposition “x belongs to A” by expert i, n denotes the number of experts involved, and ci denote weights by which the relative significance of individual experts can be expressed; it is assumed that n
c1 = 1.
i=1
Experts are instructed either to value each proposition by a number in [0, 1] or to value it as true or false. Direct methods based on exemplification have one fundamental disadvantage. They require the expert (or experts) to give answers that are overly precise and, hence, unrealistic as expressions of their qualitative subjective judgments. As a consequence,
2.9 Nonstandard Fuzzy Sets
55
the answers are always somewhat arbitrary. Indirect methods attempt to reduce this arbitrariness by replacing the requested direct estimates of degrees of membership with simpler tasks. In indirect methods, experts are usually asked to compare elements of the universal set in pairs according to their relative standing with respect to their membership in the fuzzy set to be constructed. The pairwise comparisons are often easier to estimate than the direct values, but they have to be somehow connected to the direct values. Numerous methods have been developed for dealing with this problem. They have to take into account possible inconsistencies in the pairwise estimates. Most of these methods deal with pairwise comparisons obtained from one expert, but a few methods are described in the literature that aggregate pairwise estimates from multiple experts. The latter methods are particularly powerful since they allow the knowledge engineer to determine the degrees of competence of the participating experts, which are then utilized, together with the expert’s judgments, for calculating the degrees of membership in question. The coverage of these various methods is beyond the scope of this tutorial. The problem of constructing membership functions and operations on them has been addressed by many authors. A good review of the various methods, with references to the original publications, was prepared by Sancho-Royo & Verdegay [1999]. 2.9 Nonstandard Fuzzy Sets Since the introduction of standard fuzzy sets by Zadeh [1965], several other types of fuzzy sets have been introduced in the literature. Each of them leads to a particular formalized language, which may be viewed as a branch of the overall fuzzy set theory. The following are definitions of the most visible types of nonstandard fuzzy sets. In each of them, symbols X and A denote, respectively, the universal set of concern and the fuzzy set defined. 1. Interval-valued fuzzy sets: A : X → CI([0, 1]) CI ([0,1]) denotes here the set of all closed intervals contained in [0, 1]. That is, A(x) is a closed interval of real numbers in [0, 1] for each x ∈ X. An alternative formulation is A = A, A, are standard fuzzy sets such that A(x) ≤ A(x) for all x ∈ X. Fuzzy where A and A sets defined in this way are usually called gray fuzzy sets. For each x ∈ X, functions clearly form an interval [A(x), A(x)] ∈ CI[0, 1]. Interval-valued fuzzy A and A sets have been investigated since the early 1970s and are used in many applications [Gorzalczany, 1987].
56
2 Fuzzy Logic: A Specialized Tutorial
2. Fuzzy sets of type 2: A : X → FI([0, 1]) FI([0, 1]) denotes the set of all fuzzy intervals defined on [0, 1]. Fuzzy sets of type 2, which are generalizations of interval-valued fuzzy sets, have been investigated since the mid-1970s. Their theory is now well developed and utilized in many applications. A recent appraisal of the theory was prepared by John [1998]; advanced developments, including those regarding computer software for computing with type-2 fuzzy sets, are reported in a paper by Karnik et al. [1999] and in a book by Mendel [2001]. 3. Fuzzy sets of type k(k>2): A : X → FI k−1 ([0, 1]) FI k−1 ([0, 1]) denotes the set of all fuzzy sets of type k − 1. For each x ∈ X, A(x) is a fuzzy set of type k − 1. These sets were introduced as theoretically possible generalizations of type 2 fuzzy sets in the mid-1970s. Their theory is not fully developed as yet, and their practical utility remains to be seen. 4. Fuzzy sets of level 2: A : F(X) → [0, 1] F(X) is a family of fuzzy sets defined on X. That is, a fuzzy set of level 2 is defined on a family of fuzzy sets, each of which is defined, in turn, on a given universal set X. This mathematical structure allows us to represent a higher level concept by lower level concepts, all expressed in imprecise linguistic terms of natural language. Thus far, fuzzy sets of level 2 have been rather neglected in the literature, even though they were already recognized in the early 1970s. 5. Fuzzy sets of level k(k>2): A : F k−1 (X) → [0, 1] F k−1 (X) denotes a family of fuzzy sets of level k − 1. Sets of this type are natural generalizations of fuzzy sets of level 2. They are sufficiently expressive to facilitate representation of high level concepts embedded in natural language. Notwithstanding their importance, no adequate theory has yet been developed for fuzzy sets of this type. 6. L-fuzzy sets: A : X → L L denotes a recognized set of membership grades which is required to be at least partially ordered. Usually, L is assumed to be a complete lattice. This important type of fuzzy set was introduced very early in the history of fuzzy set theory by Goguen [1967]. 7. Intuitionistic fuzzy sets: A = AM, AN Symbols AM and AN denote standard fuzzy sets on X such that 0 ≤ AM(x) + AN(x) ≤ 1 for all x ∈ X. The values AM(x) and AN(x) are interpreted for each x ∈ X as, respectively, the degree of membership and the degree of nonmembership of x in A. Intuitionist fuzzy sets have been investigated since the early 1980s. Although their theory is now fairly well developed, primarily due to work by Atanassov [2000], their utility remains to be established. R 8. Rough fuzzy sets: AR = AR , A
2.10 Principal Sources for Further Study
57
These are fuzzy sets whose α-cuts are approximated by rough sets [Pawlak, 1991]. That is, AR is a rough approximation of a fuzzy set A based on an equivalence R denote, respectively, the lower and upper relation R on X. Symbols AR and A approximations of A in which the set of equivalence classes X/R is employed R are instead of the universal set X; for each α ∈ [0, 1], the α-cuts of AR and A defined by the formulas α AR = {[x]R |[x]R ⊆ αA, x ∈ X} α AR = {[x]R |[x]R ∩ αA = ∅, x ∈ X}, where [x]R denotes the equivalence class in X/R that contains x. This combination of fuzzy sets with rough sets must be distinguished from another combination, in which a fuzzy equivalence relation is employed in the definition of a rough set. It is appropriate to refer to the sets that are based on the latter combination as fuzzy rough sets. These combinations, which have been discussed in the literature since the early 1990s, seem to be of great utility in some application areas. Observe that the introduced types of fuzzy sets are interrelated in numerous ways. For example, a fuzzy set of any type that employs the unit interval [0, 1] can be generalized by replacing [0, 1] with a complete lattice L; some of the types (e.g., standard, interval-valued, or type 2 fuzzy sets) can be viewed as special cases of L-fuzzy sets; or rough fuzzy sets can be viewed as special interval-valued sets. The overall fuzzy set theory is thus a broad formalized language based upon an appreciable inventory of interrelated types of fuzzy sets, each associated with its own variety of concepts, operations, methods of computation, interpretations, and applications.
2.10 Principal Sources for Further Study For further study of fuzzy set theory and fuzzy logic, a graduate text by Klir and Yuan [1995] is recommended since it is a natural extension of this tutorial. It employs the same terminology and notation, and covers virtually all aspects of fuzzy set theory and related areas in a thorough and mathematically rigorous fashion. It also contains a bibliography of over 1700 entries and a bibliographical index. In addition, the following general textbooks are also recommended: Lin & Lee [1996]—excellent coverage of the role of neural networks in fuzzy systems, with a focus on integrated fuzzy-neural intelligent systems; Nguyen & Walker [1997]—a well-written, rigorous presentation with many examples and exercises; Pedrycz & Gomide [1998]—a comprehensive coverage with a good balance of theory and applications. Furthermore, the two volumes of collected papers by Lotfi Zadeh [Yager et al., 1987; Klir & Yuan, 1996]
58
2 Fuzzy Logic: A Specialized Tutorial
are indispensable for proper comprehension of the four facets of fuzziness—the set-theoretic, relational, logical, and epistemological facets. In the category of reference books, two important handbooks—Ruspini et al. [1998] and Dubois & Prade [1999]—are recommended as convenient sources of information on virtually any aspect of fuzzy set theory and related areas; the latter is the first volume of a multi-volume handbook on fuzzy sets. In addition to references made in previous sections of this tutorial, several important books that seem to be relevant to geological modeling should be mentioned. Among the many books on knowledge-based systems, most of which are oriented to control, two excellent books with a broader coverage of fuzzy modeling are recommended, one written by Babuška [1998], and one edited by Hellendoorn and Driankov [1997]. An important resource also is a book of selected papers by Sugeno, edited by Nguyen and Prasad [1999]. Fuzzy modeling in which neural networks or genetic algorithms play important roles is well covered in the books by Nauck et al. [1997] and Rutkowska [2002] and the one edited by Sanchez et al. [1997], respectively. Finally, the related areas of fuzzy classification, pattern recognition, and clustering, which are also of interest to geologists, are covered by several books. The books by Bezdek [1981], Kandel [1982], and Pal & Majumder [1986] are classics in these areas; the ones by Sato et al. [1997] and Pal & Mitra [1999] are more up to date. Two valuable books were written in a popular genre by McNeill & Freiberger [1993] and Kosko [1993]. Although both books characterize the relatively short but dramatic history of fuzzy set theory and discuss the significance of the theory, they have different foci. While the former book focuses on the impact of fuzzy set theory on high technology, the latter is concerned more with philosophical and cultural aspects; these issues are further explored in a more recent book by Kosko [1999]. Another book of popular genre, which is worth reading, was written by DeBono [1991]. He argues that fuzzy logic (called in the book water logic) is important in virtually all aspects of human affairs. Fuzzy logic is a field that is currently developing extremely rapidly. The following journals are the principal sources (in English) of these developments: 1. 2. 3. 4. 5. 6. 7.
Fuzzy Sets and Systems IEEE Transactions on Fuzzy Systems Journal of Intelligent and Fuzzy Systems International Journal of Approximate Reasoning Journal of Fuzzy Mathematics International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems Japanese Journal of Fuzzy Theory and Systems (English translation by Allerton Press) 8. International Journal of Intelligent Systems 9. Journal of Intelligent Information Systems 10. Fuzzy Systems and A.I. Reports and Letters
References 11. 12. 13. 14.
59
Soft Computing Fuzzy Economic Review International Journal of Fuzzy Systems Fuzzy Optimization and Decision Making
Finally, research and education in fuzzy logic are now supported by numerous professional organizations. Many of them cooperate in a federation-like manner via the International Fuzzy Systems Association (IFSA), which publishes the prime journal in the field, Fuzzy Sets and Systems, and has organized the biennial World IFSA Congress since 1985. The oldest professional organization supporting fuzzy logic is the North American Fuzzy Information Processing Society (NAFIPS); founded in 1981, NAFIPS publishes the Journal of Approximate Reasoning and organizes annual meetings.
References Atanassov, K. T. [2000], Intuitionistic Fuzzy Sets. Springer-Verlag, New York. Babuška, R. [1998], Fuzzy Modeling for Control. Kluwer, Boston. Bezdek, J. C. [1981], Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York. De Bono, E. [1991], I Am Right, You Are Wrong: From Rock Logic to Water Logic. Viking Penguin, New York. De Cooman, G. [1997], “Possibility theory—I, II, III.” International Journal of General Systems, 25(4), 291–371. Di Nola, A., Sessa, S., Pedrycz, W., & Sanchez, E. [1989], Fuzzy Relation Equations and Their Applications to Knowledge Engineering. Kluwer, Boston. Dubois, D., & Prade, H. [1988], Possibility Theory. Plenum Press, New York. Dubois, D., & Prade, H. [1999], Fundamentals of Fuzzy Sets (Vol. 1 of Handbook of Fuzzy Sets.) Kluwer, Boston. Goguen, J. A. [1967], “L-fuzzy sets.” Journal of Mathematical Analysis and Applications, 18(1), 145–174. Gorzalczany, M. B. [1987], “A method for inference in approximate reasoning based on interval-valued fuzzy sets.” Fuzzy Sets and Systems, 21(1), 1–17. Hájek, P. [1998], Metamathematics of Fuzzy Logic. Kluwer, Boston. Hellendoorn, H., & Driankov, D. (eds.) [1997], Fuzzy Model Identification: Selected Approaches. Springer-Verlag, New York. Hughes, G. E., & Cresswell, M. J. [1996], A New Introduction to Modal Logic. Routledge, London and New York. John, R. [1998], “Type 2 fuzzy sets: An appraisal of theory and applications.” International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 6(6), 563–576. Kandel, A. [1982], Fuzzy Techniques in Pattern Recognition. John Wiley, New York. Karnik, N. N., Mendel, J. M., & Liang, Q. [1999], “Type-2 fuzzy logic systems.” IEEE Transactions on Fuzzy Systems, 7(6), 643–658.
60
2 Fuzzy Logic: A Specialized Tutorial
Kaufmann, A., & Gupta, M. M. [1985], Introduction to Fuzzy Arithmetic: Theory and Applications. Van Nostrand, New York. Klir, G. J. [1997], “Fuzzy arithmetic with requisite constraints.” Fuzzy Sets and Systems, 91(2), 165–175. Klir, G. J. [1999], “On fuzzy-set interpretation of possibility theory.” Fuzzy Sets and Systems, 108(3), 263–273. Klir, G. J., & Pan, Y. [1998], “Constrained fuzzy arithmetic: Basic questions and some answers.” Soft Computing, 2(2), 100–108. Klir, G. J., & Yuan, B. [1995], Fuzzy Sets and Fuzzy Logic: Theory and Applications. PrenticeHall, Upper Saddle River, NJ. Klir, G. J., & Yuan, B. (eds.) [1996], Fuzzy Sets, Fuzzy Logic, and Fuzzy Systems: Selected Papers by Lotfi A. Zadeh. World Scientific, Singapore. Kosko, B. [1993], Fuzzy Thinking: The New Science of Fuzzy Logic. Hyperion, New York. Kosko, B. [1999], The Fuzzy Future. Harmony Books, New York. Lin, C. T., & Lee, C. S. G. [1996], Neural Fuzzy Systems: A Neuro Fuzzy Synergism to Intelligent Systems. Prentice-Hall, Upper Saddle River, NJ. McNeill, D., & Freiberger, P. [1993], Fuzzy Logic: The Discovery of a Revolutionary Computer Technology—and How It Is Changing Our World. Simon & Schuster, New York. Mendel, J. M. [2001], Uncertain Rule-Based Fuzzy Logic Systems. Prentice-Hall, Upper Saddle River, NJ. Moore, R. E. [1966], Interval Analysis. Prentice-Hall, Englewood Cliffs, NJ. Nauck, D., Klawonn, F., & Kruse, R. [1997], Foundations of Neuro-Fuzzy Systems. John Wiley, New York. Negoita, C. V., & Ralescu, D. A. [1975], Applications of Fuzzy Sets to Systems Analysis. Birkhäuser, Basel–Stuttgart, and Halsted Press, New York. Neumaier, A. [1990], Interval Methods for Systems of Equations. Cambridge University Press, Cambridge (UK) and New York. Nguyen, H. T., & Prasad, N. R. (eds.) [1999], Fuzzy Modeling and Control: Selected Works of M. Sugeno. CRC Press, Boca Raton, FL. Nguyen, H. T., & Walker, E. A. [1997], A First Course in Fuzzy Logic. CRC Press, Boca Raton, FL. Novák, V. [1989], Fuzzy Sets and Their Applications. Adam Hilger, Bristol and Philadelphia. Novák, V., & Perfilieva, I. (eds.) [2000], Discovering the World with Fuzzy Logic. PhysicaVerlag/Springer-Verlag, Heidelberg and New York. Novák, V., Perfilieva, I., & Moˇckoˇr, J. [1999], Mathematical Principles of Fuzzy Logic. Kluwer, Boston. Pal, S. K., & Dutta Majumder, D. K. [1986], Fuzzy Mathematical Approach to Pattern Recognition. Wiley Eastern Limited, New Delhi. Pal, S. K., & Mitra, S. [1999], Neuro-Fuzzy Pattern Recognition: Methods in Soft Computing. John Wiley, New York. Pawlak, Z. [1991], Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer, Boston. Pedrycz, W. [1989], Fuzzy Control and Fuzzy Systems. Research Studies Press, Rayton, UK. Pedrycz, W., & Gomide, F. [1998], An Introduction to Fuzzy Sets: Analysis and Design. MIT Press, Cambridge, MA. Piegat, A. [2001], Fuzzy Modelling and Control. Physica-Verlag/Springer-Verlag, Heidelberg and New York.
References
61
Rescher, N. [1969], Many-Valued Logic. McGraw-Hill, New York. Ruspini, E. H., Bonissone, P. P., & Pedrycz, W. (eds.) [1998], Handbook of Fuzzy Computation. Institute of Physics Publications, Bristol (UK) and Philadelphia. Rutkowska, D. [2002], Neuro-Fuzzy Architectures and Hybrid Learning. Physica-Verlag/ Springer-Verlag, Heidelberg and New York. Sanchez, E., Shibata, T., & Zadeh, L. A. (eds.) [1997], Genetic Algorithms and Fuzzy Logic Systems: Soft Computing Perspectives. World Scientific, Singapore. Sancho-Royo, A., & Verdegay, J. L. [1999], “Methods for the construction of membership functions.” International Journal of Intelligent Systems, 14(12), 1213–1230. Sato, M., Sato, Y., & Jain, L. C. [1997], Fuzzy Clustering Models and Applications. PhysicaVerlag/Springer-Verlag, Heidelberg and New York. Van Leekwijck, W., & Kerre, E. E. [1999], “Defuzzification: criteria and classification.” Fuzzy Sets and Systems, 108(2), 159–178. Yager, R. R. [1983], “Quantified Propositions in a Linguistic Logic.” International Journal of Man–Machine Studies, 19(2), 195–227. Yager, R. R. [1985–86], “Reasoning with fuzzy quantified statements.” Kybernetes, 14,15 (Part I, Part II), 233–240, 111–120. Yager, R. R. [1991], “Connectives and quantifiers in fuzzy sets.” Fuzzy Sets and Systems, 40(1), 39–75. Yager, R. R., & Filev, D. P. [1994], Essentials of Fuzzy Modeling and Control. John Wiley, New York. Yager, R. R., Orchinnikov, S., Tong, R. M., & Nguyen, H. T. (eds.) [1987], Fuzzy Sets and Applications—Selected Papers by L. A. Zadeh. John Wiley, New York. Zadeh, L. A. [1965], “Fuzzy Sets.” Information and Control, 8(3), 338–353. Zadeh, L. A. [1978], “Fuzzy sets as a basis for a theory of possibility.” Fuzzy Sets and Systems, 1(1), 3–28.
This Page Intentionally Left Blank
Chapter 3
Fuzzy Logic and Earth Science: An Overview
Robert V. Demicco
3.1 3.2 3.3 3.4
Introduction 63 Crisp Sets and Geology 66 Fuzzy Sets in Geology 68 Fuzzy Logic Systems 73 3.4.1 Application of standard (“Mamdani”) inference rules to compaction curves 74 3.4.2 Application of standard (“Mamdani”) inference rules to coral reef growth 78 3.4.3 Application of self-adjusting inference rules to calculation of exposure index 82 3.4.4 Carbonate production as a function of depth and distance to platform edge 88 3.4.5 Permeability as a function of grain size and sorting using fuzzy clustering 93 3.4.6 Adding more antecedent variables: permeability revisited 99 3.5 Summary and Conclusions 100 References 101
3.1 Introduction We are sure that most geologists are aware of the trend toward explicit use of the terms Earth, System, and Science together in the titles of an increasing number of geology textbooks. We need to look no further than introductory texts in “geology” including: The Blue Planet: an Introduction to Earth Systems Science [Skinner & Porter, 1999]; Earth’s Dynamic Systems [Hamblin & Christiansen, 2001]; The Earth System [Kump et al., 1999]; Earth System History [Stanley, 1999]; and Earth Systems [Ernst, 2000] to name a few. Indeed, the first ten chapters of the influential textbook Understanding Earth by Press & Siever [2001] are organized into a unit entitled “Understanding the Earth System.” This trend (although many of our more traditional colleagues would say “fad”), acknowledges the fact that there is an independent field of inquiry known 63 FUZZY LOGIC IN GEOLOGY
Copyright 2004, Elsevier Science (USA) All rights of reproduction in any form reserved. ISBN: 0-12-415146-9
64
3 Fuzzy Logic and Earth Science: An Overview
as “Systems Science” that has rapidly evolved over the second half of the 20th and beginning of the 21st centuries. Although the roots of system science are based in antiquity, it did not become a recognized discipline until the latter half of the 20th century. It arose from three principal roots: (i) successful efforts in mathematics to introduce and develop more expressive formalized languages, such as fuzzy set theory, fuzzy measure theory, fractal geometry, cellular automata, etc.; (ii) the emergence of computer technology, which opened new methodological possibilities as well as laboratory tools for the prospective systems science; and (iii) a host of ideas, often captured by the general term systems thinking, which emerged in the 20th century. Systems thinking included ideas emanating from the renewed interest in holism in science, the emergence of interdisciplinary areas in science, and some developments in engineering (control theory, information theory, similarity theory, etc.). It is not our intent here to review the origin, scope, and methodology of system science or to comment on whether the use of Earth System Science in the books mentioned above is really appropriate (it is in some cases, not in others). The interested reader is referred to Klir [2001]. If geologists are going to use the term “system” in the same context as system scientists, then we need to understand this term in the sense that system scientists use it. System science seeks to categorize, understand, and exploit the interactions and linkages among components of some arbitrary division of either the artificial world (such as telephone networks) or the naturally occurring world. In the system science perspective it is not so much the components of the arbitrary division but the interactions between and among components. The most widely known concepts of system science that have become generally used in the more traditional sciences (including geology) are positive and negative feedback loops as process controls. Use of the term system in this specific way allows geologists to tap into the paradigms, methodologies, insights, etc. of system science. Mathematics is clearly at the core of system science. It has been widely appreciated that the most important tool produced by mathematics prior to the 20th century was the calculus. Indeed, Newtonian mechanics, which is a major outcome of the calculus, still comprises the bedrock of many Earth sciences (physical oceanography, seismology, climatology, whole-Earth geophysics, etc.). For example, the “diffusion equation” has application in heat flow and nearly all current groundwater flow modeling involves piecewise finite difference or finite element approximate solutions over a grid of solution points. In spite of the widespread and successful use of Newtonian mechanics in the Earth sciences, Newtonian mechanics is best at dealing only with rather simple problems involving deterministic and predominantly linear systems that have only a limited number of variables. Along with this crown jewel of pre-twentieth century mathematics there arose a traditional view that uncertainty (imprecision, nonspecificity, vagueness, inconsistency, etc.) was unscientific and had to be avoided. In the late nineteenth century, science turned to the study of physical processes at the molecular level. It rapidly became clear that, although the precise laws of
3.1 Introduction
65
Newtonian mechanics were relevant to physical processes at this level, they were not applicable in practice due to the sheer number of entities involved and the number of calculations that would have to be made. Statistical mechanics arose from these practical difficulties. It developed to deal with systems wherein there were many individuals (whether the individuals were molecules or discrete telephones), many variables, and where the variables interacted with a very high degree of randomness. The role played in Newtonian mechanics by calculus (where there is no uncertainty) was replaced by probability theory in statistical mechanics. Along with the rise of statistical mechanics came the realization that uncertainty is not only welcome in science, but also essential to disciplines such as statistical mechanics and quantum theory. Newtonian mechanics and statistical mechanics are highly complementary. The differential equations at the heart of Newtonian mechanics excel in modeling systems involving relatively small numbers of variables that are related to each other in predictable ways. On the other hand, statistical mechanics has the exact opposite characteristics: an ability to model large numbers of variables with a high degree of randomness in their interactions. It is now generally agreed that these mathematical tools only cover problems at the opposite ends of the complexity and randomness scales. In a well-known paper, Warren Weaver [1948] referred to these as problems of organized simplicity and disorganized complexity. He argued that most problems in the sciences as well as in modern technology lie somewhere between these two extremes. They involve nondeterministic and highly nonlinear systems with large numbers of components and rich interactions among the components. Furthermore, the non-deterministic nature of these systems does not arise out of randomness that can yield meaningful statistical averages and be tackled by statistical methods. Weaver called them systems of organized complexity. We would maintain that most current research in geology and all research in the area of Earth systems science focus on natural systems of organized complexity. This research has, at its heart, the desire to construct rigorous mathematical models of the behavior of Earth systems. One of the mathematical tools initially developed by systems scientists to deal with problems of organized complexity is fuzzy logic. Fuzzy logic arose out of a fundamentally different way of dealing with uncertainty. Zadeh [1965] introduced a theory of mathematical objects he called fuzzy sets—sets wherein the boundaries are not precise. Fuzzy logic (based on the mathematical manipulation of fuzzy sets) provides another approach toward modeling complex systems, an approach based on common sense, intuition, and natural language, where precise mathematical formulations of chemical and physical components of a system are replaced by natural, linguistic rules based on expert human understanding of the natural system. This chapter has two purposes. First, we would like to point out why fuzzy logic concepts naturally lend themselves to applications in the Earth sciences. Second, we would like to show how the basic concepts of fuzzy logic could be applied to the Earth sciences by way of a few simple examples. The chapters that follow this one
66
3 Fuzzy Logic and Earth Science: An Overview
have more complicated, explicit case histories of application of fuzzy logic in Earth sciences.
3.2 Crisp Sets and Geology It is obvious that almost all of the variables at the core of the Earth sciences are continuous. Obvious examples include temperature, pressure, depth in the ocean, etc. Less obvious examples are solid solution composition of feldspars, amount of quartz in a plutonic igneous rock, etc. Moreover, it is also true that many of these continua vary over many orders of magnitude. The size of sedimentary particles, for example, ranges at least from 10−4 mm through 104 mm. Likewise, permeability, a parameter from Darcy’s Law1 which depends only on the material properties of a porous medium, varies from approximately 10−16 to 10−3 cm2 for Earth materials. It is also true that most of these variables are, more often than not, broken up into arbitrary “pigeon holes” by geologists seeking to “classify.” Such “pigeon hole” classification schemes can be represented mathematically as conventional crisp sets. In a crisp set, an individual is either included in a given set or not included in it. This distinction is often described by a characteristic function. The value of either 1 or 0 is assigned by this function to each individual of concern, thereby discriminating between individuals that either are members of the set (the assigned value is 1) or are not members of the set (the assigned value is 0). Figure 3.1a is an example: the crisp set concept of “water depth” applied to a typical, shallow-marine setting. The domain of this variable ranges from 2 m below mean sea level to 2 m above mean sea level. This continuum is generally partitioned into a number of crisp sets: subtidal, intertidal, and supratidal, with the intertidal being further subdivided into high-intertidal, mid-intertidal, and low-intertidal areas [Reading & Collinson, 1996, p. 213]. In the example shown in Figure 3.1a these crisp sets are the following closed or left-open intervals of real numbers (expressing measurements in meters): Subtidal = [−2, −0.75] Low-intertidal = (−0.75, −0.25] Mid-intertidal = (−0.25, 0.25] 1 Darcy’s Law can be expressed by the formula
v=
−kρg dh μ dl
where v is the specific discharge, ρ is the fluid density, g is the gravitational acceleration, μ is the viscosity of the fluid, h is the hydraulic head (a proxy for a fluid potential field made up of potential energy and kinetic energy terms), l is length over which the potential change is measured, and k is the permeability.
3.2 Crisp Sets and Geology
67
Figure 3.1 Comparison of a crisp set description of the variable “tidal range” (a), with a fuzzy set description (b). In (a): “mean low water” = −1.25 m, “mean sea level” = 0 m, and “mean high water” = 0.75 m. The fuzzy set representation better captures natural variations (implied by the adjective “mean”) due to periodic tidal curve changes resulting from the ebb–neap–ebb cycle, and non-periodic, random variations such as storm flooding, etc.
High-intertidal = (0.25, 0.75] Supratidal = (0.75, 2]. Each of these sets (intervals) may also be expressed by a characteristic function. Denoting, for example, the characteristic function of the set (interval) representing mid-intertidal water depth by A, we have 1 when x ∈(−0.25, 0.25] A(x) = (3.1) 0 otherwise. However, on modern tidal flats, these boundaries are constantly changing due to periodic variations in over a dozen principal tidal harmonic components [see Table 11.1 in Knauss, 1978]. More importantly, it is commonly flooding due to anomalous “wind tides” and “barometric tides” [Knauss, 1978] that is important for erosion and deposition in beaches, tidal flats, etc.
68
3 Fuzzy Logic and Earth Science: An Overview
3.3 Fuzzy Sets in Geology Zadeh [1965] introduced a concept that has come to be called a standard fuzzy set in order to convey the inherent imprecision of arbitrary “pigeon hole” boundaries. The imprecision of these boundaries results from both the precision of the measurement and, as in the case of tidal flats, the accuracy of trying to pin down an ever-changing location. In a standard fuzzy set the characteristic function is generalized by allowing us to assign not only 0 or 1 to each individual of concern, but also any value between 0 and 1. This generalized characteristic function is called a membership function (Figure 3.1b). The value assigned to an individual by the membership function of a fuzzy set is interpreted as the degree of membership of the individual in the standard fuzzy set. The membership function B(x) of the standard fuzzy set “mid-intertidal” represented in Figure 3.1b is ⎧ 0 when x ≤ −0.5 m ⎪ ⎪ ⎪ ⎪ ⎪ x + 0.5 ⎪ ⎪ when − 0.5 ≤ x ≤ 0 m ⎨ 0.5 (3.2) B(x) = 0.5 − x ⎪ ⎪ ⎪ when 0 ≤ x ≤ 0.5 m ⎪ ⎪ 0.5 ⎪ ⎪ ⎩ 0 when 0.5 m ≤ x The fuzzy set description of tidal range given in Figure 3.1b better captures the essence of the gradations between locations on beaches, tidal flats, etc. Similarly, 1 to 2 meters below sea level is certainly shallow, but where does a carbonate platform or siliciclastic shelf become “deep” or “open” (see Nordlund [1996])? Using fuzzy sets, there can be a complete gradation between all these depth ranges. Each membership function is represented by a curve that indicates the assignment of a membership degree in a fuzzy set to each value of a variable within the domain of the variable involved (e.g. the variable “water depth”). The membership degree may also be interpreted as the degree of compatibility of each value of the variable with the concept represented by the fuzzy set (e.g. subtidal, low-intertidal, etc.). Curves of the membership functions can be simple triangles, trapezoids, bell-shaped curves, or have more complicated shapes. Contrary to the symbolic role of numbers 1 and 0 in characteristic functions of crisp sets, numbers assigned to individuals by membership functions of standard fuzzy sets have a clear numerical significance. This significance is preserved when crisp sets are viewed (from the standpoint of fuzzy set theory) as special fuzzy sets. Another example of the difference between crisp and fuzzy sets is provided by the concept of “grain size” (also mentioned in Section 2.1). The domain of this variable ranges over at least eight orders of magnitude from particles that are sub-micron size to particles that are meter size. Because of this spread in the domain of the variable, grain size is usually represented over a base 2 logarithmic domain. This continuum
3.3 Fuzzy Sets in Geology
69
Figure 3.2 Comparison of a crisp set description of the variable “grain size” (a) with a fuzzy set description (b) of part of the range of the variable. (The “phi” scale (σ ) = the negative log of the size of the particle with base 2.)
is generally divided into six crisp sets2 ; clay, silt, sand, gravel, cobbles and boulders (Figure 3.2a). The characteristic function A(x) of sand is, for example, ⎧ 1 ⎨ mm ≤ x ≤ 2 mm 1 when (3.3) A(x) = 16 ⎩0 otherwise In this crisp set representation of grain size a grain with diameter of 1.9999 mm would be classified as sand, whereas a grain with diameter of 2.0001 mm would be classified as gravel. If fuzzy sets are used instead of crisp sets (Figure 3.2b), than the artificial classification boundaries are replaced by gradational boundaries and the two grains described would share membership in both sets, described by the linguistic terms “coarse sand” and “gravel.” With increasing diameter of grains, the membership in “gravel” will increase and the membership in “sand” will decrease in some way that depends on the application context. The basic idea is that the membership in a fuzzy set is not a matter of affirmation or denial, as it is in a classical set, but a matter of degree. 2 However, for some usages “sand” is further subdivided into up to 20 “pigeon holes.”
70
3 Fuzzy Logic and Earth Science: An Overview
The membership functions in this case have complicated formulas because the domain is represented on a logarithmic scale whereas the range is on an arithmetic scale. Other examples of fuzzy sets relevant to geologic systems might include the thickness of sediments eroded and deposited, the anorthite content in plagioclase feldspar, and the velocity of flow in an aquifer. In these contexts, terms such as “produce some,” “erode a little,” or “about 30% anorthite” or “very slow fluid flow” have meaning. We can apply the concept of fuzzy sets to combinations of variables. For example, Figure 3.3a is the classification of intrusive igneous rocks recommended by the International Union of Geological Sciences [Streckeisen, 1974] and features a classification space comprising of two back-to-back triangles of the kind every student of geology will remember from petrography classes. In this scheme, rock name is based on the ratios of three minerals: (1) feldspar containing variable amounts of Na or Ca (symbolized by an A at the left apex of the diamond); (2) feldspar containing K (symbolized by a P at the right apex of the diamond); and (3) either quartz (symbolized by a Q at the upper apex of the diamond) or minerals referred to as “feldspathoids” (symbolized by an F at the lower apex of the diamond). The numbers along the straight-line joins between the four apices indicate the relative percentages of either A–P–Q or A–P–F where the percentages of these minerals have been normalized to 100%. For example, “granite” contains between 20% to 60% quartz and therefore contains 80% to 40% feldspar. The feldspars in turn must lie between a 10%–90% A–F mixture and a 90%–10% A–F mixture. In a crisp set representation of the concept “granite” (Figure 3.3b) we would assign a 1 to those combinations of variables that fit the definition of granite, and a 0 elsewhere. A fuzzy set representation of granite (Figure 3.3c) would take into account the obviously transitional nature of the boundaries where, depending on the context, values between 0 and 1 would be assigned. In other words, a particular igneous rock can be simultaneously a member of the fuzzy set quartz syenite and granite. Adopting such a scheme would go a long way toward settling the debates of where, exactly, the boundaries of the pigeon holes should be. Needless to say, fuzzy set concepts could be applied to the classification of sedimentary and metamorphic rocks as well. Other, nonstandard types of fuzzy sets have been introduced in the literature (see overview in Section 2.9). In this chapter, however, we consider only standard fuzzy sets in which degrees of membership are characterized by numbers between 0 and 1. Therefore the adjective “standard” is omitted. Fuzzy sets are a powerful tool for relating independent to dependent variables, as is demonstrated below. However, there are some instances in which the use of crisp sets is quite adequate. For example, the bed forms (and by extension the sedimentary structures) that form under steady, uniform flows in flumes or in nature are well represented by crisp sets. There is no significant transitional form between ripples and dunes where steady states obtain. Thus, it would be inappropriate to use a fuzzy set description of bed forms. However, we should recognize that crisp sets are special fuzzy sets.
3.3 Fuzzy Sets in Geology
71
Figure 3.3 (a) The International Union of Geological Sciences recommended classification of plutonic igneous rocks [Streckeisen, 1974]. See text for details. (b) Perspective view of the upper triangular portion of (a) showing a crisp set representation of the term granite. A 1 is assigned to all values within the trapezoid boundaries of granite. (c) Perspective view of the upper triangular portion of A showing a possible fuzzy set representation of the rock name granite.
72
3 Fuzzy Logic and Earth Science: An Overview
Figure 3.3 Continued
3.4 Fuzzy Logic Systems
73
3.4 Fuzzy Logic Systems There is a wealth of observational and experimental data in the geological sciences. Moreover, a complicated mix of quantitative and qualitative data types usually characterizes any given geologic system. For example, in a metamorphic terrain we may be able to measure isotopes and elemental compositions of a few hundred samples with tremendous accuracy and repeatability. However, these samples are usually scattered over many hundreds or even thousands of kilometers in arbitrarily located, threedimensional outcrops. Nordlund [1996] refers to qualitative information as “soft” information. Other examples of qualitative information would be “beach sands tend to be well sorted and are coarser than offshore sands,” or “carbonate sediment is produced in an offshore carbonate ‘factory’ and is transported and deposited in tidal flats,” or “basaltic magmas have a lower viscosity than more siliceous magmas.” Such statements carry information, but are not easily quantified. Indeed, these types of qualitative statement are commonly very important sources of information that is obtained by field studies. Other examples of soft information would include descriptions of rock types, interpretations of depositional settings and their entombed fossils. These qualitative, “soft data” are usually admixed with what we might refer to as “hard data.” Hard data might include seismic (or outcrop-scale) geometric patterns of reflectors or bedding geometries, isotopic ratios along a closely spaced sampling line, etc. Fuzzy logic allows us to formalize and treat such “soft” information in a rigorous, mathematical way and it also allows quantitative information to be treated in a more natural, continuous fashion. We would like to suggest that fuzzy logic might be a powerful and computationally efficient alternative technique to numerical modeling of geological systems. It has the distinct advantage in that models based on fuzzy logic are robust, easily adaptable, more attuned to common sense, computationally efficient, and in a sensitivity analysis can be easily altered, allowing many different combinations of input parameters to be run in a quick and efficient way. The primary purpose of fuzzy logic is to formalize reasoning in natural language. This requires that propositions expressed in natural language be properly formalized. In fuzzy logic, the various components of natural-language propositions (predicates, logical connectives, truth qualifiers, quantifiers, linguistic hedges, etc.) are represented by appropriate fuzzy sets and operations on fuzzy sets [Zadeh, 1975–76]. Each of these fuzzy sets and operations is strongly context dependent and, consequently, must be determined in the context of each application [Klir & Yuan, 1995]. The most common fuzzy logic systems are sets of fuzzy inference rules, or “if–then” rules (see also Section 2.5). These are conditional and usually unqualified fuzzy propositions that describe dependence of one or more output-variable fuzzy sets to one or more input-variable fuzzy sets. A simple fuzzy if–then rule assumes the canonical form If x is A then y is B
74
3 Fuzzy Logic and Earth Science: An Overview
where A and B are linguistic values defined by fuzzy sets on the universal sets X and Y , respectively. The “if” part of the rule “x is A” is referred to as the antecedent or premise whereas the “then” part of the rule “y is B” is referred to as the consequent or conclusion.
3.4.1 Application of standard (“Mamdani”) inference rules to compaction curves Where fine-textured sediments are progressively buried in subsiding sedimentary basins, their porosity (the volume fraction of connected voids that allow fluid movement) is sharply reduced. Figure 3.4a [Goldhammer, 1997] summarizes the data available for fine-grained or muddy carbonate sediments from a variety of sources. Goldhammer [1997] fits two empirically derived exponential curves to these data: φ = 70e−z/263 (z < 150 m)
(3.4)
φ = 40e−z/6500 (z > 150 m),
(3.5)
where φ = porosity and z = depth in meters. The dot-dash lines in Figure 3.4 show the solutions of these equations. This is a fairly typical example of “curve fitting” to geological data and, as written, these equations tell the reader that the change in porosity of muddy carbonate sediments with depth can be modeled by an exponential function. A fuzzy logic approach to this same data set could start with the straightforward statement: “Very near the surface, the porosity is high; around 100 meters or so the porosity decreases to intermediate values of around 40 percent; and then the porosity steadily decreases to low values at 4500 meters or so.” Both burial depth and porosity in this context are fuzzy sets. The plot in Figure 3.5a shows three possible membership functions for the fuzzy sets “surface,” “shallow,” and “deep” for the input variable depth of burial over the domain 0 to 4500 m. The plot in Figure 3.5b shows three possible membership functions for the fuzzy sets “low,” “medium,” and “high” for the dependent variable porosity over the domain 10% to 90%. With these fuzzy sets characterizing the input and output variables, we can formally break our statement above into three “if–then” rules: 1. If the burial depth is near-surface, then the porosity is high; 2. If the burial depth is shallow, then the porosity is medium; 3. If the burial depth is deep, then the porosity is low. The standard (so-called “Mamdani”) interpretation [Mamdani & Assilian, 1975] of these if–then rules is shown in Figure 3.6 for a burial depth of 100 m. The left-hand column represents the input variable burial depth whereas the right-hand column
3.4 Fuzzy Logic Systems
75
Figure 3.4 Application of fuzzy logic to compaction of lime mud with increasing burial depth. Data points shown by asterisks in (a) and (b) are from Goldhammer (1997). Dot-dash curves in (a) and (b) are empirical fits described by Equations (3.4) and (3.5). The solid lines in (a) and (b) are fuzzy logic system approximations to data. (a) Output of fuzzy logic system shown in Figures 3.5a and 3.5b. (b) Output of fuzzy logic system shown in Figures 3.5c and 3.5d.
76
3 Fuzzy Logic and Earth Science: An Overview
Figure 3.5 (a) Antecedent membership functions for the variable burial depth: close to surface; shallow; and deep. (b) Consequent membership functions of the output variable porosity: low; medium; and high. The triangular shapes of fuzzy logic systems (a) and (b) produces the output curve in Figure 3.4a. The “bell-shaped” membership functions of the same variables (burial depth and porosity) in (c) and (d) produce the output curve in Figure 3.4b.
represents the output variable porosity. The upper row is the rule: “if depth is nearsurface, then porosity is high.” The second row is the rule: “if depth is shallow, then porosity is medium.” Finally, the third row represents the rule: “if depth is deep, then porosity is low.” The input variable (100 m in this example) is evaluated simultaneously for each water depth and a truth value = degree of membership of the input variable in each of the potential input sets (“surface,” “shallow,” and “deep”) is calculated. In this case, the burial depth of 100 m only “fires” the first two rules. This is because a depth of 100 m is not part of the fuzzy set “deep,” so the “truth value” of the proposition “100 m is a member of the fuzzy set ‘deep”’ is 0. These truth values truncate the membership functions of the appropriate output variable. For each burial depth, the maximum of the three truncated membership functions of
3.4 Fuzzy Logic Systems
77
Figure 3.6 Standard (“Mamdani”) interpretation of the “if–then” rules (“if the burial depth is near surface, then porosity is high”; “if the burial depth is shallow, then the porosity is medium”; and “if the burial depth, is deep, then the porosity is low”) is shown for a burial depth of 100 m. The input variable is evaluated for each depth and a truth value = degree of membership of the input variable in each of the potential input sets (“surface,” “shallow,” and “deep”) is calculated. These truth values truncate the membership functions of the appropriate output variable. For each burial depth, the truncated membership functions of the output variable are summed, and the centroid of the appropriate curve is taken as the “defuzzified” output value.
the output variable is taken, and the centroid of the appropriate curve is taken as the “defuzzified” output value. The solid curves in Figure 3.4a shows this fuzzy inference system evaluated over the depth 0 to 4500 m. There are two distinct advantages in the approach of using fuzzy logic to characterize geologic systems rather than empirical equations. First, fuzzy sets describe systems in natural language. More importantly, the shapes of the membership functions can easily be changed by small increments, thereby allowing rapid “sensitivity analysis” of the effects of changing the boundaries of the fuzzy sets. An example of this is shown in Figures 3.5c and 3.5d. When the shapes and boundaries of the membership functions are slightly changed, the output function is also slightly changed. In this manner, by “trial and error,” the output values of a fuzzy inference system are changed in order to more nearly match ground truth. Figure 3.5c shows differently shaped membership functions with slightly different boundaries to those of the
78
3 Fuzzy Logic and Earth Science: An Overview
triangular-shaped membership functions of Figure 3.5a. The result of using the same rules as before, evaluated with these membership functions, is shown in Figure 3.4b by the solid curve (the dot-dash curve is the empirical fit suggested by Goldhammer [1997]). In robot control algorithms, where fuzzy logic was first developed, systems could self-adjust the shapes of the membership functions and set boundaries until the required task was flawlessly performed. This aspect of fuzzy systems, commonly facilitated via the learning capabilities of appropriate neural networks [Kosko, 1992; Klir & Yuan, 1995; Nauck & Klawonn, 1997] or by genetic algorithms [Sanchez et al., 1997], is one of their great advantages over numerical solution approaches. In sections below, we further discuss this aspect of fuzzy logic.
3.4.2 Application of standard (“Mamdani”) inference rules to coral reef growth Coral animals are capable of rapid fixation of CaCO3 from seawater because of symbiotic photosynthetic algae within their tissues. Thus, carbonate production of these animals is, in some way, related to light penetration into the shallow ocean. Demicco & Klir [2001] contrasted “if–then” rule-based fuzzy logic models of coral reef growth with the deterministic models of Bosscher & Schlager [1992]. This example is again briefly described here as background for Chapter 9 where this problem is used to illustrate a novel method for solving differential equations. Figure 3.7 shows data on growth rates of the main Caribbean reef-building coral Montastrea annularis [Bosscher & Schlager, 1992, Figure 1, p. 503]. Bosscher & Schlager [1992], following Chalker [1981], fit the equation G(z) = G(0) tanh(Io e−kz/Ik )
(3.6)
to these data. Here z is water depth, G(z) is growth rate at a given depth (z), G(0) is maximum growth rate (G at z = 0), Io is surface light intensity, Ik is saturation light intensity, and k is the extinction coefficient given in the Beer–Lambert law, Iz = Io e−kz .
(3.7)
In Figure 3.7, the two dotted curves are fit to the data of Equation 3.6 using different values of the parameters G(0), Io , Ik , and k. Bosscher & Schlager [1992] extended these equations and developed a numerical model of the geologic history of coral reefs growing on the Atlantic shelf-slope break of Belize by a step-wise solution of the differential equation dh(t)/dt = Gm tanh(Io exp{−k[ho + h(t)] − [so + s(t)]}/Ik ).
(3.8)
Here dh(t)/d(t) is the change in the height of the coral surface with time, ho is the initial height of the surface at the start of a time step, h(t) is the growth increment
3.4 Fuzzy Logic Systems
79
Figure 3.7 Measured growth rates of the main Caribbean reef-building coral Montastrea annularis [from Bosscher & Schlager, 1992, Figure 1]. The two dotted lines are solutions of Equation (3.6). The solid curve is the result of the fuzzy logic system described in the text.
in that time step, so is the initial sea level position for a time step, and s(t) is the variation in sea level for that time step. A simulation of coral reef growth based on this equation is shown in Figure 3.8a. The solution assumes an initial starting slope, initial values of Gm (maximum growth rate), Io (initial surface light intensity), k (extinction coefficient), and the variable sea level curve over the last 80,000 years as shown in Figure 3.8c. Demicco & Klir [2001] used a fuzzy logic system to model the growth rates of Montastrea annularis based on a natural language description that captures the essence of the data: “If the water is shallow, then the coral growth rate is fast. If the water is deep, then the coral growth rate is slow.” The input or antecedent parameter here is water depth whereas the output or consequent variable is coral growth rate. Both of these variables can be represented by fuzzy sets (Figure 3.9). Figure 3.9a shows two membership functions for the fuzzy sets “shallow” and “deep” for the input variable depth over the domain 0 to 50 m. Figure 3.9b shows two possible membership functions for the fuzzy sets “fast” and “slow” for the variable growth rate over the domain 0 to 10 mm/yr. The “Mamdani” interpretation [Mamdani & Assilian, 1975] of the if–then rules (“if the water is shallow, then the coral growth rate
80
3 Fuzzy Logic and Earth Science: An Overview
Figure 3.8 Comparison of 2-dimensional models of the geologic history of coral reefs growing on the Atlantic shelf-slope break of Belize. (a) Stepwise solution of differential Equation (3.8) [Bosscher & Schlager, 1992]. The forward model solution for coral reef growth assumes an initial starting slope, initial values of Gm (maximum growth rate), Io (initial surface light intensity), and k (extinction coefficient). (b) Model of reef growth based on the same sea level curve, same starting slope, and same initial value of Gm , but with the fuzzy inference system described in the text replacing the differential equation for coral growth production. (c) Variable sea level curve of the past 80,000 years, input into both models.
is fast; if the water is deep, then the coral growth rate is slow”) is also employed in this example (Figure 3.10). The left-hand column represents the input variable water depth whereas the right-hand column represents the output variable growth rate. The upper row is the rule: “if depth is shallow, then growth rate is fast.” The second row is the rule: “if depth is deep, then growth rate is slow.” A value of the input
3.4 Fuzzy Logic Systems
81
Figure 3.9 (a) Two membership functions for the fuzzy sets “shallow” and “deep” for the input variable depth over the domain range 0 to 50 m. (b) Two membership functions for the fuzzy sets “fast” and “slow” for the variable growth rate over the domain 0 to 10 mm/y. Membership functions were adjusted by hand to produce the visual “best fit” curve in Figure 3.7.
variable (10 m in Figure 3.10) is evaluated simultaneously for each water depth and a truth value = degree of membership of the input variable in each of the potential input sets (“shallow” and “deep”) is calculated. These truth values truncate the membership functions of the appropriate output variable. For each water depth, the maximum of the two truncated membership functions of the output variable is taken, and the centroid of the appropriate curve is taken as the “defuzzified” output value. The solid curve in Figure 3.7 shows this fuzzy inference system evaluated over the depth 0 to 50 m. Demicco & Klir [2001] also developed a forward model of reef development based on the fuzzy inference system of Figures 3.9 and 3.10. The results of this model are compared with Bosscher & Schlager’s [1992] results in Figure 3.8b. In Chapter 9, Perfilieva describes a novel technique for solving Equation (3.8) on the basis of fuzzy transformations. She then applies this solution technique to another forward model of reef growth (Figure 9.10) that gives similar results to Figures 3.8a and 3.8b.
82
3 Fuzzy Logic and Earth Science: An Overview
Figure 3.10 Standard (“Mamdani”) interpretation of the “if–then” rules (“if the water is shallow, then coral growth rate is fast; if the water is deep, then the coral growth rate is slow”) is shown for a water depth of 10 m. The input variable is evaluated for each water depth and a truth value = degree of membership of the input variable in each of the potential input sets (“shallow” and “deep”) is calculated. These truth values truncate the membership functions of the appropriate output variable. For each water depth, the truncated membership functions of the output variable are summed, and the centroid of the appropriate curve is taken as the “defuzzified” output value.
It is important to note that fuzzy logic systems are very versatile and, indeed, can be more versatile than deterministic equations. So far we have been using ordinary fuzzy sets wherein for a given input value there is one output value. In general, although we will not use them in this chapter, we can generalize ordinary fuzzy sets into secondorder fuzzy sets [Mendel, 2001], where the membership function does not assign to each element of the universal set one real number but a fuzzy number (a fuzzy set defined on the real numbers in the unit interval), or a closed interval of real numbers between the identified upper and lower bounds. Clearly, this approach would be warranted by the spread in the initial data on coral growth rates versus depth in Figure 3.7. 3.4.3 Application of self-adjusting inference rules to calculation of exposure index In control algorithms, where fuzzy logic was first developed, systems could selfadjust the shapes of the membership functions and set boundaries until the required
3.4 Fuzzy Logic Systems
83
task was flawlessly performed. This aspect of fuzzy systems, commonly facilitated via the learning capabilities of appropriate neural networks [Kosko, 1992; Klir & Yuan, 1995; Nauck & Klawonn, 1997; Lin & Lee, 1996] or by genetic algorithms [Sanchez et al., 1997; Cordón et al., 2001], is one of their great advantages over numerical solution approaches. In the two examples given above of burial depth versus porosity and coral reef growth versus water depth, the membership functions were adjusted “by hand” to fit the output function to the data. In this simple procedure, we used the naturalistic boundaries suggested by the data, and trial and error adjustment of the shapes of the membership functions to obtain the desired fit. In this section, we present a fuzzy logic system that relates elevation on a carbonate flat to the absolute amount of time an area of the flat is exposed to the atmosphere (the “exposure index” of Ginsburg et al. [1977]). We use both a trial and error fit obtained “by hand” and the adaptive neuro-fuzzy system that is included in the Fuzzy Logic Toolbox of the commercial high-level language MATLAB© . Figure 3.11a shows “exposure index” versus elevation around an arbitrarily designated “mean tidal level of 0 m” for the carbonate tidal flats of northwestern Andros Island in the Bahamas (the data are from Ginsburg et al. [1977], their Figure 3, p. 8). The exposure index is the percentage of time an area at a certain elevation relative to the mean tide level stays dry. In this microtidal setting, exposure index is a complicated function of wind direction, strength, and duration as well as the astronomical tides. The fuzzy logic systems we have used to this point in this chapter have been the Mamdani fuzzy inference systems wherein the output variable is a linguistic variable whose states are standard fuzzy sets. In the final step of a Mamdani-type fuzzy logic system, a fuzzy set that has been generated by aggregating the appropriate truncated membership functions of the output variable has to be “defuzzified” by some averaging process (e.g., finding the “centroid”). Contrary to a Mamdani fuzzy inference system, an alternative approach to formalizing fuzzy inference systems, developed by Takagi & Sugeno [1985], employs a single “spike” as the output membership functions. Thus, rather than integrating across the domain of the final output fuzzy set, a Takagi–Sugeno-type fuzzy inference system employs only the weighted average of a few data points. In Figure 3.12 we show the Takagi–Sugeno system used to generate the hand-adjusted curve in Figure 3.11a. Elevation on the tidal flat is divided into three triangular-shaped membership functions: low intertidal; high intertidal; and supratidal, shown from top to bottom in the left column of the figure. The exposure index in Figure 3.12 is represented by three Takagi–Sugeno “spikelike” (so-called “zero order”) membership functions: little exposure (centered on 0.1 or 10%); medium exposure (centered on 0.5 or 50%); and high exposure (centered on 0.99 or 99%). The “if–then” rules for this system are, as in the case of coral reef growth, simple and intuitive. They are represented by the three rows and are (from top to bottom): (1) If the elevation is low intertidal, then the exposure index is little;
84
3 Fuzzy Logic and Earth Science: An Overview
Figure 3.11 Asterisks in figures (a) and (b) are data taken from Ginsburg et al. [1977, Figure 3, p. 8] and plot measured “exposure index” versus height around “mean tidal level” for the carbonate tidal flats of northwestern Andros Island in the Bahamas. The exposure index is the percentage of time an area at a certain height relative to the mean tide level stays dry. In this microtidal setting, exposure index is a complicated function of wind direction, strength, and duration as well as the astronomical tides. (a) Output of “hand-tuned” Takagi–Sugeno fuzzy logic system shown in Figure 3.12. (b) Output of MATLAB© adaptive neuro-fuzzy Takagi–Sugeno fuzzy logic system described in Figures 3.13 and 3.14.
3.4 Fuzzy Logic Systems
85
Figure 3.12 Takagi–Sugeno (so-called 0-order) fuzzy inference system for relating height relative to sea level to exposure index evaluated at an elevation of −15 cm. In this case, for each rule the output membership functions are “spikes” set at output values 0.1, 0.5, and 0.99. The truth value of the antecedent variables truncate the spikes at the appropriate values and their weighted average determines the output. The output of this system is the solid line in Figure 3.11a.
(2) If the elevation is high intertidal, then the exposure index is medium; and (3) If the elevation is supratidal, then the exposure index is high. For an elevation of +15 cm shown in this example, the appropriate output “spikes” are truncated and the simple weighted sum of all the output functions is computed. This fuzzy logic system was adjusted “by hand,” and does not do a particularly accurate job in fitting the data. The MATLAB© adaptive neuro-fuzzy system is a program that utilizes learning capabilities of neural networks for tuning parameters of fuzzy inference systems on the basis of given data. However, as explained above, the type of fuzzy inference systems dealt with by this program are not the classical Mamdani type but rather the Takagi– Sugeno type. The program implements a training algorithm that employs the common backpropagation method based on the least-square error criterion [see Klir & Yuan, 1995, AppendixA]. Figure 3.13 shows the four antecedent membership functions used to “adjust” the output linear functions. These four membership functions correspond to the linguistic terms “subtidal,” “low intertidal,” “high intertidal,” and “supratidal”
86
3 Fuzzy Logic and Earth Science: An Overview
Figure 3.13 Four antecedent membership functions for the input variable elevation defined relative to an arbitrary datum of 0 water depth taken as “mean sea level.” These membership functions were assigned by the adaptive neuro-fuzzy training routine of MATLAB© .
used to describe the relative position of the sediment surface with respect to “mean” tidal oscillations. The curve in Figure 3.11b shows the output of this fuzzy inference system. The training algorithm generates four linear output functions. These linear output membership functions (so-called “first-order” Takagi–Sugeno membership functions) differ from the so-called “zero-order” spikes used above as the positions of the “spikes” are linearly related to the value of the input membership function. For example, in Figure 3.14 the upper panel shows the evaluations of four if–then rules for an elevation of −15 cm whereas the lower panel shows the evaluations of four if-then rules for an elevation of +15 cm. Note how the positions of the output membership spikes change between the two diagrams. This is especially obvious for the upper row in each panel. The location of the “spike” in each rule is given by one of the four linear formulas below: y = 2.08 ∗ waterdepth + 75.46 y = 1.93 ∗ waterdepth + 41.57 y = 0.16 ∗ waterdepth + 94.12 y = 0.15 ∗ waterdepth + 91.48 Here y represents the position of the output spike in the domain −128.9 to 359.7. As an example, consider the upper panel in Figure 3.14 where an elevation of −15 cm is input. The location of the spike in each of the rules is then given by: 2.08 × −15 + 75.46 = 44.26
3.4 Fuzzy Logic Systems
87
Figure 3.14 A “linear” (so-called first-order) Takagi–Sugeno machine-fit fuzzy inference system for the elevation versus exposure index on the tidal flats of Andros Island. The right column shows the four antecedent membership functions of the variable elevation over the domain −30 to +52 cm. The upper set of panels evaluates the system for an elevation of −15 cm whereas the lower set of panels evaluates the system for an elevation of +15 cm. Note that the output “spikes” change position as the input variables change. In effect the output spikes can “float” over the output domain (exposure index) in a fashion that is linearly related to the water depth. See text for further details.
88
3 Fuzzy Logic and Earth Science: An Overview 1.93 × −15 + 41.57 = 12.62 0.16 × −15 + 94.12 = 91.72 0.15 × −15 + 91.48 = 89.15
Each spike is truncated by the appropriate membership grade for the input: −15 cm. This input value truncates the membership function subtidal to degree 0.4, the membership function low intertidal to degree 0.6, the membership function high intertidal to degree 0.01, and the membership function supratidal to degree 0.004. The final output is given by aggregating the spike location times the truth value of the appropriate input fuzzy set: 44.26 × 0.4 + 12.62 × 0.6 + 91.72 × 0.01 + 89.15 × 0.004 = 26.5498 There is a clear computational advantage in employing a Takagi–Sugeno fuzzy logic system. Moreover, it is well suited to optimization techniques and adaptive techniques. The main advantages of the Mamdani method are its widespread acceptance and its intuitive nature.
3.4.4 Carbonate production as a function of depth and distance to platform edge Where the fuzzy logic system is composed of two antecedent variables and one consequent variable and standard fuzzy sets are used, the two input variables “map” to a singular value of the output variable that comprises a 3-dimensional surface. This is a particularly powerful way to envision a fuzzy logic system. Carbonate sediment production is commonly modeled as being linearly depth-dependent as in the example of coral growth discussed above. However, the pioneering work of Broecker & Takahashi [1966] showed that sediment production on shallow carbonate platforms is dependent not only on shallow depths, but also on the residence time of the water, which, practically speaking, translates into distance away from the nearest shelf margin. Figure 3.15 shows the shallow bathymetry of the Great Bahama Bank northwest of Andros Island. The dotted contours on Figure 3.15 are carbonate production contours in kg/m2 per year, taken from Broecker & Takahashi [1966, their Figure 10, p. 1585]. Broecker & Takahashi did not make any measurements that allowed them to compute carbonate production at the steep edge of the platform. However, where such measurements have been made they cluster in the range of 1 to 4 kg/m2 /y. For this exercise, we have chosen 1 kg/m2 /y as a target figure for carbonate sediment production at the margin of the Great Bahama Bank. Our models of sediment production use a 1 km2 grid of bathymetric data (Figure 3.15), compute sediment production in each cell, and graph the results.
3.4 Fuzzy Logic Systems
89
Figure 3.15 Bathymetry of the Great Bahama Bank northwest of Andros Island. The edge of the platform is taken as the −100 m sea level contour. The dotted lines are contours of carbonate sediment production in kg/m2 per year, taken from Broecker & Takahashi [1966, Figure 10, p. 1585].
The first model (Figure 3.16a) uses a simple linear interpolation of sediment production with normalized distance from the bank margin: production = −0.5 × distance + 1
(3.9)
shown in Figure 3.17a. The second model (Figure 3.16b) uses a simple linear function of sediment production with depth shown in Figure 3.17b: if depth < −10 m production = 0.01111 × depth + 1.11111 else
(3.10) production = −0.07 × depth + 0.3
90
3 Fuzzy Logic and Earth Science: An Overview
Figure 3.16 Models of carbonate sediment production on the Great Bahama Bank. (a) Sediment production modeled as a linear decrease with (normalized) distance from bank margin according to Equation (3.9) (also see Figure 3.17a). (b) Sediment production modeled using the piecewise linear relationship of Equation (3.10) relating sediment production to water depth (also see Figure 3.17b). (c) Sediment production modeled using the piecewise planar relationships of Equation (3.11) relating sediment production to both water depth and distance from bank margin (also see Figure 3.18a). (d) Sediment production modeled using the fuzzy logic system described in the text (also see Figures 3.18b and 3.19, and color insert).
This function has a maximum at 10 m and more or less resembles the data set of coral growth rates in Figure 3.7. Neither of these models does a particularly good job in predicting carbonate sedimentation production patterns of the Great Bahama Bank. The third model combines the two linear models: if depth < −10 m production = (0.011111 × depth + 1.11111) − (0.5 × distance) else
(3.11) production = (−0.05 × depth) + 0.3) − (0.5 × distance)
3.4 Fuzzy Logic Systems
91
Figure 3.17 Linear sediment production functions. (a) Graph of Equation (3.9). (b) Graph of Equation (3.10).
Figure 3.18 Models of sediment production based on distance and water depth. (a) Graph of Equation (3.11): sediment production modeled as piecewise planes. (b) Graph of sediment production according to the fuzzy logic system described in the text and Figure 3.19.
Figure 3.18a is a graph of this function and the results, applied to the Great Bahama Bank, are shown in Figure 3.16c. This function is not easily altered to fit observations and is not that transparent to someone unfamiliar with the problem. Contrast this piecewise approach to a fuzzy logic system of the same problem. Figure 3.19 shows the two input variables (normalized distance from shelf edge and depth) and the output variable production. Distance is characterized by two Gaussian membership functions, near and far, whereas depth is characterized by two trapezoidal membership functions (deep and shallow) and one triangular membership function (maximum production depth, abbreviated max on the figure). The output variable production comprises four membership functions, hardly-any, little, some, and lots. There are six rules to this fuzzy logic system: (1) If distance is near and depth is deep, produce hardly any;
92
3 Fuzzy Logic and Earth Science: An Overview
Figure 3.19 Membership functions comprising part of the fuzzy logic system that generated the solution in Figure 3.16d. The input variables are (normalized) distance (modeled with Gaussian membership functions) and water depth (modeled by both trapezoidal or triangular membership functions). Rules relating these variables are given in the text and the solution is shown in Figure 3.18b.
(2) (3) (4) (5) (6)
If distance is near and depth is max, produce lots; If distance is near and depth is shallow, then produce some; If distance is far and depth is deep, produce hardly any; If distance is far and depth is max, produce little; and If distance is far and depth is shallow, produce hardly any.
Figure 3.18b is a graph of the production versus depth and distance determined by this fuzzy logic system contrasted with the piecewise planar approximation. Figure 3.16d is the results of this fuzzy logic model applied to the Great Bahama Bank. Both the piecewise planar model (Figures 3.16c and 3.16d) do a fairly good job in reproducing the carbonate sediment production pattern on the Great Bahama Bank northwest of Andros Island (Figure 3.15). We have adjusted the boundaries and shapes of the depth function “by hand” to tune this model. It is important to note that tuning the model
3.4 Fuzzy Logic Systems
93
by adjusting the membership functions is relatively easy in comparison with trying to recalculate the piecewise approximating equations. For example, Figure 3.18b is the 3-dimensional representation of the fuzzy logic system described in Figure 3.19b. The [x,y] axes are water depth and distance, and production is [z]. Notice that this rather simple fuzzy logic system generates a rather complicated non-planar relationship between depth, distance, and production. The shape of the output surface can be quickly changed by many operations, the two most commonly employed being: (1) varying the shapes and boundaries of the membership functions; and (2) changing the rule connectors between fuzzy intersection (“and”) or fuzzy union (“or”) rules. We can, in effect, “warp” the output surface to any arbitrary shape we want by varying the shapes of the membership functions and the connectors in the rule system. This is the strength of this technique. So far, we have used the first approach and changed the shapes of the membership functions to “tune” our models. We did this both by hand and through the neuroadaptive program routines of MATLAB© . Figures 3.20 and 3.21 show an example of the second approach of changing the inferential connector to “or.” Where “or” is used instead of “and,” fuzzy union is implied. We can also use this approach to change the number of rules necessary to characterize our fuzzy logic system. Figures 3.20a and 3.20b show membership functions for the same two antecedent variables as above, namely distance from platform margin (normalized to be between 0 if a point is at the margin and 1 if it is farthest away) and water depth over the domain −100 m to 0. However, in this case for illustrative purposes we are using only two membership functions to approximate the consequent variable sediment production that is described as either being “low” or “lots” (Figure 3.20c). In this case we model the system with only three rules: (1) If the distance is near AND the depth is maximum, produce lots; (2) If the distance is far OR the depth is deep, produce little; (3) If the distance is far OR the depth is shallow, produce little. Evaluation of these rules is shown in Figure 3.21. In the case where fuzzy union is implied (i.e., where the connector is “or”) the appropriate output membership functions are truncated at the highest degree of membership of either input functions. Figure 3.22 shows the surface generated by this fuzzy inference system. In this chapter we use only standard representations of the connectors “and” and “or.” For other representations, see Section 2.3.3. 3.4.5 Permeability as a function of grain size and sorting using fuzzy clustering In this example, we consider the calculation of permeability in unconsolidated sediments as a function of the average size of the grains and the sorting of the grains.
94
3 Fuzzy Logic and Earth Science: An Overview
Figure 3.20 Effect of changing the inferential connector to standard “or” (instead of “and”) in a fuzzy logic system. (a) and (b) show membership functions for the two antecedent variables, distance from platform margin (normalized to be between 0 if a point is at the margin, and 1 if it is farthest away) and water depth over the domain −100 m to 0. (c) Two membership functions to approximate the consequent variable sediment production that is described as being either “low” or “lots.”
This is a highly nonlinear problem in real Earth systems because of a complicated interrelationship between the shapes of grains and the packing of grains. The permeability of a sediment or sedimentary rock is defined by Darcy’s law (equation given in footnote 1) and, in natural systems, varies over at least ten orders of magnitude. It is very expensive and time consuming (and in some cases impractical) to take samples of sediment and put them in a device that directly measures permeability through an application of Darcy’s law. Interested readers can get the details in any introductory hydrology textbook. Instead, hydrologists and oil company engineers have spent many years seeking empirical equations to determine permeability from more easily measurable parameters [see Freeze & Cherry, 1979]. The most obvious parameter might seem to be the porosity of the sediment, where the porosity is the
3.4 Fuzzy Logic Systems
95
Figure 3.21 Standard (“Mamdani”) interpretation of the “if–then” rules (“if the distance is near AND the depth is maximum, produce lots”; “if the distance is far OR the depth is deep, produce little”; and “if the distance is far OR the depth is shallow, produce little”) is shown for relative distance of 0.02 and water depth of −30 m. The input variable is evaluated for each water depth and a truth value = degree of membership of the input variable in each of the potential input sets (“shallow” and “deep”) is calculated. In the case where fuzzy union is implied (i.e., where the connector is “or”) the appropriate output membership functions are truncated at the highest degree of membership of either input function. These truth values truncate the membership functions of the appropriate output variable. For each water depth, the truncated membership functions of the output variable are again summed, and the centroid of the appropriate curve is taken as the “defuzzified” output value.
volume fraction of connected voids that allow fluid movement. However, the porosity in a room full of cubic close-packed bowling balls is exactly the same as the porosity in a room full of cubic close-packed marbles (except for the finger holes in the bowling balls). The permeability of sediment is a measure of frictional resistance to flow, which is a function of the “tortuosity” as well as the diameter of the flow path. Although bowling balls and marbles have equal porosities, flow paths through the marbles are more tortuous and, in the marbles, more fluid is in contact with solid grain. For these reasons, the permeability of marbles is lower than the permeability of the bowling balls even though their effective porosities are identical. Thus, most empirical equations designed to estimate permeability have the average grain size as one of the independent variables. The other independent variable that is commonly used in empirical equations to calculate permeability is the dispersion in spread of sizes (or sorting) of the sediment. The calculation of sorting is based on measuring the masses of sediment that reside on a series of standard mesh-size sieve screens,
96
3 Fuzzy Logic and Earth Science: An Overview
Figure 3.22 Output graph of the fuzzy logic system described in Figure 3.21.
assuming that these masses would follow a log-normal distribution. Sediments are described as well sorted, medium sorted and poorly sorted, on the basis of whether the range of sizes varies over a factor of 2 (well sorted), within one order of magnitude (medium), or over more than one order of magnitude (poorly sorted). It is intuitively obvious that the more poorly sorted a sediment is, the lower the permeability will be. In a frequently cited study, Krumbein & Monk [1942] prepared about 30 samples of various mean grain sizes and sorting, and measured the permeability of their artificial grain packs. They also derived the following semi-empirical equation relating permeability to average grain size and sorting: k = 760GM 2 e−1.31σ
(3.12)
where k = permeability, GM = mean grain size, and σ is a parameter that describes the range in grain sizes (or sorting) and is related to the standard deviation of the grain size where it is modeled as a normal distribution. The sands and the mixtures of sands that Krumbein & Monk [1942] used to measure permeability are all rather coarsely grained. There have been a number of other studies where, instead of artificial mixtures, the permeability of naturally occurring sediments has been measured along with grain size, porosity, and sorting. Pryor [1973] measured these parameters in river and beach sands that were mostly comprised of quartz, feldspar and other aluminosilicate minerals. Enos & Sawatsky [1981] measured this same suite of parameters, plus percentage of mud-sized material in a suite of carbonate sediments. The data from
3.4 Fuzzy Logic Systems
97
Figure 3.23 Three approximations to the data of Krumbein & Monk [1942] (shown as asterisks); Pyror [1973] (shown as triangles); and Enos & Sawatsky [1981] (shown as circles). (a) Semi-empirical fit using Equation (3.12). (b) Hand-fit Mamdani further described in text and in Figure 3.24. (c) Machine-derived linear Takagi–Sugeno fit using fuzzy clusters from Figure 3.25 and antecedent membership functions shown in Figure 3.26 to characterize grain size and sorting. Note how both of the fuzzy logic systems (b) and (c) are better fits to the natural permeability measurements of Pryor [1973] and Enos & Sawatsky [1983] (see also color insert).
Krumbein & Monk [1942] (labeled with asterisks), Pryor [1973] (labeled with diamonds), and Enos & Sawatsky [1981] (labeled with circles) are shown on Figure 3.23a along with the solution to the semi-empirical relationship given in Equation (3.12). Two fuzzy logic models of the relationship between grain size, sorting, and permeability are given in Figures 3.23b and 3.23c. Figure 3.23b shows a “hand-tuned” Mamdani fuzzy logic system whereas Figure 3.23c again relies on the MATLAB© adaptive neuro-fuzzy inference engine. The input and output membership functions for the hand-tuned Mamdani fuzzy logic system are shown in Figure 3.24. The twelve rules that govern this system are: (1) If the grain size is coarse and the sediment is well sorted, then the permeability is very high;
98
3 Fuzzy Logic and Earth Science: An Overview
Figure 3.24 Membership functions for the “hand-fit” Mamdani fuzzy logic system relating grain size, sorting and permeability. See text for fuzzy logic rules and Figure 3.23 for output solution. (a) Membership functions for grain size. (b) Membership functions for sorting. (c) Output membership functions for permeability.
(2) If the grain size is coarse and the sediment is medium, then the permeability is high; (3) If the grain size is coarse and the sediment is poor, then the permeability is moderate; (4) If the grain size is medium and the sediment is well sorted, then the permeability is moderate; (5) If the grain size is medium and the sediment is medium, then the permeability is bad; (6) If the grain size is medium and the sediment is poor, then the permeability is very poor; (7) If the grain size is fine and the sediment is well sorted, then the permeability is low moderate;
3.4 Fuzzy Logic Systems
99
(8) If the grain size is fine and the sediment is medium, then the permeability is very poor; (9) If the grain size is fine and the sediment is poor, then the permeability is very poor; (10) If the grain size is mud and the sediment is well sorted, then the permeability is very poor; (11) If the grain size is mud and the sediment is medium, then the permeability is very poor; and (12) If the grain size is mud and the sediment is poor, then the permeability is very poor. As the number of variables rises and the number of membership functions for each variable rises, there is usually an increase in the number of rules. One of the challenges facing the use of fuzzy logic in geologic models is the elimination of redundant rules. One way to reduce the number of rules was demonstrated with the use of fuzzy union among antecedent variables. Another way to keep the number of membership functions down is to look for “clusters” in the data. Clustering encompasses a number of mathematical techniques for identifying natural groupings in a data set. There are two basic methods of fuzzy clustering: fuzzy c-means and fuzzy equivalence relationbased hierarchy. The details of these methods are outside of the scope of this chapter and the interested reader is directed to Klir & Yuan [1995]. Figure 3.25 shows three views of grain size, sorting, and permeability relationships from the data set outlined above. The large asterisks are the centers of three fuzzy clusters identified by the fuzzy c-means clustering algorithm of MATLAB© . Fuzzy c-means clustering is an iterative technique that starts with a pre-determined number of clusters (three in this case) and partitions the data so that each point belongs to each cluster to some degree specified by a membership grade. Once fuzzy clusters have been identified in the data, they can serve as the starting points for the adaptive neuro-fuzzy inference engine of MATLAB© . Figure 3.23c shows the results. The output Takagi–Sugeno functions are “zero-order” spikes centered at permeabilities of 10.50, 2.50, and 758.34.
3.4.6 Adding more antecedent variables: permeability revisited Whereas the number of antecedent variables is not restricted to two, it is the only number that can conveniently be plotted. For example, we could add a third and fourth antecedent variable to the permeability fuzzy logic system described above. The amount of mud in a sediment (whether aluminosilicate clays or aragonite needles) fundamentally affects permeability because these grains are not spherically shaped at all as is the case with “ideal” spherical silt- and sand-sized particles. Moreover, the degree to which the mud in a sediment has been aggregated into fecal pellets by organisms is yet another variable that could be taken into account where calculating permeability.
100
3 Fuzzy Logic and Earth Science: An Overview
Figure 3.25 Three fuzzy c-means clusters (asterisks) for the data of Krumbein & Monk [1942], Pryor [1973], and Enos & Sawatsky [1983] (circles). (a) View in the [x y] (grain size–sorting) plane. (b) View in the [x z] (grain size–permeability) plane. (c) View in the [y z] (sorting– permeability) plane.
Empirical and semi-empirical relationships among these variables and how they affect permeability naturally would lend themselves to a fuzzy logic approach.
3.5 Summary and Conclusions It is clear that fuzzy logic systems have the potential to produce very realistic geologic models when used as so-called “expert” systems. An expert system usually comprises the cooperation of a “knowledge engineer” (i.e., someone familiar with the techniques described in this chapter) and a geologist familiar with Earth systems problems. If the geologist can distill the key points of the models into the types of “if A and if B then C” propositions described above, then the knowledge engineer can translate them into mathematically rigorous fuzzy logic systems. It is important to note that
References
101
there is only a practical limit to the number of antecedent propositions in a fuzzy logic statement. A statement such as “if A and if B or if C then D” would map three input variables into a region of space. Moreover, there is currently an explosive growth in the theory and application of fuzzy logic and other related “soft” computing techniques, opening new ways of modeling based on knowledge expressed in natural language. This method offers a distinct alternative to statistical modeling in geology. It is more computationally efficient and more intuitive for geologists than complicated models that solve coupled sets of differential equations.
References Bosscher, H., & Schlager, W. [1992], “Computer simulation of reef growth.” Sedimentology, 39(3), 503–512. Broecker, W. A., & Takahashi, T. [1966], “Calcium carbonate precipitation on the Bahama Banks.” Journal of Geophysical Research, 71(6), 1575–1602. Chalker, B. E. [1981], “Simulating light-saturation curves for photosynthesis and calcification by reef-building corals.” Marine Biology, 63(2), 135–141. Cordón, O., Herrera, F., Hoffmann, F., & Magdalena, L. [2001], Genetic Fuzzy Systems: Evolutionary Tuning and Learning of Fuzzy Knowledge Bases. World Scientific, Singapore. Demicco, R. V., & Klir, G. J. [2001], “Stratigraphic simulations using fuzzy logic to model sediment dispersal.” Journal of Petroleum Science and Engineering, 31(2–4), 135–155. Enos, P., & Sawatsky, L. H. [1981], “Pore networks in Holocene carbonate sediments.” Journal of Sedimentary Petrology, 51(3), 961–985. Ernst, W. G. (ed.) [2000], Earth Systems. Cambridge University Press, Cambridge, UK. Freeze, R. A., & Cherry, J. A. [1979], Groundwater. Prentice-Hall, Englewood Cliffs, NJ. Ginsburg, R. N., Hardie, L. A., Bricker, O. P., Garrett, P., & Wanless, H. R. [1977], “Exposure index: a quantitative approach to defining position within the tidal zone.” In: L. A. Hardie (ed.), Sedimentation on the Modern Carbonate Tidal Flats of Northwest Andros Island, Bahamas. The Johns Hopkins University Press, Baltimore, MD. Goldhammer, R. K. [1997], “Compaction and decompaction algorithms for sedimentary carbonates.” Journal of Sedimentary Research, 67(3), 26–35. Hamblin, W. K., & Christiansen, E. H. [2001], Earth’s Dynamic Systems (9th edition). PrenticeHall, Upper Saddle River, NJ. Klir, G. J. [2001], Facets of Systems Science (2nd edition). Kluwer/Plenum, New York. Klir, G. J., & Yuan, B. [1995], Fuzzy Sets and Fuzzy Logic—Theory and Applications. PrenticeHall, Upper Saddle River, NJ. Knauss, J. A. [1978], Introduction to Physical Oceanography. Prentice-Hall, Englewood Cliffs, NJ. Kosko, B. [1992], Neural Networks and Fuzzy Systems. Prentice-Hall, Englewood Cliffs, NJ. Krumbein, W. C., & Monk, G. D. [1942], “Permeability as a function of the size parameters of unconsolidated sand.” American Institute of Mining and Metallurgical Engineers, Technical Publication No. 1492. Kump, L. R., Kasting, J. F., & Crane, R. G. [1999], The Earth System. Prentice-Hall, Upper Saddle River, NJ.
102
3 Fuzzy Logic and Earth Science: An Overview
Lin, C.-T., & Lee, C. S. G. [1996], Neural Fuzzy Systems: A Neuro Fuzzy Synergism to Intelligent Systems. Prentice-Hall, Upper Saddle River, NJ. Mamdani, E. H., & Assilian, S. [1975], “An experiment in linguistic synthesis with fuzzy logic controller.” International Journal of Man–Machine Studies, 7(1), 1–13. Mendel, J. M. [2001], Uncertain Rule-Based Fuzzy Logic Systems: Introduction and New Directions. Prentice-Hall, Upper Saddle River, NJ. Nauck, D., & Klawonn, F. [1997], Foundations of Neuro-Fuzzy Systems. John Wiley, New York. Nordlund, U. [1996], “Formalizing geological knowledge—with an example of modeling stratigraphy using fuzzy logic.” Journal of Sedimentary Research, 66(4), 689–712. Press, F., & Siever, R. [2001], Understanding Earth (3rd edition). W. H. Freeman, New York. Pryor, W. A. [1973], “Permeability–porosity patterns and variations in some Holocene sandbodies.” American Association of Petroleum Geologists Bulletin, 57(1), 162–189. Reading, H. G., & Collinson, J. D. [1996], “Clastic coasts.” In: Reading, H. G. (ed.), Sedimentary Environments: Processes, Facies and Stratigraphy. Blackwell, Oxford. Sanchez, E., Shibata, T., & Zadeh, L. A. (eds.) [1997], Genetic Algorithms and Fuzzy Logic Systems: Soft Computing Perspectives. World Scientific, Singapore. Skinner, B. J., & Porter, S. C. [1999], The Blue Planet: an Introduction to Earth System Science. John Wiley, New York. Stanley, S. M. [1999], Earth System History. W. H. Freeman, New York. Streckeisen, A. [1974], “Classification and nomenclature of plutonic rocks.” Geologische Rundschau, 63(2), 773–786. Takagi, T., & Sugeno, H. [1985], “Fuzzy identification of systems and its application for modeling and control.” IEEE Transactions on Systems, Man and Cybernetics, 15(1), 116–132. Weaver, W. [1948], “Science and complexity.” American Scientist, 36(4), 536–544. Zadeh, L. A. [1965], “Fuzzy sets.” Information and Control, 8(3), 338–353. Zadeh, L. A. [1975–76], “The concept of a linguistic variable and its application to approximate reasoning.” Information Sciences, 8(3), 199–249, 301–357, 9(1), 43–80.
Chapter 4
Fuzzy Logic in Geological Sciences: A Literature Review
Robert V. Demicco
4.1 Introduction 103 4.2 A Sketch of Soft Computing 105 References to soft computing 107 4.3 Fuzzy Logic in Geology: A Literature Review 107 4.3.1 Surface hydrology 108 References to surface hydrology 109 4.3.2 Subsurface hydrology 110 References to subsurface hydrology 111 4.3.3 Groundwater risk assessment 111 References to groundwater risk assessment 112 4.3.4 Geotechnical engineering 112 References to geotechnical engineering 113 4.3.5 Hydrocarbon exploration 113 References to hydrocarbon exploration 114 4.3.6 Seismology 115 References to seismology 115 4.3.7 Soil science and landscape development 116 References to soil science and landscape development 4.3.8 Deposition of sediment 117 References to deposition of sediment 118 4.3.9 Miscellaneous applications 119 References to miscellaneous applications 119 4.4 Concluding Note: Quo Vadis 120
117
4.1 Introduction This chapter focuses on recent literature in geologically oriented journals that deals with applications of fuzzy logic in various branches of geological sciences, sometimes in the broader context of soft computing.1 1 Soft computing is briefly introduced in Section 4.2.
103 FUZZY LOGIC IN GEOLOGY
Copyright 2004, Elsevier Science (USA) All rights of reproduction in any form reserved. ISBN: 0-12-415146-9
104
4 Fuzzy Logic in Geological Sciences: A Literature Review
Geology is the application of chemical, physical, and biological principles to the study of the lithosphere. The trend in the Earth sciences over the last 10 years has been to view the Earth as a system and treat the hydrosphere, atmosphere, biosphere, and lithosphere as interconnected subsystems. This approach is interdisciplinary and has been largely fueled by concern about the Earth’s present and past environments, with a growing realization that what happens in any one of the Earth’s “spheres” has impact on the others. Thus, most modern problems of interest to geologists involve systems with large numbers of components and rich interactions among the components that are usually nonlinear and non-random. Such problems of organized complexity (also see Section 3.1) typify geologic systems and are exemplified by the Earth systems that operate at the surface of the Earth. In response to the recognition that geology deals with the realm of organized complexity, there has been a recent explosive growth in the theory and application of fuzzy logic and other related soft computing techniques in the Earth sciences. Geology, of all of the natural sciences, most readily lends itself to analysis and modeling by soft computing techniques in general and fuzzy logic in particular. Fuzzy logic, in the broad sense (see Section 2.1), has at least a ten-year history (and a growing body) of refereed works where it has been successfully applied to many areas of geological research. There are a number of reasons for this. First, geologic knowledge is routinely expressed in natural language and, indeed, much of this book is a direct outgrowth of this fact. Second, geology is primarily a field science that began as an outgrowth of the mineral extraction industry. As such, the variables that geologists have routinely measured for hundreds of years are continua that commonly vary over many orders of magnitude. Currently, most of these naturally continuous variables are, more often than not, broken up into arbitrary “pigeon holes” by geologists seeking to “classify.” Fuzzy sets and fuzzy logic offer a much more natural way of describing geological variables. Third, because geological research is field based, it is commonly carried out over fairly broad regions of tens to hundreds of square kilometers in size. And, where subsurface data are added, the volume of a typical rock body being studied, even on a modest scale, for minerals, gas, groundwater, or information about conditions on the ancient Earth, ranges up to 1000 km3 . Direct sampling of the entire rock or sediment body is clearly prohibitive, and so much of the three-dimensional distribution of rock properties is measured over a relatively tiny percentage of the study area and inferred over the entire volume. The inference is clearly a source of uncertainty and is commonly based on “expert knowledge” from a variety of sources to model the disposition of rock properties. Fourth, remote sensing has always been used in geological data gathering. The most common of these methods are exploration seismic surveys and geophysical measurements of boreholes. In the former, the differential impedance of rocks to artificial vibrational energy produces what amounts to an “ultra sound” of subsurface rock disposition, whereas in the latter, the gamma rays output of rocks and the electrical resistivity of
4.2 A Sketch of Soft Computing
105
rocks to an induced potential are measured as proxies for rock type, permeability, etc. There is also a growing reliance on satellite (and other) remote sensing imagery of surface features of the Earth in a number of geological fields. Such imagery is the basis of geographical informational systems. All of these remote-sensing techniques require correspondences between the proxy measures and the desired rock, sediment, or soil properties to be established. These correspondences are commonly not one-to-one. The remainder of this chapter is divided into three sections. Section 4.2 gives a short synopsis of how fuzzy logic relates to the broader area of soft computing. This information should be of use to introduce areas of soft computing other than fuzzy logic, mentioned in some of the papers in the following literature review and bibliography. Section 4.3 comprises a literature review and annotated bibliography wherein we have divided the current literature on fuzzy logic into a number of broader categories. Please note that the papers described in each section are listed at the end of that section rather than being amalgamated into a final list. The list of papers in each category is by no means exhaustive and, as such, the papers cited in each category do not comprise a complete bibliography of materials. However, they represent recent refereed papers in major Earth science journals that should be available in even modest university libraries. Finally, Section 4.4 is a brief postscript where we note some of the trends that seem to have developed over the last dozen years across the different areas where fuzzy logic has been applied to Earth sciences. We hope that this literature review, when coupled with the references cited in the individual chapters, will serve as an introduction into the literature.
4.2 A Sketch of Soft Computing Soft computing comprises five areas: (1) fuzzy logic; (2) rough sets [Pawlak, 1991]; (3) neural networks [Beale & Jackson, 1990]; (4) evolutionary computation [Bäck, 1996]; and (5) probabilistic reasoning based on precise (classical) probabilities and, more importantly, imprecise probabilities [Walley, 1991]. There is much synergy among these areas, with fuzzy logic playing a lead role. Rough sets and their combinations with fuzzy sets are briefly introduced in Section 2.9. Foundations of the theory of rough sets are presented in Pawlak [1991]. Combinations of rough sets and fuzzy sets (fuzzy rough sets and rough fuzzy sets) are examined in two papers by Dubois & Prade [1990, 1992]. Neural networks are computational structures that were inspired by the architecture of the natural networks of neurons in the brain [Klir & Yuan, 1995]. A neural network comprises many interconnected computational nodes called neurons. Each neuron has several inputs and one output, which are connected to other neurons or to the environment. The output value of each neuron is at each time uniquely determined by its input values. A connection between any two neurons or a neuron with the
106
4 Fuzzy Logic in Geological Sciences: A Literature Review
environment has a “strength” expressed by a number known as a weight. Each neuron thus receives inputs whose values are real numbers. On the basis of the sum of these inputs, a nonlinear activation function produces an output value of the neuron. The basic capability of a neural network is to learn patterns from examples. This is accomplished by adjusting the weights of the interconnections among the neurons according to a learning algorithm. There are various kinds of learning algorithms. It was recognized since the early 1990s that a connection between fuzzy systems and neural networks is beneficial for both areas [Kosko, 1992]. First, neural networks are capable of implementing fuzzy systems. The implementation is convenient and efficient, and it adds to fuzzy systems the capabilities of learning and adaptability. Second, neural networks were found very useful for constructing and tuning fuzzy inference rules (membership functions, operations, etc.) of fuzzy systems in various application contexts, as well as for solving, via their learning capabilities, some inverse problems associated with fuzzy systems [Klir & Yuan, 1995]. On the other hand, neural networks are made more flexible and robust by fuzzification. Hybrid combinations of fuzzy systems and neural networks, which are usually called neuro-fuzzy systems, have been studied quite extensively during the last decade or so [Lin & Lee, 1996; Nauck et al., 1997; Rutkowska, 2002]. As the name suggests, evolutionary computation is another biologically inspired computational structure most commonly used for optimization. In this case, the features to be optimized are parameterized into a vector of real numbers. In a generation step, the vector first gives rise to a large number of new vectors each of which has been altered by processes analogous to processes that alter DNA strands during replication: spot mutations (changes in individual number), transcription errors, recombination, etc. Next, the fitness value of each of the newly generated vectors is determined by criteria relative to the optimization task being performed. Only those offspring vectors that improve the performance survive to give rise to the next generation. The algorithm repeats generation steps until the fitness value satisfies some criterion. A mutually beneficial connection, which was recognized recently, is the one between fuzzy systems and the area of evolutionary computation [Sanchez et al., 1997; Cordón et al., 2001]. On the one hand, the various components of evolutionary computation can be fuzzified, which enhances their efficiency, flexibility, and robustness. On the other hand, evolutionary computation is a powerful tool of learning and tuning inference rules employed in fuzzy systems. The term genetic fuzzy system was recently suggested for a hybrid system involving fuzzy systems and evolutionary computation [Cordón et al., 2001]. Reasoning with imprecise probabilities of various kinds (interval-valued, fuzzy, etc.) [Walley, 1991] and dealing with information represented by imprecise probabilities [Klir & Wierman, 1999] is a rapidly growing area of soft computing (see Klir, [2003] for an overview).
4.3 Fuzzy Logic in Geology: A Literature Review
107
References to soft computing Bäck, T. [1996], Evolutionary Algorithms in Theory and Practice. Oxford University Press, New York. Beale, R., & Jackson, T. [1990], Neural Computing: An Introduction. Adam Hilger, New York. Cordón, O., Herrera, F., Hoffmann, F., & Magdalena, L. [2001], Genetic Fuzzy Systems: Evolutionary Tuning and Learning of Fuzzy Knowledge Bases. World Scientific, Singapore. Dubois, D., & Prade, H. [1990], “Rough fuzzy sets and fuzzy rough sets.” International Journal of General Systems, 17(2–3), 191–209. Dubois, D., & Prade, H. [1992], “Putting rough sets and fuzzy sets together.” In: Slowinski, R. (ed.), Intelligent Decision Support, pp. 203–232. Kluwer, Boston. Klir, G. J. [2003], “Uncertainty-based information.” In: Teodorescu, H., & Melo, P. (eds.), Systematic Organization of Information in Fuzzy Systems. IOS Press, Amsterdam. Klir, G. J., & Wierman, M. J. [1999], Uncertainty-Based Information: Elements of Generalized Information Theory (2nd edition). Physica-Verlag/Springer-Verlag, Heidelberg and New York. Klir, G. J., & Yuan, B. [1995], Fuzzy Sets and Fuzzy Logic: Theory and Applications. PrenticeHall, Upper Saddle River, NJ. (See especially Appendix A.) Kosko, B. [1992], Neural Networks and Fuzzy Systems. Prentice-Hall, Englewood Cliffs, NJ. Lin, C.-T., & Lee, C. S. G. [1996], Neural Fuzzy Systems: A Neuro Fuzzy Synergism to Intelligent Systems. Prentice-Hall, Upper Saddle River, NJ. Nauck, D., Klawonn, F., & Kruse, R. [1997], Foundations of Neuro-Fuzzy Systems. John Wiley, New York. Pawlak, Z. [1991], Rough Sets. Kluwer, Boston. Rutkowska, D. [2002], Neuro-Fuzzy Architectures and Hybrid Learning. Physica-Verlag/ Springer-Verlag, Heidelberg and New York. Sanchez, E., Shibata, T., & Zadeh, L. A. (eds.) [1997], Genetic Algorithms and Fuzzy Logic Systems: Soft Computing Perspectives. World Scientific, Singapore. Walley, P. [1991], Statistical Reasoning with Imprecise Probabilities. Chapman and Hall, London.
4.3 Fuzzy Logic in Geology: A Literature Review The categories chosen are briefly described below. It should come as no surprise that geotechnical engineering and areas closely allied to engineering, surface hydrology, subsurface hydrology, and hydrocarbon exploration, have seen early and extensive use of fuzzy logic. A literature review of surface and subsurface hydrology is also found in Section 6.2. A subcategory of subsurface hydrology contains papers describing management-oriented models for groundwater risk assessment based on fuzzy logic. These will be treated in a separate category for convenience. Exploration geophysicists in hydrocarbon exploration have adapted fuzzy logic into seismic processing and evaluation. They also use many of the soft computing techniques (most of which incorporate fuzzy logic) for the prediction of rock properties from borehole
108
4 Fuzzy Logic in Geological Sciences: A Literature Review
and other data. Earthquake seismology is a discipline still very closely wedded to Newtonian mechanics. However, a growing number of geophysicists have recently adapted fuzzy logic into various aspects of earthquake research. In addition to the papers mentioned here, also consult Chapter 8. Soil science and geomorphology have seen extensive use of fuzzy logic. Models of modern and ancient deposition of sediments have seen modest use of fuzzy logic to simulate sediment production, sediment erosion, sediment transportation, and sediment deposition. Chapter 5 describes uses of fuzzy logic in forward models of basin filling, whereas Chapters 9 and 10 use novel techniques based on fuzzy logic to interpret ancient deposits. Finally, there is a scattered miscellaneous literature in the geological sciences that outlines potential uses of fuzzy logic in specific areas not included in the categories outlined here. 4.3.1 Surface hydrology The discharge of a stream is the volume flow per unit time through a cross-section of the stream at a point along the stream’s course. Obviously, reliable prediction of discharge, especially low flows during droughts and high flows that lead to floods, is important for a variety of reasons. The discharge of a river represents a complicated, highly nonlinear response of a watershed area to precipitation. Complicating factors include: the topography of the watershed; the plant cover of the watershed; the stage in the annual growth cycle of the vegetation; whether the leaves are wet or dry; the intensity, duration, and location of precipitation; the physical properties of the soil; whether the soils are frozen or thawed, and, if thawed, their moisture content; etc. One of the principal areas of research on surface hydrology is to develop improved watershed response models that will allow the prediction of discharge from a watershed on the basis of precipitation falling in the catchment area of the watershed. A number of deterministic models have been developed to try to predict discharge from precipitation records or forecasts, with real-time predictions as the goal. These models vary in sophistication but, as most were developed for a specific geographic area, they are difficult to apply globally and are highly parameterized. Franks et al. [1998] employed a well-known, deterministic model to study a small watershed in Brittany, France. However, they incorporated fuzzy sets to describe the relationship between slope measurements, synthetic aperture radar measurements, and the saturation state of the catchment area. The saturation state of the catchment area, in turn, was input into the standard deterministic model. Yu & Yang [2000] also used a pre-existing deterministic model to study rainfall/runoff for a small catchment in Taiwan. However, in addition to using conventional objective functions (such as root mean square error and mean absolute percent error) to calibrate the model, they also used a fuzzy multi-objective function to calibrate the model parameters. They argued that the fuzzy multi-objective function led to improved modeling results. Zhu & Mackay [2001] also used deterministic watershed response models on experimental, well-instrumented and well-studied watersheds in the western US. They used fuzzy
4.3 Fuzzy Logic in Geology: A Literature Review
109
logic-based inferences from soil maps to provide realistic ranges of soil parameters into the model instead of “lumped” (crisp) parameters suggested by boundaries on the maps. Xiong et al. [2001] took the results of five deterministic watershed response models applied to 11 catchment areas in Taiwan and combined them into a single response model, using a first-order Takagi–Sugeno-type model. There are a number of recent watershed response models that are entirely based on fuzzy logic and other soft computing techniques. Gautam & Holz [2001] developed a rainfall-response model based on fuzzy rules extracted by an artificial neural fuzzy inference system (ANFIS) analysis of long-term rainfall versus runoff data from a watershed in Tuscany, Italy. Chang & Chen [2001] employed a back-propagation fuzzy-neural network on long-term data from a watershed in Taiwan. Özelkan & Duckstein [2001] began with conceptual rainfall-runoff modular models of a small subwatershed in the heavily instrumented Walnut Gulch experimental watershed in southeasternArizona. They then modeled uncertainty among both linear and nonlinear conceptual models with fuzzy numbers and fuzzy arithmetic operations. Finally, for a portion of the River Ouse in northern England, See & Abrahart [2001] applied data fusion techniques to the output from three separate river level forecasting models: (1) a fuzzy neural network; (2) a rule-based fuzzy logic model; and (3) an autoregressive moving average model. In addition to the volume of a stream’s discharge, the chemical quality of the water, especially during low flows, is important to planners, engineers, etc. Two studies that developed fuzzy logic-based approaches to water quality models include Rantitsch [2000] and Mujumdar & Sadikumar [2002]. Rantitsch [2000] used fuzzy c-means clustering of elements measured in stream samples from the Alps of Austria to establish four categories of background levels of various elements. These categories were better able to screen out non-anomalous concentrations of metals. Somewhat closely related to these watershed response models are studies that have attempted to model and predict precipitation input onto watersheds (see also Chapter 6). Pongracz et al. [1999] tried to develop a drought prediction model for the Great Plains of the US. They developed a set of fuzzy rules that related two inputs—(1) the Southern Ocean Oscillation (SOI as a proxy for El Niño Southern Oscillation—ENSO); and (2) the geopotential height field of the 500 hPa level over a large area of the western hemisphere—to a long-term record of droughts in eight regions of Nebraska. Stehlík & Bárdossy [2002] extended this approach to the general stochastic prediction of a time series for precipitation over Europe. They developed a fuzzy classification of point measurements of geopotential atmospheric pressure surfaces over a large-scale grid of Europe as an input into their model.
References to surface hydrology Chang, F.-J., & Chen, Y.-C. [2001], “A counterpropagation fuzzy-neural network modeling approach to real time streamflow prediction.” Journal of Hydrology, 245(1–4), 153–164.
110
4 Fuzzy Logic in Geological Sciences: A Literature Review
Franks, S. W., Gineste, P., Beven, K. J., & Merot, P. [1998], “On constraining the predictions of a distributed model: the incorporation of fuzzy estimates of saturated areas into the calibration process.” Water Resources Research, 34(4), 787–797. Gautam, D. K., & Holz, K. P. [2001], “Rainfall-runoff modeling using adaptive neuro-fuzzy systems.” Journal of Hydroinformatics, 3(1), 3–10. Mujumdar, P. P., & Sasikumar, K. [2002], “A fuzzy risk approach for seasonal water quality management of a river system.” Water Resources Research, 38(1), 1–9. Özelkan, E. C., & Duckstein, L. [2001], “Fuzzy conceptual rainfall-runoff models.” Journal of Hydrology, 253(1–2), 41–68. Pongracz, R., Bogardi, I., & Duckstein, L. [1999], Application of fuzzy rule-based modeling technique to regional drought. Journal of Hydrology, 224(3–4), 100–114. Rantitsch, G. [2000], “Application of fuzzy clusters to quantify lithological background concentrations in stream-sediment geochemistry.” Journal of Geochemical Exploration, 71(1), 73–82. See, L., & Abrahart, R. J. [2001], “Multi-model data fusion for hydrological forecasting.” Computers & Geosciences, 27(8), 987–994. Stehlík, J., & Bárdossy, A. [2002], “Multivariate stochastic downscaling model for generating daily precipitation series based on atmospheric circulation.” Journal of Hydrology, 256(1–2), 120–141. Xiong, L., Shamseldin, A. Y., & O’Connor, K. M. [2001], “A non-linear combination of the forecasts of rainfall-runoff models by the first-order Takagi–Sugeno fuzzy system.” Journal of Hydrology, 245(3–4), 196–217. Yu, P.-S., & Yang, T.-C. [2000], “Fuzzy multi-objective function for rainfall-runoff model calibration.” Journal of Hydrology, 238(1–2), 1–14. Zhu, A. X., & Mackay, D. S. [2001], “Effects of spatial detail of soil information on watershed modeling.” Journal of Hydrology, 248(1–4), 54–77.
4.3.2 Subsurface hydrology Groundwater flow has been most commonly modeled by the piecewise solution of the diffusion equation, either by finite difference approximations or by finite elements. Solute transport in groundwater systems has likewise been modeled by finite difference or finite element approximations of the advective-dispersive equation. These types of models, which are built around the empirical Darcy’s law (see Section 3.2), are very commonly applied to problems of groundwater wellfield development and, where pollutants are dissolved in the groundwater, problems of developing remediation plans to deal with the contaminant plume. There is a growing recognition that such models (although quite common) may be inadequate [Konikow & Ewing, 1999]. This inadequacy is due to the inherent imprecision of knowledge of the three-dimensional distribution of hydraulic conductivity and, more fundamentally, the general fuzzy nature of the variables (such as hydraulic conductivity, storativity, specific yield, etc.) themselves.
4.3 Fuzzy Logic in Geology: A Literature Review
111
In an important series of papers, Dou and his coworkers [Dou et al., 1995, 1997a,b] developed a series of groundwater flow models for steady-state flow, transient groundwater flow, and nonreactive solute transport via groundwater flow. These models use fuzzy numbers to capture the uncertainty in the variables in the differential equations of flow and fuzzy arithmetic techniques for solution of the equations. Schulz & Huwe [1997] developed a similar model for flow in the unsaturated zone (the surfacemost soil zone where hydraulic conductivities vary with the state of pore saturation). Recently, Duo et al. [1999] developed a fuzzy rule-based model for solute transport in the unsaturated zone. This change from using fuzzy numbers to represent imprecise variables in traditional finite difference solutions to models entirely based on fuzzy rules has mirrored the similar development of surface flow models (see also Bárdossy & Duckstein [1995]). Another type of flow modeling seeks to understand groundwater flow from first principles instead of the empirically derived Darcy flow equation. Zeng et al. [2000] developed a model of flow in porous media where the degree of interconnectiveness among pores in a porous medium was modeled with fuzzy sets.
References to subsurface hydrology Bárdossy, A., & Duckstein, L. [1995], Fuzzy Rule-Based Modeling with Applications to Geophysical, Biological and Engineering Systems. CRC Press, Boca Raton, FL. Dou, C., Woldt, W., Bogardi, I., & Dahab, M. [1995], “Steady state groundwater flow simulation with imprecise parameters.” Water Resources Research, 31(11), 2709–2719. Dou, C., Woldt, W., Bogardi, I., & Dahab, M. [1997a], “Numerical solute transport simulation using fuzzy sets approach.” Journal of Contaminant Hydrology, 27(1–2), 107–126. Dou, C., Woldt, W., Dahab, M., & Bogardi, I. [1997b], “Transient ground-water flow simulation using a fuzzy set approach.” Ground Water, 35(2), 205–215. Dou, C., Woldt, W., & Bogardi, I. [1999], “Fuzzy rule-based approach to describe solute transport in the unsaturated zone.” Journal of Hydrology, 220(1–2), 74–85. Konikow, L. F., & Ewing, R. C. [1999], “Is a probabilistic performance assessment enough?” Ground Water, 37(4), 481. Schulz, K., & Huwe, B. [1997], “Water flow modeling in the unsaturated zone with imprecise parameters using a fuzzy approach.” Journal of Hydrology, 201(1–4), 211–229. Zeng, Z., Vasseur, C., & Fayala, F. [2000], “Modeling microgeometric structures of porous media with a predominant axis for predicting diffusive flow in capillaries.” Applied Mathematical Modelling, 24(12), 969–986.
4.3.3 Groundwater risk assessment Assessing the risk to human populations from contaminant pollution of groundwater from various anthropogenic sources is an important element of municipal and
112
4 Fuzzy Logic in Geological Sciences: A Literature Review
agricultural wellfield planning. Aquifer vulnerability to contamination depends on soil properties, precipitation, topography, etc., all of which can be modeled with fuzzy sets. Freissinet et al. [1999] focused on assessing non-point source pesticide pollution of aquifers. Zhou et al. [1999] and Cameron & Peloso [2001] modified the commonly employed parametric assessment model DRASTIC by incorporating fuzzy sets. Özdamar et al. [2000] developed assessment models for evaluating the potential of industrial groundwater contamination wherein the input parameters were fuzzy.
References to groundwater risk assessment Cameron, E., & Peloso, G. F. [2001], “An application of fuzzy logic to the assessment of aquifer’s pollution potential.” Environmental Geology, 40(11–12), 1305–1315. Freissinet, C., Vauclin, M., & Erlich, M. [1999], “Comparison of first-order analysis and fuzzy set approach for the evaluation of imprecision in a pesticide groundwater pollution screening model.” Journal of Contaminant Transport, 37(1–2), 21–43. Özdamar, L., Demirhan, M., Özpinar, A., & Kilanc, B. [2000], “A fuzzy areal assessment approach for potentially contaminated sites.” Computers & Geosciences, 26(3), 309–318. Zhou, H., Wang, G., & Yang, Q. [1999], “A multi-objective fuzzy pattern recognition model for assessing groundwater vulnerability based on the DRASTIC system.” Hydrological Sciences—Journal, 44(4), 611–618.
4.3.4 Geotechnical engineering There are a number of models for river discharge control and flood management with fuzzy control systems. Teegavarapu & Simonovic [1999] describe a simple one-dam system from Kentucky, USA, whereas Cheng [1999] describes an optimization model for simultaneous floodgate control of the Yangtze River involving the Three Gorges Dam, flood basin areas along the river, and eight flood control dams on tributaries. Huang et al. [1999] developed an optimized watershed management plan for the Lake Erhai basin in southern China, using a fuzzy multi-objective management program. This watershed is under intense environmental pressure from a variety of competing land uses (agricultural, scenic, light industry, etc.). Al-Homoud & Al-Masri [1999] developed a fuzzy expert system to evaluate the failure potential of road-cut slopes and embankments in a landslide-prone area of Jordan. Grima & Verhoef [1999] modeled rock trencher performance with fuzzy logic, whereas Hammah & Curran [1998] used a fuzzy clustering algorithm to identify fracture sets encountered in exploration drilling. Ercanoglu & Gokceoglu [2002] developed a fuzzy rule-based model to assess landslide susceptibility in Turkey. Finally, Klose [2002] describes a fuzzy rule-based geophysical forecast system to interpret rock types from their seismic characteristics.
4.3 Fuzzy Logic in Geology: A Literature Review
113
References to geotechnical engineering Al-Homoud, A. S., & Al-Masri, G. A. [1999], “CSEES: an expert system for analysis and design of cut slopes and embankments.” Environmental Geology, 39(1), 75–89. Cheng, C. [1999], “Fuzzy optimal model for the flood control system of the upper and middle reaches of the Yangtze River.” Hydrological Sciences—Journal, 44(4), 573–582. Ercanoglu, M., & Gokceoglu, C. [2002], “Assessment of landslide susceptibility for a landslideprone area (north of Yenice, NW Turkey) by fuzzy approach.” Environmental Geology, 41(6), 720–730. Grima, A. M., & Verhoef, P. N. W. [1999], “Forecasting rock trencher performance using fuzzy logic.” International Journal of Rock Mechanics and Mining Sciences & Geomechanics Abstracts, 36(4), 413–432. Hammah, R. E., & Curran, J. H. [1998], “Fuzzy cluster algorithm for the automatic identification of joint sets.” International Journal of Rock Mechanics and Mining Sciences & Geomechanics Abstracts, 35(6), 889–905. Huang, G. H., Liu, L., Chakma, A., Wu, S. M., Wang, X. H., & Yin, Y. Y. [1999], “A hybrid GIS-supported watershed modeling system: application to the Lake Erhai basin, China.” Hydrological Sciences—Journal, 44(4), 597–610. Klose, C. D. [2002], “Fuzzy rule-based expert system for short-range seismic prediction.” Computers & Geosciences, 28(3), 337–386. Teegavarapu. R. S. V., & Simonovic, S. P. [1999], “Modeling uncertainty in reservoir loss functions using fuzzy sets.” Water Resources Research, 35(9), 2815–2823.
4.3.5 Hydrocarbon exploration Not surprisingly, this area has seen the greatest development of all so-called “soft computing” technologies including neural networks, fuzzy logic, genetic algorithms, and data analysis [Sung, 1999]. Huang et al. [2001] employed a neural-fuzzy-genetic algorithm for predicting permeability in petroleum reservoirs, whereas Finol & Jing [2002] tackled the prediction of permeabilities in shaly formations using a fuzzy rulebased model with well log responses as the input parameters. In a somewhat related paper, Das Gupta [2001] applied fuzzy pattern recognition to well logs to determine locations of coal seams. Janakiraman & Konno [2002] describe a fuzzy neural network designed to interpret rock facies in cross-borehole seismic exploration. There have been a number of recent journal issues solely dedicated to this area and they are an excellent means to gain access to the literature. Computers and Geosciences [Vol. 26, No. 8, October 2000] produced an issue entitled “Applications of virtual intelligence to petroleum engineering.” This issue was edited by Shahab Mohagheg and contained nine technical papers in addition to an introductory note. Most of the papers are dedicated to neural networks, many of which rely heavily on fuzzy logic. In addition to the special issue cited above, numerous single papers on applications of soft computing to the Earth sciences are to be found in Computers and Geosciences.
114
4 Fuzzy Logic in Geological Sciences: A Literature Review
The Journal of Petroleum Geology [Vol. 24, No. 4, October 2001] published a thematic issue entitled “Field applications of intelligent computing techniques.” This issue was edited by Wong and Nikravesh and included five technical papers and an introduction by the editors. Most of the papers were applications of neural networks. However, Wakefield et al. [2001] applied fuzzy logic to the biostratigraphic interpretation of mudstones in a North Sea oilfield and Finol et al. [2001] applied fuzzy partitioning to the classification and interpretation of remotely sensed resistivity and spontaneous potential wireline well logs. The Journal of Petroleum Science and Engineering produced two special issues dedicated to “Soft computing and Earth sciences” edited by Nikravesh, Aminzadeh, and Zadeh. Part one [Vol. 29, No. 3–4, May 2001] contained seven technical papers plus an introduction by the editors. Part two [Vol. 31, Nos. 2–4, November 2001] carried nine papers. These two special issues contain rather more variety of applications of soft computing to the Earth sciences but still mostly deal with applications of neural networks to data analysis. Nikravesh et al. [2003] is an extended version of these two theme issues. References to hydrocarbon exploration Das Gupta, S. P. [2001], “Application of a fuzzy pattern recognition method in borehole geophysics.” Computers & Geosciences, 27(1), 85–89. Finol, J. J., & Jing, X.-D. D. [2002], “Permeability prediction in shaly formations: the fuzzy modeling approach.” Geophysics, 67(3), 817–829. Finol, J. J., Guo, Y. K., & Jing, X. D. [2001], “Fuzzy partitioning systems for electrofacies classification; a case study for the Maricaibo Basin.” Journal of Petroleum Geology, 24(4), 441–458. Huang, Y., Gedeon, T. D., & Wong, P. M. [2001], “An integrated neural-fuzzy-geneticalgorithm using hyper-surface membership functions to predict permeability in petroleum reservoirs.” Engineering Applications of Artificial Intelligence, 14(1), 15–21. Janakiraman, K. K., & Konno, M. [2002], “Cross-borehole geological interpretation model based on a fuzzy neural network and geotomography.” Geophysics, 67(4), 1177–1183. Mohagheg, S. (ed.) [2000], “Applications of virtual intelligence to petroleum engineering.” Computers and Geosciences, 26(8). (A special issue of the journal containing a number of papers dealing with soft computing.) Nikravesh, M., Aminzadeh, F., & Zadeh, L. A. (eds.) [2001a], “Soft computing and Earth Sciences: Part 1.” Journal of Petroleum Science and Engineering, 29(3–4). (A special issue of the journal containing a number of papers dealing with soft computing.) Nikravesh, M., Aminzadeh, F., & Zadeh, L. A. (eds.) [2001b], “Soft computing and Earth Sciences: Part 2.” Journal of Petroleum Science and Engineering, 31(2–4). Nikravesh, M., Aminzadeh, F., & Zadeh, L. A. (eds.) [2003], Soft Computing and Intelligent Data Analysis in Oil Exploration. Elsevier, Amsterdam. Sung, A. H. [1999], “Applications of soft computing in petroleum engineering.” SPIE Conference on Applications and Science of Neural Networks, Fuzzy Systems, and Evolutionary Computation II, 3812, 200–212.
4.3 Fuzzy Logic in Geology: A Literature Review
115
Wakefield, M. I., Cook, R. J., Jackson, H., & Thompson, P. [2001], “Interpreting biostratigraphical data using fuzzy logic; the identification of regional mudstones within the Fleming Field, UK North Sea.” Journal of Petroleum Geology, 24(4), 417–440. Wong, P. M., & Nikravesh, M. (eds.) [2001], “Field Applications of Intelligent Computing Techniques.” Journal of Petroleum Geology, 24(4). (A special issue of the journal containing a number of papers dealing with soft computing.)
4.3.6 Seismology Deyi & Xihui [1985] presented the result of an international symposium on fuzzy mathematics in earthquake research. Since the publication of this volume, applications of fuzzy logic to seismology have focused on earthquake prediction, including assessment of the magnitude of an earthquake (absolute amount of energy a seismic event puts into the ground) and the pattern of surface disruption. Examples of the former include Wang et al. [1997] and Wang et al. [1999]. The surface effects of the passage of seismic waves can vary dramatically in an urban area depending on the material properties of the area, the type and depth to bedrock in the area, and the amplification of earthquake waves by the shape of the resonating deposits. Prediction of the ground motion at a site depends not only on these properties, but also on the properties of the incident waves, and on their orientation. Papers by Muller et al. [1998, 1999] and by Huang & Leung [1999] developed fuzzy-based neural-network approaches toward models that integrated the properties of the waves and those of the ground to predict ground motion. Locating the source site of a seismic wave arriving at a distant site has been one of the staples of seismology since its beginning. A number of deterministic models have been developed to make these calculations. Recently, Lin & Sanford [2001] described an inversion technique wherein deviations between theoretical and observed arrival times are assessed with fuzzy logic. References to seismology Deyi, F., & Xihui, L. (eds.) [1985], Fuzzy mathematics in earthquake researches, Proceedings of International Symposium on Fuzzy Mathematics in Earthquake Researches. Seismological Press, Beijing. Huang, C., & Leung, Y. [1999], “Estimating the relationship between isoseismal area and earthquake magnitude by a hybrid fuzzy-neural-network method.” Fuzzy Sets and Systems, 107(2), 131–146. Lin, K., & Sanford, A. R. [2001], “Improving regional earthquake locations using a modified G matrix and fuzzy logic.” Bulletin of the Seismological Society of America, 91(1), 82–93. Muller, S., Legrand, J.-F., Muller, J.-D., Cansi, Y., & Crusem, R. [1998], “Seismic events discrimination by neuro-fuzzy-based data merging.” Geophysical Research Letters, 25(18), 3449–3452.
116
4 Fuzzy Logic in Geological Sciences: A Literature Review
Muller, S., Garda, P., Muller, J.-D., & Cansi, Y. [1999], “Seismic events discrimination by neuro-fuzzy merging of signal and catalogue features.” Physics and Chemistry of the Earth (A), 24(3), 201–206. Wang, W., Wu, G., Huang, B., Zhuang, K., Zhou, P., Jiang, C., Li, D., & Zhou, Y. [1997], “The FAM (fuzzy associative memory) neural network model and its application in earthquake prediction.” Acta Seismologica Sinica, 10(3), 321–328. Wang, X.-Q., Zheng, Z., Qian, J., Yu, H.-Y., & Huang, X.-L. [1999], “Research on the fuzzy relationship between the precursory anomalous elements and earthquake elements.” Acta Seismologica Sinica, 12(4), 676–683.
4.3.7 Soil science and landscape development Maps that depict the distribution of different soil types are a standard product of geological and agricultural surveys throughout the world and have a variety of important uses. Such uses include agricultural planning, conservation, input into watershed models, input into groundwater models, and the legal definition of wetlands. Soils are three-dimensional mantles of weathered Earth materials with a complex biogeochemical evolutionary history that depends on parent material, precipitation, temperature, topography, groundwater levels, etc. Traditional soil classifications and maps produced from them ignored intergradations of soil types, both horizontally and vertically, and were based on widely spaced sampling pits. Although the actual microscopic structure of soils is difficult to assess, Moran & McBratney [1997] attempted to develop a two-dimensional fuzzy model of soil element disposition. In recent years soil science has been revolutionized by the development of geographical information systems (GIS), improvements of remote sensing capabilities (to patches 10 meters or so across), and a fuzzy approach toward soil classifications and mapping [Davis & Keller, 1997; Galbraith et al., 1998; Galbraith & Bryant, 1998; Wilson & Burrough, 1999]. Examples of this approach applied to different areas include: Kollias et al. [1999] for an alluvial flood plain in western Greece; Triantafilis et al. [2001] for an area in New South Wales, eastern Australia; and Zhu et al. [2001] for areas in Wisconsin and Montana. In each of these studies, horizontal and vertical measurements of soil properties in test pits are employed to devise fuzzy soil classification systems, i.e., systems wherein a point can belong to more than one soil type. In this way, intergradations among soil types are handled naturally and small areas of slightly different soil types within larger areas can be identified. These soil types are then mapped onto remotely sensed images of an area at a resolution of blocks that are approximately a few tens of meters on a side. Remote sensing commonly measures the intensity of various wavelengths of radiation off the Earth. A transfer function (commonly fuzzy) is then developed to relate wavelength and intensity of radiation to a soil type. Serandrei-Barbero et al. [1999] and Smith et al. [2000] used these techniques to interpret glacial features from the eastern Italian Alps and Greece, respectively.
4.3 Fuzzy Logic in Geology: A Literature Review
117
Fuzzy logic has also been employed to understand geochemical aspects of soils and recent stream sediments. Lahdenperä et al. [2001] also used fuzzy clustering to establish relationships among glacial tills, bedrock, and elements measured in soils of Finland. In addition, fuzzy logic has been incorporated into the revised universal soil loss equation [Tran et al., 2002]. References to soil science and landscape development Davis, T. J., & Keller, C. P. [1997], “Modelling and visualizing multiple spatial uncertainties.” Computers & Geosciences, 23(4), 397–408. Galbraith, J. M., & Bryant, R. B. [1998], “A functional analysis of soil taxonomy in relation to expert system techniques.” Soil Science, 163(9), 739–747. Galbraith, J. M., Bryant, R. B., & Ahrens, R. J. [1998], “An expert system of soil taxonomy.” Soil Science, 163(9), 748–758. Kollias, V. J., Kalivas, D. P., & Yassoglou, N. J. [1999], “Mapping the soil resources of a recent alluvial plain in Greece using fuzzy sets in a GIS environment.” European Journal of Soil Science, 50(2), 261–273. Lahdenperä, A.-M., Tamminen, P., & Tarvainen, T. [2001], “Relationships between geochemistry of basal till and chemistry of surface soil at forested sites in Finland.” Applied Geochemistry, 16(1), 123–136. Moran, C. J., & McBratney, A. B. [1997], “A two-dimensional fuzzy random model of soil pore structure.” Mathematical Geology, 29(6), 755–777. Serandrei-Barbero, R., Rabagliati, R., Binaghi, E., & Rampini, A. [1999], “Glacial retreat in the 1980s in the Breonie, Aurine and Pusteresi groups (eastern Alps, Italy) in Landsat TM images.” Hydrological Sciences—Journal, 44(4), 279–296. Smith, G. R., Woodward, J. C., Heywood, D. I., & Gibbard, P. L. [2000], “Interpreting Pleistocene glacial features from SPOT HRV data using fuzzy techniques.” Computers & Geosciences, 26(4), 479–490. Tran, L. T., Ridgley, M. A., Duckstein, L., & Sutherland, R. [2002], “Application of fuzzy-logic modeling to improve the performance of the revised universal soil loss equation.” Catena, 47(3), 203–226. Triantafilis, J., Ward, W. T., Odeh, I. O. A., & McBratney, A. B. [2001], “Creation and interpolation of continuous soil layer classes in the Lower Naomi Valley.” Soil Science Society of America Journal, 65(2), 403–413. Wilson, J. P., & Burrough, P. A. [1999], “Dynamic modeling, geostatistics, and fuzzy classification: new sneakers for a new geography?” Annals of the Association of American Geographers, 89(4), 736–746. Zhu, A. X., Hudson, B., Burt, J., Lubich, K., & Simonson, D. [2001], “Soil mapping using GIS, expert knowledge, and fuzzy logic.” Soil Science Society of America Journal, 65(5), 1463–1472.
4.3.8 Deposition of sediment Modern sedimentary systems comprise volumes of the uppermost tens of meters of lithosphere and overlying hydrosphere and atmosphere where sediments accumulate
118
4 Fuzzy Logic in Geological Sciences: A Literature Review
as, for example, on the delta of the Mississippi River or on the floor of Death Valley, California. Earth scientists have limited knowledge of the physical, chemical, and biological processes that control sediment accumulation in modern sedimentary systems. This presents a difficulty insofar as analogous ancient sedimentary systems are where the sediments and sedimentary rocks, which hold much of the direct, long-term evidence of the history of the biosphere, atmosphere, hydrosphere, and lithosphere of this planet, accumulated. In order to infer such important information as long- and short-term variations in geochemical cycles, ecosystems, and climate, we seek to recover from ancient sedimentary deposits just these records of the physical, chemical, and biological processes that operated on those ancient surfaces. Our poor understanding of modern depositional processes is due to: the large sizes of modern sedimentary systems (102 –105 km2 ); difficulties in instrumentation (especially during rare events such as hurricanes and floods); restrictions of observations on active processes to a few hundred years; and labor-intensive data gathering. Fuzzy logic was initially introduced by Nordlund [1996, 1999] to overcome the computational difficulties of sedimentary modeling introduced by these problems. FUZZIM [Nordlund, 1999] is a share-ware program that replaces sedimentary physics with common-sense rules based on hard and soft information developed by sedimentologists over the past 100 years. Edington et al. [1998], Parcell et al. [1998], and Demicco & Klir [2001] have developed other fuzzy rule-based models of ancient deposition. Urbat et al. [2000] developed a fuzzy clustering classification of diagenesis of deep-sea sediments due to flux of hydrothermal fluids.
References to deposition of sediment Demicco, R. V., & Klir, G. J. [2001], “Stratigraphic simulations using fuzzy logic to model sediment dispersal.” Journal of Petroleum Science and Engineering, 31(2–4), 135–155. Edington, D. H., Poeter, E. P., & Cross, T. A. [1998], “FLUVSIM; a fuzzy-logic forward model of fluvial systems.” Abstracts with Programs—Geological Society of America Annual Meeting, 30, A105. Nordlund, U. [1996], “Formalizing geological knowledge—with an example of modeling stratigraphy using fuzzy logic.” Journal of Sedimentary Research, 66(4), 689–712. Nordlund, U. [1999], “FUZZIM: forward stratigraphic modeling made simple.” Computers & Geosciences, 25(4), 449–456. Parcell, W. C., Mancini, E. A., Benson, D. J., Chen, H., & Yang, W. [1998], “Geological and computer modeling of 2-D and 3-D carbonate lithofacies trends in the Upper Jurassic (Oxfordian), Smackover Formation, Northeastern Gulf Coast.” Abstracts with Programs— Geological Society of America Annual Meeting, 30, A338. Urbat, M., Dekkers, M. J., & Krumsiek, K. [2000], “Discharge of hydrothermal fluids through sediment at the Escanaba Trough, Gorda Ridge (ODP Leg 169): assessing the effects on the rock magnetic signal.” Earth and Planetary Science Letters, 176(3–4), 481–494.
4.3 Fuzzy Logic in Geology: A Literature Review
119
4.3.9 Miscellaneous applications There have been a number of general papers advocating the use of fuzzy logic in the geological sciences. These include Fang [1997] and Fang & Chen [1990]. Bárdossy & Duckstein [1995] is a useful introductory text in which various applications of fuzzy rule-based modeling in biological and engineering applications as well as geosciences are developed. Also, Tamas D. Gedeon edited a special issue of the International Journal of Fuzzy Systems [Vol. 4, No. 1, 2002] with a number of papers on applications of soft computing in geology. In addition to these general references there have been the following specific applications of fuzzy logic to geological problems outside of the main areas listed above. Cagnoli [1998] suggested uses of fuzzy logic in the study of volcanoes. Van Wijk & Bouten [2000] applied a fuzzy rule-based model to simulation of latent heat fluxes of coniferous forests. Schulz et al. [1999] used fuzzy set theory to evaluate thermodynamic parameters in aqueous chemical equilibrium calculations. Kruiver et al. [1999] used fuzzy clustering of paleomagnetic measurements on deep-sea sediments as a proxy for estimating orbital forcing of climate over the last 276,000 years. Finally, Pokrovsky et al. [2002] used fuzzy logic to study the meteorological impacts of air pollution in Hong Kong.
References to miscellaneous applications Bárdossy, A., & Duckstein, L. [1995], Fuzzy Rule-Based Modeling with Applications to Geophysical, Biological and Engineering Systems. CRC Press, Boca Raton, FL. Cagnoli, B. [1998], “Fuzzy logic in volcanology.” Episodes, 21(2), 94–96. Fang, J. H. [1997], “Fuzzy logic and geology.” Geotimes, 42, 23–26. Fang, J. H., & Chen, H. C. [1990], “Uncertainties are better handled by fuzzy arithmetic.” American Association of Petroleum Geologists Bulletin, 74, 1228–1233. Gedeon, T. D. (ed.) [2002], “Soft computing in geology.” International Journal of Fuzzy Systems, 4(1). (Special issue.) Kruiver, P. P., Kok, Y. S., Dekkers, M. J., Langereis, C. G., & Laj, C. [1999], “A psuedoThellier relative palaeointensity record, and rock magnetic and geochemical parameters in relation to climate during the last 276 kyr in the Azores region.” Geophysical Journal International, 136, 757–770. Pokrovsky, O. M., Kwok, R. H. F., & Ng, C. N. [2002], “Fuzzy logic approach for description of meteorological impacts on urban air pollution species: a Hong Kong case study.” Computers & Geosciences, 28(1), 119–127. Schulz, K., Huwe, B., & Peiffer, S. [1999], “Parameter uncertainty in chemical equilibrium calculations using fuzzy set theory.” Journal of Hydrology, 217, 119–134. Van Wijk, M. T., & Bouten, W. [2000], “Analyzing latent heat fluxes of coniferous forests with fuzzy logic.” Water Resources Research, 36, 1865–1872.
120
4 Fuzzy Logic in Geological Sciences: A Literature Review
4.4 Concluding Note: Quo Vadis The papers cited in each category above by no means comprise a complete bibliography of aspects of use of fuzzy logic in the geological sciences. One final observation is in order. There seems to be a major trend in the use of fuzzy logic in the geological sciences. Initially, fuzzy sets were used to begin to capture the continuous nature of geologic data, and various techniques were developed to use fuzzy sets in previously developed deterministic models of geological phenomena. More and more, fuzzy rule-based models are beginning to supersede the older deterministic models. This trend will no doubt continue into the future.
Chapter 5
Applications of Fuzzy Logic to Stratigraphic Modeling
Robert V. Demicco
5.1 5.2 5.3 5.4 5.5
Introduction 121 Fuzzy Logic and Stratigraphic Models 123 Death Valley, California 124 Modeling Depositional Processes at a Delta Mouth 133 Multidistributary Deltaic Deposition with Variable Wave and Long-Shore Drift Regimes 137 5.6 Future Developments 147 5.7 Conclusions 148 References 149
5.1 Introduction Two-dimensional and three-dimensional computer-generated forward models of sedimentary basin filling are increasingly important tools for research in applied and theoretical geological sciences [see Tetzlaff & Harbaugh, 1989; Franseen et al., 1991; Slingerland et al., 1994; Wendebourg & Harbaugh, 1996; Harff et al., 1999; Harbaugh et al., 1999; Syvitski & Hutton, 2001; Merriam & Davis, 2001]. These models produce synthetic stratigraphic cross-sections that are important for two reasons. First, they give us a predictive picture of the subsurface distribution of petrophysical properties (such as porosity, permeability, seismic velocity, etc.) that are useful in petroleum exploration, secondary petroleum recovery, groundwater exploitation, groundwater remediation, and other geotechnical applications. Second, synthetic stratigraphic models increase our theoretical understanding of how sediment accumulation varies in time and space in response to external factors (such as eustasy and tectonics) and internal factors (such as compaction, isostatic adjustments, and crustal flexural adjustments made in response to tectonic loading and sedimentary accumulation) that are known, or suspected, to influence sedimentation patterns. Physically reasonable algorithms for eustasy, compaction, isostasy, and crustal flexure are common components of sedimentary models and can be modeled either deterministically or with fuzzy logic. For example, in Section 3.4.1 a fuzzy compaction algorithm is described. However, the main focus of this chapter is on using fuzzy logic to simulate sediment 121 FUZZY LOGIC IN GEOLOGY
Copyright 2004, Elsevier Science (USA) All rights of reproduction in any form reserved. ISBN: 0-12-415146-9
122
5 Applications of Fuzzy Logic to Stratigraphic Modeling
erosion, sediment transportation, and sediment accumulation within a forward model [Nordlund, 1996]. Wendebourg & Harbaugh [1996] collectively refer to this as the “sedimentary process simulator.” Models of sediment production based on fuzzy inference models are presented for reef growth in Section 3.4.2 and for biochemical production of carbonate sediment on a shallow platform in 3.4.4. Most forward models of basin filling focus on shallow-water shelf accumulation. For siliciclastic settings, a significant source of sediment input, i.e. a river delta, is usually an important component of the model. When we consider carbonate depositional systems, we are confronted by the in situ formation of the sediments themselves both as reefs [Bosscher & Schlager, 1992], and as bank-interior sediments [Broecker & Takahashi, 1966; Morse et al., 1984]. In both siliciclastic and carbonate shallow marine systems, waves, wave-induced currents, tidal currents, and storm-induced waves and currents lead to ever-changing patterns of sediment erosion, transportation, and accumulation. Buoyant plumes become important where deltas are an important component of a model. Coastal oceanographic modelers have made great strides in dealing with the complexities introduced by the elements listed above. Acinas & Brebbia [1997] and Komar [1998] provide details and examples of various aspects of such models. Consider the following steps that would be necessary to model sedimentation in a coastal area. First, we would need to develop a numerical solution (usually finite difference or finite element) of the fundamental, dynamical, physical equations of circulation forced by the average or “fair weather” winds, waves, tides, and, if present, buoyant river plumes. This step becomes immediately complicated if we suspect that storm events are an important component of the coastal sedimentation regime, insofar as we would then need a separate “storm” circulation model. The second step would be to use the results of the circulation model to calculate bed shear stress along the bottom of a circulation model. Finally, the bed shear stresses would then be used as input to solve the temporal and spatial terms in bedload and suspended load sediment transport equations. These would, in turn, give us the desired sediment erosion, transportation, and deposition. If we were to use the approach outlined in the preceding paragraph in a stratigraphic basin filling model, we would confront at least three major difficulties. First, the solutions of such models are site specific and depend on rigorous application of boundary conditions, initial conditions, and wave and tidal forcing functions over a discrete domain. It is important to note that literally hundreds of constants in the circulation models, shear stress calculations, and bedload transport equations would need to be specified. Second, we would need to solve the flow and transport equations at discrete time intervals over hundreds of thousands to millions of years as the basin slowly fills. Thus we would need to consider the duration of the individual time steps of the stratigraphic model. The process-response models outlined above have validity for durations of tens to (at most) hundreds of years. These are very short in comparison to basin-filling models. Basin-filling models typically operate for time scales of 105 to
5.2 Fuzzy Logic and Stratigraphic Models
123
106 years. Any long-term changes that we would wish to incorporate into our model (such as a regional climate change) or natural, evolutionary changes in the model (such as changes in antecedent topography and changed boundary conditions) would entail essentially starting from scratch with a new circulation model. The third factor is simply the scale of the numerical computations, memory storage, etc., involved in such a modeling project. We would have to run such a model many scores of times to “tune” it so that we get realistic output. For the reasons outlined in the preceding paragraph, sedimentary process simulators are the crudest parts of stratigraphic models. Siliciclastic sedimentary process simulators typically either employ the diffusion equation to represent sediment dispersal or use linear approximations of more complicated sediment dispersal. The 2-dimensional code of Bosence & Waltham [1990] and Bosence et al. [1994], the “Dr. Sediment” code of Dunn [1991], the 2-dimensional alluvial architecture code of Bridge & Leeder [1979], the 3-dimensional update of that code by Mackey & Bridge [1995], and the “CYCOPATH 2D” code of Demicco [1998] all use such an approach. Models such as STRATAFORM [Nittrouer & Kravitz, 1996], SEDFLUX [Syvitski & Hutton, 2001], and the SEDSIM models [Tetzlaff & Harbaugh, 1989; Wendebourg & Harbaugh, 1996; Merriam & Davis 2001] also use linear approximations (albeit to more realistic “physically based” circulation and sedimentation transport equations). Although these models have been successful, they can be computationally quite complex.
5.2 Fuzzy Logic and Stratigraphic Models In an effort to overcome the complexities described above, we have been developing fuzzy logic models of sediment production, erosion, transportation, and deposition based on qualitatively and quantitatively defined observational rules. Nordlund [1996] and Fang [1997] suggested that fuzzy logic could be used to overcome some of the difficulties inherent in modeling sediment dispersion. There is a wealth of observational data on flow and sediment transport in the coastal zone, in river systems, on carbonate platforms, and in closed basin settings. Nordlund [1996] refers to this as “soft” or qualitative information on sedimentary dynamics. However, we also have a fair amount of quantitative information on some sedimentary processes. For example, Section 3.4.4 describes a fuzzy inference model for the volumetric production of lime sediment per year on different areas of carbonate platforms, based on the geochemical measurements of Broecker & Takahashi [1966] and Morse et al. [1984]. Examples of qualitative information would be “beach sands tend to be well sorted and are coarser than offshore sands,” or “carbonate sediment is produced in an offshore carbonate ‘factory’ and is transported and deposited in tidal flats.” Such statements carry information, but are not easily quantified. Indeed, these types of qualitative statement are commonly the exact kind of information that is obtained
124
5 Applications of Fuzzy Logic to Stratigraphic Modeling
by studies of ancient sedimentary sequences. Moreover, with the development of “seismic stratigraphy” and “sequence stratigraphy,” applied and academic geologists have both moved into an arena where there is commonly a complex blend of “hard” and “soft” information. Hard data might include seismic (or outcrop-scale) geometric patterns of reflectors or bedding geometries, whereas soft information would include description of rock types, interpretations of depositional settings, and their positions within “system tracts” [see Vail et al., 1977; Wilgus et al., 1989; Schlager, 1992, 1999; Loucks & Sarg, 1993; Emery & Myers, 1996]. Fuzzy logic allows us to formalize and treat such information in a rigorous, mathematical way. It also allows quantitative information to be treated in a more natural, continuous fashion. The purpose of this chapter is to present a number of simulations of increasing complexity, where we have used fuzzy logic to model sediment dispersal in 3-dimensional stratigraphic models wherein sea level changes, subsidence, isostasy, and crustal flexure are modeled using conventional mathematical representations [Turcotte & Schubert, 1982; Angevine et al., 1990; Slingerland et al., 1994]. The results, summarized here along with those of the model FLUVSIM [Edington et al., 1998] and the modeling of the Smackover formation described by Parcell et al. [1998], suggest that fuzzy logic may be a powerful and computationally efficient alternative technique to numerical modeling for the basis of a sedimentary process simulator. It has the distinct advantage in that models based on fuzzy logic are robust, easily adaptable, computationally efficient, and can be easily altered internally to allow many different combinations of input parameters to be run in a sensitivity analysis in a quick and efficient way. The next three sections of this chapter describe three sets of models where fuzzy logic can be used to address some of the problems described above. The first set of examples models sedimentation in Death Valley, California. The second set of models compares a numerical solution and a fuzzy logic model applied to sedimentation in a distributary mouth bar of the Mississippi River. Finally, the last set of models illustrates deltaic sedimentation under variable wave and long shore current regimes.
5.3 Death Valley, California Death Valley is an arid closed basin located in the southwestern United States. The basin is a half graben approximately 15 km across and 65 km long. The center of the basin is a nearly flat complex of saline pans and playa mudflats nearly 100 m below sea level. Gravel alluvial fans radiate from streams along steep mountain fronts on the east side of the basin where the active border fault is inferred. These fans are steep and grade out to the floor of the basin over a few kilometers. The mountain front on the west side of the basin is gentler and alluvial fans issuing from streams on this side of the basin have a lower gradient than those on the east side, extending nearly halfway across the basin floor. In 1991, a 175 m deep core was extracted from
5.3 Death Valley, California
125
Figure 5.1 Data from the Death Valley core [Lowenstein et al., 1999]. (a) Sedimentary environments versus age. (b) Paleotemperatures measured from fluid inclusions in the core. Line denotes smoothed running average. (c) Paleoprecipitation inferred from the core and environs of Death Valley.
a salt pan in the central portion of the basin [Roberts & Spencer, 1995; Li et al., 1996; Lowenstein et al., 1999]. The core sampled basin floor sediments deposited during the last 191,000 years and these data are summarized in Figure 5.1. Figure 5.1a is a plot of thicknesses of the deposits of the four different environments found in the core plotted against their ages as interpolated from U-Series dates [Ku et al., 1998]. The preserved deposits include: (1) disrupted, thin-bedded muds deposited in desiccated playa mudflats; (2) chemical sediments interlayered with aluminosilicate muds that were deposited in brine-saturated saline pans; (3) chemical sediments (principally halite) deposited in perennial saline lakes; and (4) fossil-rich aluminosilicate muds with striking millimeter-thick laminae deposited in deep perennial lakes that were fresh. Figure 5.1b is a smoothed curve through paleotemperatures measured from brine inclusions preserved in primary halite deposits, principally from saline lakes and saline pan deposits [Lowenstein et al., 1999]. Figure 5.1c is an interpreted record of paleoprecipitation [Lowenstein, personal communication] based on a number of
126
5 Applications of Fuzzy Logic to Stratigraphic Modeling
proxies from the core and surrounding areas (dated lacustrine tufas, known lake shorelines, pollen records, etc.) and the δ 18 O composition of the sulfate minerals in the core [Yang et al., 1999]. Demicco & Klir [2001] described a fuzzy rule-based, three-dimensional model of the last 190 kiloyears (ky) of sedimentation in Death Valley using the core data shown in Figure 5.1. That model is briefly described here as a starting point. The model was a grid 15 km across and 65 km long represented by approximately 1900 active cells each 0.5 × 1.0 km in size. The modern topography was the starting point for elevation at each cell in the model. Subsidence of the model was −0.2 m/ky along the edges of the model and increased to −1 m/ky along the steep eastern margin of the basin halfway down the axis of the basin. The model employs fuzzy “if–then” rules to model both alluvial fan input along the sides of the basin and deposition on the basin floor. Two fuzzy logic systems controlled deposition on the floor of the basin: one system generated the sediment type and the other generated the sediment thickness. The fuzzy logic system that determines the type of sediment deposited on the basin floor is briefly described here. (Many other examples of a fuzzy inference approach to geologic problems are illustrated in Chapter 3.) The input variables in both of these fuzzy logic systems were: (1) the temperature signal (Figure 5.1b); and (2) the precipitation signal (Figure 5.1c) determined from the core. Figure 5.2a shows the input variable temperature here represented by two fuzzy sets, low temperatures and high temperatures, whereas Figure 5.2b shows the variable precipitation here also represented by two fuzzy sets, low precipitation and high precipitation. The output variable here is the environment of deposition, here represented by four fuzzy sets: playa, saline pan, saline lake, and freshwater lake. The membership functions that describe the fuzzy sets in Figure 5.2 are simple trapezoids or triangles. The “rules” governing the basin floor sedimentary environment are straightforward, make sense in terms of Figure 5.1, and are easily incorporated into a fuzzy inference system. The rules are: (1) If temperature is low and precipitation is low, then the basin floor environment is saline pan. (2) If temperature is low and precipitation is high, then the basin floor environment is saline lake. (3) If temperature is high and precipitation is low, then the basin floor environment is playa. (4) If temperature is high and precipitation is high, then the basin floor environment is lake. The standard (so-called “Mamdani”) interpretation [Mamdani & Assilian, 1975] of these rules is shown in Figure 5.3 for input values of temperature = 29◦ C and precipitation = 165 mm/y. The left column represents the input variable temperature,
5.3 Death Valley, California
127
Figure 5.2 (a) Membership functions describing the fuzzy sets “low” and “high” for the input variable temperature over the domain range 23 to 34◦ C. (b) Membership functions describing the fuzzy sets “low” and “high” for the input variable precipitation over the domain 100 to 300 mm/y. (c) Membership functions describing the fuzzy sets “playa,” “saline pan,” “saline lake,” and “lake” for the output variable environments over the domain range 1 to 4. Membership functions were adjusted by hand to produce the curve in Figure 5.4a.
the center column represents the input variable precipitation, and the right column represents the output variable environment. From top to bottom, the rows represent rules (1) to (4) as listed above. The input variables (29◦ C and 165 mm in this example) are evaluated simultaneously for each rule and a truth value = degree of membership of the input variable in each of the potential input sets is calculated. In this case, 29◦ “fires” all four rules, generating truth values of approximately [0.2, 0.2, 0.8, 0.8] for rules one through four respectively whereas the truth values for an input value of 165 mm are approximately [0.75, 0.3, 0.75, 0.3]. For each rule, the truth value of the lower of the two inputs truncates the membership function of the appropriate output variable. Here for example in rule 1, the truth value of 0.2 truncates the membership
128
5 Applications of Fuzzy Logic to Stratigraphic Modeling
Figure 5.3 Standard (“Mamdani”) interpretation of the “if–then” rules controlling the basin floor environments for the input variables 29◦ C and 165 mm/y. The input variables are evaluated for each pair of temperature and precipitation values input, and a truth value = degree of membership of the input variable in each of the potential input sets is calculated. These truth values truncate the membership functions of the appropriate output variable. For each pair of input variables, the truncated membership functions of the output variable are summed, and the centroid of the appropriate curve is taken as the “defuzzified” output value, here 2.25.
function salt pan at 0.21 . For each combination of temperature and precipitation, the maximum of the truncated membership functions of the output variable is taken, and the centroid of the appropriate curve is taken as the “defuzzified” output value (2.25 in this example). In Figure 5.4, the curve shows this fuzzy inference system evaluated over the age range 0 to 190 ky for every appropriate input combination of temperature and precipitation (Figure 5.1). This curve is compared to the environments recorded in the core (rectilinear line). In robot control algorithms, where fuzzy logic was first developed, systems could self-adjust the shapes of the membership functions and set boundaries until the required task was flawlessly performed. This aspect of fuzzy systems, commonly facilitated via the learning capabilities of appropriate neural networks [Kosko, 1992; Klir & Yuan, 1995; Lin & Lee, 1996; Nauck et al., 1997] or by genetic algorithms 1 The lower value is chosen here because the conjunction “and” implies a fuzzy intersection. If the connector had been the conjunction “or” then a fuzzy union of the sets is implied and the higher of the values would be used. A range of conjunctions is available (see Sections 2.5 and 3.4.4).
5.3 Death Valley, California
129
Figure 5.4 (a) Direct comparison of the Mamdani fuzzy inference model of basin floor environments (curved line) with the geologic history of environments found in the Death Valley core (rectilinear line). (b) Direct comparison of the machine-adjusted Takagi–Sugeno model of basin floor environments (curved line) with the geologic history of environments found in the Death Valley core (rectilinear line). The Takagi–Sugeno neuro-fuzzy inference system is described in the text and in Figures 5.5 and 5.6.
[Sanchez et al., 1997; Cordón et al., 2001], is one of their great advantages over numerical solution approaches. Here we illustrate an application of this self-adjusting capability of fuzzy inference systems by employing the adaptive neuro-fuzzy system that is included in the Fuzzy Logic Toolbox of the commercial high-level language MATLAB© to generate a fuzzy logic system for the Death Valley core data. The model is based on the premise that the deposits that accumulated on the floor of Death Valley were directly related in some way to a combination of temperature and rainfall. This is not an unreasonable interpretation for closed basin deposits [Smoot & Lowenstein, 1991] and is a prerequisite for using sedimentary records of lakes and other continental environments for research into paleoclimates. The MATLAB© adaptive neuro-fuzzy system is a program that utilizes learning capabilities of neural networks for tuning parameters of fuzzy inference systems on the basis of given data. The program implements a training algorithm employing the common backpropagation method based on the least square error criterion [see Klir & Yuan, 1995, Appendix A]. The fuzzy logic system used in the previous example was the so-called “Mamdani” fuzzy inference system. In the next example, we use an alternative approach to formalizing fuzzy inference systems developed by Takagi & Sugeno [1985].
130
5 Applications of Fuzzy Logic to Stratigraphic Modeling
The so-called Takagi–Sugeno-type fuzzy logic system employs a single “spike” as the output membership functions. Thus, rather than integrating across the domain of the final output fuzzy set, a Takagi–Sugeno-type fuzzy inference system employs only the weighted average of a few data points. There is a clear computational advantage in employing a Takagi–Sugeno fuzzy logic system. Moreover, the adaptive neuro-fuzzy inference engine of MATLAB© only supports Takagi–Sugeno-type output membership functions. Detailed examples of Takagi–Sugeno fuzzy logic systems can be found in Section 3.4.3 and the reader is referred to this section for further details. Figure 5.5 shows antecedent membership functions for the input variables “temperature” and “rainfall” used as input to the training algorithm. The training algorithm
Figure 5.5 Antecedent membership functions for the variable temperature (a) and precipitation (b) used as input to “tune” an adaptive neuro-fuzzy inference model relating temperature and rainfall to sedimentary environments.
5.3 Death Valley, California
131
Figure 5.6 Solution surface of the adaptive neuro-fuzzy logic model generated from MATLAB© .
also requires two separate arrays of data for “training” and “verifying.” These data arrays comprise triplets of temperature, rainfall, and resultant environment. The training algorithm systematically adjusts the output functions and ultimately generates nine linear output functions. Figure 5.6 is the surface generated by this fuzzy logic system and Figure 5.4b is a direct comparison between our modeling results (curved line) and the original depositional environment data (rectilinear line), both plotted against age. Clearly, the “trained” Takagi–Sugeno fuzzy inference system does a superior job in modeling the time history of environments of deposition on the floor of Death Valley. Figure 5.7 shows two synthetic stratigraphic cross-sections of the original model [Demicco & Klir, 2001] rerun with all conditions being the same except for the fuzzy logic systems that control deposition in the center of the basin. In this revised model, the sedimentary environment in the basin center is controlled by the machinedeveloped fuzzy logic system described above. Figure 5.7a is a cross-section across the basin in a west–east orientation, whereas Figure 5.7b is a cross-section along the north–south long-axis of the basin. As in the original model, alluvial input from the sides arose from the canyon locations at the heads of the main modern alluvial fans around the basin margin. Deposition on the alluvial fan to playa mudflat
132
5 Applications of Fuzzy Logic to Stratigraphic Modeling
Figure 5.7 Synthetic stratigraphic cross-sections from a three-dimensional model of Death Valley sedimentation over the past 190 ky. The model uses the adaptive neuro-fuzzy inference model to control the environment of the basin floor. A Mamdani fuzzy inference system controls the thickness of the sediments deposited in the center of the basin, and two Mamdani fuzzy inference systems control the amount and caliber of the sediment deposited on alluvial fans along the edges of the basin. (a) Cross-section perpendicular to the long axis of the valley in the center of the model (approximate location of the Death Valley core). Orange and yellows on the margins indicate gravels and coarse sands of the alluvial fan systems which grade into light blue playa muds toward the low floor of the basin. Basin floor sediments include: (1) lake muds (which onlap the alluvial fans in the lower portions of the model) indicated by deep blue; (2) saline pan deposits indicated by reds; and (3) saline lake deposits denoted by yellows. (b) Cross-section parallel to the long axis of the valley through the point of maximum sediment accumulation. (See also color insert.)
drainage-ways was modeled by two Mamdani fuzzy logic systems. The input variables to both models were distance from canyon mouth and slope of the sediment surface in each cell. These input variables controlled the particle size of the deposit and the thickness of the alluvial deposits in each cell. In the synthetic cross-valley section (Figure 5.7a) short steep fans on the eastern side of the basin comprise coarser gravels (orange) and contrast with the long, lower gradient fans on the western side
5.4 Modeling Depositional Processes at a Delta Mouth
133
of the basin that are generally composed of finer sediment (yellow and green). The alluvial input into the basin ultimately leads to the deposition of playa muds in the floor of the basin. The basin floor sediment is color-coded: deep freshwater lake and playa mud flats are blue, saline pan is red, and saline lake is shades of yellow and orange. Playa mud flats develop in the floor of the basin when the chemical or lacustrine sediments are minimal. In these revised models, the lakes lap up and over the alluvial fan deposits.
5.4 Modeling Depositional Processes at a Delta Mouth A significant portion of the geologic record comprises siliciclastic sedimentary rocks (sandstones and shales) deposited in shallow marine settings. It is axiomatic that an appreciable thickness of such material requires a source of sediment, i.e., a river. Thus, deltaic deposits are common components of the geologic record even though wave and tidal current processes in the basin may redistribute the material into beaches or tidal flats [Reading & Collinson, 1996]. The Mississippi River Delta complex is one of the best-studied deltas on Earth and serves as an archetypical “river dominated” delta [Gould, 1970; Wright, 1978; Reading & Collinson, 1996]. The modern Mississippi River Delta has three main distributary channels that are suggestive of a “bird’s foot” in map view. Southwest Pass is the most studied distributary. Each day, the Mississippi River delivers over 1 million tons of sediment, mostly silt and clay but with appreciable amounts of fine sand, to the Gulf of Mexico. The following discussion of deposition at the mouth of Southwest Pass is based upon Gould [1970], Wright & Coleman [1971], Coleman & Wright [1975], and Wright [1978]. Southwest Pass Channel shallows from a depth of approximately 12 m to approximately 5 m where it reaches the mouth of the pass. Seaward of the pass mouth is a sand shoal known as the “distributary mouth bar” that is some 8 km wide and extends some 10 to 15 km seaward. Depths over the shoal are as shallow as 3 to 5 m (except where dredging maintains a deeper ship channel) and are as much as 100 m at the distal end of the bar. The size of the sedimentary particles that comprise the bar grades from fine sand on the bar crest to a mixture of sand, silt, and clay mid-bar, to clay at the sides and in front of the bar. The system builds out about 80 m per year, leaving a subsurface, pillow-shaped sand deposit encased in mud (a “bar finger” sand deposit). The main processes of deposition are tied to the buoyant plume that emanates from the mouth of the distributary and include bedload deposition of the fine sand and suspension settle-out of the silt and clay. The clay deposition is affected by flocculation that coagulates clay particles as they pass from fresh water to high ionic strength seawater. The plume that issues from the distributary mouth and its associated sedimentary processes has two basic modes: (1) low to average flows; and (2) flood flows. During times of average to low discharge, a salt-water wedge intrudes up Southwest Pass 10 km or so and the buoyant, sediment-laden plume is out of contact with the bottom. During the spring freshet and other floods, the salt wedge is driven
134
5 Applications of Fuzzy Logic to Stratigraphic Modeling
out of the channel and the fresh-water plume detaches from the bottom just seaward of the mouth bar. In this mode the bedload sands are flushed from the pass out onto the distributary mouth bar and this process apparently accounts for the main sand deposits of the mouth bar. The plume itself has a complicated internal circulation and is visible in satellite photographs extending up to 20 km or so off the mouth of Southwest Pass. The scenario outlined above is supported by some limited oceanographic measurements but, as yet, no detailed numerical models of circulation and sedimentation for Southwest Pass exist. Instead, as is typical of many coastal sedimentary studies, there is the combination of hard and soft data outlined above that is summarized in Figures 18 and 20 in Gould [1970] and particularly in Figure 20 in Wright [1978]. It is just such hybrid data sets that fuzzy logic excels at modeling. The stratigraphic model SEDFLUX 1.0C contains subroutines to simulate bedload and suspended load sedimentation from a buoyant plume [Syvitski & Hutton 2001]. However, this program assumes a fairly deep receiving basin and no appreciable wave or current action in the basin. Application of the plume routines in SEDFLUX to Southwest Pass, using values of discharge, flow velocity, and sediment load appropriate for the Mississippi River at South Pass, produces a very laterally restricted suspension settle-out plume (Figure 5.8a) with most settle-out just seaward of the distributary mouth and extending as much as 35 km offshore. SEDFLUX simulates bedload deposition by uniformly distributing the bedload in front of the plume over cells in a specified depth range and is not shown here (all sediment is deposited by settle-out). The shape of the sediment plume is dictated by the analytical routines in SEDFLUX and is based on momentum dissipation in a turbulent jet as modeled by Albertson et al. [1950]. The lateral extent of the plume is dictated by the width of the river mouth, and the average velocity at the mouth of the river dictates the downstream length of the plume. The other adjustable parameters are the settling times of the sediments. The constants used in SEDFLUX are empirical and take into account the increased settling out of clay due to flocculation (Figure 5.9). One of the distinct advantages of SEDFLUX, however, is its strict adherence to conservation of mass insofar as the amount of sediment deposited in the offshore plume is the amount that enters the model from the river mouth. Even a cursory comparison of the sediment plume geometry given by the plume routines in SEDFLUX with Figures 18 and 20 in Gould [1970] and Figure 20 in Wright [1978] shows that it is very different than the observed distribution of sediment. A fuzzy logic approach to this problem would start with the combined soft and hard information described above. In this case the target would be to reproduce the thickness of sand, silt, and mud and, in particular, the proper proportions of sand, silt, and mud on the distributary mouth bar over some specified time interval (e.g., 1 day or 1 year). The targets of this simulation would be Figures 18 and 20 in Gould [1970] and Figure 20 in Wright [1978]. We have developed a Takagi–Sugeno model from these figures (Figures 5.8b, 5.8c). In this model, radial coordinates are employed and the inputs for this model are the radial distance from the river mouth (Figure 5.10a)
5.4 Modeling Depositional Processes at a Delta Mouth
135
Figure 5.8 (a) Geometry and thickness of suspended sediment plume off Southwest Pass produced by the plume subroutine of SEDFLUX 1.0 C for values appropriate to the Mississippi River at South Pass. The plume is for suspended sediment only; the bedload sediment would be evenly distributed in an arbitrary array of cells off the river mouth designated by the operator, and is not shown here. Horizontal and vertical scales are in km, contour scale is in mm/day. (b) Geometry and thickness of sediment plume off Southwest Pass produced by the fuzzy inference model with the same input values; however, bedload is included in this model. Distances are in kilometers; contours are in mm/day. (c) Mixtures of grain sizes (fuzzy sets) produced by the model: 0.9 = clean, well-sorted fine sand; 0.7 = silty sand; 0.5 = sandy silt; 0.3 = silt; and 0.1 = clay. Compare with Figure 18 in Gould [1970].
and the absolute angle left and right from the centerline of the river (Figure 5.10b). Distance from the river mouth is divided into five fuzzy sets: (1) in channel; (2) very close; (3) close; (4) far; and (5) very far. Angular separation from the centerline is divided into five triangular fuzzy sets: (1) far negative; (2) negative; (3) centerline; (4) positive; and (5) far positive. There are five linear output functions: (1) none (0); (2) very little (0.2); (3) little (0.4); (4) some (0.6); and lots (0.8). There are 17 rules to the system (Table 5.1).
136
5 Applications of Fuzzy Logic to Stratigraphic Modeling
Figure 5.9 Contour maps of the various size fractions that make up the sediment plume depicted in Figure 5.8a—note change of scales. Horizontal and vertical scales in km. In each diagram the contours are (from furthest out inward toward river mouth) 0.05 mm, 0.2 mm, 0.4 mm, 0.6 mm, and 0.8 mm. Thickness and extent of the plumes for the various grain sizes are controlled by user input sediment amounts and settling rates. These size fractions would have to be integrated in order to generate sediment distribution maps such as Figure 5.8c.
Figure 5.8b shows the thickness of sediment deposited from a fuzzy logic model developed for Southwest Pass Distributary Mouth Bar. In this model, mass conservation is obtained in the following way. The relative volume of the sediment deposited in a given time step is computed by multiplying the area of each cell by the relative thickness given in the model. This volume is divided into the volume of sediment entering the model during a time step, adjusted by the porosity of the deposits. The relative thickness is then multiplied by the calculated factor and mass balance is obtained. Figure 5.8c is the sediment type deposited. As the output variable we are calculating is a fuzzy set of mixtures of sediment types, we are assured of getting data compatible with the model data. In this case the sediment type is adjusted to match the input targets. It is important to note that further tuning of the model is accomplished
5.5 Multidistributary Deltaic Deposition with Variable Wave
137
Figure 5.10 Membership functions used in the fuzzy model of distributary mouth bar deposition off the mouth of Southwest Pass shown in Figures 5.8b and 5.8c. (a) Radial distance from river mouth. (b) Angular position relative to centerline of the model.
by adjusting the membership functions. Our fuzzy logic simulations compare more favorably with Figure 20 in Wright [1978] than do those obtained from the plume model (Figure 5.8a).
5.5 Multidistributary Deltaic Deposition with Variable Wave and Long-Shore Drift Regimes Whereas the modern Mississippi River Delta has a few distributaries, older Holocene deltas developed on the Mississippi delta plain had up to 40 distributaries [Gould, 1970]. Clearly, modeling each distributary separately would require a significant amount of computation. Instead, we have built upon a hypothetical, simplified river flood plain and delta system model inspired by Nordlund [1996] and further described in Demicco & Klir [2001]. The original model (Figure 5.11) had a simple geometry
138
5 Applications of Fuzzy Logic to Stratigraphic Modeling
Table 5.1 Rules of the fuzzy logic system of sediment plume deposition off Southwest Pass. If the radial distance is channel, then the thickness is none. If the radial distance is very close and the angular position is far negative, then the thickness is some. If the radial distance is very close and the angular position is negative, then the thickness is some. If the radial distance is very close and the angular position is central, then the thickness is lots. If the radial distance is very close and the angular position is positive, then the thickness is some. If the radial distance is very close and the angular position is far positive, then the thickness is some. If the radial distance is close and the angular position is far negative, then the thickness is none. If the radial distance is close and the angular position is negative, then the thickness is little. If the radial distance is close and the angular position is central, then the thickness is some. If the radial distance is close and the angular position is positive, then the thickness is little. If the radial distance is close and the angular position is far positive, then the thickness is none. If the radial distance is far and the angular position is far negative, then the thickness is none. If the radial distance is far and the angular position is negative, then the thickness is very little. If the radial distance is far and the angular position is central, then the thickness is little. If the radial distance is far and the angular position is positive, then the thickness is very little. If the radial distance is far and the angular position is far positive, then the thickness is none. If the radial distance is very far, then the thickness is very little.
imposed on a grid of cells 125 × 125, each 1 km2 , and four fuzzy inference systems, two for the delta and two for the river, controlled sediment deposition in the model. In each case, one Mamdani-type fuzzy logic system controlled sediment grain size and one fuzzy logic system controlled thickness of sediment deposited in each cell. The membership functions of the fuzzy logic system that controlled grain size for the subaqueous deltaic deposition is shown in Figure 5.12. This system had two antecedent variables, water depth (Figure 5.12a) and distance from the mouth of the river (Figure 5.12b), and one consequent or dependent variable, grain size. Grain size (Figure 5.12c) was normalized over the interval 0 to 1 and characterized by five triangular membership functions: clay, sandy-clay, clayey-sand, sand, and clean sand. The new delta model has user-adjustable lateral dimensions and starting topography. Moreover, the new model employs Takagi–Sugeno type fuzzy logic systems and incorporates two new elements: basin energy and long-shore drift regime. Basin energy is intended to generally model the wave climate in the basin, with 0 representing low wave energy and 1 representing high wave energy. Long-shore drift regime also varies between 0 and 1. In a 0 long shore drift setting, long-term wave approach is parallel to the shore, whereas a drift regime setting of 1 indicates substantial obliqueness in the general direction of wave approach and standing long shore drift
Figure 5.11 Outputs from basic delta model described in Demicco & Klir [2001] with sinusoidal sea level oscillation of 10 ky and 10 m height. (a) through (d) are isometric views of hypothetical delta simulation at different time steps. Dark blues represent finest floodplain muds (0 on colorbar) and deepest marine muds whereas reds denote clean sands in the river Continued
140
5 Applications of Fuzzy Logic to Stratigraphic Modeling
[Komar, 1998]. In addition, the positions of the distributary mouths grow seaward in an expanding fractal pattern and mass conservation is accomplished by the same artifice as is used in the single distributary model. Figure 5.13 conceptually illustrates the delta sedimentation patterns under endmember conditions of basin energy and drift regime. Figure 5.13a illustrates a condition of low wave energy and parallel wave approach (i.e., no long-shore drift) and is essentially the model described by Demicco & Klir [2001]. Figure 5.13b illustrates the case where basin wave energy is still low but there is “eastward” directed longshore drift. In Figure 5.13c, basin wave energy is high but wave approach is parallel to the shore. In this case, sediment is “trapped” in the littoral zone and distributed along beaches that flank the delta mouth. Finally, Figure 5.13d illustrates the case where there is strong wave energy in the receiving basin and a markedly preferred direction of wave approach, resulting in strong “eastward” directed long-shore currents. In this new model, five antecedent (input) variables control the geometry of the offshore sediment plume: basin energy, drift regime, angular position (theta), radial distance, and depth (Figure 5.14). Relative basin energy and drift regime both have range and domain of 0 to 1 (Figures 5.14a, 5.14b). Angular position (Figure 5.14c) and radial distance (Figure 5.14d) are measured in polar coordinates from a distributary mouth position. Angular trend of the shoreline is represented by angular measurements in radians from the river mouth position and is divided into three fuzzy sets: low, intermediate, and high. Low angles are “easterly,” intermediate angles are arranged on either side of a perpendicular line seaward from the delta mouth, and high angles are “westerly.” Distance from the river mouth is measured radially from a distributary ←−−−−−−−−−−−−−−−−−−−−−−−−−−−−− Figure 5.11 Continued. and at the river mouth (0.9 on colorbar). Deltaic dispersion cone varies from sands (0.7) through muddy sands (0.5) to sandy muds (0.3) of the deeper shelf. (a) Sediment surface at start of simulation. (b) Sediment surface at end of 20,000 years after one sea level rise and fall. (c) Sediment surface at the end of 28,000 years during sea level fall. (d) Sediment surface at the end of 30,000 years at the end of sea level fall. Note how the river deposits of abandoned channels sink into the floodplain surface. Also note how the river “meanders” across the floodplain as it seeks the lowest path to the shoreline. Synthetic stratigraphic cross-sections generated by the model perpendicular to shore (e) and parallel to shore (f) produced at the end of the model run. In the parallel section, marine deltaic deposits (red sands through yellow silts to light blue marine mud) alternate with the dark blue fluvial mud and the aggrading channel deposits (vertical red bars flanked by yellow levee deposits). In particular, note the four avulsion events preserved in the topmost fluvial section in the shoreparallel. The fluvial to marine cycles, ten meters or so thick, are dictated by the external sea level driver and are thus known as “allocycles.” The other, smaller scale cycles are driven by avulsion and kilometer-scale shift of the mouth of the delta between time steps. These kinds of cycle, driven by internal variability in sedimentation, are called autocycles. (See also color insert.)
5.5 Multidistributary Deltaic Deposition with Variable Wave
141
Figure 5.12 (a) Membership functions for the fuzzy sets “shallow” and “deep” for the input variable depth over the domain range 0 to −300 m for delta model in Demicco & Klir [2001]. (b) Membership functions for the fuzzy sets “at-source,” “near-source,” and “far-from-source” for the input variable location over the domain 0 to 250 km for the delta model in Demicco & Klir [2001]. (c) Membership functions and their designations for the output variable grain size over a normalized domain for the delta model in Demicco & Klir [2001].
mouth bar and has been normalized to a range of 0 to 1 by dividing radial distances by the maximum radial distance in the model for a given time step. Radial distance is divided into three fuzzy sets (Figure 5.14d) at source, near source, and far from source. Depth range (Figure 5.14e) in the model is also divided into three fuzzy sets: shoreline, shelf, and basin. Sediment thickness deposited at each cell is divided into
142
5 Applications of Fuzzy Logic to Stratigraphic Modeling
Figure 5.13 Matrix showing sediment plume produced under various regimes of relative basin energy and long-shore drift. Drift regime increases from right to left whereas basin energy regime increases from top to bottom. (a) Relative basin energy = 0, relative long-shore drift strength = 0. This is essentially the model described in Demicco & Klir [2001] and in Figures 5.11 and 5.12. (b) Relative basin energy = 0, relative long-shore drift strength = 1. (c) Relative basin energy = 1, relative long-shore drift strength = 0. (d) Relative basin energy = 1, relative long-shore drift strength = 1. (See also color insert.)
five Takagi–Sugeno linear membership functions: background (0.01), very little (0.1), little (0.33), some (0.66), and lots (1.0). Sediment type deposited at each cell is also divided into five Takagi–Sugeno linear functions. There are a total of 57 rules in this system. A typical rule reads: “If the basin energy is high and the drift regime is low and the angular location is low and the radial distance is at source and the depth is shoreline, then the thickness is lots.” In this case, if every combination of input variables were allowed, there would be 108 rules. However, it is clear from the context that some of the rules would be irrelevant and can therefore be dropped (or “pruned”) from the system. For example, it would never arise that a point that was at the source (at the distributary) would have
5.5 Multidistributary Deltaic Deposition with Variable Wave
143
Figure 5.14 Input membership functions for deltaic fuzzy model with variable basin wave energy and long-shore drift regimes. (a) Relative basin energy. (b) Relative long-shore drift regime strength. (c) Angular relationship to model centerline. (d) Radial distance from river mouth. (e) Water depth.
144
5 Applications of Fuzzy Logic to Stratigraphic Modeling
a water depth of anything but shoreline, so rules with shelf or basin water depths “at the source” were ignored. Another way to reduce rules is to use the standard fuzzy complement of membership functions. In fuzzy logic, the adjective “not” implies the fuzzy complement of the fuzzy set it modifies. Consider another rule from the delta system: “if the basin energy is high and the drift regime is low and the angular location is not intermediate and the radial distance is near source and the depth is shoreline, then the thickness is some.” For any angular position, not intermediate = 1 minus intermediate. Thus the membership function “not intermediate” covers the low and high options of the angular position. Two model runs each simulating 50,000 years of sedimentation at 200-year time steps are presented here over a 200 × 200 km2 grid. In both cases, sea level is held constant. In both models, the subsidence is a maximum in the center of the model and falls off to the edges of the model but isostatic compensation of the sediments according to the formulas given by Turcotte & Schubert [1982] are incorporated. Tectonic “subsidence” remains constant through the simulation time. In the simulation shown in Figure 5.15, basin energy and long-shore drift regime are initially assigned input values of 0.1 and 0.1. During the course of the model run, however, a random fluctuation in the second decimal point of the input values for basin energy and longshore drift regime is allowed at each time step. In the simulation shown in Figure 5.16, the basin energy input value and the long-shore drift input value are 0.85 and 0.85, respectively, and are held constant at each time step. (It is important to note that in these simulations maximum long-shore drift direction is to the “west.”) Finally, in both models, there are random (in time) avulsions of the river upstream of the model. After an avulsion, the river enters the upstream end of the model at the lowest point. The interplay of subsidence with floodplain aggradation history and location of the prior channel belts will determine this low point. From the lowest point at the upstream end of the model the river finds the lowest set of adjacent cells to reach the shoreline. The colors in Figures 5.15 and 5.16 reflect the caliber of the sediments that are deposited on the surface (Figures 5.15a and 5.16a) in each cell at each time step. The dark blue represents the finest-grained flood plain mud with lighter blues representing offshore marine mud. The red through light blue hues indicate environments of deposition dominated by clean, coarse sand (red) through finer sands (represented by yellow) to coarse silt (greenish blues). These environments include: (1) the distributary mouth bars (here amalgamated into a continuous sheet); (2) the river and distributary system that feeds the deltaic system; and (3) the river’s levee–crevasse–splay system. Note that on sediment surfaces at the ends of the runs (Figures 5.15a and 5.16a) the channel belt widens toward the shore and the distributary portion of the delta is not a single cell. This is because, at every avulsion, as the delta re-establishes itself the distributary mouth bars are added to shoreward nodes in a random, fractal pattern. This fractal growth pattern is generated by a “diffusive” mechanism [Witten & Sander, 1983]. Also note the differential buildup of an offshore delta platform in the case of low long-shore drift versus high long-shore drift.
5.5 Multidistributary Deltaic Deposition with Variable Wave
145
Figure 5.15 (a) Isometric view of hypothetical delta simulation at end of run 1. Dark blues represent finest floodplain muds and deepest marine muds whereas reds denote clean sands at the river mouth. The deltaic dispersion cone varies from sands through muddy sands to sandy muds of the deeper adjacent basin. Horizontal and lateral scale in km, vertical scale in m. Sea level held constant throughout the simulation. Basin energy and long-shore drift regime are both low in this simulation with a bit of second-order “noise” (random variation in the second decimal place). The river widens into a broad distributary system and there are approximately 10 distributaries at this point in the simulation. Synthetic stratigraphic cross-sections: (b) view perpendicular to shore and (c) parallel to shore, produced at the end of the model run. In the parallel section, marine deltaic deposits (red sands through yellow silts to light blue marine mud) grade up to dark blue fluvial mud and the aggrading channel deposits (vertical orange bars flanked by green levee deposits). In particular, note the avulsion events preserved in the topmost fluvial section in the shore-parallel. The smaller, 10 m thick scale cycles are driven by avulsion and kilometer-scale shifts of the distributary mouths of the delta between time steps. These kinds of cycle, driven by internal variability in sedimentation, are called autocycles. (See also color insert.)
146
5 Applications of Fuzzy Logic to Stratigraphic Modeling
Figure 5.16 (a) Isometric view of hypothetical delta simulation at end of run 2. Dark blues represent finest floodplain muds and deepest marine muds whereas reds denote clean sands at the river mouth. The deltaic dispersion cone varies from sands through muddy sands to sandy muds of the deeper adjacent basin. Horizontal and lateral scale in km, vertical scale in m. Sea level held constant throughout the simulation. Basin energy and long-shore drift regime are both high in this simulation with a bit of second order “noise” (random variation in the second decimal place). The river widens into a broad distributary system and there are approximately 10 distributaries at this point in the simulation. Synthetic stratigraphic cross-sections: view perpendicular to shore (b) and parallel to shore (c), produced at the end of the model run. In the parallel section, note how the strong long-shore drift has skewed the deposits to the west. Overall, marine deltaic deposits (red sands through yellow silts to light blue marine mud) grade up to dark blue fluvial mud at the left side of the cross-section. Thickness here is, in part, generated by increased isostatic adjustments due to increased sediment loads on this side of the model. A starved shelf system obtains on the right (“east”) side of the model. The smaller, 10 m thick scale cycles are driven by avulsion and kilometer-scale shifts of the distributary mouths of the delta between time steps. These kinds of cycle, driven by internal variability in sedimentation, are called autocycles. (See also color insert.)
5.6 Future Developments
147
Figures 5.15b, 5.15c, 5.16b, and 5.16c show synthetic cross-sections through the deltaic deposit at the end of the model runs. The upper panel in each diagram is a shore perpendicular view whereas the lower panel is a shore parallel view out in the basin: colors have the same meanings as in the oblique surface views. Figure 5.15 shows the case with low long-shore drift and wave energy whereas Figure 5.16 shows the marked effects of strong preferential long-shore drift. There are a number of scales of cycles in both figures. The largest scale of cycles (tens of meters) is due to the seaward progradation of the river system resulting in a cap of fluvial deposits. The next smaller cycles are autocycles due to the avulsion of the river. Finally, there are meter-scale cycles due to the change in the position and number of distributaries as the delta builds out. This alters the position of the delta by as much as a few kilometers in each time step and produces the small-scale cycles. The strong long-shore drift and strong wave energy have clearly strongly influenced the deposits depicted in Figure 5.16. Here, isostatic compensation has produced thick deposits on the western side of the basin and “starved basin” conditions on the east.
5.6 Future Developments We hope that the examples of this chapter demonstrate that fuzzy logic systems are very versatile and, indeed, can be more versatile than classical mechanics equations. There are a number of areas that warrant further research and development of applications of fuzzy logic to computer-generated stratigraphic simulations. These are outlined below. The applications discussed in this chapter restrict the use of fuzzy sets to model sediment dispersal within the various models illustrated. There is no reason why fuzzy logic systems could not be used in other parts of these simulations. For example, compaction routines developed in Chapter 3 could easily be adapted for use in these models. Indeed, compaction routines employed in many models are essentially instantaneous and are solved at designated time steps. It is clear from modern coastal areas that compaction has a definite time lag component that is not accounted for in deterministic equations. Realistic time delay factors could easily be inserted in a fuzzy logic compaction routine. So far we have been using standard fuzzy sets wherein for a given input value there is one degree of membership in the unit interval [0, 1]. It is worthwhile to explore the use of various nonstandard fuzzy sets (as introduced in Section 2.9), in particular fuzzy sets of type 2 [Mendel, 2001]. Clearly, fuzzy sets of type 2 would be warranted by the spread of the initial data on coral growth rates (Figure 3.7) used in the sediment production function of Section 3.4.2. Another area of fuzzy logic stratigraphic modeling that warrants further investigation is in rule “pruning.” In general, the number of rules equals the product of the
148
5 Applications of Fuzzy Logic to Stratigraphic Modeling
number of membership functions of the antecedent variables. However, as we see in the delta example discussed in Section 5.5 above, not all the rules make sense in the context of the model. Although automated procedures have been developed for rule pruning, these have yet to be employed in stratigraphic simulation models. Finally, we are just beginning to turn our attention to automating the entire modeling process. In such a system, we would input the desired geologic information, a crosssection or a seismic line, or a series of core holes, and ask the machine to come up with forward models that predict unknown deposits. As a tentative first step, Chapter 10 describes the extraction of a sea level signal using fuzzy linguistic rules from outcrop data. Clearly, one of the major inputs to a depositional model is the initial sea level signal, and an automated example of sea level extraction based on fuzzy expert rules is demonstrated in that chapter.
5.7 Conclusions There are a number of distinct advantages in employing fuzzy inference systems to model sediment dispersal in stratigraphic models. First, fuzzy sets describe systems in “natural language” and provide the tools to rigorously quantify “soft” information. Second, fuzzy inference systems are more computationally efficient than finite-element or finite-difference models, and can even run faster than a simple linear interpolation scheme. Last, and most importantly, the shapes of the membership functions can easily be changed by small increments, thereby allowing rapid “sensitivity analysis” of the effects of changing the boundaries of the fuzzy sets. In robot control algorithms, where fuzzy logic was first developed, systems could self-adjust the shapes of the membership functions and set boundaries until the required task was flawlessly performed. This aspect of fuzzy systems, commonly facilitated via the learning capabilities of appropriate neural networks [Kosko, 1992; Klir & Yuan, 1995; Nauck et al., 1997] or by genetic algorithms [Sanchez et al., 1998; Cordón et al., 2001], is one of their great advantages over numerical solution approaches. Fuzzy logic models hold the potential to accurately model subsurface distribution of sedimentary facies (not just water-depths of deposition) in terms of the natural variables of geology. As exploration moves further into use of 3-dimensional seismic data gathering, the utility of easy-to-use, flexible 3dimensional forward models is obvious. Such models could be used to produce synthetic seismic sections. Moreover, the “learning ability” of fuzzy logic systems coupled with neural networks offers the long-term possibility of self-tuning sedimentary models that can match 3-dimensional seismic subsurface information in a “nonhuman” expert system. This method offers an alternative to the statistical modeling of subsurface geology. It is more computationally efficient and more intuitive for geologists than complicated models that solve coupled sets of differential equations.
References
149
References Acinas, J. R., & Brebbia, C. A. [1997], Computer Modelling of Seas and Coastal Regions. Computational Mechanics, Billerica, MA. Albertson, M. L., Dai, Y. B., Jensen, R. A., & Hunter, R. [1950], “Diffusion of submerged jets.” American Society of Civil Engineers Transactions, 115, 639–697. Angevine, C. L., Heller, P. L., & Paola, C. [1990], Quantitative Sedimentary Basin Modeling, Continuing Education Course Notes Series #32. American Association of Petroleum Geologists, Tulsa, OK. Bosence, D., & Waltham, D. [1990], “Computer modeling the internal architecture of carbonate platforms.” Geology, 18(1), 26–30. Bosence, D. W. J., Polmar, L., Waltham, D. A., & Lankester, H. G. [1994], “Computer modeling a Miocene carbonate platform, Mallorca, Spain.” American Association of Petroleum Geologists Bulletin, 78(2), 247–266. Bosscher, H., & Schlager, W. [1992], “Computer simulation of reef growth.” Sedimentology, 39(3), 503–512. Bridge, J. S., & Leeder, M. R. [1979], “A simulation model of alluvial stratigraphy.” Sedimentology, 26(5), 617–644. Broecker, W. A., & Takahashi, T. [1966], “Calcium carbonate precipitation on the Bahama Banks.” Journal of Geophysical Research, 71(6), 1575–1602. Coleman, J. M., & Wright, L. D. [1975], “Modern river deltas.” In: M. L. Broussard (ed.), Deltas: Models for Exploration, pp. 99–149. Houston Geological Society, Houston, TX. Cordón, O., Herrera, F., Hoffmann, F., & Magdalena, L. [2001], Genetic Fuzzy Systems: Evolutionary Tuning and Learning of Fuzzy Knowledge Bases. World Scientific, Singapore. Demicco, R. V. [1998], “CYCOPATH 2-D, a two-dimensional, forward-model of cyclic sedimentation on carbonate platforms.” Computers & Geosciences, 24(5), 405–423. Demicco, R. V., & Klir, G. [2001], “Stratigraphic simulation using fuzzy logic to model sediment dispersal.” Journal of Petroleum Science and Engineering, 31(2–4), 135–155. Dunn, P. A. [1991], Diagenesis and Cyclostratigraphy—an example from the Middle Triassic Latemar platform, Dolomite Mountains, northern Italy. PhD Dissertation, The Johns Hopkins University, Baltimore, MD. Edington, D. H., Poeter, E. P., & Cross, T. A. [1998], “FLUVSIM; a fuzzy-logic forward model of fluvial systems.” Abstracts with Programs—Geological Society of America Annual Meeting, 30, A105. Emery, D., & Myers, K. J. [1996], Sequence Stratigraphy. Blackwell, Oxford, UK. Fang, J. H. [1997], “Fuzzy logic and geology.” Geotimes, 42(10), 23–26. Franseen, E. K., Watney, W. L., Kendall, C. G. St C., & Ross, (eds.) [1991], Sedimentary modeling: computer simulations and and methods for improved parameter definition. Kansas Geological Survey Bulletin 233, Lawrence, KS. Gould, H. R. [1970], “The Mississippi Delta complex.” In: Morgan, J. P. (ed.), Deltaic Sedimentation Modern and Ancient. Society of Economic Paleontologists and Mineralogists Special Publication 15, pp. 3–47. Tulsa, OK. Harbaugh, J. W., Watney, L. W., Rankey, E. C., Slingerland, R., Goldstein, R. H., & Franseen, E. K. (eds.) [1999], Numerical Experiments in Stratigraphy: Recent Advances in Stratigraphic and Sedimentologic Computer Simulations. SEPM (Society for Sedimentary Geology) Special Publication 62, Tulsa, OK.
150
5 Applications of Fuzzy Logic to Stratigraphic Modeling
Harff, J., Lemke, W., & Stattegger, K. [1999], Computerized Modeling of Sedimentary Systems. Springer-Verlag, New York. Klir, G. J., & Yuan, B. [1995], Fuzzy Sets and Fuzzy Logic Theory and Applications. PrenticeHall, Upper Saddle River, NJ. Komar, P. D. [1998], Beach Processes and Sedimentation (2nd edition). Prentice-Hall, Upper Saddle River, NJ. Kosko, B. [1992], Neural Networks and Fuzzy Systems. Prentice-Hall, Englewood Cliffs, NJ. Ku, T.-L., Luo, S., Lowenstein, T. K., Li, J., & Spencer, R. J. [1998], “U-series chronology of lacustrine deposits in Death Valley, California.” Quaternary Research, 50(3), 261–275. Li, J., Lowenstein, T. K., Brown, C. B., Ku, T.-L., & Luo, S. [1996], “A 100 ka record of water tables and paleoclimates from salt core, Death Valley, California.” Paleogeography, Paleoclimatology, and Paleoecology, 123(1–4): 179–203. Lin, C.-T., & Lee, G. [1996], Neural Fuzzy Systems: A Neuro Fuzzy Synergism to Intelligent Systems. Prentice-Hall, Upper Saddle River, NJ. Loucks, R. G., & Sarg, J. F. [1993], Carbonate Sequence Stratigraphy. American Association of Petroleum Geologists Memoir 57, Tulsa, OK. Lowenstein, T. K., Li., J., Brown, C., Roberts, S. M., Ku, T.-L., Luo, S., & Yang, W. [1999], “200 ky paleoclimate record from Death Valley salt core.” Geology, 27(1), 3–6. Mackey, S. D., & Bridge, J. S. [1995], “Three-dimensional model of alluvial stratigraphy: theory and application.” Journal of Sedimentary Research, B65(1), 7–31. Mamdani, E. H., & Assilian, S. [1975], “An experiment in linguistic synthesis with fuzzy logic controller.” International Journal of Man–Machine Studies, 7(1), 1–13. Mendel, J. M. [2001], Uncertain Rule-Based Fuzzy Logic Systems: Introduction and New Directions. Prentice-Hall, Upper Saddle River, NJ. Merriam, D. F., & Davis, J. C. (eds.) [2001], Geologic Modeling and Simulation Sedimentary Systems. Kluwer Academic/Plenum, New York. Morse, J. W., Millero, F. J., Thurmond, V., Brown, E., & Ostlund, H. G. [1984], “The carbonate chemistry of Grand Bahama Bank waters: after 18 years another look.” Journal of Geophysical Research, 89C(3), 3604–3614. Nauck, D., Klawonn, F., & Kruse, R. [1997], Foundations of Neuro-Fuzzy Systems. John Wiley, New York. Nittrouer, C. A., & Kravitz, J. H. [1996], “STRATAFORM: A program to study the creation and interpretation of sedimentary strata on continental margins.” Oceanography, 9(2), 146–152. Nordlund, U. [1996], “Formalizing geological knowledge—with an example of modeling stratigraphy using fuzzy logic.” Journal of Sedimentary Research, 66(4), 689–712. Parcell, W. C., Mancini, E. A., Benson, D. J., Chen, H., & Yang, W. [1998], “Geological and computer modeling of 2-D and 3-D carbonate lithofacies trends in the Upper Jurassic (Oxfordian), Smackover Formation, Northeastern Gulf Coast.” Abstracts with Programs— Geological Society of America Annual Meeting, 30, A338. Reading, H. G., & Collinson, J. D. [1996], “Clastic coasts.” In: Reading, H. G. (ed.), Sedimentary Environments: Processes, Facies and Stratigraphy, pp. 154–231. Blackwell, Oxford, UK. Roberts, S. M., & Spencer, R. J. [1995], “Paleotemperatures preserved in fluid inclusions in halite.” Geochimica et Cosmochimica Acta, 59(19), 3929–3942. Sanchez, E., Shibata, T., & Zadeh, L. A. [1997], Genetic Algorithms and Fuzzy Logic Systems: Soft Computing Perspectives. World Scientific, Singapore.
References
151
Schlager, W. [1992], Sedimentary and sequence stratigraphy of reefs and carbonate platforms. American Association of Petroleum Geologists Continuing Education Course Note Series, 34, Tulsa, OK. Schlager, W. [1999], “Sequence stratigraphy of carbonate rocks.” Leading Edge, 18(8), 901–907. Slingerland, R., Harbaugh, J. W., & Furlong, K. [1994], Simulating Clastic Sedimentary Basins: Physical Fundamentals and Computer Programs for Creating Dynamic Systems. Prentice-Hall, Englewood Cliffs, NJ. Smoot, J. P., & Lowenstein, T. K. [1991], “Depositional environments of non-marine evaporates.” In: Melvin, J. L. (ed.), Evaporites, Petroleum and Mineral Resources: Developments in Sedimentology 50. Elsevier, Amsterdam. Syvitski, J. P. M., & Hutton, E. W. H. [2001], “2D SEDFLUX 1.0C, and advanced processresponse numerical model for the fill of marine sedimentary basins.” Computers & Geosciences, 27(6), 713–753. Takagi, T., & Sugeno, H. [1985], “Fuzzy identification of systems and its application for modeling and control.” IEEE Transactions on Systems, Man and Cybernetics, 15(1), 116–132. Tetzlaff, D. L., & Harbaugh, J. W. [1989], Simulating Clastic Sedimentation. VanNostrand Reinhold, New York. Turcotte, D. L., & Schubert, G. [1982], Geodynamics — Applications of Continuum Physics to Geological Problems. John Wiley, New York. Vail, P. R., Mitchum, R. M., Jr, Todd, R. G., Widmier, J. M., Thompson, S., III, Sangree, J. B., Bubb, J. N., & Hatleid, W. G. [1977], “Seismic stratigraphy and global changes in sea level.” In: Payton, C. E. (ed.), Seismic Stratigraphy—Application to Hydrocarbon Exploration. American Association of Petroleum Geologists Memoir 26, pp. 49–62. Tulsa, OK. Wendebourg, J., & Harbaugh, J. W. [1996], “Sedimentary process simulation: a new approach for describing petrophysical properties in three dimensions for subsurface flow simulations.” In: Forster, A., & Merriam, D. F. (eds.), Geological Modeling and Mapping, pp. 1–25. Plenum Press, New York. Wilgus, C. K., Hastings, B. S., Kendall, C. G. St C., Posamentier, H. W., Ross, C. A., & van Wagoner, J. C. [1989], Sea Level Changes: an Integrated Approach. Society of Economic Paleontologists and Mineralogists, Special Publication 42. Tulsa, OK. Witten, T. A., & Sander, L. M. [1983], “Diffusion-limited aggregation.” Physical Review B, 27(9), 5686–5697. Wright, L. D. [1978], “River deltas.” In: Davis, R. A., Jr (ed.), Coastal Sedimentary Environments, pp. 5–68. Springer-Verlag, New York. Wright, L. D., & Coleman, J. M. [1971], “Effluent expansion and interfacial mixing in the presence of a salt wedge, Mississippi River Delta.” Journal of Geophysical Research, 76(36), 8649–8661. Yang, W., Krouse, H. R., Spencer, R. J., Lowenstein, T. K., Hutcheon, I. E., Ku, T.-L., Li, J., Roberts, S. M., & Brown, C. B. [1999], “A 200,000-year record of change in oxygen isotope composition of sulfate in a saline sediment core, Death Valley, California.” Quaternary Research, 51(2), 148–157.
This Page Intentionally Left Blank
Chapter 6
Fuzzy Logic in Hydrology and Water Resources
Istvan Bogardi, Andras Bardossy, Lucien Duckstein, and Rita Pongracz
6.1 Introduction 154 6.2 Overview 154 6.3 Fuzzy Rule-Based Hydroclimatic Modeling 157 6.3.1 Selection of the input and output base variables 158 6.3.2 Definition of fuzzy sets 161 6.3.3 Definition of the training and validation data sets 164 6.3.4 Rule construction 166 6.3.5 Validation procedure 169 6.3.6 Assessment of the fuzzy rule system 170 6.3.7 Evaluating the fuzzy rule-based model 172 6.4 Application Examples for Nebraska, Arizona, Germany, and Hungary 173 6.4.1 Long-term statistical forecasting of drought index in Nebraska and Hungary 173 6.4.2 Long-term statistical forecasting of precipitation in Hungary, Arizona, and Germany 178 6.5 Discussion and Conclusions 184 References 187
Abstract From the early application of fuzzy logic to hydrology a large amount of research has been pursued and at present, fuzzy logic has more and more become a practical tool in hydrologic analysis and water resources decision making. In this chapter the main areas of applications are highlighted. Then, one major area of hydrology, namely, hydro-climatic modeling of hydrological extremes (i.e., droughts and intensive precipitation) is selected to describe in details the methodology using fuzzy rules of inference (or in other words the fuzzy rule-based modeling technique). Results over four regions—Arizona, Nebraska, Germany and Hungary—and under three different climates—semiarid, dry and wet continental—suggest that fuzzy rule-based approach can be used successfully to predict the statistical properties of monthly precipitation and drought index from the joint forcing of macrocirculation patterns and ENSO information.
153 FUZZY LOGIC IN GEOLOGY
Copyright 2004, Elsevier Science (USA) All rights of reproduction in any form reserved. ISBN: 0-12-415146-9
154
6 Fuzzy Logic in Hydrology and Water Resources
6.1 Introduction Hydrology and water resources commonly involve a system of concepts, principles, and methods for dealing with modes of reasoning that are approximate rather than exact. In other words, hydrology is hampered by uncertainties caused by nature (e.g., climate), limited data, and imprecise modeling. For instance, aquifer parameters are obtained from a few locations that represent a small fraction of the total volume. Definition of system boundaries and initial conditions also introduce uncertainty. Future stresses on the system are also imprecisely known. The stochastic approach of uncertainty analysis considers aquifer properties as random variables with known distributions. Thus, the outputs from a stochastic model are also characterized by the statistical moments or the full probability density function. However, despite the theoretical development of the stochastic approach, its practical application is rather limited, especially if a point process model needs to be upscaled. From the early application of fuzzy logic to hydrology [Bogardi et al., 1983], a large amount of research has been pursued and, at present, fuzzy logic has become a practical tool in hydrologic analysis and water resources decision making. In this chapter, the main areas of applications are highlighted and one major area is selected to describe details of the methodology and examples of application results.
6.2 Overview In contrast with or in complement to a probabilistic approach, fuzzy logic also allows us to consider the treatment of imprecision (or vagueness) in hydrology. For example, consider the following statements: “runoff increases with higher antecedent moisture,” or “the water supply is less than the demand in midsummer,” or “transmissivity is greater near the foothills,” or “salt water intrusion may occur if pumpage gets close to capacity,” or “high intensity rain.” How can decisions be made under these conditions? Also, is a trade-off possible between imprecision and other criteria, such as cost or risk? As an example, let us consider a deficit or shortage in water supply as h = D − Q, where D, Q denote a given demand and supply, respectively. If we simply define a failure F as the event h > 0, then we have an ordinary set inclusion. If, however, we follow Duckstein et al. [1988a] and define a failure incident I ∗ as an ordered pair defined on the set of real numbers R: I ∗ = {(h, F (h)): h ∈ R; F (h) is the membership grade function of h in F with values in [0,1]}, then I ∗ is a fuzzy set. Such a definition makes it possible to accept an imprecise definition of a failure: incident I ∗ (h) may still represent only a fair supply condition for a 10% surplus. Thus any value of deficit h belongs to the set F of supply failures with a non-negative membership function F (h). Stream discharge is another example of imprecision (or vagueness) in hydrology. Discharge curves corresponding to time-invariant hydrological conditions are often
6.2 Overview
155
based on a few data points where the underlying discharge measurements may be inaccurate. Critical hydrologic decisions must therefore be made over such parts of the discharge curve where no data points or only imprecise data points are available [Chow et al., 1988]. Fuzzy regression may be used to express the uncertainty in discharge curves. Modeling of various relationships between variables describing water quantity and quality provides other examples. For instance, the dissolved chemical concentration of contaminants such as phosphorus stemming from non-point sources is commonly related to the peak flow or total volume of runoff events [Wetzel, 1975]. The number of such measurements is often not sufficient to perform standard statistical regression analysis. Similarly, sediment transport relationships may use river flow quantities to estimate suspended sediment concentration or bed-load. Since the river regime may change quite fast, often there is only enough time to obtain just a few observation points during important sediment transport events. In addition, the sediment measurements are themselves quite imprecise. In groundwater hydrology several examples can be mentioned. Aquifer parameters are estimated using expensive field tests, and time (or budget) constraints may lead to the availability of a relatively small number of data points. Further, dispersion coefficients (which are quite difficult to measure directly) may be estimated indirectly from other aquifer parameters [Fried, 1975]. Even though calculated pollution concentration may be quite sensitive to the dispersion coefficient, the inaccuracy involved is rarely accounted for. Fuzzy regression may also be helpful in this regard. Rate constants for dissolved oxygen models may be estimated from average water depth and flow velocity [Biswas, 1981]. However, measured rate constants may be available at only a few points; thus the use of statistical regression analysis is not justified. Flow in fractured rocks is strongly related to the geologic properties of the material. Parameters governing such flow can be estimated indirectly from geologic quantities which, however, are often difficult and time consuming to measure. As a result, the relationships may have to be modeled from just a few data points [Bogardi et al., 1982]. Environmental health risk analyses use dose–response relationships based on a few animal experiments over a dose region which considerably exceeds doses occurring in contaminated groundwater. Health risk estimated from such uncertain relationships may be the basis of environmental regulation. Fuzzy logic offers a possibility to express health risk under uncertainty and to select cost-effective risk reduction alternatives [Bardossy et al., 1991a]. Main domains of fuzzy logic applications in hydrology include: 1. Fuzzy regression, which is useful when it is known that a causal relation exists, but only very few data points are available [Bardossy et al., 1990, 1991b; Ozelkan & Duckstein, 2000]. 2. Hydrologic forecasting, for instance to embed short-term flood forecasting into medium-term forecasting. Kalman filtering is used for the short-term component,
156
3. 4.
5.
6.
7.
8.
9.
6 Fuzzy Logic in Hydrology and Water Resources while fuzzy logic operates on the medium term, leading to a complete real-time forecasting system [Kojiri, 1988]. Hydrological modeling, where traditional rainfall runoff models can be replaced by fuzzy rule systems with similar performance [Hundecha et al. 2001]. Fuzzy set geostatistics allows us to use imprecise and possibly indirect measurements and small datasets in spatial statistical analysis [Bardossy et al., 1988, 1990]. Incorporation of spatial variability into groundwater flow and transport modeling with fuzzy logic [Dou et al., 1995, 1997a; Woldt et al., 1997]. In this approach, the imprecision of hydraulic parameters is embedded directly into the governing differential equations as fuzzy numbers [Dou et al., 1995, 1997a]. Then the system of finite difference equations is solved using fuzzy set theory methods. This fuzzy modeling technique can handle imprecise parameters in a direct way without generating a large number of realizations (which is the common feature of the stochastic approach). Regional water resources management aims at selecting among alternative management schemes under small data sets and imprecisely known or modeled objectives [Bogardi et al., 1982; Nachtnebel et al., 1986; Bardossy et al., 1989]. Multicriterion decision making (MCDM) under uncertainty is essential when water resources systems face multiple and conflicting criteria (objectives), e.g., economic efficiency and environmental preservation, and the criteria corresponding to alternative systems are imprecisely known [Duckstein et al., 1988b; Bardossy et al., 1992; Bogardi et al., 1996]. These criteria are defined as fuzzy numbers and MCDM is performed in a fuzzy logic framework. Fuzzy risk analysis considers uncertainty in any or all elements of risk analysis: exposure or load, resistance or capacity, and consequence [Bogardi et al., 1989; Duckstein & Bogardi, 1991]. The uncertainties are defined as fuzzy numbers, so the risk is also obtained as a fuzzy number. In a risk management framework, management options are evaluated to identify the best option, say in a risk–cost trade-off formulation [Lee et al., 1994, 1995; Stansbury et al., 1999; Mujumdar & Sasikumar, 2002]. Reservoir operation planning may apply fuzzy logic to derive operation rules [Simonovic, 1992; Shrestha et al., 1996]. Operation rules are generated on the basis of economic development criteria such as hydropower; municipal; industrial and irrigation demands; flood control and navigation; and environmental criteria such as water quality for fish and wildlife preservation, recreational needs, and downstream flow regulation. Split sampling of historical data (mean daily time series of flow, lake level, demands, and releases) is used to train and then validate the fuzzy logic model. Such models appear to be easy to construct, apply, and extend to a complex system of reservoirs [Teegavarapu & Simonovic, 1999].
6.3 Fuzzy Rule-Based Hydroclimatic Modeling
157
10. Climatic modeling of hydrological extremes has applied fuzzy logic to describe a stochastic linkage between large-scale climatic forcing and local hydrological variables [Pesti et al., 1996; Pongracz et al., 2001]. Large-scale climatic forcing may include atmospheric circulation patterns (weather types) and sea surface temperature (SST) indicating El Niño or La Niña in the tropical Pacific region. Local hydrological variables may include, among others, daily precipitation, temperature, evaporation, wind velocity, and drought indices. This last application is the subject of the remainder of this chapter.
6.3 Fuzzy Rule-Based Hydroclimatic Modeling Fuzzy rule-based modeling shows much potential in cases when a causal relationship is well established but difficult to calculate under real-life conditions, when data are scarce and imprecise, or when a given input vector has several contradictory responses which may be true to varying degrees. These features are often present in hydrology and water resources. Fuzzy rule-based modeling may be considered as an extension of fuzzy logic. The primary difference is that fuzzy logic is traditionally used for system control with feedback, whereas fuzzy rule-based modeling is employed to simulate processes, usually without a feedback mechanism [Sugeno & Yasukawa, 1993; Wang & Mendel, 1992; DeCampos & Moral, 1993]. The advent of fuzzy rule-based modeling is a recent development that currently exists without an extensive base of scientific applications such as that enjoyed by fuzzy logic adherents in engineering disciplines. Consequently, the use of this approach to enhance the modeling of hydrological processes is relatively new [Bardossy & Duckstein, 1995]. Fuzzy rule-based modeling has been used in several areas of hydrology, including: ● ● ● ● ● ●
classification of spatial hydrometeorological events [Bardossy et al., 1995]; climatic modeling of flooding [Bogardi et al., 1995]; modeling of groundwater flow and transport [Bardossy & Disse, 1993; Dou et al., 1997b, 1999; Woldt et al., 1997]; modeling regional-scale nitrate leaching using available soil and cultivation data [Bardossy et al., 2003; Haberlandt et al., 2002]; forecasting pollutants transport in surface waters [Di Natale et al., 2000]; hydroclimatic modeling of hydrological extremes, i.e., droughts and intensive precipitation.
We use the hydroclimatic modeling of hydrological extremes to describe a typical fuzzy rule-based approach as applied in hydrology. An application of fuzzy rules of inference (or, in other words, the fuzzy rule-based (FRB) modeling technique) is illustrated by several examples. Basic definitions and the main characteristics of
158
6 Fuzzy Logic in Hydrology and Water Resources
knowledge-based fuzzy systems are presented in Chapter 2. Definition of fuzzy rules is provided in Section 2.7. In the case of FRB modeling, experts are substituted for observed data and several conditional, unqualified fuzzy propositions are used (see Section 2.5). Because of several difficulties experienced in traditional statistical analysis, FRB modeling can be used for estimating different hydroclimatological variables. The main advantages of the FRB approach are that it has a relatively simple structure and requires neither independency nor long data sets. In the following, an FRB technique, called a weighted counting algorithm [Bardossy & Duckstein, 1995], is adopted to estimate a drought index. The weighted counting algorithm is applied to a subset of the data (known as a training set). Results are then composed to a validation subset of the data. It is described in a step-by-step manner so that the same steps can be used in other similar cases.
6.3.1 Selection of the input and output base variables First, the output base variable (so-called response) that the FRB model aims to calculate is selected. Here we consider agricultural drought events represented by various types of drought indices (further described below) as the output variable. Next, input base variables (so-called premises) are selected in this case on the basis of both physical reasoning and statistical analysis. Drought events (e.g., over the Great Plains of North America) are strongly related to large-scale climatic forcings such as: (1) continental scale atmospheric circulation patterns; and (2) climate oscillations present both in the ocean and in the atmosphere (e.g., ENSO with El Niño and La Niña events [Glantz et al., 1991], and see below). Large-scale atmospheric circulation patterns (CPs) can be represented by either the sea surface pressure field or daily geopotential height fields (e.g., 500 hPa, 700 hPa level) above a continental-size area containing the study region [NCAR, 1966]. To overcome the time-scale difference between monthly drought index and daily CP, the effects of CP on droughts are represented by the monthly empirical relative frequencies of daily CP types. The CP types can be identified by a combined multivariate technique [Wilks, 1995], namely, principal component analysis and cluster analysis using the k-means method [MacQueen, 1967]. Details of this methodology are presented in Matyasovszky et al. [1993]. Other studies in Europe are based on the semi-objective CP classification system of Hess and Brezowsky [1952, 1977], or on fuzzy rule-based classification [Bardossy et al.,1995, 2002]. The importance of ENSO effects on weather anomalies and crop production in the U.S. Midwest was shown by many researchers, and is summarized by Carlson et al. [1996]. The association of drought with ENSO has been demonstrated for the whole United States [Piechota & Dracup, 1996], but the correlations are not strong enough to predict drought from any ENSO index alone. The ENSO phenomena are
6.3 Fuzzy Rule-Based Hydroclimatic Modeling
159
represented by the time series of SOI (Southern Oscillation Index), which is one of the most commonly used indices in ENSO research [NOAA, 2001a]. SOI is defined as the monthly pressure difference between Tahiti and Darwin [Clarke & Li, 1995]. Positive and negative SOI values refer to La Niña and El Niño episodes, respectively. After consideration of physical causality, possible statistical relationships between the selected response and the premises are analyzed. In this case, the correlation coefficients between the monthly relative frequencies of CP types and lagged drought index (DI) and between the monthly relative frequencies of CP types and SOI are smaller than 0.2, and mostly not significant. On the other hand, the empirical frequency distributions of CP types during the five drought categories (as defined in Table 6.1) are different at the 0.01 significance level. Figure 6.1 shows the frequencies of CP types during the two most extreme DI intervals: very dry and very wet conditions. The frequencies of CP types during the three ENSO phases (as defined in Table 6.2) are also significantly different. The correlation coefficients between DI and lagged SOI reach 0.39 and are significant (Figure 6.2). Both directions of lag have been evaluated since simultaneous, lag, and pre-lag teleconnections of climate
Table 6.1 Categories defined on drought index (DI). DI intervals
Drought categories
DI < −3 −3 ≤ DI < −1 −1 ≤ DI ≤ +1 +1 < DI ≤ +3 DI > +3
very dry dry normal wet very wet
Figure 6.1 Empirical relative frequency distributions of CP types during extreme drought conditions in climate division 8.
160
6 Fuzzy Logic in Hydrology and Water Resources Table 6.2 Categories defined on SOI. SOI intervals
ENSO phases
SOI ≤ −1 −1 < SOI < +1 SOI ≥ +1
El Niño neutral La Niña
Figure 6.2 Correlation coefficients between drought and the lagged SOI.
variables may be related to ENSO [Wright, 1985]. The conditional frequency distributions of DI during El Niño and La Niña periods (Figure 6.3) are also significantly different. These statistical analyses reinforce earlier findings (e.g., [Piechota & Dracup, 1996]) that, despite the strong teleconnection between ENSO and droughts, droughts have occurred under various phases of ENSO [Carlson et al., 1996]. Thus, for instance in the Great Plains, the partial signals of ENSO and CP on drought are weaker than in other regions. Also, CP and ENSO are evidently interdependent since they both represent parts of the complex climate system. Thus, the more traditional stochastic approach to regress SOI and the frequencies of CP types with a drought index does not work (see below). In summary, the SOI and the monthly frequency distribution of CP types constitute the input base variables, forcing functions, or premises. Next, the question arises as to how many prior monthly premises should be considered to predict the drought index. There is no strict rule for this case; here we use a selection based on the correlation analysis between SOI with different lag periods and the drought. For the CP types, none of the prior months has any significant correlation; thus, only the simultaneous frequency distributions of CP types represent the first type of premises (X1 , . . . , X6 ).
6.3 Fuzzy Rule-Based Hydroclimatic Modeling
161
Figure 6.3 Empirical relative frequency distributions of drought conditions during El Niño and La Niña.
For SOI, as already shown on Figure 6.2, the lag correlations are significant up to the prior six months. The highest correlation between DI and SOI occurs for a lag of 6 months in the case of Nebraska, but then the correlation coefficients weaken. However, another local maximum correlation can be seen at the −4 month lag period (Figure 6.2). Furthermore, theoretically, as an annual cycle is considered, no lag periods beyond six months in either direction are taken into account. On the basis of these findings, we used four lagged periods (0, −2, −4, and −6 months) of high correlations as SOI-type premises (X7 , X8 , X9 , X10 ). Note the trade-off between the increasing number of premises and the length of the data set.
6.3.2 Definition of fuzzy sets Fuzzy sets (basic characteristics of which are described in Chapter 2) are defined for each variable involved in the model. Here, all fuzzy sets are fuzzy members (Chapter 2) with triangular membership functions. Each fuzzy number can be represented by a triple a, b, c of real numbers, where b defines its core and the open interval (a, c) defines its support. Definitions of fuzzy numbers are based on the ranges of premises and the response. A fuzzy partition is applied to each variable (SOI values, CP relative frequencies, DI values).
Fuzzy numbers defined on premises The entire range of possible premise values is divided into several overlapping classes each forming a fuzzy number. The more fuzzy numbers we define, the better estimation can be expected for the values of DI. However, as we are going to use subsets of
162
6 Fuzzy Logic in Hydrology and Water Resources
the data to define and validate our FRB system, if too many fuzzy numbers are defined on the premises the validation set might contain too many observations that have never occurred in the training set. As a compromise, all premises (relative frequencies of CP types, and lagged SOI time series) are divided into five regions, namely for monthly CP occurrence: very rare A1 , rare A2 , medium A3 , frequent A4 , and very frequent A5 (Figure 6.4). Then, for SOI: strong El Niño A1 , weak El Niño A2 , neutral A3 , weak La Niña A4 , and strong La Niña phases A5 (Figure 6.5). Various CP types occur with different frequencies, so for the sake of comparability the highest monthly frequency
Figure 6.4 Fuzzy numbers defined on the monthly relative frequency of a given CP type.
Figure 6.5 Fuzzy numbers defined on SOI.
6.3 Fuzzy Rule-Based Hydroclimatic Modeling
163
Table 6.3 Monthly maximum relative frequencies (max) for daily CP types and their proportions.
max ¾·max ½·max ¼·max
CP1
CP2
CP3
CP4
CP5
CP6
0.77 0.58 0.38 0.19
0.50 0.38 0.25 0.13
0.93 0.70 0.47 0.23
0.74 0.56 0.37 0.19
0.94 0.70 0.47 0.23
0.60 0.45 0.30 0.15
Table 6.4 Values of premise membership functions for the data array April 1946.
i
Xi1
A1i Very rare
A2i Rare
A3i Medium
A4i Frequent
A5i Very frequent
1 2 3 4 5 6
0.30 0 0.20 0.07 0.03 0.40
0 1.00 0.14 0.64 0.86 0
0.44 0 0.86 0.36 0.14 0
0.56 0 0 0 0 0.33
0 0 0 0 0 0.67
0 0 0 0 0 0
Strong El Niño
Weak El Niño
Neutral
Weak La Niña
Strong La Niña
0.69 0 0 0
0.31 0.79 0.60 0.83
0 0.21 0.40 0.17
0 0 0 0
7 8 9 10
−1.04 0.31 0.60 0.25
0 0 0 0
that ever occurred in the data set is defined as the maximum of the given CP-type premise (the present case is included in Table 6.3). As an example, in April 1946 (representing a data array) the occurrences of CP types are: CP1: 9, CP2: 0, CP3: 6, CP4: 2, CP5:1, and CP6: 12 days. Thus, for that month X1 = 0.30 (relative frequency of CP1), X2 = 0, X3 = 0.20, X4 = 0.07, X5 = 0.03, X6 = 0.40, X7 = −1.04 (simultaneous SOI), X8 = 0.31 (SOI −2 months before), X9 = 0.60 (SOI −4 months before), X10 = 0.25 (SOI −6 months before). The corresponding membership functions are given in Table 6.4. For example, the relative frequency of CP1 X1,1 = 0.30 possesses membership values (different from 0) in both fuzzy sets “Rare monthly CP occurrence” and “Medium monthly CP occurrence,” 0.44 and 0.56, respectively; or the relative frequency of CP5 X5,1 = 0.03 has membership values (different from 0) in both fuzzy sets “Very rare monthly CP occurrence” and “Rare monthly CP occurrence,” 0.86 and 0.14, respectively.
164
6 Fuzzy Logic in Hydrology and Water Resources
Figure 6.6 Fuzzy numbers defined on PMDI.
Fuzzy numbers defined on the response Drought Index (DI) as the response (Y ) is considered in the present example for the eight climate divisions of Nebraska established by NOAA [2001b] and the spatial average of the entire state. Different types of fuzzy numbers can be defined on the range of DI from extremely dry (large negative DI values) to extremely wet (large positive DI values) conditions. As the total number of fuzzy numbers increases (7, 8, 11, 12, 17, 18), the accuracy of the FRB model improves. So the last option was chosen with 18 fuzzy numbers: B1 , . . . , B18 (Figure 6.6). This fuzzy partition offers a proper representation of the wide range of DI, and the data set is able to provide several arrays in each interval. For the example of April 1946, values of the DI membership function are given in Table 6.5 for the eight climate divisions and the spatial average. Note that in other cases linear partitioning may not be applied to the response, e.g., if it follows a skewed frequency distribution, as precipitation does. In this case, a different partitioning is selected; namely, the 10th percentiles are assigned to the core of the fuzzy sets (Figure 6.7).
6.3.3 Definition of the training and validation data sets The entire data set {Xij ; Yj }i=1,...,k;j =1,...,n contains k(= 6 + 4 = 10 in the present example) premises Xi and n observations on the premises and the response Y . The entire time series is split into two parts: a training set τ (2/3 of the entire period) and a validation set ν (1/3 of the entire period). The training set is used to learn the fuzzy rules, so it must be long enough to provide valuable model outputs. The validation set is applied to validate the rules derived from the training set, namely, how correctly they estimate the observed response. Different partitions of the data set can be used to check the sensitivity of results to this operation; in the present example case, the results are not sensitive to the selection of partitions. Besides the continuous partitioning, it is possible to select every third data point for the validation procedure, while the other two are parts of the training set.
Table 6.5 Values of drought response membership functions for the data array April 1946 in different regions of Nebraska. Drought
Membership values
Div.
Y1
B1 Extreme dry
B2 Dry 7
... ...
B6 Dry 3
B7 Dry 2
B8 Dry 1
B9 Normal
B10 Wet 1
... ...
B17 Wet 8
B18 Extreme wet
1 2 3 5 6 7 8 9 NE
−1.16 −1.71 −1.31 −2.47 −1.47 −2.08 −2.29 −1.76 −1.84
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
... ... ... ... ... ... ... ... ...
0 0 0 0.47 0 0.08 0.29 0 0
0.16 0.71 0.31 0.53 0.47 0.92 0.71 0.76 0.84
0.84 0.29 0.69 0 0.53 0 0 0.24 0.16
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
... ... ... ... ... ... ... ... ...
0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0
166
6 Fuzzy Logic in Hydrology and Water Resources
Figure 6.7 Fuzzy numbers defined on the response variable based on the 10th percentiles of the precipitation time series in Hungary.
6.3.4 Rule construction Fuzzy rules are constructed using the training set τ : {Xij ; Yj }i=1,...,k;j =1,...,nt (where nt < n, number of observations in the time series of the training set) by applying the following steps.
Determine the highest values of all membership functions for each data array First, values of membership functions are calculated for each observed premise and the response: Ali (Xij ) (for li = 1, . . . , 5; i = 1, . . . , k) and Bl (Yj ). Then, the maximum values of membership functions are selected. Thus, each Xij data array within the data set (j = 1, . . . , nt ) possesses a value Mij : Mij = max (Ali (Xij )), li =1,...,5
and also each response Yj possesses a value M0j : M0j = max (Bl (Yj )). l=1,...,18
Table 6.6 shows these selected maximum values for the data array, April 1946.
Combined effect of fuzzy numbers (using operator AND) Since we have more than one premise, the effects of premises should be combined. The two most commonly used operators for fuzzy numbers are AND and
6.3 Fuzzy Rule-Based Hydroclimatic Modeling
167
Table 6.6 Maximum membership function values and weights for the data array April 1946. i
Name
Maximum value Mi1
Name of the fuzzy number
1 2 3 4 5 6
CP1 CP2 CP3 CP4 CP5 CP6
0.56 1.00 0.86 0.64 0.86 0.67
Medium Very rare Rare Very rare Very rare Frequent
7 8 9 10
SOI SOI (−2) SOI (−4) SOI (−6)
0.69 0.79 0.60 0.83
Weak El Niño Normal Normal Normal
DOF 1 = 0.049 Response variable
Location
Maximum value M01
Name of the fuzzy number
Weight of rule 1 ω1 = DOF 1 · M01
DI div. 1 DI div. 2 DI div. 3 DI div. 5 DI div. 6 DI div. 7 DI div. 8 DI div. 9 DI/NE
W-Ne N-Ne NE-Ne Central-Ne E-Ne SW-Ne S-Central Ne SE-Ne Nebraska
0.84 0.71 0.69 0.53 0.53 0.92 0.71 0.76 0.84
dry 1 dry 2 dry 1 dry 2 dry 1 dry 2 dry 2 dry 2 dry 2
0.041 0.035 0.034 0.026 0.026 0.045 0.035 0.037 0.041
OR [Zimmermann, 1985]. In the present model we use only the operator AND to add the effects of different premises. So a rule looks like this: IF
(X1j is Al1 AND X2j is Al2 AND . . . AND X10j is Al10 ) THEN
j is Bl . Y
The combined effect of all premises is represented here by the product of membership functions called degree of fulfillment (DOF), which indicates the degree of applicability of the rule within the FRB system. Thus, the DOF of the j th set of data array (DOFj ) is calculated as DOFj = ki=1 Mij . For the data array of April 1946, we obtain DOF 1 = 0.56 · 1.00 · 0.86 · 0.64 · 0.86 · 0.67 · 0.69 · 0.79 · 0.60 · 0.83 = 0.049.
168
6 Fuzzy Logic in Hydrology and Water Resources
In the very beginning, the fuzzy rule system is empty, containing no rules at all—the first rule is derived from the first observed values. In the present case, this first rule for the entire state of Nebraska, and for Northern, Central, Southwestern, South-central, and Southeastern Nebraska, is as follows: Rule (1)
IF (Medium CP1 occurrence) AND (Very rare CP2 occurrence) AND (Rare CP3 occurrence) AND (Very rare CP4 occurrence) AND (Very rare CP5 occurrence) AND (Frequent CP6 occurrence) AND (Weak El Niño in the actual month) AND (Neutral phase 2 month before) AND (Neutral phase 4 month before) AND (Neutral phase 6 month before) THEN (Dry2 drought condition)
whereas for Western, Northeastern, and Eastern Nebraska, it is: Rule (2)
IF (Medium CP1 occurrence) AND (Very rare CP2 occurrence) AND (Rare CP3 occurrence) AND (Very rare CP4 occurrence) AND (Very rare CP5 occurrence) AND (Frequent CP6 occurrence) AND (Weak El Niño in the actual month) AND (Neutral phase 2 month before) AND (Neutral phase 4 month before) AND (Neutral phase 6 month before) THEN (Dry1 drought condition)
The rule system will grow as more and more rules are added on the basis of the observed data array. If a rule derived from a given set of data arrays is not included in the rule system yet, then it should be added to the rule system.
6.3 Fuzzy Rule-Based Hydroclimatic Modeling
169
Assign a weight to each rule Weights indicate the proportion of the training data sets explained by a given (mth) rule. They are calculated as the sum of the products of DOFj and the value of membership function of the response variable (M0j ): ωm =
nt
DOF j · M0j .
j =1
For the data point April 1946, the weights of rule (1) or (2), depending on the area considered, are shown in Table 6.6. If the first rule (1) or (2) appears in more data arrays, the individual weights are summed. After proceeding throughout the entire training set, all derived rules will possess a weight that will be used in the validation procedure when the estimated values of the response are calculated during the defuzzification step. 6.3.5 Validation procedure Fuzzy rules derived from the training set τ are validated using the validation data set ν: {Xij ; Yj }i=1,...,k;j =nt +1,...,n in the following steps. Calculate all possible DOF for each data array All values of membership functions are calculated for each premise, so we have all Ali (Xij ) (for li = 1, . . . , 5; i = 1, . . . , k) values. Since the fuzzy sets are defined as overlapping intervals, all the data array will fall into two different fuzzy sets of a given premise (Figures 6.4, 6.5). Thus, theoretically, there are 2k possible rules, but most of them are either impossible or did not occur in the training set (the maximum number of rules is determined by the length of the training set, nt , which is much less than 2k ). Therefore, only a few existing rules will be taken into account in specifying the response output. As an example, for a data array from the validation set (July 1966), the possible membership values are calculated in the western Nebraska region (Table 6.7). The total number of potentially applicable fuzzy rules is 210 = 1024 in the present study. Combine the fuzzy responses: defuzzification At this time, the application of each rule provides a fuzzy response. The defuzzification process combines the fuzzy responses as the weighted linear combination and arrives
170
6 Fuzzy Logic in Hydrology and Water Resources
Table 6.7 Membership function values for the data array July 1966 (western Nebraska).
i
Xi124
1 2 3 4 5 6
0.13 0.07 0.23 0.32 0.22 0.03
7 8 9 10
A1i Very rare
A2i Rare
A3i Medium
A4i Frequent
A5i Very frequent
0.33 0.48 0.03 0 0.03 0.79
0.67 0.52 0.97 0.26 0.97 0.21
0 0 0 0.74 0 0
0 0 0 0 0 0
0 0 0 0 0 0
Strong El Niño
Weak El Niño
Neutral
Weak La Niña
Strong La Niña
0 0 0.18 0
0.16 0.43 0.82 0.89
0.84 0.57 0 0.11
0 0 0 0
0 0 0 0
−0.24 −0.65 −1.77 −1.33
at a crisp (a real number) estimated response. The center of gravity can be commonly j ): used to obtain the estimated value of the response variable (Y DOF m · ωm · Bm (2) j = m∈τ , Y DOF m · ωm m∈τ
where Bm (2) is the core of the consequent fuzzy set Bm (when the membership value equals 1) defined on DI. In our example, for data array July 1966 five rules are applicable out of the 1024 possible fuzzy rules (Table 6.8). So the estimation for DI in western Nebraska at July 1966 is: −5 + 0.25·0.14·1 + 3.59·0.23·3 + 1.01·0.15·1 + 9.12·0.30·(−3)) 124 = 10 ·(0.096·0.64·(−1) Y −5
10
·(0.096·0.64 + 0.25·0.14 + 3.59·0.23 + 1.01·0.15 + 9.12·0.30)
= −5.39 3.73 = −1.45.
6.3.6 Assessment of the fuzzy rule system A more automated method can be used to obtain the entire fuzzy rule system, namely, simulated annealing using the Metropolis–Hastings algorithm [Chib &
6.3 Fuzzy Rule-Based Hydroclimatic Modeling
171
Table 6.8 Characteristics of the applied rules for the data array July 1966 (western Nebraska). (VR = very rare, R = rare, M = medium, wE = weak El Niño, N = neutral.)
Applied (mth) rule VR, R, VR, R, R, R, N, wE, wE, N → dry1 VR, R, VR, M, R, VR, wE, N, wE, N → wet1 VR, R, R, R, R, VR, wE, wE, sE, wE → wet3 R, VR, VR, N, R, R, wE, N, wE, wE → wet1 R, R, R, R, R, R, wE, wE, wE, wE → dry3
DOFm [10−5 ]
Weight ωm [10−2 ]
Bm (2)
0.96 2.51 35.91 10.12 91.20
6.4 1.4 2.3 1.5 3.0
−1 1 3 1 −3
Greenberg 1995]. In the first step, the performance P of the rule system is defined using the estimated and observed response values: j , Yj ). F (Y P = j
Typically, F can be chosen as an lp measure: p j , Yj ) = F (Y Yj − Yj . Other performance functions such as a likelihood-type measure, a geometric distance, or a performance related to proportional errors can also be formulated. Once one has a measure of performance an automatic assessment of the rules can be established. This means that the goal is to find the rule system for which the performance P reaches its minimal value. Since the number of possible different rule systems is very large, there is no possibility to try out each possible rule combination to find the best. Therefore, discrete optimization methods have to be used to find “good” rule systems. Genetic algorithm or simulated annealing are possible candidates for this task. Here, simulated annealing is summarized as a tool for finding the rule system R with optimal performance P (R). The algorithm is as follows: The possible fuzzy sets for the arguments Ali and the responses Bl are defined. An initial rule system R is generated at random. The performance of the rule system P (R) is calculated. An initial so-called annealing temperature ta is selected. A rule I of the rule system is picked at random. An argument or a response of this rule is chosen at random. If argument h ≤ k is chosen, an index 1 ≤ h∗ ≤ lh is chosen at random and a new rule system R ∗ , with ki,h∗ replacing ki,h , is considered. 8. If response h > k is chosen, an index 1 ≤ h∗ ≤ lh is chosen at random and a new rule system R ∗ , with li = h∗ replacing li , is considered.
1. 2. 3. 4. 5. 6. 7.
172
6 Fuzzy Logic in Hydrology and Water Resources
9. In both cases the performance of the new rule system P (R ∗ ) is evaluated. 10. If P (R ∗ ) < P (R) then R ∗ replaces R. 11. If P (R ∗ ) ≥ P (R) then the quantity P (R) − P (R)∗ π = exp ta is calculated. With the probability π, the rule system R ∗ replaces R (negative changes). 12. Steps 5–11 are repeated NN times. 13. The annealing temperature ta is reduced. 14. Steps 12–13 are repeated until the proportion of positive changes becomes less than a threshold ε > 0. The above algorithm yields a rule system with “optimal” performance. However, the rules obtained might reflect some specific features corresponding to a small number of cases in the data set. To avoid rules that are derived from too few cases, the performance function is modified. The insufficient generality of a rule can be recognized on the number of cases to which it is applied. As an alternative, the degree of fulfillment of the rules can also be considered. In order to ensure the transferability of the rules, the performance of the rule system is modified by taking the sum of the DOFs into account. ⎡ ⎛ ⎞ ⎤ vi (x1 (t), . . . , xJ (t)) v − ⎢ ⎜ ⎟ ⎥ t ⎢1 + ⎜ ⎟ ⎥ (6.1) P (R) = P (R) ⎣ ⎝ ⎠ ⎦ v i
+
Here (·)+ is the positive part function x+ =
x
if x ≥ 0
0
if x < 0
v is the desired lower limit for the applicability of the rules, in this case expressed by the sum of DOFs. If the sum of DOFs for a rule is less than v , then a penalty is applied. If P is used in the optimization procedure, then rules that are based on a few cases and are seldom used are penalized. The degree of penalty depends on the grade to which the desired limit v exceeds the actual sum of DOFs for a selected rule.
6.3.7 Evaluating the fuzzy rule-based model The FRB model must be evaluated in terms of how well it reproduces the statistical properties and the actual time series of the consequences in the validation set. In order
6.4 Application Examples
173
to fulfill the evaluation procedure, it is possible to compare various statistical parameters of the observed and the calculated time series, e.g., mean, standard deviation, quartiles, deciles, etc. The strength of the linear relationship between the observed and the FRM-modeled time series can be represented by the correlation coefficient; in the optimal case it should be equal to 1. Graphical comparison includes the plot of the observed and FRB-modeled time series, or the scatterplot diagrams of the observed and calculated data. Statistical distributions of the two time series can be compared by using the relative frequency distribution histograms or the empirical probability functions. Furthermore, several types of error terms are available [Wilks, 1995] to describe the reproduction of observed data by the FRB model. Only some of them are listed here. Definition of the mean error (ME) is written as follows: ME =
n 1 (Yj − Yj ). · n j =1
The mean absolute error (MAE) can be defined as follows: MAE =
n 1 · Yj − Yj n j =1
Finally, the root-mean squared error (RMSE) is the most often used error term; it can be defined as follows: $ % n %1 j − Yj )2 & · RMSE = (Y n j =1
In the next section of this chapter several examples of these evaluation forms are shown in order to illustrate the applicability of FRB modeling and the goodness-of-fit between the observed and modeled time series. 6.4 Application Examples for Nebraska, Arizona, Germany, and Hungary 6.4.1 Long-term statistical forecasting of drought index in Nebraska and Hungary Drought indices serve as common tools to measure the intensity and spatial extent of droughts. One of the most commonly used climatic drought indices in the U.S. is the Palmer Drought Severity Index (PDSI) [Palmer, 1965]. The PDSI is based on the principle of a balance between moisture supply and demand when man-made
174
6 Fuzzy Logic in Hydrology and Water Resources
Figure 6.8 Drought conditions observed in Nebraska and Hungary.
changes are not considered. This index indicates the severity of a wet or dry spell— the greater the absolute value, the more severe the dry or the wet spell. The PDSI was modified by the National Weather Service Climate Analysis Center to obtain another index (modified PDSI or PMDI), which is more sensitive to the transition periods between dry and wet conditions [Heddinghause & Sabol, 1991]. For the example presented here, the modified Palmer index is considered. However, the methodology is applicable to any other drought indices such as the Standardized Precipitation Index [McKee et al., 1993] or the Bhalme–Mooley drought index [Bogardi et al., 1994]. A long-term historical data set of monthly PMDI values exists for climatic divisions around the U.S. [Guttman & Quayle, 1996]. The data set of the monthly PMDI starts in 1895 [NOAA, 2001b]. Drought events occur in the case of negative PMDI values whereas positive values imply wet conditions. The observed PMDI in Nebraska and PDSI in Hungary (Figure 6.8) indicate the high variability and persistence of droughts. Drought indices are evaluated during the summer season (May–August). Drought is a normal part of the climate in both Hungary and Nebraska, and is different from other natural hazards. Drought is a slowonset, insidious hazard that is often well established before it is recognized as a threat, taking months or years to develop. Very severe drought occurred at both locations in summer 2000. Although both locations have mainly continental climate, Hungarian summers may be interrupted occasionally by oceanic and Mediterranean influences. The climate of the different regions within both locations varies considerably: the western part of Nebraska is in general colder and drier than the eastern part, whereas in Hungary the opposite is true. A historical data set of monthly PDSI values exists for 16 climatic stations in Hungary (Mika, 2000). The question arises whether the monthly PDSI values are homogeneous. Figure 6.9 shows the cumulative frequency distribution of PMDI for Szarvas (located in the eastern part of Hungary) for two periods: the training set of
6.4 Application Examples
175
Figure 6.9 Cumulative frequency distributions of monthly PDSI for Szarvas, located in eastern Hungary.
Figure 6.10 Distributions of PMDI for the training and validation sets in climate division 1, Nebraska.
1881–1960 and the validation set of 1961–1990. The two frequency distributions are different at the 0.01 significance level, using the two-sample Kolmogorov–Smirnov test [Wilks, 1995], which indicates a drier climate during this latter period. The other stations in Hungary generally behave similarly, with some rare exceptions. For Nebraska, Figure 6.10 shows the cumulative frequency distribution of PMDI in climate division 1 for two periods: the training sets of 1946–62, 1978–94 and the validation set of 1963–77. The two frequency distributions are different at the 0.1, but the same at the 0.05 significance level, using the two-sample Kolmogorov–Smirnov test. The other divisions in Nebraska behave similarly.
176
6 Fuzzy Logic in Hydrology and Water Resources
Figure 6.11 Cumulative frequency distribution of PMDI time series (1946–1994) in southcentral Nebraska (climate division 8).
Results for Nebraska First, examples of the results obtained for Nebraska are given. The distributions of the calculated PMDI reproduce the empirical distributions (Figure 6.11). Figure 6.12 compares the observed and estimated (calculated) time series for climate division 8 in the case of using the FRB modeling technique (upper panel) and the multivariate regression (MR) model (lower panel). The FRB model performs almost perfectly over the training set, and quite well over the entire period, whereas the MR technique is not able to reproduce the variability of the observed PMDI time series, not even in the training period. The calculated values of the FRB model are not exactly the same as the observed drought index during the validation period. This is evident and was expected, since droughts are triggered by a large number of atmospheric, hydrologic, agricultural, and other phenomena in addition to the two types of premises this model considers. Another reason is that during the “learning” process huge and persistent negative (years 1954–57) and positive (1992–94) peaks must be “assimilated.” The model did learn all the peaks, which is necessary in order to apply the FRB model to the entire range of PMDI.
Results for Hungary The FRB approach leads to similar promising results in Hungary. Being located in central Europe, CPs are represented by the daily time series of the Hess–Brezowsky [1952, 1977] CP types. Here, the original 29 CP types have been aggregated into six larger classes according to: (1) the three main circulations (zonal classes (4 types),
6.4 Application Examples
177
Figure 6.12 Observed and estimated PMDI time series for climate division 8, NE.
half-meridional classes (7 types), and meridional classes (18 types)); (2) cyclonic and anticyclonic CP types; and (3) the combination of (1) and (2). The frequency distributions of the calculated drought index, PDSI, reproduce the empirical distributions (Figure 6.13). This performance of the approach, both in Nebraska and in Hungary, is even more noteworthy if we consider the much larger difference between the empirical distributions in the training set (from which the rules are derived) and the validation set; that is, the model works under non-stationary climatic conditions. We conclude that the FRB technique is able to reproduce the variability of droughts influenced by CP and ENSO both in Nebraska and Hungary.
178
6 Fuzzy Logic in Hydrology and Water Resources
Figure 6.13 Cumulative frequency distribution of PDSI time series (1881–1990) in southwestern Hungary (for station Pécs).
6.4.2 Long-term statistical forecasting of precipitation in Hungary, Arizona, and Germany Results for Hungary Although there is a considerable variability in average monthly precipitation in Hungary, the frequency distributions of monthly precipitation for selected clusters of months do not change significantly. Namely, for Keszthely, three clusters of months can be separated: January to March, May to August, and September to November (Figure 6.14). Note that April and December do not pertain to any clusters. Here the FRB models contain the CP premises without any lag because regional precipitation does not exhibit any correlation with the relative frequency of CP classes for prior months. This is evident physically since the mean residence time of a water vapor particle in the atmosphere is not more than 10–15 days. Thus the simultaneous frequency distributions of the six CP classes represent the first type of premises. For SOI, we selected two previous monthly values representing the highest correlation with monthly precipitation (for instance, in the case of station Keszthely, −1 and −5 months SOI). The following example rule indicates how it reflects typical weather conditions leading to “extremely high” precipitation for station Keszthely and season
6.4 Application Examples
179
Figure 6.14 Cumulative frequency distributions of monthly precipitation at Keszthely for the three clusters.
May–August in Hungary:
IF ((Very rare zonal–cyclonic CP class occurrence) AND (Very rare halfmeridional–cyclonic CP class occurrence) AND (Very frequent meridional–cyclonic CP class occurrence) AND (Very rare zonal–anticyclonic CP class occurrence) AND (Rare halfmeridional–anticyclonic CP class occurrence) AND (Rare meridional–anticyclonic CP class occurrence) in the actual month) AND (Strong El Niño 1 month before) AND (Neutral phase 5 months before) THEN (Extremely high monthly precipitation).
This rule can be physically expected because much precipitation results from meridional transport dominance together with strong cyclonic activity above the European continent.
180
6 Fuzzy Logic in Hydrology and Water Resources
Table 6.9 Standard deviation of observed and estimated time series using different fuzzy rule-based models for station Keszthely in western Hungary. (Model 1: cyclonic– anticyclonic CP + 2 SOI; Model 2: zonal–half-meridional–meridional CP + 2 SOI; Model 3: cyclonic–anticyclonic dominancy and zonality (6 CP classes) + 2 SOI; Model 4: only 6 CP classes; Model 5: only 2 SOI.) Season Observed time series Model 1 Model 2 Model 3 Model 4 Model 5
Jan.–March
May–Aug.
Sept.–Nov.
23.6 11.3 15.8 22.0 18.7 4.1
41.5 10.1 23.6 35.1 28.5 7.3
36.8 14.7 16.7 30.0 25.5 4.4
The results of the FRB models using five fuzzy sets for each premise and 11 for the response are summarized by providing the standard deviations (Table 6.9) of observed and estimated monthly precipitation for all seasons for the climate station Keszthely located in the western part of the country. It is evident from Table 6.9 that the FRB model using 6 + 2 premises reproduces the empirical standard deviation quite well. The frequency distributions of the calculated precipitation reproduce the empirical distributions (Figure 6.15). The results are quite sensitive to the selection of the number of premises. With fewer premises the reproduction of standard deviation becomes worse (Table 6.9). For instance, if only either cyclonic–anticyclonic or zonal–halfmeridional–meridional CP classes are used, the model does not reproduce the empirical frequency distributions (Figure 6.16). The necessity of considering both cyclonic–anticyclonic dominance and zonality can be explained easily. Zonal or meridional airflow determines the main circulation characteristics, whereas the major synoptic phenomena, cyclones and anticyclones, are mainly responsible for the regional precipitation. Results for Arizona In Arizona monthly precipitation data from eight stations, distributed quite evenly over the state, are used with the FRB model. Daily observed 500 hPa level geopotential height data describe the atmospheric CPs defined at 35 grid points on a diamond grid over the southwestern U.S. For the classification of CP, an automated, nonhierarchical method, namely, PCA coupled with k-means clustering technique, is used. Twenty principal components are maintained to explain 97–98% of the variance. There are six, seven, seven, and eight types for winter (January–March), spring (April–June), summer (July–September) and fall (October–December), respectively.
6.4 Application Examples
181
Figure 6.15 Cumulative frequency distributions of monthly precipitation time series in western Hungary (Keszthely) using six CP classes and two SOI as premises.
182
6 Fuzzy Logic in Hydrology and Water Resources
Figure 6.16 Cumulative frequency distributions of estimated monthly precipitation time series compared to observed precipitation in western Hungary (Keszthely) using two (cyclonic– anticyclonic) or three (zonal–half-meridional–meridional) CP classes and two SOI as premises.
Concerning the other type of premises, lag correlations between monthly precipitation and SOI are higher than in Hungary; e.g., in Grand Canyon the statistically significant (80%) correlations, <0.19, correspond to lags of −2, −5, −9, and −10 months. Thus, SOI is represented with these four monthly values as premises at this station. The number of years chosen for validation is up to 8 years. Figure 6.17 shows the results of the FRB model for December at Grand Canyon, compared with the results of a multiple regression model. It is clear that the FRB model provides a better fit for both the calibration and validation period. Results for Germany Twelve CPs have been defined by using 500 hPa geopotential height anomalies over the North Atlantic/European region. The defining process of fuzzy rules is based on the position of high and low air pressure anomalies [Bardossy et al., 1995]. The fuzzy rules are obtained automatically by using an optimization of the performance of the classification, which is measured by rainfall frequencies and rainfall amounts conditioned on the CP. In this application wet and dry CPs were defined. Here, the training and validation periods were selected as 1980–89 and 1970–79, respectively. Figures 6.18 and 6.19 show the distributions of the mean normalized 500 hPa geopotential height anomalies for the wettest and the driest CPs, respectively. CP 1 (Figure 6.19) is the most frequent CP and dominates during the whole year, having the average annual frequency of about 40%. At the same time it is the driest CP with the lowest precipitation probability (25.6%), lowest mean wet-day amount (1.1 mm),
6.4 Application Examples
183
Figure 6.17 Comparison of the results of fuzzy rule-based and multiple regression models with observed monthly precipitation, Grand Canyon, December.
and lowest wetness index (0.63). The map shows that CP 1 is characterized by a pronounced high-pressure anomaly east of the British Isles, which causes a weak air movement and transport of dry air masses from northeastern Europe to central Europe. Figure 6.18 shows CP 3. CP 3 is a typical wet CP which has the second highest precipitation probability (66.9%), highest mean wet-day amount (2.9 mm), and highest wetness index (1.65). Figure 6.18 shows that CP 3 is characterized by a typical negative air pressure anomaly north of the British Isles and a positive anomaly over the eastern Atlantic. This distribution of air pressure anomalies causes a typical west cyclonic transport of wet, ocean air mass from the northern Atlantic to central Europe. All the maps of air pressure anomalies produced by the automated classification method show physically realistic results. Wetness indices (defined as the ratio of precipitation contribution and occurrence frequency) for every CP, in every season (and for the annual cycle), were computed for the nine stations in Germany (Figure 6.20). For instance, in the case of station Stuttgart (indicated by ‘∗’ on the maps), the wettest CP is CP 3 and the driest CP is CP 1. Figure 6.20 shows that both the above-described CPs have the same (wet or dry) character at all stations. Also, in the case of other CPs, it holds that wet CPs are wet and dry CPs are dry for almost all stations simultaneously. The spatially homogeneous character of wetness index patterns is reflected in the relatively low values of the coefficient of variation calculated from all stations and for a selected CP (ranging to 0.08 for CP 9 and 0.29 for CP 2). The negative correlation (−0.47) between the average wetness index and the coefficient of variation indicates that
184
6 Fuzzy Logic in Hydrology and Water Resources
Figure 6.18 Mean normalized distributions of 500 hPa geopotential height of a wet CP, averaged over 1970–79. (CPs are precipitation-optimized in 1980–89 with 500 hPa data and nine stations in Germany.) (See also color insert.)
the spatial variability of the wetness index is greater for dry CPs. This is caused by the fact that the wet CPs have a west or north cyclonic character which causes precipitation events covering large areas. However, for dry CPs, local precipitation events (especially convective rainfalls in summer) are typical.
6.5 Discussion and Conclusions The fuzzy rule-based approach was used successfully over four regions—Arizona, Nebraska, Germany, and Hungary—and under three different climates—semiarid, dry, and wet continental—to predict the statistical properties of monthly precipitation and drought index from the joint forcing of CP and ENSO. Because of the weakness and nonlinearity of this relationship over these regions, traditional methods of forecasting have limited possibilities. The FRB technique is able to reproduce the observed time series even for the validation period if the relationship is relatively stronger, as in Arizona. The reproduction of the time series is relatively weaker in Nebraska and even more in Hungary, located very far from the ENSO region. The
6.5 Discussion and Conclusions
185
Figure 6.19 Mean normalized distributions of 500 hPa geopotential height of a dry CP, averaged over 1970–79. (CPs are precipitation-optimized in 1980–89 with 500 hPa data and nine stations in Germany.) (See also color insert.)
possibility of using this technique for real-time forecasting is thus variable. On the other hand, in every case, the observed frequency distributions of both precipitation and drought index are correctly reproduced. The fuzzy rule-based technique has the potential to generate time series of regional drought indices and/or precipitation under climate change scenarios. The main idea is to use, instead of the historical CP and ENSO data, results of general circulation models (GCM) with the established fuzzy rule-based linkage. Several GCMs are able to reproduce features of present atmospheric general circulation patterns quite correctly (e.g., [Mearns et al., 1999]). Also, recently, GCM-produced ENSO indices have become available [Meehl and Washington, 1996; Timmermann et al., 1999]. The following conclusions can be drawn from the experience gained at the four remote regions under different climates: 1. In every case, the fuzzy rule-based approach reproduces the statistical properties of monthly precipitation and drought index. 2. The best results require consideration of the joint forcing of CP and ENSO information. Separate use of either the relative frequencies of CP types as premises
Figure 6.20 Spatial variability of yearly wetness index for one wet (top) and one dry (bottom) CP for nine stations in Germany, 1970–79. (CPs are precipitation-optimized in 1980–89 with 500 hPa data and nine stations in Germany.)
References
3.
4. 5.
6. 7.
187
or the lagged SOI shows that neither formulation can reproduce the empirical frequency distributions. Statistical measures of dependence between CP, ENSO, and precipitation/drought index are relatively weak, precluding the use of other techniques such as multivariate regression. In every case, the calculated time series reproduces the observed time series for the calibration period. In Arizona the calculated precipitation time series, and in Nebraska the calculated drought index, reproduce fairly well the observed time series, even for the validation periods. In Hungary the observed time series for the validation periods are not reproduced. All the maps of air pressure anomalies produced by the automated classification method show physically realistic results. All wet and dry CPs provide a spatially homogeneous character of wetness index in the German stations.
References Bardossy, A., & Disse, M. [1993], “Fuzzy rule-based models for infiltration.” Water Resources Research, 29(2), 373–382. Bardossy, A., & Duckstein, L. [1995], Fuzzy Rule-Based Modeling with Applications to Geophysical, Biological and Engineering Sciences. CRC Press, Boca Raton, FL. Bardossy, A., Bogardi, I., & Kelly, W. E. [1988], “Imprecise (fuzzy) information in geostatistics.” Mathematical Geology, 1(4), 287–311. Bardossy, A., Bogardi, I., Duckstein, L., & Nachtnebel, P. [1989], “Fuzzy decision-making to resolve regional conflicts between industry and the environment.” In: Evans, C. W., Karwowski, W., & Wilhelm, P. M. (eds.), Fuzzy Methodologies for Industrial and Systems Engineering, Chapter 3. Elsevier, Amsterdam. Bardossy, A., Bogardi, I., & Duckstein, L. [1990], “Fuzzy regression in hydrology.” Water Resources Research, 25(7), 1497–1508. Bardossy, A., Bogardi, I., & Duckstein, L. [1991a], “Fuzzy set and probabilistic techniques for health-risk analysis.” Applied Mathematics and Computation, 45(3), 241–268. Bardossy, A., Hagaman, R., Duckstein, L., & Bogardi, I. [1991b], “Fuzzy least squares regression: theory and application.” In: Fedrizzi, M., & Kacprzyk, J. (eds.), Fuzzy Regression Models, pp. 66–86. Omnitech Press, Warsaw, Poland. Bardossy, A., Duckstein, L., & Bogardi, I. [1992], “Fuzzy composite programming with water resources engineering application.” In: Proceedings of Fourth World Congress of the International Fuzzy System Association, Brussels. Bardossy, A., Duckstein, L., & Bogardi, I. [1995], “Fuzzy rule-based classification of circulation patterns for precipitation events.” International Journal of Climatology, 15(10), 1087–1097. Bardossy, A., Stehlik, J., & Caspary, H.-J. [2002], “Automated objective classification of daily circulation patterns for precipitation and temperature downscaling based on optimized fuzzy rules.” Climate Research, 23, 11–22.
188
6 Fuzzy Logic in Hydrology and Water Resources
Bardossy, A., Haberlandt, U., & Krysanova, V. [2003], “Automatic fuzzy-rule assessment and its application to the modeling of nitrogen leaching for large regions.” Soft Computing, June. Biswas, A. K. (ed.) [1981], Models for Water Quality Management. McGraw-Hill, New York. Bogardi, I., Duckstein, L., & Szidarovszky, F. [1982], “Bayesian analysis of underground flooding.” Water Resources Research, 18(4), 1110–1116. Bogardi, I., Duckstein, L., & Bardossy, A. [1983], “Regional management of an aquifer under fuzzy environmental objectives.” Water Resources Research, 19(6), 1394–1402. Bogardi, I., Duckstein, L., & Bardossy, A. [1989], “Uncertainties in environmental risk analysis.” In: Haimes, Y. Y., & Stakhiv, E. Z. (eds.), Risk Analysis and Management of Natural and Man-made Hazards, pp. 342–356. ASCE, New York. Bogardi, I., Matyasovszky, I., Bardossy, A., & Duckstein, L. [1994], “A hydroclimatological model of areal drought.” Journal of Hydrology, 153(1–4), 245–264. Bogardi, I., Reitel, R., & Nachtnebel, P. [1995], “Fuzzy rule-based estimation of flood probabilities under climatic fluctuation.” In: Haimes, Y. Y., Moser, D. A., & Stakhiv, E. Z. (eds.), Risk-based Decision-making in Water Resources VII, Proceedings, pp. 61–79. ASCE, Reston, VA. Bogardi, I., Bardossy, A., & Duckstein, L. [1996], “Conflict analysis using multiple criterion decision making under uncertainty.” In: Ganoulis, J. (ed.), Transboundary Water Resources Management: Theory and Practices, pp. 79–98. Springer-Verlag, Heidelberg. Carlson, R. E., Todey, D. P., & Taylor, S. E. [1996], “Midwestern corn yield and weather in relation to extremes of the Southern Oscillation.” Journal of Production Agriculture, 9(3), 347–352. Chib, S., & Greenberg, E. [1995], “Understanding the Metropolis–Hastings algorithm.” American Statistician, 49(4), 327–335. Chow, V. T., Maidment, D. R., & Mays, L. W. [1988], Applied Hydrology. McGraw-Hill, New York. Clarke, A. J., & Li, B. [1995], “On the timing of warm and cold El Niño–Southern Oscillation events.” Journal of Climate, 8(10), 2571–2574. DeCampos, L. M., & Moral, S. [1993], “Learning rules for a fuzzy inference model.” Fuzzy Sets and Systems, 59(3), 247–257. Di Natale, M., Duckstein, L., & Pasanisi, A. [2000], “Forecasting pollutants transport in river by a fuzzy rule-based model.” In: Workshop on Fuzzy Logic and Applications, Mons Institute of Technology, Belgium. Dou, C., Woldt, W., Bogardi, I., & Dahab, M. [1995], “Steady state groundwater flow simulation with imprecise parameters.” Water Resources Research, 31(11), 2709– 2719. Dou, C., Woldt, W., Dahab, M., & Bogardi, I. [1997a], “Transient groundwater flow simulation using a fuzzy set approach.” Ground Water, 35(2), 205–215. Dou, C., Woldt, W., Bogardi, I., & Dahab, M. [1997b], “Numerical solute transport simulation using fuzzy sets approach.” Journal of Contaminant Hydrology, 27(1–2), 107–126. Dou, C., Woldt, W., & Bogardi, I. [1999], “Fuzzy rule-based approach to describe solute transport in the unsaturated zone.” Journal of Hydrology, 220(1–2), 74–85. Duckstein, L., & Bogardi, I. [1991], “Reliability with fuzzy elements in water quantity and quality problems.” In: Ganoulis, J. (ed.), Risk and Reliability in Water Resources and Environmental Engineering, pp. 78–99. Springer-Verlag, Berlin.
References
189
Duckstein, L., Bogardi, I., & Bardossy, A. [1988a], “A fuzzy reliability model of water supply during droughts.” In: AGU Fall National Meeting, Session H21D: Drought Concept, Drought Management and Water Supply System Reliability, San Francisco. Duckstein, L., Korhonen, P., & Tecle, A. [1988b], “Multiobjective forest management using a visual, interactive and fuzzy approach.” In: Proceedings of 1988 Symposium on Systems Analysis in Forest Resources, pp. 68–74. USDA Forest Service, Fort Collins, Colorado. Fried, J. L. [1975], Groundwater Pollution. Elsevier, Amsterdam. Glantz, M. H., Katz, R. W., & Nicholls, N. [1991], Teleconnections Linking Worldwide Climate Anomalies. Cambridge University Press, New York. Guttman, N. B., & Quayle, R. G. [1996], “A historical perspective of U.S. climate divisions.” Bulletin of the American Meteorological Society, 77(2), 293–303. Haberlandt, U., Krysanova, V., & Bardossy, A. [2002], “Assessment of nitrogen leaching from arable land in large river basins, Part II: regionalization using fuzzy rule based modeling.” Ecological Modelling, 150(3), 277–294. Heddinghause, T. R., & Sabol, P. [1991], “A review of the Palmer Drought Severity Index and where do we go from here.” In: Proceedings of the Seventh Conference on Applied Climatology, pp. 242–246. American Meteorological Society, Boston, MA. Hess, P., & Brezowsky, H. [1952], “Katalog der Grosswetterlagen Europas.” Berichte das Deutschen Wetterdienstes in der US Zone, 33. Bad Kissingen. Hess, P., & Brezowsky, H. [1977], “Katalog der Grosswetterlagen Europas.” Berichte das Deutschen Wetterdienstes, 113. Offenbach. Hundecha, Y., Bardossy, A., & Theisen, H. W. [2001], “Development of a fuzzy logic-based rainfall-runoff model.” Hydrological Sciences Journal, 46(3), 363–376. Kojiri, T. [1988], “Real-time reservoir operation with inflow prediction by using fuzzy inference theory.” In: Seminar on Conflict Analysis in Reservoir Management, Session F. Asian Institute of Technology, Bangkok, Thailand, December. Lee, Y. W., Dahab, M. F., & Bogardi, I. [1994], “Fuzzy decision making in ground water nitrate risk management.” Water Resources Bulletin, 30(1), 135–148. Lee, Y. W., Dahab, M. F., & Bogardi, I. [1995], “Nitrate risk assessment using a fuzzy-set approach.” Journal of Environmental Engineering, 121(3), 245–256. McKee, T. B., Doeskin, N. J., & Kleist, J. [1993], “The relationship of drought frequency and duration to time scales.” In: Proceedings of the Eighth Conference on Applied Climatology, AMS, Boston. MacQueen, J. B. [1967], “Some methods for classification and analysis of multivariate observations.” Proceedings of 5th Berkeley Symposium on Mathematical Statistical Probability, 1, 281–297. Matyasovszky, I., Bogardi, I., Bardossy, A., & Duckstein, L. [1993], “Estimation of local precipitation statistics reflecting climate change.” Water Resources Research, 29(12), 3955–3968. Mearns, L. O., Bogardi, I., Giorgi, F., Matyasovszky, I., & Palecki, M. [1999], “Comparison of climate change scenarios generated from regional climate model experiments and statistical downscaling.” Journal of Geophysical Research, 104(D6), 6603–6621. Meehl, G. A., & Washington, W. M. [1996], “El Niño-like climate change in a model with increased atmospheric CO2 concentrations.” Nature, 382(6586), 56–60. Mika, J. [2000], “Spatial and temporal variations of the Palmer Drought Severity Index.” In: Ijjas, I. (ed.), Proceedings of the International Conference on Water Resources Management in the 21st Century with Particular Reference to Europe, pp. 151–160. Budapest.
190
6 Fuzzy Logic in Hydrology and Water Resources
Mujumdar, P. P., & Sasikumar, K. [2002], “A fuzzy risk approach for seasonal water quality management of a river system.” Water Resources Research, 38(1), 55–63. Nachtnebel, H. P., Hanish, P., & Duckstein, L. [1986], “Multicriterion analysis of small hydropower plants under fuzzy objectives.” The Annals of Regional Science, XX, 86–100. NCAR Data Support Section and University of Washington Department of Atmospheric Sciences [1996], NCEP Grid Point Data Set—version III. NOAA, Climate Prediction Center [2001a], “SOI time series.” http://www.cpc.ncep.noaa.gov/ data/indices/index.htm NOAA, National Climatic Data Center [2001b], “Modified Palmer Drought Severity Index.” http://lwf.ncdc.noaa.gov/oa/climate/onlineprod/drought/ftppage.html Ozelkan, E. C., & Duckstein, L. [2000], “Multi-objective fuzzy regression: a general framework.” Computers and Operations Research, 27(7–8), 635–652. Palmer, W. C. [1965], Meteorological Drought. Research Paper 45. US Weather Bureau, Washington, DC. Pesti, G., Shrestha, B., Duckstein, L., & Bogardi, I. [1996], “A fuzzy rule-based approach to drought assessment.” Water Resources Research, 32(6), 1741–1747. Piechota, T. C., & Dracup, J. A. [1996], “Drought and regional hydrologic variation in the United States: Associations with the El Niño–Southern Oscillation.” Water Resources Research, 32(5), 1359–1373. Pongracz, R., Bartholy, J., & Bogardi, I. [2001], “Fuzzy rule-based prediction of monthly precipitation.” Physics and Chemistry of the Earth, Part B., 26(9), 663–667. Shrestha, B. P., Duckstein, L., & Stakhiv, E. Z. [1996], “Fuzzy rule-based modeling of reservoir operation.” Journal of Water Resources Planning and Management, 122(4), 262–269. Simonovic, S. P. [1992], “Reservoir systems-analysis—Closing gap between theory and practice.” Journal of Water Resources Planning and Management, 118(3), 262–280. Stansbury, J., Bogardi, I., & Stakhiv, E. Z. [1999], “Risk-cost optimization under uncertainty for dredged material disposal.” Journal of Water Resources Planning and Management, 125(6), 342–351. Sugeno, M., & Yasukawa, T. [1993], “A fuzzy logic based approach to qualitative modeling.” IEEE Transactions on Fuzzy Systems, 1(1), 7–31. Teegavarapu, R. S. V., & Simonovic, S. P. [1999], “Modeling uncertainty in reservoir loss functions using fuzzy sets.” Water Resources Research, 35(9), 2815–2823. Timmermann, A., Oberhuber, J., Bacher, A., Esch, M., Latif, M., & Roeckner, E. [1999], “Increased El Niño frequency in a climate model forced by future greenhouse warming.” Nature, 398(6729), 694–697. Wang, L. X., & Mendel, J. M. [1992], “Generating fuzzy rules by learning from examples.” IEEE Transactions on Systems, Man and Cybernetics, 22(6), 1414–1427. Wetzel, E. G. [1975], Limnology. W. G. Saunders, London. Wilks, D. S. [1995], Statistical Methods in the Atmospheric Sciences. Academic Press, San Diego, CA. Woldt, W. E., Dou, C., & Bogardi, I. [1997], “Innovations in modeling solute transport in the vadose zone using fuzzy rule-based methods.” In: Proceedings of ASAE Conference Emerging Technologies in Hydrology, pp. 2097–2098. ASAE. Wright, P. B. [1985], “The Southern Oscillation: an ocean–atmosphere feedback system?” Bulletin of the American Meteorological Society, 66(4), 398–412. Zimmermann, H. J. [1985], Fuzzy Set Theory—and its Applications. Kluwer-Nijhoff, Boston.
Chapter 7
Formal Concept Analysis in Geology
Radim Beˇ lohlávek
7.1 Introduction 192 7.1.1 Directly observable data: objects and their attributes 192 7.1.2 Analysis: discovery of hidden attribute dependencies and natural concepts 192 7.2 Formal Concept Analysis: What and Why 193 7.2.1 Origins 193 7.2.2 Informal outline 193 7.2.3 Discovering natural concepts hidden in the data 194 7.2.4 Hierarchy of discovered concepts 196 7.2.5 Attribute dependencies 197 7.2.6 Fuzziness and similarity issues 198 7.3 Formal Concept Analysis of Fuzzy Data: a Guided Tour 199 7.3.1 Fuzzy context and fuzzy concepts: input data and hidden concepts 199 7.3.2 Fuzzy concept lattices: hierarchy of hidden concepts 203 7.3.3 Attribute implications 204 7.4 Similarity and Logical Precision 207 7.4.1 Similarity relations 207 7.4.2 Similarity of objects and similarity of attributes 209 7.4.3 Similarity of concepts 211 7.4.4 Compatible similarities and factorization 212 7.4.5 Similarity of concept lattices 214 7.4.6 Logical precision 215 7.5 Formal Concept Analysis Demonstrated: Examples 217 Acknowledgments 236 References 236
191 FUZZY LOGIC IN GEOLOGY
Copyright 2004, Elsevier Science (USA) All rights of reproduction in any form reserved. ISBN: 0-12-415146-9
192
7 Formal Concept Analysis in Geology
7.1 Introduction 7.1.1 Directly observable data: objects and their attributes When humans formulate their knowledge about some domain of interest, they usually recognize objects and (their) properties. Objects and attributes (properties) are, indeed, primary phenomena when one observes the physical world. When experts begin to explore an unknown area of interest, their first step is to identify relevant objects and their attributes. Then, experts identify what objects have which attributes. With this object–attribute knowledge at hand, experts can start further investigations such as various kinds of relationships between attributes and a natural classification scheme. Attributes can be useful in devising suitable criteria according to which the objects relevant to the domain may be naturally classified. Furthermore, it is often found that, in order to get an insightful view into the domain of interest, one needs to establish a reasonable conceptual system, i.e., a collection of concepts (specific to the domain) with basic relationships between the concepts.
7.1.2 Analysis: discovery of hidden attribute dependencies and natural concepts What was outlined above is the more true for biological sciences, geological sciences, etc. Let us illustrate the above general description by an example. Suppose we arrive at a new territory with completely unknown living organisms (or suppose we are in our world but do not know anything about the organisms living here) and suppose we want to know more about the organisms. In other words, we want to be able to do more than just recognize and distinguish the organisms we encounter. The objects of our domain are thus the living organisms (or a suitable collection of them). We may now select several attributes of these organisms that seem to some extent relevant and elementary (directly observable). So far, our knowledge is limited to the knowledge of what objects have what attributes. Note that this knowledge is naturally depicted in the form of a (two-dimensional) table with rows corresponding to objects and columns corresponding to attributes. A data entry corresponding to a table cell, which is the intersection of the xth row (row corresponding to object x) and the yth column (column corresponding to attribute y), contains the value of attribute y on object x. This value may be a numerical value (if the attribute is quantifiable, e.g., the weight in kilograms), a logical value (in case of qualitative attributes like “hard”), or some other value (e.g. where the object was found). Typical of geological and biological sciences is the fact that some attributes are commonly fuzzy (e.g. “hard”) in the sense that an attribute applies to an object only to a certain degree. To get a deeper insight into the object–attribute data, one naturally asks what are the dependencies and relationships among the attributes that can be read from the table.
7.2 Formal Concept Analysis: What and Why
193
For instance, there might be dependencies, not visible on the surface, that tell us that a certain combination of certain attributes determines to some extent another attribute or attributes. As an example, consider a dependency “if x lives in the water then x has . . .” One wishes to have an automatic procedure for obtaining the dependencies from the data. Another natural question relates to the fact that it is almost impossible to communicate knowledge without a conceptual system that is appropriate for a given domain. That is, one looks for what natural concepts are hidden in the data and what is the hierarchical structure of these concepts. For instance, one expects that some natural concepts like “a flying predator” are in some way hidden in the present object–attribute data, and that some natural hierarchy of those concepts is in the data as well.
7.2 Formal Concept Analysis: What and Why 7.2.1 Origins The kind of data analysis outlined in the previous section is the main objective of formal concept analysis. The roots of formal concept analysis go to the paper by Wille [1982]. In this paper, he outlined his program of “restructuring lattice theory.” The main aim was to develop a lattice theory close to the original motivations of the theory of ordering. This is best illustrated by the following quotation from the paper: “The approach to lattice theory outlined in this paper is based on an attempt to reinvigorate the general view of order. For this purpose we go back to the origin of lattice concept in nineteenthcentury attempts to formalize logic, where the study of hierarchies of concepts played a central role [cf. Schröder, 1890–95] . . .. In set-theoretical language, this gives rise to lattices whose elements correspond to the concepts . . . and whose order comes from the hierarchy of concepts.”
The theory that resulted from this endeavor is called the theory of concept lattices. The part dealing with applications to the analysis of object–attribute data is known as formal concept analysis. The basic reference is Ganter & Wille [1999], where one can also find an extensive list of publications related to both theory and applications; see also Ganter [1994], Ganter et al. [1987], and Wille [1992]. Extension of concept lattices and formal concept analysis to the case of fuzzy data (which is the tool we are interested in) can be found in Beˇ lohlávek [2002] and Pollandt [1997]; the first paper on this topic is Burusco & Fuentes-Gonzáles [1994].
7.2.2 Informal outline The rest of this section is devoted to an informal outline of the conceptual framework of formal concept analysis. The basic notion which serves to represent the input object–attribute knowledge is that of a formal context. Formal context consists of a
194
7 Formal Concept Analysis in Geology Table 7.1 Input data in the tabular form. I (xi , yj ) is the degree to which attribute yj applies to object xi .
x1 .. . xi .. . xk
y1
…
…
…
yj .. . .. . I (xi , yj ) .. . .. .
…
yl
…
…
set X of objects, a set Y of attributes, and a relation I between objects and attributes. The set X represents objects that are relevant to our domain of interest, i.e., objects to which we restrict our attention. Likewise, Y contains relevant attributes. We restrict ourselves to the case where attributes are qualitative, i.e., they either apply or don’t apply or apply only to a certain truth degree. An example of such an attribute is “to be found in North America.” Provided North America is sharply delineated, this is an example of a so-called crisp attribute; each object x either was found in North America (the attribute takes logical value 1 on this object) or was not (the attribute takes logical value 0 on this object). The attribute “hard” is an example of a typical fuzzy attribute; the fact that an object is hard may be assigned a truth degree, say, 0.7 if the object is more or less hard. Therefore, the input data specify for each object x from X and each attribute y from Y to which extent the attribute y applies to the object x. This is naturally done by specifying the truth degree I (x, y) of the fact “y applies to x.” The degree I (x, y) is supplied by an expert’s observation of the domain. In practice, the sets X and Y are of course finite. The input data can thus be put into a table specifying the values I (x, y) for each x from X and y from Y . Let the elements of X be x1 , . . . , xk , and the elements of Y be y1 , . . . , yl . Then the input data, i.e., the triple X, Y, I , can be represented by a table (Table 7.1). Having specified the input data X, Y, I , one is interested in the analysis of these data. In this preliminary outline, we focus on the discovery of natural concepts hidden in the data, the discovery of attribute dependencies in the form of attribute implications, and the measurement of similarity of attributes and similarity of the discovered concepts.
7.2.3 Discovering natural concepts hidden in the data First of all, one needs to say what is to be understood by a (formal) concept. In formal concept analysis, a concept is understood according to a longstanding tradition of
7.2 Formal Concept Analysis: What and Why
195
Port-Royal logic [Arnauld & Nicole, 1662]; see also Höfler [1906]. A concept is determined by its extent and its intent. The extent of a concept is the collection of all objects that are covered by the concept. The intent of a concept is the collection of all attributes covered by the concept. For instance, consider the concept DOG. The extent of DOG consists of all dogs while the intent of DOG consists of all attributes that apply to dogs (like “to be a mammal,” “to bark,” etc.). Therefore, a concept in formal concept analysis is understood to be a pair A, B consisting of a collection A of objects (extent) and a collection B of attributes (intent). In order to qualify as a concept, the pair A, B has to satisfy some constraints. Note that this is extremely important and makes formal concept analysis what it is. If there were no constraint, the whole thing would be useless; in this case, each pair A, B would be a concept and so “being a concept” would carry no information. The constraint being used in formal concept analysis is very simple and can be described verbally as follows: A pair A, B is called a (formal) concept if B is the collection of all attributes shared by all objects from A and A is the collection of all objects sharing all the attributes from B. Therefore, given the input data X, Y, I , there might be several pairs A, B where A is a subcollection of X and B is a subcollection of Y that satisfy the definition of a formal concept. These formal concepts are hidden in the input data in that their presence is not obvious by just looking at the table. A procedure that takes X, Y, I as its input and generates the list of formal concepts that are hidden in X, Y, I may be considered as performing a discovery of natural concepts that “exist in the data.” Formal concepts that arise from such a procedure are, in a sense, meaningful clusters of objects and attributes (where “meaningful” is to be considered with respect to (w.r.t.) the conceptual interpretation). We denote the collection of all formal concepts hidden in X, Y, I by B (X, Y, I ). Let us illustrate the notion of a concept by a simple example. Example 7.1 A simple example illustrates these notions. Let X = {x1 , x2 , x3 }, Y = {y1 , y2 , y3 , y4 } and consider a binary relation I given by Table 7.2. That is, x1 has attributes y1 and y2 but does not have attributes y3 and y4 . Although this is a very simple example, we will interpret xi and yj as follows. Let x1 , x2 , x3 be some geological objects, say minerals; let y1 mean “was found in North America,” y2 mean “was found in South America,” y3 mean “was found in Asia,” y4 mean “was found in Europe.” However, note that “object” is just a technical term. In our example, xi refers to a whole group of minerals of the same sort (one particular mineral cannot be found in both North
196
7 Formal Concept Analysis in Geology Table 7.2 Input data to Example 7.1.
x1 x2 x3
y1
y2
y3
y4
1 0 0
1 1 0
0 1 1
0 0 1
Table 7.3 Concepts from Example 7.1.
c1 c2 c3 c4 c5 c6 c7
x1
x2
x3
y1
y2
y3
y4
1 0 0 1 0 1 0
1 1 0 1 1 0 0
1 1 1 0 0 0 0
0 0 0 0 0 1 1
0 0 0 1 1 1 1
0 1 1 0 1 0 1
0 0 1 0 0 0 1
America and South America). There are seven concepts hidden in these data; they are listed in Table 7.3. For instance, concept c4 is a pair A4 , B4 with the extent A4 = {x1 , x2 } and the intent B4 = {y2 }. That is, c4 covers mineral x1 and mineral x2 , and covers the attribute “found in South America.” We comment further on the concepts below.
7.2.4 Hierarchy of discovered concepts The next issue is the hierarchy of discovered concepts. Hierarchy of concepts w.r.t. their generality is a basic relation that accompanies concepts. For instance, the concept DOG is a subconcept of the concept MAMMAL, MAMMAL is a superconcept of DOG. We denote the conceptual hierarchy by ≤ and write A1 , B1 ≤ A2 , B2 to denote that the concept A1 , B1 is a subconcept of the concept A2 , B2 . Being more general means covering more objects (or, which is equivalent, less attributes). Therefore, it is only natural to define ≤ by A1 , B1 ≤ A2 , B2 if and only if A1 is a subcollection of A2 or, equivalently, if and only if B2 is a subcollection of B1 . It is easily seen that hierarchy defined in this way is a partial order, i.e., it is reflexive (c ≤ c), antisymmetric (c1 ≤ c2 and c2 ≤ c1 imply c1 = c2 ), and transitive (c1 ≤ c2 and c2 ≤ c3 imply c1 ≤ c3 ). Moreover, for each collection of formal concepts
7.2 Formal Concept Analysis: What and Why
197
Figure 7.1 Hierarchy of hidden formal concepts from Example 7.1.
from B (X, Y, I ), there exist both their direct superconcept (generalization) and their direct subconcept (specialization) in B (X, Y, I ) (see the next section). Therefore, ≤ obeys the laws naturally required for a complete conceptual system. The hierarchical structure of the collection B (X, Y, I ) w.r.t. the hierarchy order ≤ is easily depicted by a so-called Hasse diagram. The next example serves to illustrate this. Example 7.2 Consider the input data from Example 7.1. The hierarchical structure of B (X, Y, I ) is depicted in Figure 7.1. Concept c1 is the most general concept; its extent contains all objects (x1 , x2 , x3 ). On the other hand, c7 is the empty concept; its extent does not contain any object. Between c1 and c7 there are five concepts. For instance, concepts c3 and c5 have the verbal descriptions “to be found in Asia and in Europe” and “to be found in South America and in Asia,” respectively. Concept c2 is the join of c3 and c5 and, therefore, c2 is the direct generalization of c3 and c5 . Indeed, the intent of c2 is {y3 }, which means that the verbal description of c2 is “to be found in Asia.” 7.2.5 Attribute dependencies Attribute dependencies, as approached in formal concept analysis, are expressed by implications of the form attributes y1 , . . . z1 imply attributes y2 , . . . , z2 , written formally {y1 , . . . , z1 } ⇒ {y2 , . . . , z2 }. This implication means that each object that has all of the attributes y1 , . . . , z1 has also all of the attributes y2 , . . . , z2 ;
198
7 Formal Concept Analysis in Geology
it is this sense in which the implication is valid in given input data X, Y, I . Attribute implications reveal other kinds of knowledge hidden in the input data. The analysis of attributes may reveal that some attributes are only simple combinations of others. We are surely interested in all implications that are valid in the input data. However, listing of all implications can yield a huge number of implications. Moreover, some of them are trivial (such as {y1 } ⇒ {y1 }), some of them follow (in a natural but precisely defined sense) from other implications. Formal concept analysis has means to generate only basic implications (in the sense that all other implications follow from the basic ones). The next example illustrates the basics of attribute implications. Example 7.3 Consider again the input data of Example 7.1. We can see that there are, for instance, the attribute implications {y4 } ⇒ {y3 } and {y1 } ⇒ {y2 } which are true in the input data. Implication {y4 } ⇒ {y3 } says that each mineral found in Europe was also found in Asia, and {y1 } ⇒ {y2 } says that each mineral found in North America was also found in South America. On the contrary, implication {y3 } ⇒ {y4 } is not true in the data since x3 has the attribute y3 but does not have the attribute y4 .
7.2.6 Fuzziness and similarity issues The example we used to demonstrate the basic notions of formal concept analysis was specific in that the attributes were crisp. However, most empirical attributes are fuzzy. What was described above also applies if attributes are fuzzy; we only need correctly to interpret the verbal description (see next section). In the case that the input data contains fuzzy attributes, an important phenomenon is that of similarity. Similarity can be basically considered on three levels: similarity of attributes (and similarity of objects); similarity of concepts; and similarity of the conceptual structures. Similarity is a graded notion. Objects x and y may be more similar than objects x and z are. Simplifying reality using similarity is the very nature of how humans cope with the complexity of the outer world. Basically, the simplification is done by identifying objects that are “very similar.” The process of identification of elements is known as factorization. It is the usual formal model of what people call abstraction. What “very similar” means depends on how coarse the factorization is and how much abstraction is required. Fuzzy logic has natural means to model factorization w.r.t. graded similarity. Intuitively, we consider two attributes similar if they apply to each object of the domain of discourse approximately to the same extent. This makes it possible to reduce the input data, i.e., to identify attributes that are “very much” similar. Distinguishing very similar attributes would lead to overly detailed and extensive analysis. From the input (fuzzy) data one can generate the list of all formal concepts hidden in the data and the hierarchical structure of the concepts. A natural question is that of
7.3 Formal Concept Analysis of Fuzzy Data: a Guided Tour
199
how the similarity of the formal concepts can be measured. Intuitively, we consider two concepts similar if they apply to all objects to approximately the same extent. If one finds one does not need that level of discernibility which is represented by the generated structure of concepts, one may wish to simplify the conceptual system by identifying concepts that are sufficiently similar. Doing so, one obtains a simpler conceptual system that is (for the given purposes) sufficient. Finally, a natural problem is the similarity of two conceptual systems. A conceptual system is a system of basic abstract units (concepts) which allows efficient communication. Given two systems, an immediate question is to what extent are the two equivalent in that each concept of one of them may be described by concepts in the other one. Formal concept analysis of fuzzy data has means for naturally modeling all of the three levels of similarity described above. Basically, it answers the questions of (a) how to measure similarity, and (b) how to simplify (factorize) by similarity.
7.3 Formal Concept Analysis of Fuzzy Data: a Guided Tour This section presents basic notions and results of formal concept analysis of fuzzy data. Theorems are presented without proofs, which can be found in Beˇ lohlávek [2002b].
7.3.1 Fuzzy context and fuzzy concepts: input data and hidden concepts Let X be a set of objects and Y be a set of attributes to which we restrict our attention. Let L be a (suitable) set of truth degrees. Furthermore, let I be a binary fuzzy relation with truth degrees in L; that is, I assigns to each x ∈ X and each y ∈ Y a truth degree I (x, y) ∈ L. The degree I (x, y) is interpreted as the truth degree to which object x has attribute y. Definition 7.1 The above triple X, Y, I is called a fuzzy context. Mostly, L is taken to be some subset of [0, 1]. As we will see, we need operations on L that correspond to logical connectives. That is, L should be equipped with a couple of operations corresponding to conjunction, implication, etc. We will provide a general structure for this purpose (L will be equipped with a structure of so-called complete residuated lattice) and then show particular examples of this structure that are most commonly used in applications. Complete residuated lattices are the basic structures of truth values used in fuzzy logic in the narrow sense [Goguen, 1967, 1968–69; Höhle, 1995, 1996; Hájek, 1998].
200
7 Formal Concept Analysis in Geology
The reader can find basic information about lattices and partially ordered sets in Davey & Priestley [1990]. A complete residuated lattice [Ward & Dilworth, 1939] is an algebra L = L, ∧, ∨, ⊗, →, 0, 1 such that: (1) L, ∧, ∨, 0, 1 is a complete lattice with the least element 0 and the greatest element 1; (2) L, ⊗, 1 is a commutative monoid, i.e., ⊗ is commutative, associative, and x ⊗ 1 = x holds for each x ∈ L; and (3) ⊗, → are binary operations which form an adjoint pair, i.e., x ⊗ y ≤ z if and only if x ≤ y → z holds for all x, y, z ∈ L. The operation ⊗ corresponds to conjunction; → corresponds to implication. The most studied and applied set of truth values is the real interval [0, 1]. Each left-continuous t-norm ⊗ induces a complete residuated lattice [0, 1], min, max, ⊗, →, 0, 1 where → is given by a → b = max{c | a ⊗ c ≤ b} (and conversely, each residuated lattice on [0, 1] is induced in this way by some left-continuous t-norm); for details and more information see Beˇ lohlávek [2002b], Hájek [1998]. The most popular t-norms are: (1) the Łukasiewicz t-norm (a ⊗ b = max(a + b − 1, 0), a → b = min(1 − a + b, 1)); (2) the Gödel tnorm (a ⊗ b = min(a, b), a → b = 1 if a ≤ b and a → b = b else); and (3) the product t-norm (a ⊗ b = a · b, a → b = 1 if a ≤ b and a → b = b/a else). They are all continuous. On the other hand, any continuous t-norm can be composed in a simple way out of these three; see Beˇ lohlávek [2002b] or Hájek [1998]. Another important set of truth values is the set {a0 = 0, a1 , . . . , an = 1}, a0 < · · · < an , where the ordering determines the complete lattice structure. Two t-norms are often considered: (1) ak ⊗ al = amax(k+l−n,0) and the corresponding → given by ak → al = amin(n−k+l,n) (Łukasiewicz); and (2) ak ⊗ al = amin(k,l) and the corresponding → given by ak → al = 1 if k ≤ l and al else (Gödel). A special case of the latter algebras is the Boolean algebra 2 of classical logic with the support set {0, 1}. It may be easily verified that the only t-norm on {0, 1} is the classical conjunction operation ∧, i.e., a ∧ b = 1 if and only if a = 1 and b = 1, which implies that the only residuum operation is the classical implication operation →, i.e., a → b = 0 if and only if a = 1 and b = 0. Note that each of the preceding residuated lattices is complete. In the following, L always denotes some complete residuated lattice. However, there will be no substantial loss if one assumes that L is [0, 1] or some finite subchain of [0, 1]. A fuzzy set (or L-set) A in a universe set X is a mapping A : X → L. The value A(x) ∈ L is interpreted as the truth value of the statement “the element x belongs to A.” The set of all fuzzy sets in X is denoted by LX . For A1 , A2 ∈ LX we write A1 ⊆ A2 if and only if A1 (x) ≤ A2 (x) for all x ∈ X. Similarly, a binary fuzzy relation R between X and Y is a mapping R : X × Y → L. Particularly, {a1 /x1 , . . . , an /xn } denotes a fuzzy set A with A(x1 ) = a1 , . . . , A(xn ) = an , and A(x) = 0 for x = xi (i = 1, . . . , n). Coming back to the notion of a fuzzy context, we want to formalize the notion of a concept. According to Port-Royal, a concept consists of a collection of objects (its
7.3 Formal Concept Analysis of Fuzzy Data: a Guided Tour
201
extent) and a collection of attributes (its intent). If the attributes in the context are fuzzy, both extent and intent are assumed to be fuzzy sets. Consider, for example, the concept hard mineral. Clearly, its extent is a fuzzy set rather than a crisp set. Following the verbal definition, we need to define two operators, ↑ and ↓ . The intended meaning of ↑ and ↓ is the following. For a fuzzy set A of objects (i.e., A ∈ LX ), A↑ is the fuzzy set of all attributes (i.e., A↑ ∈ LY ) shared by all objects from A; for a fuzzy set B of attributes (i.e., B ∈ LY ), B ↓ is the fuzzy set of all objects (i.e., B ↓ ∈ LX ) sharing all attributes from B. The basic semantic rules of fuzzy logic give the following. For a fuzzy context X, Y, I , A ∈ LX and B ∈ LY , A↑ and B ↓ are a fuzzy set in Y and a fuzzy set in X, respectively, defined by A↑I (y) =
'
A(x) → I (x, y)
(7.1)
B(y) → I (x, y)
(7.2)
x∈X
B ↓I (x) =
'
y∈Y
for each y ∈ Y and x ∈ X. Therefore, (7.1) and (7.2) define mappings ↑I : LX → LY , : LY → LX . If I is obvious, we write only ↑ and ↓ instead of ↑I and ↓I .
↓I
Example 7.4 Let X = {x1 , x2 , x3 }, Y = {y1 , y2 }, L = [0, 1], I (x1 , y1 ) = 1, I (x1 , y2 ) = 0.3, I (x2 , y1 ) = 0.8, I (x2 , y2 ) = 0.9, I (x3 , y1 ) = 0, I (x3 , y2 ) = 0.1. Consider A = {0.5/x1 , 1/x2 , 0/x3 }, B = {0.7/y1 , 0.3/y2 }. We want to determine A↑I and B ↓I . For instance, with the Łukasiewicz structure on [0, 1], we have A↑I (y1 ) =
'
A(x) → I (x, y1 )
x∈X
= [A(x1 ) → I (x1 , y1 )] ∧ [A(x2 ) → I (x2 , y1 )] ∧ [A(x3 ) → I (x3 , y1 )] = [0.5 → 1] ∧ [1 → 0.8] ∧ [0 → 0] = 1 ∧ 0.8 ∧ 1 = 0.8 ' A(x) → I (x, y2 ) A↑I (y2 ) = x∈X
= [A(x1 ) → I (x1 , y2 )] ∧ [A(x2 ) → I (x2 , y2 )] ∧ [A(x3 ) → I (x3 , y2 )] = [0.5 → 0.3] ∧ [1 → 0.9] ∧ [0 → 0.1] = 0.8 ∧ 0.9 ∧ 1 = 0.8
202
7 Formal Concept Analysis in Geology
B ↓I (x1 ) =
'
B(y) → I (x1 , y)
y∈Y
= [B(y1 ) → I (x1 , y1 )] ∧ [B(y2 ) → I (x1 , y2 )] = [0.7 → 1] ∧ [0.3 → 0.3] =1∧1=1 ' B(y) → I (x2 , y) B ↓I (x2 ) = y∈Y
= [B(y1 ) → I (x2 , y1 )] ∧ [B(y2 ) → I (x2 , y2 )] = [0.7 → 0.8] ∧ [0.3 → 0.7] =1∧1=1 ' B(y) → I (x3 , y) B ↓I (x3 ) = y∈Y
= [B(y1 ) → I (x3 , y1 )] ∧ [B(y2 ) → I (x3 , y2 )] = [0.7 → 0] ∧ [0.3 → 0.1] = 0.3 ∧ 0.8 = 0.3. Changing the structure on [0, 1] changes the operators ↑I and ↓I . For instance, with the Gödel structure on [0, 1], we get A↑I (y1 ) = 0.8, A↑I (y2 ) = 0.3, B ↓I (x1 ) = 1, B ↓I (x2 ) = 1, B ↓I (x3 ) = 0. With these definitions, the verbal definition of a concept may be formalized as follows. Definition 7.2 A fuzzy concept in a fuzzy context X, Y, I is each pair A, B of a fuzzy set A ∈ LX of objects and a fuzzy set B ∈ LY of attributes such that A↑ = B and B ↓ = A. Indeed, this definition just makes formal the verbal constraints that have to be fulfilled by the extent A and the intent B of a concept consisting of A and B. The set of all fuzzy concepts in a given fuzzy context X, Y, I will be denoted by B (X, Y, I ). That is, we have B (X, Y, I ) = {A, B ∈ LX × LY | A↑I = B, B ↓I = A}. B (X, Y, I ) will be called a fuzzy concept lattice determined by a fuzzy context X, Y, I (the term concept “lattice” will be justified later). We write only “context,” “concept,” and “concept lattice” if L is obvious. On the other hand, if we want
7.3 Formal Concept Analysis of Fuzzy Data: a Guided Tour
203
to emphasize the structure L of truth values, we write “L-context,” “L-concept,” and “L-concept lattice.” It is important to note that if L = 2, i.e., the structure of truth values is the two-element set L = {0, 1}, then the notions of an L-context, L-concept, and L-concept lattice become the notions of a (crisp) context, (crisp) concept, and a (crisp) fuzzy concept lattice developed by Wille [1982]. Example 7.5 For L = 2 (i.e., L = {0, 1}, thus we have 0 and 1 as the only truth values), we have that A↑ (y) = 1 (i.e., y belongs to A↑ ) if and only if for each x ∈ X such that A(x) = 1 we have I (x, y) = 1, or, in other words, each x ∈ A is in relation I with y, which is the meaning of A↑ in the crisp case. The situation for B ↓ is symmetric. Fuzzy concepts may be viewed as maximal rectangles in the object–attribute table corresponding to a fuzzy context. Although this is especially appealing in the crisp case, i.e., L = 2, we will demonstrate this alternative view of concepts in general. A binary fuzzy relation I between X and Y is called a rectangular relation if and only if there are A ∈ LX , B ∈ LY such that I (x, y) = A(x) ⊗ B(y) (for all x ∈ X, y ∈ Y ), written I = A ⊗ B. In this case, the pair A, B is called a rectangle. A rectangle A, B is said to be contained in a binary fuzzy relation I if A ⊗ B is contained in I , i.e., if A ⊗ B ⊆ I . There is a naturally defined ordering ≤ defined on the set of all rectangles by A1 , B1 ≤ A2 , B2 if and only if for all x ∈ X, y ∈ Y we have A1 (x) ≤ A2 (x) and B1 (y) ≤ B2 (y). The following theorem, if interpreted for L = 2, says that concepts are just maximal rectangles of I which are filled with 1s (if we consider the two-valued relation I as a matrix-table of 0s and 1s). Theorem 7.1 For a fuzzy context X, Y, I and A ∈ LX , B ∈ LY , we have that A, B is a fuzzy concept iff it is a maximal rectangle contained in I .
7.3.2 Fuzzy concept lattices: hierarchy of hidden concepts We are now going to investigate the set of all fuzzy concepts with its hierarchical structure. A thorough treatment on this topic can be found in Beˇ lohlávek [2003]. At this moment, we confine ourselves to a special case: we concentrate on the “crisp” hierarchy of fuzzy concepts. The subconcept–superconcept relation ≤ on the set B (X, Y, I ) of all fuzzy concepts in a fuzzy context X, Y, I (i.e., the conceptual hierarchy) is naturally defined by A1 , B1 ≤ A2 , B2
iff
A1 ⊆ A2
(iff B1 ⊇ B2 ).
(7.3)
That is, the fuzzy concept A1 , B1 is a subconcept of the fuzzy concept A2 , B2 if the extent A2 is greater than the extent A1 . This means that the degree to which
204
7 Formal Concept Analysis in Geology
any object x belongs to A2 is at least as high as the degree to which x belongs to A1 . Equivalently, A1 , B1 is a subconcept of A2 , B2 if the intent B1 is greater than the intent B2 . Saying that A1 , B1 is a subconcept of A2 , B2 may be equivalently expressed by saying that A1 , B1 is more special than A2 , B2 , or that A2 , B2 is more general than A1 , B1 , or that A2 , B2 is a superconcept of A1 , B1 . The structure of B (X, Y, I ) w.r.t. ≤ and its characterization is the subject of the following theorem [Beˇ lohlávek 2001, 2003]. Theorem 7.2 (main theorem of fuzzy concept lattices, crisp order) Let X, Y, I be an L-context. (1) B (X, Y, I ) is a complete lattice in which infima and suprema can be described as follows: * , * , + + '( + ) ↑ ↓↑ A j , Bj = Aj , ( Aj ) = Aj , ( Bj ) (7.4) j ∈J
-(
)
*
j ∈J
A j , Bj = (
j ∈J
+ j ∈J
j ∈J ↓
Bj ) ,
+ j ∈J
,
*
j ∈J
Bj = (
j ∈J
j ∈J ↑↓
Aj ) ,
+
, Bj .
(7.5)
j ∈J
(2) Moreover, a complete lattice V = V , ≤ is isomorphic to B (X, Y, I ) iff there are mappings γ : X × L → V , μ : Y × L → V , such that γ (X × L) is supremally dense in V, μ(Y × L) is infimally dense in V, and a ⊗ b ≤ I (x, y) is equivalent to γ (x, a) ≤ μ(y, b) for all x ∈ X, y ∈ Y , a, b ∈ L. Recall that the fact that B (X, Y, I ) is a complete lattice is a very natural one. Recall [Birkhoff, 1967] that a complete lattice is a partially ordered set V where, to each subset V of V , there exists the infimum of V as well as the supremum. Now the supremum of V is the least one of all elements of V which are greater than each element of V . That is, under the conceptual interpretation, if V is a set of concepts then the supremum of V is a concept which can be thought of as the direct generalization of all concepts from V . Dually, the infimum of V is the concept which can be thought of as the direct common specialization of the concepts from V .
7.3.3 Attribute implications Attribute implications represent another useful set of information that can be extracted from the input data (fuzzy context). The basic idea is this. An attribute implication consists of a pair comprising a fuzzy set A of attributes and a fuzzy set B of attributes. Such an attribute implication will be briefly denoted by A ⇒ B (note that we use → to denote residuum (implication operation) in structures of truth values and also to denote attribute implications—however, there is no danger of confusion). Now, given the input data, i.e., a fuzzy context X, Y, I , an attribute implication may
7.3 Formal Concept Analysis of Fuzzy Data: a Guided Tour
205
be true in X, Y, I to a certain degree. The degree A ⇒X,Y,I B to which A ⇒ B is true in X, Y, I is verbally described as the degree to which “for each x ∈ X: if x has all the attributes from A then x has also all the attributes from B.” Formally, we have ' ' ' A ⇒X,Y,I B = ( (A(y) → I (x, y)) → (B(y) → I (x, y))). x∈X y∈Y
y∈Y
In a similar way, one can consider the degree to which an attribute implication is true in B (X, Y, I ). We first elaborate on the details. Then we go to the notion of entailment of attribute implications and to the notion of a base of attribute implications. Recall that for fuzzy sets C, D.∈ LU , the degree S(C, D) to which C is contained in D is defined by S(C, D) = x∈U (C(x) → D(x)). Definition 7.3 For a fuzzy context X, Y, I , a fuzzy set M ∈ LY , and an attribute implication A ⇒ B, we put |= (M, A ⇒ B) = S(A, M) → S(B, M) and call |= (M, A ⇒ B) the degree to which A ⇒ B is true in M. Definition 7.3 has a natural interpretation: |= (M, A ⇒ B) is the truth degree to which it is true that whenever A is contained in M then B is as well. We are going to define the notion of validity of a collection of attribute implications in a collection of fuzzy sets of attributes. |= is an L-relation between the set LY of all Lsets of attributes and the set LY ×LY of all L-attribute implications. As established by Beˇ lohlávek [1999] (see also Ore [1944]), this relation induces an L-Galois connection Y Y Y ∧ , ∨ between LY and LY × LY , i.e., for M ∈ LL and I ∈ LL ×L the L-sets Y Y Y M∧ ∈ LL ×L and I ∨ ∈ LL are given by ' M(M) → |= (M, A ⇒ B) M∧ (A ⇒ B) = M∈LL
I ∨ (M) =
Y
'
I(A ⇒ B) → |= (M, A ⇒ B).
Y Y (A⇒B)∈LL ×L
Therefore, M∧ (A ⇒ B) is the truth degree to which A ⇒ B is true in each M from M, and I ∨ (M) is the truth degree to which each implication from I is true in M. Particularly, we will be interested in {{1/x}↑ | x ∈ X}∧ and Int(X, Y, I )∧ : {{1/x}↑ | x ∈ X}∧ (A ⇒ B) is the truth degree to which A ⇒ B is true in each {1/x}↑ (i.e., the intent of the elementary fuzzy concept {1/x}↑↓ , {1/x}↑ ) and Int(X, Y, I )∧ (A ⇒ B) is the truth degree to which A ⇒ B is true in all intents Y of B (X, Y, I ). For the sake of brevity we denote for M ∈ LL and A ⇒ B the
206
7 Formal Concept Analysis in Geology
degree M∧ (A ⇒ B) by A ⇒M B and, furthermore, if M = {{1/x}↑ | x ∈ X} we write A ⇒X,Y,I B instead of A ⇒M B. This makes good sense: since {1/x}↑ (y) = I (x, y), A ⇒X,Y,I B is the degree to which it is true that, for each x ∈ X, if x has all the attributes from A then x has all the attributes of B; i.e., we get the notion of a validity of an attribute implication in a fuzzy context. For M = D | C, D ∈ B (X, Y, I ) we denote A ⇒M B by A ⇒B(X,Y,I ) B. That is, A ⇒B(X,Y,I ) B is the degree to which it is true that if A is contained in an intent D of some fuzzy concept C, D from the fuzzy concept lattice B (X, Y, I ), then B is contained in D as well. Theorem 7.3 For any fuzzy context X, Y, I and any attribute implication A ⇒ B we have A ⇒X,Y,I B
=
A ⇒B(X,Y,I ) B,
i.e., for any implication A ⇒ B, the degree to which A ⇒ B is valid in the fuzzy context X, Y, I equals the degree to which A ⇒ B is true in B (X, Y, I ). Since A ⇒X,Y,I B and A ⇒B(X,Y,I ) B coincide, we denote both of them simply by A ⇒I B or even by A ⇒ B. The following assertions list some basic rules of the calculus of attribute implications. Theorem 7.4 For any fuzzy context X, Y, I and any attribute implication A ⇒ B, the truth degrees A ⇒ B, S(A↓ , B ↓ ), and S(B, A↓↑ ) are equal. Theorem 7.5 For each fuzzy context X, Y, I we have A ⇒I A = 1,
(A ⇒I B) ⊗(B ⇒I C) ≤ (A ⇒I C)
S(A1 , A2 ) ⊗ S(B2 , B1 ) ⊗(A1 ⇒ B1 ) ≤ (A2 ⇒ B2 ).
(7.6) (7.7)
Thus, (7.6) says that A always implies A, and that if A implies B and B implies C then A implies C; (7.7) says that if A1 implies B1 and if A1 is contained in A2 and B1 contains B2 , then A2 implies B2 . These rules indicate that some of the attribute implications “follow” from other implications. Therefore, it is not necessary to list all the attribute implications. Rather, it is desirable to have only a relatively small “base” of attribute implications from which all other implications follow. We now formalize these intuitive considerations. We say that an attribute implication A ⇒ B follows from a set {Aj ⇒ Bj | j ∈ J } of attribute implications A, B, Aj , Bj ∈ LY if, for each fuzzy set D ∈ LY , we have that A ⇒M B = 1 (i.e., the truth degree to which A ⇒ B is true in {M} is 1)
7.4 Similarity and Logical Precision
207
whenever, for each j ∈ J , we have Aj ⇒M Bj = 1. For a fuzzy context X, Y, I we say that a set I = {Aj ⇒ Bj | j ∈ J } of attribute implications which are true in degree 1 in X, Y, I forms a base for X, Y, I if for each attribute implication A ⇒ B which is true in degree 1 in X, Y, I we have that A ⇒ B follows from I. A base I for X, Y, I is called irredundant if no A ⇒ B ∈ I follows from I − {A ⇒ B}. Therefore, an irredundant base provides complete irredundant information about the attribute implications. Further information about attribute implications (including algorithms) can be found in Ganter & Wille [1999] and in Pollandt [1997].
7.4 Similarity and Logical Precision 7.4.1 Similarity relations The similarity phenomenon plays a crucial role in the way humans regard the world. In fact, similarities are induced by the very nature of human perception. Gradual similarity of concepts is one of the fundamental preconditions for powerful human reasoning and communication. The similarity phenomenon is thus one of the most important ones accompanying conceptual structures. In fuzzy set theory, the similarity phenomenon is approached via so-called similarity relations (fuzzy equivalence relations); see Chapter 2. For a structure L of truth values, a similarity relation (fuzzy equivalence) [Zadeh, 1971] on a set U is a binary fuzzy relation E : U × U → L on a universe U satisfying the following properties for all x, y, z ∈ U : E(x, x) = 1
(7.8)
E(x, y) = E(y, x)
(7.9)
E(x, y) ⊗ E(y, z) ≤ E(x, z).
(7.10)
Properties (7.8), (7.9), and (7.10) are called reflexivity, symmetry, and transitivity, respectively. A similarity class of x ∈ U is the fuzzy set [x]E ∈ LU given by [x]E (y) = E(x, y) for each y ∈ U , i.e., it is a collection of elements similar to x. A fuzzy set A ∈ LU is said to be compatible w.r.t. E if for every x, y ∈ U we have A(x) ⊗ E(x, y) ≤ A(y). Verbally, this condition says that, with each element x, A contains all the elements similar to x. It is easily seen that in the crisp case, i.e., L = {0, 1}, similarity relations are equivalence relations. For the study of the similarity phenomenon, the crisp case is a degenerate one and non-interesting— two elements x and y may be “fully similar” (E(x, y) = 1) or “fully dissimilar” (E(x, y) = 0).
208
7 Formal Concept Analysis in Geology
Example 7.6 For the three basic structures on [0, 1], i.e., Łukasiewicz, Gödel, and product, transitivity translates to max(0, E(x, y) + E(y, z) − 1) ≤ E(x, z) min(E(x, y), E(y, z)) ≤ E(x, z) E(x, y) · E(y, z) ≤ E(x, z), respectively. Reflexivity and symmetry conditions are the same for each of the three structures. Note that transitivity expresses a condition which can be formulated in words as “if x and y are similar and if y and z are similar then x and z are similar.” For example, if E(x, y) = 0.8 (x and y are similar in degree 0.8) and E(y, z) = 0.8 (y and z are similar in degree 0.8) then x and z have to be similar at least in degree 0.8 ⊗ 0.8. Thus, in the case of the product structure, transitivity forces E(x, z) ≥ 0.8 ⊗ 0.8 = 0.64. To model the equivalence (or closeness) of truth values we have at our disposal the so called biresiduum (or biimplication) [Pavelka, 1979] operation ↔ defined by a ↔ b = (a → b) ∧ (b → a). The following lemma will be useful in our considerations. Lemma 7.1 Let E be a similarity on U , S = {Ai ∈ LU | i ∈ I } be a family of fuzzy sets. (1) E is the largest similarity relation compatible with all [x]E . (2) The relation ES defined by ' ES (x, y) = (Ai (x) ↔ Ai (y)) (7.11) i∈I
is the largest similarity relation compatible with all Ai ∈ S. Moreover, Ai (x) = 1 implies [x]ES ⊆ Ai . Notice that for the crisp case (i.e., L = {0, 1}), ES is a crisp equivalence relation— two elements of the universe are equivalent if and only if there is no set of the family which separates them. Remark 7.1 Lemma 7.1 has a very natural interpretation. Each fuzzy set Ai in U represents some property of elements of U ; Ai (x) is the degree to which the element x has the property represented by Ai . The degree Ai (x) ↔ Ai (y) is the truth degree to which “x has the property Ai if and only if y has the property Ai .” Therefore, ES (x, y) is the truth degree of “for each property A from S: x has A if and only if y has A.” If
7.4 Similarity and Logical Precision
209
S is the system of all relevant properties, ES (x, y) is the truth degree to which x and y have the same (relevant) properties. Example 7.7 For the three basic structures on the real unit interval [0, 1], i.e., Łukasiewicz, Gödel, and product, we have ES (x, y) = inf {1 − |Ai (x) − Ai (y)|} (Łukasiewicz) i∈I / 1 if ∀i : Ai (x) = Ai (y) ES (x, y) = inf i∈I {min(Ai (x), Ai (y))} otherwise
(Gödel)
ES (x, y) = inf {min(Ai (x)/Ai (y), Ai (y)/Ai (x))} (product) i∈I
where we put 0/0 = 1 and a/0 = ∞ for a = 0. Example 7.8 We illustrate ES . Let U = {x, y, z}, S = {A, B} where A = {1/x, 0.5/y, 0.1/z}, B = {0.9/x, 0.4/y, 0.1/z}. For the Łukasiewicz structure on [0, 1] we get ES (x, y) = [A(x) ↔ A(y)] ∧ [B(x) ↔ B(y)] = [1 ↔ 0.5] ∧ [0.9 ↔ 0.4] = 0.5 ∧ 0.5 = 0.5, and ES (y, z) = 0.6, ES (x, z) = 0.1. Note that, for the Gödel structure, we get ES (x, y) = 0.4, ES (y, z) = 0.1, ES (x, z) = 0.1; thus we see that ES depends on the structure on L. In the following we consider the problem of similarities on three levels: similarity of objects (and attributes), similarity of attributes, and similarity of concept lattices. The proofs and further results can be found in Beˇ lohlávek [2000a]. 7.4.2 Similarity of objects and similarity of attributes First, let us propose a way to measure similarity of objects and similarity of attributes of a given fuzzy context. This similarity is induced by the structure of fuzzy concepts determined by the fuzzy context. It turns out that these similarities may be determined directly from the fuzzy context; this is relevant from the computational point of view. Lemma 7.1 can be directly applied to our problem of measuring similarity of objects and attributes. We are given objects (elements of X) and their observed attributes (elements of Y ). A natural question is that of the similarity relation on the objects and attributes. The given fuzzy context gives rise to a complete lattice of all fuzzy concepts hidden in the fuzzy context. As mentioned, the fuzzy concept lattice may be used for the conceptual classification of objects and attributes. It seems therefore
210
7 Formal Concept Analysis in Geology
reasonable to use the induced conceptual structure B (X, Y, I ) to define similarity relations on X and on Y . Consider the problem of similarity of objects. Informally, two objects x1 , x2 ∈ X are similar if they cannot be separated by any concept, or more precisely, if for each concept c it holds that x1 belongs to the extent of c if and only if x2 belongs to the X X×X : extent of c. This leads to the following definition of a relation EB (X,Y,I ) ∈ L 0 1 ' X EB A(x1 ) ↔ A(x2 ) . (7.12) (X,Y,I ) (x1 , x2 ) = A,B∈B(X,Y,I )
X The relation EB (X,Y,I ) will be called induced (by B (X, Y, I )) similarity on X. By Lemma 7.1 we immediately get the following statement.
Theorem 7.6 X The relation EB (X,Y,I ) is the largest similarity relation on X compatible with the extents of all concepts of B (X, Y, I ). From the computational point of view, the foregoing definition leads to the followX ing algorithm for computing the similarity relation EB (X,Y,I ) . Take a fuzzy context, generate all the fuzzy concepts of B (X, Y, I ), and determine the similarity of each pair x1 , x2 ∈ X × X by (7.12). The fuzzy concept lattice may, however, be quite extensive. This poses the question whether the computational cost can be reduced. An (exact) solution which significantly reduces the computational costs follows. Define X X×X by a relation EX,Y,I ∈L X EX,Y,I (x1 , x2 ) =
'
(I (x1 , y) ↔ I (x2 , y)).
(7.13)
y∈Y X EX,Y,I (x1 , x2 ) may be obtained from the L-context X, Y, I computing |Y | times the operation ↔. Using Lemma 7.1 (put X = X, I = Y , Ai (x) = I (x, y)), we get the following theorem.
Theorem 7.7 X The relation EX,Y,I is the largest similarity relation on X compatible with all X I (_, y) ∈ L , y ∈ Y . The following theorem solves the problem of finding an efficient procedure for X computing the similarity relation EB (X,Y,I ) . Theorem 7.8 Let X, Y, I be a fuzzy context. Then, for the similarity relations defined by (7.13) and (7.12), we have X X EB (X,Y,I ) = EX,Y,I .
(7.14)
7.4 Similarity and Logical Precision
211
X X Hence, the computation of EB (X,Y,I ) may be reduced to the computation of EX,Y,I , which is much simpler. In a completely analogous way we may get the results for the similarity relations on Y .
7.4.3 Similarity of concepts The next level on which the similarity phenomenon will be considered is the level of concepts. Observe first the following fact, which follows from Lemma 7.1. Lemma 7.2 For any universe U , the relation E on LU given for any A1 , A2 ∈ LU by E(A1 , A2 ) =
'
(A1 (x) ↔ A2 (x))
x∈U
is the largest similarity relation on LU such that A1 (x) ⊗ E(A1 , A2 ) ≤ A2 (x) holds for each x ∈ U , A1 , A2 ∈ LU . E(A1 , A2 ) is thus the truth degree to which it is true that x belongs to A1 if and only if x belongs to A2 . In the following, it will be clear what universe U the relation E concerns. Example 7.9 Let again U = {x, y, z}, S = {A, B}, where A = {1/x, 0.5/y, 0.1/z}, B = {0.9/x, 0.4/y, 0.1/z}. For the Łukasiewicz structure on [0, 1] we get E(A, B) = [A(x) ↔ B(x)] ∧ [A(y) ↔ B(y)] ∧ [A(z) ↔ B(z)] = [1 ↔ 0.9] ∧ [0.5 ↔ 0.4] ∧ [0.1 ↔ 0.1] = 0.9 ∧ 0.9 ∧ 1 = 0.9. For the Gödel structure we analogously have E(A, B) = 0.4, while for the product structure we have E(A, B) = 0.8. Consider first the relations E Ext and E Int on B (X, Y, I ); call them induced similarity by extents and induced similarity by intents, respectively: E Ext (A1 , B1 , A2 , B2 ) = E(A1 , A2 ) =
'
(A1 (x) ↔ A2 (x))
x∈X
E Int (A1 , B1 , A2 , B2 ) = E(B1 , B2 ) =
'
y∈Y
Lemma 7.2 gives immediately the following statement.
(B1 (y) ↔ B2 (y)).
212
7 Formal Concept Analysis in Geology
Theorem 7.9 E Ext and E Int are the largest similarity relations on B (X, Y, I ) such that A1 (x) ⊗ E Ext (A1 , B1 , A2 , B2 ) ≤ A2 (x) and B1 (y) ⊗ E Int (A1 , B1 , A2 , B2 ) ≤ B2 (y) hold for every x ∈ X, y ∈ Y , A1 , B1 , A2 , B2 ∈ B (X, Y, I ). The following theorem answers the question of how the relations E Ext and E Int are related. Theorem 7.10 For any fuzzy context X, Y, I we have E Ext = E Int . We can therefore write E instead of E Ext and E Int and call it the induced similarity on concepts.
7.4.4 Compatible similarities and factorization The primary importance of similarity relations in human reasoning is the reduction of the complexity of the world at a reasonable price. The complexity is reduced by considering the “collections of similar elements of concern” rather than the particular elements themselves [Zadeh, 1997]. This is known in general system theory as the abstraction process by factorization: moving from a given level of abstraction (distinguishability) one level up where the elements are collections of elements of the lower level. Instead of the original system one therefore considers the “system modulo similarity.” The price paid is the loss of precision. Our concern in the following is the reduction of the complexity of the concept lattice by factorization modulo similarity. The concept lattice of a given context represents the overall conceptual structure, which can be considerably intricate. To gain insight one has to look for methods of reducing the complexity of the structure. In the two-valued (crisp) case, considerable attention has been paid to this problem [Ganter & Wille, 1999]. In the fuzzy case, one would expect to find methods for gradual reduction of the complexity. The idea is to factorize the concept lattice by an appropriate α-cut α E of the similarity E (note that α E = {c1 , c2 | α ≤ E(c1 , c2 )}), controlling thus the complexity by α ∈ L. Clearly, the lower α ∈ L, the coarser the factorization. The process of factorization of a system consists of two steps. First, specification of the elements, and, secondly, specification of the structure of the factor system. Since both of the steps are non-standard in our case, we will describe them in more detail. In general, algebraic systems can be factorized by congruences, i.e., equivalences compatible with the structure of the system. We deal with conceptual structures that are complete lattices. The α-cut α E is clearly a tolerance relation (i.e., reflexive and symmetric), not transitive in general. In general, factorization of algebras by compatible tolerances is not possible. Surprisingly, Czédli [1982] showed a way to
7.4 Similarity and Logical Precision
213
factorize lattices by compatible tolerance relations. The construction has then been used for the factorization of ordinary concept lattices [Wille, 1985]. In the following, we describe the construction of the factor lattice of a fuzzy concept lattice by a compatible tolerance relation. Let X, Y, I be a fuzzy context. A tolerance relation T on B (X, Y, I ) is said to be compatible if it is preserved2 under arbitrary suprema 2 and infima, i.e., if cj , cj ∈ T , j ∈ J , implies both j ∈J cj , j ∈J cj ∈ T . . and j ∈J cj , j ∈J cj ∈ T for any cj , cj ∈ B (X, Y, I ), j ∈ J . For a com. T = patible tolerance relation T on B (X, Y, I ) denote cT = c,c ∈T c and c 2 T T c,c ∈T c . Call [c]T = [cT , (cT ) ] = {c ∈ B (X, Y, I ) | cT ≤ c ≤ (cT ) } a block of T and denote B (X, Y, I )/T = {[c]T | c ∈ B (X, Y, I )} the set of all blocks. Introduce a relation ≤T on2B (X, Y, I2)/T by [c]T ≤T [c ]T if and only if . . [c]T ≤ [c ]T (if and only if [c]T ≤ [c ]T ). The justification of the construction is given by the following statement which follows immediately from Wille [1985]. Theorem 7.11 (1) B (X, Y, I )/T is the set of all maximal tolerance blocks, i.e., B (X, Y, I )/T = {B ⊆ B (X, Y, I ) | (B × B ⊆ T ) & ((∀B ⊃ B)B × B ⊆ T )}. (2) B (X, Y, I )/T , ≤T is a complete lattice (factor lattice) where suprema and infima are described by j ∈J
[cj ]T = [
j ∈J
cj ]T
and
' j ∈J
[cj ]T = [(
'
cj )T ]T
(7.15)
j ∈J
for every cj ∈ B (X, Y, I ), j ∈ J . Substituting (7.4) and (7.5) into (7.15), we get a more concrete description of the lattice operations. Coming back to the induced similarity E on B (X, Y, I ), the ultimate question is that of the compatibility of the α-cuts of E. We call a similarity relation F on B (X, Y, I ) compatible if α E is a compatible tolerance relation on B (X, Y, I ) for each α ∈ L. Notice that for the two-valued (crisp) case the situation is completely uninteresting. Namely, as one may easily check, the only cases are 0 E = B (X, Y, I ) × B (X, Y, I ) and 1 E = id B(X,Y,I ) = {c, c | c ∈ B (X, Y, I )}. In the first case, B (X, Y, I )/0 E = {B (X, Y, I )}, i.e., the factor lattice collapses into a one-element lattice, while in the second case, B (X, Y, I )/1 E = {{A, B} | A, B ∈ B (X, Y, I )}, i.e., B (X, Y, I ) and B (X, Y, I )/1 E are isomorphic. Note that we need not confine ourselves to the induced similarity E. On the other hand, taking into account only similarity relations F satisfying A(x) ⊗ F (A, B, A , B ) ≤ A (x) (which is quite natural—it reads “object belonging to the extent of some concept belongs also to the extent of any similar concept”)
214
7 Formal Concept Analysis in Geology
for each x ∈ X, Theorem 7.9 tells us that E provides the most extensive reduction: for any other F and each α ∈ L, α E is coarser than α F . Theorem 7.12 The induced similarity E on B (X, Y, I ) is compatible. If α ∈ L is ⊗-idempotent (i.e., α ⊗ α = α) then α E is, moreover, transitive, i.e., a congruence relation on B (X, Y, I ). Remark 7.2 Theorem 7.12 and the construction described above yield a method for factorizing any fuzzy concept lattice B (X, Y, I ) by any α-cut α E of the induced similarity E. It is worth noticing that the similarity E is defined “internally,” i.e., it is not supplied from the outside. Remark 7.3 If L is the Gödel algebra on [0, 1], i.e., ⊗ is min, then each α-cut of E is indeed a congruence relation.
7.4.5 Similarity of concept lattices Finally, we consider similarity of concept lattices. A natural way to define the similarity degree of two concept lattices over the sets X and Y is based on the following intuition. Concept lattices B (X, Y, I1 ) and B (X, Y, I2 ) are similar if and only if for each concept c1 ∈ B (X, Y, I1 ) there is a concept c2 ∈ B (X, Y, I2 ) such that c1 and c2 are similar and, conversely, for each concept c2 ∈ B (X, Y, I2 ) there is a concept c1 ∈ B (X, Y, I1 ) such that c1 and c2 are similar. In the following we write B1 and B2 instead of B (X, Y, I1 ) and B (X, Y, I2 ), respectively. According to how the similarity of concepts is measured, we distinguish two rules for the definition of the similarity degree of two concept lattices: E ∗ (B (X, Y, I1 ), B (X, Y, I2 )) ' E ∗ (A1 , B1 , A2 , B2 )∧ = A1 ,B1 ∈B1 A2 ,B2 ∈B2
'
-
E ∗ (A1 , B1 , A2 , B2 ),
A2 ,B2 ∈B2 A1 ,B1 ∈B1
∗ ∈ {Ext, Int}, where we put E Ext (A1 , B1 , A2 , B2 ) = E(A1 , A2 ) E Int (A1 , B1 , A2 , B2 ) = E(B1 , B2 ).
(7.16)
7.4 Similarity and Logical Precision
215
We shall see that all of the above relations are, in fact, similarity relations. The meaning of the similarity relation E defined on the set of all contexts is: E(X, Y, I1 , X, Y, I2 ) ' = E(I1 , I2 ) =
(7.17) I1 (x, y) ↔ I2 (x, y).
x,y∈X×Y
Lemma 7.2 immediately gives the result that E is a similarity relation. The main result showing the relationships between the relations introduced above is contained in the following theorem. Theorem 7.13 For every pair of fuzzy contexts X, Y, I1 , X, Y, I2 we have E(X, Y, I1 , X, Y, I2 ) ≤ E Ext (B1 , B2 ) and E(X, Y, I1 , X, Y, I2 ) ≤ E Int (B1 , B2 ). Moreover, E Ext and E Int are similarity relations on {B (X, Y, I ) | I ∈ LX×Y }.
7.4.6 Logical precision Consider a structure L of truth values. The set L is the set of all possible truth values which we have at our disposal for logical modeling of our knowledge. It can be considered as representing “logical discernibility.” Consider for example the twoelement Boolean algebra. Then the level of discernibility is low—we can discern only fully true statements from fully false statements. An n-element chain of truth values offers more—we can discern n logical “levels.” Very loosely, using more truth values means more logical precision (in the above sense). From the point of view of logical modeling, it is natural to be able to change the set of truth values (in order to increase or decrease the logical discernibility) so that the structural properties of the model remain preserved. An important role is played by the structure of the set of truth values. Consider two structures L1 and L2 of truth values such that there is a surjective mapping h : L1 → L2 , i.e., h(L1 ) = L2 . If h preserves the structure of the sets of truth values, then the change from L2 to L1 can be considered as an increase of logical precision and, conversely, change from L1 to L2 can be considered as a decrease of logical precision. The requirement of preserving the structure of truth values may be, from the algebraic point of view, seen as fulfilled if h is a homomorphism [Grätzer,
216
7 Formal Concept Analysis in Geology
1968]. In our case, h is a homomorphism if and only if the following conditions are satisfied: h(a ∨ b) = h(a) ∨ h(b) h(a ∧ b) = h(a) ∧ h(b) h(a ⊗ b) = h(a) ⊗ h(b) h(a → b) = h(a) → h(b) h(0) = 0 h(1) = 1. In the following we will suppose that all the homomorphisms . under consideration . . will be -preserving, i.e., for each K ⊆ L1 we have h( k∈K k) = k∈K h(k). Given two structures L1 and L2 of truth values and a homomorphism h : L1 → L2 , we define for each L1 -fuzzy set A in X (A ∈ LX 1 ) the corresponding L2 -fuzzy set by (x) = h(A(x)) for all x ∈ X. The following two statements h(A) ∈ LX (h(A)) 2 show how the systematic change of the set of truth values (i.e., increase or decrease of logical precision) influences the structure of the respective concepts [Beˇ lohlávek, 2002a]. Lemma 7.3 Let L1 , L2 be complete residuated lattices and h : L1 → L2 be an onto homoY morphism. Let X, Y, I be an L1 -context. Then for C ∈ LX 2 , D ∈ L2 , the Y following holds: C, D ∈ B (X, Y, h(I )) iff there are A ∈ LX 1 , B ∈ L1 such that A, B ∈ B (X, Y, I ), h(A) = C, and h(B) = D. lattices V1 and A lattice homomorphism h : V1 → V2 between two.complete . complete if for each K ⊆ V1 we have h( k∈K ) = k∈K h(k) and V22is called 2 h( k∈K ) = k∈K h(k). Theorem 7.14 Under the conditions of the preceding lemma, the mapping h∗ as defined by h∗ (A, B) = h(A), h(B) is a complete homomorphism of B (X, Y, I ) onto B (X, Y, h(I )). Remark 7.4 The foregoing theorem is relevant from the application point of view. Suppose we have a concept with truth values from L1 . A further analysis on the level of L1 may be (for various reasons, e.g. computational ones) “too precise.” We can then skip to
7.5 Formal Concept Analysis Demonstrated: Examples
217
a level of L2 = h(L1 ) which is appropriate. Due to the theorem, the structure of the concepts changes systematically, i.e., the structure of concepts in L1 is in a systematic way more precise than that one in L2 . There is an important consequence of Theorem 7.14: the fuzzy concept lattice B (X, Y, h(I )) can be thought of as if obtained from B (X, Y, I ) by factorization, i.e., by the process of abstraction. Namely, the mapping h∗ of B (X, Y, I ) to B (X, Y, h(I )) induces an equivalence relation θh∗ on B (X, Y, I ) by A1 , B1 , A2 , B2 ∈ θh∗
iff
h∗ (A1 , B1 ) = h∗ (A2 , B2 ).
That is, we can consider a so-called factor set B (X, Y, I )/θh∗ of B (X, Y, I ) modulo θh∗ . The elements of B (X, Y, I )/θh∗ are classes [A, B]θh∗ of pairwise θh∗ -equivalent fuzzy concepts, i.e., [A, B]θh∗ = {A , B ∈ B (X, Y, I ) | A, B, A , B ∈ θh∗ }. Moreover, θh∗ is a complete congruence on B (X, Y, I ). This means that one can define operations on B (X, Y, I )/θh∗ in such a way that it becomes a complete lattice: we put [A1 , B1 ]θh∗ ∧ [A2 , B2 ]θh∗ = [A1 , B1 ∧ A2 , B2 ]θh∗ [A1 , B1 ]θh∗ ∨ [A2 , B2 ]θh∗ = [A1 , B1 ∨ A2 , B2 ]θh∗ ' ' [Ai , Bi ]θh∗ = [ Ai , Bi ]θh∗ i∈I
i∈I
i∈I
i∈I
[Ai , Bi ]θh∗ = [ Ai , Bi ]θh∗ . The following theorem follows from elementary facts of general algebra. Theorem 7.15 B (X, Y, I )/θh∗ equipped with the above-defined operations is isomorphic to B (X, Y, h(I )). This means that B (X, Y, I )/θh∗ and B (X, Y, h(I )) differ only in relabeling their elements. That is, B (X, Y, h(I )) may be seen as if obtained from B (X, Y, I ) by abstraction.
7.5 Formal Concept Analysis Demonstrated: Examples Example 7.10 Our main purpose is to illustrate the methods described above. The example is taken from paleontology and is a simplified version of a larger example examined in a
218
7 Formal Concept Analysis in Geology
Figure 7.2 Fossils from Example 7.10.
forthcoming paper by Beˇ lohlávek and Košt’ák. The input data are hypothetical fossil outlines depicted in Figure 7.2. Each of the fossils consists of a body and a spine. Intuitively, the fossils belong to some general category C (covering them all). Suppose we have no previous
7.5 Formal Concept Analysis Demonstrated: Examples
219
knowledge about what natural categories (subcategories of C) exist. A first approximation of natural subcategories of C may be automatically extracted from the input data using formal concept analysis. The first step is to write down an appropriate context from the input data. That is, we have to identify objects and (fuzzy) attributes. We naturally take the nine fossils for the objects; we assign numbers 1–9 to them, according to Figure 7.2. Now we have to identify suitable (fuzzy) attributes. Needless to say, identification of attributes of the fossils is an arbitrary process. What makes the fossils look different here are basically two features: first, the size of the spine and, second, the shape of the body. Concerning the first feature, we identify two attributes, “spine small” and “spine big.” These attributes are naturally fuzzy ones. Concerning the shape of the body, the shape goes from circle-shaped to very much oval-shaped. The key feature is thus the ratio length:width. We identify two attributes, “oval-shaped” (length:width big) and“circle-shaped” (length:width small). The attributes and their meaning are summarized in Table 7.4. We now have to take an appropriate set of truth values and equip it with an appropriate structure.We take L = {0, 12 , 1} and will consider two structures defined on L, the Łukasiewicz one and the Gödel one. The context is given by Table 7.5. Therefore, Table 7.4 Attributes of fossils and their meaning. Abbreviation
Meaning
ss sb cs os
has a small spine has a big spine is circle-shaped is oval-shaped
Table 7.5 Fuzzy context given by fossils and their properties. Spine small ss
Spine big sb
Circle-shaped cs
Oval-shaped os
fossil 1 fossil 2 fossil 3 fossil 4 fossil 5
1 1 1 1 0
0 0 0 0 1
0 0 0
1 1 1 1
fossil 6 fossil 7
0
1
1 2 1 2
1 2 1 2
1 1
0
1 1
0 0
fossil 8 fossil 9
1
0
1 2
1
1 2 1 2
220
7 Formal Concept Analysis in Geology
Figure 7.3 Fuzzy concept lattice of the context in Table 7.5 (Łukasiewicz structure).
the set X of objects contains nine elements (denoted by 1,…,9); the set Y of attributes contains four elements (denoted ss, sb, cs, os). The corresponding fuzzy concept lattices are depicted in Figure 7.3 (Łukasiewicz structure on L) and Figure 7.4 (Gödel structure on L). To gain more insight, the elements (i.e., concepts) of the lattice are identified in Table 7.6 (Łukasiewicz structure) and Table 7.7 (Gödel structure). Table 7.8 shows X the similarity relation EB (X,Y,I ) on the fossils. We now illustrate reduction of a fuzzy concept lattice by decrease of logical precision (see Section 7.4). Consider the Gödel structure L on L and the mapping h : L → {0, 1} defined by 0 h:
1 2
1
→ 0 → 1 → 1
. We can verify that h is a ( -preserving) homomorphism of L onto 2. The context X, Y, h(I ) is depicted in Table 7.9.
7.5 Formal Concept Analysis Demonstrated: Examples
221
Figure 7.4 Fuzzy concept lattice of the context in Table 7.5 (Gödel structure).
According to Theorems 7.14 and 7.15, the mapping h∗ sending A, B ∈ B(X, Y, I ) to h(A), h(B) is a complete homomorphism of B (X, Y, I ) onto B (X, Y, h(I )). This mapping induces a congruence θh∗ on B (X, Y, I ) so that two fuzzy concepts A1 , B1 and A2 , B2 from B (X, Y, I ) belong to the same class of θh∗ (are θh∗ -congruent) iff h(A1 ) = h(A2 ) and h(B1 ) = h(B2 ). The classes of θh∗ are depicted in Figure 7.5. The elements of B (X, Y, h(I )) corresponding to the classes of θh∗ are listed in Table 7.10. The lattice B (X, Y, h(I )) (which is isomorphic to B (X, Y, I )/θh∗ ) is depicted in Figure 7.6. Consider now the Łukasiewicz structure on {0, 12 , 1}. We illustrate the factorization by similarity. Consider the α-cut of the induced similarity E on B (X, Y, I ) for α = 12 , 1 i.e., consider 2 E. The tolerance blocks in B (X, Y, I ) (which are, in fact, complete sublattices) are depicted in Figure 7.7. Note that each block is a maximal subset of L-concepts which are similar in the degree of at least 12 . The corresponding factor 1 lattice B (X, Y, I )/ 2 E is depicted in Figure 7.6. A few remarks on this example follow. We can identify apparently natural concepts. Consider, e.g., the Łukasiewicz structure. Concept no. 14 is naturally described as “a fossil with a small spine and an oval shape.” Concept no. 26 is “a fossil with a circle shape which has a rather small spine.” Concept no. 1 is an example of an (empirically) empty concept (its extent is an empty set). Concept no. 17 (and also all the concepts between 1 and 17, i.e., 2, 3, 4, 5, 6, 7, 8, 10, 11, 12), however, do not contain any object in degree 1. One may thus wish to consider them empty concepts as well.
Table 7.6
Fuzzy concepts of the context of Table 7.5 (Łukasiewicz structure).
No.
Extent 5
Intent
1
2
3
4
6
7
8
9
ss
sb
cs
os
1.
0
0
0
0
0
0
0
2.
0
0
0
1 2
0
0
0
0
0
1
1
1
1
0
0
1
1 2
1
3.
0
0
0
0
1 2
1 2
0
1
0
0
1 2
1
1
4.
0
0
0
0
0
0
1 2
1
1 2
0
1
1
1
5.
1 2
1 2
1 2
0
0
0
0
6.
0
0
0
1 2
1 2
0
0
7.
0
0
0
1 2 1 2 1 2
1 2
0
1
1 2
1
0
1 2
1 2 1 2 1 2
1
0
0
1 2 1 2
1 2
1
1 2
1 2 1 2
1
1
0
1 2
1
1
1 2 1 2
0
0
0
1
0
1 2 1 2 1 2
1
8.
0
0
0
0
1 2
9.
1 2 1 2 1 2
1 2 1 2 1 2
1 2 1 2 1 2
1
0
0
1 2
1 2
0
0
0
1 2
0
0 1 2
1 2 1 2 1 2
1 2 1 2
1
1 2
1 2 1 2 1 2
1 2
1 2 1 2 1 2
0
0
1
1
1 2 1 2 1 2
1
12.
0
0
0
1 2 1 2 1 2
13.
0
0
0
0
1
1
14.
1
1
1
1
0
0
0
0
0
1
0
0
1
15.
1 2 1 2 1 2
1 2 1 2 1 2
1
1 2
1 2
0
0
0
1 2
0
1
0
1 2
1 2
0
0
0
1
1
0
1
0
19.
0
0
0
1 2 1 2 1 2 1 2
1 2 1 2
18.
1 2 1 2 1 2 1 2
1 2 1 2 1 2
1
17.
1 2 1 2 1 2
0
1
1 2
20.
0
0
1 2 1 2
1
0
21.
1
22.
1
23.
10. 11.
16.
1
0
0
1 2
1 2
0
0
1
1
0
1 2 1 2 1 2 1 2
1
1
1
1 2 1 2
1
1
1 2 1 2
1 2 1 2
0
0
0
1
1
1
0
0
1 2 1 2 1 2 1 2
1 2 1 2 1 2 1 2
1
1 2
1 2
1
0
0
1
1
0
0
0
1 2 1 2
1 2 1 2
1
27.
1
28.
0
0
0
1 2 1 2 1 2 1 2
1 2 1 2 1 2 1 2
26.
1 2 1 2 1 2 1 2
1 2 1 2 1 2 1 2
1
1
29.
1
1
1
1
1 2
1 2
30.
1
1
1
1
0
0
31. 33.
1 2 1 2 1 2
1 2 1 2 1 2
1 2 1 2 1 2
34.
0
0
35.
1
1
24. 25.
32.
1
1 2 1 2
1 2 1 2
0
0
1
1 2 1 2
1
0
0
1 2
0
1 2 1 2
1
1
0
0
1
1 2 1 2
1
1
1 2 1 2
1 2 1 2
1 2 1 2 1 2 1 2
0
1
0
1
1
0
1 2
1
0
1 2 1 2 1 2
1 2 1 2 1 2
1 2 1 2
1 2
0
0
1 2
1
1
0
0
0 1 2
0
0 1 2
0
1
1
1
1
1 2
1 2
1
1
1
1 2
1
1
1
1
1 2
0
1 2
0
1 2 1 2
1 2 1 2 1 2
1
1
1
1
1
0
0
1
0
1
1
1
1
1 2
1 2
1 2
0
0
0
1 2
1 2
1
1
1
1 2
0
0
0
1 1
1 1
1 1
1 1
0 0
0 0
1 2
0 0
36.
1
1
1
1
1 2
37. 38.
1 2
1 2
1 2
1
1
1
1 1
1 1
1 2
0
0 0
0
0
7.5 Formal Concept Analysis Demonstrated: Examples
223
Table 7.7 Fuzzy concepts of the context of Table 7.5 (Gödel structure). No.
1. 2. 3. 4. 5. 6.
Extent
Intent
1
2
3
4
5
6
7
8
9
ss
sb
cs
os
0 0 0 0 0 0
0 0 0 0 0 0
0 0 0 0 0 0
0
0 0 0
0 0 0
0 0
0 0
1 2
1 2
1 2
1 2
0 0
0 0
0 0
1 2
1 2
1 1 1 0 1 1
1 0 1 1 0 0
1 1 1 1
0 0
0 0 0 0 0 1
1 1 0 1 1 0
1 2
1 2
0 1 1 0 0
0 1 0
0
0 1 1 0 0
0 1 0
0
0 0 0 1 1
1 2
1 2
1 2
1 2
0 0 1 1
1 1 0 0
1 1 1 1 0
1
1 2
1 2
0
0
0 0
1 1 0
7. 8. 9. 10. 11. 12.
0 0 0 0 1 0
0 0 0 0 1 0
0 0 0 0 1 0
13. 14.
0 0
0 0
0 0
15. 16. 17. 18. 19.
0 0 1 1 0
0 0 1 1 0
0 0 1 1 0
20. 21. 22. 23. 24. 25.
0 0 1 1 0 1
0 0 1 1 0 1
0 0 1 1 0 1
1 2
0 0 1 1 2 1 2
1 2 1 2
0 1 1 1 1 1 2
1 1 1 1
0
0
1 2
1 2
0 0 0 0 0 1
0 1
0 1
0 1
0
1 1 0
0 1
0 1
1 2
1 2
1 2
1 2
0
0 1
0 1
0 0 1 0
0
0 0 1 0 1
1 1 0 1 1 1
1 1 0 1 1 1
0 1 1 0 1 1
0 1 1 0 1 1
0 1 1 0 1 1
0 0
1 2
1 2
1 2
0 0 0
0 1 2
0 0 0 0 0 0 0 0 0
1 2
1
1 2 1 2
1 1 1 0 0 1 2 1 2
1 0 1 2
0 1 0 1 1 2 1 2
0 0 1 0 1 2
1 0 0
0 0
1 2
0 0
0
1 2
This is possible if one goes from the original fuzzy concept lattice B (X, Y, I ) to the lattice which results from B (X, Y, I ) by factorization modulo the induced similarity 1 2 E. Indeed, as we can see from Figure 7.3, the concepts between 1 and 17 form one 1 block of 2 E and can thus be considered one concept in the factor lattice. The same applies to other similar concepts of B (X, Y, I ). We thus see that the factorization 1 modulo 2 E makes the concept lattice smaller in that similar concepts which need not be distinguished are put together.
224
7 Formal Concept Analysis in Geology
X Table 7.8 Similarity EB on fossils from Figure 7.2 and Table 7.5. (X,Y,I )
fossil 1 fossil 2
fos. 1
fos. 2
fos. 3
fos. 4
fos. 5
fos. 6
fos. 7
fos. 8
fos. 9
1
1
1
0
0
0
0
0
1
1
1 2 1 2 1 2
0
0
0
0
0
0 0 1
0 0 1
0 0
0 0
1 2
1
1
1 2 1 2
0 0 0
1
1
fossil 3 fossil 4 fossil 5
1
1
fossil 6 fossil 7 fossil 8
0 1 2 1 2
1
fossil 9
1
Table 7.9 Context X, Y, h(I ) corresponding to X, Y, I from Table 7.5.
fossil 1 fossil 2 fossil 3 fossil 4 fossil 5 fossil 6 fossil 7 fossil 8 fossil 9
Spine small ss
Spine big sb
Circle-shaped cs
Oval-shaped os
1 1 1 1 0 0 1 1 1
0 0 0 0 1 1 1 1 0
0 0 0 1 1 1 1 1 1
1 1 1 1 1 1 0 0 0
Note that both B (X, Y, h(I )) obtained from B (X, Y, I ) (Gödel structure) and 1 B (X, Y, I )/ 2 E obtained from B (X, Y, I ) (Łukasiewicz structure) are isomorphic. 1 Note also that, taking the Gödel structure, 2 E is just the partition induced by the 1 homomorphism h∗ induced above, i.e., B (X, Y, I )/ 2 E again is isomorphic to the lattice in Figure 7.6. Example 7.11 The next example illustrates attribute implications. Table 7.11 shows the input data. There are 17 objects (denoted by 1–17) and 36 attributes (denoted by a–J). In this case, the structure of truth values is the two-element Boolean algebra, i.e., only
7.5 Formal Concept Analysis Demonstrated: Examples
225
Figure 7.5 Classes of the congruence relation induced by h.
truth values 0 (false) and 1 (true) are employed. The objects are extinct cephalopods (identified by paleontologists) as listed in Table 7.12. The organisms possess attributes which are listed in Table 7.13. In view of our purpose we do not comment on the appropriateness of these attributes from the paleontological point of view. Beˇ lohlávek & Košt’ák (in preparation) contains more information and details. There is a relatively large number of attributes, obviously more than a human mind can grasp at once. Therefore, it is desirable to get information that gives us an additional insight. A suitable one is in the form of attribute dependencies. Tables 7.14, 7.15, and 7.16 show an irredundant basis of all attribute implications that are valid in the input data (that is, no one of the implications follows from the others). The implications are to be read as follows. Implications correspond to rows in the table (in order to fit the page, tables are split into two parts: one corresponding to attributes a–r and one corresponding to attributes s–J; therefore, each row is split into two parts). “A” denotes that the corresponding attribute belongs to the antecedent of the implication, “C” denotes that the attribute belongs to the consequent. Thus, the first row says that the implication {J}⇒{p} is true in the input data; the eighth row says that the implication {A,B}⇒{l} is true in the input data, etc.
226
7 Formal Concept Analysis in Geology
Table 7.10 Concepts of B (X, Y, h(I )). No.
1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.
Extent
Intent
1
2
3
4
5
6
7
8
9
ss
sb
cs
os
0 0 0 0 1 0 0 0 1 1 0 1
0 0 0 0 1 0 0 0 1 1 0 1
0 0 0 0 1 0 0 0 1 1 0 1
0 1 0 0 1 1 1 0 1 1 1 1
0 0 0 1 0 0 1 1 0 1 1 1
0 0 0 1 0 0 1 1 0 1 1 1
0 0 1 0 0 1 0 1 1 0 1 1
0 0 1 0 0 1 0 1 1 0 1 1
0 0 0 0 0 1 0 0 1 0 1 1
1 1 1 0 1 1 0 0 1 0 0 0
1 0 1 1 0 0 0 1 0 0 0 0
1 1 1 1 0 1 1 1 0 0 1 0
1 1 0 1 1 0 1 0 0 1 0 0
1
Figure 7.6 Lattice B (X, Y, h(I )). Factor lattice B (X, Y, I )/ 2 E.
Computation of concept lattices and attribute implications is too extensive to be done by hand and, therefore, requires the use of a computer. In our examples we used a software tool that is being developed jointly in the Department of Computer Science, Palacký University, Olomouc (Czech Republic), and in the Department of Computer Science, Technical University of Ostrava (Czech Republic).
1
Figure 7.7 Blocks of the tolerance relation 2 E on the concept lattice of Figure 7.3.
228
7 Formal Concept Analysis in Geology
Table 7.11 Fuzzy context given by fossils and their properties. a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
0 1 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0
0 1 1 1 1 0 1 1 1 1 1 1 1 1 1 0 1
1 1 1 0 1 0 1 1 0 0 0 0 1 0 1 1 0
1 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0
0 1 1 1 1 0 1 1 1 0 1 0 0 1 0 0 1
0 1 1 0 0 1 0 0 0 0 0 0 1 0 0 1 0
0 1 1 0 0 0 0 0 0 1 0 1 1 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 1 0 1 0 1 1 1 0 0 0 0 0 1 0 0
0 1 1 0 1 1 1 1 1 0 1 1 1 1 1 1 1
0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0
0 1 1 1 0 0 0 1 1 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0
0 0 0 0 1 1 0 0 0 1 0 1 1 1 1 1 1
0 1 1 1 0 0 1 1 1 1 1 0 0 1 0 0 1
0 0 1 1 0 1 0 1 0 0 0 1 1 1 1 1 0
0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1
1 1 1 0 1 1 1 1 1 1 1 1 0 0 1 1 1
Table 7.12 Objects of the context of Table 7.11. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
Actinocamax verus antefragilis Praectinocamax primus Praectinocamax plenus Praectinocamax plenus cf. strehlensis Praectinocamax triangulus Praectinocamax aff triangulus Praectinocamax sozhenzis Praectinocamax contractus Praectinocamax planus Praectinocamax coronatus Praectinocamax matesovae Praectinocamax medwedicicus Praectinocamax sp.1 Praectinocamax sp.2 Goniocamax intermedius Goniocamax surensis Goniocamax volgensis
0 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 1 1 0 0 0 1 0 1 0 1 0 0 0 1 0 1
0 1 1 0 0 1 0 1 0 0 0 0 1 1 0 1 0
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 1 0 1 0 0 1 1 0 1 0 0 0 1 1 1
0 0 1 1 1 0 0 1 1 0 1 0 0 0 1 1 1
0 1 1 1 1 1 0 1 1 1 0 1 0 1 0 0 1
7.5 Formal Concept Analysis Demonstrated: Examples Table 7.13 Attributes of the context of Table 7.11. a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J
rostra big rostra medium rostra small cigar shape in dorsoventral view lanceolat in dorsoventral view little lanceolat in dorsoventral view subcylindric in dorsoventral view conic in dorsoventral view cigar shape in lateral view lanceolat in lateral view little lanceolat in lateral view subcylindric in lateral view conic in lateral view flat lateral flat dorsal flat ventral alveolar fracture highly conic alveolar fracture lowly conic pseudoalveol flat pseudoalveol deep cut of alveolar fracture oval cut of alveolar fracture oval-triangular cut of alveolar fracture triangular cut alveolar fracture circle-shaped conellae join dorsolateral line dorsolateral listel rostrum granulation rostrum granulation partly rostrum striation rostrum striation partly vessel engram vessel engram rarely mucro ventral line
229
230
7 Formal Concept Analysis in Geology
Table 7.14 Minimal base of implications of the context of Table 7.11 (first part).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
1 2 3 4
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . C A C A . A A . . . . C A C C A
. . . . . . . . . . . . . . . A A A A . . . . . . . . . C A A A . A C
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . A C . . . . . . . . . . C . . . . . C . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . C . A . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . A C C C A C . . . . . C . . . C . . . . . . . . . . A . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . A . C . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
C C . . C . . . . . . . . . . . A . . . A . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . A . . . . . . . . C A . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
s
t
u
v
w
x
y
z
A
B
C
D
E
F
G
H
I
J
. . . .
. . . .
. . . .
. . . A
. . . .
. . . .
. . . .
. . . .
. . . .
. A C C
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . A .
A . . .
continued
7.5 Formal Concept Analysis Demonstrated: Examples Table 7.14 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
. . . . . . . . A . . . . . . . . . . . . . . . . . . . A . A
Table 7.15
36 37 38 39 40 41 42
231
Continued A . . . . . A A . . A . . C A . . . . . A A . . . . . . A . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . C . . . . . A C . . A A C . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . A C . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . A . C . . . . A C . . . C . . A . A . . . . . . . . . .
. C . A . . A . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . A . . . . A . C A . . . . . . . . A . A . . . . A . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . A C . . C . . . . . . . . A C A C . . . . . . . . A .
. . . . C A . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Minimal base of implications of the context of Table 7.11 (second part).
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
. . . . . . .
. A C . C . C
A . . . . . .
. . . . . . .
. . . . . . .
A . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . .
. . . . . . A
. . . . C . C
. . . . . . .
. . . . . . .
. . . . . . .
. C . . . . .
. . . . . . .
. . . . . . .
continued
232
7 Formal Concept Analysis in Geology
Table 7.15 Continued
43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70
36 37 38 39 40 41 42 43 44 45
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
C . . . . . . . C . . . . . . A . . . . . . A . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . A C C A C A . C A A C A A A C . . C A C
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
A A A A C A A A . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . A
. . . . . . C A . . . . . . . . . . . A . . . . . . . .
. . . . A C . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . C . A . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . A C A . . . . . . . . . . . . .
s
t
u
v
w
x
y
z
A
B
C
D
E
F
G
H
I
J
C . . . . . . . . .
A . . A . A . . . .
. . A A . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
A . . . C . C . C A
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . A A . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . C . . . .
. . . . . . . . . .
. . . C . . . . A C
7.5 Formal Concept Analysis Demonstrated: Examples Table 7.15 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70
. . . . . . . . . . . . . . . . . . . A . . . . .
Table 7.16
71 72 73 74 75 76 77 78 79 80
233
Continued . C A . A . . . . . . . . . . . . A A . . . C A .
A C A . . . . A C C . . . C C . . . . . . . . A .
. . . . C . . . A . A A . . . . A . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . A . C .
. . . . . . . . . . . . . . . . . . . . A C . . .
C . . . . . . . . . . . . . . . . . A . . . . A .
. . . A . C . A . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . A A . . . C . . C . .
. . . . . . . . . . . . A A . . C . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . C . . . . . .
. . . . . . A . . . A C . . . . . . . . . . . . .
. . . . . . A . C C . . A C . A . C . . . . . . A
Minimal base of implications of the context of Table 7.11 (third part).
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
. . . . . . . . . .
. . . . . C A A A .
. . . . . . . . A C
. . . . . A A C . .
A . . . . . . . . .
. . . . . . . . . .
. . . . . . C . . .
. . . . . . . . . .
. . . . . . . . . .
. . . A A . . . . .
A C A . . . . . C A
. A . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. . . . . . . . . .
. A C C . . . . . .
continued
234
7 Formal Concept Analysis in Geology
Table 7.16 Continued
81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113
71 72 73 74 75 76
a
b
c
d
e
f
g
h
i
j
k
l
m
n
o
p
q
r
. . . . . . . . . . . A A C A A A A A C A C . C C . A . . . . . .
. . . . . A . . . A . . . . . . C A C . . . . . . . . . . . . . .
C A C . C . A A A C A . . . . . A C . A . C A . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . C A . . . . . .
. . . . . . . A C C . . . . . . . C . A . . A . A . . . . . . . .
. . . . . A C . . . . . . A C C . . . . . . . . . . . . . . C . A
. . . . A C A . . . . . . . . . . . . . . . . . A . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . C . . A C .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C . .
. . . . . . . . . . . . . . . . . . . . . . . . . A C . . . . . .
. . A A . . . C . . . . . . . . . . . . . . . . . A . . . . . . .
. . . . . . . . . . . C . . . . . . . . . . . A . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . C
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . A . . . . . . . . A
. . . . . . . . . . A . . . . . . . . . . . . . . . . . . . . . .
s
t
u
v
w
x
y
z
A
B
C
D
E
F
G
H
I
J
. . . . . .
. . . . . .
. . A . . .
. . . . C .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
. . . . . .
A . . . . .
. . . . A .
C . A . . A
continued
7.5 Formal Concept Analysis Demonstrated: Examples Table 7.16 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A .
235
Continued . A . . . . A A . . . . . . . . A . . . . . . . . C A . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . A . . C . . . . . . . . .
. A . A A C . A . . . . . . . . . . . A . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . A C . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. A A . . . . . . . . . . A . C . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A . . C
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A C . . .
. . . . A A . C . . . . . . . . . . . . . . . A C . . . . . . . . . . . .
. . . . . . . . A . . . . A C . . . A . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A C . . . .
. . . . . A . . . . . . . . . . . . . . . . A . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. A . . . . . . . . . . A . . . C A C . . . . . . . . . . . . . . . . A .
236
7 Formal Concept Analysis in Geology
Acknowledgments The work has been supported by the project Kontakt ME 468 of the MŠMT of the Czech Republic as the international supplement to the NSF project “Stratigraphic ˇ simulation using fuzzy logic to model sediment dispersal” and partly by GACR ˇ 201/02/P076. The author thanks Dr Martin Košt’ák for 201/99/P060 and GACR providing him with paleontological data.
References Arnauld, A., & Nicole, P. [1662], La logique ou l’art de penser. Paris. Bˇelohlávek, R. [1999], “Fuzzy Galois connections.” Mathematical Logic Quarterly, 45(4), 497–504. Bˇelohlávek, R. [2000a], “Similarity relations in concept lattices.” Journal of Logic and Computation, 10(6), 823–845. Bˇelohlávek, R. [2000b], “Fuzzy Galois connections and fuzzy concept lattices: from binary relations to conceptual structures.” In: Novák, V., & Perfilieva, I. (eds.), Discovering the World with Fuzzy Logic, pp. 462–494. Physica-Verlag, Heidelberg and New York. Bˇelohlávek, R. [2001], “Reduction and a simple proof of characterization of fuzzy concept lattices.” Fundamenta Informaticae, 46(4), 277–285. Bˇelohlávek, R. [2002a], “Logical precision in concept lattices.” Journal of Logic and Computation, 12(6), 137–148. Bˇelohlávek, R. [2002b], Fuzzy Relational Systems: Foundations and Principles. Kluwer/Plenum, New York. Bˇelohlávek, R. [2003], “Concept lattices and order in fuzzy logic.” Annals of Pure and Applied Logic, in press. Birkhoff, G. [1967], Lattice Theory, 3rd edition. AMS College Publication 25. Providence, RI. Burusco, A., & Fuentes-Gonzáles, R. [1994], “The study of the L-fuzzy concept lattice.” Mathware & Soft Computing, 3, 209–218. Czédli, G. [1982], “Factor lattices by tolerances.” Acta Scientiarum Mathematicarum (Szeged), 44, 35–42. Davey, B., & Priestley, H. A. [1990], Introduction to Lattices and Order. Cambridge University Press, Cambridge, UK. Ganter, B. [1994], “Lattice theory and formal concept analysis—a subjective introduction.” Preprint MATH-AL-2-1994, Technical University Dresden, Germany. Ganter, B., & Wille, R. [1999], Formal Concept Analysis: Mathematical Foundations. SpringerVerlag, Berlin and New York. Ganter, B., Wille, R., & Wolff, K. E. (eds.) [1987], Beiträge zur Begriffsanalyse. B. I. Wissenschaftsverlag, Mannheim, Germany. Goguen, J. A. [1967], “L-fuzzy sets.” Journal of Mathematical Analysis and Applications, 18, 145–174. Goguen, J. A. [1968–69], “The logic of inexact concepts.” Synthese, 19, 325–373. Gottwald, S. [1993], Fuzzy Sets and Fuzzy Logic. Foundations of Applications—from a Mathematical Point of View. Vieweg, Wiesbaden, Germany.
References
237
Grätzer, G. [1968], Universal Algebra. Van Nostrand, Princeton, NJ. Hájek, P. [1998], Metamathematics of Fuzzy Logic. Kluwer, Dordrecht. Höfler, A. [1906], Grundlehren der Logik und Psychologie. G. Freytag, Leipzig. Höhle, U. [1995], “Commutative, residuated l-monoids.” In: Höhle, U., & Klement, E. P. (eds.), Non-Classical Logics and their Applications to Fuzzy Subsets, pp. 53–106. Kluwer, Dordrecht. Höhle, U. [1996], “On the fundamentals of fuzzy set theory.” Journal of Mathematical Analysis and Applications, 201, 786–826. Klir, G. J., & Yuan, B. [1995], Fuzzy Sets and Fuzzy Logic: Theory and Applications. PrenticeHall, Upper Saddle River, NJ. Ore, O. [1944], “Galois connexions.” Transactions of the American Mathematical Society, 55, 493–513. Pavelka, J. [1979], “On fuzzy logic I, II, III.” Zeitschrift für Mathematische Logik und Grundlagen der Mathematik, 25, 45–52, 119–139, 447–464. Peacocke, C. [1992], A Study of Concepts. MIT Press, Cambridge, MA. Pollandt, S. [1997], Fuzzy Begriffe: Formale Begriffsanalyse uncharfer Daten. Springer-Verlag, Berlin. Schröder, E. [1890–95], Algebra der Logik I, II, III. Leipzig. Sowa, J. F. [1984], Conceptual Structures: Information Processing in Mind and Machine. Addison-Wesley, Reading, MA. Ward, M., & Dilworth, R. P. [1939], “Residuated lattices.” Transactions of the American Mathematical Society, 45, 335–354. Wille, R. [1982], “Restructuring lattice theory: an approach based on hierarchies of concepts.” In: Rival, I. (ed.), Ordered Sets, pp. 445–470. Reidel, Dordrecht–Boston. Wille, R. [1985], “Complete tolerance relations of concept lattices.” In: Eigenthaler, G. et al. (eds.), Contributions to General Algebra, Vol. 3, pp. 397–415. Hölder-Pichler-Tempsky, Wien. Wille, R. [1992], “Concept lattices and conceptual knowledge systems.” Computers & Mathematics with Applications, 23, 493–515. Zadeh, L. [1965], “Fuzzy sets.” Information and Control, 8(3), 338–353. Zadeh, L. [1971], “Similarity relations and fuzzy orderings.” Information Sciences, 3, 159–176. Zadeh, L. [1997], “Towards the theory of fuzzy information granulation and its centrality in human reasoning and fuzzy logic.” Fuzzy Sets and Systems, 90(2), 111–127.
This Page Intentionally Left Blank
Chapter 8
Fuzzy Logic and Earthquake Research
Chongfu Huang
8.1 Introduction 239 8.2 Basic Terminology in Earthquake Research 242 8.2.1 Earthquake and seismology 242 8.2.2 Earthquake engineering 244 8.3 Fuzzy Logic in Earthquake Prediction 245 8.3.1 Direct method of fuzzy pattern recognition in earthquake prediction 245 8.3.2 Fuzzy pattern for earthquake prediction based on seismicity indices 247 8.4 Fuzzy Logic in Earthquake Engineering 249 8.4.1 Fuzzy earthquake intensity 250 8.4.2 Estimating earthquake damage with fuzzy logic 255 8.5 Hybrid Fuzzy Neural Networks with Information Diffusion Method 8.5.1 Fuzzy relationships given by the information diffusion method 8.5.2 Pattern smoothing 262 8.5.3 Learning relationships by BP neural networks 264 8.5.4 An application 264 8.6 Conclusion and Discussion 269 Appendix 8.A: Modified Mercalli Intensity Scale Used in China 270 Acknowledgments 271 References 271
259 260
8.1 Introduction An earthquake is a sudden, rapid shaking of the Earth caused by the breaking and shifting of rock beneath the Earth’s surface. Ground shaking from earthquakes can collapse buildings and bridges; disrupt gas, electric, and phone services; and sometimes trigger landslides, avalanches, flash floods, fires, and tsunamis. No story will ever be written that will tell the awfulness of a few hours following a terrible earthquake such as the San Francisco earthquake, 1906 (April 18 at 5:15 a.m., measuring 8.25 on the Richter scale), or the Richter magnitude 7.8 239 FUZZY LOGIC IN GEOLOGY
Copyright 2004, Elsevier Science (USA) All rights of reproduction in any form reserved. ISBN: 0-12-415146-9
240
8 Fuzzy Logic and Earthquake Research
Tangshan earthquake on July 28, 1976, that destroyed almost all of the city and killed 242,000 people (investigated by the China Academy of Building Research [1986]). No pen of the most powerful descriptive power could ever place on paper the impression of any one of the hundreds of thousands who felt the mighty Earth tremble. No pen can record the sufferings of those who were crushed to death or buried in the ruins that encompassed them in the instant after a destructive earthquake. Although the world’s largest earthquakes do have a clear spatial pattern, we are not able to predict individual earthquakes. Some Chinese seismologists claimed that the Haicheng earthquake (February 4, 1975, with Richter magnitude 7.3) was successfully predicted. However, many scientists deny it (see Geller et al. [1997]). Storms approach, fires spread, floods migrate downstream after large rainfalls, but earthquakes can turn a perfectly normal day into disaster in seconds. There is a growing possibility that in the near future we will be able to correctly predict earthquakes that have obvious precursors. Unfortunately, most earthquakes do not have obvious precursors such as changes in land elevation, changes in groundwater levels, widespread reports of peculiar animal behavior, and foreshocks. A contemporary problem facing seismologists is to explore more powerful tools for monitoring seismic activity and analyzing collected geophysical data and other related information. Seismologists have developed several approaches based on fuzzy logic for analyzing earthquake data [Feng et al., 1996; Junji & Feng, 1995]. Furthermore, a new branch of seismology called fuzzy seismology has been proposed by Feng et al. [1992b]. Fuzzy seismology includes the following ten fuzzy methods applied in earthquake prediction: (1) fuzzy pattern recognition; (2) fuzzy clustering analysis; (3) fuzzy information retrieval; (4) fuzzy similarity choice; (5) fuzzy multi-factorial evaluation; (6) fuzzy reasoning; (7) fuzzy self-similarity analysis and fuzzy fractal dimension; (8) gray fuzzy prediction; (9) fuzzy neural network; and (10) fuzzy analyzing and processing software systems. Lin and Sanford [2001] incorporated a fuzzy logic algorithm into the location program SEISMOS to increase stability in locating regional earthquakes. This technique was converted into a computer subroutine as an initial hypocenter estimator and incorporated into SEISMOS in early 1994 at the New Mexico Institute of Mining and Technology [Lin, 1994]. The Fuzzy/SEISMOS combination has proved to be very effective in locating earthquakes. The current success rate in earthquake prediction is very low. Even if we might one day successfully predict any earthquake, it is impossible to move the buildings or the property within buildings away from the area of a predicted earthquake. We are able to mitigate earthquake disasters by considering earthquake loads when designing and constructing buildings. This discipline is a natural branch of civil engineering, called earthquake engineering. The use of fuzzy sets in civil engineering in the USA was alluded to in 1971 [Brown & Leonards, 1971], but little serious work occurred until the advent of Blockley’s 1975 and 1977 papers. These UK studies had a profound effect on work in the USA where concern had existed about the inclusion of semantic wisdom,
8.1 Introduction
241
as opposed to countable information, into building safety and design reliability considerations [Brown, 1985]. An immediate application is the assessment of seismic damage [Yao, 1980]. This led to the discovery by English-speaking workers of the existing fuzzy set literature in Japan [Tokyo Institute of Technology, 1975–1985] and, in particular, to applications of fuzzy set theory to earthquake engineering in China [Feng et al., 1982; Liu, 1982; Liu & Dong, 1982; Tian, 1983; Wang, F., 1983; Wang, 1984; Huang & Liu, 1985; Liu et al., 1985; He & Guo, 1988; Huang & Xiu, 1988]. In the early 1980s, some scientific researchers working in earthquake-resistant design came in touch with fuzzy logic and it quickly became a powerful tool to incorporate the copious imprecise knowledge in this field quantitatively and in a more scientific fashion. For example, Wang [1982] suggested a fuzzy method for determining the value of certain parameters used in structure design, so as to avoid the problem of increasing or decreasing the design parameters by a factor of 2. Chiang and Dong [1987] suggested that when available data on structural parameters are crude and do not support a rigorous probabilistic model, the fuzzy set approach should be considered in view of its simplicity. To select the best design strategy, Wu and Wang [1988] developed a fuzzy approach to reasonably relate the condition of construction, the environment of the structure, the probability of foundation damage, the arrangement of horizontal and vertical planes of the structure, and so on. Subramaniam et al. [1996] presented one of the first experimental applications of fuzzy control to a building structure, showing the feasibility of the implementation of fuzzy logic to highly nonlinear problems. For efficient fuzzy information processing relevant in earthquake engineering, Huang [1997] suggested the principle of information diffusion to deal optimally with recognizing the underlying relationships from a small sample. Properties of information diffusion estimators on probability density functions confirm that the principle is true. The simplest diffusion function is the 1-dimensional linear information distribution [Liu & Huang, 1990]. Huang [2000] demonstrated that the work efficiency of the method of information distribution is about 28% higher than the histogram method. In other words, we can use a 28% smaller sample size and obtain the same accuracy by the method of information distribution. If we need a sample of 30 observations for the histogram method, a sample with 22 observations can give an estimate by the method of information distribution with the same accuracy. However, we never eliminate the imprecision. Huang [1998b] studied the possibility of using the method of information distribution to calculate a fuzzy probability distribution to represent the imprecision. In this chapter we review some contemporary methodologies of fuzzy logic for earthquake research. The focus is on applications of fuzzy logic developed by seismologists rather than civil engineers. The chapter is organized as follows. To help the readers who are unfamiliar with earthquake research, Section 2 introduces basic terminology of seismology and earthquake engineering. Section 3 briefly presents two fuzzy methods for earthquake prediction. Section 4 reviews the studies of fuzzy earthquake intensity and
242
8 Fuzzy Logic and Earthquake Research
the estimation of earthquake damage with fuzzy logic. In Section 5, we give one example of a hybrid fuzzy neural network that estimates the relationship between isoseismal area and earthquake magnitude. The chapter is then summarized with a conclusion in Section 6. 8.2 Basic Terminology in Earthquake Research 8.2.1 Earthquake and seismology An earthquake is the vibration of the Earth produced by the rapid release of energy. This energy radiates in all directions from its source, the focus, in the form of waves. Just as the impact of the stone sets water waves in motion, an earthquake generates seismic waves that radiate throughout the Earth. Even though the energy dissipates rapidly with increasing distance from the focus, instruments located throughout the world can record the event. Although an earthquake is a natural phenomenon, seismology is the branch of Earth science concerned with the study of natural earthquakes, man-made earthquakes, and related phenomena such as underground nuclear bomb testing. Early attempts to establish the size or strength of earthquakes relied heavily on subjective description. There was an obvious problem with this method because people’s accounts vary. Thus it was difficult to make an accurate determination of the quake’s absolute strength. In 1902, a fairly reliable intensity scale based on the amount of damage caused to various types of structure was developed by Giuseppe Mercalli [Tarbuck & Lutgens, 1991]. A modified form of this tool is presently used in China (see Appendix 8.A). By definition, the earthquake intensity (also called seismic intensity) is a measure of the effects of a quake at a particular location. It is important to note that earthquake intensity depends not only on the absolute strength of the earthquake, but also on other factors. These factors include the distance from the epicenter (the vertical projection of the focus on the surface), the nature of the surface materials, and building design. The intensity at the epicenter is called epicentral intensity. In 1935, Charles Richter of the California Institute of Technology introduced the concept of earthquake magnitude when attempting to rank earthquakes of southern California [Berlin, 1980]. The earthquake magnitude is basically a relative scale [Kasahara, 1981]. It defines a standard size of earthquake and rates other earthquakes in a relative manner by their maximum amplitude of ground motion under identical observational conditions. This is evident from Richter’s definition: M = log[A()/A0 ()] = log A() − log A0 (),
(8.1)
where is an epicentral distance, and A0 and A denote the maximum trace amplitudes, on a specified seismograph, of the standard event and the one to be measured,
8.2 Basic Terminology in Earthquake Research
243
respectively. The standard earthquake, i.e., M = 0 (= log 1) in Richter’s formula, is such as to give the maximum trace amplitude of 0.001 mm on a Wood–Anderson-type seismograph at = 100 km. The intensity scale differs from the Richter magnitude scale in that the effects of any one earthquake vary greatly from place to place, so there may be many intensity values measured from one earthquake. Each earthquake, on the other hand, should have just one magnitude, although different methods of estimating it will yield slightly different values. Earthquake prediction [Aki, 1995] is usually defined as the specification of the time, location, and magnitude of a future earthquake within stated limits. Earthquake activity is called seismicity and is usually described by statistical indices called seismicity indices. In China, seismicity indices are calculated each month from the data for the last 100 events before the end of the month in question. The following three indices are commonly used to quantify earthquake activity in an area: (1) the b value; (2) the η value; (3) the c value. 1. The b value is a basic characteristic of the seismicity rate in an area, which can be calculated by using the earthquake data set collected in the studied area. The b value is a parameter in the Gutenberg–Richter law (Gutenberg & Richter, 1944], log10 N = a − bM,
(8.2)
that determines the correlation between the magnitude M of earthquakes and their relative numbers N . 2. Let Mi be the ith magnitude of the given data with size n, M0 is the minimum magnitude within the activity, and Xi = Mi − M0 . Then, the η value is defined as η=
n 01
n
i=1
Xi2
n 130 1
n
12 Xi
.
(8.3)
i=1
It is a degree of deviation from the Gutenberg–Richter law. 3. The c value is another deviation degree that is defined by using the Kullback– Leibler directed divergence (Kullback, 1959]: c=
n
p(Xi ) ln[p(Xi )/q(Xi )]
i=1
p(Xi ) = ni /N q(Xi ) = B exp(−BXi )M B = b ln 10 Xi = Mi − M0 ,
(8.4)
244
8 Fuzzy Logic and Earthquake Research
where M is the length of an earthquake range, ni and N denote the number of earthquake range from Mi to Mi + M and the total number of earthquakes, respectively, and M0 is the minimum magnitude within the activity. 8.2.2 Earthquake engineering Earthquake engineering can be defined as the branch of civil engineering devoted to mitigating earthquake hazards [Committee on Earthquake Engineering, Research Commission on Engineering and Technical Systems, National Research Council, 1982]. In this broad sense, earthquake engineering covers the investigation and solution of the problems created by damaging earthquakes, and consequently the work involved in the practical application of these solutions, i.e., in planning, designing, constructing, and managing earthquake-resistant structures and facilities. However, most earthquake engineers and seismologists consider earthquake engineering as a scientific discipline that is more concerned with estimation of earthquake loading and earthquake damage assessment. Therefore, as seismologists, earthquake engineers also study the seismic (elastic) waves generated by an earthquake. Earthquake engineers pay much attention to predicting the intensity of shaking at a given site for an earthquake of given magnitude. This intensity is called site intensity and is measured by macro-seismic intensity (as a modified Mercalli intensity) or ground acceleration. Earthquake engineers can then change the site intensity into earthquake loading for earthquake-resistant design. Earthquake engineers, in the course of their work, are faced with many uncertainties and must use sound engineering judgments to develop safe solutions to challenging problems. In earthquake engineering there are two terms that are frequently used beside the term “intensity” (defined by Appendix 8.A). These two terms are “building damage index” and “peak ground acceleration.” 1. The building damage index is a measure of damage done to a building during an earthquake. If a building is without any damage after an earthquake, the value of the damage index of this building is 0. If a building is totally collapsed after an earthquake, the value of the damage index is 1. The building damage index is thus in the unit interval [0, 1]. We usually use the set given in Equation (8.5) as the universe of discourse of the building damage index, where the step length is 0.1. U = {u1 , u2 , u3 , . . . , u11 } = {0, 0.1, 0.2, . . . , 1}
(8.5)
2. The peak ground acceleration (PGA) is the largest acceleration recorded by a particular instrumentation station during an earthquake. PGA is what is experienced by a particle on the ground. PGA is measured in terms of the galileo, g(cm/s2 ). The PGA of the vertical component will usually be different from the PGA of the
8.3 Fuzzy Logic in Earthquake Prediction
245
horizontal component. Both values of PGA are dependent on the distance from the source, but for short distances the PGA of the vertical component may actually exceed the PGA of the horizontal component. In general usage, however, PGA means the PGA of the horizontal component. 8.3 Fuzzy Logic in Earthquake Prediction Although the leading seismological authorities of each era have generally concluded that earthquake prediction is not feasible [Geller et al., 1997], we believe that earthquake prediction is worthy of study. Feng et al. [1992a] used “fuzzy recognition” to analyze seismic precursors observed in China and Japan. Their result was that all earthquakes of M ≥ 6 that occurred in the given area during the given period had common precursors. The longest precursory time was about 5 months, and the shortest precursory time was about 2 months. Feng et al. [1992a] and Junji & Feng [1995] used “multi-story fuzzy multifactorial evaluation” (advanced by Wang, P. Z. [1983]) to assess and examine the potential strengths of 15 induced earthquakes due to water reservoirs in Canada, China, India, Greece, USA, and Zambia with magnitudes 2.0– 6.5. The result shows that the multi-story evaluation method can be used to assess the potential strength of an induced earthquake more effectively and more accurately. In this section, we review two methods of earthquake prediction with fuzzy pattern recognition. 8.3.1 Direct method of fuzzy pattern recognition in earthquake prediction General method According to the method of fuzzy set theory, the direct method of fuzzy pattern recognition may in principle include the following three steps: 1. Choosing characteristics. For each considered object u (e.g., seismic belt, radon content) we choose key characteristics relevant to earthquake prediction. 2. Constructing the membership function. If the object u may belong to several fuzzy sets Ai (i = 1, 2, . . . , n), then we must construct their membership functions. 3. Recognizing and judging the object. According to some principle of belongingness the element u can be judged, and the kind of set to which the object u belongs can be determined. Two principles of belongingness are usually used. They are: (a) The principle of maximal belongingness: If we have a fuzzy set Ai that satisfies Ai (u) = max{A1 (u), A2 (u), . . . , An (u)}
(8.6)
246
8 Fuzzy Logic and Earthquake Research then it can be considered that the object u relatively belongs to Ai ; otherwise, if we have an object ui which satisfies A(ui ) = max{A(u1 ), A(u2 ), . . . , A(un )}
(8.7)
then it can be considered that among all objects u1 , u2 , . . . , un the object ui belongs to fuzzy set A with maximum weight. (b) The principle of threshold-value. After defining a threshold-value λ ∈ [0, 1] and taking μmax = max{A1 (u), A2 (u), . . . , An (u)}, if μmax < λ, then we reject recognition of u; if μmax ≥ λ, then we recognize the object. If we have A1 (u), A2 (u), . . . , An (u) ≥ λ, then object u belongs to A1 ∪ A2 ∪ · · · ∪ An . Otherwise, if we have a series of objects u1 , u2 , . . . , um that satisfies A(ui ) ≥ λ (i = 1, 2, . . . , m), then all these objects can belong to the same fuzzy set A. The threshold-value λ can be determined empirically and is usually taken as λ > μ∗ = 0.5, where μ∗ is called the “fuzzy boundary point.”
Fuzzy analysis and recognition of earthquake precursors The direct method of fuzzy pattern recognition has been used to analyze and recognize seismic precursors (seismic belt, seismic gap, seismicity quiescence, self-similarity of earthquake sequence, etc.) and nonseismic precursors (radon content in groundwater, water level, tilt, tide, volume strain, levelling, earth resistivity, geomagnetic field, etc.) observed in China and in Japan [Feng & Ichikawa, 1989; Feng et al., 1989]. The basic technique of the direct method of fuzzy pattern recognition is to construct the suitable membership function for each single precursor and the total membership function for several precursors. For the majority of the observational curves of single precursors y = f (t), the membership function of a single precursor is represented by the following analytical expression [Junji & Feng, 1995]: 0 μ(yi ) = 1 +
1−1 a |k(yi )| · |r(yi )|
(8.8)
where yi denotes the precursor data in the ith time interval, |k(yi )| and |r(yi )| are the absolute value of slope and coefficient of correlation of curve y = f (t), respectively, a is an empirical constant, and μ(yi ) is the grade of membership of the precursor data yi being anomalistic. The total membership function of n precursors can be expressed by ⎧2n if there are m or more precursors ⎪ j =1 μj (yj i ), ⎪ ⎪ ⎪ whose memberships are larger ⎨ μ(y1i , y2i , . . . , yni ) = than 0.5 in the same time interval, ⎪ ⎪ shorter than k days ⎪ ⎪ ⎩ 0, otherwise, (8.9)
8.3 Fuzzy Logic in Earthquake Prediction
247
where yj i denotes the j th precursor data in the ith time interval; μj (yj2 i ) is the grade denotes the of membership of the precursor data yj i being anomalistic; the sign disjunction, i.e., max; the parameters n, m, k must be determined empirically for the different kinds of precursors in the different regions. For the Tokai area of Japan, Junji and Feng [1995] have found n = 9, m = 3, and k = 15 in the case of different kinds of precursors, and n = 30, m = 6, and k = 5 in the case of single volume strain precursors at different stations. For the Beijing–Tianjin–Tangshan area of China, they have n = 14, m = 5, and k = 15 in the case of single radon content precursors at different stations (all precursors are radon content but are recorded at different stations). The formula (8.9) with different values of n, m, k can be called the precursor pattern for medium–short-term M ≥ M0 earthquake prediction. The above values of n, m, k for Japan and China correspond to M0 = 6.0 and in this case formula (8.9) can be written as n μj (yj i ) = μ1 (y1i ) μ2 (y2i ) ··· μn (yni ) μ(y1i , y2i , . . . , yni ) = j =1
(8.10) The results given by Junji & Feng [1995] show that common precursors occurred before all earthquakes of M ≥ 6 took place in the given area and the given period. The longest precursory time (i.e., the beginning time of a precursor anomaly) was about 5 months and the shortest precursory time was about 2 months. Therefore they consider the precursors recognized by this means as medium–short-term earthquake precursors. Zheng and Feng [1989] also used “fuzzy recognition” to analyze recorded groundwater levels in 33 wells near Tangshan before and after the earthquake. They found previously unknown anomalies in the correlations between water levels in different wells beginning three hours before the quake. The anomalies were distinct from typical fluctuations during the 5-day period analyzed, and increased in strength as the earthquake approached. 8.3.2 Fuzzy pattern for earthquake prediction based on seismicity indices We can separately study anomalies in the seismicity indices b, η, c before large earthquakes in a given region by choosing their threshold values b∗ , η∗ , and c∗ . For example, Junji and Feng [1995] studied the variations of b, η, c for the Songpan region (in Sichuan Province, China) and found b∗ = 0.73, η∗ = 1.75, c∗ = 0.87. Their prediction check results, however, were not satisfactory. To improve the prediction in the seismicity indices b, η, c, Junji and Feng [1995] established two fuzzy patterns for one-year and half-year earthquake prediction, respectively.
248
8 Fuzzy Logic and Earthquake Research
First, they defined three membership functions on domains of indices b, η, c, respectively. In standard fuzzy terms, these functions would be written as μ1 (b), μ2 (η), μ3 (c), representing the grades of membership of the indices being anomalistic. However, in their papers, the functions are written as μb (ti ), μη (ti ), μc (tj ), and without any indication of the forms of the functions. Second, they discovered two numbers μ∗ , μ∗∗ that can help us to judge whether an earthquake would occur. Different regions have different μ∗ and μ∗∗ . They give two patterns for one-year and half-year earthquake prediction with μ∗ and μ∗∗ , respectively, as follows: Pattern A μ(bti , ηti , ctj ) = μ1 (bti )
'
μ2 (ηti )
'
μ3 (ctj ) ≥ μ∗
⇒ an earthquake more than M1 would occur within one year
(8.11)
(for Songpan M1 = 5.4) where j = i or i + 1 or i + 2. Pattern A indicates that if b and η are anomalistic (more than some degree) during time interval ti , and c is anomalistic (also more than some degree) during this time interval or the next interval ti+1 or the following interval ti+2 , then, within one year following the time interval when c is anomalistic (ti or ti+1 or ti+2 ), an earthquake more than M1 would occur. For the Songpan region, M1 = 5.4. Pattern B μ(bti , ηti , ctj ) = μ1 (bti )
.
μ2 (ηti )
.
μ3 (ctj ) ≥ μ∗∗
⇒ an earthquake more than M2 would occur within half a year
(8.12)
where also j = i or i + 1 or i + 2. Pattern B indicates that if b and η are anomalistic (more than some degree) during time interval ti , and c is anomalistic (also more than some degree) during this time interval or the next interval ti+1 or the following interval ti+2 , then, within half a year following the time interval when c is anomalistic (ti or ti+1 or ti+2 ), an earthquake more than M2 would occur. Table 8.1 shows the result for the Songpan region where, for earthquake M ≥ 5.4, the correct rates of prediction are 91% and 68% for Pattern A and Pattern B, respectively. The correct rates are higher than those of other methods in which the seismicity indices b, η, c are used separately.
8.4 Fuzzy Logic in Earthquake Engineering
249
Table 8.1 The check results of fuzzy comprehensive prediction of M ≥ 5.4 earthquakes in the Songpan region, using b, η, c. Pattern A
Pattern B
Actual earthquake period and M
Prediction period
Check
Prediction period
Check
1970 II (5.5; 5.5) 1972 II (5.6; 5.5) 1973 II (6.5) 1974 I (5.7) 1974 II (5.7) 1976 II (7.2) 1981 I (6.9)
1970 II–1971 I 1972 II–1973 I 1973 II–1974 I 1974 I–II 1974 I–II 1976 II–1977 I 1981 I–II
B B A B B A A
1985 II (5.4) 1987 I (6.2)∗
1984 II–1985 I
A D
1970 II 1972 II 1973 II 1974 I 1974 I 1976 II 1981 I 1984 II 1985 I
1989 I (5.4) 1989 II (6.3)
1988 II–1989 I 1989 II–1990 I
A A
B B A A D A A C A D C A A
1988II 1989 I 1989 II
I, from January to June; II, from July to December; A, accurate prediction; B, basically accurate prediction; C, false prediction; D, no prediction. * This event was at the north boundary of Songpan region. Source: Junji & Feng [1995].
8.4 Fuzzy Logic in Earthquake Engineering The main tasks of earthquake engineering are to estimate site intensity and to estimate earthquake damage. Due to the fuzziness of the earthquake intensity, there is a number of different sources that define and evaluate earthquake intensity. Due to the complicated nature of building damage caused by earthquakes, only five grade levels of damage to buildings are often used: (1) intact; (2) slight damage; (3) moderate damage; (4) severe damage; and (5) collapse. These grades comprise fuzzy subsets in the universe of discourse of average building damage index [0, 1]. Liu and Wang [Liu et al., 1985; Wang et al., 1986] used set-valued statistics to outline a procedure to take into account the number of buildings damaged at each level so as to devise an urban disaster mitigation strategy. In order to take into account all the factors that might affect urban development in a seismic area, they developed a fuzzy dynamic analysis procedure. Xiu and Huang [1989] employed the method of information distribution [Liu & Huang, 1990] and fuzzy inference [Huang & Shi, 2002] to estimate the earthquake damage to brick-column single-story factory buildings. The result indicates that the fuzzy model can attain minimum error in damage prediction.
250
8 Fuzzy Logic and Earthquake Research
8.4.1 Fuzzy earthquake intensity The macro-seismic intensity scale is a subjective measure of the effect of the ground shaking and is not a precise engineering measurement (Appendix 8.A). This is true despite a great deal of effort since the 1942 study of Gutenberg and Richter to take the average values of peak ground acceleration from the seismic instrumental records in damaged areas as a precise physical reference. Correlations have not been successful because of the great deviation of both values in different locations. From the point of view of fuzzy mathematics, Feng et al. [1982] first described seismic intensities of VI to XII as normal functions, using different grades of building damage as base variables. More than 700 field investigations of Chinese earthquakes provided data to shape the functions. Liu et al. [1983] formally proposed the name “fuzzy intensity” and defined it as fuzzy subsets in the universe of discourse U with the base variables coming from the average damage index [0, 1] of buildings, on the basis of numerous earthquake field investigations carried out in China in recent decades. A more detailed description of this work is given by Tian [1983], in which 1364 data points are used. Wang, F. [1983] studied the relation between epicentral intensity and magnitude. He expressed epicentral intensity as fuzzy subsets of magnitude by using normal configurations as membership function with shapes determined from earthquake records in China. He obtained better results than those obtained by statistical methods of correlation. Wang [1984] outlined a two-stage procedure for intensity evaluation. He first divided intensity evaluating factors into four categories, each of which was considered as a factor subset. These were the subset U1 of damage indexes of buildings, subset U2 of peak values of ground motion and some response spectra, subset U3 of earthquake characteristics, and subset U4 of human reactions and geologic condition. All the factors in subsets U1 and U2 were described by a normal configuration. At the first stage, the fuzzy intensity vector in each category was determined by inference rules, and then the second stage was carried out by treating such intensity vector as a factor vector combined with a weight factor. Liu [1982] was the first scientist to introduce the concept of linguistic variable in fuzzy set theory to study fuzzy earthquake intensity. He established a fuzzy model to infer PGA from earthquake intensity, both of which were defined as fuzzy subsets in the universes of discrete acceleration and building damage index, respectively. Liu used the model to check the 1980 Intensity Scale of China. The result is satisfactory. In Liu’s model, earthquake intensities V–X are defined by Table 8.2, where u1 , u2 , . . . , u12 are values of building damage index. For example, fuzzy intensity X1 (i.e., intensity level V) is a fuzzy subset in the universes of building damage index with membership function X1 = 1/u1 + 0.90/u2 + 0.83/u3 + 0.50/u4 .
8.4 Fuzzy Logic in Earthquake Engineering
251
Table 8.2 Fuzzy earthquake intensities V–X with the universe U . General building
X1
X2
X3
X4
X5
X6
(VII)
(VIII)
(IX)
(X)
0.27 0.42 0.60 0.77 0.98
0.30 0.50 0.70 1
Damage
Damage index
(V)
(VI)
Intact
u1 < 0.03 0.03 ≤ u2 < 0.10 0.10 ≤ u3 < 0.16 0.16 ≤ u4 < 0.23 0.23 ≤ u5 < 0.30 0.30 ≤ u6 < 0.36 0.36 ≤ u7 < 0.43 0.43 ≤ u8 < 0.50 0.50 ≤ u9 < 0.56 0.56 ≤ u10 < 0.63 0.63 ≤ u11 < 0.70 0.70 ≤ u12
1 0.90 0.83 0.50
0.27 0.67 0.88 0.81 0.71 0.54 0.37
Slight damage
Moderate damage
Severe damage
Collapse
0.42 0.81 0.96 1 1 0.79 0.52 0.37
0.40 0.50 0.61 0.69 0.81 0.92 1 0.81 0.60
u1 , u2 , . . . , u12 are the elements of the universe U . Source: Liu [1982].
Liu also defined 12 fuzzy PGAs by Table 8.3, where v1 , v2 , . . . , v16 are the traditional PGAs expressed in g (galileo or cm/s2 ). For example, fuzzy PGA Y2 is a fuzzy subset in the universes of PGA with membership function Y2 = 0.75/v1 + 0.90/v2 + 0.75/v3 + 0.45/v4 + 0.25/v5 . With the fuzzy sets in Tables 8.2 and 8.3, Liu [1982] collected 117 observations where intensity and acceleration were changed into an information matrix M, shown in Equation (8.13). Let (I, v) be an observation with intensity I and acceleration v. If I fires fuzzy subset Xi (i.e., I is just Xi ) and v fires fuzzy subset Yj (i.e., μyj (v) > 0), then (I, v) is regarded as an observation that belongs to fuzzy subset Xi × Yj . The actual number of observations belonging to Xi × Yj is assigned to the element of M in Xi row and Yj column. ⎛ X1 X2 ⎜ ⎜ M = X3 ⎜ ⎜ X4 ⎜ ⎜ X5 ⎝ X6
Y1 4 4 0 0 0 0
Y2 2 10 5 0 0 0
Y3 0 10 16 1 0 0
Y4 1 2 14 0 0 0
Y5 0 1 12 1 0 0
Y6 0 1 9 1 0 0
Y7 0 2 5 0 0 0
Y8 0 2 7 2 0 0
Y9 0 0 1 1 0 0
Y10 Y11 Y12⎞ 0 0 0 0 0 0 ⎟ ⎟ 2 0 0 ⎟ ⎟. 0 0 0 ⎟ ⎟ 0 0 1 ⎠ 0 0 0 (8.13)
252
8 Fuzzy Logic and Earthquake Research
Table 8.3 Fuzzy PGA Y with the universe V . Acceleration
Y1
Y2
Y3
(g) v1 < 25 25 ≤ v2 < 50 50 ≤ v3 < 75 75 ≤ v4 < 100 100 ≤ v5 < 130 130 ≤ v6 < 160 160 ≤ v7 < 190 190 ≤ v8 < 230 230 ≤ v9 < 270 270 ≤ v10 < 315 315 ≤ v11 < 360 360 ≤ v12 < 420 420 ≤ v13 < 500 500 ≤ v14 < 600 600 ≤ v15 < 700 700 ≤ v16 (g)
1 0.85 0.50 0.30
0.75 0.90 0.75 0.45 0.25
0.50 0.85 1 0.85 0.50 0.30
Y4
Y5
0.25 0.45 0.75 0.90 0.75 0.45 0.25
0.25 0.40 0.70 0.85 0.70 0.45 0.25
Y6
0.25 0.45 0.75 0.90 0.75 0.45 0.25
Y7
0.20 0.40 0.65 0.80 0.65 0.40 0.20
Y8
0.25 0.45 0.75 0.90 0.75 0.45 0.25
Y9
0.25 0.40 0.70 0.85 0.70 0.40 0.25
Y10
Y11
Y12
0.30 0.50 0.85 1 0.85 0.50 0.30
0.25 0.45 0.75 0.90 0.75 0.45
0.30 0.50 0.85 1
v1 , v2 , . . . , v16 are the elements of the universe V . Source: Liu [1982].
Then, with his experience, from M Liu chose the following five elements: E = {(1, 2), (2, 3), (3, 5), (4, 8), (5, 12)}
(8.14)
to make the following five fuzzy rules:
g1 = If X is X1 then Y is Y2 ,
g2 = If X is X2 then Y is Y3 ,
g3 = If X is X3 then Y is Y5 ,
g4 = If X is X4 then Y is Y8 ,
g5 = If X is X5 then Y is Y12 . Let Ri = Xi × Yi ,
(8.15)
where the membership function of fuzzy set Ri is Ri (u, v) =
min {Xi (u), Yi (v)},
u∈U,v∈V
(8.16)
8.4 Fuzzy Logic in Earthquake Engineering
253
Table 8.4 Fuzzy relation between intensity I and acceleration v.
V VI VII VIII IX
v1
v2
v3
v4
v5
v6
v7
v8
v9
v10
v11
v12
v13
v14
v15
v16
0.75 0.75 0.75 0.50 0
0.90 0.85 0.81 0.61 0
0.83 0.88 0.81 0.61 0.25
0.83 0.85 0.81 0.61 0.37
0.70 0.70 0.70 0.70 0.37
0.81 0.81 0.85 0.79 0.37
0.70 0.70 0.70 0.70 0.37
0.45 0.45 0.45 0.45 0.45
0.50 0.61 0.75 0.75 0.60
0.50 0.61 0.79 0.90 0.60
0.50 0.61 0.75 0.75 0.60
0.45 0.45 0.45 0.45 0.45
0.25 0.25 0.30 0.30 0.30
0 0 0.37 0.50 0.50
0 0 0.37 0.60 0.85
0 0 0.37 0.60 0.98
and R = R1 ∪ R 2 ∪ R 3 ∪ R 4 ∪ R 5 .
(8.17)
Hence, Liu [1982] obtained a fuzzy relation between building damage index u and acceleration v. Here we omit the fuzzy relation because it is very large. Then, with the max–min fuzzy composition rule, from a given intensity I he obtained a fuzzy PGA, Q, defined in the universe of discourse of acceleration, V = {v1 , v2 , . . . , v16 }. For example, if I = V , then I = V = X1 = 1/u1 + 0.90/u2 + 0.83/u3 + 0.50/u4 μQ (vj ) = max{min[μX1 (ui ), μR (ui , vj )]}, j = 1, 2, . . . , 16 Q = 0.75/v1 + 0.90/v2 + 0.83/v3 + 0.83/v4 + 0.70/v5 + 0.81/v6 + 0.70/v7 + 0.45/v8 + 0.50/v9 + 0.50/v10 + 0.50/v11 + 0.45/v12 + 0.25/v13 . Therefore, he obtained a fuzzy relation between intensity (not building damage index) and acceleration, shown in Table 8.4. Finally, using the so-called “max” principle, he defuzzified Q into a crisp result. That is, let Q = μQ (v1 )/v1 + μQ (v2 )/v2 + · · · + μQ (v16 )/v16 If μQ (vk ) = max{μQ (v1 ), μQ (v2 ), . . . , μQ (v16 )}, then Q’s crisp result is v = vk . Hence, we can estimate acceleration from a given intensity. All results from the fuzzy model are shown in column 4 of Table 8.5. To judge whether the Q obtained by using the above fuzzy inference method is reliable, Liu introduced two indexes: (1) maximum possibility E(Q); and
254
8 Fuzzy Logic and Earthquake Research
Table 8.5 1980 Intensity Scale of China and results from Liu’s [1982] model. 1980 Intensity Scale of China
Fuzzy inference
Intensity
Damage index
PGA
PGA
Error
E(Q)
EC(Q)
V VI VII VIII IX X
— 0–0.1 0.11–0.30 0.31–0.50 0.51–0.70 0.71–0.90
31 (22–44) 63 (45–89) 125 (90–177) 250 (178–353) 500 (354–707) 1000 (708–1414)
37.5 62.5 145 292.5 700 —
20% 0 16% 17% 40% —
0.77 0.78 0.78 0.70 0.43 —
0.62 0.62 0.63 0.51 0.27 —
(2) minimum possibility EC(Q), which are defined by the formulas E(Q) =
12
pi sup(Q ∩ Yi )
(8.18)
i=1
EC(Q) = 1 − E(Q),
(8.19)
where Q is the complement of Q, and 1 mki , 117 6
pi =
(8.20)
k=1
mki is the value of the element in the kth-row and ith-column of the information matrix (8.13). pi is the probability of the evidence qi that can fire Yi . The probability distribution is P = {p1 , p2 , . . . , p12 } = {0.068, 0.145, 0.231, 0.145, 0.120, 0.094, 0.060, 0.094, 0.017, 0.017, 0, 0.009} For example, 1 4+4+0+0+0+0 = 0.068. mk1 = 117 117 6
p1 =
k=1
In Liu’s estimation, the larger E(Q) and EC(Q) are, the more reliable Q is. Liu’s E(Q) and EC(Q) are given in columns 6 and 7 of Table 8.5, respectively. For example, V’s row of Table 8.4 is a fuzzy PGA, denoted by Q1 ; that is: Q1 = 0.75/v1 + 0.90/v2 + · · · + 0.25/v13 .
8.4 Fuzzy Logic in Earthquake Engineering
255
Hence E(Q1 ) =
12
pi sup(Q1 ∩ Yi ) = 0.77
i=1
EC(Q1 ) = 1 − E(Q1 ) = 0.62. It is interesting to note that, in Table 8.5, the error of fuzzy inference is lower if the E(Q) and EC(Q) are higher. Therefore, this fuzzy model provides a way of checking the 1980 Intensity Scale of China. For example, from Table 8.5 we know that the PGAs assigned by the 1980 Intensity Scale of China for intensities VI and VII are correct. Among the 117 data serving for Liu’s model, much of the data has intensities VI and VII. Hence, the results for these intensities are more reliable. Thus, in this model, the 1980 Intensity Scale of China has been proven in theory for the first time.
8.4.2 Estimating earthquake damage with fuzzy logic There has been a number of efforts to estimate earthquake damage using mechanics methods with probability approaches. However, only a few of these have been directed towards determining structural damage states with respect to earthquakes. Current methods emphasize stochastic damage evaluation, and calculations of the reliability of structural systems with uncertain parameters under various seismic excitations are presented. The reliability functions of structural systems are determined analytically by introducing approximate probability density functions of damage. This practice verifies that the approximate probability density functions are only applicable to the reliability analysis of uncertain structural systems when we can collect sufficient data from the building and earthquake load. In the current seismic code of China (GBJ11-89), a definite three-level seismic fortification requirement is proposed, that is, “Do not be damaged under a minor earthquake. Be repairable under a moderate earthquake. Do not collapse under a major earthquake.” The relative relations between the three-level intensity and the basic intensity are summed up as shown in Table 8.6. Owing to the complicated nature of building damage caused by earthquakes, only five grade levels of damage to buildings are often used: (1) intact; (2) slight damage; (3) moderate damage; (4) severe damage; and (5) collapse. These grades comprise fuzzy subsets in the universe of discourse. In the current seismic code used in China, there are five grades to measure earthquake damage to engineered structures: intact; slight damage; moderate damage; severe damage; and collapse. In fact, earthquake damage grade is a fuzzy concept, because it is difficult to completely separate the different damage grades by some
256
8 Fuzzy Logic and Earthquake Research
Table 8.6 Relationship of three-level seismic levels with the basic intensity (the design reference period is 50 years). Seismic levels
Minor earthquake
Moderate earthquake
Major earthquake
Exceedance probability
0.632
0.10
0.02–0.03
Relationship with basic intensity
1.55 degrees lower than basic intensity
equals the basic intensity
about 1 degree higher than basic intensity
Design requirement
do not be damaged
be repairable
do not collapse
specific values of structural response. We denote Ai as the ith fuzzy seismic damage grade; that is to say, the universe set of the seismic damage grade is D = {A1 , A2 , A3 , A4 , A5 } = {intact, slight damage, moderate damage, severe damage, collapse} For a specific type of building, the main task of estimating earthquake damage is to identify the relationship between the damage grade and the earthquake load. For the brick-column, single-story factory building, Xiu and Huang [1989] suggested a typical model with fuzzy logic. In their model, the earthquake loads are the site intensities VI, VII, VIII, IX, and X, and the damage grades were defined as the fuzzy subsets in the universes of building damage index as the following: A1 = intact = 1/u1 + 0.7/u2 + 0.2/u3 A2 = slight damage = 0.2/u1 + 0.7/u2 + 1/u3 + 0.7/u4 + 0.2/u5 A3 = moderate damage = 0.2/u3 + 0.7/u4 + 1/u5 + 0.7/u6 + 0.2/u7 A4 = severe damage = 0.2/u5 + 0.7/u6 + 1/u7 + 0.7/u8 + 0.2/u9 A5 = collapse = 0.2/u7 + 0.7/u8 + 1/u9 + 0.7/u10 + 0.2/u11 where the values of building damage index, u1 , u2 , . . . , u11 , are given in (8.5). The relationship in their model is a regression result obtained from the data recording the damage caused by the Tangshan earthquake, shown in column 3 of Table 8.7. To change the given sample into a fuzzy relationship between intensity and damage (with building parameters such as building height H , distance d between two brick columns, and so on), we first calculate the dynamic response x, shown in column 4 of Table 8.7. We then obtain 18 data with dynamic response x and real damage D. With these data and the method of information distribution, we construct a primary
8.4 Fuzzy Logic in Earthquake Engineering
257
Table 8.7 Earthquake damage to brick-column single-story factory buildings in area with intensity VIII during the Tangshan earthquake, 1976. No.
Name of workshop
Real damage D
Dynamic response x
Fuzzy conclusion
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
Washing workshop of TMF #28 machining workshop of TMF Metalforming workshop of TMF Finished products storehouse of TMF Coil workshop of TGEW Spare parts storehouse of TGEW Wooden molds storehouse of TGEW Spray paint workshop of TGEW Machining workshop of TDEW Metalforming workshop of TDTF Stamping workshop of TDTF Machining workshop of TDMF Machining workshop of TAEF Assembling workshop of TAEF Repairing workshop of TMPM Preserving workshop of TSCF Stamping workshop of TLVSF Oil filling workshop of TITW
A3 A1 A1 A1 A1 A1 A1 A2 A1 A1 A1 A4 A5 A4 A4 A3 A2 A2
1.2481 0.926054 0.536433 0.94601 0.57222 1.1049 0.243632 1.27026 3.59285 0.74636 1.41304 0.830951 1.59838 1.59838 1.98074 1.45437 1.10519 0.914276
A2 A1 A1 A1 A1 A2 A1 A2 A5 A1 A3 A1 A5 A5 A5 A3 A2 A1
TAEF—Tanggu Auto Engine Factory; TDEW—Tianjin Diesel Engine Works; TDMF—Tianjin Dongfeng Metalforming Factory; TDTF—Tianjin Dongfanghong Tractor Factory; TGEW—Tianjin Generating Equipment Works; TITW—Tianjin Instrument Transformer Works; TLVSF—Tianjin Low-Voltage Switch Factory; TMF—Tianjin Machinery Factory; TMPM—Tianjin Medium Plate Mill; TSCF—Tianjin Second Cable Factory.
information distribution matrix, shown in Table 8.8. The element of the ith row and j th column is denoted by qij . For example, q23 = 0.42, q24 = 0.29, and so forth. For the sake of clarity, here we use only two places following the decimal point. In fact, the real value of every element stored in the computer has nine digits. Some of the zero entries in the matrix are only approximately equal to zero. Non-entry is really zero. Let R1 = (rij1 )6×11 R2 = (rij2 )6×11 ,
258
8 Fuzzy Logic and Earthquake Research
Table 8.8 Primary information distribution matrix QVIII of damage cases in VIII zone during the Tangshan earthquake, 1976. Dynamic response
0.0
0.1
0.2
0.3
Damage index 0.4 0.5 0.6
0.7
0.8
0.9
1.0
0.96 1.12 1.28 1.44 1.60 1.76
0.64 0.39 0.13 0.30 0.00 0.00
0.47 0.47 0.28 0.21 0.00 0.00
0.21 0.42 0.41 0.12 0.01 0.00
0.20 0.29 0.44 0.23 0.02 0.00
0.27 0.14 0.35 0.32 0.10 0.00
0.03 0.00 0.00 0.00 0.48 0.00
0.02 0.00 0.00 0.00 0.49 0.00
0.00 0.00 0.00 0.00 0.25 0.00
0.00 0.00 0.00 0.00 0.10 0.00
0.21 0.05 0.20 0.23 0.26 0.00
0.09 0.01 0.06 0.06 0.41 0.00
Table 8.9 Fuzzy relationship matrix RVIII based on damage cases in VIII zone during the Tangshan earthquake, 1976. Dynamic response
0.0
0.1
0.2
0.3
Damage index 0.4 0.5 0.6
0.7
0.8
0.9
1.0
0.96 1.12 1.28 1.44 1.60 1.76
1.00 0.61 0.19 0.46 0.00 0.00
0.73 0.99 0.59 0.45 0.00 0.00
0.33 0.91 0.93 0.29 0.01 0.00
0.31 0.62 1.00 0.52 0.05 0.00
0.42 0.30 0.81 0.91 0.21 0.00
0.04 0.00 0.00 0.00 0.99 0.00
0.03 0.00 0.00 0.00 1.00 0.00
0.00 0.00 0.00 0.00 0.51 0.00
0.00 0.00 0.00 0.00 0.15 0.00
0.32 0.11 0.46 0.70 0.52 0.00
0.14 0.03 0.13 0.16 0.84 0.00
where qij , j = 1, 2, . . . , 11 max{q1j , q2j , . . . , q6j } qij rij2 = , i = 1, 2, . . . , 6, max{qi1 , qi2 , . . . , qi11 }
rij1 =
then R = R1
'
R2 (i.e., rij = min{rij1 , rij2 })
is the fuzzy relationship from x to D, shown in Table 8.9. The fuzzy relationship matrix is a fuzzy set defined in universe V × U , where V = {v1 , v2 , . . . , v5 } = {0.96, 1.12, 1.28, 1.44, 1.60, 1.76}, is the universe of the dynamic response and U is the universe of damage index as given in (8.5).
8.5 Hybrid Fuzzy Neural Networks
259
By means of the max–min model for fuzzy inference, we can obtain a fuzzy conclusion D from a response x. We choose the closest damage grade A to be the last conclusion. Experientially, Xiu & Huang [1989] employed the method of information distribution [Liu & Huang, 1990] to change a crisp value x into a fuzzy set μX (v) with universe V . The fuzzy set has the following membership function: / 1− | x − v | /v, if |x − v| ≤ v μX (v) = (8.21) 0, otherwise, where v, the step length of v, is v2 − v1 = 0.16. For example, for a given input x = 1.2481, Xiu and Huang changed it into a fuzzy input X = 0.72/v2 + 0.26/v3 = 0.72/1.12 + 0.26/1.28. They then obtained a fuzzy conclusion D = 0.61/u1 + 0.74/u2 + 0.74/u3 + 0.62/u4 + 0.3/u5 + 0.26/u6 + 0.13/u7 = 0.61/0 + 0.74/0.1 + 0.74/0.2 + 0.62/0.30 + 0.3/0.4 + 0.26/0.5 + 0.13/0.6. Because A2 is the closest grade to the D among all damage grades, we choose A2 to be the final conclusion. The last column of Table 8.7 gives the conclusion from the fuzzy inference. This result is better than any other result from a traditional method (such as the method of least squares).
8.5 Hybrid Fuzzy Neural Networks with Information Diffusion Method The studies of earthquake prediction and earthquake engineering often involve finding nonlinear functions based on data that are scanty, incomplete, and contradictory. In this section, we estimate the relationship between isoseismal area and earthquake magnitude as an example to introduce a new approach. In the mid-1970s, some researchers [e.g., Howell & Schultz, 1975; Gupta & Nuttli, 1976] were active in the search for the following (or similar) expression relating intensity I to magnitude M and hypocentral distance (in kilometers): I = aM − b log10 + c
(8.22)
where a, b, and c are empirical constants. Since there is a 60% probability that an observed intensity is more than one degree greater or smaller than its predicted value [Lomnitz & Rosenblueth, 1976], a more appropriate expression [Huang & Liu, 1985]
260
8 Fuzzy Logic and Earthquake Research
relating isoseismal area S (in square kilometers), intensity I , and magnitude M was developed: log10 S(I ) = a + bM
(8.23)
where a and b are empirical constants. However, several studies [Liu et al., 1987] have demonstrated that the linear relationship does not fit the seismicity of any region. In the western United States of America, many destructive earthquakes are controlled by one huge fault, the San Andreas fault, and the relationship between isoseismal area and earthquake magnitude is approximately linear. Nevertheless, in regions where earthquakes are not controlled by one single fault, the linear relationship generally fails. In principle, if there were many observations recording historical earthquakes, researchers could use powerful statistical tools such as regression analysis [Bollinger et al., 1993; Cavallini & Rebez, 1996; Fukushima et al., 1995] to reveal the nonlinear relationship. However, destructive earthquakes are infrequent events with very small probability of occurrence. Thus, we do not have enough data to employ the traditional statistical tools to estimate the relationship between isoseismal area and earthquake magnitude. Furthermore, seismotectonic structures are very complex and the relationship between isoseismal area and earthquake magnitude is strongly nonlinear. Since we do not know which nonlinear function would best describe the relationship, it is profitable to employ neural networks to search for the mapping from the input, which is earthquake magnitude, to the output, which is isoseismal area by observations. However, in the real world, the observations are strongly scattered and contradictory patterns do occur. Since neural information processing models largely assume that the learning patterns for training a neural network are compatible, a neural network approach to this problem does not converge because the adjustments of weights and thresholds do not know where to turn because of the ambiguity brought forth by the contradictory patterns. Hence, to estimate the relationship, we need to handle the fuzziness and granularity of the observations. Hybrid fuzzy neural networks [Huang & Leung, 1999] based on the normal diffusion function [Huang, 1997] can be used to estimate fuzzy relationships with scanty, incomplete, and contradictory data.
8.5.1 Fuzzy relationships given by the information diffusion method To facilitate our discussion, we first give some basic concepts and definitions. Unless stated otherwise, we assume that we are given a sample of n real-valued observations, xi (i = 1, 2, . . . , n), which have two components, earthquake magnitude mi and isoseismal area Si , whose underlying relationship is to be estimated.
8.5 Hybrid Fuzzy Neural Networks
261
An observation is also called a pattern. A given sample is represented as follows: X = {x1 , x2 , . . . , xn } = {(m1 , S1 ), (m2 , S2 ), . . . , (mn , Sn )}. To reduce scattering in the sample, we generally consider logarithmic isoseismal area, g = log10 S, instead and the given sample can be rewritten as X = {x1 , x2 , . . . , xn } = {(m1 , g1 ), (m2 , g2 ), . . . , (mn , gn )}.
(8.24)
In general, fuzzy subsets used to construct fuzzy rules and fuzzy relationships are derived from “expert” information, and different choices of fuzzy subsets lead to different results. During development, fuzzy subsets can be adjusted (“turned”) according to the results. However, this approach in which first impressions are strongest does not fit incomplete data problems where there is insufficient information to support any first impressions. The usual method is to look for evidence to support fuzzy relationships. The tracing curve of sample-data clusters is regarded as an input–output fuzzy relationship function. Cluster analysis techniques can be developed to choose reasonable fuzzy subsets. When a sample is small or scattered, there does not exist any tracing curve of clusters. The clustering approach in small data sets can be replaced by an information diffusion method [Huang, 1997; Huang & Shi, 2002] which helps us to change an observation into a fuzzy subset to fill the gap caused by incomplete data. Let M = {m1 , m2 , . . . , mn } be a given sample with the universe of discourse U . A mapping from M × U to [0, 1] μ : M × U → [0, 1] is called information diffusion of M on U if it satisfies: (1) ∀mi ∈ M, if u0 = mi , then μ(mi , u0 ) = supu∈U μ(mi , u) (2) ∀mi ∈ M, μ(mi , u) is a convex function about u. μ(mi , u) is called an information diffusion function of M on U . When U is discrete, the function can also be written as μ(mi , uj ). Let X be a given sample which can be used to estimate the real-valued relation R by the operator γ . If the estimator is calculated by using the information distribution function, the estimator is called the information diffusion estimator. The principle of information diffusion, which has been justified [Huang, 1997, 1998a, 2000; Huang & Shi, 2002], asserts that there must be some reasonable information diffusion functions to improve the non-diffusion estimator if and only if X is incomplete. The principle is obvious when we understand the fuzziness of incomplete data. And, by the methods of classical mathematics, we have justified that the principle is efficient for estimating probability density functions with small samples.
262
8 Fuzzy Logic and Earthquake Research
On the basis of similarities of information and molecules, and with the help of the molecular diffusion theory, we obtain, by solving the partial differential equation, a normal diffusion function 5 4 (u − mi )2 (8.25) μ(mi , u) = exp − 2h2 where h is called the normal diffusion coefficient, which can be simply calculated [Huang, 1997] by ⎧ ⎪ ⎪1.6987(b − a)/(n − 1), for 1 < n ≤ 5 ⎨ 1.4456(b − a)/(n − 1), for 6 ≤ n ≤ 7 h= (8.26) 1.4230(b − a)/(n − 1), for 8 ≤ n ≤ 9 ⎪ ⎪ ⎩ 1.4208(b − a)/(n − 1), for 10 ≤ n. Using the normal diffusion function, we can change any input–output observation (mi , gi ) ∈ X in (8.24) into two fuzzy subsets 5 4 6 6 (u − mi )2 /u (8.27) μ(mi , u)/u = exp − Ai = 2h2m U U and 6 Bi =
6 μ(gi , v)/v =
V
V
7
8 (v − gi )2 exp − /v. 2h2g
(8.28)
Obviously, (mi , gi ) means A i → Bi .
(8.29)
In order to preserve more information, we employ the correlation-product encoding [Kosko, 1992] to produce a fuzzy relationship based on Ai → Bi instead of the correlation-minimum encoding in the Mamdani–Togai model. Therefore, Ri (u, v) = Ai (u)Bi (v);
u ∈ U, v ∈ V .
(8.30)
So, we can get n fuzzy relationships from n historical earthquake observations.
8.5.2 Pattern smoothing Suppose we have n observations (m1 , g1 ), (m2 , g2 ), . . . , (mn , gn ). Using the information diffusion technique, we can get n fuzzy if-then rules A1 → B1 , A2 → B2 , . . . , An → Bn .
8.5 Hybrid Fuzzy Neural Networks
263
Now, if a crisp input value m0 (the antecedent) is known, then one needs a way to infer the consequent g0 from m0 and Ri . In practical calculation, U is generally discrete, so that m0 is not just equal to some value in U . We can employ the information distribution formula in (8.31) to get a fuzzy subset as an input: / 1 − |m0 − uj |/, if |m0 − uj | ≤ (8.31) m0 (uj ) = 0, if |m0 − uj | > where = u2 − u1 . A fuzzy consequent g0 from m0 and Ri is then obtained as ∼ ∼ m0 (u)Ri (u, v). (8.32) g0 (v) = u
When we defuzzify g0 into a crisp output value g0 , there is no operator that can avoid ∼ system error. In order to get direct crisp output values, we only need to change the magnitude component into fuzzy subsets. In other words, for the sake of defuzzifying easily, it is unnecessary to change the logarithmic isoseismal area component into fuzzy subsets when we construct the relationships Ri . That is, for an observation (mi , gi ), we employ Equation (8.27) to change input mi into a fuzzy subset Ai with membership function Ai (u), but employ Equation (8.33) instead of (8.28) to change output gi into a fuzzy subset Bi with membership function Bi (v); that is, a singleton. / 1, if v = gi (8.33) Bi (v) = 0, if v = gi . In this case, Equation (8.30) is changed into (8.34): / Ai (u), if v = gi Ri (u, v) = 0, if v = gi . Therefore
/ g0 (v) =
u m0 (u)Ai (u),
0,
Let Wi =
if v = gi if v = gi .
m0 (u)Ai (u).
(8.34)
(8.35)
(8.36)
u
(In fact, Wi is the possibility (weight) that consequent g0 may be gi .) Then, to integrate all results coming from R1 , R2 , . . . , Rn , the relevant output value g0 becomes
n n 3 Wi g i Wi . g0 = (8.37) i=1
i=1
264
8 Fuzzy Logic and Earthquake Research
The procedure comprising (8.31)–(8.37) is called “information-diffusion approximate reasoning” (IDAR). Any observation (mi , gi ) of the sample can thus be changed into a new pattern gi ) via information-diffusion approximate reasoning. The new patterns must be (mi , smoother.
8.5.3 Learning relationships by BP neural networks Using neural networks for automatic learning by examples has been a common approach employed in the construction of artificial systems. Backpropagation (BP) neural networks [Pao, 1989; Rumelhart & McClelland, 1973], a class of feedforward neural networks, are models commonly used for learning and reasoning. However, neural information processing models generally assume that the patterns used for training a neural network are compatible. Because the new patterns that result from information-diffusion approximate reasoning are smoother, they are compatible. Therefore, we can employ the BP neural networks to learn relationships from the new patterns. We integrate the information-diffusion approximate reasoning and conventional BP neural network into a hybrid model (HM) depicted in Figure 8.1, called a hybrid fuzzy neural network. In the hybrid model, observations (m1 , g1 ), (m2 , g2 ), . . . , (mn , gn ) are first g1 ), changed, via the information-diffusion technique, into new patterns (m1 , gn ). Then a conventional BP neural network is employed to learn g2 ), . . . , (mn , (m2 , the relationship between isoseismal area g and earthquake magnitude m.
8.5.4 An application In Yunnan province of China, there is a data set of strong earthquakes consisting of 25 records from 1913 to 1976 with magnitude and isoseismal area, the latter surveyed from the region with intensity I ≥ VII. The magnitude and the isoseismal area are shown in the M column and the SI≥VII column of Table 8.10, respectively. The given sample with magnitude m and logarithmic isoseismal area g = log10 S is shown in Equation (8.38): X = {x1 , x2 , . . . , x25 } = {(m1 , g1 ), (m2 , g2 ), . . . , (m25 , g25 )} = {(6.5, 3.455), (6.5, 3.545), (7, 3.677), (5.75, 2.892), (7, 3.414), (7, 3.219), (6.25, 3.530), (6.25, 3.129), (5.75, 2.279), (6, 1.944), (5.8, 1.672), (6, 3.554), (6.2, 2.652), (6.1, 2.865), (5.1, 1.279),
(8.38)
8.5 Hybrid Fuzzy Neural Networks
265
Figure 8.1 Architecture of the hybrid model (HM) integrating information-diffusion approximate reasoning (IDAR) and a conventional BP neural network.
(6.5, 3.231), (5.4, 2.417), (6.4, 2.606), (7.7, 3.913), (5.5, 2.000), (6.7, 2.326), (5.5, 1.255), (6.8, 2.301), (7.1, 2.923), (5.7, 1.996)}. Linear regression methods have conventionally been used to estimate the relationship between g and m. Regressing g on m with data in the given sample, we obtain the regression line shown in (8.39), whose r 2 = 0.503 is relatively small. g = −2.60767 + 0.8521531m
(8.39)
To establish a relationship by the neural network approach, we constructed a conventional BP neural network with one node in the input layer, 15 nodes in the hidden layer, and one node in the output layer. Setting the momentum rate η = 0.9 and learning rate α = 0.7, we directly used X in (8.38) to train the BP network. After 600,000 iterations, the normalized system error is 0.015594. The results obtained by the regression and neural network methods are compared in Figure 8.2. Apparently, the BP curve does not quite adequately capture the observed relationship. Neither does the regression method.
266
8 Fuzzy Logic and Earthquake Research
Figure 8.2 Relationship between earthquake magnitude and logarithmic isoseismal area, estimated by linear regression and BP network.
A careful examination of X shows that it is, in fact, relatively small and data contained are incomplete and sometimes contradictory. Therefore, any observation in X can be regarded as a piece of fuzzy information that represents partially the relationship between the two variables. Furthermore, an observation can produce a simple fuzzy relationship via the diffusion method. The integration of all fuzzy relationships will, in turn, produce a more appropriate description. For the analysis of data in Table 8.10, the procedure for our proposed approach is as follows. First, let the discrete universe of discourse of earthquake magnitudes be U = {u1 , u2 , . . . , u30 } = {5.010, 5.106, . . . , 7.790}
(8.40)
where the step length is 0.096. Then, from Xm = {mi |i = 1, 2, . . . , 25}, using Equation (8.26), we obtain the normal diffusion coefficient as hm = 1.4208(7.7 − 5.1)/(25 − 1) = 0.15392.
(8.41)
8.5 Hybrid Fuzzy Neural Networks
267
Table 8.10 Magnitudes and isoseismal areas. Date
M
SI≥VII
Date
M
1913.12.21 1917.7.31 1925.3.16 1930.5.15 1941.5.16 1941.12.26 1951.12.21 1952.6.19 1952.12.28 1955.6.7 1961.6.12 1961.6.27 1962.6.24
6.5 6.5 7 5.75 7 7 6.25 6.25 5.75 6 5.8 6 6.2
2848 3506 4758 779 2593 1656 3385 1345 190 88 47 3582 449
1965.7.3 1966.1.31 1966.2.5 1966.9.19 1966.9.28 1970.1.5 1970.2.7 1971.4.28 1973.3.22 1973.8.6 1974.5.11 1976.2.16
6.1 5.1 6.5 5.4 6.4 7.7 5.5 6.7 5.5 6.8 7.1 5.7
SI≥VII 733 19 1703 261 404 8176 100 212 18 200 837 99
M—Magnitude measured on the Richter scale; S—Isoseismal area measured in square kilometers.
Applying the normal information diffusion formula (8.25), observation mi can be changed into a fuzzy subset as 4 5 6 6 (uj − mi )2 Ai = mi (uj )/uj = exp − (8.42) /uj . 2h2m U U Therefore, an observation xi = (mi , gi ) can be transformed into a single-column fuzzy relationship matrix: ⎛ gi ⎞ u1 mi (u1 ) ⎟ u2 ⎜ ⎜ mi (u2 ) ⎟ ⎜ · ⎟ . (8.43) Ri = · ⎜ ⎟ ⎜ · ⎟ · ⎜ ⎟ ⎝ · ⎠ · u30 mi (u30 ) When magnitude m0 is given, we can change it into a fuzzy subset on U through the information distribution formula in (8.31). Employing the weight formula in (8.36), we calculate the relevant logarithmic isoseismal area g0 . Let m0 = mi ∈ Xm ; we can obtain a new sample as follows: = { x2 , . . . , x25 } X x1 , g1 ), (m2 , g2 ), . . . , (m25 , g25 )} = {(m1 , = {(6.5, 3.113), (6.5, 3.113), (7, 3.184), (5.75, 2.231), (7, 3.184),
268
8 Fuzzy Logic and Earthquake Research
(7, 3.184), (6.25, 3.028), (6.25, 3.028), (5.75, 2.231), (6, 2.684),
(8.44)
(5.8, 2.300), (6, 2.684), (6.2, 2.979), (6.1, 2.858), (5.1, 1.444), (6.5, 3.113), (5.4, 1.931), (6.4, 3.120), (7.7, 3.912), (5.5, 1.963), (6.7, 2.871), (5.5, 1.963), (6.8, 2.884), (7.1, 3.231), (5.7, 2.168)}. This is used to train a conventional BP neural network [Rumelhart et al., 1986; Pao, 1989] with one unit in the input layer, one hidden layer with 15 units, and one unit in the output layer. Let the momentum rate be η = 0.9 and the learning rate be in (8.44) to train the BP neural network. After 153,780 iterations, α = 0.7. We use X the normalized system error is 0.00001. Using the hybrid model consisting of the information-diffusion approximate reasoning method and a BP neural network, we can get a better estimator, depicted in Figure 8.3.
Figure 8.3 Relationship between earthquake magnitude and logarithmic isoseismal area, estimated by the hybrid model (HM) consisting of the IDAR method and a BP network.
8.6 Conclusion and Discussion
269
For comparison, the average sums of squared errors of the three estimators: g , the linear-regression estimator (LR); g , the BP neural network estimator (BP); and g, ˜ the hybrid-model (HM); are computed as follows: ⎧ 1 25 2 LR = 25 ⎪ i=1 (gi − gi ) = 0.273425 ⎪ ⎨ 1 25 2 (8.45) BP = 25 i=1 (gi − gi ) = 0.2102895 ⎪ ⎪ ⎩ 1 25 H M = 25 i=1 (gi − g˜ i )2 = 0.1993096 Obviously, the hybrid model is much better than the linear-regression model and the conventional BP model, because the HM curve represents the nonlinearity of the relationship between earthquake magnitude and logarithmic isoseismal area, and the curve is more stable than the BP curve.
8.6 Conclusion and Discussion This chapter surveys some contemporary efforts in which the methodologies of fuzzy logic have been applied to earthquake research. Taxonomies of the types and scope of applications of fuzzy logic in seismology, as well as earthquake engineering, are described. Methodological needs to enable efficacious assessment of the effects of potential earthquakes with the aid of a few historical earthquake records are discussed. In the past 30 years, scientists and engineers in seismology and earthquake engineering have introduced many methods of fuzzy logic and developed several approaches based on fuzzy logic for earthquake research and the analysis of complex systems. Early investigations of fuzzy logic for earthquake research focused on fuzzy concepts in earthquake prediction, earthquake engineering, and earthquakeresistant design. The most important results based on fuzzy logic were produced from 1985 through 1995. More recent studies focus on developing hybrid approaches integrating fuzzy logic, neurocomputing, evolutionary computing, and probabilistic computing. Civil engineers in the USA first introduced fuzzy logic into earthquake research. Seismologists and earthquake engineers in China and Japan developed the traditional fuzzy methods and suggested some new fuzzy methods to promote earthquake research. After reviewing the contemporary efforts relevant to the development of some methodologies of fuzzy logic for earthquake research, we can come to the following conclusions. Fuzzy seismology, as a new branch of seismology, has provided some tools to analyze seismic information for improving earthquake prediction. There have been commonly accepted approaches to quantify some fuzzy concepts in earthquake research such as earthquake intensity and the grade of earthquake damage. The
270
8 Fuzzy Logic and Earthquake Research
data in earthquake research are often scanty, incomplete, and contradictory. We can employ the methods of information diffusion to analyze these data and obtain a fuzzy relationship among the factors. There are many applications of fuzzy logic in earthquake-resistant design. After reviewing the effectiveness of the fuzzy approach to earthquake research, we believe that it is important to note the following points. We are not able to predict individual earthquakes. We do not regard fuzzy logic as a technique to resolve this problem. Rather, we regard it as a method that can enhance other techniques to promote earthquake prediction. Fuzzy logic has provided powerful tools to quantify some inherently fuzzy concepts in earthquake research. In this field, however, fuzzy methods are still dependent on expert knowledge. Most people are interested in transforming the expert knowledge into a quantified model. A few pay attention to finding new knowledge from data. We shall pay more attention to finding knowledge from seismic data and earthquakedisaster data, which are often scanty, incomplete, and contradictory. It is important to note that neither the classical models nor the fuzzy models govern the physical processes in nature. Scientists propose them as a compensation for their own limitations in understanding the processes concerned. With the development of soft computing and computational intelligence, one day we shall see that we have no choice but to use fuzzy logic for analyzing seismic data and earthquake-disaster data when they are scanty, incomplete, or contradictory.
Acknowledgments The author is especially indebted to Professor George J. Klir of Binghamton University—SUNY who found the value of this research and recommended the chapter for publication. The project is sponsored by the Scientific Research Foundation for Returned Overseas Chinese Scholars, State Education Ministry. This research was done at the Key Laboratory of Environmental Change and Natural Disaster, Ministry of Education of China.
APPENDIX 8.A: Modified Mercalli Intensity Scale Used in China I. People do not feel any Earth movement. II. A few people might notice movement if they are at rest and/or on the upper floors of tall buildings. III. Many people indoors feel movement. Hanging objects swing back and forth. People outdoors might not realize that an earthquake is occurring.
References
271
IV. Most people indoors feel movement. Hanging objects swing. Dishes, windows, and doors rattle. The earthquake feels like a heavy truck hitting the walls. A few people outdoors may feel movement. Parked cars rock. V. Almost everyone feels movement. Sleeping people are awakened. Doors swing open or close. Dishes are broken. Pictures on the wall move. Small objects move or are turned over. Trees might shake. Liquids might spill out of open containers. VI. Everyone feels movement. People have trouble walking. Objects fall from shelves. Pictures fall off walls. Furniture moves. Plaster in walls might crack. Trees and bushes shake. Damage is slight in poorly built buildings. No structural damage. VII. People have difficulty standing. Drivers feel their cars shaking. Some furniture breaks. Loose bricks fall from buildings. Damage is slight to moderate in well-built buildings; considerable in poorly built buildings. VIII. Drivers have trouble steering. Houses that are not bolted down might shift on their foundations. Tall structures such as towers and chimneys might twist and fall. Well-built buildings suffer slight damage. Poorly built structures suffer severe damage. Tree branches break. Hillsides might crack if the ground is wet. Water levels in wells might change. IX. Well-built buildings suffer considerable damage. Houses that are not bolted down move off their foundations. Some underground pipes are broken. The ground cracks. Reservoirs suffer serious damage. X. Most buildings and their foundations are destroyed. Some bridges are destroyed. Dams are seriously damaged. Large landslides occur. Water is thrown on the banks of canals, rivers, lakes. The ground cracks in large areas. Railroad tracks are bent slightly. XI. Most buildings collapse. Some bridges are destroyed. Large cracks appear in the ground. Underground pipelines are destroyed. Railroad tracks are badly bent. XII. Almost everything is destroyed. Objects are thrown into the air. The ground moves in waves or ripples. Large amounts of rock may move.
References Aki, K. [1995], “Earthquake prediction, societal implications.” Reviews of Geophysics, 33(Supplement), 243–247. Berlin, G. L. [1980], Earthquakes and the Urban Environment (Volume I ). Coleman Research Corporation, Boca Raton, FL. Blockley, D. I. [1975], “Predicting the likelihood of structural accidents.” Proceedings of the Institution of Civil Engineers, 59, Part 2, 659–668. Blockley, D. I. [1977], “Analysis of structural failures.” Proceedings of the Institution of Civil Engineers, 62, Part 1, 51–74.
272
8 Fuzzy Logic and Earthquake Research
Bollinger, G. A., Chapman, M. C., & Sibol, M. S. [1993], “A comparison of earthquake damage areas as a function of magnitude across the United States.” Bulletin of the Seismological Society of America, 83(4), 1064–1080. Brown, C. B. [1985], “The use of fuzzy sets in seismic engineering in the U.S.A.” In: Feng, D. Y., & Liu, X. H. (eds.), Fuzzy Mathematics in Earthquake Researches, pp. 2–7. Seismological Press, Beijing. Brown, C. B., & Leonards, R. S. [1971], “Subjective uncertainty analysis.” American Society of Civil Engineers, National Structural Engineering Meeting, Baltimore, MD. Preprint No. 1388. Cavallini, F., & Rebez, A. [1996], “Representing earthquake intensity–magnitude relationship with a nonlinear function.” Bulletin of the Seismological Society of America, 86(1), 73–78. Chiang, W. L., & Dong, W. M. [1987], “Dynamic response of structures with uncertain parameters: a comparative study of probabilistic and fuzzy set models.” Probabilistic Engineering Mechanics, 2(2), 82–91. China Academy of Building Research [1986], The Mammoth Tangshan Earthquake of 1976 Building Damage Photo Album. China Academic Publishers, Beijing (in Chinese). Committee on Earthquake Engineering, Research Commission on Engineering and Technical Systems, National Research Council [1982], Earthquake Engineering Research 1982. National Academy Press, Washington, DC. Feng, D. Y., & Ichikawa, M. [1989], “Quantitative estimation of time-variable earthquake hazard by using fuzzy set theory.” Technophysis, 169(1–3), 175–196. Feng, D. Y., Lou, S. B., Lin, M. Z., Gu, J. P., Zhong, T. J., & Chen, H. C. [1982], “Application of fuzzy mathematics in evaluating earthquake intensity.” Earthquake Engineering and Engineering Vibration, 2(3), 16–28 (in Chinese). Feng, D. Y., Wu, G. Y., Ichikawa, M., & Ito, H. [1989], “An applicaton of the direct method of fuzzy pattern recognition to researches of earthquake precursors in the Tokai area.” Papers in Meteorology and Geophysics, 40(1), 1–19. Feng, D. Y., Jiang, C., Zheng, X. M., Lin, M. Z., & Ito, H. [1992a], “Applications of fuzzy neural networks in earthquake prediction.” Proceedings of International Conference on Information and Systems ’92, pp. 809–812. Dalian Maritime University Publishing House, Dalian, China. Feng, D. Y., Lin, M. Z., Gu, J. P., Jiang, C., & Yu, X. J. [1992b], Fuzzy Seismology: Application of Fuzzy Mathematics in Seismology. Seismological Press, Beijing (in Chinese). Feng, D. Y., Chen, R. H., Lin, M. Z., & Jiang, C. [1996], “Applications of fuzzy sets in earthquake trend study.” The Selected Papers of Earthquake Prediction in China, pp. 63–69. Seismological Press, Beijing (in Chinese). Fukushima, Y., Gariel, J. C., & Tanaka, R. [1995], “Site-dependent attenuation relations of seismic motion parameters at depth using borehole data.” Bulletin of the Seismological Society of America, 85(6), 1790–1804. Geller, R. J., Jackson, D. D., Kagan, Y. Y., & Mulargia, F. [1997], “Earthquakes cannot be predicted.” Science, 275(5306), 1616–1617. Gupta, I. N., & Nuttli, O. W. [1976], “Spatial attenuation of intensities for central U.S. earthquakes.” Bulletin of the Seismological Society of America, 66(3), 743–751. Gutenberg, B., & Richter, C. F. [1944], “Frequency of earthquakes in California.” Bulletin of the Seismological Society of America, 34(4), 185–188. He, G. N., & Guo, Y. [1988], “Fuzzy multifactorial evaluation on liquefaction of light loam.” Earthquake Engineering and Engineering Vibration, 8(3), 48–56 (in Chinese).
References
273
Howell, B. F., & Schultz, T. R. [1975], “Attenuation of modified Mercalli intensity with distance from the epicenter.” Bulletin of the Seismological Society of America, 65(3), 651–665. Huang, C. F. [1997], “Principle of information diffusion.” Fuzzy Sets and Systems, 91(1), 69–90. Huang, C. F. [1998a], “Deriving samples from incomplete data.” Proceedings of FUZZ– IEEE’98 (Volume I), Anchorage, Alaska, pp. 645–650. Huang, C. F. [1998b], “Concepts and methods of fuzzy risk analysis.” Risk Research and Management in Asian Perspective. Proceedings of the First China–Japan Conference on Risk Assessment and Management, pp. 12–23. International Academic Publishers, Beijing. Huang, C. F. [2000], “Demonstration of benefit of information distribution for probability estimation.” Signal Processing, 80(6), 1037–1048. Huang, C. F., & Leung, Y. [1999], “Estimating the relationship between isoseismal area and earthquake magnitude by hybrid fuzzy-neural-network method.” Fuzzy Sets and Systems, 107(2), 131–146. Huang, C. F., & Liu, Z. R. [1985], “Isoseismal area estimation of Yunnan province by fuzzy mathematical method.” In: Feng, D. Y., & Liu, X. H. (eds.), Fuzzy Mathematics in Earthquake Researches, pp. 185–195. Seismological Press, Beijing. Huang, C. F., & Shi, Y. [2002], Towards Efficient Fuzzy Information Processing—Using the Principle of Information Diffusion. Physica-Verlag, Heidelberg. Huang, C. F., & Xiu, X. W. [1988], “Fuzzy similarity character method used to predict building damage.” Earthquake Engineering and Engineering Vibration, 8(3), 57–68 (in Chinese). Junji, K., & Feng, D. Y. [1995], Advances in Mathematical Seismology. Seismological Press, Beijing (in Chinese). Kasahara, K. [1981], Earthquake Mechanics. Cambridge University Press, Cambridge, UK. Kosko, B. [1992], Neural Networks and Fuzzy Systems. Prentice-Hall, Englewood Cliffs, NJ. Kullback, S. [1959], Information Theory and Statistics. John Wiley, New York. Lin, K. W. [1994], “Regional earthquake hypocenter location using a fuzzy logic algorithm enhanced SEISMOS program.” Geophysics Open-File Report 74. New Mexico Institute of Mining and Technology, Socorro, NM. Lin, K. W., & Sanford, A. R. [2001], “Improving regional earthquake locations using a modified G matrix and fuzzy logic.” Bulletin of the Seismological Society of America, 91(1), 82–93. Liu, X. H., & Dong, J. C. [1982], “Fuzzy mathematical method in earthquake intensity evaluation and building damage prediction.” Earthquake Engineering and Engineering Vibration, 2(4), 26–37 (in Chinese). Liu, X. H., Wang, M. M., & Wang, P. Z. [1983], “Fuzzy earthquake intensity.” Earthquake Engineering and Engineering Vibration, 3(3), 62–75 (in Chinese). Liu, X. H., Chen, Y. P., Zhang, W. D., & Wang, P. Z. [1985], “Quantity estimation of damaged buildings by employing falling-shadow Bayesian principle.” Earthquake Engineering and Engineering Vibration, 5(1), 1–12 (in Chinese). Liu, Z. R. [1982], “Computational investigation on fuzzy relation between earthquake intensity and peak acceleration on ground motion.” Earthquake Engineering and Engineering Vibration, 2(3), 29–42 (in Chinese). Liu, Z. R., & Huang, C. F. [1990], “Information distribution method relevant in fuzzy information analysis.” Fuzzy Sets and Systems, 36(1), 67–76. Liu, Z. R., Huang, C. F., Kong, Q. Z., & Yin, X. F. [1987], “A fuzzy quantitative study on the effect of active fault distribution on isoseismal area in Yunnan.” Journal of Seismology, 7(1), 9–16 (in Chinese).
274
8 Fuzzy Logic and Earthquake Research
Lomnitz, C., & Rosenblueth, E. [1976], Seismic Risk and Engineering Decisions. Elsevier, Amsterdam. Pao, Y. H. [1989], Adaptive Pattern Recognition and Neural Networks. Addison-Wesley, Reading, MA. Rumelhart, D. E., & McClelland, J. L. [1973], Parallel Distributed Processing: Explorations in the Microstructure of Cognition. MIT Press, Cambridge, MA. Rumelhart, D. E., Hinton, G. E., & Williams, R. J. [1986], “Learning internal representations by error propagation.” In: Rumelhart, D.E., & McClelland, J. L. (eds.), Parallel Distributed Processing: Explorations in the Microstructure of Cognitions, Volume 1: Foundations, pp. 318–362. MIT Press, Cambridge, MA. Subramaniam, R. S., Reinhorn, A. M., Riley, M. A., & Nagarajaiah, S. [1996], “Hybrid control of structures using fuzzy logic.” Microcomputers in Civil Engineering: Special Issue on Fuzzy Logic in Civil Engineering, 11(1), 1–17. Tarbuck, E. J., & Lutgens, F. K. [1991], Earth Science, Sixth edition. Macmillan, New York. Tian, Q. W. [1983], “Fuzzy relation between earthquake intensity and building damage indices due to earthquake.” Earthquake Engineering and Engineering Vibration, 3(3), 76–83 (in Chinese). Tokyo Institute of Technology [1975–1985], Summary of Papers on General Fuzzy Problems. Tokyo. Wang, F. [1983], “Fuzzy recognition of the relations between epicentral intensity and magnitude.” Earthquake Engineering and Engineering Vibration, 3(3), 84–96 (in Chinese). Wang, G. Y. [1982], “Fuzzy synthetic evaluation of earthquake intensity and its application to structural design.” Earthquake Engineering and Engineering Vibration, 2(4), 17–25 (in Chinese). Wang, G. Y. [1984], “Two-stage fuzzy comprehensive evaluation of earthquake intensity.” Earthquake Engineering and Engineering Vibration, 4(1), 12–19 (in Chinese). Wang, P. Z. [1983], Fuzzy Set Theory and its Applications. Shanghai Scientific-Technological Press, Shanghai (in Chinese). Wang, P. Z., Liu, X. H., & Sanshez, E. [1986], “Set-valued statistics and its application to earthquake engineering.” Fuzzy Sets and Systems, 18(3), 347–356. Wu, A. H., & Wang, G. Y. [1988], “Fuzzy decision-making for optimal design intensity of aseismic structure.” Earthquake Engineering and Engineering Vibration, 8(1), 1–11 (in Chinese). Xiu, X. W., & Huang, C. F. [1989], “Fuzzy identification between dynamic response of structure and structural earthquake damage.” Earthquake Engineering and Engineering Vibration, 9(2), 57–66 (in Chinese). Yao, J. T. P. [1980], “Damage assessment of existing structures.” Journal of Engineering Mechanics Division (American Society of Civil Engineers), 106(4), 785–799. Zheng, X. M., & Feng, D. Y. [1989], “The fuzzy recognition of precursory anomalies of groundwater level before Tangshan M = 7.8 earthquake.” Seismology and Geology, 11(3), 1–8 (in Chinese).
Chapter 9
Fuzzy Transform: Application to the Reef Growth Problem
Irina Perfilieva
9.1 9.2 9.3 9.4 9.5 9.6
Introduction 275 Preliminaries 276 Fuzzy Partition of the Universe 277 F-Transform 280 Inverse F-Transform 283 Approximate Solution to the Cauchy Problem 287 9.6.1 The generalized Euler method 288 9.6.2 The generalized Euler–Cauchy method 292 9.7 Reef Growth Model and Sea Level Extraction 294 9.8 Conclusions 297 Acknowledgment 299 References 300
9.1 Introduction Fuzzy logic provides a basis for the approximate description of different dependencies. Fuzzy logic’s ability to produce smooth descriptions has especially attracted researchers from various areas, and many of their results are unified by the common name—the universal approximation. These results have also proved the approximation property of so-called fuzzy models. Generally speaking, each model differs from the others by the choice of logical operations (see Novák et al. [1999]). Unfortunately, the technique of producing fuzzy approximation models has not been followed by other methods. We know from the theory of numerical methods that approximation models can replace original complex functions in some computations, e.g., in solving differential equations, etc. This chapter is a new contribution to this area. We call this type of technique numerical methods based on fuzzy approximation models. In this chapter, we show how it is possible to obtain an approximate value of a definite integral as well as an approximate solution to an ordinary differential equation 275 FUZZY LOGIC IN GEOLOGY
Copyright 2004, Elsevier Science (USA) All rights of reproduction in any form reserved. ISBN: 0-12-415146-9
276
9 Fuzzy Transform: Application to the Reef Growth Problem
on the basis of a certain class of fuzzy approximation models. We then apply this technique to the solution of a differential equation modeling reef growth (see also Chapter 3). Furthermore, we use it to model ancient sea level variations (see also Chapter 10).
9.2 Preliminaries We confine ourselves to models that can be represented by functions of one variable. Generalization to the case of two and more variables is straightforward. Suppose that we are given data and are concerned with construction of a representative model. By this we mean the construction of a formula which represents (precisely or approximately) the given data. For example, if our data comprise a collection of pairs (xi , yi ), i = 1, . . . , n, then the model can be given by a formula F (x) such that the function y = F (x) interpolates or approximates the data. In the first case we have
yi = F (xi ),
i = 1, . . . , n
while in the second case, the equality may not hold, but the function y = F (x) “nicely” approximates (in the sense of some criterion) the data. We choose interpolation when our data are precise, otherwise we look for an approximation. Let us stress that in both cases the model may not be unique. It depends on the primary choice of a class of formulas (F (x) in our notation) which is chosen to represent the data as well as the approximation criterion. With this in mind, we come to the following formulation of the problem investigated in this chapter. We are given data which are not precise (e.g., the data contain errors of measurement, or comprise numeric or linguistic expert estimations, etc.). Our goal is to construct an approximating model of the given data, represented by a formula from a certain class and such that a certain criterion is satisfied. Moreover, having an approximating model at our disposal, we would like to apply it in further investigations which may be connected with the original data. In this chapter we will demonstrate how this technique works in solving differential equations where some parameters are replaced by their approximating models. The structure of the chapter is as follows. In Section 9.3, the concepts of a fuzzy partition and uniform fuzzy partition of the universe are introduced. In Sections 9.4 and 9.5, the technique of direct and inverse fuzzy transforms (F-transforms hereafter) is introduced and approximating properties of the inverse F-transform are established. Section 9.6 presents the technique of approximate solvability of ordinary differential equations. Section 9.7 is an example of an application.
9.3 Fuzzy Partition of the Universe
277
9.3 Fuzzy Partition of the Universe We take an interval [a, b] as a universe. That is, all (real-valued) functions considered in this chapter have this interval as a common domain. Let us introduce fuzzy sets (given by their membership functions) which are subsets of the universe [a, b] and which form a fuzzy partition of the universe. Definition 9.1 Let x1 , . . . , xn be fixed nodes within [a, b], such that x1 = a, xn = b, and n ≥ 2. We say that fuzzy sets A1 , . . . , An identified with their membership functions A1 (x), . . . , An (x) defined on [a, b], form a fuzzy partition of [a, b] if they fulfill the following conditions: Ak : [a, b] → [0, 1], Ak (xk ) = 1. Ak (x) = 0 if x ∈ (xk−1 , xk+1 ), where x−1 = a, xn+1 = b. Ak (x) is continuous. Ak (x) monotonically increases on [xk−1 , xk ] and monotonically decreases on [xk , xk+1 ]. n 5. k=1 Ak (x) = 1, for all x.
1. 2. 3. 4.
The membership functions A1 (x), . . . , An (x) are called basic functions. Figure 9.1 shows a fuzzy partition of the interval [1, 4] by fuzzy sets with triangularshaped membership functions. The following formulas give the formal representation
Figure 9.1 An example of a fuzzy partition of [1, 4] by triangular membership functions.
278
9 Fuzzy Transform: Application to the Reef Growth Problem
of such triangular membership functions: ⎧ 1) ⎨1 − (x−x h1 , x ∈ [x1 , x2 ] A1 (x) = ⎩ 0, otherwise ⎧ (x−x ) k−1 ⎪ x ∈ [xk−1 , xk ] ⎪ hk−1 , ⎪ ⎪ ⎨ k) Ak (x) = 1 − (x−x hk , x ∈ [xk , xk+1 ] ⎪ ⎪ ⎪ ⎪ ⎩ 0, otherwise ⎧ (x−x ) n−1 ⎨ hn−1 , x ∈ [xn−1 , xn ] An (x) = ⎩ 0, otherwise where k = 2, . . . , n − 1, and hk = xk+1 − xk . Moreover, we say that a fuzzy partition is uniform if the nodes x1 , . . . , xn are equidistant, i.e., xk = a + h(k − 1), k = 1, . . . , n, where h = (b − a)/(n − 1), n ≥ 2, and two additional properties are met: 6. Ak (xk − x) = Ak (xk + x), for all x, k = 2, . . . , n − 1, n > 2. 7. Ak+1 (x) = Ak (x − h), for all x, k = 2, . . . , n − 2, n > 2. In the case of a uniform fuzzy partition, h is the length of the support of A1 or An while 2h is the length of support of the other basic functions Ak , k = 2, . . . , n − 1. Figure 9.2 shows a uniform partition of the interval [1, 4] by sinusoidal-shaped basic functions. Their formal expressions are given below. ⎧ ⎨0.5(cos πh (x − x1 ) + 1), x ∈ [x1 , x2 ] A1 (x) = ⎩ 0, otherwise ⎧ ⎨0.5(cos πh (x − xk ) + 1), x ∈ [xk−1 , xk+1 ] Ak (x) = ⎩ 0, otherwise where k = 2, . . . n − 1, and ⎧ ⎨0.5(cos πh (x − xn ) + 1), x ∈ [xn−1 , xn ] An (x) = ⎩ 0, otherwise. For the sake of simplicity, here we consider only uniform partitions, though some results remain true in the general case. We will point out when this occurs.
9.3 Fuzzy Partition of the Universe
279
Figure 9.2 An example of a uniform fuzzy partition of [1, 4] by sinusoidal membership functions.
The following lemma shows that, in the case of a uniform partition, the definite integral of a basic function does not depend on its concrete shape. This property will be further used to simplify a direct F-transform. Lemma 9.1 Let the uniform partition of [a, b] be given by basic functions A1 (x), . . . , An (x). Then 6 x2 6 xn h (9.1) A1 (x)dx = An (x)dx = 2 x1 xn−1 and for k = 2, . . . , n − 1
6
xk+1
Ak (x)dx = h
xk−1
where h is the length of the support of A1 . Proof Obviously,
6
x3
6 A2 (x)dx = · · · =
x1
xn
An−1 (x)dx. xn−2
Therefore, to prove (9.2) it is sufficient to estimate 6 h A(x)dx −h
(9.2)
280
9 Fuzzy Transform: Application to the Reef Growth Problem
where A(x) = A2 (x + a + h) and x ∈ [−h, h]. Based on properties 5 and 7 of basic functions, we can deduce that 1 − A(x) = A(x + h), x ∈ [−h, 0]. Then 6
h 0
6 A(x)dx =
0 −h
6 A(x + h)dx = h −
0 −h
A(x)dx
which implies (9.2). Equation (9.1) follows immediately from the symmetry of basic functions (property 6).
9.4 F-Transform In this section we introduce the technique of two F-transforms: direct and inverse. The direct F-transform takes the original function (which should be at least integrable) and converts it into an n-dimensional vector. The inverse F-transform converts the n-dimensional vector into a special function which approximates the original one. The advantage of the direct F-transform is that it produces a simple and unique representation of the original function which enables us to use the former instead of the latter in complex computations. After finishing the computations, the result can be brought back into the space of ordinary functions by the inverse F-transform. To be sure that this can be done we need to prove a number of theorems. The following definition [see also Perfilieva & Chaldeeva, 2001] introduces the direct F-transform (or fuzzy transform) of a given function. Definition 9.2 Let f (x) be any continuous (real-valued) function on [a, b] and A1 (x), . . . , An (x) be basic functions which form a fuzzy partition of [a, b]. We say that the n-tuple of real numbers [F1 , . . . , Fn ] is the F-transform of f w.r.t. A1 , . . . , An if b Fk =
a
f (x)Ak (x)dx . b a Ak (x)dx
(9.3)
Suppose that the basic functions A1 , . . . , An are fixed. Denote the F-transform of f w.r.t. A1 , . . . , An by Fn [f ]. Then, according to Definition 9.2, we can write Fn [f ] = [F1 , . . . , Fn ]. The elements F1 , . . . , Fn are called components of the F-transform.
(9.4)
9.4 F-Transform
281
If the partition of [a, b] by A1 , . . . , An is uniform, then the expression (9.2) for components of the F-transform may be simplified on the basis of Lemma 9.1: 6 2 x2 f (x)A1 (x)dx F1 = h x1 6 2 xn f (x)An (x)dx Fn = h xn−1 6 1 xk+1 f (x)Ak (x)dx k = 2, . . . , n − 1. Fk = h xk−1 Remark 9.1 Even in the case where f (x) is known only at some nodes x1 , . . . , xl ∈ [a, b], the F-transform components of f w.r.t. A1 , . . . , An can be computed as follows: l j =1 f (xj )Ak (xj ) Fk = l j =1 Ak (xj ) where 1 ≤ k ≤ n and n < l. It is easy to see that if a fuzzy partition (and therefore, basic functions) is fixed, then the F-transform as a mapping from C[a, b] (the set of all continuous functions on [a, b]) to Rn is linear, so that Fn [αf + βg] = αFn [f ] + βFn [g] for α, β ∈ R and functions f, g ∈ C[a, b]. One may say that we lose information by using an F-transform instead of the original function. However, we can investigate this problem by asking the following question: how fully is the original function f represented by its F-transform? First of all, we will try to estimate each component Fk , k = 1, . . . , n, using different assumptions about the smoothness of f . Lemma 9.2 Let f (x) be any continuous function on [a, b] and A1 (x), . . . , An (x) be basic functions which form a uniform fuzzy partition of [a, b]. Then for each k = 2, . . . , n − 1, there exist two constants ck1 ∈ [xk−1 , xk ] and ck2 ∈ [xk , xk+1 ] such that 6 1 ck2 Fk = f (x)dx h ck1 and for k = 1 (k = n) there exists c ∈ [x1 , x2 ] (c ∈ [xn−1 , xn ]) such that 6 6 2 c 2 xn F1 = f (x)dx (Fn = f (x)dx). h x1 h c
282
9 Fuzzy Transform: Application to the Reef Growth Problem
Proof The proof can be easily obtained from the second mean-value theorem. Therefore, by Lemma 9.2 we can say that Fk is a mean value of f within the interval [ck1 , ck2 ] and thus it accumulates the information about function f within this interval. We can evaluate Fk more precisely if the function f is twice continuously differentiable. Lemma 9.3 Let the conditions of Lemma 9.2 be fulfilled, but function f be twice continuously differentiable in (a, b). Then for each k = 1, . . . , n Fk = f (xk ) + O(h2 ).
(9.5)
Proof The proof will be given for one fixed value of k which lies between 2 and n − 1. The other two cases k = 1 and k = n are considered analogously. We apply the trapezoid formula with nodes xk−1 , xk , xk+1 to the numerical computation of the integral 6 1 xk+1 f (x)Ak (x)dx h xk−1 and obtain 6 1 xk+1 Fk = f (x)Ak (x)dx h xk−1 =
1 h · (f (xk−1 )Ak (xk−1 ) + 2f (xk )Ak (xk ) + f (xk+1 )Ak (xk+1 )) + O(h2 ) h 2
= f (xk ) + O(h2 ).
Using (9.5) and applying again the trapezoid formula with nodes x1 , . . . , xn to the numerical computation of the integral 6 xk f (x)dx x1
we easily come to the following corollary. Corollary 9.1 (Computation of definite integral) Let the conditions of Lemma 9.3 be fulfilled and the F-transform of f be given by (9.4). Then for each k = 2, . . . , n − 1 6 xk f (x)dx = h( 12 F1 + F2 + · · · + Fk−1 + 12 Fk ) + O(h2 ). (9.6) x1
9.5 Inverse F-Transform Moreover, for any continuous function f (x) the integral precisely: 6 a
b
b a
283
f (x)dx can be computed
f (x)dx = h( 12 F1 + F2 + · · · + Fn−1 + 12 Fn ).
(9.7)
Returning to the problem of losing information by dealing with an F-transform instead of its original, we can say that an F-transform preserves mean values of function f over 2h long subintervals with an accuracy of up to h2 .
9.5 Inverse F-Transform A reasonable question is the following: can we reconstruct the original function from its F-transform? The answer is clear: in general not precisely, because we are losing information when changing to the direct F-transform. However, the function which can be reconstructed (by the inverse F-transform) approximates the original one in such a way that a universal convergence can be established. Moreover, the inverse F-transform fulfills the best approximation criterion which can be called the piecewise integral least-square criterion. Definition 9.3 Let Fn [f ] = [F1 , . . . , Fn ] be the F-transform of a function f (x) w.r.t. A1 , . . . , An . The function fF,n (x) =
n
Fk Ak (x)
(9.8)
k=1
will be called the inversion formula or the inverse F-transform. The lemma below shows that the sequence of functions fF,n uniformly converges to f . Lemma 9.4 (n) (n) Let f (x) be any continuous function on [a, b] and let {(A1 , . . . , An )n } be a sequence of uniform fuzzy partitions of [a, b], one for each n. Let {fF,n (x)} be the sequence of inverse F-transforms, each with respect to the given n-tuple (n) (n) A1 , . . . , An . Then for any ε > 0 there exists nε such that for each n > nε and for all x ∈ [a, b] |f (x) − fF,n (x)| < ε.
(9.9)
284
9 Fuzzy Transform: Application to the Reef Growth Problem
Proof Note that the function f is uniformly continuous on [a, b]; i.e., for each ε > 0 there exists δ = δ(ε) > 0 such that for all x1 , x2 ∈ [a, b] |x1 − x2 | < δ implies |f (x1 ) − f (x2 )| < ε. To prove our lemma we choose some ε > 0 and find the respective δ. Let nε ≥ 2 be such that h = (b − a)/(nε − 1) ≤ δ/2. We will show that with n ≥ nε (9.9) holds true. Let F1n , . . . , Fnn be the components of the (n) (n) single F-transform of f w.r.t. basic functions A1 , . . . , An , n ≥ nε . Then for all t ∈ [xk , xk+1 ], k = 1, . . . , n − 1, we can evaluate 6 1 xk+1 (n) |f (t) − Fkn | = |f (t) − f (x)Ak (x)dx| h xk−1 6 1 xk+1 (n) |f (t) − f (x)|Ak (x)dx < ε ≤ h xk−1 and analogously |f (t) − Fk+1,n | < ε. Therefore, |f (t) −
n k=1
(n)
Fkn Ak (t)| ≤
n
(n)
(n)
(n)
Ak (t)|f (t) − Fkn | < ε(Ak (t) + Ak+1 (t)) = ε.
k=1
Because argument t has been chosen arbitrarily, this proves the required inequality. The following corollary reformulates Lemma 9.4. Corollary 9.2 Let the assumptions of Lemma 9.4 be fulfilled. Then the sequence of inverse F-transforms {fF,n } uniformly converges to f . To illustrate this fact we choose two functions with different behavior, sin(1/x) and sin x (see Figures 9.3, 9.4), and consider different values of n. As we see below, the greater the value of n, the closer the approximating curve approaches the original function. Since approximation by the inverse F-transform converges uniformly to the original function, we are interested in the criterion that distinguishes it among other functions from the certain class. As discussed at the beginning, this criterion guarantees us that the inverse F-transform is the best approximation in the sense we explain below. Let A1 (x), . . . , An (x) be basic functions which form a fuzzy partition of [a, b], and let f (x) be an integrable function on [a, b]. By F T we denote the class of
9.5 Inverse F-Transform
285
Figure 9.3 sin(1/x) and its inverse transformations based on triangular-shaped basic functions: n is the number of nodes used to approximate the function.
approximating functions represented by the formula n
ci Ai (x)
(9.10)
i=1
where c1 , . . . , cn are arbitrary real coefficients. Let the following piecewise integral least-square criterion 6 b n (f (x) − ci )2 Ai (x) dx (c1 , . . . , cn ) = a
(9.11)
i=1
characterize the closeness between f (x) and a function from F T . Then the components F1 , . . . , Fn of the F-transform minimize (9.11) and therefore determine the best approximation of f (x) in F T . We leave this fact unproved because the proof is a technical exercise.
286
9 Fuzzy Transform: Application to the Reef Growth Problem
Figure 9.4 sin(x) and its inverse transformation based on sinusoidal-shaped basic functions (n = 20).
It is worth noticing that, so far, we have not specified any concrete shape for the basic functions. Thus, a natural question arises concerning the influence of different shapes of basic functions on the quality of the approximation. We can say the following. The properties of an approximating function are determined by the respective properties of basic functions. For example, if basic functions are of triangular shape then the approximating function will be piecewise linear. If we choose sinusoidal-shaped basic functions, then the approximating function will be, of course, smoother. In any case, whatever we want to obtain from the approximating function is required from the basic functions as well. The following lemma shows how the difference between any two approximations of a given function by inverse F-transformations can be estimated. As can be seen, it depends on the character of smoothness of the original function expressed by its modulus of continuity (see below). Lemma 9.5 (x) and f (x) be two inverse F-transformations of the same function f (x) Let fF,n F,n w.r.t. n-tuples of different basic functions, n ≥ 2. Then (x) − fF,n (x)| ≤ 2ω(f, 2h) |fF,n
where ω is the modulus of continuity of f (x): ω(f, 2h) = max max |f (x + δ) − f (x)|. |δ|≤2h x∈[a,b]
We illustrate Lemma 9.5 by considering two different inverse F-transforms of functions sin(1/x) and sin x. One is based on triangular-shaped basic functions and
9.6 Approximate Solution to the Cauchy Problem
287
Figure 9.5 sin(1/x) and its inverse transforms based on triangular and sinusoidal-shaped basic functions.
Figure 9.6 sin x and its inverse transforms based on triangular and sinusoidal-shaped basic functions.
one is based on sinusoidal-shaped basic functions (see Figures 9.5, 9.6). Because sin(1/x) has a modulus of continuity greater than sin x, the approximation of the latter with the same value of n looks nicer (both approximations practically coincide with the original function).
9.6 Approximate Solution to the Cauchy Problem In this section we show how the approximation models based on the F-transform can be used in applications. In general, we mean by this that if the original function is replaced by an approximation model as described above, then a certain simplification of complex computations can be achieved. For demonstration, we will consider the
288
9 Fuzzy Transform: Application to the Reef Growth Problem
Cauchy problem
y (x) = f (x, y) y(x1 ) = y1
(9.12)
and show how it can be approximately solved in the interval [x1 , xn ] if F-transform is applied to both sides of the differential equation. Let us stress that in this section we need a uniform fuzzy partition of [x1 , xn ].
9.6.1 The generalized Euler method Suppose that we are given the Cauchy problem (9.12) where the functions y(x) and f (x, y(x)) on [x1 , xn ] are sufficiently smooth. Let us choose some uniform fuzzy partition of interval [x1 , xn ] with parameter h = (xn − x1 )/(n − 1), n ≥ 2, and apply the direct F-transform to both parts of the differential equation. In this way we transfer the original Cauchy problem to the space of fuzzy units, solve it in the new space, and then transfer it back by the inverse F-transform. We describe the sequence of steps which leads to the solution. The justification is proved in Theorem 9.1. Before we apply the direct F-transform to both parts of the differential equation, we replace y (x) by its approximation (y(x + h) − y(x))/ h so that y(x + h) = y(x) + hy (x) + O(h2 ).
(9.13)
Denote y1 (x) = y(x + h) as a new function and apply the direct F-transform to both parts of (9.13). By the linearity of F-transform and Lemma 9.3 we obtain from (9.13) the expression for F-transform components of the respective functions Fn [y ] =
1 (Fn [y1 ] − Fn [y]) + O(h2 ). h
(9.14)
], F [y] = [Y , . . . , Y Here Fn [y ] = [Y1 , . . . , Yn−1 n 1 n−1 ], and Fn [y1 ] = [Y 11 , . . . , Y 1n−1 ]. Note that these vectors are one component shorter than in Definition 9.2 because the function y1 (x) may not be defined on [xn−1 , xn ], xn = b. It is not difficult to prove that
Y 11 = Y2 + O(h2 ) Y 1k = Yk+1 , k = 2, . . . , n − 1. Indeed, for values k = 2, . . . , n − 2 6 6 1 xk+1 1 xk+2 Y 1k = y(x + h)Ak (x)dx = y(t)Ak+1 (t)dt = Yk+1 . h xk−1 h xk
9.6 Approximate Solution to the Cauchy Problem
289
For the values k = 1, n − 1 the proof is analogous. Therefore, Equation (9.14) gives us the way to compute components of the F-transform of y via components of the F-transform of y: Yk =
1 (Yk+1 − Yk ) + O(h2 ), k = 1, . . . , n − 1. h
Let us introduce the (n − 1) × n matrix ⎛ −1 1 0 −1 1⎜ ⎜ D= ⎜ . h ⎝ .. 0
0
⎞ 0 0 0 0⎟ ⎟ ⎟ ⎠ · · · −1 1
0 ··· 1 ··· 0
(9.15)
(9.16)
so that equality (9.15) can be rewritten (up to O(h2 )) as matrix equality Fn [y ] = DFn [y]
(9.17)
]T and F [y] = [Y , . . . , Y ]T . where Fn [y ] = [Y1 , . . . , Yn−1 n 1 n Coming back to the Cauchy problem (9.12) and applying F-transform to both sides of the differential equation, we will obtain the following system of linear equations with respect to the unknown Fn [y]:
DFn [y] = Fn [f ]
(9.18)
where Fn [f ] = [F1 , . . . , Fn−1 ]T is the F-transform of f (x, y) as the function of x w.r.t. the chosen basic functions A1 , . . . , An . The last component Fn is not present in Fn [f ] due to the preservation of dimensionality. Note that system (9.18) does not include the initial condition of (9.12). For this, let us complete matrix D by adding the first row ⎛ ⎞ 1 0 0 ··· 0 0 0 0⎟ 1⎜ ⎜−1 1 0 · · · ⎟ Dc = ⎜ . ⎟ ⎠ h ⎝ .. 0
0
0 · · · −1
1
so that D c is an n × n nonsingular matrix. Analogously, let us complete the vector Fn [f ] by the first component y1 / h so that Fnc [f ] = [
y1 , F1 , . . . , Fn−1 ]T . h
Then, the transformed Cauchy problem can be fully represented by the following linear system of equations with respect to the unknown Fn [y]: D c Fn [y] = Fnc [f ].
(9.19)
290
9 Fuzzy Transform: Application to the Reef Growth Problem
The solution of (9.19) is given by the formula Fn [y] = (D c )−1 Fnc [f ]
(9.20)
which, in fact, is the generalized Euler method. To make sure, we compute the inverse matrix ⎛ ⎞ 1 0 0 ··· 0 0 ⎜1 1 0 · · · 0 0 ⎟ ⎜ ⎟ (D c )−1 = h ⎜ . ⎟ ⎝ .. ⎠ 1 1 1 ··· 1 1 and rewrite (9.20) componentwise: Y1 = y1 Y2 = y1 + hF1 Y3 = y1 + hF1 + hF2 .. . Yn = y1 + hF1 + · · · + hFn−1 or, in a more concise way, Y 1 = y1
(9.21)
Yk+1 = Yk + hFk , k = 1, . . . , n − 1. Formulas (9.21) can be applied to the computation of Y2 , . . . , Yn provided that the way of computing F1 , . . . , Fn−1 is known. However, it cannot be done directly using formulas (9.3) because the expression for function f (x, y) includes also the unknown function y. Therefore, we have to get around this difficulty. The following approximation b f (x, Yk )Ak (x)dx k = a F (9.22) b A (x)dx k a for Fk , k = 1, . . . , n − 1, is suggested. The theorem given below provides the justification. Theorem 9.1 Let the Cauchy problem (9.12) with twice differentiable parameters be transformed by applying F-transform w.r.t. basic functions A1 , . . . , An to both sides of a given
9.6 Approximate Solution to the Cauchy Problem
291
differential equation. Then the components of the F-transform of y w.r.t. the same basic functions can be found approximately from the following system of equations Y 1 = y1 k , k = 1, . . . , n − 1, Yk+1 = Yk + hF
(9.23) (9.24)
k is given by (9.22). The local approximation error is of the order h2 . where F Proof It has been shown that the system of linear equations (9.19) represents the F-transform of the Cauchy problem (9.12) up to O(h2 ). Therefore, to prove the theorem it is k sufficient to show that for each k = 1, . . . , n − 1, the order of the difference Fk − F 2 is h . Let us denote y(xk ) = yk . First, we estimate (using the trapezoid formula) the intermediate difference 6 1 xk+1 (f (xk , yk ) − f (x, Yk ))Ak (x)dx f (xk , yk ) − Fk = h xk−1 =
1 h ∂f · · 2(f (xk , yk ) − f (xk , Yk )) + O(h2 ) = (xk , y)(y ¯ k − Yk ) + O(h2 ) h 2 ∂y
where y¯ ∈ [yk , Yk ]. By Lemma 9.3, we have yk − Yk = O(h2 ) which, when substituted into the expression above, leads to the estimation k = O(h2 ). f (xk , yk ) − F Again, by Lemma 9.3, we get f (xk , yk ) − Fk = O(h2 ) which together with the previous estimation proves that k = O(h2 ). Fk − F This completes the proof.
Corollary 9.3 The generalized Euler method for (9.12) is given by the recursive scheme (9.23)– (9.24) with the local error O(h2 ). The approximate solution to (9.12) can be found
292
9 Fuzzy Transform: Application to the Reef Growth Problem
Figure 9.7 Precise solution (gray line) and approximate solution (black line) of the Cauchy problem obtained by the generalized Euler method (n = 10).
by taking the inverse F-transform yY,n (x) =
n
Yk Ak (x)
k=1
where A1 , . . . , An are fixed basic functions. Let us illustrate the generalized Euler method for the Cauchy problem (9.12) given by
y (x) = x 2 − y y(x1 ) = 1.
Figure 9.7 shows the precise solution (gray line) and the approximate one obtained by the generalized Euler method. The global error of the approximate solution with n = 10 nodes has the order 10−1 , which corresponds to the theoretical estimation.
9.6.2 The generalized Euler–Cauchy method The generalized Euler method for the Cauchy problem has the same disadvantage as its classical prototype, namely, it is not sufficiently precise. Therefore, we will construct the generalization of the more advanced method known as the Euler–Cauchy method. Recall that its classical prototype belongs to the family of the Runge–Kutta methods.
9.6 Approximate Solution to the Cauchy Problem
293
The following scheme provides formulas for the computation of components of the F-transform of the unknown function y(x) w.r.t. some basic functions A1 , . . . , An : Y1 = y 1
(9.25)
∗ k Yk+1 = Yk + hF
Yk+1 = Yk +
(9.26)
h ∗ ), k = 1, . . . , n − 1, (Fk + F k+1 2
(9.27)
where k = F ∗ = F k+1
b a
b a
k )Ak (x)dx f (x, Y b a Ak (x)dx ∗ )Ak+1 (x)dx f (x, Y k+1 . b A k+1 (x)dx a
This method computes the approximate coordinates [Y1 , . . . , Yn ] of the direct Ftransform of the function y(x). The inverse F-transform yY,n (x) =
n
Yk Ak (x)
k=1
approximates the solution y(x) of the Cauchy problem. It can be proved that the generalized Euler–Cauchy method (9.25)–(9.27) has a local error of order h3 . Let us illustrate the generalized Euler–Cauchy method for the Cauchy problem considered above with the same number of nodes n = 10. Figure 9.8 shows the precise solution (gray line) and the approximate one obtained by the generalized Euler–Cauchy method. The global error of the approximate solution with n = 10 nodes is of the order 10−2 which again corresponds to the theoretical estimation. Remark 9.2 The demonstrated generalized Runge–Kutta methods can be applied to the Cauchy problem where y(x) and f (x, y) are vector functions, i.e., to the system of ordinary differential equations with initial values. Remark 9.3 The demonstrated generalized Runge–Kutta methods can be applied to the Cauchy problem even if the function f (x, y) is only partially given at a finite number of nodes or by a description using fuzzy “IF–THEN” rules. Formalization and realization of this description in the form of the inverse F-transform can be preliminarily constructed (see Perfilieva [2001a,b, 2002]).
294
9 Fuzzy Transform: Application to the Reef Growth Problem
Figure 9.8 Precise solution (gray line) and approximate solution (black line) of the Cauchy problem obtained by the generalized Euler–Cauchy method (n = 10).
Remark 9.4 The generalized Runge–Kutta methods based on F-transformation can also be applied to the Cauchy problem when the initial value y1 is not known precisely. For example, y1 may be a fuzzy number.
9.7 Reef Growth Model and Sea Level Extraction We are going to apply the generalized Euler method based on F-transforms to computer-based modeling of carbonate sedimentation. There are well-justified reasons for the use of fuzzy-based modeling: the available data are imprecise and to a great extent averaged in nature (this is especially true for data related to the past); geological processes are very slow and the changes can be described qualitatively rather than quantitatively; geological processes are locally non-homogeneous and are dependent on a specific place while, at the same time, homogeneous in that they obey universal laws. All these reasons argue in favor of using a fuzzy-based approach which is sufficiently robust and computationally efficient (see also Chapters 3 and 10). Though the fuzzy approach is mainly associated with linguistic characterization of vague (and thus, qualitatively expressed) events, we here demonstrate another technique related to fuzziness on one side and to classical analysis and numeric methods on the other. We investigated the following two problems which use the reef growth model (9.28) of Bosschler & Schlager [1992] and Demicco & Klir [2001]. In the first problem, we use our technique to model Belize reef growth based on a differential equation (9.28) which characterizes the process. In the second problem, we use a measured
9.7 Reef Growth Model and Sea Level Extraction
295
stratigraphic section with well-defined third-order and fourth-order cycles and apply the reef growth model to back-calculate a sea level record for the section. In the Bosschler and Schlager [1992] model, the growth rate of corals depends largely upon the amount of light available for photosynthesis. As light decreases with water depth, so does reef growth. The following differential equation characterizes the process of reef growth under changing sea level regime: I0 dh(t) = Gm tanh exp(−k[h0 + h(t) − (s0 + s(t))]) (9.28) dt Ik where ● ● ● ● ● ● ● ●
h(t) is the growth increment, h0 the initial height, Gm the maximal growth rate, I0 the surface light intensity, Ik the saturating light intensity, k the extinction coefficient, s0 the initial sea level position, and s(t) the sea level variation.
Note that sea level is included in (9.28) as a parameter. We first apply our technique of solving ordinary differential equations to Equation (9.28) and obtain a model of the Belize barrier reef growth over the past 80,000 years. The parameters are taken from Bosscher and Schlager [1992] and relate to the carbonate sediment production pattern on the Atlantic shelf-slope break of Belize. The sea level curve s(t) of the past 80,000 years was reconstructed from the numeric data [Bosscher and Schlager, 1992] related to the pattern by the use of F-transforms (see Figure 9.9). Figure 9.10 is a graph of carbonate production versus depth and distance, determined by solving differential equation (9.28) using the technique of the generalized Euler method (9.23)–(9.24) described above. This graph is similar to that shown in Bosscher and Schlager [1992] (their Figure 8), which they obtained by numerical solution of (9.28) using the fourth-order Runge–Kutta method. It is also similar to the graph presented in Demicco and Klir [2001] (their Figure 2d; see also Section 3.4.2, Figure 3.8) obtained using a fuzzy linguistic model of the same problem. Our second exercise is the inverse problem to that considered above: using a stratigraphic measured section where we are given the growth increment h(t), find s(t), the sea level history. The problem formulation and the empirical data have been provided by Demicco (personal communication). There are some difficulties in obtaining a solution to the inverse problem considered above. They are caused by the way in which the growth increment h(t) is given. We also recognize that intertidal carbonate cycles are strictly not totally aggradational (although subtidal cycles may be), so that this exercise is a starting point to illustrate
296
9 Fuzzy Transform: Application to the Reef Growth Problem
Figure 9.9 The sea level curve reconstructed by F-transforms from numeric data.
Figure 9.10 Carbonate production on the Atlantic shelf-slope break of Belize. The horizontal axis is the distance from the sea shore and the vertical axis is the water depth.
9.8 Conclusions
297
potential applications. Below we describe the construction of a mathematical model of h(t). The input data have been obtained empirically from a vertical measured section of an Upper Cambrian limestone from western Maryland (see Chapter 10 for more details). The section comprises a sequence of thicknesses of eight types of rocks. Each type corresponds to a certain water depth and, therefore, relatively determines the ancient sea level. Moreover, if we apply the Bosscher and Schlager [1992] model when relative sea level rises, the growth rate significantly decreases. Since sea level rises up and then goes down repeatedly, the total sequence of thicknesses of types of rocks can be divided into cycles characterizing one period of sea level rise and, particularly, its fall. The division into cycles cannot be determined by our mathematical model and is performed by either an expert or by a program implementing expert actions (see Chapter 10 for details). Thus, a mathematical model of h(t) can be constructed from a sequence of thicknesses of various types of rocks divided into respective cycles plus information about the correspondence between a rock type and a water depth. Examples of the input data are given in Table 9.1. Given this, we can solve equation (9.28) with respect to unknown s(t) for each cycle and in this way obtain the data that characterize the decreasing portion of each cycle in the sea level history. The result is depicted in Figure 9.11. Let us stress that the data in Figure 9.11 characterizing sea level is artificially joined by a continuous curve. To obtain a mathematical model of the sea level data obtained above, we use F-transforms with sinusoidal-shaped fuzzy sets. The results are shown in Figures 9.12 and 9.13. Let us remark that the curve in Figure 9.12 is the thirdorder sea level history insofar as it is showing trends in the fourth-order cycle data. Moreover, although the values of sea level are negative, the relative magnitude of tens of meters is of the right order as judged from sedimentologic inference.
9.8 Conclusions This chapter is a contribution to a new area that can be called numerical methods on the basis of fuzzy approximating models. Fuzzy basic functions have been introduced and two kinds of the so-called F-transform of an original function w.r.t. chosen basic functions have been presented. The convergence of the approximation models that are obtained by the inverse F-transform has been demonstrated. Finally, the ability of the new models to be used in numerical methods instead of the original function has been demonstrated on examples of solving ordinary differential equations. The following facts make this technique attractive: ● ● ●
computation simplicity; good accuracy, comparable with analogous numerical methods; stability with respect to changes in initial data.
298
9 Fuzzy Transform: Application to the Reef Growth Problem Table 9.1 Example of input data. Rock type
Rock thickness
Cycle number
Cycle thickness
4 1 3 4 3 4 2 4 .. .
1.2 0.4 0.1 5.5 0.1 3.0 0.7 0.3 .. .
1 1 1 1 2 2 2 2 .. .
1.2 1.6 1.7 7.2 0.1 3.1 3.8 4.1 .. .
4 3 4 5 3 5 3 5 6 3 2 5 4 3
3.1 0.6 1.4 0.6 0.9 0.8 1.0 3.1 1.1 1.2 0.8 2.4 2.0 2.7
44 44 44 44 44 44 44 44 44 45 45 45 45 45
3.1 3.7 5.1 5.7 6.6 7.4 8.4 11.5 12.6 1.2 2.0 4.4 6.4 9.1
Figure 9.11 The sea level data.
9.8 Conclusions
299
Figure 9.12 Sea level curve obtained using F-transform.
Figure 9.13 Sea level data and curve obtained using F-transform.
We have applied numeric methods on the basis of F-transform to the solution of ordinary differential equations which are encountered in geological practice. The results are convincing and confirm that, in the case of averaged initial data, the proposed methodology has advantages over other methods. In further investigation we would like to extend this methodology to the solution of partial differential equations.
Acknowledgment ˇ and project This paper has been supported by grant IAA118730 of the GAAC CR ME468 of the MŠMT of the Czech Republic as the international supplement to the project NSF “Stratigraphic Simulation Using Fuzzy Logic to Model Sediment
300
9 Fuzzy Transform: Application to the Reef Growth Problem
Dispersal.” The author wishes to thank her PhD student Martina Danˇ ková and diploma student Dagmar Plšková for realization of all the computations in Matematica.
References Bosscher, H., & Schlager, W. [1992], “Computer simulation of reef growth.” Sedimentology, 39, 503–512. Demicco, R. V., & Klir, G. J. [2001], “Stratigraphic simulations using fuzzy logic to model sediment dispersal.” Journal of Petroleum Science and Engineering, 31, 135–155. Novák, V., Perfilieva, I., & Moˇckoˇr, J. [1999], Mathematical Principles of Fuzzy Logic. Kluwer, Boston and Dordrecht. Perfilieva, I. [2001a], “Logical approximation: general approach to the construction of approximating formulas.” Proceedings of EUSFLAT’2001, Leicester, UK. Perfilieva, I. [2001b], “Neural nets and normal forms from fuzzy logic point of view.” Neural Network World, 11, 627–638. Perfilieva, I. [2001c], “Normal forms for fuzzy logic functions and their approximation ability.” Fuzzy Sets and Systems, 124, 371–384. Perfilieva, I. [2002], “Logical approximation.” Soft Computing, 7(2), 73–78. Perfilieva, I., & Chaldeeva, E. [2001], “Fuzzy transformation.” Proceedings of IFSA’2001 World Congress, Vancouver, Canada.
Chapter 10
Ancient Sea Level Estimation
Vilem Novák
10.1 Introduction 301 10.2 Special Fuzzy Logic Techniques 303 10.2.1 Outline of the methods 303 10.2.2 The theory of evaluating linguistic expressions 305 10.2.3 Linguistic description and logical deduction 314 10.2.4 Fuzzy transform 319 10.3 Automatic Determination of Rock Sequences 322 10.3.1 Geological characterization 322 10.3.2 General rules and the fuzzy algorithm 325 10.3.3 Results of tests 326 10.4 Sea Level Estimation 328 10.5 Conclusion 335 Acknowledgment 335 References 335
10.1 Introduction This chapter presents an application of fuzzy logic to the estimation of ancient sea level changes. The main idea is to utilize the geologist’s expert knowledge expressed in natural language. The initial situation is as follows. We are given the data about rocks found in a vertical section (such as an oil well, or a large outcrop). The total thickness of the section is about 250 meters. Up to eight rock types (mostly limestone) may be distinguished in the section and the thickness of each rock type is given. The oldest rocks are at the bottom (at 0 meters) and the youngest rocks are on the top. Our principal goal is to estimate the behavior of ancient sea level on the basis of the rock type data. This goal can be solved on the basis of thickness of certain shallowingupwards sequences of rocks deposited in the given vertical section. Let us note that the “accommodation potential” of a sedimentary deposit is a complicated sum of subsidence, sea level change, and sediment deposition. For a first cut, we assume that 301 FUZZY LOGIC IN GEOLOGY
Copyright 2004, Elsevier Science (USA) All rights of reproduction in any form reserved. ISBN: 0-12-415146-9
302
10 Ancient Sea Level Estimation
subsidence is constant and negative. We also assume that sediment production can “fill up the holes” quickly. Thus, the above goal splits into two tasks. The first task is to determine sequences of rocks deposited during one cycle of sea level rise and fall. The second task is to use those determined sequences to estimate the sea level changes. This can be done because the sequence thickness roughly corresponds to the maximum ancient sea level position in the given time period. We have used fuzzy logic in the broader sense1 to solve both tasks. To solve the first task, a special algorithm has been developed and written in Borland PASCAL 7.0. The algorithm is based on a geologist’s description of the way he/she determines the hierarchy of rock sequences. The description is in natural language and its specific feature is the use of vague predicates for determination of the character of thickness of rocks and their sequences. The second task is solved using two basic techniques. The first technique is F-transform, described in Chapter 9. We benefit especially from its ability to filter the data. Hence, this method is well suited for determining ancient sea level fluctuations since the input data are far from being precise. However, the geologist may have still more specific information, which is difficult (or impossible) to include in the algorithm of F-transform. Therefore, it would be useful to be able to use this information to directly affect the method of sea level estimation. Such a possibility can be accomplished when the information about the sequence thicknesses is provided using a linguistic description. This is a set of linguistically specified IF–THEN rules characterizing the sea level position in the respective time period. Linguistic description is an efficient tool, which embraces the geologist’s specific knowledge and thus allows a more realistic model of the actual sea level position to be obtained. In this chapter, we describe a method in which such a linguistic description can be learned from the data, and also how sea level position can be deduced from it. All the algorithms use routines from the software package LFLC 1.5 developed at the University of Ostrava, Czech Republic. Moreover, a new routine for “smooth logical deduction” has been developed and successfully applied. The chapter is divided into four sections. Section 10.2 describes special fuzzy techniques used for solution of the above tasks. We start with an outline of the theory of evaluating linguistic expressions and their semantics since these concepts play an essential role in the learning and deduction methods used later. Then the logical deduction and the F-transform methods are described. In Section 10.3, the algorithm for determining rock sequences is described. In Section 10.4, we present the results, 1 The following classification of fuzzy logic has been generally agreed: fuzzy logic in the narrow sense (FLn), which is a special many-valued logic providing tools for modeling the vagueness phenomenon; and that in the broader sense (FLb), which is an extension of FLn by some aspects of natural language semantics to enable modeling of natural human reasoning. More about the general theory of fuzzy logic can be found in Novák & Perfilieva [2000].
10.2 Special Fuzzy Logic Techniques
303
an estimation of the sea level fluctuation based on three methods: (1) F-transform; (2) logical deduction on the basis of linguistic descriptions learned from the information about sequence thicknesses; and (3) logical deduction where the linguistic description is learned from the F-transformed data.
10.2 Special Fuzzy Logic Techniques This section describes special fuzzy techniques used for solving the two tasks defined above. The three techniques described below are the theory of evaluating linguistic expressions, fuzzy logic deduction based on linguistic descriptions, and fuzzy transform. 10.2.1 Outline of the methods This subsection informally describes the methods explained in more detail below. It is intended for those who are more interested in the results of our methods for rock sequence determination and ancient sea level estimation presented below than in the methods themselves. Such readers can skip the rest and continue with Section 10.3. On the other hand, the reader interested in the details of the methods can skip this subsection. The essential constituent of our methods are the so-called evaluating linguistic expressions and the model of their semantics. These are expressions such as “very deep,” “more or less thick,” “very roughly small,” “about 200,” “shallow,” etc. In general, we can say that these expressions characterize linguistically some value or number. They form a small but very important and often used part of natural language. Fuzzy logic has been able to offer a mathematical model that works well in describing the semantics of evaluating expressions. Such a model is described in the next section. Using it, we can solve various important problems. The first problem is to imitate a geologist’s reasoning when determining rock sequences derived from the measured stratigraphic sections. The rules he/she uses contain a lot of vague evaluating expressions and thus it is fairly difficult to develop an algorithm mimicking a human specialist. In Section 10.3, we describe a possible algorithm for determination of rock sequences based on the theory of semantics of evaluating expressions which does this job with almost 90% success. Evaluating linguistic expressions are used also in the so-called fuzzy IF–THEN rules. For example, C := IF sequence is very thin THEN sea level change is rather small
(10.1)
In this example, “very thin” and “rather small” are evaluating expressions, “sequence” is an independent variable, and “sea level change” is a dependent variable.
304
10 Ancient Sea Level Estimation
We usually disregard their real meaning and denote them simply by X and Y , respectively. A set of fuzzy IF–THEN rules forms a linguistic description of some situation. In this chapter, we use linguistic descriptions to characterize ancient sea level position estimated on the basis of the rock sequences determined using the algorithm described in Section 10.3. Therefore, we face the problem, how can a linguistic description be found? A very simple but effective method can be used to learn a linguistic description from the data. We suppose that the data are organized into lines consisting of measured values for all variables (for simplicity we assume only independent variable X and dependent variable Y ). Then we must specify linguistic context for both variables X and Y . In our case, this means that we specify intervals in which all values of the variables can fall. Limits of each interval characterize the smallest and highest possible values. For the given value we find a typical evaluating expression. If we do it for both values in the given data line, we obtain one fuzzy IF–THEN rule. Repeating this procedure for the whole data set, we derive a linguistic description characterizing the same situation as is characterized by the data. In our case, the data consist of numbers of time cycles during which each rock sequence has been deposited (variable X) and thicknesses of the rock sequences (variable Y ). Hence, the obtained linguistic description characterizes ancient sea level changes. Of course, the linguistic description may contain redundant and also contradictory rules. These are partly reduced automatically but otherwise have to be elaborated directly by the geologist. Surprisingly, this is a potential advantage of the linguistic descriptions since they make it possible to include additional special knowledge not contained in the data. The linguistic descriptions serve to derive the output given the input. A procedure called a logical deduction in fuzzy logic in the broader sense is used in this chapter. Roughly speaking, the given input value is classified so that the most suitable rule is fired. From the logical point of view this means that the rule of modus ponens is applied: given the fact A and the implication A ⇒ B, we conclude the fact B. For example, if we know that “the sequence is very thin” and also the rule (10.1), then we conclude that “sea level change is rather small.” A powerful technique based on fuzzy logic is fuzzy transform (F-transform), described in detail in Chapter 9. This is a general technique which enables us to filter the data and find a good fit with their trend. In this chapter, we use F-transform in two ways. First, it is used for the estimation of ancient sea level on the basis of the determined rock sequences. Second, the nice smoothness properties of F-transform are utilized to improve the logical deduction. The resulting method, called smooth logical deduction, is better suited for approximating data on the basis of linguistic expressions than the pure deduction mentioned above. The linguistic descriptions learned from rock sequences (determined by an algorithm based on evaluating expressions), with possible expert modification, are applied to the estimation of ancient sea level changes.
10.2 Special Fuzzy Logic Techniques
305
10.2.2 The theory of evaluating linguistic expressions Specific to applications of fuzzy logic and also important for dealing with the first task specified above, namely determination of rock sequences, are the so-called evaluating linguistic expressions [see Novák, 2001; Novák et al., 1999]. These are expressions such as “very large, extremely deep, roughly one thousand, rather thick,” etc. Such expressions are used by people practically every time they want to characterize length, age, depth, thickness, and many other kinds of measurable objects. It is important to note that these terms are vague in nature and never express precise numbers. Let us remark that the use of imprecise numbers and values is very typical in geology. Even concrete numbers used by geologists are always imprecise (i.e., fuzzy) and should be treated as such.
Components of evaluating linguistic expressions The basic components of all evaluating expressions are atomic evaluating expressions. They comprise any of the adjectives “small,” “medium,” or “big.” Let us stress that “small,” “medium,” or “big” should be taken as canonical and can be replaced by any other cases such as “thin,” “thick,” “old,” “new,” etc. Atomic expressions are also fuzzy quantities, namely “approximately x0 .” These are linguistic expressions characterizing some element x0 on an ordered set. Examples of fuzzy quantities are one million, the value x0 , etc. As mentioned, quantities in common human understanding are almost always understood as imprecise; for example, “one million” never means the number 1,000,000, but “something close to it.” Atomic evaluating expressions usually form pairs of antonyms, i.e., the pairs nominal adjective − antonym.
(10.2)
When completed by a middle term, such as “medium,” “average,” etc., they form the so-called basic linguistic trichotomy. Thus, in geology, the pair of antonyms can be “thin–thick,” “old–young,” “shallow–deep,” and the basic linguistic trichotomy can be “thin–medium thick–thick.” Note that natural language is quite rich in the basic pairs (10.2) but the middle term has mostly the form “medium” or “average” completed by the corresponding adjective (e.g., “deep”). Simple evaluating expressions are expressions of the form linguistic hedgeatomic evaluating expression.
(10.3)
Examples of simple evaluating expressions are very thin, more or less medium, roughly thick, about two thousand, approximately z, etc. Linguistic hedges are special adjectives modifying the meaning of the adjectives before which they stand. Examples of them are “very, extremely, roughly,
306
10 Ancient Sea Level Estimation
approximately,” etc. In general, we speak about linguistic hedges with either a narrowing effect (very, significantly, etc.) or with a widening effect (more or less, roughly, etc.). A specific case arises when no linguistic hedge is present. We will consider this case as the presence of an empty linguistic hedge and handle it in the same way as the other simple expressions. Hence, the pure atomic expressions “small,” “medium,” and “big” are also simple evaluating expressions. The concept of linguistic hedge and the first outline of a possible theory of its semantics was introduced by Zadeh [1975]. His theory has been further analyzed from the linguistic point of view by Lakoff [1973] and further elaborated, first in Novák [1989] and elsewhere. In this section, a novel theory is presented, which conforms with the linguistic analysis. In fuzzy logic, we will use several linguistic hedges with precisely defined semantics (see below) forming simple evaluating expressions with the following natural order according to their meaning: ⎫ “extremely” atomic expression ⎪ ⎪ ⎪ “significantly” atomic expression ⎪ ⎪ ⎪ ⎪ ⎪ “very” atomic expression ⎪ ⎪ ⎬ (empty hedge) atomic expression (10.4) “more or less” atomic expression ⎪ ⎪ ⎪ ⎪ “roughly” atomic expression ⎪ ⎪ ⎪ “quite roughly” atomic expression ⎪ ⎪ ⎪ ⎭ “very roughly” atomic expression where atomic expression is one of “small,” “medium,” or “big,” with the caveat that hedges with narrowing effect cannot be used together with “medium” (e.g., the expression “very medium” has no meaning). We will denote evaluating linguistic expressions by script letters A, B, . . .. Evaluating linguistic expressions can be joined by the connectives “and” and “or,” thus forming compound evaluating linguistic expressions. In general, they take the form C := A connective B,
(10.5)
where A, B are evaluating linguistic expressions and connective is either “and” or “or.”2 Examples of compound evaluating expressions are “very deep or deep,” “thick or thin,” etc. The evaluating expressions are often assigned to nouns, characterizing sizes of the objects (or their parts) as their more specific properties. This leads to the 2 One of the possibilities is the connective “but,” which always assumes that the second expression has a negative form, e.g., “small but not very small.” The meaning of “but” is, in this case, the same as “and.” However, the theory of negation of evaluating expressions requires a deeper linguistic analysis and thus is not considered in this chapter. A discussion of the mathematical model of linguistic negation and other problems can be found in Novák [1992].
10.2 Special Fuzzy Logic Techniques
307
concept of evaluating linguistic predication. In general, these are expressions of the form noun is A,
(10.6)
where A is an evaluating linguistic expression. If A is simple, then (10.6) is called the simple evaluating predication. Examples of simple evaluating predications are, e.g., “rock thickness is very big,” “the rock deposit is roughly small,” “the sea level increase is very high,” etc. To finish this section, we have to mention a very important class of linguistic expressions forming conditional clauses C := IF noun is A THEN noun is B
(10.7)
where A, B are evaluating expressions. Clauses (10.7) are in fuzzy logic called fuzzy IF–THEN rules and stand behind most of its successful applications.
Semantics of evaluating linguistic expressions Fuzzy logic offers a sophisticated theory of the semantics of evaluating linguistic expressions and predications. To explain its fundamentals, we have to start with some general remarks. The general model of the semantics of linguistic expressions is based on the distinction between their intension and extension in the sense introduced by Carnap [1947]. First, we start with the concept of a possible world. This can be understood as a state of the world at the given time moment and place. Alternatively, we can understand it also as a particular context in which the given linguistic expression is used. Then we can distinguish the following. Intension of a linguistic expression, sentence, or of a concept, can be identified with the property denoted by it. An intension may lead to different truth values in various possible worlds but it is invariant with respect to them. This means that, given a concept, it has just one intension which does not change when the possible world is changed. For example, the expression “deep” is the name of an intension, being a certain property of depth, which in a concrete context (i.e., possible world) provides information about the size of depth. Extension is a class of elements determined by an intension, which fall into the meaning of a linguistic expression in a given possible world. Thus, it depends on the particular context of use and changes whenever the possible world (context, time, place) is changed. Our example, “deep,” may mean 1 cm when a beetle needs to cross a puddle, 3 m in a small lake, but 3 km or more in the ocean. Expressions A of natural language are names of intensions. Let us remark that a lot of convincing arguments have been made that the meaning of expressions of natural language cannot be identified with their extensions (see, e.g., Gallin [1975]).
308
10 Ancient Sea Level Estimation
To formalize the theory of the meaning of evaluating linguistic expressions, we will start again with the concept of possible world. For the case of evaluating expressions, it is useful to understand possible worlds as closed intervals of real numbers. Thus, the set of possible worlds is the set of triples of numbers W = {vL , vS , vR | vL , vS , vR ∈ [0, ∞) and vL < vS < vR }. The numbers vL , vS , vR represent left limit, center, and right limit of possible values which may fall into the meaning of the expressions of concern, respectively (see below). Let v ∈ R be some value (R is the set of all real numbers). We say that v belongs to the possible world w = vL , vS , vR if v ∈ [vL , vR ]. In this case we usually write simply v ∈ w. In the theory of evaluating expressions, we commonly replace the term possible world by the term linguistic context (or simply, context). Let w1 , . . . , wn be an n-tuple of possible worlds. Then their Cartesian product is defined by w1 × · · · × wn = [v1,L , v1,R ] × · · · × [vn,L , vn,R ]. Let V be a set of elements which may fall into the meaning of the linguistic expression. In our case, we usually put V = R. Then intension is formally a function from the set of all possible worlds into the set of all fuzzy sets in the universe V , i.e., A : W → F(V ).
(10.8)
The extension in the given possible world w ∈ W is a fuzzy set A(w) ⊂ V , ∼ i.e., it is a functional value of the intension (a function) A in the given possible world. Recall that the fuzzy set A(w) is itself a function A(w) : V → [0, 1]. For example, let us consider the expression “small.” This is a property which has some intension Sm : W → F(R). Then in each possible world (context) w ∈ W , the meaning of “small” is a certain fuzzy set of real numbers. The general principles of how that particular fuzzy set can be determined are described below. A difficulty in the notation arises at this point. Let v ∈ V be some element. Since A(w) is a function, then its value at the point v must be written as A(w)(v), which is somewhat awkward. To overcome this inconvenience, we will usually (but not always!) write Aw instead of A(w) in the case that w is a possible world. Our goal now is to characterize both intension and all the extensions of the evaluating linguistic expressions. The main idea of the corresponding mathematical
10.2 Special Fuzzy Logic Techniques
309
model is the following. Take the expression “small.” Then, given a possible world w = vL , vS , vR , the value vL and all values “close” to it are small. We can imagine this situation as that of an “observer” who stands in the position vL and looks over the universe so that all small values fall inside the horizon of his/her view. Similarly, “big” means that the observer stands in vR and looks back. Finally, “medium” means that the observer stands in vS and looks at “both sides.” Mathematically, we model this idea using three of linear functions L, R : W × R → [0, 1] defined in each possible world w ∈ W by vS − x ∗ Lw (x) = vS − v L x − vS ∗ Rw (x) = vR − v S
(left horizon)
(10.9)
(right horizon)
(10.10)
and a middle horizon function Mw (x) = ¬Lw (x) ∧ ¬Rw (x) =
x − vL vS − v L
∗
∧
vR − x vR − vS
∗ (10.11)
where ¬ is the negation operation defined by ¬a = 1 − a, a ∈ [0, 1]. The star used in (10.9), (10.10), (10.11) means cut of the values to the interval [0, 1]; i.e., if Lw (x) < 0 then we put Lw (x) = 0 and if Lw (x) > 1 then we put Lw (x) = 1. Similarly for Rw and Mw . Furthermore, we consider a class of abstract hedges, being functions ⎧ 1, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨1 −
c≤y
(c − y) , b≤y
where a, b, c are special parameters fulfilling a, b ∈ (−∞, 1), c ∈ (0.5, 1], and a < b < c. The function ν can be also seen as a deformation of the horizon. The set of all abstract hedges νa,b,c will be denoted by Hf. To simplify the notation, we will often write only ν ∈ Hf, understanding that v is, in fact, determined by the parameters a, b, c.3 3 Functions ν ∈ Hf are quadratic. Their simpler, but less adequate, form is linear. On the other hand, a more complicated form for them is not necessary.
310
10 Ancient Sea Level Estimation
We will now define three special classes of intensions (recall that by F(R) we denote the set of all fuzzy sets on R): (i) Intensions of type Small: Sm = {Smν : W → F(R) | Smν,w (x) = ν(Lw (x)), ν ∈ Hf}. (ii) Intensions of type Medium: Me = {Meν : W → F(R) | Meν,w (x) = ν(Mw (x)), ν ∈ Hf}. (iii) Intensions of type Big: Bi = {Biν : W → F(R) | Biν,w (x) = ν(Rw (x)), ν ∈ Hf}. We are now ready to define the meaning of simple evaluating linguistic expressions. Let A be such an expression. Then its meaning is identified with its intension, which is a function Int(A) ∈ Sm ∪ Me ∪ Bi .
(10.12)
Furthermore, for each possible world w ∈ W , the extension of the evaluating expression A in w ∈ W is Ext w (A) = Int(A)(w). This means that if w = vL , vS , vR is a possible world then the extension Ext w (A) of the evaluating expression A in w is a fuzzy set Ext w (A) ⊂ [vL , vR ]. ∼ Let us remark that the meaning of fuzzy quantities is modeled similarly to that of expressions of type “medium.” For simplicity, we have omitted details in this chapter. The interested reader should consult Mareš [1994]. If we rewrite the above formulas in more detail, we obtain: Int(linguistic hedge small) = Smν ∈ Sm Int(linguistic hedge medium) = Meν ∈ Me Int(linguistic hedge big) = Biν ∈ Bi where Smν , Meν , and Biν are the functions defined above. Using them, we can derive for each possible world the explicit formulas for the corresponding extensions. Let the parameters of the linguistic hedge ν be a, b, c ∈ [0, 1]. Put K1 = (c−b)(c− a), K2 = (b − a)(c − a). Then the extension in each possible world w = vL , vS , vR
10.2 Special Fuzzy Logic Techniques
311
is given by one of the following formulas:
Smν,w
Meν,w
Biν,w
⎧ 1, x ∈ [vL , cSm ], ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (x − cSm )2 ⎪ ⎪ ⎪ ⎨1 − K1 (vS − vL )2 , x ∈ (cSm , bSm ], = ⎪ (a − x)2 ⎪ Sm ⎪ ⎪ x ∈ (bSm , aSm ), ⎪ 2, ⎪ K (v ⎪ 2 S − vL ) ⎪ ⎪ ⎪ ⎩ 0, x ≥ aSm ⎧ ⎪ 1, ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ L ⎪ ⎪ (cMe − x)2 ⎪ ⎪ 1− , ⎪ ⎪ K1 (vS − vL )2 ⎪ ⎪ ⎪ ⎪ ⎪ R 2 ⎪ ⎨ (x − cMe ) 1 − = K1 (vR − vS )2 ⎪ ⎪ ⎪ ⎪ L 2 ⎪ ⎪ (x − aMe ) ⎪ ⎪ ⎪ 2, ⎪ ⎪ ⎪ K2 (vS − vL ) ⎪ ⎪ ⎪ R ⎪ ⎪ (aMe − x)2 ⎪ ⎪ ⎪ ⎪ K2 (vR − vS )2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎩0
L , cR ], x ∈ [cMe Me
cSm = cvL + (1 − c)vS bSm = bvL + (1 − b)vS aSm = avL + (1 − a)vS
L = cv + (1 − c)v cMe S L R = cv + (1 − c)v , cMe S R
L , cL ), x ∈ [bMe Me
L = bv + (1 − b)v bMe S L
R , bR ], x ∈ (cMe Me
R = bv + (1 − b)v bMe S R
L , bL ), x ∈ (aMe Me
L = av + (1 − a)v aMe S L
R , a R ), x ∈ (bMe Me
R = av + (1 − a)v aMe S R
L , x ≤ aMe
R x ≥ aMe
⎧ 1, x ∈ [cBi , vR ], ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (cBi − x)2 ⎪ ⎪ ⎪ ⎨1 − K1 (vR − vS )2 , x ∈ [bBi , cBi ), = ⎪ ⎪ (x − aBi )2 , ⎪ ⎪ x ∈ (aBi , bBi ), ⎪ ⎪ K2 (vR − vS )2 ⎪ ⎪ ⎪ ⎪ ⎩ 0, x ≤ aBi .
cBi = cvR + (1 − c)vS bBi = bvR + (1 − b)vS aBi = avR + (1 − a)vS
The meaning of the parameters aSm , bSm , cSm for small and likewise for medium and big is clear from Figure 10.1.
312
10 Ancient Sea Level Estimation
Figure 10.1 Scheme of the construction of the extension of evaluating linguistic expressions in the given possible world w. The νa,b,c function is turned over to the left of the y-axis. Dashed lines represent the horizon functions Lw , Mw , Rw . Each of the parameters a, b, c is projected onto them and results in the corresponding parameters aA , bA , cA for A := Sm, Me, Bi. Note that cA determines edge of kernel, aA edge of support, and bA determines inflexion point.
Remark 10.1 If A is an evaluating linguistic predication (10.6), then we put its intension equal to the intension of the evaluating expression inside (10.6). We will say that an abstract hedge ν1 ∈ Hf is sharper than ν2 ∈ Hf, ν1 < ν2 ,
if
a2 , b2 , c2 < a1 , b1 , c1 .
(10.13)
Note that the ordering of hedges at the same time induces ordering of simple evaluating expressions. Using it, the natural ordering introduced in (10.14) can be modeled in fuzzy logic. We now select the following adverbs: “extremely (Ex), significantly (Si), very (Ve), more or less (ML), roughly (Ro), quite roughly (QR), very roughly (VR).” Furthermore, we choose certain numbers a0 , b0 , c0 ∈ [0, 1] and assign the abstract hedge νa0 ,b0 ,c0 ∈ Hf to the “empty hedge.” Then we choose three abstract hedges νEx , νSi , νV e ∈ Hf, for which νEx < νSi < νV e < νa0 ,b0 ,c0 holds, and four abstract hedges νML , νRo , νQR , νV R ∈ Hf, for which νa0 ,b0 ,c0 < νML < νRo < νQR < νV R
10.2 Special Fuzzy Logic Techniques
313
Table 10.1 Empirically estimated possible values of parameters a, b, c of some linguistic hedges. They follow the natural ordering of simple evaluating expressions in (10.4). Linguistic hedge Extremely Significantly Very empty Rather More or less Roughly Quite roughly Very roughly
a
b
c
0.5 0.47 0.35 0.27 0.4 0.23 0.2 0.15 0.09
0.75 0.6 0.58 0.5 0.5 0.45 0.4 0.32 0.2
0.95 0.8 0.83 0.8 0.8 0.76 0.7 0.65 0.6
holds. Finally we assign these hedges to the words selected above and understand the former as the meaning of the latter. This procedure allows us to construct the meaning of each evaluating expression (i.e., its intension) as well as its extension in each possible world. Note that we can also define other kinds of modifiers not belonging to the above group, for example “rather.” This should be assigned the abstract hedge νa,b,c with a > a0 , b ≤ b0 , and c < c0 . Thus the above theory encompasses a large class of linguistic hedges. The empirically estimated values of the parameters a, b, c of the various linguistic hedges are given in Table 10.1. Shapes of the membership functions of extensions of the experimentally determined pure evaluating expressions are depicted in Figure 10.2.
Finding a suitable expression The theory described above enables us to master the meaning of evaluating linguistic expressions and apply it to a great variety of problems. In our case, we have used it to develop an algorithm for determination of rock sequences. To gain insight into practical application of the theory, imagine a possible world, say w = 1, 40, 100, and a given value x = 7. How can we linguistically characterize size of x with respect to w? One may state quite naturally that this value is small or, to be more precise, even very small. The question arises whether we can mimic this empirical procedure. Certain general requirements can be formulated enabling us to develop a satisfactory algorithm. This has been described in Dvoˇrák & Novák [1993] and is called the “Suit” procedure.
314
10 Ancient Sea Level Estimation
Figure 10.2 Membership functions of extensions of selected evaluating expressions from Table 10.1. The order of curves corresponds to the natural ordering from (10.4), i.e., for “small” the extensions go from “extremely” to “very roughly,” for “big” in the opposite direction, and for “medium” from “rather” to “very roughly.”
Let us consider a possible world w = vL , vS , vR and a value u ∈ w. Then Suit is a function Suit : R × W → S
(10.14)
where S is the set of all simple evaluating linguistic expressions. In practice, Suit(u, w) is such an evaluating expression A, that the value u is typical for its extension Ext w (A). The general idea of finding the functional value of Suit(u, w), given the possible world w and a value u ∈ w, is to choose that evaluating expression A ∈ S whose extension Ext w (A) in the possible world w is the sharpest (smallest) one in the sense of the ordering (10.13), provided that the membership degree Extw (A)(u) is non-zero and maximal. We do not give more details here and refer to the software LFLC (Linguistic Fuzzy Logic Controller), versions 1.5 and 2000, developed in the University of Ostrava and in which the Suit procedure is successfully implemented.4
10.2.3 Linguistic description and logical deduction Linguistic description is a powerful tool that enables us to characterize in a reasonable (but not too detailed) form a situation requiring decision, the behavior of a system, 4 The demo version of LFLC 2000 can be found in www.ac030.osu.cz where also information on how to order it is provided.
10.2 Special Fuzzy Logic Techniques
315
a control strategy, the character of a data set, etc. Linguistic description can be understood as a slightly formalized form of a wide class of descriptions provided by people freely using natural language. Moreover, on the basis of linguistic description and of some concrete observation (in the given possible world), people are able to derive conclusions and take appropriate actions. The power of fuzzy logic consists in its ability to model both the meaning of linguistic description as well as deduction based on it, provided that an observation is given.
Interpretation of linguistic description The linguistic description is a set of fuzzy IF–THEN rules R := {R1 , R2 , . . . , Rm }
(10.15)
where each Ri is a conditional clause of the general form (10.7). Since, in practice, we are usually not interested in the concrete noun in the linguistic predications (10.6), we usually replace it by some variable. Thus, the fuzzy IF–THEN rules in fuzzy logic Ri take the form Ri := IF X is Ai THEN Y is Bi .
(10.16)
The part before THEN is called the antecedent and the part after it is called the succedent. Let us also remark that the rules may have more than one variable in the antecedent. For simplicity of explanation, we do not elaborate on this case here. Let the fuzzy IF–THEN rule R have the form (10.7). Let the meaning of the predication “noun is A” be Int(noun is A) = EvA from (10.12)5 and similarly EvS for “noun is B” (the subscript A stands for “antecedent” and S for “succedent”). Then the intension (meaning) of the fuzzy IF–THEN rule (10.7) is Int(R) := EvA ⇒ EvS
(10.17)
where ⇒ is an implication connective interpreted by some implication operation →. There are good reasons to take → as a Łukasiewicz implication defined by a → b = min(1, 1 − a + b),
a, b ∈ [0, 1].
(10.18)
Given a couple of possible worlds w, w ∈ W , then the extension of (10.7) is Ext w,w (R) := EvA,w → EvS,w
(10.19)
5 In accordance with Remark 10.1, we semantically do not distinguish evaluating linguistic predication from the linguistic expression inside it.
316
10 Ancient Sea Level Estimation
where EvA,w is the extension of the linguistic predication “noun is A” in the possible world w and, similarly, EvS,w is the extension of the linguistic predication “noun is B” in the possible world w . The extension (10.19) is a fuzzy relation Ext w,w (R) ⊂ w × w defined by ∼ Ext w,w (R)(v, v ) = EvA,w (v) → EvS,w (v ),
v ∈ w, v ∈ w .
(10.20)
Defuzzification The defuzzification operation plays an important role in fuzzy logic applications. The reason is that fuzzy logic works with fuzzy sets but, at the end, we usually need some specific number as an output. Therefore, a certain procedure which transforms a fuzzy set into a number is needed. In the literature [e.g., Klir & Yuan, 1995; Novák, 1989; Novák, et al., 1999], many kinds of defuzzification operations have been described. To deal with evaluating expressions, we employ the Least of Maxima (LOM), First of Maxima (FOM), and Center of Gravity (COG) methods. In our case, the first two can be computed for the respective expressions of the type small and big as follows. Let w = vL , vS , vR be a possible world. Then LOM(Smν (w) = cvL + (1 − c)vS
(10.21)
FOM(Biν (w) = cvR + (1 − c)vS
(10.22)
where c ∈ (0.5, 1] is the parameter of the corresponding linguistic hedge ν in (10.21) and (10.22). The COG method is defined by v Evw (v)dv . (10.23) COG(Ev(w)) = w w Evw (v)dv The following special defuzzification operation DEE (Defuzzification of Evaluating Expressions) works very well for evaluating linguistic expressions: let Ev be the meaning (intension) of some evaluating expression and w ∈ W be a possible world. Then the defuzzification operation ⎫ ⎧ ⎪ ⎬ ⎨LOM(Sm(w)), if Ev ∈ Sm⎪ (10.24) DEE(Ev(w)) = COG(Ev(w)), if Ev ∈ Me . ⎪ ⎭ ⎩FOM(Bi(w)), if Ev ∈ Bi ⎪ Thus, the result of the defuzzification operation depends on the type of evaluating expression to be defuzzified. A schematic picture demonstrating the behavior of the DEE defuzzification method is given in Figure 10.3.
10.2 Special Fuzzy Logic Techniques
317
Figure 10.3 Scheme of the Defuzzification of Evaluating Expressions (DEE) method.
Logical deduction The way people make inferences on the basis of linguistic description can be explained with an example. Let us consider a linguistic description that consists of two rules: R1 := IF X is small THEN Y is big R2 := IF X is big THEN Y is small. Furthermore, let the linguistic context (possible worlds) for the variables X, Y be w = w = 0, 0.5, 1. Then small values are some values around 0.3 (and smaller) and big ones some values around 0.7 (and bigger). We know from the linguistic description that small input values correspond to big output ones and vice versa. Therefore, given the input, e.g., X = 0.3, then we expect the result Y ≈ 0.7 due to the rule R1 since we evaluate the input value as being small, and thus, in this case the output should be big. Similarly, for X = 0.75 we expect the result Y ≈ 0.25 due to the rule R2 . We expect the formal theory of logical deduction to give analogous results. Let R be a linguistic description (10.15). In fuzzy logic, its meaning can be represented by a set of intensions (10.17) ⎫ Int(R1 ) = EvA,1 ⇒ EvS,1 ⎬ .......................... (10.25) ⎭ Int(Rm ) = EvA,m ⇒ EvS,m The logical deduction may proceed, provided that an observation Ev equal to some of the antecedents EvA,i from (10.25) is given. Then the modus ponens rule can be
318
10 Ancient Sea Level Estimation
applied EvA,i , EvA,i ⇒ EvS,i EvS,i
(10.26)
giving as the output the expression EvS,i . Our problem, however, consists in the fact that we are given only an observation u ∈ w in a possible world w. Because we work in the space of evaluating expressions, we transform u into the most specific antecedent EvA,i0 in the sense of the ordering (10.13) such that the membership degree of u in its extension EvA,i0 ,w (u) is nonzero and maximal. This can be justified by the empirical observation that, given the possible world, each value in it can be classified by some evaluating expression (in fact, they exist in natural language just for this purpose). Furthermore, since the expressions are more or less specific, the most specific one gives the most precise information. Consequently, this expression (if it exists in the given linguistic description) is used as an argument in the modus ponens (10.26). We say that the corresponding rule Ri0 “fired.” Logical deduction determines a function fR given by fR (u) = DEE(EvA,i0 ,w (u) → EvS,i0 ,w )
(10.27)
where i0 is the number of the rule which fired on the basis of modus ponens and the method described above. The resulting function fR is, in general, only partially continuous. This is not a problem in decision-making or even in fuzzy control, but can be a drawback in data approximation. On the other hand, logical deduction provides an efficient and human-like behavior. Moreover, the linguistic description in the form considered above is natural and easily understood by people. Let us note that the fuzzy transform described below enables us to overcome the non-continuity of the logical deduction and, at the same time, to preserve its main characteristics.
Learning linguistic description from the data The Suit function (10.14) can be effectively used for learning linguistic description on the basis of given data [see Beˇ lohlávek & Novák, 2002]. Consider a pair of possible worlds w, w and the data ⎫ (u1 , v1 ) ⎬ ........ , ⎭ (uN , vN )
(10.28)
10.2 Special Fuzzy Logic Techniques
319
such that each pair (uj , vj ) ∈ w ×w . We suppose that these data can be characterized using some linguistic description (10.15) consisting of rules of the form Ri := IF X is Ai THEN Y is Bi
(10.29)
where Ai , Bi are evaluating linguistic expressions such that for each pair (uj , vj ) of the data Ai = Suit(uj , w) and Bi = Suit(vj , w )
(10.30)
hold true. Hence, each data item leads to some fuzzy IF–THEN rule (10.29) formed using (10.30). Of course, it may happen that two different data items lead to the same IF–THEN rule. Moreover, some learned rules may turn out to be superfluous when used in logical deduction. Hence, the following learning procedure leads to the linguistic description which has the ability to approximate the data (10.30). 1. Repeat the learning procedure (10.30) for all j = 1, . . . , N and generate the linguistic description {R1 , . . . , RN }. 2. Reduce the learned linguistic description as follows: (a) Replace all the sets of identical rules by one rule only. (b) Let Ri and Rk be two learned rules such that their succedents are identical. Let the antecedent of Ri be wider (in the sense of the ordering (10.13)) than that of Rk . Then exclude the rule Rk . (c) Let Ri and Rk be two learned rules such that their antecedents are identical. Let the succedent of Ri be sharper (in the sense of (10.13)) than that of Rk . Then exclude the latter rule. The result of the learning procedure is a linguistic description of the form (10.15). We employ this procedure to model ancient sea level position—see below. Let us remark, however, that the linguistic description learned may still need some expert modification. The reason for such discrepancies arise from the character of the data and vagueness of the antecedents of the rules. If the values ui in (10.28) are fairly close, and the values vi vary, then the learning procedure may lead to rules that have the same antecedent but different succedents. Therefore, some of the rules have to be deleted (or modified) on the basis of special expert knowledge. This problem should be the subject of further investigation.
10.2.4 Fuzzy transform The fuzzy transform technique developed by I. Perfilieva is described in Chapter 9 (see also Perfilieva [2003]). We briefly review it with respect to our methods and notation.
320
10 Ancient Sea Level Estimation
The F-transform is applied to some continuous function f (x) defined on an interval of real numbers [vL , vR ] ⊂ R. From the point of view of the explanation in previous sections, this interval is our possible world (the context of use). As above, we will write x ∈ w in the meaning x ∈ [vL , vR ]. The interval w is divided into a set of equidistant nodes x0,k = vL + h(k − 1), k = 1, . . . , n, where h = (vr − vL )/(n − 1),
n ≥ 2,
(10.31)
is the fixed distance between each neighboring set of nodes. Direct F-transform In the direct F-transform we next define n basic functions, which cover w and take the role of fuzzy points (or granules), dividing w into n vague areas. For us, the basic functions will be fuzzy numbers Fnν,x0 , where ν is a linguistic hedge and x0 is the central point around which the fuzzy number is defined. Thus, the fuzzy number is an extension of the linguistic expression “approximately x0 .” The considered membership function is the following: ⎫ ⎧ 1, x ∈ [cxL0 , cxR0 ], cxL0 = x0 − (1 − c)h ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ R ⎪ ⎪ c = x + (1 − c)h ⎪ ⎪ 0 x0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ L 2 ⎪ ⎪ (c − x) ⎪ ⎪ x ⎪ ⎪ L L L 0 ⎪ ⎪ 1 − , x ∈ [b , c ), b = x − (1 − b)h 0 ⎪ ⎪ x 0 x0 x0 2 ⎪ ⎪ K h ⎪ ⎪ 1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ R 2 ⎪ ⎪ ⎪ ⎬ ⎨1 − (x − cx0 ) , x ∈ (cR , bR ], bR = x + (1 − b)h ⎪ 0 x0 x0 x0 2 . (10.32) Fnν,x0 = K1 h ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (x − axL0 )2 ⎪ L , bL ), L = x − (1 − a)h⎪ ⎪ ⎪ , x ∈ (a a ⎪ ⎪ 0 x0 x0 x0 2 ⎪ ⎪ ⎪ ⎪ K2 h ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ R ⎪ ⎪ 2 ⎪ ⎪ − x) (a ⎪ ⎪ x R R R 0 ⎪ ⎪ ⎪ ⎪ , x ∈ (b , a ), a = x + (1 − a)h 0 x x x ⎪ ⎪ 2 0 0 0 ⎪ ⎪ K h ⎪ ⎪ 2 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎭ ⎩ L R x ≥ ax0 0, x ≤ ax0 , Note that Fnν,x0 (x0 ) = 1 and Fnν,x0 (x0 ± h) = 0. Thus, one fuzzy number is spread over three neighboring nodes x0 − h, x0 , x0 + h. Furthermore, n k=1
Fnν,x0k (x) = 1
10.2 Special Fuzzy Logic Techniques
321
holds for each x ∈ w. Consequently, each x ∈ w is covered by exactly two neighboring fuzzy numbers Fnν,x0k , Fnν,x0k+1 . This means that x0k ≤ x ≤ x0k+1 and Fnν,x0k (x) + Fnν,x0k+1 (x) = 1. This property is used below for smooth logical deduction. Using the fuzzy numbers as basic functions, we transform the values of the given function f (x) into an n-tuple of real numbers [F1 , . . . , Fn ]. In reality (and also in the application described below) the function f (x) is known only at some points x1 , . . . , xN . Thus, the input data form a set of couples ⎫ (x1 , f (x1 )) ⎬ ............ . (10.33) ⎭ (xN , f (xN )) Then the direct F-transform is defined by N j =1 f (xj ) Fn ν,x0k (xj ) Fk = , N j =1 Fn ν,x0k (xj )
k = 1, . . . , n.
(10.34)
Inverse F-transform After realizing the direct transform, we can transform [F1 , . . . , Fn ] back to obtain a function n fF,n (x) = Fk · Fnν,x0k (xj ). (10.35) k=1
The result of the application of F-transform to (10.33) is filtered data ⎫ (x1 , fF,n (x1 )) ⎬ .............. . ⎭ (xN , fF,n (xN ))
(10.36)
It can be proved that if n increases then fF,n (xj ) converges to f (xj ). Let us also remark that the F-transform has very nice filtering properties and is easy to compute. Therefore, we have used it to solve of the sea level determination problem. Smooth logical deduction Recall that logical deduction gives a partially continuous function fR defined in (10.27). This drawback can be overcome by joining it with the F-transform. The idea is very simple. Let a linguistic description R (10.15) be given. Furthermore, consider the linguistic context (possible world) w and let the value u0 ∈ w be given. On the basis of R and u0 , the result of logical deduction at the point u0 is fR (u0 ), by (10.27). Then the
322
10 Ancient Sea Level Estimation
function fR (x) may take the role of the above-considered function f (x) to be filtered using the F-transform. First we choose some smoothing number, which is the number of nodes n, and compute the distance h in (10.31). Since only one data item is to be filtered, namely (u0 , fR (u0 )), only two numbers Fk , Fk+1 in (10.34) must be computed for some nodes x0k , x0k+1 such that x0k ≤ u0 ≤ x0k+1 . To do this we have to choose a step r > 0, compute a sequence of values u1 = x0k−1 , u2 = u1 + r, u3 = u1 + 2r, . . . , up = x0k+2 laying between the nodes x0k−1 and x0k+2 , and generate a sequence of auxiliary data ⎫ (u1 , fR (u1 )) ⎬ ............. (10.37) ⎭ (up , fR (up )) where fR (uj ), j = 1, . . . , p is a result of the logical deduction (10.27) based on the linguistic description R and the input value uj . Now, the smooth logical deduction consists of two steps. First we compute the numbers Fk , Fk+1 according to p Fk =
j =1 fR (uj ) · Fn ν,x0k (uj ) p j =1 Fn ν,x0k (uj )
(10.38)
(note that the summation in (10.38) is sufficient only for values x0k−1 ≤ uj ≤ x0k+1 in the case of Fk and x0k ≤ uj ≤ x0k+2 in the case of Fk+1 since the membership function of the corresponding fuzzy number is zero otherwise). Second, we compute the resulting smoothed output using fR,n (u0 ) = Fk · Fnν,x0k (u0 ) + Fk+1 · Fnν,x0k+1 (u0 ).
(10.39)
The scheme of the smooth logical deduction is depicted in Figure 10.4. As can be seen, the algorithm is more complicated than the simple logical deduction (10.27). However, its advantage lies in both its ability to mimic human reasoning based on linguistic description and its continuity.
10.3 Automatic Determination of Rock Sequences 10.3.1 Geological characterization We tested this procedure on two detailed, measured stratigraphic sections from Lower Paleozoic carbonates. The first case is from the Lower Ordovician El Paso Group from west Texas described in Goldhammer et al. [1993] and given in Appendix C of that paper. The second is an approximately 250 meter thick composite section of the Upper
10.3 Automatic Determination of Rock Sequences
323
Figure 10.4 Scheme of the smooth logical deduction.
Cambrian Conococheague Limestone from western Maryland. Details of the sedimentology of the Conococheague Limestone can be found in Demicco [1985]. Both sections comprise hundreds of individual measured rock units distributed among eight rock types: (1) stromatolites; (2) dolomitic mudstones; (3) thrombolitic bioherms (that contain sponges in the Ordovician section); (4) cross-stratified grainstones; (5) wavy bedded “ribbon rocks”; (6) mud cracked cryptomicrobial laminites; (7) mud cracked planar laminites; and (8) breccias (in the Cambrian Conococheague Limestone) or aeolian sandstones (in the El Paso Group). Rock types 1 and 2 represent deepest subtidal settings. Rock types 3 and 4 represent shallow subtidal shoals and shallow subtidal bioherms (these may be lateral equivalents of each other in terms of absolute depth). Rock types 5, 6, and 7 were deposited on lowest intertidal mixed sand and mud flats, high intertidal mud flats covered with cyanobacterial mats, and supratidal playa-like mudflats, respectively. Rock type 8 in both cases represents desiccated to completely disrupted sabkhas. Lower Paleozoic carbonates are, in general, divisible into two orders of cycles: fourth-order cycles 1 to 10 meters thick, and third-order cycles tens up to one hundred meters thick [Goldhammer et al., 1993; Kerans & Tinker, 1997]. Systematic variations in the thickness and rock types that comprise the fourth-order cycles (i.e., the “stacking pattern”) define the third-order cycles which are directly analogous to seismic sequences in that they record long-term (one to a few tens of millions of years) variations in rates of sea level rise and fall. Third-order cycles start with exposure breccias (or, in the case of the El Paso Group, aeolian flats) that are followed by fourth-order cycles deposited on tidal flats that systematically become thicker upward as accommodation increases during rapid third-order sea level rise. When accommodation potential is highest, cycles are thickest (up to 10 meters) and are
324
10 Ancient Sea Level Estimation
capped either by thick thrombolitic bioherms or by shoal deposits. (Note that this is a slightly different way of defining subtidal cycles than was given by Goldhammer et al. [1993].) Finally, as the rate of third-order sea level rise slows, cycles become thinner upward and once again contain more intertidal and supratidal flat deposits, culminating in exposure breccias or wind-swept sabkhas at the slowest point of thirdorder sea level rise (the next sequence boundary). Figure 10.5 shows 75 m of section illustrating changes in stacking patterns of the Conococheague Limestone from a zone of maximum accommodation through thinning intertidal cycles to a breccia marking a sequence boundary. Goldhammer et al. [1993] described similar stacking patterns in the El Paso Group of west Texas.
Figure 10.5 Illustration of changes in stacking patterns of the Conococheague Limestone from a zone of maximum accommodation through thinning intertidal cycles to a breccia marking a sequence boundary.
10.3 Automatic Determination of Rock Sequences
325
10.3.2 General rules and the fuzzy algorithm Our point of departure is to develop a series of “IF–THEN rules” specified by a geologist that capture the elements of fourth-order cycles and how they are organized into third-order cycles. 1. The lower the number of a rock type, the deeper the water it was deposited in; however, there may be errors and exceptions. 2. The higher the number of a rock type, the shallower the water it was deposited in. 3. Rock types 6, 7, and 8 were deposited at or slightly above ancient sea level. Furthermore, more detailed linguistic rules for interpreting “accommodation potential” in limestone are to be used: A1. If rock types 6, 7, or 8 are “far above” rock types 1, 2, 3 then sequences usually start with rock type 3 and end with 4 or 5. A2. If rock types 6, 7, or 8 are “not far above” rock types 1, 2, 3 then sequences usually start with rock type 3 and end in rock type 6 or 7, rarely 8. A3. Sequences are usually 1 to 10 m thick. A4. If most sequences end in rock type 4, or if the majority of the section is rock types 3 and 4, then this is a zone of maximum accommodation. A5. If most sequences end in rock types 6 or 7 and sequences are becoming thinner upward then this is a zone of decreasing accommodation. A6. If most sequences end in rock types 6 and some with 7, or if the majority of the section is rock type 5, then this is a zone of increasing accommodation. A7. If the section is between a zone of decreasing accommodation and a zone of increasing accommodation, then this is a zone of minimal accommodation and should correspond to rock type 8. On the basis of the above principles, a special algorithm for determination of rock sequences has been developed. Its main feature is the use of the evaluating linguistic expressions for branching inside the algorithm. The input data have the following form:
Rock type
Rock thickness
9 9 .. .
9.9 9.9 .. .
326
10 Ancient Sea Level Estimation
The algorithm for determination of rock sequences is based especially on rules A1–A3. Its global description is the following: 1. Find all potential ends of sequences. These should be rocks of type 6, 7, or 8 if they are followed by rock type 1, 2, or 3 (rule A2). If it happens that the given rock has a lower number followed again by 6, 7, or 8 and it is too thin then it is ignored. 2. Check whether the obtained sequences are sufficiently thick (rule A3). If the given sequence is too thin then it is joined with the following one, provided that the resulting sequence does not become too thick. 3. If the given sequence is too thick then it is further divided using rule A1. For this, check all rock types 4 and mark them as ends of a new sequence provided that the new sequence is not too thin; mark the new sequence only if it is sufficiently thick. The linguistic expressions in the previous algorithm marked in italics are evaluating expressions which are modeled using the theory described above. More concretely, ● ● ●
too thin means: the thickness is significantly small or smaller, too thick means: thickness is very big or bigger, sufficiently thick means: the thickness is medium.
The given input value—rock or sequence thickness—is evaluated by means of the assignment of the proper evaluating expressions using the Suit (10.14) procedure. However, to do this properly, first the linguistic context (possible world) w = vL , vS , vR must be specified. This depends on the specified data—on the character of the area where they have been obtained.
10.3.3 Results of tests The two sets of data we tested are denoted as D1 [Demicco, 1985] and G1 [Goldhammer et al., 1993], respectively. The measuring units are in meters. The linguistic context for characterization of rocks and their sequences is the following: Data
Context of thickness
Possible world
Typical value
D1
rock sequence
w = 0, 4.8, 12 w = 0, 9.6, 24
too thin = 0.6 normal = 9.8
G1
rock sequence
w = 0, 2, 5 w = 0, 2, 5
too thin = 0.2 normal = 2
10.3 Automatic Determination of Rock Sequences
327
The typical value is the value representing what is too thin or normal in the given context. This means that any other value around them (or smaller in the case of “too thin”) is evaluated accordingly. The results of sequence determination have been compared with the geologist’s expert judgment. Let us stress that he/she also utilizes other, quite special experience, whereas the algorithm is nondeterministic and allows ambiguous solutions. Data G1 These data are from the Lower Ordovician El Paso Group observed in the Franklin Mountains of west Texas [see Goldhammer et al., 1993]. The total number of rock units in this data set is 272 and comprises a significant proportion of the measured section shown in Appendix C of Goldhammer et al. [1993]. Geologists determined 99 rock sequences and our algorithm determined 97 sequences. Of these, 87 are coincident with the geologist’s solution, an 88% agreement. The first discrepancy between the geologist’s solution and our algorithm occurs with sequence No. 43 (numbered by the algorithm). This consists of the rocks 3, 5, 3, 5, 3, 5. The corresponding data are summarized in Table 10.2. One can see from it that our algorithm checked all the geologist’s sequences as well. However, since all the rocks of type 5 are too thin (thickness = 0.1 m) and, at the same time the considered sequence G43 (geologist’s numbering) is rather thin (thickness = 0.7 m), and even the sequence G43 + G44 is still quite roughly thin (thickness = 1.3 m), both sequences were neglected by the algorithm and joined into one larger sequence No. 43. The other discrepancies are similar. However, also vice versa, our algorithm has marked three sequences which the geologist has neglected. The corresponding data are in Table 10.3. In all cases, the rock thickness, though small, does not correspond to the description “too thin,” and thus the algorithm decided to mark one more sequence in comparison with the geologist’s point of view. There are two other similar cases in the G1 data set.
Table 10.2 Sequence No. 43 (determined by the algorithm) from the point of view of the geologist. Rock type 3 5 3 5 3 5
Rock thickness 0.6 0.1 0.5 0.1 0.5 0.1
Rock character Sm SiSm Sm SiSm Sm SiSm
Sequence thickness 0.6 0.7 1.2 1.3 1.8 1.9
Sequence character RaSm QRSm RaMe
Geologist’s number G43 G43 G44 G44 G45 G45
328
10 Ancient Sea Level Estimation
Table 10.3 Geologist’s sequence No. G66 divided into sequences Nos. 63 and 64 (numbering by the algorithm). Rock type
Rock thickness
Rock character
Sequence thickness
3 4 3 5
0.3 0.9 0.3 0.5
VeSm MLSm VeSm Sm
0.3 1.2 0.3 0.8
Sequence character RoSm RaSm
Algorithm number 63 63 64 64
Let us stress that we can hardly expect the above discrepancies to be removed using the criteria specified above. In other words, full agreement with the geologist can be obtained only by using other information not included in the algorithm as of now.
Data D1 These data have been obtained in western Maryland from the Upper Cambrian deposit Conococheague Limestone. The total number of rock units in this data set is 226. The geologist determined 44 sequences while our algorithm determined 47 sequences. Of these, 36 are coincident with the geologist’s solution, which means an 81% agreement. The character of discrepancies is the same as is seen in section G1.
10.4 Sea Level Estimation Ancient sea level is estimated on the basis of the stacking pattern of sequences determined in the previous step. Each determined sequence corresponds to a certain time period which, for simplicity, is assumed to be constant. The determined sequence thickness then corresponds in a uniform way to the sea level in the given time period. The input data for sea level estimation are derived from the data sets given in the previous section and contain the sequence number (which coincides with the time period) and the sequence thickness determined in the previous step. These data are input to further analysis. Its result is the following file, summarizing total information about ancient sea level. Part of the data G1 is in Table 10.4. The first column contains sequence number and the second one its thickness (these have been determined using the algorithm described above). Three “approximation” columns contain the estimated sea level using the three methods described above. These methods yield different sea level estimations from the standard “Accommodation Plots” of the data [see Figure 14 in Goldhammer et al., 1993]. The third column is obtained using fuzzy F-transform.
10.4 Sea Level Estimation
329
Table 10.4 Part of the G1 data. Input values
Approximation
Sequence number
Sequence thickness
F-transform
LD from seq. thickness
LD from F-transform
1 2 3 4 5 6 .. . 92 93 94 95 96 97
8.4 3.6 3.8 1.5 3.0 11.3 .. . 0.9 2.9 6.3 1.8 8.5 4.0
5.1 5.1 5.1 5.4 5.5 5.5 .. . 2.1 2.1 2.8 4.5 5.1 5.1
4.2 4.2 4.2 4.2 4.2 4.2 .. . 2.3 2.4 2.6 2.7 3.3 3.8
5.0 5.0 5.0 5.0 5.0 5.0 .. . 2.3 2.4 2.5 2.6 3.5 4.2
Figure 10.6 Estimation of ancient sea level. Source: data G1, automatically determined rock sequences. Estimation obtained using F-transform.
330
10 Ancient Sea Level Estimation
Figure 10.7 Estimation of ancient sea level. Source: data G1, automatically determined rock sequences used for learning of linguistic description with fuzzy numbers in the antecedent. Estimation obtained using the learned linguistic description and smooth logical deduction.
The fourth column is computed using the learned linguistic description of ancient sea level behavior on the basis of the determined thickness of sequences. The fifth column is the same but the linguistic description of the ancient sea level behavior is learned on the basis of the F-transformed sequence thicknesses. The advantage of the latter two methods comes from the fact that we have the linguistic description of third-order and fourth-order sequence significance at our disposal. Hence, it is possible to modify the results on the basis of some geologist’s additional knowledge or information. Some of the obtained results are graphically depicted in Figures 10.6–10.12. In Figure 10.6, the result of approximating the ancient sea level from the data G1 using F-transform (third column of the above data) is presented. The following Figure 10.7 shows the estimation of sea level using the linguistic description. The description has been learned from the original data (first and second columns). The antecedent has been learned using fuzzy numbers; the succedent contains general evaluating expressions. The linguistic context of the time slice (antecedent) is w1 = [1.0, 97.0] and change of sea level (succedent) is w2 = [0.0, 5.0] (this coincides with the context for
10.4 Sea Level Estimation
331
Figure 10.8 Estimation of ancient sea level. Source: data G1, automatically determined rock sequences used for learning of linguistic description with evaluating expressions in the antecedent. Estimation obtained using the learned linguistic description and smooth logical deduction.
thickness of sequences). The learned linguistic description has the form given in Table 10.5. The expressions in the tables are written briefly, using “shorts” in correspondence with Section 10.2. The expression “Ap5” means “approximately 5” since we work with fuzzy numbers only. Some contradictory rules have been manually deleted (the original learned description had 67 rules). For comparison, the same approximation using the learned linguistic description, which has in its antecedent only evaluating expressions, is presented in Figure 10.8. The linguistic description has the form given in Table 10.6 (after manual deletion of some contradictory rules; the original description had 25 rules). Figure 10.9 shows the result of estimation of the sea level, again using the learned linguistic description in which the third column has been used instead of the second one. Similarly, Figure 10.10 contains the same but using a linguistic description with evaluating expressions only. These descriptions have not been manually modified.
Figure 10.9 Estimation of ancient sea level. Source: data G1, automatically determined rock sequences. These have been first filtered using F-transform and the result has been used for learning of linguistic description with fuzzy numbers in the antecedent. Estimation obtained using the learned linguistic description and smooth logical deduction.
Figure 10.10 Estimation of ancient sea level using the same procedure as in Figure 10.9, but the linguistic description is formed only of evaluating expressions.
Figure 10.11 Estimation of ancient sea level for data D1 using the same procedure as in Figure 10.9.
Figure 10.12 The same estimation as in Figure 10.11, but the linguistic description has been modified manually on the basis of additional knowledge.
334
10 Ancient Sea Level Estimation Table 10.5
Learned linguistic description for the G1 data.
Rule No.
Time slice ⇒ sea level [1.0, 97.0] ⇒ [0.0, 5.0]
1 2 3 4 5 6 7 .. . 46 47 48 49 50 51
Ro1 Ro2.9 Ro5 Ro7 Ro9 Ro9 Ro11 .. . Ro87 Ro89 Ro91 Ro93 Ro95 Ro97
⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒
MLBi VRSm QRBi Ap5 VRBi Ap5 RaMe .. . MLSm Sm MLSm QRBi MLMe Bi
Table 10.6 Manually modified linguistic description containing evaluating expressions only. Rule No. 1 2 3 4 5 6 7 8 9 10
Time slice ⇒ sea level [1.0, 97.0] ⇒ [0.0, 5.0] RaSm RoSm QRSm VRSm MLMe Bi VeBi SiBi ExBi VR97
⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒ ⇒
MLBi RaMe RoBi VRSm MLMe RoBi VRBi Sm QRBi Bi
An example of the power of linguistic description is shown in Figures 10.11 and 10.12, which show the sea level estimation for the data D1. The first figure shows the estimation based on the original learned linguistic description. The second figure shows the same after expert modification of the description. One may see that special knowledge can be well included in the linguistic descriptions.
10.5 Conclusion
335
10.5 Conclusion This chapter describes some fuzzy techniques useful for application in geology, namely the theory of evaluating linguistic expressions, linguistic descriptions and logical deduction based on them, and the fuzzy transform (F-transform) from I. Perfilieva (Chapter 9). These techniques have been applied to the determination of rock sequences. On the basis of rock sequences ancient sea level position and fluctuations can be estimated in variety of ways. Our results demonstrate that fuzzy techniques provide a strong tool capable of solving interesting geologic problems. We see our results as a strong hint supporting further development. For example, sea level estimation should be improved when other influences are included, such as subsidence and sediment deposition. It is clear that a lot of geological information has not been included, which thus made us unable to estimate the time periods during which the changes took place. It is also clear that extraction of sea level signals is the first step in any attempt to “inverse model” a sedimentary deposit. Thus, we have opened an extensive area for further research in fuzzy logic and its applications to geology. Acknowledgment This chapter has been supported by project ME468 of the MŠMT of the Czech Republic as the international supplement to project NSF “Stratigraphic Simulation Using Fuzzy Logic to Model Sediment Dispersal.” References Bˇelohlávek, R., & Novák, V. [2002], “Learning rule base of the linguistic expert systems.” Soft Computing, 7(2), 79–88. Carnap, R. [1947], Meaning and Necessity: a Study in Semantics and Modal Logic. University of Chicago Press, Chicago. Demicco, R.V. [1985], “Platform and off-platform carbonates of the Upper Cambrian Conococheague Limestone of western Maryland.” Sedimentology, 32(1), 1–22. Dvoˇrák, A., & Novák, V. [2003], “Fuzzy logic deduction with crisp observations.” in press. Gallin, D. [1975], Intensional and Higher-Order Modal Logic (with Applications to Montague Semantics). North-Holland, Amsterdam. Goldhammer, R.K., Lehmann, P.J., & Dunn, P.A. [1993], “The origin of high-frequency platform carbonate cycles and third-order sequences (Lower Ordovician El Paso Group, west Texas): constraints from outcrop data and stratigraphic modeling.” Journal of Sedimentary Petrology, 63, 318–359. Kerans, C., & Tinker, S. W. [1997], “Sequence stratigraphy and characterization of carbonate reservoirs.” SEPM Short Course Notes No. 40. Society of Economic Paleontologists and Mineralogists, Tulsa, OK.
336
10 Ancient Sea Level Estimation
Klir, G.J., & Yuan, B. [1995], Fuzzy Sets and Fuzzy Logic: Theory and Applications. PrenticeHall, New York. Lakoff, G. [1973], “Hedges: A study in meaning criteria and logic of fuzzy concepts.” Journal of Philosophy and Logic, 2, 458–508. Mareš, M. [1994]. Computation over Fuzzy Quantities. CRC Press, Boca Raton. Novák, V. [1989], Fuzzy Sets and their Applications. Adam Hilger, Bristol. Novák, V. [1992], The Alternative Mathematical Model of Linguistic Semantics and Pragmatics. Plenum Press, New York. Novák, V. [2001], “Antonyms and linguistic quantifiers in fuzzy logic.” Fuzzy Sets and Systems, 124, 335–351. Novák, V., & Perfilieva, I. (eds.) [2000], Discovering the World with Fuzzy Logic. SpringerVerlag, Heidelberg. Novák, V., Perfilieva, I., & Moˇckoˇr, J. [1999], Mathematical Principles of Fuzzy Logic. Kluwer, Boston and Dordrecht. Perfilieva, I. [2003], “Fuzzy transform.” In: Dubois, D. et al. (eds.) Rough and Fuzzy Reasoning: Rough versus Fuzzy and Rough and Fuzzy. Springer-Verlag, Heidelberg. Zadeh, L.A. [1975], “The concept of a linguistic variable and its application to approximate reasoning I, II, III,” Information Sciences, 8, 199–257, 301–357; 9, 43–80.
Acknowledgments
The Editors would like to acknowledge the support of the United States National Science Foundation under Grant No. EAR 9909336 and the associated International Supplement for collaboration with three researchers in the Czech Republic: Radim Beˇ lohlávek (Chapter 7), Vilem Novák (Chapter 10), and Irina Perfilieva (Chapter 9). This support was crucial for preparing this book. The Editors would also like to acknowledge the important role of the Center for Intelligent Systems at Binghamton University, a multidisciplinary research center, which provided them with a stimulating environment for pursuing research on applications of fuzzy logic in geology. One result of this ongoing research is this book. Some additional acknowledgments, of a more specific nature, are presented at the end of individual chapters.
337
This Page Intentionally Left Blank
Index
Absolute quantifiers 44 Abstraction process by factorization 212 Accommodation potential 325 Aggregating operations 25, 27 Aggregation operations, basic classes 28–29 Algebraic product 23 Algebraic sum 24 α-cuts 15–16, 30, 38 Ancient sea level estimation 301–337 fuzzy logic techniques 303–322 if–then rules 303–304 overview 301–303 presentation of results 328–334 Ancient sedimentary sequences 124 Andros Island 88–92 Antecedent membership functions 130 for burial depth 74–76 Antecedent variables 99 Approximate reasoning 36, 44–46 Approximation models 287 Arithmetic mean 27 Arithmetic operations on fuzzy intervals 28–31 Arizona, long-term statistical forecasting of precipitation 180–182 Artificial neural fuzzy inference system (ANFIS) analysis 109 Atmospheric circulation patterns (CPs) 158 Atomic evaluating expressions 305 Attribute implications 204–207, 224 Automation, modeling process 148 Averaging operations 25–28 requirements 26 Backpropagation (BP) neural networks 264 Basic functions 277 Basic linguistic trichotomy 305 Basin energy 138, 142, 145–146
Basin floor sedimentary environment 126 Basin wave energy 143 Beer–Lambert law 78 Binary fuzzy relations 33 Binary relations 31, 36 Biresiduum 208 Boolean algebra 25 Bounded difference 24 Bounded sum 24 Building damage index 244, 250, 256 Burial depth, antecedent membership functions for 74–76 Carbonate production versus depth and distance 295 Carbonate sediment production as function of depth and distance to platform edge 88–93 if–then rules 91–93 Cartesian product 31–32 Cauchy problem, approximate solution 287–289 Center of area method 52 Centroid method 52 Characteristic function definition 66 of crisp sets 18 China 112 seismic code 255 Circulation pattern (CP) 162, 177, 183–187 Classical measure theory 1 Classical set theory 1 and fuzzy set theory 17 Climatic modeling of hydrological extremes 157 Coastal oceanographic modelers 122 Combinations of variables 70 Commutativity requirements 23 Compaction curves, application of standard inference rules 74–78
339
340
Index
Compatibility relations 38 Compatible similarities 212–214 Compatible tolerance relations 213 Complements 21–22 classes 22 requirements 21 Complexity 2 managing 4 Compositional rule of inference 45–46 Compositions 33–34, 52 properties 34 Computational cost 4 Computational theory of perceptions (CTP) xi Computer technology, emergence of 64 Computers and Geosciences 113 Computing with words and perceptions (CWP) x Concept lattices, similarity of 214–215 Conditional and qualified fuzzy propositions 44 Conditional and unqualified fuzzy propositions 43–44 Congruence relation 225 Congruences 212 Connected binary relations 34 Conococheague limestone 323–324, 328 Constrained fuzzy arithmetic 31 Coral reef growth, standard inference rules 78–82 Core 15 Credibility 2 Crisp sets 12 and geology 66–67 characteristic functions 18 grain size 69 granite 70 tidal range 67 versus fuzzy sets, representation of grain size 12 Cutworthy generalization 33 Cutworthy property 17 CYCOPATH 2D code 123 Cylindric closures 32–33 Cylindric extensions 32–33 Darcy’s Law 66, 94 Data sets, training 164 Death Valley, California 118, 124–133 Decomposition 35
DEE (Defuzzification of Evaluating Expressions) 316–317 Defuzzification 52–53, 77, 83, 128, 169–170, 316 Degree of fulfillment (DOF) 167, 169–170, 172 Degree of membership 4, 13, 31, 38–39, 70 Delta sedimentation patterns 140 Delta simulation 145–146 Depositional processes, modeling 133–137 Deterministic systems 2 Direct F-transform 200, 320–321 Direct methods 54–55 Disorganized complexity 65 “Dr. Sediment” code 123 Drastic intersection 24 Drastic union 24 Drought, long-term record 109 Drought index (DI) 159, 161, 164, 170 long-term statistical forecasting 173–177 Drought response membership functions, Nebraska 165 Earth sciences 104, 114 overview 63–102 Earthquake damage 249 assessment 244 brick-column single-story factory buildings 257 estimation 255 Earthquake engineering 244–245 fuzzy information processing in 241 fuzzy logic 249–259 Earthquake intensity 242, 249–255 Earthquake loading estimation 244 Earthquake magnitude 242 and logarithmic isoseismal area 266, 268 Earthquake precursors 246–247 Earthquake prediction 240, 243 based on seismicity indices, fuzzy pattern for 247–248 fuzzy logic 245–248 fuzzy pattern recognition 245–247 Earthquake research 239–274 basic terminology 242–245 Earthquake-resistant design 241 El Niño Southern Oscillation (ENSO) 109, 158–161, 177, 184–185, 187
Index Environmental health risk analyses 155 Epicenter 242 Epicentral intensity 242 Evaluating linguistic expressions finding a suitable expression 313–314 semantics 307–313 theory 305 Evaluating linguistic predication 307 Evolutionary computation 106 Expert systems 100, 112 Exposure index versus height around mean tidal level 84 Extension principle 17–18 Factorization 212–214 Factorization modulo 223 Factorization modulo similarity 212 Fitness value 106 FLUVSIM 124 Formal concept analysis 191–237 attribute dependencies 197–198 attribute implications 204–207, 224 directly observable data: objects and their attributes 192 discovering natural concepts hidden in data 194–196 discovery of hidden attribute dependencies and natural concepts 192–193 examples 217–235 extent and intent 195 fuzziness and similarity issues 198–199 fuzzy data 199–207 hierarchy of discovered concepts 196–197 hierarchy of hidden concepts 203–204 informal outline 193–194 input data and hidden concepts 199–203 origins 193 F-transform 275–300, 304 components 280–281 direct 280, 320–321 inverse 280, 283–287, 292, 321 technique 319–322 Fuzzified function 17 FUZZIM 118 Fuzzy algorithm 325–326 Fuzzy approximation models 275 overview 276 Fuzzy arithmetic 31, 109
341
Fuzzy clustering basic methods 99 c-means 99–100 sorting using 93–96 Fuzzy compaction algorithm 121 Fuzzy compatibility, examples 39 Fuzzy compatibility relations 38 Fuzzy complements involutive 22 membership functions 144 Fuzzy concept lattices 203–204, 220–221, 223 crisp order 204 Fuzzy concepts 199–203 Fuzzy contexts 199–203, 206, 215 Fuzzy data 4 Fuzzy earthquake intensity 250–255 Fuzzy equivalence, examples 39 Fuzzy equivalence relations 38 Fuzzy fact 46 Fuzzy implication 43–44, 46 Fuzzy inference rules 73 Fuzzy inference system 126, 129 output values 77 Fuzzy information processing in earthquake engineering 241 Fuzzy intervals 49 Fuzzy logic 38–46 broad sense 3–4, 12 emergence of 6 engineering applications 5 impact on mathematics and logic 6 narrow sense 3, 11 output graph 96 principal sources for further study 57–59 role of 5 scientific applications 5 specialized tutorial 11–61 systems 73–100 use of term 3, 11 Fuzzy Logic Toolbox 83, 129 Fuzzy measures 1 Fuzzy models 275 Fuzzy monotone measures 1 Fuzzy numbers 29, 44, 82, 109 combined effect 166 defined on monthly relative frequency of given CP type 162 defined on PMDI 164
342
Index
Fuzzy numbers (continued ) defined on premises 161–163 defined on response variable 166 defined on SOI 162 Fuzzy operations, construction 53–55 Fuzzy partial ordering 38, 40 Fuzzy partition 277 uniform 278 universe 277 Fuzzy pattern for earthquake prediction based on seismicity indices 247–248 recognition in earthquake prediction 245–247 Fuzzy power set 14 Fuzzy predicates 41 Fuzzy probability 41 Fuzzy propability qualifier 42 Fuzzy propositions 3–4, 38–39 and fuzzy sets 41 Fuzzy quantifiers 41–42, 44 Fuzzy regression 155 Fuzzy relation equations 34 applications 36 solving 35–36 Fuzzy relations 31–38 antisymmetric 38 between building damage index and acceleration 253 diagram and matrix representation 37 information diffusion method 260–262 reflexive 37 single set 36–38 symmetric 37 Fuzzy relationship matrix 258 Fuzzy risk analysis 156 Fuzzy rule 46 Fuzzy rule-based fuzzy logic model 109 Fuzzy rule-based geophysical forecast system 112 Fuzzy rule-based hydrology modeling 157 Fuzzy rule-based models evaluation 172–173 future trends 120 Fuzzy rule system assessment 170–172 Fuzzy seismology 240 Fuzzy set theory 1, 3–4, 12, 53 and classical set theory 17 emergence of 6 engineering applications 5
impact on mathematics and logic 6 principal sources for further study 57–59 role of 5 scientific applications 5 Fuzzy sets 3–4, 200 and fuzzy propositions 41 basic characteristics 16 basic concepts 14–18 construction 53–55 definition 161–164 geostatistics 156 in geology 68 level 2 56 level k(k > 2) 56 operations on 19 type 2 56 type k(k > 2) 56 versus crisp set, representation of grain size 12 see also specific types Fuzzy systems 49–53 knowledge-based 50 literature 53 model-based 50 Fuzzy transform see F-transform Fuzzy transformations 81 Fuzzy transitivity 38 Fuzzy truth qualifiers 42 Fuzzy truth values 41 Gaussian membership functions 91 General circulation models (GCM) 185 Generalized Euler method 288–292 Generalized Euler–Cauchy method 292–294 Generalized means 27 Generalized modus ponens 45 Generalized Runge–Kutta methods 293–294 Genetic algorithms 83 Genetic fuzzy system 106 Geographical information systems (GIS) 105, 116 Geological sciences, literature review 103–120 Geology fuzzy set in 68 literature review 107–119
Index miscellaneous applications of fuzzy logic 119 Geometric mean 27 Geomorphology 108 Geotechnical engineering 107 literature review 112–113 Germany, long-term statistical forecasting of precipitation 182–184 Grain size 68 crisp set 69 versus fuzzy set representation 12 permeability as function of 93–96 sorting and permeability 93–96 Granite crisp set 70 fuzzy set representation 70 Granul 49 Granulation 49 of systems variables 4 Great Bahama Bank 88–92 Groundwater flow 156 Groundwater flow modeling 110–111 Groundwater risk assessment 107 literature review 111–112 Gutenberg–Richter Law 243 Haicheng earthquake 240 Harmonic mean 27 Height 15 Homomorphisms 216 Hungary, long-term statistical forecasting of precipitation 178–180 Hybrid fuzzy neural networks 259–269 Hybrid fuzzy systems 50 Hybrid model (HM) 269 Hydrocarbon exploration 107 literature review 113–115 Hydroclimatic modeling 157–173 Hydrological extremes, climatic modeling of 157 Hydrological forecasting 155–156 Hydrological modeling 156 Hydrology 153–190 application 155–157 fuzzy rule-based modeling 157 overview 154–157 If–then rules 73–74, 100–101, 142, 168, 293, 302–304, 307, 315, 325 carbonate sediment production 91–93
343
Death Valley 126–127 Mamdani fuzzy logic system 97–99 sediment plume deposition off Southwest Pass 138 standard (“Mamdani”) interpretation 74–82, 95, 126, 128 Takagi–Sugeno system 83–85 Igneous rock, classification 70 Indirect methods 54–55 Inferential connector, effect of changing 94 Information-diffusion approximate reasoning (IDAR) 264–265, 268 Information diffusion estimator 261 Information diffusion method 259–269 fuzzy relationships 260–262 Information distribution formula 263, 267 Input base variables 158–161 Input membership functions 143 Input-variable fuzzy sets 73 Interactions 64 International Union of Geological Sciences 70, 72 Intersections 22–25 examples 23–24 Interval-valued fuzzy sets 55 Intuitionistic fuzzy sets 56 Inverse F-transform 280, 283–287, 292, 321 Inverse function 17 Inverse transformation 286–287 Inverses 33–34 Inversion formula 283 Involutive fuzzy complements 22 Joins 33–34 Journal of Petroleum Geology 114 Journal of Petroleum Science and Engineering 114 Knowledge acquisition 53 Knowledge-based fuzzy systems 50 Kullback–Leibler directed divergence
243
L-fuzzy sets 56–57 La Niña 158–161 Landscape development, literature review 116–117 Learned linguistic description 334 Learning algorithm 106
344
Index
Learning relationships by BP neural networks 264 Least of Maxima (LOM) method 316 Level set 15 LFLC (Linguistic Fuzzy Logic Controller) 314 Linear regression methods 265 Linguistic description 304, 314–319 learning from data 318–319 power of 334 Linguistic expressions 41–42 components of evaluating 305 theory of evaluating 305 see also Evaluating linguistic expressions Linguistic hedges 305–306 Linguistic terms 49, 53–54 membership functions 85–86 Linguistic variable 49 Linkages 64 Logical deduction 304, 317–318 Logical precision 215–217 Long-shore drift regimes 138, 142–143, 145–147 Macro-seismic intensity scale 250 Mamdani fuzzy inference model of basin floor environments 129 Mamdani fuzzy inference systems 83 Mamdani fuzzy logic system 83, 97–99, 132, 138 see also If–then rules Mamdani inference rules 78–82 Mamdani interpretation of if–then rules 74–82, 95, 126, 128 Mamdani method 88 Mamdani–Togai model 262 Mathematical models and physical reality 3 Mathematics as core of system science 64 MATLAB 83–85, 93, 97, 99, 129–131 Max-min transitivity 38 Mean absolute error (MAE) 173 Mean error (ME) 173 Membership degree 4, 13, 31, 38–39, 70 Membership functions 12, 37, 39, 53, 68, 70, 126, 277 antecedent 74–76, 130 constructing 54–55, 245 distributary mouth bar deposition 137 examples 13
extensions of selected evaluating expressions 314 fuzzy complement of 144 fuzzy sets “shallow” and “deep” for input variable depth 141 input and output 97 linguistic terms 85–86 modified 20–21 notations 13 possible shapes 14 triangular-shaped 78 variable porosity 74–76 Metropolis–Hastings algorithm 170 Minimal solution 35 Mississippi River Delta 118, 124, 133, 137 Model-based fuzzy systems 50 Modeling process, automation 148 Modified membership function 20–21 Modified Mercalli intensity scale 270–271 Modifiers 19–21 types 20 Modus ponens 45 Modus tollens 45 Monotone measures 1–3 Montastrea annularis 78–79 Multicriterion decision making (MCDM) under uncertainty 156 Multidistributary deltaic depostion 137–147 National Weather Service Climate Analysis Center 174 n-dimensional relations 31–33 Nebraska, drought response membership functions 165 Necessity measure 47–48 Nested family 16 Neural networks 57, 83, 105–106, 109, 128–129 Neuro-fuzzy sytems 106 Newtonian mechanics 65 Nondeterministic systems 2 Nonlinear activation function 106 Nonstandard fuzzy intersections 25 Nonstandard fuzzy sets 13, 55–57, 70 Nonstandard fuzzy unions 25 Normal diffusion coefficient 262 Normal information diffusion formula 267 Numerical methods based on fuzzy approximation models 275
Index Numerical modeling, fuzzy logic as alternative technique 73 Optimization 106 Ordinary fuzzy sets 82 Organized complexity 104 Organized simplicity 65 Output base variables 158–161 Output-variable fuzzy sets 73 Pattern smoothing 262–264 PDSI (Palmer Drought Severity Index) 174–175 Peak ground acceleration (PGA) 244–245, 250 membership function 251 Perennial lakes 125 Permeability 66 as function of grain size 93–96 calculation of 93–96 grain size, sorting and 93–96 Permeability and average grain size 96 Petroleum engineering, virtual intelligence in 113 Plutonic igneous rocks, classification 72 Porosity, membership functions 74–76 Possibility measures 47–48 Possibility theory 46–49 standard fuzzy-set interpretation 48 Precipitation input onto watersheds 109 long-term statistical forecasting 178–184 Primary information distribution matrix 258 Principle of information diffusion 261 Principle of maximal belongingness 245 Principle of threshold-value 246 Principles of belongingness 245 Probability and uncertainty 1 Probability-qualified form 42 Probability theory 1–2 Projections 32–33 Propositional forms, basic types 41–44 Rainfall versus runoff data 109 Reef growth model 294–297 Reflexivity 207–208 Regional water resources management Relational join 34
156
345
Remote sensing 104, 116 Reservoir operation planning 156 Rock properties 104 Rock sequences automatic determination 322–329, 331–332 determination of 305 results of tests 326–328 Root-mean squared error (RMSE) 173 Rough fuzzy sets 56 Rough sets 105 Rule-based models see Fuzzy rule-based models Rule construction 166–169 Runge–Kutta methods 292 Saline lakes 125 Saline pans 125 San Francisco earthquake 239 Satellite remote sensing imagery 105 Scalar cardinality 15 Scientific knowledge, organizing 2 Sea level curve 296 data 298–299 extraction 294–297 see also Ancient sea level estimation Second-order fuzzy sets 82 SEDFLUX 134–135 Sediment accumulation 122 deposition 108, 117–118 erosion 121–122 plume 135–136, 142 production 122 transportation 122 Sedimentary basin filling, computer-generated forward models 121 Sedimentary particles 12 SEDSIM 123 Seismic code, China 255 Seismic damage grade 256 Seismic intensity 242 Seismic levels and basic intensity 256 Seismic stratigraphy 124 Seismicity 243 Seismicity indices 243 fuzzy pattern for earthquake prediction based on 247–248
346
Index
Seismology 108 literature review 115–116 SEISMOS 240 Self-adjusting inference rules, calculation of exposure index 82–88 Semantic level 43 Sequence number 328 Sequence stratigraphy 124 Sigma count 15 Siliciclastic sedimentary process simulators 123 Siliciclastic sedimentary rocks 133 Similarity 207–215 class 207 of attributes 209–211 of concept lattices 214–215 of concepts 211 of objects 209–211 relations 207–215 Simple evaluating expressions 305 Simple evaluating predication 307 Simulated annealing 171 Single-column fuzzy relationship matrix 267 Sinusoidal membership functions 279 Sinusoidal shape 286 Site intensity 244, 249 Smackover formation 124 Smooth logical deduction 302, 304, 322–323 Soft computing 103–107, 114 applications 119 areas 105 SOI (Southern Ocean Oscillation) 109, 159, 161 Soil science 108 literature review 116–117 Solution set of fuzzy relation equations 35–36 Sorting grain size and permeability 93–96 using fuzzy clustering 93–96 Southern Ocean Oscillation (SOI) 109, 159, 161 Southwest Pass 135, 137 Southwest Pass Distributary Mouth Bar 136 Standard averaging operation 27 Standard composition 33–34 Standard fuzzy arithmetic 30–31
Standard fuzzy complement 22 Standard fuzzy intersection 23 Standard fuzzy sets 13, 68 interpretation of possibility theory 48 Standard inference rules application to compaction curves 74–78 coral reef growth 78–82 Standard (“Mamdani”) interpretation of if–then rules 74–82, 95, 126, 128 STRATAFORM 123 Stratigraphic basin filling models 122–123 Stratigraphic models 123–124 Stratigraphic simulations, future developments 147 Stream sediments 117 Strength of relation 31 Strong α-cut 15–16 Subconcept–superconcept relation 203 Subnormal 15 Subset 14–15 Subsurface hydrology 107 literature review 110–111 Support 15 Supremum 16–17 Surface hydrology 107 literature review 108–110 Symmetric averaging operations 27 Symmetry 207–208 Syntactic level 43 Synthetic stratigraphic cross-sections of Death Valley sedimentation 132 System science mathematics as core of 64 roots of 64 Systems thinking 64 Takagi–Sugeno first-order membership functions 86–87 Takagi–Sugeno fuzzy inference system 83, 85, 131 Takagi–Sugeno fuzzy logic system 88, 130, 138 Takagi–Sugeno linear functions 142 Takagi–Sugeno linear membership functions 142 Takagi–Sugeno system, if–then rules 83–85 Tangshan earthquake 240, 256, 258 Theory of monotone measures 1
Index Tidal range 49–50 Tolerance relations 213, 227 Training data sets 164 Transitivity 207–208 Transport modeling 156 Trapezoid formula 282 Triangular conorms (t-conorms) 23 Triangular membership functions 91, 277–278 Triangular norms (t-norms) 23 Triangular-shaped membership functions 78 Truth assignment function 39 Truth-qualified form 42 Truth values 76, 81, 95, 127, 203, 208, 215 Uncertainty 1–2 multicriterion decision making (MCDM) under 156 Uncertainty and probability 1 Unconditional and qualified propositions 42–43 Unconditional and unqualified propositions 42 Understanding Earth 63 Understanding the Earth System 63
347
Uniform fuzzy partition 278–279 Unions 22–25, 52 examples 23–24 Unique maximum solution of fuzzy relation equations 35 Universal set 15 Vagueness 4 Validation data sets 164, 169 Virtual intelligence in petroleum engineering 113 Water quality models 109 Water resources 153–190 overview 154–157 Watershed response models 109 Weighted average 54 Weighted generalized means 27 Weights 28 calculation of 169 Wetness index 183, 186 Wood–Anderson-type seismograph
243
Yager class of fuzzy complements 22 Yager class of fuzzy intersections 24 Yangtze River 112
Figure 3.16 Models of carbonate sediment production on the Great Bahama Bank.
Figure 3.23 Three approximations to the data of Krumbein & Monk (1942), Pryor (1973), and Enos & Sawatsky (1981).
Figure 5.7 Synthetic stratigraphic cross-sections from a three-dimensional model of Death Valley sedimentation over the past 190 ky.
Figure 5.11 Outputs from basic delta model: (a) through (d) are isometric views of hypothetical delta simulation at different time steps; (e) and (f) are synthetic stratigraphic cross-sections perpendicular (e) and parallel (f) to shore.
Figure 5.13 Matrix showing sediment plume produced under various regimes of relative basin energy and long-shore drift.
Figure 5.15 (a) Isometric view of hypothetical delta simulation at end of run 1. (b, c) Synthetic stratigraphic cross-sections, perpendicular (b) and parallel (c) to shore.
Figure 5.16 (a) Isometric view of hypothetical delta simulation at end of run 2. (b, c) Synthetic stratigraphic cross-sections, perpendicular (b) and parallel (c) to shore.
Figure 6.18 Mean normalized distributions of 500 hPa geopotential height of a wet CP, averaged over 1970–79.
Figure 6.19 Mean normalized distributions of 500 hPa geopotential height of a dry CP, averaged over 1970–79.