ADVANCES AND CHALLENGES IN MULTISENSOR DATA AND INFORMATION PROCESSING
NATO Security through Science Series This Series presents the results of scientific meetings supported under the NATO Programme for Security through Science (STS). Meetings supported by the NATO STS Programme are in security-related priority areas of Defence Against Terrorism or Countering Other Threats to Security. The types of meeting supported are generally “Advanced Study Institutes” and “Advanced Research Workshops”. The NATO STS Series collects together the results of these meetings. The meetings are co-organized by scientists from NATO countries and scientists from NATO’s “Partner” or “Mediterranean Dialogue” countries. The observations and recommendations made at the meetings, as well as the contents of the volumes in the Series, reflect those of participants and contributors only; they should not necessarily be regarded as reflecting NATO views or policy. Advanced Study Institutes (ASI) are high-level tutorial courses to convey the latest developments in a subject to an advanced-level audience. Advanced Research Workshops (ARW) are expert meetings where an intense but informal exchange of views at the frontiers of a subject aims at identifying directions for future action. Following a transformation of the programme in 2004 the Series has been re-named and reorganised. Recent volumes on topics not related to security, which result from meetings supported under the programme earlier, may be found in the NATO Science Series. The Series is published by IOS Press, Amsterdam, and Springer Science and Business Media, Dordrecht, in conjunction with the NATO Public Diplomacy Division. Sub-Series A. B. C. D. E.
Chemistry and Biology Physics and Biophysics Environmental Security Information and Communication Security Human and Societal Dynamics
Springer Science and Business Media Springer Science and Business Media Springer Science and Business Media IOS Press IOS Press
http://www.nato.int/science http://www.springer.com http://www.iospress.nl
Sub-Series D: Information and Communication Security – Vol. 8
ISSN: 1574-5589
Advances and Challenges in Multisensor Data and Information Processing
Edited by
Eric Lefebvre Lockheed Martin Canada, Montreal, Quebec, Canada
Amsterdam • Berlin • Oxford • Tokyo • Washington, DC Published in cooperation with NATO Public Diplomacy Division
Proceedings of the NATO Advanced Study Institute on Multisensor Data and Information Processing for Rapid and Robust Situation and Threat Assessment Albena, Bulgaria 16–27 May 2005
© 2007 IOS Press. All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, without prior written permission from the publisher. ISBN 978-1-58603-727-7 Library of Congress Control Number: 2007922628 Publisher IOS Press Nieuwe Hemweg 6B 1013 BG Amsterdam Netherlands fax: +31 20 687 0019 e-mail:
[email protected] Distributor in the UK and Ireland Gazelle Books Services Ltd. White Cross Mills Hightown Lancaster LA1 4XS United Kingdom fax: +44 1524 63232 e-mail:
[email protected]
Distributor in the USA and Canada IOS Press, Inc. 4502 Rachael Manor Drive Fairfax, VA 22032 USA fax: +1 703 323 3668 e-mail:
[email protected]
LEGAL NOTICE The publisher is not responsible for the use which might be made of the following information. PRINTED IN THE NETHERLANDS
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
v
Preface From the 16th to the 27th of May 2005, a NATO Advanced Study Institute entitled Multisensor Data and Information Processing for Rapid and Robust Situation and Threat Assessment was held in Albena, Bulgaria. This ASI brought together 72 people from 13 European and North American countries to discuss, through a series of 48 lectures, the use of information fusion in the context of defence against terrorism, which is a NATO priority research topic. Information fusion resulting from multi-source processing, often called multisensor data fusion when sensors are the main sources of information, is a relatively young (less than 20 years) technology domain. It provides techniques and methods for: 1) integrating data from multiple sources and using the complementarity of this data to derive maximum information about the phenomenon being observed; 2) analyzing and deriving the meaning of these observations; 3) selecting the best course of action; and 4) controlling the actions. Various sensors have been designed to detect some specific phenomena, but not others. Data fusion applications can combine synergically information from many sensors, including data provided by satellites and contextual and encyclopedic knowledge, to provide enhanced ability to detect and recognize anomalies in the environment, compared with conventional means. Data fusion is an integral part of multisensor processing, but it can also be applied to fuse non-sensor information (geopolitical, intelligence, etc.) to provide decision support for a timely and effective situation and threat assessment. One special field of application for data fusion is satellite imagery, which can provide extensive information over a wide area of the electromagnetic spectrum using several types of sensors (Visible, Infra-Red (IR), Thermal IR, Radar, Synthetic Aperture Radar (SAR), Polarimetric SAR (PolSAR), Hyperspectral...). Satellite imagery provides the coverage rate needed to identify and monitor human activities from agricultural practices (land use, crop types identification...) to defence-related surveillance (land/sea target detection and classification). By acquiring remotely sensed imagery over earth regions that land sensors cannot access, valuable information can be gathered for the defence against terrorism. Developed on these themes the ASI’s program was subdivided in ten half-day sessions devoted respectively to the following research areas: • • • • • •
Target recognition/classification and tracking Sensor systems Image processing Remote sensing and remote control Belief functions theory Situation assessment
vi
The lectures presented at the ASI proved to be of great contribution and importance to the research and development of the multisensor data fusion based surveillance systems used in rapid and robust situations and for threat assessment. The ASI gave all the participants the opportunity to interact and exchange valuable knowledge and work experience to overcome challenging issues in various research areas. This book summarizes the lectures that were given at this ASI. An Advanced Research Workshop (ARW) related to this ASI was held in Tallinn, Estonia from June 27th to July 1st 2005. This ARW addressed the data fusion technologies for harbour protection. More information on this event can be found at http://www.canadiannatomeetings.com. I would like to thank all the lecturers who accepted the invitation to participate in the ASI. The time they spent preparing their lectures and their active participation were key factors to the ASI’s success. I would also like to thank them for the summary papers they provided to make this book happen. I extend my thanks to all the attendees of the ASI for their interest and participation. A special acknowledgement goes to Kiril Alexiev, the co-director of this ASI who initiated this project and was always very supportive. His tremendous help in the coordination of all events and logistics was much appreciated. My warm thanks go to Gayane Malkhasyan and Masha Ryskin, my administrative assistants and interpreters who ensured that everything ran smoothly during the course of the ASI. I would also like to thank the officers from the Albena Congress centre office, in particular, Mrs. Galina Toteva for her extra assistance. I would like to thank Pierre Valin and Erik Blasch who did the technical reviews of this book. Their judicious comments were very helpful. Very special thanks go to Kimberly Nash who reviewed the papers and formatted the book. Thank you for your patience and all the time you spent increasing the quality of the book. Finally I wish to express my gratitude to NATO who supported this ASI along with Lockheed Martin Canada, the Institute of Parallel Processing of the Bulgarian Academy of Science, Defence Research and Development Canada, the European Office of Aerospace Research and Development of the USAF and the National Science Foundation, without whom it would have been impossible to organize this event. Eric Lefebvre Montreal, Canada
vii
Contents Preface Eric Lefebvre
v
Sensor Data Fusion: Methods, Applications, Examples Wolfgang Koch
1
Simulation of Distributed Sensor Networks Kiril Alexiev
24
Joint Target Tracking and Classification via Sequential Monte Carlo Filtering Donka Angelova and Lyudmila Mihaylova
33
A Survey on Assignment Techniques Felix Opitz
41
Non-Linear Techniques in Target Tracking Thomas Kausch, Kaeye Dästner and Felix Opitz
48
Underwater Threat Source Localization: Processing Sensor Network TDOAs with a Terascale Optical Core Device Jacob Barhen, Neena Imam, Michael Vose, Arkady Averbuch and Michael Wardlaw On Quality of Information in Multi-Source Fusion Environments Eric Lefebvre, Melita Hadzagic and Éloi Bossé
56
69
Polarimetric Features and Contextual Information Fusion for Automatic Target Detection and Recognition Yannick Allard, Mickael Germain and Olivier Bonneau
78
Enhancing Efficiency of Dynamic Threat Analysis for Combating and Competing Systems Edward Pogossian, Arsen Javadyan and Edgar Ivanyan
85
Evidence Theory for Robust Ship Identification in Airborne Maritime Surveillance Missions Pierre Valin
92
Improved Threat Evaluation Using Time of Earliest Weapon Release Eric Ménard and Jean Couture Detection of Structural Changes in a Multivariate Data Using Change-Point Models David Asatryan, Boris Brodsky and Irina Safaryan
99
106
Unification of Fusion Theories (UFT) Florentin Smarandache
114
Belief Functions Theory for Multisensor Data Fusion Patrick Vannoorenberghe
125
viii
Dempster-Shafer Evidence Theory Through the Years: Limitations, Practical Examples, Variants Under Conflict and a New Adaptive Combination Rule Mihai Cristian Florea, Anne-Laure Jousselme and Dominic Grenier
148
Decision Support and Information Fusion in the Context of Command and Control Éloi Bossé
157
Fusion in European SMART Project on Space and Airborne Mined Area Reduction Isabelle Bloch and Nada Milisavljević
164
The DSmT Approach for Information Fusion and Some Open Problems Jean Dezert and Florentin Smarandache
171
Multitarget Tracking Applications of Dezert-Smarandache Theory Albena Tchamova, Jean Dezert, Tzvetan Semerdjiev and Pavlina Konstantinova
179
Image Registration: A Tutorial Pramod K. Varshney, Bhagavath Kumar, Min Xu, Andrew Drozd and Irina Kasperovich
187
Automated Registration for Fusion of Multiple Image Frames to Assist Improved Surveillance and Threat Assessment Malur K. Sundareshan and Mohamed I. Elbakary Data Fusion and Image Processing: A Few Application Examples Olivier Goretta and Francis Celeste Secondary Application Wireless Technologies to Increase Information Potential for Defence Against Terrorism Christo Kabakchiev, Vladimir Kyovtorov and Ivan Garvanov
211 221
236
Adaptive Image Fusion Using Wavelets: Algorithms and System Design Stavri G. Nikolov, Eduardo Fernández Canga, John J. Lewis, Artur Loza, David R. Bull and C. Nishan Canagarajah
243
Methods for Fused Image Analysis and Assessment Artur Loza, Timothy D. Dixon, Eduardo Fernández Canga, Stavri G. Nikolov, David R. Bull, C. Nishan Canagarajah, Jan M. Noyes and Tom Troscianko
252
Object Tracking by Particle Filtering Techniques in Video Sequences Lyudmila Mihaylova, Paul Brasnett, C. Nishan Canagarajah and David Bull
260
Wavelets, Segmentation, Pixel- and Region- Based Image Fusion John J. Lewis, Richard J. O’Callaghan, Stavri G. Nikolov, David R. Bull and C. Nishan Canagarajah
269
Data Fusion and Quality Assessment of Fusion Products: Methods and Examples Paolo Corna, Lorella Fatone and Francesco Zirilli
277
ix
Information Management Methods in Sensor Networks Lyudmila Mihaylova, Andy Nix, Donka Angelova, David R. Bull, C. Nishan Canagarajah and Alistair Munro A Novel Method for Correction of Distortions and Improvement of Information Content in Satellite-Acquired Multispectral Images Vyacheslav I. Voloshyn, Volodymyr M. Korchinsky and Mykola M. Kharytonov
307
315
Multisensor Data Fusion in the Processes of Weighing and Classification of the Moving Vehicles Janusz Gajda, Ryszard Sroka and Tadeusz Zeglen
324
Sensor Performance Estimation for Multi-Camera Ambient Security Systems: A Review Lauro Snidaro and Gian Luca Foresti
331
Principles and Methods of Situation Assessment Alan N. Steinberg
339
Higher Level Fusion for Catastrophic Events Galina L. Rogova
351
Ontology-Driven Knowledge Integration from Heterogeneous Sources for Operational Decision Making Support Alexander Smirnov, Michael Pashkin, Nikolai Chilov and Tatiana Levashova
359
Evaluation of Information Fusion Techniques Part 1 – System Level Assessment Erik Blasch and Susan Plano
366
Evaluation of Information Fusion Techniques Part 2 – Metrics Erik Blasch
375
Rapid and Reliable Content Based Image Retrieval Dimo T. Dimov
384
Subject Index
397
Author Index
401
This page intentionally left blank
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
*
*
1
2
W. Koch / Sensor Data Fusion: Methods, Applications, Examples
W. Koch / Sensor Data Fusion: Methods, Applications, Examples
3
4
W. Koch / Sensor Data Fusion: Methods, Applications, Examples
W. Koch / Sensor Data Fusion: Methods, Applications, Examples
5
6
W. Koch / Sensor Data Fusion: Methods, Applications, Examples
Zk
tl l = 1
Zl = {z1l , . . . , zl k } n
... k xl {Zk nk Z k−1 }
xl Zk
t1
p(xl |Z k ) l = k
tk p(xk |Z k ) =
p(Zk , nk |xk ) p(xk |Z k−1 ) . dxk p(Zk , nk |xk ) p(xk |Z k−1 )
p(Zk , nk |xk ) (xk ; Zk , nk ) ∝ p(Zk , nk |xk ) Zk , nk
xk Hk xk Rk (xk ; zk ) ∝ N zk ; Hk xk , Rk
dxk−1 p(xk , xk−1 |Z
p(xk |Z k−1 ) tk tk−1 k−1
)
xk−1
p(xk |Z k−1 ) = tk−1
p(xk |Z k−1 ) = dxk−1 p(xk |xk−1 , Z k−1 ) p(xk−1 |Z k−1 ) . p(xk |xk−1) = N (xk ; Fk|k−1 xk−1 , Dk|k−1 ) Fk|k−1 Dk|k−1 p(xk |Z k ) p(xk−1 |Z k−1 )
p(xk |xk−1 ) p(xk |Z k ) = N (xk ; xk|k , Pk|k )
tk
(xk ; Zk , nk )
7
W. Koch / Sensor Data Fusion: Methods, Applications, Examples
p(x0 |Z 0 ) = N x0 ; x0|0 , P0|0 P0|0
t0
tl tk p(xl |Z k ) =
l < k
p(xl |Z k ) dxl+1 p(xl , xl+1 |Z k ) xl+1 p(xl |Z k ) =
dxl+1
p(xl+1 |xl ) p(xl |Z l ) dxl p(xl+1 |xl ) p(xl |Z l )
p(xl+1 |Z k ) tl+1
tl
p(xl |Z k ) p(xl |Z k ) xl
Zk
tl tk tl xl tl > tk
tl = tk
tl < tk
p(xk−1 |Z k−1 ) −−−−−−−−−−−−−−−−→ p(xk |Z k−1 ) p(xk |Z k−1 )
−−−−−−−−−−−−−→
p(xk |Z k )
p(xl−1 |Z k )
←−−−−−−−−−−
p(xl |Z k ).
p(xl |Z k ) p(xk−1 |Z k−1 ) p(xk |Z k ) tk−1 tk tk+1 N z; Hx, R N x; y, P = N z; Hy, S ν = z − Hy,
S = HPH + R,
×
p(xk+1 |Z k+1 )
N x; y + Wν , P − WSW N x; Q−1 (P−1 x + R−1 H z), Q
W = PH S−1 ,
Q−1 = P−1 + H R−1 H.
8
W. Koch / Sensor Data Fusion: Methods, Applications, Examples
p(xk+2 |Z k+1 )
tk−1 tk
p(xk |Z k )
tk+1 tk+1 p(xk+2 |Z k+1 ) p(xk+2 |Z k+2 ) p(xk−1 |Z
tt+2 )
k+2
p(xk |Z k )
p(xk+1 |Z k+2 ) p(xk |Z k+2 )
9
W. Koch / Sensor Data Fusion: Methods, Applications, Examples
Z k = {Zi }ki=1
• h1 • h0
P1 = P ( h1 |h1 ) P0 = P ( h1 |h0 )
Zk
Zk h1 PD PF
p(h1 |Z k ) LR(k) = p(Z k |h1 )/p(Z k |h0 ) LR(k) A B
LR(k) < A LR(k) > B A < LR(k) < B
P0 P1 p(h0 |Z k ) k=1
h0 h1 Zk+1
LR(k + 1)
10
W. Koch / Sensor Data Fusion: Methods, Applications, Examples
PD ◦ ◦
2.4
W. Koch / Sensor Data Fusion: Methods, Applications, Examples
11
12
W. Koch / Sensor Data Fusion: Methods, Applications, Examples
ΔT i
ΔTc ΔTi i = 1, . . . , n n ΔT c n kum i PD (n) = 1 − i=1 (1 − PD ) c PD
= 1−
n
i ΔTc /ΔTi i=1 (1 − PD )
i PD
1 ΔTc
c PD
=
n
1 i=1 ΔTi
ΔTc
13
W. Koch / Sensor Data Fusion: Methods, Applications, Examples
ΔTi ΔTi
i i PD
c PD
c PD
ΔTc ΔT2
ΔT1
c PD
n
√ ∝ 1/ n
zi Ri
zk = (rk , ϕk ) R = diag[σr2 , σϕ2 ] t[zk ] = rk (cos ϕk , sin ϕk ) t[zk ] xk|k−1 t(zk ) ≈ t(xk|k−1 ) + ∂t[xk ]/∂xk |xk =xk|k−1 (zk − xk|k−1 ) ∂t[xk ] ∂xk
=
cos ϕk −rk sin ϕk sin ϕk rk cos ϕk
=
cos ϕk − sin ϕk sin ϕk cos ϕk
1 0 0 rk
Dϕ
ϕk|k−1 Dϕ Sr RSr D ϕ
Dϕ diag[σr2 , (rk|k−1 σϕ )2 ]D ϕ
x1|1 = z1 P1|1 = R1 k i=1
R−1 i zi ,
Pk|k =
.
Sr
rk|k−1
xk|k = Pk|k
k i=1
R−1 i
−1
.
14
W. Koch / Sensor Data Fusion: Methods, Applications, Examples
O1
X
O2
O3 r
r
ϕ S1
π −ϕ R
Ri i = 1, . . . , k Pk|k = R/k
S2
S1
S2
15
W. Koch / Sensor Data Fusion: Methods, Applications, Examples
tk
Hg xk = 12 H(x1k + x2k )
zgk = Hg xk + ugk
zgk
ugk ∼ N (0, Rg )
Rg Hxik = (rki , ϕik )
H i = 1, 2
αr αϕ
αr αϕ Δr/αr 1 Δϕ/αϕ 1 Pr = Pu
Pr (Δr, Δϕ) Pu Pr (Δr, Δϕ) = 1 − Pu (Δr, Δϕ) 2 2 exp − log 2( Δϕ . Pu (Δr, Δϕ) = exp − log 2( Δr αr ) αϕ )
xk Pu
H(x1k − x2k )
0 Ru Pu (xk ) = |2πRu |−1/2 N O; Hu xk , Ru
(Zk , nk |xk ) = Ekii
αr αϕ Ru =
2 2 1 2 log 2 diag[αr , αϕ ].
Ek
Ek (Zk , nk , Ek |xk )
zik ∈ Zk
Zk (Zk , nk , Ekii |xk ) ∝ Pu (xk ) N (zik ; Hgk xk , Rgk )
i zk ; Hg x , Rg ∝N k O H u 0
O Ru
.
16
W. Koch / Sensor Data Fusion: Methods, Applications, Examples
Ekii
zik
Hgk xk = 12 H(x1k + x2k ) Hu xk = H(x1k − x2k )
Ek00 Zk (Zk , nk , Ek00 |xk ) ∝ Pu (xk ) ∝ N 0; Hu x, Ru .
zik , zjk
∈ Zk
Ekij
(Zk , nk , Ekij |xk ) ∝ [−Pu (xk )] N
zik zjk
R O) . ;(H ) x , ( k H O R
1 1 − Pu (xk ) = 1 − |2πRu | 2 N 0; Hu x, Ru
W. Koch / Sensor Data Fusion: Methods, Applications, Examples
epk = (rk − pk )/|rk − pk | pk tk rk ˙ xk = (r k ,r k)
r˙ k
hn (rk , r˙ k ; pk ) = (rk − pk ) r˙ k /|rk − pk | hc (xk ; pk ) = 0
PD
hn (xk ;pk ) 2 PD (xk ; pk ) = Pd 1 − e− log 2( MDV ) .
PD p(xk |Z k )
hn (xk ; pk )
17
18
W. Koch / Sensor Data Fusion: Methods, Applications, Examples
xrk lk
tk
l˙k xrk = (lk , l˙k )
19
W. Koch / Sensor Data Fusion: Methods, Applications, Examples
p(xrk−1 |Z k−1 ) −−−−−→ p(xrk |Z k−1 ) l R p(xsk |Z k−1 ) −−−→ p(xsk |Z k )
p(xrk |Z k−1 )
−−−−−−−−−−−−−−−−−−−→
p(xsk |Z k−1 )
20
W. Koch / Sensor Data Fusion: Methods, Applications, Examples 2000
End
20
Target stops
(4) (3)
(2)
15
position error semi−axes [m]
Target stops
25
(2)
Kalman filter mean: 364 (109) m
(4)
1500 Terrain obscuration
(1)
Clutter notch and terrain obscuration
1000
(3) 500
0 0
500
1000
1500
2000
2500
3000
3500
y [km]
time [s]
1200 Kalman filter with track generated road mean: 250 (46) m
position error semi−axes [m]
Terrain obscuration
10
Terrain obscuration
(1) 5
Start 0
0
1000
(1)
Target stops
800
(2)
Clutter notch and terrain obscuration
600
(3)
(4)
400 200
5
10
15
x [km]
20
25
0 0
500
1000
1500
2000 time [s]
2500
3000
3500
21
W. Koch / Sensor Data Fusion: Methods, Applications, Examples
g
tk
bk
dk = (uk , vk ) tk bk = (uk|k−1 , vk|k−1 )
k
tk SNk = SN0 ( rrk0 )−4 e−(log 2)|dk −bk |
2
/b2
b |dk − bk | = b
0
PFA 1/[1+SNR(dk ,rk ;bk )] PD (dk , rk ; bk ) = PFA p(dk |Z k−1 ) tk
p(dk |¬D1k , Z k−1 ) ∝ [1−PD (dk , rk ; bk )]p(dk |Z k−1 ) p(dk |Z k−1 ) ¬D1k
p(dk |¬D1k , ¬D2k , Z k−1 ) ¬D2k
22
W. Koch / Sensor Data Fusion: Methods, Applications, Examples
W. Koch / Sensor Data Fusion: Methods, Applications, Examples
23
24
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
Simulation of Distributed Sensor Networks1 K. ALEXIEV Department of Mathematical Methods for Sensor Information Processing, Institute for Parallel Processing, Bulgarian Academy of Sciences
Abstract. Sensor networks have emerged as fundamentally new tools for monitoring spatially distributed phenomena. They incorporate the most progressive ideas from several areas of research: computer networks, wireless communications, grid technologies, multiagent systems and network operating systems. In spite of great interest centered on sensor data processing and information fusion, the simulation of entire multisensor network remains very important for the optimal solution of many tasks relevant to joint data processing and data transmission between sensor nodes. The current state of development automation tools does not correspond to the contemporary needs of modern design. The main purpose of this paper is to outline the structure of a simulation tool for modeling dynamical self-organizing heterogeneous sensor networks. Focus is concentrated on different component modeling and data flow simulation in the sensor networks.
Keywords. Modeling, Sensor networks
Introduction Sensor networks (SN) are useful tools for monitoring spatially distributed phenomena. A sensor network may consist of homogeneous or heterogeneous sensors spread in a global surveillance volume, which act jointly to optimally solve required tasks. Video cameras, acoustic microphone arrays, thermal imaging systems, seismic or magnetic sensing devices, microwave or millimeter wave radar systems, laser radar systems and etc. based on different sensing principles such as mechanical, chemical, thermal, electrical, chromatographic, magnetic, biological, fluidic, optical, acoustic, ultrasonic, mass sensing are used for monitoring. With advances in low-power processors, Internet, mobile communication and micro-mechanical systems the development of networks of numerous tiny, intelligent, wirelessly networked sensor nodes has become more cost effective, allowing these networks to penetrate deeply into the consumer market. SN are now used by the armed forces, the coast guard, the police, fire departments, environmental protection, etc. SN could be deployed on transport vehicles, in hospitals and in factories. The project “Smart home” supposes that nearly all home tools will be equipped with a variety of sensors, communicating with each other and with the outside world. The SN could also help society through health care 1
The research reported in this paper is partially supported by the Bulgarian Ministry of Education and Science under grant No I-1202/2002, and grant No I-1205/2002.
K. Alexiev / Simulation of Distributed Sensor Networks
25
by monitoring vital signs and other indicators of the aged and chronically ill. Using real time controls combined with databases could enhance treatment and reduce medical costs. The quick proliferation of SN requires appropriate tools for R&D of SN. Digital computer simulation presents a picture of expected system performance, and optimizes sensor deployment and collaborative information processing. Analysis of test results in different environment conditions and scenarios increases the robustness of the overall system at a reasonable cost.
1. Fundamentals of Sensor Networks SN architecture can be presented as communicating sensor nodes. The appearance, development and fast growth of SN is based on the rapid progress of information and communication technologies. The means for SN simulation are heirs of modeling tools from these areas. 1.1. Computer Networks The Atanasoff-Berry Computer was the world's first electronic digital computer, built at Iowa State University during 1937-1942 [15]. On October 19, 1973, US Federal Judge Earl R. Larson signed his decision, which declared the Electronic Numerical Integrator and Computer (ENIAC) patent No 3120606 of Mauchly and Eckert invalid and named Atanasoff the inventor of the electronic digital computer. Atanasoff’s invention had no patent rights and the court deemed his device public knowledge. One of the important consequences of this decision is that it prevented patent rights from being applied to computer architecture, which in turn lead to the quick growth of the computer industry. In 1972, Metcalfe developed the Ethernet system to interconnect several computers. In spite of its simplicity and robustness, the Ethernet cannot be used in large networks based only on the addressing scheme. Information flow in large networks can however be controlled using the third and fourth levels of the Open System Interconnection (OSI) model by devices called routers. The addressing scheme of the third level has a hierarchical structure and is called Internet Protocol (IP). The Transmission Control Protocol (TCP) adds Layer 4 connection-oriented reliability services to connectionless IP communications. In a dynamically changed network, routers require special types of networking protocols, called routing protocols, which spread adequate and consistent information about the current state of the network topology between routing devices. The progress of conventional computer networks creates all the necessary hardware and software prerequisites for sensor networks built on a fixed infrastructure. 1.2. Wireless Local Area Networks (WLAN) The mass production and distribution of wireless phones has lead to progresses in wireless technology and created technical solutions that can be easily implemented in wireless computer networks. These networks inherit the traditional problems of wireless and mobile communications, such as bandwidth optimization, power control, and transmission quality enhancement. In addition, their multi-hop nature and the possible lack of a fixed infrastructure introduce new research problems, such as
26
K. Alexiev / Simulation of Distributed Sensor Networks
network configuration, device discovery, and topology maintenance, as well as ad hoc addressing and self-routing. The most worthy characteristic of these networks is their ad hoc availability, which is a valuable property for sensor networks. Standard 802.11 determines two types of WLAN: infrastructure mode and ad-hoc mode [16]. 1.3. New Protocols The SN collaboration assumes a specific protocol stack, which concerns all layers of the OSI model [11,12,13,14]. The physical layer is responsible for frequency selection, carrier frequency generation, signal detection and modulation and data encryption. In WLAN the sensor transmitter antenna is often placed near to the ground and works in a diffraction zone. As a result, long-distance wireless communication can be expensive, both in terms of energy and implementation complexity. The choice of a good modulation scheme is also critical for reliable communication in a sensor network. While an M-ary modulation scheme can reduce the transmit on-time by sending multiple bits per symbol, it results in complex circuitry and increased radio power consumption, but the binary modulation scheme is more energy efficient. The data link layer is responsible for the multiplexing of data streams in LAN, data frame detection, Medium Access Control (MAC), and error control. Traditional CSMA protocol is not optimal because of its assumption of stochastically distributed traffic and independent point-to-point flows. The SN is characterized by highly correlated and dominantly periodic traffic. The TDMA based communication dedicates the full bandwidth to a single sensor node. The FDMA based communication allocates minimum signal bandwidth per node. Several hybrid TDMA–FDMA schemes are proposed to lower synchronization costs and optimize channel width. If the transmitter consumes more power, a TDMA scheme is favored, while the scheme leans toward FDMA when the receiver consumes greater power. Demand-based MAC schemes may be unsuitable for SN due to their large messaging overhead and link set-up delay. Network layer. The IP is almost the only routed protocol used in contemporary computers and SN. The number of routing protocols used in computer networks is less than 10, but the situation with routing protocols in WN is extremely complicated because of their number and different criteria for delivery optimization. The routing protocols can be classified in different groups according to their different characteristics. The first classification scheme divides the routing protocols into flatbased groups of nodes with equal functionality, hierarchical-based groups, where the nodes play different roles in the network, and location-based groups where the sensor nodes' positions are exploited to route data in the network. Another classification scheme divides routing protocols into multipath-based, query-based, negotiationbased, QoS-based, or coherent-based routing techniques depending on the protocol operation. The third scheme classifies the routing protocols into three categories: proactive, reactive, and hybrid protocols, depending on how the source finds a route to the destination. In proactive protocols, all routes are computed before they are really needed, while in reactive protocols, routes are computed on demand. Hybrid protocols use a combination of these two ideas. The number of proposed sensor routing protocols continuously increases. Transport layer SN protocols do not always require reliable packet delivery (if guaranteed delivery is too energy prohibitive). One possible solution is to split TCP into two parts – one for the connection of sensor nodes with the stationary node and
K. Alexiev / Simulation of Distributed Sensor Networks
27
classical TCP for information transmission between stationary nodes and to the querying node. Sometimes a simple UDP is used for the first part of the communication. Application layer SN protocols like the Sensor management protocol set up the rules for data aggregation, sensor node clustering and the addressing scheme. The Task assignment and data advertisement protocol controls interest dissemination from the querying node to the whole SN or to a problem subset of the sensors. The Sensor query and data dissemination protocol determines the interfaces and rules that an application has to use to issue queries, respond to queries and collect incoming replies. 1.4. Multiple Agent Systems A Multiple Agent System (MAS) can be defined as a loosely coupled network of problem solvers (agents) that interact to solve problems that are beyond the individual capabilities or knowledge of a single problem solver [10]. These problem solvers are functionally specific modular autonomous components, usually heterogeneous in nature. It is considered that the agent consists of different modules, such as the planning module, communication module, coordination module, task-reduction module, scheduling module, execution monitoring module, exception-handling module, etc. The agents also use different protocols to exchange information and for cooperative problem solving. In MAS there is no global control and there is the potential for disparities and inconsistencies. It is very important to determine the conflicts and resolve them correctly in time. 1.5. Grid Technology When networking was in its incipient stage, communications were limited to a narrow bandwidth of 56Kbps. Now standard communications (optic and cable) permit information transition of 10 and more Gbps and are no longer the bottleneck of distributed computing. Grid systems use resources of remote nodes as resources of one virtual scalable supercomputer [17]. The first grid system was developed in 1989 in the project CASA, where IP and Message Passing Interface were used. The projects FAFNER, I-WAY, Globus, Legion, Condor, Nimrod, NEOS, NetSolve, Horus, etc. settle different problems in communication, resource management and distributed data processing. The first standards for grid computing - Open Grid Services Architecture and Grid Remote Procedure Call enabled the integration of services and resources across distributed, heterogeneous, dynamic virtual organizations. 1.6. Network Operating System (NOS) The computer OS is a software package that enables computer hardware to be used, provides hardware integrity from a system point of view, runs user application software and ensures a suitable user interface. Usually OS performs these functions for a single user at a time. A NOS distributes these functions over multiple networked devices and shares resources across the network. Most network operating systems are built around a client-server model. The server provides resources and services to one or more clients by means of a network. Microsoft Windows NT, 2000/2003, Linux, UNIX, and Mac OS are the most popular NOS today. The NOS and servers act as the central point of the network and the main repositories of resources. This creates a single point of failure
28
K. Alexiev / Simulation of Distributed Sensor Networks
in the network. The International Standards Organization (ISO) created organization, information, communication, and functional models for network management. The Organization model describes the components of network management, such as the manager, agent and their relationships. The Information model concerns structure and storage of network management information. The Communication model deals with the communication between the agent and manager. The Functional model addresses the network management applications that reside upon the network management station.
2. Synchronization of Sensor Data and Principal SN Components Sensor data must be synchronized in time and space. This means that every sensor packet must be labeled with a time and space stamp to determine when these data were measured and where. Geo-synchronization is also necessary for communication purposes. A source must know the geographical position of any destination to which it wishes to send data, and must label the packets for that destination with its position. Synchronization can be performed either locally or globally, with suitable chosen protocols or services. Examples of sensor data synchronization protocols are Grid’s location service for ad hoc networks with dynamic nodes and Network Time Protocol. The SN can be simulated using four principal components: models of sensors used, environment model, sensor signal and data processing and communication modeling. These components are included in a common framework with suitable graphical user interfaces and a section for performance evaluation.
3. Modeling SN simulation is regarded as the only systematic tool for the detailed analysis of complex systems [2,4,6]. It solves problems arising from the dynamic allocation of sensors at random or a priori known moments and the disappearance of sensors. But it is important to not overstate the significance of modeling. There is a very useful saying to return us to reality: “Computers can take you farther than you really are!” 3.1. Sensor Models Sensors are sources of information about an ill-defined and chaotic reality and targets of interest. The mathematical representation of a sensor includes generating received measurements in a time-surveillance volume domain, considering technical sensor characteristics. A sensor node may also include communication and processing software and the platform on which the sensor is embedded (Figure 1). The most important sensor characteristics are field of view ( FOV ), maximal and minimal detection range Dmax / Dmin , probability of correct/incorrect detection, measurement rate Fm , and measurement noise characteristics N = N ( p N , mN , σ N ) , where p N is noise probabilistic density function with corresponding parameters mN and σ N . When the sensors are embedded on moving platforms every measurement is done at a different location point of space. The parameters of a moving platform for trajectory based modeling are initial platform position data
K. Alexiev / Simulation of Distributed Sensor Networks
29
K = K ( x0 , y0 , z0 , v x0 , v y 0 , v z 0 , a x0 , a y 0 , a z 0 ) and platform trajectory, which can be
described by the turn points’ position and corresponding velocities and accelerations at those points K = K ( xi , yi , zi , v xi , v y i , v z i , a xi , a y i , a zi ) . The probability model of sensor measurements can be expressed by the equation M = M ( FOV , Dmax , Dmin , Fm , N , K , t ) . In this equation there are implicitly included all measured physical phenomena in the detection area of the sensor.
Figure 1. Generalized architecture of sensor node
Figure 2. Radar simulator
3.2. Environment Modeling We have to construct a versatile environment in which SN can be studied. The environment employs a wide range of models to orchestrate and simulate realistic scenarios. The environment is a set of noisy and useful (from process of interest) signals, received by sensors. There are two possible sources of this input information. The first uses data measured by real sensors in the original environment. The limitations caused by data modeling are avoided but there is a strong drawback – it is very difficult, dangerous, expensive and sometimes impossible to explore estimated algorithms in a complex scenario. Such a scenario is improbable, but not impossible, and can exist in real life critical situations. A second drawback is that we have information about targets of interest, received by the same or other sensors with limited accuracy. As a result, inexact knowledge about targets is used for the estimation of sensor characteristics and the corresponding estimation algorithms. The true target state parameters are unknown in practice, or they are measured with limited accuracy, which is insufficient for estimation. In this case the researcher does not have the exact reference data for the accurate evaluation of the explored algorithm [2]. The second source of this input information is from input data that are entirely generated. This approach has considerable flexibility in the generation of complex target and clutter scenarios and a priori known reference input is provided, but generated data only approximate the real sensor data on an appropriate level of abstraction.
30
K. Alexiev / Simulation of Distributed Sensor Networks
Clutter can have a natural or artificial origin. It is easy to explain clutter in the case of imaging sensors. The received image consists of objects of interest and a so-called background. The objects appear, move and disappear above a constant or variable background. Usually, this background is natural. The environment clutter also includes weather conditions – rain, snowfall, the influence of the sun, moon or stars, water vapor, light or night, etc. Sometimes an artificial signal is generated to deceive the enemy sensors and the corresponding clutter has specific characteristics. Another useful feature of the simulation program is the possibility of generating input sensor information with randomized parameters for statistical estimation or algorithm parameters. 3.3. Sensor Processing Algorithms Processing algorithms can be grouped as signal processing algorithms, data processing algorithms and information processing algorithms. Sensor signal processing algorithms filter noisy raw data and locate, detect, or recognize objects of interest. They have to reject all unnecessary information. Choosing a detection threshold and correlation detection algorithm will determine the characteristics of a whole sensor system. For example, for a radar sensor system, a low detection threshold reduces the possibility of undetected targets, but the larger volume of data requires more effective track initiation and estimation algorithms. In the case of an imaging system, a low detection threshold essentially impacts subsequent feature extraction, but a high detection threshold increases the probability of losing feature information. Sensor data processing algorithms estimate object parameters. In the case of a single measured object, data processing algorithms are based on statistic estimation procedures [1-3]. The only problem to be solved is outlier detection. A more complicated case is when there are several measurements with unknown origin. This problem is known as an assignment problem for point objects or a registration problem for imaging sensors. The complexity of the task is NP with the number of measurements. Sometimes, data processing algorithms are also regarded as fusion algorithms because they process a sample of measurements. The most popular algorithms are α − β , α − β − γ , Kalman and extended Kalman filters, AR and ARMA estimators, different transformations, correlation algorithms, template matching algorithms, Sobel, Prewitt, Canny, Roberts, Laplacian, Hough, Radon edge detection, nearest-neighbor association algorithm, different kinds of probabilistic association rules, hybrid estimation procedures as interactive multiple modeling, etc. [1-9]. Sensor information processing algorithms serve as a source of object ID information. They are based on a prior knowledge about objects of interest saved in database and on a variety of decision rules, like Bayesian, Dempster-Shafer, etc. SN provides an advantage if and only if it is able to dynamically use the received signals, data and information of all sensors as a whole, not just as a mere collection of individual sensors. The collaborative information processing process is often called fusion [4-6]. Fusion can be very different and depends on the level of source information and type of sensors. Competitive sensors provide independent measurements of the same information, regarding a physical phenomenon. Competing sensors can be identical or can use different methods to measure physical attributes. Sensors are generally put in a competitive configuration to provide greater reliability or fault-tolerance to a system. Competitive sensors are most often used for missioncritical components to provide a more robust and reliable system. Sensors are
K. Alexiev / Simulation of Distributed Sensor Networks
31
complementary when they do not depend on each other directly, but can be combined to give a more complete image of the phenomena being studied. In general fusing complementary data is easy. The data from independent sensors can be appended to each other, providing a more complete mapping of the physical attributes being studied. 3.4. Communication Distributed information processing is impossible without the timely delivery of two types of information. The sensors have to exchange measured signals, data and information for cooperative information processing. This is the main source of channel load. Efficient use of bandwidth requires exchange of filtered information only, but the usage of raw information can increase the detectability of objects of interest and can reduce estimation errors. Besides this data, the processing nodes have to be provided with information giving a complete view of the SN topology by routing protocols. Event driven updates allow efficient use of bandwidth and faster convergence. The sensors process the received routing information about SN states and build a database. When the process of routing is completed successfully all sensors in the network will have a consistent database. 3.5. Examples General Purpose Simulation System (GPSS) is a system that originates from the Geoffrey Gordon simulator (1959) and is still off the shelf. GPSS models well statistical and control-flow based applications, where events can be modeled in discrete time units. Petri nets was introduced by Carl Adam Petri in his PhD thesis (1962) as a special class of generalized graphs or nets [18]. This is the first modeling and analysis tool well suited for the study of Discrete Parallel Event Systems. Petri net is a mathematical description of the system structure that can then be investigated analytically. Prof. Atanasov from BAS and his group further improved this theory (Generalized Petri nets) and programmed a shell for complex system modeling [19]. Radar simulator (fig. 2, [9]) has embedded scenario generator that is used for radar signal and data processing. The Network simulator (NS-2) is a discrete event simulator targeted at networking research and was supported by DARPA. NS provides substantial support for simulation of TCP, routing, and multicast protocols over wired and wireless networks. NetLogger is a methodology that enables the real-time diagnosis of performance problems in distributed systems. The methodology includes tools for generating time stamped event logs that can be used to provide detailed end-to-end application and system level monitoring; and tools for visualizing the log data and real-time state of the distributed system. Maryland Routing Simulator (MaRS) is a flexible platform developed specifically for valuation and comparison of network routing algorithms. MaRS was used previously for comparative evaluation of link-state and distance-vector routing protocols. QualNet WiFi was released by Scalable Network Technologies as a WLAN simulation tool for the interaction between the MAC and physical layers of wireless
32
K. Alexiev / Simulation of Distributed Sensor Networks
networks. QualNet WiFi comes with a WiFi-specific library of network protocols as well as the Animator, Designer, Analyzer, Tracer and Simulator modules. GloMoSim builds a scalable simulation environment for WLAN. It possesses parallel discrete-event simulation capability. References [1] [2] [3] [4]
[5] [6]
[7] [8]
[9]
[10]
[11] [12] [13] [14] [15] [16] [17]
[18] [19]
Y. Bar-Shalom, Multitarget multisensor tracking: Advanced applications, Artech House, 1990. A. Farina, F. A. Studer, Radar Data Processing, John Wiley & Sons inc., 1985. S. Blackman, Multiple Target Tracking with Radar Applications, Artech House, 1986. Kiril Alexiev, Modeling of Sensor Networks and Collaborative Information Processing in Time-Space Domain, International Conference Automatic and Informatics’03, 6-8 October 2003, Sofia, Bulgaria, pp.33-36. R.R. Brooks, S.S. Iyengar, Multi-Sensor Fusion: Fundamentals and Applications with Software, Prentice Hall, New Jersey, 1998. T. Horney, M. Brännström, M. Tyskeng, J. Mårtensson, GeeWah Ng, M. Gossage, WeeSze Ong, HueyTing Ang, KheeYin How, Simulation framework for collaborative fusion research, Int. conf. Fusion’04, Sweden, pp. 214-218. D. McMichael, G. Jarrad, S. Williams, M. Kennett, Modelling, simulation and estimation of situation histories, Int. Conf. Fusion’04, Sweden, pp. 928-935. Аlexiev К., Djerassi Е., Bоjilov L., Flight object modeling in radar surveillance volume, Sixth international conference: “Systems for automation of engineering and research, SAER'92, St. Konstantin - Varna, Bulgaria, pp. 316 - 320, 01-03, 1992, in Bulgarian. K. Alexiev, A MATLAB Tool for Development and Testing of Track Initiation and Multiple Target Tracking Algorithms, Information&Security An International Journal "Sensor Data Fusion",Vol. 9, 2002, pp.166-174. Huhns, M.N. and Singh, M.P., Agents and Multi-agent Systems: Themes, Approaches, and Challenges, In: Readings in Agents, Huhns, M.N. and Singh, M.P. (Eds.), San Francisco, Calif., Morgan Kaufmann Publishers, 1998, pages 1-23. I.F. Akyildiz, W. Su, Y. Sankarasubramaniam, E. Cayirci, Wireless sensor networks: a survey, Computer Networks 38 (2002) 393–422 L. P Clare, G. J. Pottie, and J. R. Agre, Self-Organizing Distributed Sensor Networks, Proc. SPIE, vol.3713, April 199, pp. 229-238. Nirupama Bulusu, John Heidemann, Deborah Estrin, Gps-less low cost outdoor localization for very small devices, IEEE Personal Communications Magazine, 7(5):28–34, October 2000. Jae-Hwan Chang, Leandros Tassiulas, Energy conserving routing in wireless ad-hoc networks, INFOCOM, 2000. http://www.britannica.com/eb/article-216038?hook=723621#723621.hook IEEE 802.11 Standard. Chervenak, A., Foster, I., Kesselman, C., Salisbury, C. and Tuecke, S. The Data Grid:Towards an Architecture for the Distributed Management and Analysis of Large Scientific Data Sets. J. Network and Computer Applications, 2001. Petri, C.A., Fundamentals of a Theory of Asynchronous Information Flow, Proc. of IFIP Congress 62, Amsterdam: North Holland Publ. Comp., 1963, Pages: 386-390. K. T. Atanasov, Generalized Nets and Systems Theory, Academichno izdatelstwo, 1997.
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
33
Joint Target Tracking and Classification via Sequential Monte Carlo Filtering1 D. ANGELOVA a , L. MIHAYLOVA b a
Bulgarian Academy of Sciences, 25A Acad. G. Bonchev Str, 1113 Sofia, Bulgaria
b
Department of Electrical and Electronic Engineering, University of Bristol, UK
Abstract. This paper addresses the problem of joint tracking and classification (JTC) of maneuvering targets via sequential Monte Carlo (SMC) techniques. A general framework of the problem is presented within the SMC. A SMC algorithm is developed, namely a Mixture Kalman filter (MKF), which accounts for speed and acceleration constraints. The MKF is applied to airborne targets: commercial and military aircraft. The target class is modeled as an independent random variable, which can take values over the discrete class space with equal probability. A multiple-model structure in the class space is implemented, which provides reliable classification. The performance of the proposed MKF is evaluated by simulation over typical target scenarios. Keywords: joint tracking and classification, sequential Monte Carlo methods, mixture Kalman filtering
Introduction Joint Target Tracking and Classification (JTC) deals with determining the identity of a target while tracking it. Classification or identification of the target involves determining the type of the target, e.g., bomber, commercial aircraft, fighter, or helicopter. Different methods have been developed to solve this problem: Kalman filtering, Interacting Multiple Model (IMM) techniques [1,2], Monte Carlo methods [35], belief functions techniques (e.g., based on the transferable belief model [6]). Many of the IMM and Kalman filtering techniques suffer from the limitation of this framework, which is restricted to linear models and Gaussian processes. The sequential Monte Carlo (SMC) techniques are much more general, and can afford to incorporate nonlinear constraints that are typical to the JTC problem. However these methods are computationally expensive. SMC methods [3-5,7] are very suitable for classification purposes due to the fact that they can easily cope with the highly nonlinear relationships between state and class (feature) measurements and non-Gaussian noise processes. An example of a successful application of this approach to littoral tracking is proposed in [4]. A multiple model particle filter and a Mixture Kalman Filter (MKF) for JTC are developed in [7]. The classification in [7] is carried out by processing kinematic measurements only, primarily in the air surveillance context. The features of the proposed algorithms include the following: for each target class a separate filter is 1 Partially supported by the UK MOD Data and Information Fusion DT Center and the Bulgarian Foundation for Scientific Investigations under grants I-1202/02 and I-1205/02.
34
D. Angelova and L. Mihaylova / Joint Target Tracking and Classification
designed; class filters operate in parallel, covering the class space; each classdependent filter represents a switching multiple model filtering procedure, covering the kinematic state space of the maneuvering target. This kind of multiple model configuration provides precise and reliable tracking and correct class identification but at the cost of a rather complex algorithm. In this paper, another MKF with reduced computational complexity compared to the MKF developed in [7] is proposed for tracking and identification of air targets in two classes: commercial and military aircraft. The parallel work of class filters are simulated by utilising a random class variable. The independent sets of particles for each class are replaced by two class-dependent sets of particles, randomly generated at each time step. The class variable is modeled as an independent random variable, taking values over a finite discrete class space with an equal probability. The proposed filtering algorithm has a relatively simple structure and exhibits the same performance as the MKF, proposed in [7]. The elimination of the unlikely filters after classification decision can be easily realized. Section 2 summarizes the Bayesian formulation of the JTC problem, and section 3 presents the mixture Kalman filtering for JTC. Section 4 deals with the implementation of the MKF for JTC. Simulation results are given in Section 5. Section 6 contains concluding remarks. 1. Bayesian Formulation of JTC Consider the following model of a discrete-time jump Markov system xk
F (Ok ) x k 1 G (Ok )uk (Ok ) B (Ok ) w k ,
zk
H (O k ) x k D (O k )v k , k
(1)-(2)
1,2, ,
where x k nx is the base (continuous) state vector, z k n z specifies the measurement vector, uk nu represents a known control input and k is a discrete time. The input noise process w k and the measurement noise v k are assumed to be independent identically distributed Gaussian processes with characteristics w k ~ N (0, Q ) and v k ~ N (0 , R) , respectively. The modal (discrete) state Ok , characterising the different system modes (regimes), can take values over a finite set S, i.e. Ok S # {1,2, , s} . We assume that the mode Ok evolves according to a first-order Markov chain with transition probabilities S ij Pr{O k j | O k 1 i}, (i, j S ), initial probability distribution P0 (i ) t 0 and
¦
s
P (i ) i 1 0
0 . Next, we suppose that the target
belongs to one of M classes c C where C # {1,2, , M } represents the set of the target classes. Generally, the number of the discrete states s probability
ʌ (c )
[S ijc ], i,
distribution
P0c (i )
and
the
transition
j S (c) are different for each target class.
s (c ), the initial
probability
matrix
D. Angelova and L. Mihaylova / Joint Target Tracking and Classification
35
{z k , y k } the set of kinematic z k and class (feature) y k
Denote with Ȧk
measurements obtained at time instant k. Then ȍ k {Z k , Y k } specifies the cumulative set of kinematic and feature measurements, available up to time k. The goal of the joint tracking and classification task is to estimate simultaneously the base state xk, the modal state Ok and the posterior classification probabilities P(c | ȍ k ) , c C , based on the observation set ȍ k . The problem can be stated in the Bayesian framework of estimating the posterior joint state-mode-class probability density function (pdf) p( x k , Ok , c | ȍ k ) , which can be computed recursively from the joint pdf at the previous time step in two stages - prediction and measurement update [3]. The predicted state-mode-class pdf p( x k , Ok , c | ȍ k 1 ) at time k is given by the equation
¦³ O
p( x k , Ok , c | ȍ k 1 )
k 1S ( c )
xk 1
p( xk , Ok | xk 1 , Ok 1 , c, ȍ k 1 ) p( xk 1 , Ok 1 , c | ȍ k 1 )dxk 1
(3) where the state prediction pdf p( x k , Ok | x k 1 , Ok 1 , c, ȍ k 1 ) is obtained from the state transition equation (1). The form of the measurement likelihood function p(Ȧk | x k ,O k , c) is usually known. When the measurement Ȧk arrives, the update step can be completed p ( x k , Ok , c | ȍ k )
p (Ȧk | x k ,O k , c) p( x k , Ok , c | ȍ k 1 ) p (Ȧk | ȍ k 1 )
(4)
The recursion (3)-(4) begins with the prior density P{ x0 , O0 , c} , assumed known, where x0 n x , c C , O0 S (c) . The target classification probability is calculated by P (c | ȍ k )
p(Ȧk | c, ȍ k 1 ) P(c | ȍ k 1 )
¦
M i 1
(5)
p(Ȧk | ci , ȍ k 1 ) P(ci | ȍ k 1 )
with an initial prior target classification probability P0 (c),
¦
cC
P0 (c) 1 . The state
estimate xˆ kc xˆ kc
¦ ³ O k S ( c )
xk
x k p ( x k , Ok , c | ȍ k ) d x k , c C
(6)
36
D. Angelova and L. Mihaylova / Joint Target Tracking and Classification
for each class c takes part in the calculation of the combined state estimate xˆ k xˆ kc P (c | ȍ k ) . It is obvious that the estimates needed for each class can be
¦
cC
calculated independently from the other classes. Therefore, the JTC task can be accomplished by the simultaneous work of M filters. SMC methods provide a number of useful suboptimal algorithms to approximate the optimal JTC solution, given by (3)-(6). In the general case of nonlinear state and measurement equations, particle filters represent the above complicated probability distributions by a set of N discrete, weighted (by W) samples {O(k j ) , x k( j ) , Wk( j ) }Nj 1 for each class c , and utilize importance sampling and weighted resampling to complete the filtering task [3].
2. Mixture Kalman Filtering for JTC
The dynamic system model (1)-(2) under consideration belongs to the class of Conditional Dynamic Linear Models (CDLM). The modal state variable Ok is called indicator variable. For a given trajectory of the indicator Ok , k 1, 2, , the system is both linear and Gaussian and can be estimated by a KF. The MKF exploits the conditional Gaussian property and utilizes a marginalization operation to improve the efficiency of the conventional SMC procedure. The samples are generated only in the indicator space and the target state distribution is approximated by a mixture of Gaussian distributions. Let ȁk {O0 , O1 , O2 , , Ok } be the set of indicator variables up to time instant k . By recursively generating a set of properly weighted random samples { ȁk( j ) , Wk( j ) }Nj 1 to represent the pdf p ( ȁk | ȍ k ) (for class c), the MKF approximates the state pdf p( x k | ȍ k ) by a random mixture of Gaussian distributions [8,7] N
¦W
( j) ( j) k N ( μk ,
Ȉ k( j ) )
(7)
j 1
where μk( j )
μk ( ȁk( j ) ) and Ȉ k( j )
Ȉ k ( ȁk( j ) ) are obtained by a KF, designed with the
system model (1-2), corresponding to class c . We denote by KFk( j ) { μk( j ) , Ȉ k( j ) } the sufficient statistics that characterize the posterior mean and covariance matrix of the state x k , conditional on the set of accumulated observations ȍ k and the indicator trajectory ȁk( j ) . Then based on the set of samples { ȁk( j )1 , KFk(j1) , Wk(j1) }Nj 1 at the previous
time
(k 1)
,
samples { ȁk( j ) , KFk( j ) , Wk( j ) }Nj 1
the
MKF
produces
a
respective
set
of
at the current time k . The correctness of the procedure
is proven in [8]. Using the likelihood function p(Ȧk | c, ȍ k 1 ) of class c C at time k , the class probabilities are calculated according to (5).
D. Angelova and L. Mihaylova / Joint Target Tracking and Classification
37
Consider the set of N compound particles {ck( j ) , O(k j ) , x k( j ) }Nj 1 . The class variable ck is assumed to be independent of ck 1 . It can take each possible value in C with an equal probability, i.e., P(ck c) 1 / M , c C . The indicator variable Ok takes values from the set S and evolves according to a Markov chain with transition matrix S . The set of samples is initialized according to the known initial class, state and mode distributions. The initial weights have equal probabilities (1 / N ) . At every time
instant k
1,2, , for each particle ( j ) 1, , N , first a class variable ck( j ) is drawn.
Then, depending on the realization of the class variable, O(k j ) and x k( j ) are generated in the following way [6]: the MKF scheme runs s KF prediction steps, according to each O S . The likelihoods of the predicted states are calculated based on the received measurement z k . They form a trial sampling distribution, according to which the new
O(k j ) is selected. Then the KF update step is accomplished only for the selected O(k j ) . The weight Wk( j ) is updated based on the factorized likelihood of the measurement Ȧ k . The sum of the weights, pertaining to class c , form the likelihood of class c and takes part in the calculation of the posterior class probabilities (5). The state estimate xˆ kc is updated based on the normalized weights of the particles, corresponding to c, according to (7). Finally, the combined output state estimate xˆ kc is evaluated according to the procedure from Section 2. The resampling scheme deals with the elimination of particles with small weights and replicates the particles with higher weights.
3. Model Parameters and Incorporation of the Constraints
Target model. In the two-dimensional target dynamics given by (1), the state vector x ( x, x , y, y ) ' contains target positions and velocities in the horizontal Cartesian coordinate frame. The control input vector u
(a x , a y ) ' includes target accelerations
along x and y coordinates. The matrices F and B G have the well known form [7, 10], corresponding to the uniform motion. The target is assumed to belong to one of two classes (M = 2), representing either a lower speed commercial aircraft with limited maneuvering capability (c1) or a highly maneuvering military aircraft (c2). The flight envelope information comprises speed and acceleration constraints, typical for each class. The speed V
x 2 y 2 of each class is limited respectively to the interval:
{c1 : V (100, 300)} and {c2 : V (150, 650)} [m / s ] . The control inputs are restricted to the following sets of accelerations: {c1 : u (0, 2 g , 2 g )} and {c2 : u (0, 5 g , 5 g )} , where g 9.81 [m/s2] is the acceleration due to gravity. The acceleration process uk is a Markov chain with five states (modes) s(c1) = s(c2) = 5. The following sets of modes (ax, ay) are selected in the implementation: { (0, 0), (0, A), (0, A), (-A, 0), (0, A) }, where A = 2g stands for class c1 target and A = 5g refers to the class c2.
38
D. Angelova and L. Mihaylova / Joint Target Tracking and Classification
Measurement model. The measurement vector z ( D, E ) ' consists of the distance D to the target and bearing E , measured by the radar. For the purposes of the MKF design, a measurement conversion is performed from polar to Cartesian coordinates [7]. The following sensor parameters are selected in the simulations: V D 120 [m] , V E 0.2 [deg] . The sampling interval is T 5 s . Speed constraints. Acceleration constraints are imposed on the filter operation by the use of a control input in the target model. The speed constraints are enforced through the speed likelihood functions [7]. According to the problem formulation, presented in Section 2, feature measurements y k , k 1,2, are used for the purposes of classification. In our case we do not have feature measurements. The feature measurement likelihood functions g c ( y k ), c C are replaced by speed likelihood functions. They are constructed based on the speed envelope information. The speed likelihood functions, together with speed estimates, form a virtual “feature measurement” set {Y k } . At each time step k , the filtering algorithm gives a combined state estimate xˆ k . Let us assume that the estimated speed from the previous time step, ˆ V , is a kind of “feature measurement”. The likelihood function is factorized k 1
ˆ p(Ȧ k | x k , Ok , c) f (z k | x k , Ok ) g c (y k ) where y k V k -1 . The normalized speed likelihoods represent speed-based class probabilities estimated by the filters.
4. Simulation Results
The simulated path of a second class target is shown in Figure 1 (a). It performs four turn maneuvers with accelerations 1g; 2g; 5g; 2g. The 5g turn provides insufficient true class information, since the maneuver is of short duration, and the next 2g turn can lead to a misclassification. The speed of 250 [m/s] provides better conditions for the probability; that the target belongs to class 2, according to the speed constraints. The estimated speed probabilities assist in the proper class identification, as we can see in Figure 1 (b). The Root-Mean Squared Errors (RMSEs) [10]: on position (both coordinates combined) and speed (magnitude of the velocity vector) are presented in Figure 2. The RMSEs shown are for each separate class, and the combined RMSE obtained after averaging with the class probabilities. The MKF is implemented with N = 200 particles. The results are based on 100 Monte Carlo runs. The experiments show that the filter provides reliable tracking of intensively maneuvering targets with accelerations up to 5g with acceptable errors.
39
D. Angelova and L. Mihaylova / Joint Target Tracking and Classification 70
class # 1 class # 2
65 1
60
50
y [km]
55 0.8
45
0.6
40 0.4
35
30 0.2
START 25
t [scans]
x [km] 20 20
25
30
35
40
45
50
55
60
65
70
0
0
10
20
30
Figure 1. (a) Test trajectory
40
50
60
70
80
(b) Class probabilities
340
180
combined class # 1 class # 2
300
280
260
combined class # 1 class # 2
160
Speed RMSE [m/s]
Position RMSE [m]
320
140
120
100 240 80 220 60 200 40
180
t [scans] 160
0
10
20
30
40
50
60
70
t [scans] 80
20
Figure 2. (a) MKF position RMSE [m]
0
10
20
30
40
50
60
70
80
(b) MKF speed RMSE [m/s]
The computational complexity of the proposed MKF allows for an on-line implementation. The experiments are performed on a PC computer with AMD Athlon processor 1.4 GHz. The MKF computational time in Matlab environment is 0.3 seconds per scan.
5. Conclusions
We propose a mixture Kalman filter for joint maneuvering target tracking and classification by accounting for acceleration and speed constraints. The operation of two multiple model class-dependent MKFs is simulated by a suitably determined random class variable. Thus a relatively simple structure of the algorithm is achieved. The filter performance is analyzed by simulation over typical target trajectories. The results show reliable tracking and correct class discrimination. Generalization to more target classes is straightforward.
References [1] [2] [3]
E. Blasch, C. Yang, Ten Methods to Fuse GMTI and HRRR Measurements for Joint Tracking and Identification, Proc. of the 7th Intl. Conf. on Inf. Fusion, pp.1006-1013, 2004. B. Ristic, N. Gordon, A. Bessell, On Target Classification Using Kinematic Data, Information Fusion, Elsevier Science, 5, pp.15–21, 2004. A. Doucet, N. de Freitas, N. Gordon (editors), Sequential Monte Carlo Methods in Practice, SpringerVerlag, New York, 2001.
40 [4] [5] [6] [7] [8] [9] [10]
D. Angelova and L. Mihaylova / Joint Target Tracking and Classification
N. Gordon, S. Maskell, T. Kirubarajan, Efficient particle filters for joint tracking and classification, Proc. SPIE Signal and Data Processing of Small Targets, 2002. M. Malick, S. Maskell, T. Kirubarajan, N. Gordon, Littoral Tracking Using Particle Filter, Proc. of the Fifth Int. Conf. Information Fusion, 2002. P. Smets, B. Ristic, Kalman Filter and Joint Tracking and Classification in the TMB Framework, Proc. of the 7th Intl. Conf. on Multisensor Information Fusion, Sweden, pp. 46-53, 2004. D. Angelova, L. Mihaylova, Joint Target Tracking and Classification with Particle Filtering and Mixture Kalman Filtering Using Kinematic Radar Information, Digital Signal Proc., 2005 (to appear). R. Chen, J. Liu, Mixture Kalman Filters, J. Royal Statistical Society B, 62, pp. 493–508, 2000. R. Chen, X. Wang, J. Liu, Adaptive Joint Detection and Decoding in Flat-Fading Channels via Mixture Kalman Filtering, IEEE Trans. on Inform. Theory, 46, pp.493–508, 2000. Y. Bar-Shalom, X.R. Li, Multitarget–Multisensor Tracking: Principles and Techniques, YBS Publishing, Storrs, CT, 1995.
41
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
A Survey on Assignment Techniques Felix OPITZ EADS Deutschland GmbH, Wörthstr. 85, 89077 Ulm, Germany
Abstract: We address the assignment problem by considering the relationship between single sensor measurements and real targets. Here, the class of multidimensional assignment is considered for multi-scan as well as unresolved measurement problems in multi-target tracking. Keywords: Data Fusion, Multi-Target-Tracking, Assignment, Data Association, Convex Analysis, Bundle Trust Region Method, Linear Programming, Interior Point.
Introduction Assignment methods form an essential functionality of multi sensor data fusion, which is applied in various areas: guidance of traffic flows, coastal surveillance, and air traffic control. These methods examine the relations between sensor plots (positions) and targets. A plot may be caused by one or multiple targets, or by the environment. On the other hand, a target may also go undetected by a sensor for several reasons. Assignment techniques generate, evaluate, and select hypothesis about the associations between plots and their origins.
1. Hypothesis Generation Hypothesis generation decides over the association hypotheses. Two-dimensional assignment considers the simultaneous relation between targets and sensor plots within a full sensor scan. This is done assuming that each plot is associated with at most one target and each target is associated with at most one plot. The mathematical description is realized by an 0-1-indicator s.t. F ij
1 determines the association between target i
and plot j , where plot 0 expresses false alarm. c ij determines the weight of the local association F ij
1 to the global assignment. The assignment problem, shown in Figure
1, becomes an integer optimisation problem of the form: m
min
n
¦ ¦ c ij F i j
F ij i 0 j 0
n
with
¦ F ij j 0
m
1, i 1,..., m and
¦ F ij
i 0
1, j 1,..., n
(1)
42
F. Opitz / A Survey on Assignment Techniques
Figure 1. Two dimensional assignment
Higher-dimensional optimisation simultaneously establishes an optimal relation between the targets, the plots of a first scan, and those of a second one, shown in Figure 2. Using an indicator function F ij j ^0,1` this is expressed as [1]: 1 2 n1
m
min ¦ ¦
n2
¦
F ij1 j2 i 0 j 0 j 0 1 2
c ij j F ij j subject to 1 2 1 2
n2
m
¦ ¦ F ij j
1 2
i 0 j2 0
n1
n2
¦ ¦ F ij j
j1 0 j2 0
1 2
1, i 1,..., m;
m n1
1, j1 1,..., n1 ;
¦ ¦ F ij j
1 2
i 0 i1 0
1, j 2 1,..., n2
(2)
Figure 2. Three dimensional assignment
Another application of multidimensional assignment considers merged plots caused by the limited resolution capability of sensors [2]. Let F ij1i2 ^0,1` be an indicator between a pair of targets and plots. An unresolved plot j belongs to an unordered pair of targets i1 , i2 , i.e. F ij1i2 ii
most one target i1 , i.e. F j1 1 m
min
n
¦ ¦
F iji i2 i ,i 1 2
0j 0
ii
ii
c j1 2 F j1 2 s.t.
F ij2i1
1 . A resolved plot is associated with at
1: m
¦
m
¦ F ij1i2
i1 0 i2 i1
n
1; j 1,..., n
m
¦ ¦ F ij i
12
j 0 i2 0
1; i1
1,..., m (3)
43
F. Opitz / A Survey on Assignment Techniques
2. Hypothesis Evaluation
The hypothesis evaluation is computer by the likelihoods defined in filter theory, see [1, 2]. 2
>1 PD @G
Lij
1 j2
jk , 0
P
[ UD N ( z ki ; Hx ij F
k 1
k | j k 1
, S ij
k | j k 1
1G jk , 0
)]
, c ij
ln( Lij
1 j2
1 j2
) (4)
3. Hypothesis Selection 3.1. Two Dimensional Optimisation
Methods to solve the two dimensional optimisation problem are: the Hungarian method, the Munkres-algorithm, and the Jonker-Volgenannt-Castanon or so called Auction algorithms [3, 4], shown in Figure 3. if i unassigned
start forward step
u i : 0, i 1,..., m; u j : 0, j 1,..., n;
while i : j : Ȥ ij
for j 1 : n if
c 0j
u j 0 then u j :
i: c 0j
argmax {Ȥ ij
reverse step
0
while j : i : Ȥ ij
0};
j : max ( argmin { cik u j} );
end;
k
min{c ik u k } ° w : ® kz j °¯ ci0 H
j i
max (argmin{c kj u k } );
0};
k
jz 0 ; j 0
min{c kj u k }; ° w : ® k zi c0j H °¯
Ȥ ij 1; u i : w İ;
jz0 ; j 0
Ȥ ij 1; u j : w İ;
if j z 0 then
if i z 0 then
Ȥ kj : 0 k z i; u j : w cij İ end; end;
0
argmax {Ȥ ij
Ȥ ik
0 k z j; u i : w cij İ;
end; end;
if j unassigned
Figure 3. Forward – Reverse – Auction Algorithm
3.2. Higher Dimensional Optimisation: Convex Optimisation
The higher dimensional optimisation problem is NP-hard. One method of dealing with the optimisation problem Eq. (2) is to reduce it to a 2-dimensional one, using Lagrange multipliers, and an additional convex optimisation.
44
F. Opitz / A Survey on Assignment Techniques
3.2.1. Relaxation and Lagrange Multipliers & Relaxation uses the last set of constraints and Lagrange multipliers u
(u j2 ) j2 1,...,n2 , to relax the 3-dimensional problem into a 2-dimensional one [1, 3, 5]. One obtains ( u0 : 0 ): m
&
min ¦
I (u )
n2
¦ ¦ c
z ij1 j 2 i 0 j 0 j 0 1 2
n1
m
n1
¦¦
i 0 j1 0
z ij j 1 2
u j2 z ij j 1 2
i j1 j2
n1
m
1, i 1,..., m;
¦¦
i 0 j1 0
z ij j 1 2
n2
¦u j
subject to
2
j2 0
(5)
1, j1 1,..., n1 and
z ij j 1 2
^0,1`
& & Given a fixed u , I (u ) may be calculated via the Auction algorithm. With the optimal
solution F ij
of the original problem in Eq. (2)., the following identity holds:
1 j2
m
&
I (u ) d ¦
n1
n2
¦ ¦ c ij j
i 0 j1 0 j2 0
1 2
F ij
1 j2
& &max I (u ) d uIR N
n1
m
n2
¦ ¦ ¦ c ij j i 0 j1 0 j2 0
1 2
F ij
(6)
1 j2
3.2.2. Convex Analysis and Bundle-Trust-Region Method
The function \ I is a convex (not differentiable) function, s.t. Eq. (6) is a convex & optimization problem. Instead of a vector valued gradient \ (u ) of a differentiable function, the convex analysis deals with set-valued subgradient resp. H subgradient [6]:
^ ^
`
& & & & & & & & w\ (u ) : g IR N : \ (v ) t \ (u ) g7 (v u ), v & & & & & & & & w H \ (u ) : g IR N : \ (v ) t \ (u ) g7 (v u ) H, v
(7)
`
An advanced and efficient method of solving the convex optimization problem & & & defines iterative bundle Bk ^g i | g i w\ (ui ); i 1,..., lk ` of subgradients and thus establishes an approximation of \ by a sequence of so called cutting planes \ k (Figure 4):
&
&
& &
&
\ k (u ) : &max ^\ (ui ) g i u ui g i Bk
`
(8) &
Further, defining the linearization errors as D ki
&
& &
&
\ (uk ) \ (ui ) g i u k ui , one obtains an inner approximation of the H subgradient for \ [7]: lk ° lk & S k ,H : ® E i g i | E i t 0, E i °¯ i 1 i 1
¦
¦
1,
lk
½°
i 1
¿
&
¦ EiD ki d H ,¾° wH\ (uk )
(9)
45
F. Opitz / A Survey on Assignment Techniques
&
\
\ ( u1 )
&
\3
\ (u 2 ) & \ (u 3 )
& g3 & & u1 u 3
&
\2
\ (u 2 ) & \ (u 3 )
\
\ ( u1 )
& u1
& u2
& u3
& u2
Figure 4. Cutting plane approximation
The cutting plane \ k approximation is used (Figure 4) to find the descent & & & direction d u u k in the different iteration steps: &
&
\ (u k , d )
& & & & max ^\ (ui ) g i u ui
i 1,...,lk
& & & & & & max \ (ui ) g i , d g i , ui u k
`
i 1,...,lk
^
`
(10)
& Of course, one cannot trust this model far away from u k , so a penalty term is & added as a further modification [7]. The determination of d k for known step size tk is done by:
& & d : d (t )
^
& & arg min \ (u k , d )
& & d d
1 2t k
`
(11)
i.e., the determination of the descent direction is a quadratic optimisation problem: & & (v, d ) : (v(t k ), d (t k ))
^
arg min v ( v , d )IR n 1
1 2t k
& & & & d d | v t g i d D ki
2 lk lk lk & °1 arg min ® 2 E i g i t1 E iD ik | E i t 0, i 1,..., lk , E i k E ° i 1 i 1 i 1 ¯
¦
& with d (t k )
lk
t k
¦ i 1
¦
& E i g i , v(t k )
¦
lk
t k
¦ i 1
2
& Ei gi
½ ° 1¾ ° ¿
`
or
(12)
(13)
lk
¦ E iD ik , using the duality [7, 8]. i 1
Eq. (13) may be interpreted as the projection of the null vector 0 onto S k ,H for lk
H
¦ E iD ik . i 1
46
F. Opitz / A Survey on Assignment Techniques
Therefore, it may be used to realize a stopping criteria (Figure 6). step 0: initialisation: constants: l ! 3 and 0 m1 m 2 1,0 m 3 1, T ! 0, W ! 0, W ! 0, H ! 0 ~& & & ~ u 0 IR n ; g1 w\ (u 0 ); B0 ^g1`; D10 : 0; g0 : g1; D 0 : 0; k : 0
& step 1: iteration: determine new trust parameter t k and descent direction d k (fig. 6)
Bk 1 & gi
{
& gi
serious step & & & u k 1 u k d k
null step & & u k 1 u k
Dik 1
Dik 1
& gl k 1
~& gk
D lk11
& gl k 2
& h}
D lkk12
& & & & Dik \ (u k 1 ) \ (u k ) t k gi 7d k & 7& ~ \ (u& ) \ (u& ) t ~ D k k 1 k k gk d k
Dik , i
1,..., lk ,
lk
D lkk11
¦ Eik Dik i 1
& & & & & \ ( u k ) \ ( u k d k ) h 7d k
D lkk12
0
^
`
step 2: reduction of bundle: Set N max i | Dik 0 and obtain Bk 1 by deletion of one or two & , i z N from B , s.t. lk 1 # Bk 1 d l whenever necessary. Set k:=k+1 and go to step 1. Figure 5. Bundle trust region algorithm (main)
step 0: set W 0L : 0, W 0U : T, W d W 0 T, j : 0 & step 1: calculate (Q, d ) and E via SQP, see Eq. (13) & & & & & ~ v : u k d, h w\ ( v), D k:
lk
¦
~& Ei D ik , g k :
lk
&
¦ Ei g i , D
& & & & \ ( u k ) \ ( v) h 7 d
i 1
i 1
& ~ d H and ~ Whenever D g k d H then stop the algorithm with t k 1 k
Wj
step2: Distinguish between the following cases: & & & & case 3 : \( v) \ (u k ) t m1Q and case 1 : \( v) \ (u k ) m1Q and & & & ~ \ (u& ) \( v& ) d ~ ~ g k 1 D D d m 3D h 7d t m 2Q W j t T W k k k 1 W j ; stop (serious step)
tk
tk
W j ; stop (null step)
& & case 2 : \ ( v) \ (u k ) m1Q and & 7& h d m 2Q W j T W
& & case 4 : \ ( v) \ (u k ) t m1Q and & ~ \ (u& ) \ ( v& ) ! ~ g D!m D
W Lj1
W j , W Uj1
W Lj1
W j1
1 (W 2 j
j
W Uj ,
min(T, W Uj ))
j 1; goto step 1
3 k
k
W Lj , W Uj1
W j , W j1
k 1
1 (W L 2 j
~ D k 1
W j)
j j 1; goto step 1
Figure 6. Determination of the trust region parameter
The final modification of the bundle trust region method is to distinguish between so called serious and null steps: a new iteration step is called a serious step if it delivers an improved approximation of the minimal point. The remaining steps do not improve the minimum point. Nevertheless they are used to improve the cutting plane (Figure 5 and 6).
F. Opitz / A Survey on Assignment Techniques
47
3.3. Higher Dimensional Optimisation: Linear Programming To be able to apply linear programming, one has to renounce the integer constraint of the indicators. The optimisation problem in Eq. (2) or Eq. (3) is transformed via [2, 9]: x( n1 1)( m 1) j1 ( m 1) j 2 i 1 l F ij
1 j2
resp. x( m 1)( m 1) j ( m 1)i1 i2 1 l F ij1i2
(14)
One obtains the linear programming in the standard form: min c 7 x subject to x
& Ax
& & & & b 0d x d1
(15)
Theoretically, this linear programming may be solved by the simplex method. Unfortunately, the complexity of the simplex method may increase exponentially with the dimension. However, Khachiyan proved that a linear program is solvable in polynomial time. A class of efficient algorithms are the Inner Point methods [10]. j j
However, a solution F i ji resp. F i 1 2 for Eq. (2) resp. (3) found by these new methods 12
need not be integer. But, due to the constraints it allows an interpretation in the sense of a (pseudo-) probability for an association [9].
References [1]
A. B. Poore; Rijavec, N.: “A Lagrangian Relaxation Algorithm for Multi-dimensional Assignment Problems Arising from Multitarget Tracking,” SIAM Journal on Optimization, Vol. 3, No. 3, pp 544563, August 1993. [2] F. Opitz: Clustered multidimensional data association for limited sensor resolutions. Proc. 8th Int. Conf. on Information Fusion, Philadelphia, PA, USA, 2005. [3] P. J. Shea, Lagrange-Relaxation-Based Methods for Data Association Problems in Tracking, PhD Thesis, Colorado State University, Fort Collins, 1995. [4] D. Bertsekas: Network Optimization – Continuous and discrete models, Athena Scientific, Belmont, Massachusetts, 1998. [5] F. Opitz: Data Association based on Lagrange Relaxation & Convex Analysis. NATO-RTO Symposium on “Target Tracking & Data Fusion for Military Observation Systems”, Budapest, Hungary, 2003. [6] J.-B. Hiriart-Urruty; Lemaréchal, C.: Convex Analysis and Minimization Algorithms I and II, Grundlehren der mathematschen Wissenschaften 306, Springer-Verlag, Berlin Heidelberg, 1993. [7] W. Alt: Numerische Verfahren der konvexen, nichtglatten Optimierung, Teubner, Stuttgart, 2004. [8] H. Schramm; Zowe, J.: A Version of the Bundle Idea for Minimizing a Nonsmooth Function, SIAM Journal on Optimization 2, pp. 121-152, 1992. [9] X. Li; Luo, Z.-Q.; Wong, K.M.; Bossé, E.: An Interior Point Linear Programming Approach to TwoScan Data Association, IEEE Trans. on Aerospace and Electronic Systems, Vol. 35, No. 2, April 1999. [10] Y. Ye; Todd, M.J.; Mizuno, S.: “An O(¥nL)-Iteration Homogeneous and Self-Dual Linear Programming Algorithm,” Mathematics of Operations Research, Vol. 19, No. 1, February 1994.
48
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
Non-linear Techniques in Target Tracking Thomas KAUSCH, Kaeye DÄSTNER and Felix OPITZ EADS Deutschland GmbH, Wörthstr. 85, D-89077 Ulm, Germany
Abstract. New classes of tracking algorithms combining Variable-StructureInteracting Multiple Model (VS-IMM) techniques, augmentation or dual estimation, and Unscented Kalman filtering are presented in this paper. These filter methods ensure significant self-adjusting and inherent manoeuvre detection capabilities. The algorithms are distinguished through their highly accurate course and speed estimations, even for manoeuvring targets. The performance of these techniques is demonstrated for targets performing turns of varying cross accelerations. Keywords. target tracking, sensor data fusion, interacting multiple model, augmentation, variable structure, Kalman Filter, Unscented Transformation, Unscented Kalman Filter.
Introduction The challenge of the tracking problem is to identify the path of a manoeuvring target based on noisy radar measurements. To be more precise, it is the aim of modern filter techniques to improve the sensor measured position, to derive course and speed estimations and to provide a measure of the estimation uncertainty, which is used in a succeeding data association process. Finally, modern filter techniques also allow the simultaneous classification of the target manoeuvres. The basic components needed for every filter is the mathematical expression for the assumed target dynamics and the relationship between the target state and the measurement, i.e., the measurement equation:
xk
f( xk 1 ) nkx1 , yk
h( xk ) nky
(1)
Herein x k is the state vector; i.e., Cartesian position and velocity and the process noise n kx1 and the measurement noise n ky1 are zero mean white Gaussian processes x
y
with covariances Rk 1 and Rk .
T. Kausch et al. / Non-Linear Techniques in Target Tracking
49
1. Increase of maneuver spectrum
1.1. Interacting Multiple Model (IMM) The challenge of tracking maneuvering targets is to find a suitable dynamic model with respect to the true but uncertain target behaviour to get an improved radar system performance. The uncertainty in the choice of the propagation equation leads directly to the idea of the well known Interacting Multiple Model (IMM) [1, 2, 3] as an efficient estimation technique suitable for unsure target maneuver hypotheses. Instead of a single dynamic model the IMM contains a whole filter bank of different maneuver models. By a statistical mixing of the implemented models, the covered target maneuver spectrum is extended. To avoid an exploding growth of concurring models with the resulting increased CPU load, one searches for methods that offer a limitation of the necessary model set within the IMM approach.
1.2. Variable Structure One such extension, called VS-IMM, is the real-time modification of the used model set, so that the number of models may be limited by concentrating on the most suitable ones [4].
1.3. Augmentation Another possibility is to extend the reliability of a single maneuver model. The augmentation techniques [5, 6] are applicable when a maneuver model m allows a parameterization f m (., Z ) with a parameter Z. Prominent examples are coordinate turn models, with Z as turn rate, or the ballistic models [7], with Z as ballistic coefficients. Augmentation is realized by extending the state vector with the parameterisation Zk , which leads to the new propagation equation for model m:
xˆk
§ nx · fˆ m ( xˆk 1 ) ¨¨ kZ1 ¸¸ © nk 1 ¹
§ f m ( xk 1 , Zk 1 ) · § nkx1 · ¸¨ Z ¸ ¨ ¸ ¨n ¸ ¨ Zk ¹ © k 1 ¹ ©
(2)
and
yk
hˆ( xˆk ) nky
h( xk ) nky
(3)
instead of Eq. (1). Filtering is performed with the augmented states or within a dual estimation approach, where two separated estimators are used both for state and parameter estimation [8, 9].
50
T. Kausch et al. / Non-Linear Techniques in Target Tracking
2. Handling of Non-Linearity The most popular Kalman filter is a linear filter assuming linear relationships in both propagation of the state and projection onto the measurement space. To handle nonlinearities the extended Kalman filter and the newer Unscented Kalman filter (UKF) are possible choices.
2.1. Extended Kalman Filter The extended Kalman filter [2, 3] handles the problem of non-linearity by linearization of the corresponding functions with respect to the covariance transformations. This results in the following change wrt the linear Kalman filter:
Pkm
S km
7
Fkm1 Pkm1 Fkm1 Rkmx1 , with Fkm1
7
H km Pkm1 H km Rky , with H km
§wf m ¨ ¨ wx ©
· ¸ ¸ m xk 1 ¹
§ wh · ¨ ¸ ¨ wx x m ¸ k © ¹.
(4)
(5)
Finally the Kalman gain matrix and the new state and covariance are calculated:
K km
Pkm H kT S km
1
(6)
(7)
(8)
xkm
xkm K km y k y km
Pkm
Pkm K km S km K km
7
2.2. Unscented Kalman Filter To handle non-linear estimation problems, the UKF makes use of the Unscented Transformation. This means that a certain number of samples of the probability distribution are used to perform the non-linear transformation. After this the transformed samples are recombined to get the transformed mean and covariance of the probability distribution. A very clear description of this algorithm can be found in [10, 11, 12].
51
T. Kausch et al. / Non-Linear Techniques in Target Tracking
3. Combined Algorithms
3.1. Dual UKF-VS-IMM & Dual EKF-VS-IMM The Dual UKF-VS-IMM (Dual EKF-VS-IMM) uses a bank of M manoeuvre models realised by Unscented Kalman filters (extended Kalman filters). Every model is described by a parameterised propagation function f m (., Z m ), m 1, , M . It is assumed that the model parameters are bounded to avoid a degeneration of the model resulting in a decreased filter performance by allowing non-realistic states: m m @ Z m >Zlower , Zupper
(9)
This approach starts with the interaction and mixing step of the standard IMM technique using the mixing probabilities P kn|m1 , states xkn1 and covariances Pkn1 of the previous step:
xk0m1
M
¦P
n|m n k 1 k 1
x
n 1
(10)
M
Pk0m1
¦P n 1
n| m k 1
>P
n k 1
xkn1 xk0m1 xkn1 xk0m1
@ 7
(11)
A filtering is executed by the dual estimation methodology for every individual model. Given Z km1 a UKF (EKF) is applied to estimate the current dynamic parameter Z~ m based on the new measurement y and the last state estimation x 0 m . In this step k
k
k 1
the propagation of the dynamic parameter is the trivial one while the projection onto the measurement space is given by
Z h$ f m ( xkm1 , Z )
(12)
One must prevent the model set from model coalescence by considering Eq. (9). This is done by a recovery step whenever the estimated parameter drifts out of its region:
Zkm
m m Zlower if Z~km Zlower ° m m ~m ®Zupper if Zk ! Zupper ° Z~ m else ¯ k
(13)
52
T. Kausch et al. / Non-Linear Techniques in Target Tracking
Finally, the new state xkm and covariance Pkm are estimated by a second UKF (EKF) assuming Z km . Simultaneously the likelihood is calculated for each individual model:
/mk
1 ( yk ; ykm , S km )
(14)
After the above branching into the different models, the new mixing and model probabilities are determined via the transition probabilities W n|m :
W n|m P kn1
P kn|m1
M
¦W
i|m
P ki 1
i 1
(15) M
P km
/mk ¦W i|m P ki 1 i 1 M i k j 1
M
¦ / ¦W i 1
j|i
P kj1 (16)
Finally, the model specific states and covariances are combined into global ones. M
xk
¦x
m k
P km
m 1
M
Pk
¦P m 1
(17) m k
>P
m k
xkm xk xkm xk
@ 7
(18)
3.2. UKF-VS-AIMM & EKF-VS-AIMM The UKF-VS-AIMM (EKF-VS-AIMM) applies VS-IMM methods to the augmented state spaces instead of the dual estimation approach above. The drawback is that parameter and state are highly coupled. This may be a disadvantage with respect to modularisation aspects and the capability to combine different types of manoeuvre models.
4. Example and Simulation Results The following example illustrates the techniques developed above. It is applied on a model set consisting of two Coordinated Turn (CT) models (left and right handed turns) parameterised by turn rates and a Constant Velocity (CV) model. The following
T. Kausch et al. / Non-Linear Techniques in Target Tracking
53
diagrams show a comparison between the different techniques. The xy-plots (Figure 1 and Figure 2) show a single run. The remaining plots (Figure 3 to Figure 5) are the result of Monte Carlo simulations. The solid line determines the Dual UKF VS-IMM [13], the dashed line the UKF-VS-AIMM [14] and the dotted line the result of an EKFVS-AIMM (VS-AIMM) [5]. The sensors are assumed to deliver range, azimuth and Doppler.
Figure 1. Measurements (left) and resulting track calculated with a Dual UKF VS-IMM (right)
Figure 2. Tracking result when using an UKF VS-AIMM (left) and EKF VS-AIMM (right)
Figure 3. Probabilities of the CT left model (left) and CV model (right)
54
T. Kausch et al. / Non-Linear Techniques in Target Tracking
Figure 4. Probability of the CT right model (right)
Figure 5. Course (left) and speed (right) accuracy (rms)
5. Conclusion New classes of UKF controlled IMM algorithms were introduced. These classes of algorithms realise synergies by combining recently founded techniques in a common scheme. For an example based on coordinated turn models these algorithms were compared with classical, EKF based, techniques. The new algorithms are proven to possess excellent estimation accuracy and manoeuvre detection capabilities.
References [1] [2] [3] [4]
[5] [6] [7] [8]
Y. Bar-Shalom and X.-R. Li: Estimation and Tracking: Principles, Techniques, and Software , Artech House, 1993. Y. Bar-Shalom and X.-R. Li: Multitarget-Multisensor Tracking: Principles and Techniques, YBS, Year of publication, 1995. S. Blackman and R. Popoli: Modern Tracking System, Artech House, 1999. X. R, Li.. Engineer’s guide to variable-structure multiple-model estimation for tracking: "Engineer’s guide to variable-structure multiple-model estimation for tracking", In Yaakov Bar-Shalom and W. D. Blair L. editors, Multitarget-Multisensor Tracking Applications and Advances" Volume III, pages 499 – 567, Artech House, Boston, 2000. E. Semerdjiev, L. Mihaylova and X. R. Li: "Variable- and Fixed-Structure Augmented IMM Algorithms Using Coordinate Turn Model", Third Int. Conf. Information Fusion, Paris, France, 2000. R. F. Stengel: Optimal Control and Estimation, Dover Pubns., 1994. A. Farina, D. Benevenuti, B. Ristic: "Estimation accuracy of a landing point of a ballistic target", Fifth International Conference on Information Fusion, Annapolis, Maryland, USA, 2002. C. K. Chui and G. Chen: Kalman Filtering, Springer Verlag, 1987.
T. Kausch et al. / Non-Linear Techniques in Target Tracking
[9] [10] [11]
[12] [13] [14]
55
E. A. Wan and A. T. Nelson: Dual Extended Kalman Filter Methods", In Siman Haykin editor, "Kalman Filtering and Neural Networks. John Wiley & Sons Inc., 2001. B. Ristic, S. Arulampalam and N. Gordon: Beyond the Kalman Filter – Particle Filters for Tracking Applications; Artech House, 2004. S. Julier, J. K. Uhlmann, and H. F. Durant-Whyte: "A New Method for the nonlinear transformation of means and covariances in filter and estimators", IEEE Transactions on Automatic Control, Vol. 45, No. 3, March 2000. S. Julier and J. K. Uhlmann. Data Fusion in Nonlinear Systems. Authors: "Data Fusion in Nonlinear Systems", in David L. Hall and James Llinas editors, "Handbook of multisensor data fusion", pages 131 – 13-21, CRC Press, Boca Raton, 2001. F. Opitz and T. Kausch: "UKF controlled Variable-Structure IMM Algorithms using Coordinated Turn Models", Seventh Int. Conf. Information Fusion, Stockholm, Sweden, 2004. F. Opitz: "A Variable Structure Augmented IMM Algorithm based on Unscented Transformations", International Radar Symposium, Warszawa, Poland, 2004.
56
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
Underwater Threat Source Localization: Processing Sensor Network TDOAs with a Terascale Optical Core Device Jacob BARHEN a,1, Neena IMAM a, Michael VOSE a,b, Arkady AVERBUCH c, and Michael WARDLAW d a Oak Ridge National Laboratory, United States of America b University of Tennessee, United States of America c Lenslet Inc., Israel d Office of Naval Research, United States of America
Abstract. Revolutionary computing technologies are defined in terms of technological breakthroughs, which leapfrog over near–term projected advances in conventional hardware and software to produce paradigm shifts in computational science. For underwater threat source localization using information provided by a dynamical sensor network, one of the most promising computational advances builds upon the emergence of digital optical-core devices. In this article, we present initial results of sensor network calculations that focus on the concept of signal wavefront Time-Difference-of-Arrival (TDOA). The corresponding algorithms are implemented on the EnLight™ processing platform recently introduced by Lenslet Laboratories. This tera-scale digital optical core processor is optimized for array operations, which it performs in a fixed-point-arithmetic architecture. Our results (i) illustrate the ability to reach the required accuracy in the TDOA computation, and (ii) demonstrate that a considerable speed-up can be achieved when using the EnLight™ 64D prototype processor as compared to a dual Intel XeonTM processor. Keywords. Time-Difference-of-Arrival (TDOA), optical-core processor, sensor net, underwater source localization.
Introduction In recent years, there has been a rapidly growing interest in near–real–time remote detection and localization of underwater threats using information provided by dynamically evolving sensor networks. This interest has been driven by the requirement to improve detection performance against stealthier targets using ever larger distributed sensor arrays under a variety of operational and environmental conditions. Figure 1 illustrates a typical mission, depicting a submerged threat (here a submarine), a patrol aircraft searching for it, and a field of Global Positioning System (GPS) capable sonobuoys. The buoys are passive omnidirectional sensors that provide 1
Corresponding Author: J. Barhen, Center for Engineering Science Advanced Research, Computer Science and Mathematics Division, Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, TN 37831-6016; E-mail:
[email protected]
J. Barhen et al. / Underwater Threat Source Localization
57
sound pressure measurements of the target signal perturbed by the ambient conditions. Once the buoys are placed, the aircraft monitors their transmissions and processes the data to detect, classify and localize the threat. The sonobuoys continuously monitor and transmit the measured signal via radio link with the aircraft. The position of the buoys is sampled periodically and also transmitted via radio link. A field of self localizing sonobuoys provides a unique means for underwater target detection in terms of its deployment flexibility, signal acquisition speed, focused ranging, and capability for net-centric information fusion. However, demanding calculations need to be performed to achieve source localization, and their complexity is known to increase dramatically with the size of the sensor array. This, in turn, results in substantial processing power requirements that cannot readily be met with off-the-shelf computing hardware. In fact, it is generally recognized that the development of acoustic sensors for Figure 1. Patrol aircraft monitoring GPS-capable underwater detection is much less sonobuoys challenging than identifying and implementing, in near real-time, and often under severe power availability constraints, the appropriate signal processing and detection algorithms. Here, we will consider the implementation of an algorithm for signal wavefront Time-Difference-of-Arrival (TDOA) at each array element of a distributed sensor network. TDOA techniques are the cornerstone of modern source localization paradigms. Our implementation is carried out on the recently introduced EnLight platform. This revolutionary digital optical core processor offers tera-scale computing capabilities in a limited (native 8-bit) precision, fixed-point arithmetic architecture. The specific objective of our effort was to (i) demonstrate the ability to reach a required accuracy in the TDOA computation, and (ii) estimate the speed-up achieved when using an EnLight device as compared to a leading-edge Intel XeonTM processor.
1. Underwater Threat Localization with a Sensor Network Over the past few decades, a great deal of effort has been devoted to the extraction of spatio-temporal information from an array of spatially distributed sensors [1, 2]. In the area of Anti-Submarine Warfare (ASW), much attention has focused on adaptive beamforming, primarily in the context of towed arrays [3, 4]. The basic emphasis of such a research was to achieve robust detection and Direction-of-Arrival (DOA) estimation under requirements for auto-calibration of the arrays [5, 6]. Notwithstanding the
58
J. Barhen et al. / Underwater Threat Source Localization
considerable progress reported over the years, today’s leading paradigms still face substantial degradation in the presence of realistic ambient noise and clutter [7]. With the emergence of large scale dynamic sensor networks (as depicted in Figure 1), where each individual sensor is subject to random motion, many previously postulated basic assumptions [8] (e.g., far-field geometry) are no longer valid. For instance, the sensors typically have arbitrary spacing between them, and the aperture of the distributed array may be considerable compared to the distance to the threat. Thus, different paradigms must be considered. One of the most robust is based on the concept of signal wavefront TDOA [9]. Several methodologies are available for localizing a threat source in the context of TDOAs. For illustrative purpose, we mention here briefly only three interesting approaches. Each of them requires, as a necessary first step, that accurate estimates of TDOAs for each combination of sensors in the network be obtained. The first methodology finds an estimate for the source location given the TDOAs and the sonobuoy positions using either maximum likelihood [10] or iterative least squares [11] optimization procedures. The second methodology attempts to directly obtain a closed form solution of the source location. Recently reported results [12] indicate that excellent accuracy can be achieved under minimal operational constraints of sensor non colinearity. The third approach is novel, and is introduced in a companion paper [13]. Its primary interest resides in the fact that it enables simultaneous estimation of the TDOAs and the threat source location. It builds upon the NOGA algorithms [14] that were developed for uncertainty analysis of nonlinear systems.
2. Time Differences of Arrival Let W n m (p ( s ) ) denote the TDOA between sensor n and sensor m for a signal wavefront originating from a source with position coordinates p ( s ) ( p1( s ) , p2( s ) , p3( s ) ) . Note that the superscript ~ refers to transposition. The TDOA is defined as:
W nm (p ( s ) )
|| p ( n ) p ( s ) || || p ( m ) p ( s ) || c c
(1)
where p ( n ) represents the position of the n-th sonobuoy, and c represents the speed of sound in the medium. Because of the absence of a timing reference on the unknown threat source, the most commonly used technique for TDOA computation is crosscorrelation. One usually has to estimate the TDOA for each sensor pair ( n, m ) from signals xn (t ) and xm (t ) measured respectively at sonobuoys n and m. Consider then a signal s (t ) radiating from a remote source through a channel that is subject to possibly strong interference and noise. The simplest signal propagation model for estimating the TDOA between signals xn (t ) and xm (t ) is
x m (t )
Am s (t G m ) K m (t )
(2a)
J. Barhen et al. / Underwater Threat Source Localization
x n (t )
A n s (t G n ) K n ( t )
59
(2b)
where Am and An are amplitudes scaling the signal, K m (t ) and K n (t ) represent noise and interfering signals, and G m and G n are signal delay times. Let m correspond to the sensor with the smaller delay. If we refer the delay times and scaling amplitudes to m , denote the amplitude ratio by A , and define W n m G n G m , the signal propagation model becomes
K m (t )
(3a)
An s (t W n m ) K n (t )
(3b)
x m (t ) x n (t )
s (t )
To apply cross-correlation techniques, one assumes that K m (t ) and K n (t ) are real, jointly stationary, zero-mean random processes that are uncorrelated mutually, as well as with s (t ) . The cross correlation between signals xn (t ) and xm (t ) measured at sonobuoys n and m is then defined as
R xm xn (W )
³
f
f
xn (t ) xm (t W ) dt
(4)
The argument W that maximizes R xm xn provides an estimate of the TDOA W n m . Such a technique enables the synchronization of all sensors participating in the localization process. However, the correlation R xm xn can only be estimated from sequences of length N corresponding to discrete samples of the signals. Thus, an estimate of it is given by
Rˆ xm xn ( P )
1 N
N | P |1
¦Q
0
xn (n ) xm (n P )
(5)
Alternatively, the cross-correlation can be computed from the cross-power spectral density G xm xn ( f ) of xn (t ) and xm (t ) , i.e.,
R xm xn (W )
³
f
f
G xm xn ( f ) e jS f W df
(6)
This is of interest, because G xm xn can be obtained efficiently using Fourier transforms, which can be computed very fast by the optical core processor introduced in the sequel. In practice, one uses the concept of Generalized Cross Correlation (GCC) [15], where a frequency weighting filter is introduced in order to sharpen the correlation peak. The GCC is defined as
H nm (t ,W )
³
f f
\ (t , f ) Gnm (t , f ) e i 2S f W df
(7)
60
J. Barhen et al. / Underwater Threat Source Localization
where Gn m (t , f ) is the crosspower spectrum at instant t and frequency f corresponding to signals xn (t ) and xm (t ) , and \ (t , f ) is the frequency weighting filter. The GCC provides a coherence measure [16] that captures, for a hypothesized delay W , the similarity between signal segments extracted from sensors n and m. For broad-band signals, the so-called phase transform technique was introduced in [15]. It translates into the following choice for the frequency weighting filter
\ (t , f )
1 . | Gnm (t , f ) |
(8)
In practice, of course, the signals at each sensor are sampled, and both
H n m and Gn m have to be obtained from finite length sequences. To further increase
the accuracy of the TDOA estimation and to be able to achieve sub-sample precision, interpolation of the normalized cross-correlation can be performed, as suggested in [16], before finding the maximum using a windowed sinc filter.
3. The EnLight Optical Core Processor To address the computational challenges raised by underwater threat source localization, revolutionary computing technologies are needed. These are defined in terms of technological breakthroughs, both at the device and algorithmic levels, which leapfrog over near–term projected advances in conventional hardware and software to (potentially) result in paradigm shifts in computational science. For maritime sensing applications, one of the most promising advances builds upon the emergence of digital optical-core devices, inherently capable of high parallelism that can be translated to very high performance computing. Enlight256 32Gbps
High Speed Input (HSIP)
Vector Memory
High Speed Output (HSOP)
32Gbps
768Gbps
Vector Register File
Enlight Instruction Set
Vector Processing Unit (VPU) Micro-program Memory
Vector Matrix Multiplier Optical Core Fast Matrix Buffer
EMIF
TI 64xx 64xx TI InstructionSet Set Instruction
Matrix Memory
EMIF
Scalar Processing Unit & Control 2Gbps
Host (system) Figure 2. Architecture of the EnLight optical core processor
32Gbps
J. Barhen et al. / Underwater Threat Source Localization
61
Recently, Lenslet Inc. introduced the novel EnLight™ processing platform [17]. The EnLight™256 is a small factor digital signal processing chip (5u5 cm2) with an optical core. The processor is optimized for array operations, which it can perform in fixed point arithmetic at the rate of 16 TeraOPS at 8bit precision. This is substantially faster than the fastest FPGA or DSP processors available today. The architecture of a computational node is shown in Figure 2. The optical core performs matrix-vector multiplications (MVM), where the nominal matrix size is 256u256. The system clock is 125MHz. At each clock cycle, 128K multiply-and-add operations per second (OPS) are carried out, which yields the peak performance of 16 TeraOPS. The rationale for large matrices is the good scaling and parallelism of such an optical processor – the larger the scale the faster the computation, with relatively small scaling penalty, comparing to electronics.
Figure 3. The EnLight™ 64D demonstrator board
Before starting production of the EnLight™256 processor, Lenslet built the EnLight™64D board, shown in Figure 3. This is a prototype demonstrator for the optical processing technology, with reduced size 64u64 optical core. In our proof-of-concept effort focused on TDOA estimation, we have used the 64D for all hardware tests. To project scale-up capabilities, we also tested our algorithms with the bit-exact simulator of the EnLight™256.
The EnLight™64D is specified as follows. Its clock operates at 60 MHz. The optical core has 64 input channels (configured as 256 vertical cavity surface emitting lasers, bundled in groups of 4 per channel). The size of the active matrix is 64u64; it is embedded in a larger multiple quantum well (MQW) spatial light modulator of size 264u288. There are 64 output channels (64 light detectors integrated with an array of analog-to-digital converters). The optical core performs the MVM function at the rate of 60 106 u 642 u 2 = 492 Giga operations per second. Each of the 64 data components in the input and output channels has an 8-bit accuracy, which results in a data stream of 60 106 u 64 u 8 bits/s = 30.7 Giga bits per second. We have developed algorithms that not only specifically build upon the massive parallelism of the EnLight processor, but also exploit the physics of this unique device. What is meant here is that a Discrete Fourier Transform (implemented as a simple matrix-vector multiplication) can be performed using the EnLight in a single processor clock cycle, provided the matrix fits in the core. This has enabled us to develop new
62
J. Barhen et al. / Underwater Threat Source Localization
hybrid FFT/DFT high-radix implementation of transforms. Details are given in a separate paper [18].
4. Results In this study, we are interested in demonstrating the ability of the EnLight computing platform to accurately carry out the estimation of signal wavefront TDOAs. For the purpose of the numerical simulations and optoelectronic hardware implementation, a number of operational simplifications are made. In particular, we assume that: Only a single target is present during the detection process; The same speed of sound is experienced at each sensor location; Each sonobuoy position is known exactly (via GPS) as it drifts; Detection opportunities are defined by the incidence of signals at the sonobuoy. Not all sonobuoys may detect the source. 1500
1000
500
0 -1500
-1000
-500
0
500
1000
1500
-500
-1000
x-y projection -1500
1 -1500
-1000
-500
0
500
1000
-9
-19
-29
-39
-49
x-z projection
-59
Figure 4. Synthetic scenario for TDOA estimation
1500
A synthetic scenario is illustrated in Figure 4. The sensor net comprises 10 sonobuoys, all identified by yellow icons. Only seven of these sensors detect a signal. The target is denoted by a red icon. Both x-y and x-z plane projections of their coor-dinates are shown. Because of symmetry, there are only 21 TDOAs that will be estimated for the seven active sensors. They will be labelled lexicographically. Thus, TDOA1 corresponds to sonobuoy pair (1,2), TDOA7 refers to pair (2,3), and TDOA21 is obtained from pair (6,7).
For assessing the accuracy of EnLight computations, we consider here a very simple model. We assume that the target emits a periodic pulsed signal with unit nominal amplitude. Pulse duration is 1 SI, and inter-pulse period is 25 SIs. The size of one Sampling Interval (SI) is 0.08s. Noise and interference are taken as a Gaussian process with a varying power level (typically up to unity). Signal extinction is neglected. Each sensor stores sequences of measured signal samples. Sequence lengths can range between 1K and 80K samples. In Figure 5, we illustrate the first 250 samples of two such signal traces, for instance those recorded
63
J. Barhen et al. / Underwater Threat Source Localization
at sensors 1 and 2 for a Signal-To-Noise Ratio (SNR) of – 11dB. Clearly, the pulsed signal signature from the threat source is indistinguishable because of the strong noise and interference. This contributes to the rationale for using correlation techniques in the source localization process. 3
2
Signal Magnitude
1
0 0
25
50
75
100
125
150
175
200
225
250
275
-1
-2
-3 Sample Sequence
Figure 5. Synthetic data sequences Sensor recorded at sensors #1 Sensor ##1 2 (orange) and #2 (green).
9
8
TDOA magnitude
7
6
5
4
3
2
1
0 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
T D OA s
Figure 6. TDOA magnitude (in units of sampling intervals) versus sensor pairs (ordered lexicographically) for 7 active sensors. Exact (model) results are in blue, sensor-inferred (model) FORTRAN) inferred (sensors) results (computed usingexact 64-bit floating-point are in brown. SNR = - 9dB.
To assess the accuracy of computations performed with the EnLight processor, we have computed the TDOAs for all 21 sensor pairs using three different approaches.
64
J. Barhen et al. / Underwater Threat Source Localization
First, exact results were obtained using the model specified by Eq (1). The sensor and target positions were assumed to be exactly known, and the sonic velocity was taken to be identical at all sensor locations. Calculations were carried out using the Intel Visual FORTRAN in 64-bit precision. In Figures 6 to 9, the corresponding magnitudes are coloured in blue. In the second approach, the TDOAs were estimated using noise corrupted data samples collected at each sensor. The correlations were calculated in terms of Fourier transforms, and the computations were again carried out using 64-bit Intel Visual FORTRAN. The magnitudes of the corresponding TDOAs are coloured in brown in Figures 6 and 8. In the third approach, the sensor data processing was implemented on the EnLightTM64D hardware prototype. The TDOAs are coloured in yellow in Figures 7 and 9. For benchmark purposes, two sets of data were used. Each set corresponds to a different SNR level. These levels were selected to show the break-point of correct TDOA estimation for signals buried in ever stronger noise, when calculations are performed in high precision (floating point). This allows us to illustrate (by comparison) the occurrence of potential additional discrepancies introduced by the fixed-point limited-precision EnLight architecture.
9
8
TDOA magnitude
Delay (in Sampling Intervals)
7
6
5
4
3
2
1
0 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
TDOAs
Figure 7. TDOA magnitude (in units of sampling intervals) versus sensor pairs (ordered exact (model) inferred (sensors) lexicographically) for 7 active sensors. Exact (model) results are in blue, sensor-inferred TM results (computed using EnLight 64D) are in yellow. SNR = - 9dB.
As observed from Figures 6 and 7, both the EnLight and the high precision visual FORTRAN computations from sensor data produce TDOA estimates that are identical to the exact model results for SNR = – 9 dB. Similar quality results were obtained for all sets of equal or higher SNR, and for sequence lengths of at least 2K samples. Consider now a target signal embedded in noise at SNR = – 11 dB. Figure 8 illustrates
65
J. Barhen et al. / Underwater Threat Source Localization
the emergence of discrepancies due to noise in the correlations computed in high precision. The TDOA for sensor pair (2,7) is estimated incorrectly (wrong correlation peak selected as result of noise). 9
discrepancy
8
TDOA magnitude
Delay in Sampling Intervals
7 6
5 4
3 2 1 0 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
TDOAs
Figure 8. TDOA magnitude (in units of sampling intervals) versus sensor pairs (ordered exact (model) inferred (sensors) lexicographically) for 7 active sensors. Exact (model) results are in blue, sensor-inferred results (computed using Intel Visual FORTRAN) are in brown. SNR = -11B.
9
discrepancies
8
Delay in Sampling Intervals
TDOA magnitude
7
6
5
4
3
2
1
0 1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
TDOAs Figure 9 TDOA magnitude (in units of sampling intervals) versus sensor pairs (ordered lexicographically) for 7 active sensors. Exact are in blue, sensor-inferred exact (model) (model) results inferred (sensors) results (computed using EnLightTM64D) are in yellow. SNR = -11dB.
Figure 9 shows that two discrepancies appear in the EnLight computations at – 11dB SNR. The TDOA discrepancy for sensor pair (2,7) corresponds to the one noted in
66
J. Barhen et al. / Underwater Threat Source Localization
Figure 8 for the 64-bit precision calculations. Here another error (peak misclassification) is introduced for sensor pair (4,5). It is a direct consequence of the limited precision used in EnLight. In summary, the above results indicate that excellent accuracy can be achieved with the EnLight processor for properly scaled signals sampled over broad dynamic ranges. In terms of processing speed, we have carried out benchmark calculations for Fourier transforms of long signal sequences. In particular, we have compared the execution speed of the EnLightTM64 D hardware to processing using dual Intel Xeon processors running at 2 GHz and having 1 GB RAM. The benchmark involved the computation of 32 sets of 80K complex samples transforms. For each sample, both the forward and the inverse Fourier transforms were calculated, the latter following multiplication of the former by the transform of a reference (to estimate the correlation). The measured times were 9,626 ms on the dual Xeon system, versus 1.42 ms on the EnLight. This corresponds to a speed-up of over 13,000 on a per processor base. We also carried out a capability projection estimation using the EnLightTM256 bit exact simulator. The resulting time was 0.17 ms, yielding a speed-up of over 113,000 per processor. For the positive SNR used, perfect accuracy in determining the correlation peaks was obtained. More details on these computations can be found in [18].
5. Conclusions and Future Research To achieve the real-time performance required for underwater threat source localization, many existing algorithms need to be revised and adapted to the emerging revolutionary computing technologies. These include field programmable gate arrays (FPGA), processor in memory (PIM) architectures, and optical (or optoelectronic) devices. The EnLight terascale optical core processor represents one such revolutionary advance. In that context, our future efforts will focus on demonstrating the ability to achieve the accuracy required (including, if necessary, higher than 8-bit) for other relevant maritime sensing applications; quantifying the speed-up achieved per processor as compared to a leadingedge conventional processor or DSP; determining the scaling properties per processor as function of the number of sensors present in the detection, tracking, and discrimination network; characterizing the SNR gain and detection improvement as function of array size and geometry. Thirty five years ago, fast computational units were only present in vector supercomputers. Twenty five years ago, the first message-passing machines (NCUBE, Intel) were introduced. Today, the availability of fast, low-cost processors has revolutionized the way calculations are performed in various fields, from personal workstation to terascale machines. An innovative approach to high performance, massively parallel computing remains a key factor for progress in science and national defense applications. In
J. Barhen et al. / Underwater Threat Source Localization
67
contrast to conventional approaches, one must develop computational paradigms that exploit, from the onset (1) the concept of massive parallelism, and (2) the physics of the implementation device. This has been the guiding principle for our algorithm implementation on the EnLight processor. Ten to twenty years from now, asynchronous, optical, nanoelectronic, biologically inspired, and quantum technologies have the potential of further revolutionizing computational science and engineering by (a) offering unprecedented computational power for a wide class of demanding applications, and (b) enabling the implementation of novel information–processing paradigms.
Acknowledgments Primary funding for this work was provided by the Office of Naval Research. Additional support was received from the Oak Ridge National Laboratory’s LDRD program. Oak Ridge National Laboratory is managed for the United States Department of Energy by UT-Battelle, LLC under contract DE_AC05- 00OR22725.
References 1.
R. Klemm, Space – Time Adaptive Processing, The Institution of Electrical Engineers (UK) Press (1998).
2.
R. Klemm, ed., Applications of Space – Time Adaptive Processing, The Institution of Electrical Engineers (UK) Press (2004).
3.
W. Burdick, Underwater Acoustic System Analysis, Prentice Hall (1984).
4.
P. Tichavsky and K. T. Wong, “Quasi-fluid-mechanics-based quasi-Bayesian Cramer-Rao bounds for deformed towed-array direction finding”, IEEE Transactions on Signal Processing, 52(1), 36-47 (2004).
5.
A. Van Buren, “Near-field transmitting and receiving properties of planar near-field calibration arrays”, Journal of the Acoustical Society of America, 89(3), 1423-1427 (1991).
6.
M. Viberg and A.L. Swindlehurst, “A Bayesian approach to auto-calibration for parametric array signal processing”, IEEE Transactions on Signal Processing, 42(12), 3495-3507 (1994).
7.
A. Nuttall and J. Wilson, “Adaptive beamforming at very low frequencies in spatially coherent, cluttered noise environments with low signal-to-noise ratio and finite-averaging times”, Journal of the Acoustical Society of America, 108(5), 2256-2265 (2000).
8.
P. Tichavsky and K. T. Wong, “Near-field / far-field azimuth and elevation angle estimation using a single vector hydrophone”, IEEE Transactions on Signal Processing, 49(11), 2498-2510 (2001).
9.
T. Ajdler, I. Kozintsev, R. Lienhart, M. Vetterli, “Acoustic source localization in distributed sensor networks”, Asilomar Conference on Signals, Systems and Computers, CD ROM IEEE Press (2004).
10. Y. Chan and K. Ho, “A simple and efficient estimator for hyperbolic location”, IEEE Transactions on Signal Processing, 42(8), 1905-1915 (1994). 11. R. Schmidt, “Least squares range difference location”, IEEE Transactions on Aerospace and Electronic Systems, 32(1), 234-241 (1996). 12. G. Mellen, M. Pachter, and J. Raquet, “Closed-form solution for determining emitter location using time difference of arrival measurements”, IEEE Transactions on Aerospace and Electronic Systems, 39(3), 1056-1058 (2003). 13. J. Barhen, N. Imam, and M. Wardlaw, “Underwater Threat Source Localization: Uncertainty Reduction Algorithms for the EnLight Terascale Optical Core Processor”, NATO ARW Data Fusion Technologies for Harbor Protection – Estonia 2005, IOS Publishers (in press, 2006).
68
J. Barhen et al. / Underwater Threat Source Localization
14. J. Barhen, V. Protopopescu, and D. Reister, “Consistent uncertainty reduction in modeling nonlinear systems”, SIAM Journal of Scientific Computing, 26, 653-665 (2004). 15. C. Knapp and G. Carter, “The generalized correlation method for estimation of time delay”, IEEE Transactions on Acoustics, Speech and Signal Processing, 24(4), 320-327 (1976). 16. M. Omologo, P.Svaizer, “Use of crosspower-spectrum phase in acoustic event location”, IEEE Transactions on Speech and Audio Processing, 5(3), 288-292 (1997). 17. A. Sariel, A. Halperin, and S. Levit, at URL: www.lenslet.com . 18. J. Barhen, N. Imam, A. Averbuch, M. Berlin, and M. Wardlaw, “Implementation of an Active Sonar Matched Filter Algorithm for Broadband Doppler-Sensitive Waveforms on the EnLight Terascale Optical Core Processor”, IEEE Transactions on Signal Processing (submitted, February 2006).
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
69
On Quality of Information in Multi-Source Fusion Environments Eric LEFEBVRE a,1 , Melita HADZAGIC b and Éloi BOSSÉ c a Lockheed Martin Canada, Montréal, QC, Canada b Dept. of Elec. and Comp. Engineering, McGill University, Montréal, QC, Canada c DRDC-RDDC, Val-Belair, QC, Canada Abstract. The effectiveness of a multi-source fusion process for decision making highly depends on the quality of information that is received and processed by the fusion system. This paper summarizes the existing quantitative analyses of different aspects of information quality in multi-source fusion environments. The summary includes definitions of four main aspects of information, namely, uncertainty, reliability, completeness and relevance. The quantitative assessment of quality of the information can facilitate evaluating how well the product of the fusion process represents the reality, hence contribute to improved decision making. Keywords. information quality, multi-source fusion, uncertainty, completeness, relevance, reliability
Introduction The effectiveness of a multi-source fusion process for decision making highly depends on the quality of information that is received and processed by the fusion system. A quantitative assesment of the quality of this information can facilitate evaluating how well the product of the fusion process represents the reality, hence contribute to improved decision making. This paper summarizes the existing quantitative analyses of different aspects of information quality in multi-source fusion environments. The summary includes definitions of four main aspects of information, namely, uncertainty, reliability, completeness and relevance, and descriptions of strategies and metrics for accounting for each aspect. In Section 1 we put these aspects of information quality in context of an operational fusion environment. In Section 2, we define uncertainty, reliability, completeness and relevance, and present available quantitative methodologies for accounting for each of the information property, within the process of information fusion. In Section 3, we provide the conclusions. 1. Fusion Environment In the operational context, a fusion process includes several components which influence the quality of information produced by the fusion system. Figure 1 illustrates a simplified 1 Correspondence to: Eric Lefebvre, Lockheed Martin Canada, 6111 Royalmount Ave., Montréal, QC, H4P 1K6. Tel.: +1 514 340 8310 ext.8715; Fax: +1 514 340 8354; E-mail:
[email protected].
70
E. Lefebvre et al. / On Quality of Information in Multi-Source Fusion Environments
model of the operational fusion process. It shows the components that take part in real events, as well as in the fusion system representation. First, sensors detect events. This detection is subject to the sensor characteristics. The information obtained by the sensors is limited by the type of sensors, and it deviates from the real values depending on the sensor precision and accuracy.
Figure 1. Simplified model of the operational fusion process.
Next, the sensors’ information is collected. A component that performs this task, the Collector, may be a sensor management system or an external fusion node, such as a collaborative agency (e.g. the Coast Guard providing the information to the Navy would be considered as a Collector in this representation). The Collector asses the reliability of the information provided by the sensors it manages. The Fusion Engine is responsible for fusing the information and constructing the representation of the real world for a decision maker. Within this simplified model of the operational fusion, it is possible to identify four main aspects (or properties) of information quality, namely, uncertainty, reliability, completeness, and relevance. Each aspect can be loosely coupled with a different component of the fusion system. The uncertainty, in our view, relates to the detection ability of the sensor. The reliability of sensors, hence of the information, relates to the sensor properties as well, but it is evaluated at a higher level, i.e. within the Collector component. The information completeness will depend on the fusion procedure, hence it is related to the Fusion Engine component. Finally, the relevance of information, in terms of added value and timeliness of information, will depend on the needs of the decision maker. Therefore, the overall quality of information produced by the fusion system may not be of absolute value, but rather depending on the situation, the choice of the system components, and the system/user’s needs, i.e context dependent. The assessment of the aforementioned aspects of the information quality will allow to assess how much the representation of reality, obtained by the fusion system, is accurate. Thus, it will also lead to improved decision making. Another important issue for a decision maker is the situation awareness. However, in this paper, we present only the methodologies that condsider the objective part of information, without addressing the subjective opinion that a human may have about the information, hence excluding the situation awareness from the representation of the quality of information.
2. Information Properties The following information properties determine the quality of information in the information fusion process: uncertainty, reliability, completeness and relevance. To com-
E. Lefebvre et al. / On Quality of Information in Multi-Source Fusion Environments
71
pletely account for the quality of information in the fusion process, we need to assess each of these properties individually. In this Section, we present the definitions, the principal concepts and the strategies for accounting for uncertainty, reliability, completeness, and relevance of information in the process of information fusion. 2.1. Uncertainty Various typologies of uncertainty exist in the literature and they have been discussed in [1]. The typology proposed by Klir and Wierman [4], does not mention knowledge, and thus stays at a lower level of processing, i.e. at the information level. Its concept is closely related to quantitative theories, and leads to corresponding measures of uncertainty or the uncertainty-based information. This typology distinguishes three main types of uncertainty, namely fuzziness, nonspecificity, and discord. In this paper, we adopt the circular typology of uncertainty that is proposed in [5]. This typology is based on the one of Klir and Wierman’s, see Figure 2.
Figure 2. Circular typology of uncertainty.
In the framework of evidence theory, the belief function can model both nonspecificity and discord. The fuzzy set theory, representing and managing vague information, deals with fuzziness and nonspecificit as main kinds of uncertainty. The most adequate framework for representing uncertainty when dealing with all three kinds of uncertainty is the combination of the evidence and fuzzy set theory, i.e. fuzzy evidence theory [5]. Here, we briefly provide the theoretical basics of fuzzy evidence theory. A more detailed description of the fuzzy evidence theory can be found also in [5] and references therein. Let X be a frame of discernment containing N distinct objects,
P(X) the power set of X, and let x ∈ X be any element of X. Let Bel(A) = B⊆A m(B) be a belief function, and m : P(X) → [0, 1] a basic probability assignment (BPA), as defined in evidence theory. If the set A is defined as a fuzzy set A˜ (with the membership ˜ then the BPA, m, becomes the fuzzy BPA, m, ˜ in fuzzy evidence theory. μA˜ (x), ∀x ∈ A), The belief function in the fuzzy event is given by ˜ ˜ ˜ = I(A˜ ⊂ B)m( A) (1) Bel(A) ˜ B⊆X
˜ is the inclusion in the power set of X, P(X). The pignistic probawhere I(A˜ ⊂ B) bilities, when A˜ reduces to a singleton, (i.e. when A(x) = 1 for a single x ∈ X, and 0 elsewhere), are extended to
72
E. Lefebvre et al. / On Quality of Information in Multi-Source Fusion Environments
BetPm (x) =
x∈B⊆X
˜ m(B)B(x) ˜ |B|
(2)
The appropriate choice of framework to represent uncertainty may improve the quality of the information in a fusion process. Furthermore, the quality of information can be measured by the reduction of uncertainty [6] . Various theories of the generalized information theory field provide methods for measuring the uncertainty, and they are summarized in [7]. We refer to the results in [5], where a general measure of uncertainty (GM) is used to quantify in aggregate fashion the total uncertainty of a system that is based on fuzzy evidence theory. Moreover, GM is used for artificially reducing the uncertainty of a fuzzy BPA. Definition 1 Let m ˜ be a fuzzy BPA defined on a finite frame of discernment X. The General Measure of Uncertainty of m ˜ is defined by GM (m) ˜ =−
[BetPm (x)log2 BetPm (x)
+BetPm (x)log2 BetPm (x)]
(3)
m(A)A(x) ˜ |A|
(4)
m(A)(1 − A(x)) ˜ |A|
(5)
x∈X
where
BetPm (x) =
x∈A⊆X
BetPm (x) =
x∈A⊆X
Three basic operations for artificially reducing the uncertainty of a fuzzy BPA using the GM are proposed and they are: (1) defuzzification, (2) specification, and (3) accordance. The defuzzification transforms a fuzzy BPA into a crisp one. When applied to a fuzzy set, it gives a crisp set, while when applied to a fuzzy probability distribution gives a classical probability distribution. The specification transforms a fuzzy BPA into fuzzy probability distribution. When applied to a fuzzy set, specification gives nonspecific fuzzy set, while when applied to a crisp set, it gives a singleton. The accordance transforms a fuzzy BPA into a fuzzy set. When applied to a fuzzy probability distribution, it gives a nonspecific fuzzy set, while when applied to a classical probability distribution, accordance gives a singleton. The GM belongs to the group of quantitative methods for representing and measuring uncertainty. Uncertainty can also be represented using qualitative methods, see [8] and references therein. 2.2. Reliability In reality, the information sources, which are used in the fusion process (e.g. sensors), may not be completely reliable or may not have the same reliability. Hence, to obtain a better representation of the real world one needs to account for the reliability. The concepts and strategies of incorporating reliability into the fusion process have been summarized in [9]. According to [9], the reliability of information is closely related to
E. Lefebvre et al. / On Quality of Information in Multi-Source Fusion Environments
73
the modeling of uncertainty of a source of information and the difficulty of finding an adequate belief model to describe the uncertainty. Additionally, the models may come from different uncertainty theoretical frameworks due to dealing with different sources of information. 2.2.1. Reliability as Higher Order of Uncertainty Recommended metrics for accounting for the ranges of validity and other limitations of a chosen belief model for each information source are the reliability coefficients. In this context, the reliability coefficients represent uncertainty of evaluation of uncertainty, also called the second (or higher) order of uncertainty. They are considered as measures of the adequacy of a model which is used and the state of the observed environment. There are two approaches to account for the reliability as the higher order of uncertainty: (1) representing reliability as relative stability, i.e. by measuring the performance of each information source, and (2) by measuring the accuracy of the predicted beliefs, where the reliability coefficients represent adequacy of each belief model with respect to the reality. Here, as in [9], we present only the second approach. Let si , i = 1, . . . , I be data produced by I sources, and Θ = θ 1 , . . . , θN be a set of events under consideration. It is assumed that there is a model M , which utilizes the data and the prior information to provide us the degree of belief x i in the event A ∈ Θ : xi (A), i = 1, . . . , I. These degrees of belief take values in a real interval and are modeled within a chosen theoretical framework used to represent uncertainty. The degree of belief based on fused information which takes into account the reliability is defined by the fusion operator F R (x1 , . . . , xI , R1 , . . . , RI ), where Ri ∈ [0, 1], i = 1, . . . , I represent the reliability coefficients. R i is close to zero if the source i is unreliable and close to 1 if the source is reliable. The reliability coefficients do not only depend on a selected belief model, but also on the characteristics of the environment and a domain of the input. Hence, the reliability coefficients can be written as R i = R(Mi , γ, Y ), where Mi is the model chosen for the source i, while γ and Y , are the vectors of parameters which characterize the external and internal environments, respectively. 2.2.2. Fusion Strategies The fusion operator, F R = F (x1 , . . . , xI , R1 , . . . , RI ), depends on the global knowledge about the information sources, the environment and the chosen belief model, each possibly providing different information about the reliability. According to the level of knowledge we have about the information sources, the following situations can be distinguished [9]: 1. It is possible to assign a numerical degree of reliability to each source. In this case, each reliability value must be "relative" or "absolute",
i.e. the reliability values may or may not be linked by an equation such as i Ri = 1. 2. The order of the reliabilities of the sources are known, but not their precise values. 3. A subset of sources is reliable, but we do not know which one. To employ the knowledge in the situations above, the following strategies may be used: a. Explicitly utilizing the reliability of sources. b. Identifying the quality of input data of the fusion process and eliminating the data of poor reliability.
74
E. Lefebvre et al. / On Quality of Information in Multi-Source Fusion Environments
2.2.3. Reliability Coefficients The major issue in building the fusion operator, F R , is modeling the reliability coefficients, Ri . Their models may be constructed using the domain knowledge of external sources and the contextual information, learned by training data (e.g. in neural networks), or as a function of agreement between different sources, or between sources and fusion results. 2.3. Completeness Several descriptions of completeness as an aspect of imperfect information can be found in the literature, see [11], and they all relate to the deficiency of information. The deficiency is a property that results from incompleteness of what concerns the user (or the fusion system). Even incomplete information can sometimes be sufficient from the user’s point of view. In the structured thesaurus of imperfection of Smets [11], the word incomplete is mentioned in the context of imprecision and data without error, but missing (i.e. not present, although expected). A problem of completeness in the evidence theory (i.e. updating beliefs with incomplete information) was raised in [12]. The process of updating probabilities with observations that are incomplete, or set-valued, requires the knowledge of the incompleteness mechanism or so called protocol, which turns a complete observation into an incomplete one. The results in [12] show that neglecting the incompleteness mechanism leads to naive conditioning that is generally prone to failure. Nevertheless, it is also observed that the protocols do not always exist in the practical applications of probability or evidence theory. Recently, in [13] has been shown that commonly used strategies for updating beliefs fail, except under very special assumptions. It has also been confirmed that the incompleteness mechanism may be unknown, or difficult to model; and that the condition of coarsening at random (CAR), which guarantees that naive updating produces correct result, does not hold frequently. In [14] a new method for updating beliefs with incomplete observations, which makes no assumptions about the incompleteness, is proposed. The ignorance about the probability of the missing measurement A is modeled by a vacuous lower prevision, a tool from the theory of imprecise probabilities. Without loss of generality, the vacuous lower prevision can be considered as equivalent to the set of all distributions (i.e. it makes all the incompleteness mechanisms possible a priori). Only the coherence arguments are used to update the probabilities. The model for incompleteness mechanism is applied to the special case of the classification problem using the Bayesian networks. A definition of completeness, more confined to a case of information fusion for situation awareness and decision making, is given in [15]. The completeness is defined as a degree to which the information is not missing with the respect to the relevant ground truth. In this context, having the information about all the relevant features of interest means that the information is complete. Here, the term relevant implies that the completeness depends on the situation, the command level, and the scale of operation. The completeness is assessed within a so called information domain which includes the information obtained from the sensor sources, the fusion and the communication networks. The completeness of information obtained from the sensors is related to the sensor detection and the ability of the sensor suite to cover the area of operation (AO).
E. Lefebvre et al. / On Quality of Information in Multi-Source Fusion Environments
75
2.4. Relevance A problem may arise when the amount of available information in a fusion system grows beyond its capacity. Too much information usually results in degradation of system performance, which is usually due to the computational complexity for reasoning. Poor performance leads to the unsuccessful fusion process. Hence, it is important to be able to determine which information is relevant to a particular fusion task and what can be ignored without compromising the resulting fused information. A significant amount of research about relevance of information has been reported in the research fields of information retrieval (IR) systems and query answering (QA) tasks, see [11] and references therein. The question of relevance has been raised in [18], as one of the major problems in upgrading a search engine to a QA system. The latter is considered as a very complex problem and far from solution. Nevertheless, it is suggested that the relevance should be treated as a matter of degree, i.e. as a fuzzy concept. In [15], the relevance is defined as the proportion of collected information that is related to the task at hand, meaning that it is also context dependent as completeness. Despite the amount of published work on relevance, it seems that there is no cointensive definition of relevance in the literature, [18]. In the presence of uncertainty, the uncertainty representation will determine the approach to assess relevance. In the following paragraph, we present two methods for assessing relevance within the quantitative uncertainty framework. They consider two important issues in the information fusion systems: the temporal relevance (e.g. time of the measurement arrival) and the value-added of sensor reports. The description of methods for representing conditional ignorance and informational relevance in the symbolic entropy theory, and methods for extracting the best relevant information within the qualitative uncertainty framework can be found in [19]. 2.4.1. Relevance Measures In dynamic uncertain domains two classes of irrelevant information, namely, mutually independent beliefs and conditionally independent beliefs are considered as independent information, and accounted for accordingly. The third class of irrelevant information includes information that becomes obsolete with time due to the uncertain dynamic nature of change, i.e. the relevance of such information degenerates in time. It has been shown in [16] that the degeneration occurs in probabilistic temporal reasoning. A weaker temporal relevance criterion, that represents a degree of relevance measure called temporal extraneousness, and that captures this relevance is defined in [16]. Definition 2 If the maximum effect of information Θ at time t j on belief ql at time ti is less than a small value δ, then t i and tj are temporally extraneous with respect to q l . The extraneousness level δ 1 is met when the inequality |P (q li |Θj )−P (qli |¬Θj )| ≤ δ holds. The strength of the degree of relevance as measured by the temporal extraneousness can change according to the value of δ. A δ value of zero results in strong irrelevance notion of probabilistic independence. It has been shown that the efficiency of probabilistic temporal reasoning can be improved by ignoring irrelevant and weakly relevant information. Another notion of relevance has been provided in [17]. This notion of relevance corresponds to the uncertainty as defined in Section 2.1. If the amount of uncertainty
76
E. Lefebvre et al. / On Quality of Information in Multi-Source Fusion Environments
of information is large, then the information is considered as irrelevant. The amount of uncertainty remaining about the measurement x after the measurement y is observed, can be represented by the conditional entropy, H(x|y). Therefore, the relevance of y can be measured using the conditional entropy, which is defined as: H(x|y) = H(x, y) − H(y)
(6)
where 0 ≤ H(x|y) ≤ H(x), and H(x, y) is the joint entropy of observations x and y. To distinguish which sensor gives more accurate observations, another measure, the mutual information, I(x, y) is used. The mutual information represents a measure of uncertainty that is resolved by observing y and is defined as I(x, y) = H(x) − H(x|y)
(7)
The methodology described in [17] is applied to tracking of multiple targets by a network of radar sensors. It has been shown that it contributes to improving the decision accuracy in the current network node. This further helps in determining whether the sensor is functional or how much to weigh the decision of a neighboring node, which are important issues for data fusion. This is done using the lower and the upper bounds and the probabilistic fusion framework as described in [17].
3. Conclusions To completely account for the quality of information in the operational multi-source fusion environment for decision making, one needs to account for all aspects of information as the information moves through the informational value chain of the fusion process. It is possible to identify four main aspects of information quality, namely, uncertainty, reliability, completeness, and relevance. Each aspect can be loosely coupled with a different component of the fusion system. In this paper we provided a summary of existing descriptions of the quantitative measures and metrics for these four aspects of information. In designing such measures, one may assume particular theoretical framework for representing the processes that produce information (e.g. sensors, fusion operators, networks), such as fuzzy evidence theory and the corresponding GM [5]. However, as reported in [15], this assumption may not be necessary, and one may not be concerned about how the information is transformed as it moves through the fusion system. In the latter case, the quality of the information processing is expressed through the quality metrics in forms of parameters and alternative probability functions. The choice of measures of individual information property and the choice of strategy to incorporate that property in the fusion process, depend on the particular fusion application. Accounting for uncertainty, reliability, completeness, and relevance of information in the multi-source fusion environment will contribute to better understanding of the information and hence, improve the representation of the reality. A more accurate representation of reality will contribute to improved decision making. The measures of individual aspects of information quality within the fusion environment may also facilitate the development of measures of performance of a fusion system.
E. Lefebvre et al. / On Quality of Information in Multi-Source Fusion Environments
77
References [1] A-L. Jousselme, P. Maupin, and E. Bossé, Uncertainty in situation analysis perspective, Proceedings of the 6th International Conference on Information Fusion, 23(2):728–736, July 2003. [2] D. L. Hall and S. A. H. McMullen, Mathematical techniques in multisensor data fusion, Artech House, 2004. [3] J. Y. Halpern, Reasoning about uncertainty, MIT Press, 2003. [4] G. J. Klir and M. J. Wierman, Uncertainty based information, Vol.15 of Studies of Fuzziness and Soft Computing, Physica-Verlag, NY, 2nd.edition, 1999. [5] C. Liu, A-L. Jousselme, D. Grenier and E. Bossé, A general measure of uncertainty framed into the fuzzy evidence theory, IEEE Transactions on Systems, Man and Cybernetics - Part A: Humans and Systems, 2005. In review. [6] G. J. Klir and T. A. Folger, Fuzzy Sets, Uncertainty and Information,NJ, Prentice Hall, 1988. [7] C. Liu, A general measure of uncertainty-based information, Ph.D. Thesis submitted to the Dept. of Electrical and Software Engineering, Laval University, 2004. [8] S. Parsons, Qualitative methods for reasoning under uncertainty, MIT Press, 2001. [9] G. L. Rogova and V. Nimier, Reliability in Information Fusion: Literature Survey, Proceedings of the 7th International Conference of Information Fusion,pp. 1158-1165, 2004. [10] D. Dubois and H. Prade, Combination of Fuzzy Information in the Framework of Possibility Theory, in Data Fusion in Robotics and Machine Learning, M. A. Abidi and R. C. Gonzales, editors, Academic Press, 1992. [11] A. Motro and P. Smets, editors. Uncertainty managementin information systems: from needs to solutions, Kluwer Academic Publishers, 1997. [12] G. Shafer, Conditional probability, International Statistical Review, 53:261–277, 1985. [13] P. D. Grunwald and J. Y. Halpern, Updating probabilities, Journal of Artificial Intelligence, pp. 243-278, 2003. [14] G. De Cooman and M. Zaffalon, Updating beliefs with incomplete observations, Journal of Artificial Intelligence Research, Vol. 159,1-2, Nov.2004. [15] W. Perry, D. Signori, and J. Boon, Exploring Information Superiority: A Methodology for Measuring the Quality of Information and Its Impact on Shared Awareness. RAND Corporation Report, 2004. [16] A. Y. Tawfik and E. M. Neufeld, Irrelevance in uncertain temporal reasoning, Proc of the 3rd Intl. IEEE Workshop on Temporal Representation and Reasoning, pp. 196-202, 1996. [17] S. Kadambe, Sensor/data fusion based on value of information Proc. of the 6th International Conference on Information Fusion, pp. 25-32, 2003. [18] L. Zadeh, From search engines to QA systems; the problems of world knowledge, relevance and deduction, WSEAS Conference, 2005. [19] M. Chachoua and D. Pacholzyk, Qualitative reasoning under ignorance and informationrelevance extraction, Knowledge and Information Systems, Vol. 4, Issue 4, pp. 483-506, 2002.
78
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
Polarimetric Features and Contextual Information Fusion for Automatic Target Detection and Recognition a
Yannick ALLARD a, Mickael GERMAIN b, Olivier BONNEAU b Research and Development Department., Lockheed Martin Canada, b Centre de Recherche en Mathematiques, Universite de Montreal,
[email protected],
[email protected]
Abstract. Several studies have already shown that remote sensing imagery would provide valuable information for area surveillance missions and activity monitoring and that its combination with contextual information could significantly improve the performance of target detection/target recognition (TD/TR) algorithms. In the context of surveillance missions, spaceborne synthetic aperture radars (SARs) are particularly useful due to their ability to operate day and night under any sky condition. Conventional SARs operate with a single polarization channel, while recent and future spaceborne SARs (Envisat ASAR, Radarsat-2) will offer the possibility to use multiple polarization channels. Standard target detection approaches on SAR images consist of the application of a constant false alarm rate (CFAR) detector and usually produce a large number of false alarms. This large number of false alarms prohibits their manual rejection. However, over the past ten years a number of algorithms have been proposed to extract information from a polarimetric SAR scattering matrix to enhance and/or characterize man-made objects. The evidential fusion of such information can lead to the automatic rejection of the false alarms generated by the CFAR detector. In addition, the aforementioned information can lead to a better characterization of the detected targets. In the case of more challenging backgrounds, such as groundbased target detection, the use of higher level information such as context can help in the removal of false alarms. This paper will discuss the use of polarimetric information for target detection using polarimetric SAR imagery as well as the benefit of contextual information fusion for ground-based target detection.
Key words: polarimetric SAR, target detection, contextual information, evidential fusion.
Introduction Remote sensing imagery, due to its large spatial coverage, enables the monitoring of large areas and provides valuable information in the context of area surveillance and activity monitoring. The next generation of sensors that will likely be used for this particular task consists primarily of high-resolution Polarimetric SAR (PolSAR), Hyperspectral Imagery (HSI) and high-resolution optical systems. These sensors will provide a large amount of data and will require the development of tools and methodologies to automatically analyze them and extract meaningful information. For the particular tasks of target detection and area monitoring, PolSAR sensors have the
Y. Allard et al. / Polarimetric Features and Contextual Information Fusion
79
advantage of being independent of solar illumination and are very sensitive to the presence of man-made objects. During the last decade, many algorithms were developed for point target detection and characterization on polarimetric data. However, these polarimetric features cannot provide all the information needed to achieve target detection with an acceptable level of false alarms in the case of very challenging backgrounds. One challenge for the particular task of ground-based object detection and discrimination is the adequate use of contextual information. This paper discusses the task of target detection and characterization using PolSAR imagery for application in wide area surveillance. In the next section, conventional target detection methodology on SAR imagery is described. Section 2 presents the more commonly used polarimetric features that can be used for point target detection using polarimetric SAR imagery. The use of contextual information as an aid for target detection and discrimination is discussed in Section 3 while Section 4 presents examples of applications of polarimetric features fusion for target detection and characterization for maritime and ground area surveillance. Finally, conclusions are drawn in Section 5.
1. Target Detection using SAR Imagery Conventional systems for target detection and recognition on SAR imagery usually consist of five stages (Figure 1).
Figure 1. Stages of a target detection and recognition system
The detection stage consists of determining the presence of a target signature at a particular position in the image. This task is usually achieved mainly with the application of a two-parameter Constant False Alarm Rate (CFAR) detector. However, this CFAR detection step generates numerous false detections making it impossible to perform manual rejection, especially in the case of a very challenging background. On the other hand, target discrimination may be seen as the binary classification in target versus non-target. This paper focuses on these two tasks of the target detection and recognition scheme. Target characterization using polarimetric information will be briefly discussed. 1.1. Constant False Alarm Rate (CFAR) Detection The target detection stage selects areas with a high probability of containing targets. The detector must be computationally simple and should provide high probability of detection while creating at the same time as few False Alarms (FAs) as possible; an FA being a detection that corresponds to a clutter region. One of the most widely used prescreeners in SAR target detection is the two-parameter CFAR detector, which is based on a normalized test of the pixel intensity versus its local neighborhood. Figure 2 represents the typical window of analysis of a CFAR detector. The moving window is composed of a test pixel surrounded by a guard ring to prevent any influence of the target on the boundary ring, which is used to compute the necessary statistics. The
80
Y. Allard et al. / Polarimetric Features and Contextual Information Fusion
popularity of such a simplicity/performance.
detector
is
due
to
a
good
compromise
between
Figure 2. Principle of a CFAR Detector
However, the discriminating power of the two-parameter CFAR is not enough to reduce the false alarms to an acceptable level. The incorporation of complementary features such as polarimetric features should help the target system provide a more reliable result. The next section describes some of the polarimetric features that can be used in the target discrimination stage to reduce the number of FA to an acceptable level.
2. Polarimetric Features for Target Discrimination The detection stage highlights potential targets that must pass through a discrimination stage intended to reject false alarms based on geometrical and electromagnetic properties. Polarimetric features introduce the means to tackle the task of target discrimination more effectively. With the current airborne and upcoming spaceborne polarimetric SARs (Radarsat-2), polarimetric decomposition algorithms can be used to remove false detection in the target discrimination stage. Usually, target discrimination is only applied in the Region Of Interest (ROI) and more computationally demanding algorithms, such as the polarimetric decompositions, can be applied for that task. Amongst the large number of available polarimetric decompositions, the following are the most interesting for point target detection, discrimination and characterization: x The Odd/Even basis decomposition (4) x Cameron’s Coherent Target Decomposition (CTD) (2) x Polarization anisotropy (7) x Symmetric Scattering Characterization Method (SSCM) (6) x Subaperture Coherence (5). For a complete description of these algorithms the reader should refer to the appropriate publications. Examples of these polarimetric features computed over maritime and ground areas are provided in the corresponding presentation.
3. Contextual Information for Target Detection and Discrimination In the case of very challenging backgrounds, such as ground-based target detection and discrimination, the previously mentioned polarimetric features can still be used to
Y. Allard et al. / Polarimetric Features and Contextual Information Fusion
81
reduce the number of false detections. However, for such targets, object-centred approaches, which are largely used in the ATR community, have some drawbacks when the image suffers some degradation or when the target is small compared to the sensor’s resolution. In these cases, not enough local evidence can be extracted to ensure a reliable detection and recognition. In the absence of local evidence, the scene structure and a priori knowledge should provide the information for efficient detection and recognition. The background will therefore be considered as an indicator of an object’s presence and properties and not as a potential distractor. Context information may be captured through a wide variety of methods such as: x Well known pixel/object-based labelling techniques introducing dependencies through neighbouring pixel/region relationships x Temporal data revisit x Fusion (pixel, features) provided by other sensors x Geographical Information System (GIS) thematic maps x Other valuable knowledge sources for interpretation: meteorological data, tides timetables… In our example of application, we used a previously interpreted Ikonos image to model the context using topological relationships such as being on, near to or far from a certain land cover type. This contextual information can be used in the target detection task by modifying the false alarm probability of a CFAR detector (1) or in the target discrimination step, by performing context-based false alarm mitigation.
4. Examples of Application This section presents practical examples of target detection and discrimination using polarimetric features computed over PolSAR imagery. Two cases are discussed: maritime surveillance and ground-based target detection. 4.1. Maritime Surveillance When performing ship detection on a SAR image using a CFAR detector, many false alarms (typically from 3-10 to more than an hundred) are generated. These false alarms are mainly caused by the sea state, small fishing boats, icebergs, etc… In this case, using polarimetric information should be beneficial in removing a huge number of false alarms due to a not-so-challenging background. The method we have chosen to demonstrate this is the evidential fusion of polarimetric information in the CFAR contacts to validate or discard the ROI. 4.1.1. Evidential Fusion of Polarimetric Features for False Alarm Mitigation As mentioned earlier, because of the computational burden of the polarimetric decomposition and fusion algorithms, these are only applied in the ROI that survives the CFAR detection step. The fusion of polarimetric features is achieved using Dempster-Shafer’s evidence theory (3). This framework offers us the advantages of an easy modelization of imprecision and uncertainty in the reasoning process and takes into account compound hypothesis, which is particularly useful since our polarimetric features are unable to discriminate in a precise manner all the objects of interest.
82
Y. Allard et al. / Polarimetric Features and Contextual Information Fusion
To fuse polarimetric information, it is mandatory to define mass functions for each feature. We use trapezoidal mass functions for each “continuous” feature (e.g., subaperture coherence) or a hard confidence if the feature is a hard decision provided by an algorithm (e.g., binary classification of coherent and non-coherent point target). The mass functions are assigned using our knowledge about each feature. In the case of continuous features, the overlapping parts of the trapezoidal mass functions do not allow us to choose their parameters very precisely (8). 4.2. Ground-based Target Detection Consider now the case of ground-based area monitoring using remotely sensed imagery. In this case, false alarms are more likely to happen because of numerous reasons, such as speckle noise, smaller target size with regard to a sensor’s resolution; man-made return from any metallic object, commercial vehicular traffic; etc... In addition, the possibility of a target’s camouflage is present and adds to the already difficult task of target detection. Given all these potential distracters, the task of ground-based target discrimination is more complex than its maritime counterpart. However, the use of contextual information should reduce the number of false detections to an acceptable level. 4.2.1. Integration of Contextual Information Contextual information can be used in the two first steps of an ATR scheme. It can be used either directly in the detection stage during pre-screening or in the discrimination step by performing context-based false alarm mitigation. During pre-screening, one will seek to use contextual information to vary the probability of false detections on a per-pixel basis according to the land-cover features present in the scene under analysis and the distance of a particular pixel with regard to these features. If we suppose that interpreted imagery or GIS information layers are available, it is possible to use a priori knowledge about the terrain type and edges positions to modify the PFA of a CFAR detector to reflect the military behavior of the target we expect to detect. As an example, one could state that a target would prefer being close to a forest boundary to allow protection on one flank and camouflage. Proximity to a means of transportation should also be favored for displacement reasons. One can use all these subjective assumptions of target behavior to modify the false alarm rate of the CFAR detector on a per-pixel basis. Knowledge of the land cover types present in a SAR scene can also be used to remove false alarms that occurred in regions where a target cannot be detected according to the sensor or target properties. This process is called context-based false alarm mitigation. In addition, the knowledge of land cover types enables a system to extract the target’s context and infer some of the target’s properties. 4.3. Target Characterization The task of target characterization aims at extracting target features and recognizing targets from polarimetric SAR images. The target’s length extraction and characterization is a complex problem to resolve, especially for ships, due to various problems such as uncontrolled environments, variable image acquisition geometry and
Y. Allard et al. / Polarimetric Features and Contextual Information Fusion
83
resolution, focus problems and dependence of the radar scattering to a target’s orientation. 4.3.1. Length and Orientation Estimation Once a target is detected and segmented from the imagery, one of the tasks in target characterization is to estimate the length and the orientation of the target. One of the commonly used methods to extract this information is the Hough transform. The Hough transform computes the target’s centreline and the length and orientation are computed using the end-points of the line assuming that the sample spacing in range and azimuth are known. 4.3.2. Charactreization using Polarimetric Features The task of target characterization aims at extracting target features and recognizing targets from polarimetric SAR images. When a target is detected, the polarimetric information can be used to characterize the target and/or detect and identify supersuperstructures on the target. One way to use the polarimetric features in such a task is to analyze the distribution of the elemental scatterers in different portions of the target. To do so, the detected target is segmented in a certain number of parts, and each part is subject to an analysis to detect potential structures. We used the scatter type derived from the Cameron decomposition in an attempt to characterize the detected target. However, despite highlighting the difference in shape distribution between the target and its surroundings, the limited available datasets render a detailed study and analysis of the target’s shape composition impossible. The length of a target is therefore the major source of information for target identification.
5. Conclusion The next generation of spaceborne and airborne imaging sensors will increase the role of remote sensing imagery for wide area surveillance and monitoring. However, due to this growing amount of available data, it will be necessary to develop automated tools to help the image analyst in his or her task. The image analysis application considered in this lecture was an application directed toward automatic detection and characterization of targets using SAR imagery. As shown, the detection and discrimination performances of an automatic system are better when using polarimetric SAR data than using only its single channel counterpart. The polarimetric nature of the data provides additional features of interest for ship and ground target detection/recognition. The evidential fusion of these polarimetric features within the CFAR contacts can eliminate many of the false alarms generated by the CFAR detector. Dual polarization should be investigated for detection and false alarm reduction, especially using Subaperture Coherence of the HV channel, and perhaps other features computed from the HH-HV channel. In the case of ground targets, the fusion of polarimetric features alone cannot, in a general case, reduce the number of FA. As we have seen, the use of multiple sensors improves scene description. The integration of contextual information is required to increase performance of ATD/R algorithms, especially in the case of challenging backgrounds, because of the lack of local information to provide reliable object detection and characterization.
84
Y. Allard et al. / Polarimetric Features and Contextual Information Fusion
Acknowledgment This work was supported by the Canadian Space Agency under the Earth Observation Application Development Program (EOADP) (contract 9F028-3-4910/A) and the Radar Applications and Space Technologies section of the Defence Research and Development Canada – Ottawa (DRDC-O). We would like to acknowledge Space Imaging for some data that were used in this project.
References [1] [2] [3] [4] [5] [6] [7] [8]
Blacknell, D., Contextual information in SAR target detection, IEE Proceedings; Radar, Sonar and Navigation, Vol. 148, Issue 1, pp. 41-47, 2001. Cameron, W., Youssef, N., Leung, L.K., Simulated polarimetric signatures of primitive geometrical shapes. IEEE Transactions on Geoscience and Remote Sensing, 34(3), pp. 793-803, 1996. Dempster, A.P., A Generalization of Bayesian Inference, Journal of the Royal Statistical Society, 30, 1968. Novak, L., Halversen, S., Owirka, G, Hiett, M, Effects of Polarization and Resolution on the Performance of a SAR Automatic Target Recognition System, Lincoln Laboratory Journal, vol. 8, no. 1, pp. 49-68, 1995. Sourys, J.-C., Henry, C., Adragna, F., On the use of Complex SAR Image Spectral Analysis for Target Detection : Assessment of Polarimetry, IEEE Transactions on Geoscience and Remote Sensing, vol. 41, no. 12, pp. 2725-2734, 2003. Touzi, R., Charbonneau, F., Characterization of Target Symmetric Scattering Using Polarimetric SARs, IEEE Transactions on Geoscience and Remote Sensing, vol. 40, no. 11, pp. 2507-2516, 2002. Touzi, R., Charbonneau F., Hawkins R. K., Vachon, P. W., Ship Detection and Characterization using Polarimetric SAR; Canadian Journal of Remote Sensing (RADARSAT 2 Special Issue),June 2004. Tupin, F., Reconnaissance de forme et Analyses de Scène en Imagerie Radar à ouverture Synthétique, Thèse de Doctorat, École Nationale Supérieure des Technologies, Paris, 1997.
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
85
Enhancing Efficiency of Dynamic Threat Analysis for Combating and Competing Systems Edward POGOSSIAN a,b, Arsen JAVADYAN a,b, Edgar IVANYAN b a Academy of Science of Armenia, Institute for Informatics and Automation Problems, b State Engineering University of Armenia,
[email protected],
[email protected],
[email protected]
Abstract. We study the class of problems where Solutions Spaces are specified by combinatorial Game Trees. A version of Botvinnik’s Intermediate Goals At First (IGAF) algorithm is developed for strategy formation based on common knowledge planning and dynamic plan testing in the corresponding game tree. The algorithm (1) includes a range of knowledge types in the form of goals and rules, (2) demonstrates a strong tendency to increase strategy formation efficiency, and (3) increases the amount of knowledge available to the system.
Key words: game tree, expert knowledge, decision making, intrusion protection, measurement.
Introduction Many security and competition problems belong to a class where Spaces of Solutions are specified by combinatorial Game Trees (SSGT). Specifically, network Intrusion Protection Optimal Strategy Provision (IP OSP) and Management in oligopoly competitions (MOSP) problems, chess-like problems – Chess OSP, are examples. Many other security problems such as Computer Terrorism Countermeasures, Disaster Forecast and Prevention, Information Security, and Medical Countermeasures may also be reduced to an SSGT class. To solve SSGT problems we define a class of Decision Making Systems (DMS) Intermediate Goals At First (IGAF) algorithms, based on the following constructions and procedures: (1) a game tree model for target competition with sub models of the states, actions, and (contra)actions, (2) the rules to apply (contra)actions to the states and transform them to new ones, (3) descriptors of the goal states, (4) the optimal strategy search procedure with strategy planning units aimed at narrowing the search area in the game tree, (5) the plans quantification, (6) the game tree based dynamic testing strategy, and (6) the best action selection units . IGAF algorithms were successfully tested on the network Internet Protocol (IP) problem and other SSGT problems. For example, for the IP problem, the IGAF1 version exceeded system administrators and known standard protection systems in about 60% of the experiments on fighting 12 different types of known network attacks [1].
86
E. Pogossian et al. / Enhancing Efficiency of Dynamic Threat Analysis
A Linux version of the IGAF algorithm is now currently being used for the IP system of the ArmCluster [2]. Pioneering research into strengthening the performance of the chess version of IGAF like programs, which simulate a chess master’s decision making processes through systematic acquisition of human knowledge, was performed in [3, 4, 5] and developed in [6]. In [9] an attempt was undertaken to study the viability of a decision making system, with various types of chess knowledge, including an ontology of about 300 concepts. Significant advances in the ontology of the security domain, ontology-based representation of distributed knowledge of agents, a formal grammar of attacks and their application to network IP systems, as well as a comprehensive review of ontology studies in the field, are presented in [7, 8]. Compared with [10,11,12], where network-vulnerability analysis is based on finding critical paths in attack graphs, our game tree based model searches counteraction strategies comprising elementary and universal units – elementary procedures or an alphabet, that the intruder or administrator uses to combine either attacks or defense procedures, respectively. Some of these procedures can coincide, particularly with elementary attacks of [10, 11, 12]. However the aim is to find procedures elementary enough to cover the diversity of intruder and defender behaviors, while meaningful enough for human understanding and operations. An alphabetic approach to representation of attacks and defense operations causes game tree size explosion, which we attempt to overcome using successful computer chess experience. In this paper, we describe our experience at enhancing the effectiveness of the IGAF algorithms. First, we determine the class of SSGT problems and provide examples. Then we discuss whether the SSGT Expert Knowledge (EK) can be adequately simulated and whether models can be regularly used for problem solving. Finally, we describe our approach to measuring the performance of the IGAF algorithms and our experiments on enhancing their performance for the IP OSP problem.
1. Determining the Class of SSGT SSGT problems are identified in a unified way by game tree constituents, which create the base for a unified methodology for their resolution. The constituents include, particularly, the list of competing parties and their goals, their actions and (contra)actions, states of trees. and rules for their transformations. For the above problems the GT constituents are determined as: x The Chess OSP problem: white and black players with checkmate as the goal chess piece moves are (contra)actions and composition of the chess pieces on the board specify game states transformed by actions corresponding to chess rules. x The MOSP problem for the Value War [13] model’s interpretation: a company competing against a few others with Return On Investment as the goal price changes and product quality as the actions
E. Pogossian et al. / Enhancing Efficiency of Dynamic Threat Analysis
x
87
tree states determined by competition scenarios, i.e., the competition template is formed from the conceptual basis of management theory, with a set of parameters specifying a particular competition in the scenario and the actions of all competing parties transformation rules determined by general micro- and macro- economics laws, which, when applied to input states, create new output ones. The IP OSP problem: network protection systems, e.g., system administrators or IP special software, combat against intruders or network disturbing forces (e.g., hackers or disturbances caused by technical casualties) to ensure the network is kept in a safe and stable state network states are determined by the composition of current resources vulnerable to network disturbances actions and (contra)actions are the lists of means able to change resources and therefore transform states [1, 14].
2. Whether SSGT EK can be Simulated Adequately? Expert concept approximation with acceptable adequacy might have the following impact on understanding cognition and computers [15]: x explaining cognitive mechanisms for concept creation and processing x highlighting whether human conceptual activity associates imagery operations with an attributive base or whether it can avoid imagery support [16, 17], i.e., understanding “….the act of forming and examining a ‘picture in the head’ ”[16] x revealing whether computers are able to simulate in a “natural” way an “alive” fragment of human knowledge. x By “natural” computer simulation, we mean: whether human image processing is an essential part of human mental operations, the answer to which may involve finding new simulation means to enrich computer operations; otherwise new effective procedures may need to be developed with regard to the current concept of computers. We expect the last answer may be a step towards understanding the principal abilities of computers [18], which are universally perceived as the means to simulate systems. Our preliminary analysis of the chess knowledge collected [9] allows us to state that expert knowledge for SSGT problems can be simulated by computers. The analysis is based on Cermelo’s reduction of chess knowledge to descriptions of classes of winning positions in the game tree.
3. Can SSGT EK Models be Regularly Used for Problem Solving? Using the above game tree model, we experiment with a variety of algorithms to counteract intrusions. Our IGAF1 is similar to Botvinnik’s chess tree cutting-down algorithm (CTCD). The CTCD is based on the natural hierarchies of goals in control problems and the
88
E. Pogossian et al. / Enhancing Efficiency of Dynamic Threat Analysis
assertion that search algorithms become more efficient if they try to achieve subordinate goals before attempting main ones. The trajectories of confronting parties to those subgoals are chained in order to construct around them zones of the most likely actions and counteractions. As a result of comparative experiments with the minmax and IGAF1 algorithms in [1, 14], we state the following: x the model, which uses the minimax algorithm, is compatible with experts (the system administrators or specialized programs) against intrusions or other forms of base system perturbations x the IGAF1 cutting-down tree algorithm, besides being compatible with the minimax algorithm, can also be effectively applied to real IP problems. x Here we consider a more advanced version of the algorithm, Intermediate Goals At First (IGAF2), which is able to: (1) acquire a range of expert knowledge in the form of goals or rules, and (2) increase the efficiency of strategy formation by increasing the amount of expert knowledge available to the algorithm [21]. The following expert goals and rules have been embedded in the IGAF2 algorithm. The goals: 1. the critical vs. normal states are determined by a range of values of the system states; for example, any state of the system with a value corresponding to a criterion function, which is more or equal to some threshold, may be determined as a critical goal 2. the suspicious vs. normal resources are determined by a range of states of the resource classifiers; combinations of classifier values identified as suspicious or normal induced signals for appropriate actions. The rules: 1. Identify the suspicious resources by classifier and narrow the search to the corresponding game subtree 2. Avoid critical states and focus on normal ones 3. Normalize the state of the system. First, try the actions of the defender, whose influence on the resources caused the current state change; if this does not work, try other actions 4. In building a game subtree for suspicious resources use defending actions able to influence such resources normal actions until there are no critical states if some defensive actions were used in previous steps decrease their usage priority 5. Balance the resource parameters by keeping them in the given ranges of permitted changes.
4. Measuring the Performance The On-the-Job Competition Scales method is aimed at evaluating DMS or their constituents in competitions. Given competition, this method allows the DMS to be ordered by on-the-job performance or absolute scales, in accordance with comprehensive comparisons of all competitor performances according to the base criteria of success declared in the original competition definitions [19].
E. Pogossian et al. / Enhancing Efficiency of Dynamic Threat Analysis
89
To compare IP algorithms, we use the Distance to Safety (DtS) and Productivity (P) criteria to estimate the “distance” of current states of protected systems from normal ones and the level of performance that the IP algorithms can preserve for them, correspondingly. A special tool is developed to estimate the quality of protection against unauthorized access [20]. The tool allows the component parts of experiments to vary, such as estimated criteria, attack types, and IP system parameters. Each experiment includes an imitation of the work of the base system with/without suspicion of an attack (or any other perturbation in the system), during which the IP algorithm makes a decision about the best strategy and chooses the best action according to the strategy. Data from attack experiments contain the system state’s safety estimate for each step, the actions taken by each side, and the system’s performance. The attack experiments have to be representative of a variety of possible attacks. We assume a combinatorial, individual nature of the attacks that are unified into classes of similar ones. By experimenting with a few class representatives, we hope to approximate a coverage of their variety. In [21], the SYN-Flood, Smurf, Fraggle Login-bomb attacks were studied. In this experiment we also added the ICMP Hack and Data Fragmentation attacks.
5. Experiments on Enhancing Performance The experiments were aimed at proving that the IGAF2 cutting-down tree algorithm, besides being compatible with the minimax algorithm, increases efficiency by embedding expert knowledge. The investigated version of the algorithm used the following components: x Over 12 single-level and multilevel solver-classifiers of the local system states x 6 actions/procedures of the attacking side x 8 “normal” actions/procedures of the system x 22 known counteractions against attack actions/procedures (the database of counteractions). IGAF2 was tested in experiments against four attacks with a depth of the game tree search up to 13 and the following controlled and measured criteria and parameters: distance to safety, productivity, working time, and number of game tree nodes searched, new queue of incoming packages, TCP connection queue, number of processed packages, RAM, HD, unauthorized access to files and login into the system. The results of the experiments show: x Sampling means for Distance to Safety and Productivity of the IGAF2 and minmax algorithms are compatible. x Number of nodes searched by the IGAF2 algorithm with all expert rules and subgoals are decreasing compared with the IGAF1 algorithm or the minimax one. x Number of nodes searched by the IGAF2 algorithm with all expert rules and subgoals is the smallest compared with the IGAF1 algorithm or the minimax one when the depth of search is increasing up to 13
90
E. Pogossian et al. / Enhancing Efficiency of Dynamic Threat Analysis
x
The time spent by the IGAF2 algorithm with all expert rules and subgoals is the smallest compared with the IGAF1 algorithm or the minimax one when the depth of search is increasing up to 13.
6. Conclusion A version of Botvinnik’s IGAF2 algorithm was developed, which is able to acquire a range of expert knowledge in the form of goals or rules and to increase the efficiency of strategy formation by increasing the amount of expert knowledge available to the algorithm. The viability of the IGAF2 algorithm was successfully tested in the network IP problems against representatives of six classes of attacks: SYN-Flood, Fraggle, Smurf, Login-bomb, ICMP Hack and Data Fragmentation. The recommended version of the algorithm – IGAF2, with all expert rules and subgoals, for the depth of search 5 and 200 defending steps, outperforms the Productivity of minmax algorithm by 14%, using 6 times less computing time and searching 27 times less nodes of the tree. Future plans include, expanding the alphabet of attack and defense actions, including hidden ones, and developing our approach to increase the strength of the IGAF algorithms through systematic enrichment of their knowledge base by new IP goals and rules.
References 1. Pogossian E. Javadyan A. “A Game Model For Effective Counteraction Against Computer Attacks In Intrusion Detection Systems,” NATO ASI 2003, Data Fusion for Situation Monitoring, Incident Detection, Alert and Response Management, Tsahkadzor, Armenia, August 19-30, pp.30. 2. H. V. Astsatryan, Yu. H. Shoukourian,, V. G. Sahakyan. The ArmCluster1 Project: Creation of HighPerformance Computation Cluster and Databases in Armenia Proceedings of Conference. Computer Science and Information Technologies, 2001, pp. 376-379 3. M.M. Botvinnik, About solving approximate problems, S. Radio, Moscow, 1979(Russian) 4. Botvinnik, M.M., Computers in Chess: Solving Inexact Search Problems. Springer Series in Symbolic Computation, with Appendixes, Springer-Verlag: New York , 1984. 5. Botvinnik, M. M., Stilman, B., Yudin, A. D., Reznitskiy , A. I., Tsfasman, M.A., “Thinking of Man and Computer,” Proc. of the Second International Meeting on Artificial Intelligence, Repino, Leningrad, Russia, Oct. 1980, pp. 1-9. 6. Stilman, B., Linguistic Geometry: From Search to Construction, Kluwer Academic Publishers, Feb.2000, 416 pp. 7. V. Gorodetski, I. Kotenko: “Attacks against Computer Network: Formal Grammar Based Framework and Simulation Tool.” Proc. of the 5 Intern. Conf. "Recent Advances in Intrusion Detection", Lecture Notes in Computer Science, v.2516, Springer Verlag, pp.219-238, 2002. 8. I. Kotenko, A. Alexeev E., Man’kov. “Formal Framework for Modeling and Simulation of DDoS Attacks Based on Teamwork of Hackers-Agents.” Proc. of 2003 IEEE/WIC Intern. Conf. on Intelligent Agent Technology, Halifax, Canada, Oct. 13-16, 2003, IEEE Computer Society. 2003, pp.507-510. 9. E.Pogossian. Adaptation of Combinatorial Algorithms.(in Russian), Yerevan., 1983, 293 pp. 10.Phillips C., Swiler L.. “A Graph-Based System for Network-Vulnerability Analysis,” New Security Paradigms Workshop ,Proceedings of the 1998 workshop on New security paradigm 11.Sheyner O., Jha S., Haines J., Lippmann R., Wing J., “Automated Generation and Analysis of Attack Graphs.” Proceed. of the IEEE Symposium on Security and Privacy, Oakland, 2002. 12.Sheyner O., Wing J., Tools for Generating and Analyzing Attack Graphs, to appear in Proceed. of Formal Methods for Components and Objects, Lecture Notes in Computer Science, 2005. 13. Chussil M., Reibstein 1994. D. Strategy Analysis with Value War. The SciPress,
E. Pogossian et al. / Enhancing Efficiency of Dynamic Threat Analysis
91
14.Pogossian E. Javadyan A. “A Game Model And Effective Counteraction Strategies Against Network Intrusion.” 4th International Conference in Computer Science and Information Technologies, CSIT2003, Yerevan, 2003, pp.5 15.Pogossian E., Tumasyan K. “Toward Chess Concepts Adequate Simulation,” Proceedings of the Annual Conference of the State Engineering University of Armenia, 2004, pp.5 (in Russian). 16.Pylyshyn Z. Seeing and Visualizing: It’s Not What You Think, An Essay On Vision And Visual Imagination, http://ruccs.rutgers.edu/faculty/pylyshyn.html 17.Kosslyn S. Image and Mind. Cambridge, MA Harvard University Press 1980 18.Winograd T., Flores F. 1986.Understanding Computers and Cognition (A new foundation for design). PUBLISHER? 19. Pogossian E. 1999. “Management Strategy Search and Programming.” Proceed. CSIT99, Yerevan, 20.Pogossian E. Javadyan A., Ivanyan E. “Toward a Toolkit for Modeling Attacks and Evaluation Methods of Intrusion Protection,” Annual Conference of the State Engineering University of Armenia, 2004, pp.5 (in Russian). 21.Pogossian E., Javadyan A., Ivanyan E. “Effective Discovery of Intrusion Protection Strategies,” AISADM 2005, Lecture Notes in Artificial Intelligence , Vol. 3505, pp. 263-276, 2005. Springer-Vergal Berlin Heidelberg 2005.
92
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
Evidence Theory for Robust Ship Identification in Airborne Maritime Surveillance Missions Pierre VALIN Defence R&D Canada Valcartier (DRDC-V) 2459 Blvd Pie XI Nord, Val-Bélair, QC, G3J 1X5, Canada
Abstract. The CP-140 (Aurora) Canadian maritime surveillance aircraft is presently undergoing an Aurora Incremental Modernization Program that will allow multi-sensor data fusion to be performed. Dempster-Shafer (DS) evidence theory is chosen for the identity information fusion due to its natural handling of conflicting, uncertain and incomplete information. Two realistic scenarios were constructed in order to test DS under countermeasures, miss-associations, and incorrect classification. Results show that DS theory is robust under all but the worst cases, when using the existing suite of sensors. Keywords: attributes, platform database, SAR, imagery classifiers, DempsterShafer
Introduction Reasoning over attributes (or situations) plays a big role in military domains, where complementary sensor information can lead to quicker and more stable identification (ID) through identity information fusion. The focus of this lecture is an application of Dempster-Shafer (DS) evidence theory to airborne surveillance of ships through reasoning over attributes, using both passive and active sensors to properly identify targets in a hostile environment. Many other lecturers [1] at this NATO ASI have presented DS theory, so it will not be detailed further here.
1. The Aurora’s Current Sensor Suite The Aurora has dissimilar non-imaging sensors for fusion x A 2-D radar (the AN/APS-506) x An Electronic Support Measures (ESM) system (the AN/ALR-502) providing passive ID information through detection of emitters which are crosscorrelated with a realistic a priori Platform Data Base (PDB) x An Identification Friend or Foe (IFF) (the AN-APX-502) providing allegiance (if in proper working condition) x A datalink, Link-11, mainly for ID information and complementary imaging sensors for fusion
P. Valin / Evidence Theory for Robust Ship Identification
93
x
A Spotlight Synthetic Aperture Radar (SAR) planned upgrade currently under way, for which a cued classifier was designed and implemented. x A Forward-Looking Infra-Red (FLIR) passive sensor (the OR-5008/AA), for which several different classifiers were implemented and their results fused. In order to simplify the presentation of ID fusion results, only results from the fusion of interpreted SAR imagery with ESM reports will be shown.
2. The Attributes of the PDB The attributes, over which one has to reason during the level-1 Object Refinement phase of fusion, often referred to as Multi-Sensor Data Fusion (MSDF), can originate from imaging or non-imaging sensors, and be kinematical, geometrical or relate more directly to an ID (if those come from intelligent sensors such as an IFF, or an ESM, or imagery classifiers). These attributes form the columns of the PDB and the rows correspond to all the possible platforms that can be encountered. Kinematical attributes can be estimated through tracking in the positional estimation function of DF, and through reports from IFF and datalink. Since the tracker can provide speed, acceleration and sometimes altitude, attributes such as maximum (max) acceleration, max platform speed, minimum (min) platform speed, cruising speed, and max altitude either serve as bounds to discriminate between possible air target IDs or suggest the plausible IDs. However, speed reports should be fused only if they involve a significant change from past historical behaviour in that track. The reason is two-fold: 1. First, no single sensor must attempt to repeatedly fuse identical ID declarations, otherwise the hypothesis that sensor reports are statistically independent is violated. 2. Second, the benefits of the fusion of multiple sensors are lost when one sensor dominates the reports. Geometrical attributes can be estimated by algorithms which post-process imaging information from sensors such as the FLIR, or Electro-Optics (EO) and SAR. Classifiers that perform such post-processing can be thought of as Image Support Modules (ISM) performing much the same functionality as the ESM does for the analysis of electromagnetic signals. These ISMs can provide the three geometrical dimensions of height, width and length (for FLIR and EO), and also Radar Cross Section (RCS) of the platform as seen from the front, side or top. In addition, the distribution of relevant features may be needed for classifiers, but may be considered part of the algorithms that generate plausible IDs. Identification attributes can be directly given by the ESM, as outputs of the FLIR and SAR ISM, from acoustic signal interpretation (for surface and sub-surface targets), and from Doppler radar (for airborne targets). The ESM requires an exhaustive list of all the emitters that are carried by the platform, since it will provide an emitter list with some confidence level about the accuracy of the list that reflects the confidence in its electromagnetic spectral fit. However an IFF response can lead to an identification of a friendly or commercial target but the lack of a response does not necessarily imply that the interrogated platform is hostile. One has to distribute the lack of a response between at least two declarations: the most probable foe declaration and a less probable friendly or neutral declaration corresponding to an IFF equipment that is not working or absent. On the other hand, the ISMs are usually designed to not only provide the
94
P. Valin / Evidence Theory for Robust Ship Identification
best single ID possible, but also to estimate confidence in higher levels of an appropriate taxonomy tree (STANAG 4420 or MIL-STD 2525B, which are mostly consistent, but vary in the detail provided).
3. Justification for Choosing DS Evidence Theory The best choice of a method for combining sensor propositions (such as from the ESM and the SAR ISM depends on such factors as: x Must process incomplete information ==> notion of ignorance x Sensor performance is not always monitored ==> notion of uncertainty x Must handle conflicts between contact/track ==> notion of conflict x Must not require a priori information ==> no Bayesian reasoning x Real-time method ==> possibility of approximation (truncation) is required x Operator wants best ID ==> give preference to single ID (singleton) x Operator wants next best thing ==> doublet (2 best IDs), triplet, etc… x Must resist countermeasures ==> conflict again (emitter not in PDB) x Must resist false associations ==> ESM report associated to wrong track x Must be tested operationally ==> complex scenarios needed x Ordinary method must explode ==> large complex PDB needed Thus, one requires a reasoning method where ignorance, uncertainty and conflict have mathematical meaning, which is robust, and which can be simplified to reduce calculational complexity. It is well known that incomplete, uncertain, and sometimes conflicting information is ideally suited to DS evidential reasoning, where "mass" or Basic Probability Assignment (BPA) plays the role of the probability. Indeed, when the intersection of sets is null for certain combinations between the new contact and the existing track, conflict exists. Furthermore, when one is uncertain of the correctness of the declared proposition, and its associated probability, it is wise to assign a small mass to the ignorance, as well as the best estimate for the larger mass of the declared proposition. Finally, well-documented and tested approximation (truncation) methods exist that keep the algorithm real-time. This approximation at every fusion step is absolutely necessary since, for a PDB of size N, one may have to keep tracks of up to 2N combinations (the power set) with associated masses becoming increasingly smaller (of order 2-N). A realistic military PDB can have a few hundred (in this lecture, about 140) to several thousand platforms (our most recent PDB has about 2,200), so, without approximation, one would have to monitor typically 21000 or approximately 10300 platforms, with masses expressed in floating point arithmetic of extremely high precision.
4. Hierarchical SAR Imagery Classifier A hierarchical SAR classifier was designed and implemented to provide three complex declarations that are sent . 1. first, it provides an estimate of the Ship Length (SL) interval, and correlates it to all platforms of the PDB
P. Valin / Evidence Theory for Robust Ship Identification
2. 3.
95
then, it analyses the superstructure profile to identify whether the ship is a line combatant or a merchant (using a neural net trained on Knowledge Base rules), thus providing a declaration of Ship Category (SC) and finally (in the case of a combatant), it provides a Ship Type (ST) declaration by Bayesian reasoning over length distributions for the five types: frigate, destroyer, cruiser, battleship, and aircraft carrier (in order of increasing mean length). Length is a discriminator for line combatants, as shown in Figure 1 for the probability of length s for ship type t. However length is not a discriminator for merchant ships.
Figure 1. Probability distributions for the 5 line combatant types as a function of length
5. The Designed Scenarios The complete set of fusion algorithms (registration, association, positional fusion and identity information fusion) that can lead to timely ID were tested for two scenarios, in which radar and ESM contacts were provided by DRDC-V’s Concept Analysis and Simulation Environment for Automatic Tracking and Identification (CASE-ATTI) sensor module, and SAR imagery was simulated with DRDC Ottawa’s SARSIM. 1. Maritime Air Area Operations (MAAO) which involves the ID of three enemy Russian ships (Udaloy destroyer, Kara cruiser, and Mirka frigate) in the presence of ESM countermeasures, and which fuses the SAR ISM results since the enemy line ships are of different types. 2. Direct Fleet Support (DFS) involving American and Canadian convoys, which are also imaged by the SAR, but for which miss-associations can occur, due to the geometry of the scenario, and for which certain SAR images are atypical of the type of ship being imaged, leading to false declarations from the ISM.
6. Typical Results in the MAAO Scenario The performance of the SAR ISM classifier is shown in Figure 2 below. The three declarations (SL, SC and ST) are clearly indicated (from top to bottom, or left to right). For example, for the Kara cruiser, the ST declaration would have the set of all cruisers
96
P. Valin / Evidence Theory for Robust Ship Identification
assigned a mass of 0.67, the set of all destroyers assigned a mass of 0.10 and the set of all aircraft carriers assigned a mass of 0.04, with ignorance having the residual mass.
Figure 2. SL, SC and ST Declarations for Russian ships in the MAAO Scenario
The evolution of fusing ESM data (represented by triangles) and SAR data results in the refinement of the ID of the Kara Azov upon fusing the key emitters #92 and #93, as shown in Figure 3 below. SAR data arrives late enough in the scenario that platform ID was already firmly established.
Figure 3. Evolution of the Leading Proposition when Fusing ESM and SAR Data for the Kara Azov
7. Typical Results in the DFS Scenario The performance of the SAR ISM classifier is shown in Figure 4 below. The three declarations (SL, SC and ST) are clearly indicated (from top to bottom, or left to right). In this case, it should be noticed that the Virginia is an atypically small cruiser and is miss-identified as a Destroyer by the Bayes length classifier. Indeed, by referring to
P. Valin / Evidence Theory for Robust Ship Identification
97
Figure 1, a length of 127 meters is near the peak of the destroyer distribution and is in the tail of the cruiser one.
Figure 4. SL, SC and ST Declarations for American ships in the DFS Scenario
The evolution of fusing ESM data (represented by triangles) and SAR data results in the refinement of the ID of the Ticonderoga, where the fusion of emitter #110 and the SL, SC, and ST declarations provide the final correct ID, as shown in Figure 5 below..
Figure 5. Evolution of the Leading Proposition When Fusing ESM and SAR Data for the Ticonderoga
It is clear that automatic fusion of ESM reports to tracks and the correct interpretation of SAR imagery data can lead to correct ID under most conditions. Errors in ID can occur occasionally due to algorithmic errors such as false associations, or wrongly interpreted imagery, or as the result of intentional deceit, e.g. countermeasures. It is our experience that when only one such type of error is present, the DS scheme is robust enough to recover. When two or more are present, DS only recovers if enough correct data, such as provided by the ESM, are fused. The same
98
P. Valin / Evidence Theory for Robust Ship Identification
conclusions hold if other data (IFF, tracking info such as speed, Link-11 ID data, FLIR, …) are included.
8. Aurora Incremental Modernization Program The original sensor suite of the CP-140 was essentially that of an anti-submarine warfare platform such as the US S-3 Viking, but the airframe itself was that of a US P3 Orion. The CP-140 is currently undergoing an Aurora Incremental Modernization Program (AIMP) to update its sensors, according to its new more peaceful roles. Its new sensors will provide data to a modernized Data Management System (DMS), which could have a “fusion box” providing fused data to an operator for a human-inthe-loop aid to decision making. x The new L3 Wescam MX-20 EO/IR replacing the ageing FLIR, for which several classifiers (k-NN, neural net, Dempster-Shafer, Bayes…) were designed, tested, and fused several ways. x The new Telephonics APS-143 SAR, for which a SAR classifier for category (line, merchant, etc.) and type (e.g., frigate, cruiser, destroyer, aircraft carrier, etc.) has been designed and tested, as shown in this study. x The new Lockheed ESM ALQ-217, which enables extreme bearing accuracy. This study has shown that a fusion capability would have helped even with the old sensor suite. Of course, with the improved sensor suite, more complex missions can be attempted and the fusion capability (new algorithms?) will have to be re-evaluated in the near future.
9. Conclusions Through the use of an a priori database and the DS reasoning framework, all sensors and ISMs contribute declarations of (possible multiple) propositions, which can be fused to achieve a correct platform ID. The DS scheme is robust in the sense that it can handle conflicts, ignorance, and ambiguities, which can result from inadequate performances from sensors or ISMs, or from miss-associations in difficult tracking conditions.
References [1]
See for example the lectures of P. Vannoorenberghe, J. Dezert and F. Smarandache.
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
99
Improved Threat Evaluation using Time of Earliest Weapon Release Eric MÉNARD and Jean COUTURE Lockheed Martin Canada, 6111 Ave. Royalmount, Montréal, Qc, H4P 1K6
Abstract. Lockheed Martin Canada has recently developed a Situation and Threat Assessment and Resource Management application using some recent technologies and concepts that have emerged from level 2 and 3 data fusion research. The current paper describes some exploration work on Improved Time of Earliest Weapon Release for threat evaluation and the utilization of target weaponry system information for threat evaluation refinement. Keywords. Threat Assessment, Situation Assessment, Time of Earliest Weapon Release
Introduction In recent years, Lockheed Martin Canada (LM Canada) has been active in data fusion research, specifically in evaluating methods, algorithms, techniques and architectures. Most of this research effort has involved experimentation, i.e., implementing and testing the most promising techniques of the research. This practical experimentation allows potential solutions to be identified and evaluated for those problems and issues that do not materialize during theoretical approaches. This experimentation has led to the development of the Situation and Threat Assessment (STA) and Resource Management (RM) application (hereafter STA/RM) [1]. This application integrates the capabilities of a modern high-level fusion system based on concepts from the data fusion level 2-3 as per the JDL model [2]. The STA/RM application is built on an expendable framework where new algorithms can be easily implemented and evaluated [3]. The current paper focuses on experimentation performed to evaluate the use of information from an external a priori database about a target’s weaponry system, during the threat evaluation process. A brief scenario of three targets attacking a valuable asset will be given and the results from the basic threat evaluation and improved Time of Earliest Weapon Release (TEWR) will be compared.
1. Simulation Environment The R&D Testbed is a framework used to conduct investigations on Data Fusion algorithms and perform data analysis. This paper concerns the following points of interest.
100
E. Ménard and J. Couture / Improved Threat Evaluation
1.1. Multi-Source Data Fusion (MSDF) The MSDF application executes the processing related to Data Fusion level 1: x Association x Gating x Tracking x ID Fusion 1.2. Situation and Threat Assessment and Resource Management (STA/RM) The STA/RM application regroups (level 2-3) Data Fusion and is divided into the following two major components: 1.2.1. Situation and Threat Assessment STA combines the following functional components: x Situation assessment x Threat evaluation x Target/weapon analysis x Engagement planning 1.2.2. Resource Management The RM component regroups the following functionalities: x Weapon assignment x Execution of planning The remainder of the paper describes the external a priori database used for this threat evaluation comparison, and the performance of two threat evaluation algorithms on a brief scenario.
2. External a priori Database Information Threat assessment evaluation is based on information about a target determined by MSDF and target ID refinement by STA/RM. The new concept added to the STA/RM for the ID refinement process consists of interrogating an external database to retrieve information about the target weaponry system. This database was built in collaboration with and sponsored by DRDC Valcartier [4, 5] and contains a wealth of a priori information about: x Aircraft and ships (more than 2 000) of all types and their characteristics x Sensors (more than 1 500) and their characteristics x Weapons (more than 500) of all types and their characteristics x Ground infrastructures, maps, pictures, and documents.
E. Ménard and J. Couture / Improved Threat Evaluation
101
3. Threat Assessment Implementation The main tasks of a “real life” threat assessment system are the evaluation of the threat level of non-friendly tracks and the ranking of those threat levels to build a prioritized threat list. This list is in turn used to establish engagement planning and reserve and assign resources to the most threatening entities. Threat level evaluation usually addresses three different aspects of a threat: its opportunity to do damage, its capability to do damage and its intent [6]. This paper demonstrates the utility of integrating the target capabilities’ information for threat assessment evaluation. The following sections describe three threat evaluation algorithms. 3.1. Basic Threat List The basic threat evaluation list is also called threat’s opportunity and was addressed during the first implementation phase of STA/RM. Opportunity is defined relative to a specific location or a valuable asset, like the ownship, and represents a time and space measure of how close the threat will approach its assumed target (i.e., the location of asset). Its computation requires the following threat information: x Speed x Heading x Closest Point of Approach (CPA) x Time to reach CPA Many different algorithms exist to perform the computation, each having a particular method for weighting the different pieces of information. In any case, using only the opportunity to deduce threat values can be misleading for the following reasons: 1. Threat opportunity does not take into account that: a) Slow entities (e.g., ships, submarines) are usually assigned low threat values in spite of the possibility that they may have very threatening long range weapons b) Projection of entity trajectories may not lead directly towards the asset to be protected but entity still can launch weapons directly to the asset 2. Threats may not have the intent to attack. It is therefore necessary to include a threat’s capability in the threat evaluation process. 3.2. Time of Earliest Weapon Release A threat’s capability has just been implemented in the latest STA/RM implementation phase. To include a threat’s capability in the overall threat evaluation process, the two following information sources are required: x Threat’s identity provider x Characteristics and capabilities of the identified entities The first source is fulfilled by the MSDF application and uses Dempster-Shafer evidential theory to compute entity identities. The second source of information comes from an external a priori database described in Section 2.
102
E. Ménard and J. Couture / Improved Threat Evaluation
With these two sources of information in place, we investigated two different methods that include a threat’s capability in threat evaluation [7]: the Constant Velocity Time Of Earliest Weapon Release (CVTEWR) and the Maneuver Time Of Earliest Weapon Release (MTEWR). These two methods are explained in Figure 1. The former uses the estimated time the identified threat would take to launch its most threatening weapon (e.g., the one having the longest range) if the threat maintains its current velocity. The second method is similar except that the threat is assumed to instantaneously break its trajectory in order to launch its most threatening weapon in a minimum of time. In the end, we implemented the MTEWR method, which constitutes a “worst case scenario” compared with the other method. This method requires the following a priori information: x Precise threat identity x “Most threatening” weapon identity x Weapon velocity x Weapon maximum range x Weapon type (Missile, Close-In Weapon System (CIWS), Gun, Cannon, Torpedo, Mortar) x Weapon Utility (air / surface / subsurface)
Figure 1. Definition of CVTEWR and MTEWR computational methods
This method returns a threat value between 0 and 1 until the point of earliest weapon range is reached. Once a target reaches the maximum range of a target weapon or is inside the range, the threat value remains equal to 1. The problem is that targets inside the maximum weapon range cannot be prioritized. This effect can be solved with the Improved TEWR. 3.3. Improved TEWR This method divides a threat value into three parts: x TEWR [0, 0.7] x Time for the weapon to reach the asset to protect [0. 0.2] x If the target itself is a missile + 0.1 The sum of the three parts forces a threat value into [0, 1] interval. An additional feature was also implemented to detect whether an incoming missile is dangerous to the asset. The feature verifies whether the distance between the missile and the asset is larger than the maximum missile range. If this is the case, 0.5 is subtracted from the threat value.
E. Ménard and J. Couture / Improved Threat Evaluation
103
Figure 2. Improved TEWR threat value evaluation
4. Results 4.1. Scenario
Figure 3. Scenario with fast and long range missile on the bomber
This scenario has two fighters going straight to the valuable asset and a bomber passing on the side. The valuable asset has a weaponry system with two illuminators for guiding missiles. Each of these illuminators can engage only one target and will not be released until the engaged target is destroyed. By using only the position and the speed for the Basic Threat List Evaluation (Table 1), the two fighters get the highest priorities and are engaged first. Once fighter #1 is destroyed, the bomber has already moved away too fast from the valuable asset to be engaged. From the external a priori database, the STA/RM receives weapons information for these targets, including information that the bomber is equipped with a fast long range missile. The Improved TEWR Threat List Evaluation method sets the bomber as the second highest threat. The weaponry system engages and destroys fighter #1 and the bomber and, when an illuminator is released, fighter #3 is engaged and destroyed. The Improved TEWR method allows all threats to the valuable asset to be destroyed.
104
E. Ménard and J. Couture / Improved Threat Evaluation
This method does not kill the targets earlier but kills them according to a threat ranking based on more integrated target information. Table 1. Comparison between Basic Threat List evaluation and Improved TEWR Threat List Evaluation
With basic threat list evaluation Track name Fighter #1 Bomber #2 Fighter #3
Threat value 0.75 0.6 0.72
Created (s) 0.0 0.9 0.9
Threat value 0.9 0.85 0.8
Created (s) 0.0 0.9 0.9
Killed (s) 70.0 0.0 74.7
Time To Be Killed (s) 70.0 Never be killed 73.8
Distance To Be Killed (m) 14584.0 Never be killed 15805.6
With improved TEWR Track name Fighter #1 Bomber #2 Fighter #3
Killed (s) 68.0 62.4 98.7
Time To Be Killed (s) 68.0 61.5 97.8
Distance To Be Killed (m) 15242.0 13455.2 9176.4
5. Conclusion In this paper, we demonstrated the advantage of using the Improved TEWR Threat List Evaluation compared with the Basic Threat Evaluation by taking onboard target weapon characteristics for threat evaluation. The Improved TEWR is an exploration of ideas to refine threat evaluation processing and there remain some refinement and additional issues to be investigated, such as: x Improve handling of threats with incomplete identity x Refine the concept of “Most Threatening” weapon by including threat sensors, jamming and softkill capabilities x Introduce Measures of Performance (MOPs) based on probability of killing targets and probability of survival A great deal of information can be further extracted from the external a priori database about sensors, jammers and flares onboard the targets. This information can be used to refine the threat evaluation by assigning a higher threat status to targets that could jam the valuable asset weaponry system. A smarter choice of weapon to use for targets should depend on the target defence system against this weapon to increase the probability of kill.
References [1] [2] [3]
E. Shahbazian, J.R. Duquet, P. Valin, A Blackboard Architecture for Incremental Implementation of Data Fusion Applications, in FUSION 98 Las Vegas, 6-9 July 1998, Vol I, pp 455-461. A.N. Steinberg, C.L. Bowman, F.E. White, Revisions to the JDL Data Fusion Model, in Joint NATO/IRIS Conference, Quebec City, Quebec, 19-29 October, 1998. P. Bergeron, J. Couture, J.R. Duquet, M. Macieszczak, and M. Mayrand, A New Knowledge-Based System for the Study of Situation and Threat Assessment in the Context of Naval Warfare, in FUSION 98 Las Vegas, 6-9 July 1998, Vol II, pp 926-933.
E. Ménard and J. Couture / Improved Threat Evaluation
[4] [5] [6]
[7]
105
J.-F. Truchon and J. Couture, MSDF/STARM Libraries Study – Data Representation for the Information Libraries, Doc. No. 6520014004, Lockheed Martin Canada, 2002. J. Couture, J.R. Duquet, and Y. Allard, MSDF/STARM Libraries Study –Final Report, Doc. No. 6520014004, Lockheed Martin Canada, 2002. Couture J. and E. Menard, Issues with Developing Situation and Threat Assessment Capabilities, Data Fusion Technologies for Harbour Protection, NATO Advanced Reserch Workshop, Tallinn, Estonia, June 27-July 1, 2005 (in preparation). M.G. Oxenham, Enhancing Situation Awareness for Air Defence via Automated Threat Analysis, In Proceedings of the Sixth International Conference on Information Fusion (FUSION 2003), Cairns, Australia, 8-11 July 2003. International Society of Information Fusion, 2003, pp. 1086-1093.
106
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
Detection of Structural Changes in a Multivariate Data Using Change-Point Models David ASATRYANa,1, Boris BRODSKYb, Irina SAFARYANc a Institute for Informatics and Automation Problems, Armenia b Central Economics and Mathematics Institute, Russian Federation c Slavonic University of Armenia
Abstract. Many problems of image processing, remote sensing and remote control can be formulated in terms of detection of structural changes in observed multivariate temporal or spatial data. The proposed lecture considers modern methods of detection of structural changes in multivariate data and some important applications. Various methods for the effective solution of these problems are described. The most popular methods of change-point determination in the onedimensional interpretation are given: parametrical and non-parametrical statistical methods, a wavelet analysis method. A method of detection of structural changes in a multivariate regression analysis is considered. Keywords. Change-point determination, structural changes, nonparametric methods, segmentation, wavelet analysis, multivariate regression model
Introduction Adequate and effective information processing is an important part of a decisionmaking system that operates by means of remote sensing and satellite imagery. Various problems of target detection, recognition and threat assessment are solved by information received from multi-sensor systems. Systems of this type usually use algorithms based on statistical methods, digital data, signal and image processing and other mathematical methods. As a result of such processing, an object or a phenomena is derived, which is detected from the presence of various disturbing factors, noises and distortions. If remote sensing occurs in a time and/or space domain, the corresponding methods are likewise performed in a time and/or space domain. It is supposed that the presence or absence of objects of interest or expected phenomena in the scene under analysis influences the character and structure of the received data. By structure we mean the regularities of the observed objects’ behaviour as shown in their data distribution, dependence between variables, the presence of groups of observation results with certain properties, etc. Therefore, many problems of image 1
Corresponding Author: David Asatryan, Institute for Informatics and Automation Problems, 1, P.Sevaki Str., Yerevan, 375014, Armenia; E-mail:
[email protected]
D. Asatryan et al. / Detection of Structural Changes in a Multivariate Data
107
processing, remote sensing and remote control may be formulated in terms of detection of structural changes (or jumps) in the observed multivariate temporal or spatial data. The past four decades have seen considerable theoretical and empirical research on the detection of abrupt changes in multivariate data and its applications to various problems of regression analysis, monitoring of dynamical systems and other stochastic models (well known as change-point analysis). Development of these models has resulted in the creation of a huge number of approaches, methods and algorithms for detecting abrupt changes in the structure and (or) the features of the available information. Review of all these methods in a limited paper is impossible. The given lecture is therefore devoted to a brief but systematic presentation of some popular processing methods and algorithms for detecting abrupt changes in onedimensional and multivariate data.
1. Simplest Change-Point Problem One of the initial papers devoted to this problem is [1]. A vast amount of scientific literature on this topic currently exists. We can refer, for example, to [2]. The simplest change-point problem is formulated as follows. Let X 1 , X 2 ,..., X n be a sequence of i.i.d. random variables. Let’s suppose that for a fixed (but unknown) value of 1 d k d n , the random variables X1 , X 2 ,..., X k are i.i.d. with a distribution function F1 ( x ) and, analogously, the random variables X k 1 , X k 2 ,..., X n are from a distribution function F2 ( x ) , with it being known that F1 ( x ) z F2 ( x ) at least in a point x . Hence, the value of k is named a change-point in the sequence X 1 , X 2 ,..., X n .
Let x1, x 2 ,..., x n be the observation results. Concerning the distribution function Fj ( x ), j 1,2 , various assumptions can be made, depending on the considered sensor control model and the presence of aprioristic information regarding the type and parameters of distribution. For example, in a continuous case a normal, exponential or other model can be accepted with some unknown parameters. We need to estimate the change-point k in the various models of structure of a sequence X 1 , X 2 ,..., X n , which can be a multivariate variable as well. To complete a Change-Point Problem (CPP) it is necessary to define a model M M (X; F1 , F2 | k ) that connects the distribution of observed variables with changepoint k and a criterion ) )(k | M; x 1 , x 2 ,..., x n ) for estimating procedure (it can be either maximized or minimized). Thus, we can estimate the unknown change-point via the following procedure: ^
k
arg max ) (k | M; x 1 , x 2 ,..., x n ) 1d k d n
Let’s consider an example of CPP based on a normal distribution model. Let 1, 2 be normal distribution functions with the parameters ( P j , V j 2 ); j 1, 2 .
F j ( x ), j
At first, we consider a case of known parameters ( P j , V j 2 ); j 1, 2 .
108
D. Asatryan et al. / Detection of Structural Changes in a Multivariate Data
Absence of a change-point means that all instances of X i , i 1,...n have a normal distribution N(P 1 , V1 ) (hypothesis H 0 ). This situation can be considered as k ! n . The presence of a change-point means that X i ~ N(P1 , V12 ), i 1,...k , and
X i ~ N(P 2 , V 22 ), i k 1,...n (hypothesis H1 ). We want to test the hypothesis H 0 against the alternative hypothesis H1 . The logarithm of the likelihood ratio for these two alternatives at the independent observations is as follows P 2 P1
ln / n
V
S nk (P 0 , Q)
where Q
2
n § P P2 · ¸ ¦ ¨ xi 1 2 i k© ¹
P 2 P1 V
2
n § P P1 · ¸ ¦ ¨ x i P1 2 2 ¹ i k©
1 V2
S nk (P1 , Q),
n § Q· Q ¦ ¨ x i P0 ¸ , 2¹ i k©
P 2 P1 is a jump magnitude (by taking into account the sign). We can set
) (k | M; x 1 , x 2 ,..., x n )
S nk (P 0 , Q) , i.e., as an estimate of unknown change-point k we ^
can use the maximum likelihood ratio as follows k
arg max S nk (P 0 , Q) . 1d k d n
When the jump magnitude Q is unknown (this case usually occurs in applications) we can use the same technique based on likelihood ratio but now we must estimate the change-point and jump magnitude at the same time [2]. It is obvious that this approach needs to use the complete information on probability distribution function of observation; therefore it is not robust in general.
2. Nonparametric Methods
In contrast to the previous situation we can suppose that the distribution functions F j ( x ), j 1,2 are unknown but we know some integrated information on their behaviour. This problem is considered in [3]-[6] in detail, so only a simple case is provided to demonstrate the ideas of this approach. Let Z( n , k ) be a two-sample nonparametric statistic to test a hypothesis H 0 (where the samples x 1 , x 2 ,..., x k and x k 1 , x k 2 ,..., x n are from the same distribution) against the alternative hypothesis H1 . As the statistics of such kind we can indicate the Wilcoxon (or connected with them Mann-Witney statistics), some statistics based on range powers and many others.. We assume that if a change-point exists, then F1 ( x ) z F2 ( x ) and F1 ( x ) ! F2 ( x ) . This model M is considered, particularly, in [4]. Let’s consider, for example, the Mann-Witney statistics, which have the following expression Z( n , k )
k n 1 ¦ ¦ Z ij (n, k ) , k (n k ) i 1 j k 1
109
D. Asatryan et al. / Detection of Structural Changes in a Multivariate Data
where Z ij (n , k )
1, x i d x j ®0, x ! x , i 1,2,..., k; j k 1, k 2,..., n . i j ¯
Thus, we can put ^
) (k | M; x 1 , x 2 ,..., x n )
Z(n , k ) and k
arg max Z(n , k ) . 1d k d n
3. Multiple Change-Point Problem (CPP)
A situation with many change-points is more typical. There are a few methods for change-point determination in the multivariate case. We’ll consider two of these, each differing in mathematical approach and, consequently, in processing method. 3.1. Segmentation Method
This method uses a time series segmentation to break a series into homogeneous pieces. The quality of a segmentation is determined by the sum of the squared deviation of the data from the means of their respective segments; in what follows we will use the term segmentation cost for this quantity. Given a time series, the procedure computes the minimal cost segmentation with K 2,3,... change-points. The procedure gradually increases K and, for every value of K , the best segmentation is computed. The procedure is terminated when differences in the means of the obtained segments are no longer statistically significant (as measured by Schefe’s contrast criterion). In this section we formulate time series segmentation as an optimization problem. We follow Hubert’s presentation [7], but modify his notations and formulate them in the terms of a time series. Given a time series x 1 , x 2 ,..., x T and a number K , a segmentation is a sequence of times t ( t 0 , t 1 ,..., t K ) that satisfies 0 t 0 t 1 ... t K 1 t K T . The intervals of integers >t 0 1, t 1 @ , >t 1 1, t 2 @ ,…, >t K 1 1, t K @ are the segments and the times t 0 , t 1 ,..., t K are the change-points. K , the number of segments, is the order of the segmentation. The length of the k th segment (for k 1,2,..., K ) is denoted by Tk t k t k 1 . The following notation is used for a given segmentation t ( t 0 , t 1 ,..., t K ) . For k 1,2,..., K define ^
Pk
1 Tk
tk
¦ x t , dk
t t k 1 1
tk
^
2 ¦ (x t P k ) , D K (t)
t t k 1 1
where D K ( t ) defines the cost of segmentation t
K
¦ dk
k 1
K
¦
^
tk
2 ¦ (x t P k ) ,
k 1 t t k 1 1
( t 0 , t 1 ,..., t K ) . ^
Now we can define the best K -th order segmentation t to be the one minimizing ^
D K ( t ) and denote the minimal cost by D K
^
^
^
^
D K ( t ) . Note that we have D K t D K 1
110
D. Asatryan et al. / Detection of Structural Changes in a Multivariate Data
for every K . One can show that the number of possible segmentations grows exponentially with T . Minimization of D K can be achieved by several alternatives. There are many algorithms to efficiently search the set of all possible segmentations (Hubert uses a branch-and-bound approach). One can consider a cost function as follows )
) ( t 0 , t 1 ,..., t K | M; x 1 , x 2 ,..., x T )
D K (t) ,
and minimize it for the set of all discrete values of t 0 , t 1 ,..., t K . 3.2. Wavelet Analysis
Because of their good time-frequency localization, among other reasons, wavelets have proven useful in many applications in statistics and other fields (especially signal and image processing techniques). In particular, they are well equipped to deal with abrupt jumps and other irregular features in nonparametric regression. Following the approach of Daubechies, we start with two related and specially chosen, mutually orthonormal, functions or parent wavelets: the scaling function I , (sometimes referred to as the father wavelet), and the mother wavelet, \ . Other wavelets in the basis are then generated by translations of the scaling function I , and dilations and translations of the mother wavelet \ , using the relationships: I j0 , k ( t )
2 j0 / 2 I(2 j0 t k ); j 0 , k Z , \ j,k ( t )
2 j / 2 \(2 j t k );
j, k Z (4-1)
for some fixed j 0 Z , where Z is the set of integers. Typically the scaling function I resembles a kernel function and the mother wavelet \ is a well-localized oscillation (hence the name wavelet). A unit increase in j in (4-1) (i.e., dilation) has no effect on the scaling function ( I j0k has a fixed width), but packs the oscillations of \ jk into half the width (doubles its “frequency” or, in strict wavelet terminology, its scale or resolution). A unit increase in k (i.e., translation) shifts the location of both I j0 k and \ jk , the former by a fixed amount ( 2 j0 ) and the latter by an amount proportional to its width ( 2 j ). Given the above wavelet basis, a function y( t ) is then represented in a corresponding wavelet series as: y( t )
f
¦ c j0 k I j0k ( t ) ¦ ¦ w jk \ jk ( t ) , c j0 k
kZ
j j0 kZ
y, I j0 k
and w jk
y, \ jk .
The parent wavelets need to be specially chosen if that is to be the case. For our purpose, the best simplest wavelet basis seems to be the Haar basis, which uses a parent couple given as follows
D. Asatryan et al. / Detection of Structural Changes in a Multivariate Data
I( t )
1, 0 d t d 1 , \(t ) ® ¯ 0, otherwise.
111
1, 0 d t d 1 / 2, ° ® 1, 1 / 2 d t 1, ° 0, otherwise. ¯
The heuristic underlying the development of this method is that under the alternative, most of the empirical coefficients will still be near zero, but that a few coefficients, localized to the area of the change-point, will exhibit significant signals.
4. Detection of Structural Changes in a Regression Model
The CPP for regression models was first considered by Quandt [8]. The following model of observations (X 1 , Y1 ),..., (X n , Yn ) was analyzed: Yj
E 0 E1 X j VZ j , ® (E ' ) (E ' )X VZ , 0 1 1 j j ¯ 0
where Z j are i.i.d.r.v.’s with EZ j
0, EZ 2j
jd k , (5-1) j! k
1;
(' 0 , ' 1 ) z (0,0) .
If 1 d k d n 1 then statistical characteristics of the dependent variable Y j change at the instant k , and if k n then model (5-1) is statistically homogenous. We consider a method [7] for estimation of the change-point k by observations (X 1 , Y1 ),..., (X n , Yn ) . The general statement of the CPP for the linear regression models can be formulated as follows. Suppose y i , i 1,2,..., k are i.r.v.’s. Under the null hypothesis H 0 the linear model is yi
Ex *i H i , 1 d i d n ,
where E (E1 , E 2 ,..., E d ) is an unknown vector of coefficients; x i ( x 1i , x 2i ,..., x di ) are known predictors and * is the transposition symbol. The errors are supposed to be i.i.d.r.v.’s with
EH i
0, 0 V 2
DH i f .
Under the alternative hypothesis H 1 a change at the instant k * occurs, i.e. yi
Ex *i H i , 1 d i d k * , ® * * ¯Jx i H i , k i d n
where k * and J R d are unknown parameters, and E z J . Denote
112
D. Asatryan et al. / Detection of Structural Changes in a Multivariate Data
1 ¦ xi , Qn n 1d i d k
1 ¦ yi , x k n 1di d k
yk
and X n
( x 1 , x 2 ,..., x n ) * , Yn
¦ ( x i x n )( x i x n ) * ,
1di d n
( y1 , y 2 ,..., y n ) * .
The least square estimate of E is ^
E *n
^
^
^
(E1n , E 2 n ,..., E dn )
(X *n X n ) 1 X *n Yn
Many authors propose to reject H 0 for the large values of max | U n (k ) | , where 1d k d n
§ k · ¨ ¸ ©1 k / n ¹
U n (k )
^
1/ 2
1 k(x
y k y n E n (x k x n ) * k
x n )( x k x n ) * (Q n (1 k / n ))1 / 2
.
^
Thus, the change-point estimate has a form k
arg max | U n (k ) | . 1d k d n
If k ! 1 for a given k -partition ^i 1 , i 2 ,...i k `, the least-squares estimates for the E j can easily be obtained. The resulting minimal Residual Sum Of Squares (RSS) is given by RSSi 1 , i 2 ,...i k
k 1
¦ rss(i j1 1, i j ) , where rss(i j1 1, i j ) is the usual minimal RSS
j 1
in the j th segment. The problem of dating structural changes is to find the change^
^
points i 1 ,..., i k that minimize the objective function ^ · §^ ¨ i 1 ,..., i k ¸ © ¹
arg min RSSi 1 ,..., i k (i1 ,...,i k )
over all partitions i 1 , i 2 ,...i k with i j i j1 t n h t m
References [1] [2] [3] [4] [5]
E. Page. A test for a change in a parameter occurring at an unknown point. Biometrica, 42: 523-527, 1955. M. Basseville, I. Nikiforov. Detection of abrupt changes: Theory and applications. Prentice-Hall, N.Y., 1993. B. Brodsky, B.Darkhovsky. Nonparametric change-point detection. Proceedings of the 2nd IFAC Symposium on Stochastic Control, Vilnius, 1986. D. Asatryan, I. Safaryan. Nonparametric methods for detecting changes in the properties of random sequences. In: Detection of Changes in Random Processes (ed. L.Telksnys) N.Y. 1986, pp 1-13. B. Brodsky, B. Darkhovsky. Nonparametric Methods in Change-Point Problem. Kluwer Academic Press, 1993.
D. Asatryan et al. / Detection of Structural Changes in a Multivariate Data
113
B. Brodsky, B. Darkhovsky. Non-Parametric Statistical Diagnosis: Problems and Methods. Kluwer Academic Publishers. The Netherlands. 2000. [7] P. Hubert. The segmentation procedure as a tool for discrete modeling of hydrometeorogical regimes. Stoch. Env. Res. and Risk Ass., vol. 14, pp.297-304, 2000. [8] R.E. Quandt. The estimation of parameters of a linear regression system obeying two separate regimes. Journal American Statistical Asso-ciation, 50, 873–880, 1958. [9] M. Cs¨org˝ o, L. Horvath. Limit theorems in change-point analysis. Chichester: Wiley, 1997. [10] J. Bai, P. Perron. Estimating and testing linear models with multiple structural changes. Econometrica, 66, 1, 47–78, 1998. [11] B. Darkhovsky. Retrospective change-point detection in some regression models. Theory of Probability and Applications, 40, 4, 898-903, 1995. [6]
114
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
Unification of Fusion Theories (UFT) Florentin SMARANDACHE The University of New Mexico 200 College Road Gallup, NM 87301, USA
[email protected]
Abstract. We propose Unification of Fusion Theories and Fusion Rules in solving problems/applications. For each particular application, check the reliability of sources, and select the most appropriate model, rule(s), fusion theories, and algorithm of implementation. The unification scenario presented herein, which is in an incipient form, should periodically be updated to incorporate new discoveries from fusion and engineering research. Keywords. Fusion theories, fusion rules, lattice, Boolean algebra, Lindenbaum algebra, frame of discernment, model, static/dynamic fusion, incomplete/ paraconsistent/imprecise information, specificity chains, specialization
Introduction Each theory works well for some applications and less well for others. This unification, a fusion overview attempt, might look like a cooking recipe, or more precisely, a logical chart or a computer program; however, we do not yet see another method to comprise/unify all things. We extend the power set and hyper-power set from previous theories to a Boolean algebra that we construct by closing the frame of discernment under union, intersection, and complement of sets. All basic belief assignments (bba) and rules are extended to this Boolean algebra. A similar generalization has been previously used by Guan-Bell (1993) for the Dempster-Shafer rule using propositions in sequential logic. Herein we reconsider Boolean algebra for all fusion rules and theories, but use sets instead of propositions because it is generally harder to work in sequential logic with summations and inclusions than in the set theory. We present the definition of a model, some classifications of frames of discernment and their elements, types of information, what specificity chains and specialization mean, the definition of static and dynamic fusions, and the algebraic properties of rules. We list the fusion rules and theories but are not able to present them due to space limitation. We also propose a partial Unification of Fusion Rules (UFR).
F. Smarandache / Unification of Fusion Theories (UFT)
115
1. Fusion Space For n t 2 let Ĭ = {ș1, ș2, …, șn} be the frame of discernment of the fusion problem/ application under consideration. Then (Ĭ, , , U ), Ĭ closed under these three operations: union, intersection, and complementation of sets respectively, forms a Boolean algebra. With respect to the partial ordering relation, the inclusion , the minimum element is the empty set
I , and the maximal element is the total ignorance
n
I = T i . i=1
Similarly one can define: (Ĭ, , , \) for sets, Ĭ closed with respect to each of these operations: union, intersection, and difference of sets respectively. (Ĭ, , , U ) and (Ĭ, , , \) generate the same super-power set SĬ closed under , , U and \ because for any A, B SĬ one has U (A) = I \ A and reciprocally A\B = A U (B). If one considers propositions, then (Ĭ, , , ) forms a Lindenbaum algebra in sequential logic, which is isomorphic with the above (Ĭ, , , U ) Boolean algebra. By choosing the frame of discernment Ĭ with exclusive elements, closed under only, one gets Dempster-Shafer’s, Yager’s, Transferable Belief Model, Dubois-Prade’s power set. Then making Ĭ closed under both , one gets Dezert-Smarandache’s hyper-power set. While, extending Ĭ for closure under , , and U one also includes the complement of set (or negation of proposition if working in sequential logic). In the case of non-exclusive vague elements in the frame of discernment one considers the complement is involutive, i.e., C(C(A))=A for any set A to avoid an infinite loop in the closure under complement process. Therefore the super-power set (Ĭ, , , U ) includes all previous fusion spaces. The power set 2Ĭ, used in DST, Yager’s, TBM, DP, which is the set of all subsets of Ĭ, is also a Boolean algebra, closed under , , and U , but does not contain intersections of elements from Ĭ since the elements are supposed exclusive. The Dedekind distributive lattice DĬ, used in DSmT, is closed under , , and if negations/complements arise, they are directly introduced into the frame of discernment, say Ĭ’, which is then closed under , . Unlike others, DSmT allows intersections, generalizing the previous theories. The Unifying Theory contains intersections and complements as well. Model means to know the empty intersections in the super-power set, whose conflicting masses should be transferred to non-empty sets. Comments on Frames and their extensions. F.1.1 Open World is a frame that misses some hypotheses (non-exhaustive) [Smets], e.g., ȍ = {John, George}, but later we find another suspect, David. An open world becomes closed if one adds another hypothesis in the frame of discernment șc, which includes all missing hypotheses. F.1.2 Closed World is a frame that includes all hypotheses (exhaustive). F.2.1 Homogeneous frame: all its elements are of the same nature. F.2.2 Heterogeneous frame: at least two of its elements are of a different nature, e.g., ȍ = {White, Bird, Long}. It is split into homogeneous sub-frames; the complement is computed with respect to an element’s sub-frame. Construct superpower sets for each sub-frame.
116
F. Smarandache / Unification of Fusion Theories (UFT)
F.3.1 Finite frame. F.3.2 Infinite frame. E.1.1 Exclusive elements: their intersection is empty. E.1.2 Non-exclusive elements: their intersection is not empty, e.g., ȍ = {A, B} of target cross-sections, A={x| 1.5<x<2.5}, B={x| 2<x<3}. E.2.1 Classical elements: their boundaries are well defined. E.2.2 Vague elements: their boundaries are not well defined, e.g., ȍ = {Red, Orange}. Let’s consider a frame of discernment Ĭ with exclusive or non-exclusive hypotheses, exhaustive or non-exhaustive, closed or open world (all possible cases). We need to remark that in cases where these n t 2 elementary hypotheses ș1, ș2, …, șn are exhaustive and exclusive, one gets the Dempster-Shafer Theory, Yager’s, Dubois-Prade Theory and Dezert-Smarandache Theory; but for cases where the hypotheses are non-exclusive one gets Dezert-Smarandache Theory, while for nonexhaustivity one gets TBM. An exhaustive frame of discernment is called closed world, and a non-exhaustive frame of discernment is called open world (meaning that new hypotheses might exist in the frame of discernment that we are not aware of). Ĭ may be finite or infinite. Let mj: SĬ Æ [0. 1], 1 j s, be s 2 basic belief assignments (when bbas are working with crisp numbers) or with subunitary subsets, mj: SĬ Æ P([0. 1]), where P([0. 1]) is the set of all subsets of the interval [0,1] (when dealing with very imprecise information).
2. Types of Information I.1 Complete information: normally the sum of crisp masses of a bba, m(.), is 1, i.e., ¦ m( X ) 1 . X S 4
I.2 Incomplete information: not enough knowledge; the sum of scalar mass components < 1; I.3 Paraconsistent information: conflicting/paradoxist information coming from opposite view points; the sum of scalar mass components > 1; Some prefer to normalize incomplete and paraconsistent information; others do not (wanting to learn the type of information after fusion). I.4 Imprecise information (Dezert-Smarandache): mass components are subsets (not necessarily intervals) of [0, 1]. I.4.1 Admissibly condition for completeness: x A:m(A): ¦ m(A) 1 ; AS 4
otherwise it is imprecise, incomplete or paraconsistent information. Similarly, for a bba m(.) valued on subunitary subsets dealing with paraconsistent and incomplete information respectively: I.4.2 For incomplete imprecise information, one has ¦ sup{m( X )} 1 . X S T
I.4.3 While for ¦ inf{m( X )} ! 1 . X S T
paraconsistent
imprecise
information
one
has
F. Smarandache / Unification of Fusion Theories (UFT)
117
3. Specificity Chains and Specialization Specificity Chains are inclusion chains that use the min principle, i.e., a cautious way to transfer conflicting masses to less and less specific elements. The transfer of conflicting mass and normalization diminishes the specificity. If A B=I, its mass is moved to a less specific element A (also to B) in an optimistic view on them, but if we have a pessimistic view on A and B we can move the mass m(A B) to A B (entropy increases, imprecision increases). And even more, if we are very pessimistic about A and B we move the conflicting mass to total ignorance in a closed world or to the empty set in an open world. Examples of Specificity Chains: x In a closed world: A B C A B A A B A B C 4 x In an open world: A (B C) o I. Specialization means transfer of a set mass to its subsets (opposite of specificity chain), e.g., A A B A B C.
4. Static and Dynamic Fusion According to Wu Li we have the following classification and definitions: x Static fusion means to combine all belief functions simultaneously. x Dynamic fusion means that the belief functions become available one after another sequentially, and the current belief function is updated by combining itself with a newly available belief function.
5. Summary of Library of Fusion Rules The following 25 old and 19 new rules have been collected herein: Conjunctive, Disjunctive, Exclusive Disjunctive, Mixed Conjunctive-Disjunctive rules, Conditional rule, Dempster's, Yager's, Smets' TBM rule, Dubois-Prade's, DezertSmarandache classical and hybrid rules, Murphy's average rule, Inagaki-Lefèvre-ColotVannoorenberghe Unified Combination rules (and, as particular cases, Iganaki's parameterized rule, Weighting Average Operator (Vannoorenberghe), minC (M. Daniel), and newly Proportional Conflict Redistribution 1-5 and 3.4 rules (Smarandache-Dezert) among which PCR5 is the most exact way of redistribution of the conflicting mass to non-empty sets following the path of the conjunctive rule), Zhang's Center Combination rule, Convolutive x-Averaging, Consensus Operator (Jøsang), Cautious Rule (Smets), Į-junctions rules (Smets), Yen’s rule, p-boxes method, Yao and Wong’s Qualitative rule, Baldwin’s rule, Besnard’s rule, and six new T-norm and T-conorm rules (Tchamova-Smarandache) adjusted from fuzzy sets, plus six new N-norm and N-conorm rules adjusted from neutrosophic logic (Smarandache), partial Unification of Fusion Rules (Smarandache, 2005). Introducing the degree of union, inclusion, besides that of intersection, with respect to the cardinal of sets (not from a fuzzy set point of view), many from the above fusion rules, can be improved.
118
F. Smarandache / Unification of Fusion Theories (UFT)
Due to space limitation we are unable to present these rules nor the below fusion theories. Reader can download the author’s NASA presentation article for more information: http://xxx.lanl.gov/ftp/cs/papers/0410/0410033.pdf.
6. Summary of Fusion Theories
x x x x x
Dempster-Shafer Theory of Evidence (1976) TBM (Transferable Belief Model) [P. Smets] Fuzzy Theory (Zadeh, 1965) Dezert-Smarandache Theory of Plausible, Uncertain, and Paradoxist Reasoning (2001) Neutrosophic Theory (Smarandache, 1995) – generalization of Fuzzy Theory
7. Algebraic Properties of Fusion Rules Let R be a fusion rule, and m1(.), m2(.), …, ms(.) bba’s. P.1 R is commutative if R(m1, m2)=R(m2, m1). P.2.1 R is associative if R(R(m1, m2), m3)=R(m1, R(m2, m3)). P.2.2 R is quasi-associative if there exists an algorithm/method that transforms a non-associative rule into an associative one, e.g., rules, based on conjunctive rule and then the transfer of conflicting mass, are quasi-associative since one can store the conjunctive rule result for the combination with the next evidence coming in. P.3.1 R is idempotent if R(m1, m1)=m1. P.3.2 R is convergent towards idempotence (Smarandache, 2004) if lim R(m1, ,,, , m1)=m1 kĺ -- k times – P.4 R satisifies the Vacuum Belief Assignment (VBA), where VBA or neutral element is mVBA(total ignorance)=1, if R(m1, VBA)=m1. P.5 R satisfies the Markovian requirement (Smets) if R(m1, m2, …, ms)=R(R(m1, m2, …, ms-1),ms)). P.6.1 R satisfies the majority opinion (Wu Li, 2004) if R(m1, m2, …, m2) § m2. P.6.2 R is convergent towards the majority opinion (Smarandache, 2004) if lim R(m1, m2, ,,, , m2)= m2. kĺ ---- k times ---P.7 R discounts the old sources if d[R(m1, m2, …, ms), m1] > d[R(m1, m2, …, msĬ 1), m1] for ms m1, where the distance d(m1, m2) = |m1(X)-m2(X)| for all X S . P.8 Continuity of rule R (Smarandache, 2004): H ! 0 , G G H , such that if m1 m2 H ˢ then R m1 m3 R m2 m3 G .
This property means smooth behavior of the rule. P.9 Coherence of a rule: it has some justification for its construction and able to provide fusion performances close to what human experts would expect "rationale“ (this is relatively subjective) [J. Dezert].
8. Unification of Fusion Rules (UFR) If variable y is directly proportional with variable p, then y=z·p, where k is a constant.
119
F. Smarandache / Unification of Fusion Theories (UFT)
If variable y is inversely proportional with variable q, then z=k·(1/q); we can also say that z is directly proportional with variable 1/q. In a general way, we say that y is directly proportional with variables p1, p2, …, pm and inversely proportionally with variables q1, q2, …, qn, then m
y = k·(p1·p2·…·pm)/(q1·q2·…·qn) = kP/Q, where P
n
p , Q q . i
i 1
j
j 1
In a general definition UFR is: mUFR(I) = 0, and A S4 \ I one has mUFR(A) =
¦
X 1, X 2S 4 X 1* X 2 A
d ( X 1* X 2)T ( X 1, X 2)
P A
¦ Q A
X S 4 \ A X * ATr
d ( X * A)
T ( A, X ) P ( A) / Q ( A) P ( X ) / Q( X )
where * is an intersection or union of sets, d(X*Y) is the degree of intersection or union, T(X,Y) is a T-norm fusion combination rule (extension of conjunctive or disjunctive rules), Tr is the ensemble of sets (in majority cases they are empty sets) whose masses must be transferred, P(A) is the product of all parameters directly proportional with A, while Q(A) the product of all parameters inversely proportional with A.
9. Scenario of Unification of Fusion Theories
A. CHECK THE RELIABILITY OF SOURCES B. FIND THE MODEL C. CHOOSE THE FUSION RULES AND THEORIES D. MORE CHECKINGS Since everything depends on the application/problem to solve, this scenario looks like a logical chart designed by the programmer to write and implement a computer program, or looks like a cooking recipe. Here it is the attempting scenario for a unification and reconciliation of the fusion theories and rules: S.1 If all sources of information are reliable, then apply the conjunctive rule, which means consensus between them (or their common part). S.2 If some sources are reliable and others are not but we don’t know which ones are unreliable, apply the disjunctive rule as a cautious method (and no transfer or normalization is needed). S.3 If only one source of information is reliable but we don’t know which one, then use the exclusive disjunctive rule based on the fact that X1 X2 … Xn means either X1 is reliable, or X2, and so on to Xn, but not two or more at the same time. S.4 If a mixture of the previous three cases, in any possible way, use the mixed conjunctive-disjunctive rule. For example, suppose we have four sources of information and we know that: either the first two are telling the truth or the third or the fourth is telling the truth. The mixed formula becomes:
120
F. Smarandache / Unification of Fusion Theories (UFT)
mF(I) = 0 m I m A
¦
0 , and A S4 \ I , one has
m1( X 1)m 2( X 2)m3( X 3)m 4( X 4) .
X 1, X 2 , X 3, X 4S 4 (( X 1 X 2 ) X 3 ) e X 4 A
S.5 If we know the sources that are unreliable, we discount them. But if all sources are fully unreliable (100%), then the fusion result becomes vacuum bba (i.e., m(Ĭ) = 1) and the problem is indeterminate. We need to get new sources that are reliable or at least not fully unreliable. S.6 If all sources are reliable or the unreliable sources have been discounted (in the default case), then use the DSm classic rule (which is commutative, associative, Markovian) on Boolean algebra (Ĭ, , , U ) no matter what contradictions (or model) the problem has. I emphasize that the super-power set SĬ generated by this Boolean algebra contains singletons, unions, intersections, and complements of sets. S.7 If the sources are considered from a statistical point of view, use Murphy’s average rule (and no transfer or normalization is needed). S.8 In the case the model is not known (the default case), it is prudent/cautious to use the free model (i.e., all intersections between the elements of the frame of discernment are non-empty) and the DSm classic rule on SĬ, and later if the model is found out (i.e., the constraints of empty intersections become known), one can adjust the conflicting mass at any time/moment using the DSm hybrid rule. S.9 Now suppose the model becomes known (i.e., we find out about the contradictions (=empty intersections) or consensus (=non-empty intersections) of the problem/application). Then: S.9.1 If an intersection A B is not empty, we keep the mass m(A B) on A B, which means consensus (common part) between the two hypotheses A and B (i.e., both hypotheses A and B are right) [here one gets DSmT]. S.9.2 If the intersection A B = I is empty, meaning contradiction, we do the following: S.9.2.1 if one knows that between these two hypotheses A and B one is right and the other is false, but we don’t know which one, then one transfers the mass m(A B) to m(A B) since A B means at least one is right [here one gets Yager’s if n=2, or Dubois-Prade, or DSmT]; S.9.2.2 if one knows that between these two hypotheses A and B one is right and the other is false, and we know which one is right (say hypothesis A is right and B is false), then one transfers the whole mass m(A B) to hypothesis A (nothing is transferred to B); S.9.2.3 if we don’t know much about them but one has an optimistic view about hypotheses A and B, then one transfers the conflicting mass m(A B) to A and B (the nearest specific sets in the Specificity Chains) (using Dempster’s, PCR2-5); S.9.2.4 if we don’t know much about them but one has a pessimistic view about hypotheses A and B, then one transfers the conflicting mass m(A B) to A B (the more pessimistic one is, the further one gets in the Specificity Chains: (A B) A (A B) I). This is also the default case [using DP’s, DSm hybrid rule, Yager’s];
F. Smarandache / Unification of Fusion Theories (UFT)
121
S.9.2.5 if one has a very pessimistic view about hypotheses A and B then one transfers the conflicting mass m(A B) to the total ignorance in a closed world [Yager’s, DSmT], or to the empty set in an open world [TBM]; S.9.2.5.1 if one considers that no hypothesis between A and B is right, then one transfers the mass m(A B) to other non-empty sets (in the case more hypotheses do exist in the frame of discernment) different from A, B, A B - for the reason that if A and B are not right then there is a bigger chance that other hypotheses in the frame of discernment have a higher subjective probability to occur. We do this transfer in a closed world (DSm hybrid rule), but if it is an open world, we can transfer the mass m(A B) to the empty set leaving room for new possible hypotheses [here one gets TBM]; S.9.2.5.2 if one considers that none of the hypotheses A, B is right and no other hypothesis exists in the frame of discernment (i.e., n = 2 is the size of the frame of discernment), then one considers the open world and one transfers the mass to the empty set [here DSmT and TBM converge on each other]. Of course, this procedure is extended for any intersections of two or more sets: A B U , etc. and even for mixed sets: A (B U ), etc. If it is a dynamic fusion in a real time, and associativity and/or Markovian process are needed, use an algorithm that transforms a rule (which is based on the conjunctive rule and the transfer of the conflicting mass) into an associative and Markovian rule by storing the previous result of the conjunctive rule and, depending on the rule, other data. Such rules are called quasi-associative and quasi-Markovian. Some applications require the necessity of decaying the old sources because their information is considered to be worn out. If some scalar bba is not normalized (i.e., the sum of its components is < 1 as in incomplete information, or > 1 as in paraconsistent information) we can easily divide each component by the sum of the components and normalize it. But it is also possible to fuse incomplete and paraconsistent masses and then normalize them after fusion. Or leave them unnormalized since they are incomplete or paraconsistent. PCR5 (Smarandache-Dezert) does the most mathematically exact (in the fusion literature) redistribution of the conflicting mass to the elements involved in the conflict, a redistribution that exactly follows the tracks of the conjunctive rule. Here is its formula: X S 4 \ I, mPCR 5 X
m12 X
[m1( X ) 2 ]m 2(Y ) [m 2( X ) 2 ]m1(Y ) { } m 2( X ) m1(Y ) Y ( S 4 )\ X m1( X ) m 2(Y )
¦
c ( X Y ) I
where c(X) is the conjunctive normal form of X, m12(X) is the conjunctive rule result of X, and all denominators are non-null. If one is null, its fraction is discarded.
122
F. Smarandache / Unification of Fusion Theories (UFT)
10. Example
Let Ĭ = {A, B, C, D, E} be the frame of discernment. We present an example that passes through many possibilities. Suppose m1(A) = 0.2, m1(B) = 0, m1(C) = 0.3, m1(D) = 0.4, m1(E) = 0.1, and m2(A) = 0.5, m2(B) = 0.2, m2(C) = 0.1, m2(D) = 0, m2(E) = 0.2. Suppose both sources are reliable, then we use the conjunctive rule and we get: m12(A) = 0.10, m12(B) = 0, m12(C) = 0.03, m12(D) = 0, m12(E) = 0.02, and also: m12(A B) = 0.04, m12(A C) = 0.17 , m12(A D) = 0.20, m12(A E) = 0.09 m12(B C) = 006. m12(B D) = 0.08, m12(B E) = 0.02, m12(C D) = 0.04, m12(C E) = 0.07, m12(D E) = 0.08. For the redistribution of the intersection masses, let’s suppose that: a. A BI, i.e., consensus (common part) between A and B, hence the mass m12(A B) = 0.04 remains on intersection: mUFT(A B) = 0.04. b. A C=I, i.e., contradiction between A and C, but we are optimistic about both of them, then we can transfer the mass 0.17 to A and C (using PCR5, but other rules can also be used such as Dempster’s, PCR3, etc.) the redistribution mass mr(A)=0.107, mr(C)=0.107, where mr(X) means the redistributed mass gained by set X at a respective step. c. A D=I, and suppose we know that one hypothesis is right, one wrong, but we don’t know which one, then the mass 0.20 is transferred to mr(A D). d. A E=I, and suppose we know that A is right, E is wrong, then the whole mass 0.09 is transferred to A only, i.e., mr(A)=0.09. e. we don’t know if B C = I or I, therefore the model is unknown, hence we keep the mass 0.06 on B C just in case we might find out more information about the model (this is considered the default model). f. B D=I, but we don’t know any relationship between B and D, hence in a prudent way we transfer the mass 0.08 to the uncertainty: mr(B D)=0.08. g. B EI, the intersection is not empty, but suppose neither B E nor B E interest us, then we can transfer the mass 0.02 to B and E (using PCR5), hence mr(B)=0.013, mr(E)=0.007. h. C D=I, and suppose we are pessimistic about both C and D, then the mass 0.04 is transferred to C D, i.e., mr(C D)=0.04. i. C E=I, and suppose we are very pessimistic about both C and E, then we cautiously transfer the mass of this intersection, 0.07, to the total ignorance: mr(A B C D E)=0.07. j. D E=I, and suppose we know that both D and E are wrong, then its mass 0.08 is redistributed among A, B, C equally, mr(A)= mr(B)= mr(C)=0.027. Then one sums the masses of the conjunctive rule m12 and the redistribution of conflicting masses mr (according to the information we supposedly have on each intersection, model, and relationship between conflicting hypotheses) to get the mass of the Unification of Fusion Theories mUFT, and we get: mUFT(A) = 0.324, mUFT(B) = 0.040, mUFT(C) = 0.119, mUFT(D) = 0, mUFT(E) = 0.027, mUFT(A B) = 0.04, mUFT(B C) = 0.06, mUFT(A D) = 0.20, mUFT(B D) =0.08, mUFT(C D) = 0.04, mUFT(A B C D E) = 0.07, mUFT(I) = 0. mUFT, the Unification of Fusion Theories rules, are combinations of many rules and give the optimal redistribution of the conflicting mass for each particular problem, following the given model and relationships between hypotheses. This extra
F. Smarandache / Unification of Fusion Theories (UFT)
123
information allows the choice of the combination rule to be used for each intersection. The algorithm is presented above. mlower, the lower bound believe assignment, the most pessimistic/prudent belief, is obtained by transferring the whole conflicting mass to the total ignorance (Yager’s rule) in a closed world, or to the empty set (Smets’ TBM) in an open world, herein meaning that other hypotheses might belong to the frame of discernment. For the previous example we have: mlower(A)=0.10, mlower(B)=0, mlower(C)=0.03, mlower(D)=0, mlower(E)=0.02, and mlower(A B C D E)= 0.85 in a closed world or mlower(I)= 0.85. mmiddle, or the default case, the middle believe assignment, half optimistic and half pessimistic, is obtained by transferring the partial conflicting masses m12(X Y) to the partial ignorance X Y (as in Dubois-Prade’s rule or more general as in DezertSmarandache theory). For the previous example we have: mmiddle(A)=0.10, mmiddle(B)=0, mmiddle(C)=0.03, mmiddle(D)=0, mmiddle(E)=0.02, mmiddle(A B)=0.04, mmiddle(A C)=0.17, mmiddle(A D)=0.20, mmiddle(A E)=0.09, mmiddle(B C)=0.06, mmiddle(B D)=0.08, mmiddle(B E)=0.02, mmiddle(C D)=0.04, mmiddle(C E)=0.07, mmiddle(D E)=0.08. mupper, the upper bound believe assignment, the most optimistic (less prudent) belief, is obtained by transferring the masses of intersections (empty or non-empty) to the elements in the frame of discernment using the PCR5 rule of combination, i.e., m12(X Y) is split to the elements X, Y. We use PCR5 because it is more exact mathematically (following backwards the tracks of the conjunctive rule) than Dempster’s rule, minC, and PCR1-4. For the previous example we have: mupper(A)= 0.400, mupper(B)= 0.084, mupper(C)= 0.178, mupper(D)= 0.227, mupper(E)= 0.111.
11. Conclusion
The Unification of Fusion Theories (UFT) and partial Unification of Fusion Rules (UFR) are presented in this short article. They combine existing and new fusion rules and theories in an attempt to provide an optimal fusion for practical applications. The partial or total conflicting masses are better redistributed if we have more information about the sources and the relationship between hypotheses in conflict. It is possible to do a prudent/ pessimistic (low belief) transfer, or average optimistic (middle belief) transfer, or most optimistic (less prudent) transfer.
References [1] [2] [3] [4] [5]
D. Dubois., H. Prade, On the combination of evidence in various mathematical frameworks, Reliability Data Collection and Analysis, J. Flamm and T. Luisi, Brussels, ECSC, EEC, EAFC: pp. 213-241, 1992. J. W. Guan, D. A. Bell, Generalizing the Dempster-Shaffer Rule of Combination to Boolean Algebras, IEEE, 229-236, 1993. T. Inagaki, Interdependence between safety-control policy and multiple-senor schemes via DempsterShafer theory, IEEE Trans. on reliability, Vol. 40, no. 2, pp. 182-188, 1991. E. Lefèvre, O. Colot, P. Vannoorenberghe, Belief functions combination and conflict management, Information Fusion Journal, Elsevier Publisher, Vol. 3, No. 2, pp. 149-162, 2002. C. K. Murphy, Combining belief functions when evidence conflicts, Decision Support Systems, Elsevier Publisher, Vol. 29, pp. 1-9, 2000.
124 [6] [7] [8] [9] [10] [11] [12] [13] [14] [15]
F. Smarandache / Unification of Fusion Theories (UFT)
K. Sentz, S. Ferson, Combination of evidence in Dempster-Shafer Theory, SANDIA Tech. Report, SAND2002-0835, 96 pages, April 2002, www.sandia.gov/epistemic/Reports/SAND2002-0835.pdf. G. Shafer, A Mathematical Theory of Evidence, Princeton Univ. Press, Princeton, NJ, 1976. F. Smarandache, J. Dezert (Editors), Applications and Advances of DSmT for Information Fusion, Am. Res. Press, Rehoboth, 2004, http://www.gallup.unm.edu/~smarandache/DSmT-book1.pdf. F. Smarandache, J. Dezert, Proportional Conflict Redistribution Rules, arXiv Archives, http://xxx.lanl.gov/PS_cache/cs/pdf/0408/0408064.pdf. P. Smets, Quantified Epistemic Possibility Theory seen as an Hyper Cautious Transferable Belief Model, http://iridia.ulb.ac.be/~psmets. R. R. Yager, Hedging in the combination of evidence, Journal of Information & Optimization Sciences, Analytic Publishing Co., Vol. 4, No. 1, pp. 73-81, 1983. F. Voorbraak, On the justification of Dempster's rule of combination, Artificial Intelligence, 48, pp. 171-197, 1991. R.R. Yager, On the relationships of methods of aggregation of evidence in expert systems, Cybernetics and Systems, Vol. 16, pp. 1-21, 1985. L. Zadeh, Review of Mathematical theory of evidence, by Glenn Shafer, AI Magazine, Vol. 5, No. 3, pp. 81-83, 1984. L. Zadeh, A simple view of the Dempster-Shafer theory of evidence and its implication for the rule of combination, AI Magazine 7, No.2, pp. 85-90, 1986.
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
125
Belief Functions Theory for Multisensor Data Fusion Patrick VANNOORENBERGHE Universit´e Paul Sabatier, Toulouse3, UFR PCA Laboratoire de T´el´ed´etection ` a Haute R´esolution 118, route de Narbonne, 31062 Toulouse cedex4, France
[email protected] Abstract. Sensors are mainly associated in order to get benefits of their complementarity. Different kinds of advantages may be expected such as the ability to face a more important set of situations, the improvement of discrimination capacity or simply time saving. When analyzing a situation, the available sensors are most often used under conditions that include uncertainties at different levels. In this paper, belief functions theory, a mathematical toolbox which allows to represent both imprecision and uncertainty, is used to represent, manage and reason with such uncertainties (imprecise measurements, ambiguous observations in space or in time, incomplete or poorly defined prior knowledge). Practical examples on how to use this theoretical framework in detectionrecognition problems are provided. They have nice properties like the possibility to quantify that none of the original hypothesis is supported, that the value of some ’likelihoods’ are unknown, that we can accept an a priori belief that really represents total ignorance. Several applications where belief functions have been successfully applied for multisensor data fusion are finally presented. Keywords. Belief functions, uncertainty management, information fusion, pattern recognition, multisensor data processing
Introduction Information fusion has been the object of much research over the last few years [1,21,33,16,25,26,8,24,23]. Generally, it is based on the confidence measure theory to represent imprecision and uncertainty (possibility theory, evidence theory, probability theory and fuzzy set theory) and provides techniques and methods for: • integrating data from multiple sources and using the complementarity of the available data to derive maximum information about the phenomenon being observed, • analyzing and deriving the meaning of these observations (achieving more reliable information), • selecting the best course of action (improving decision making) and
126
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion
• controlling the actions. Data fusion is used in many application fields, such as multisensor fusion [3,35], image processing and analysis [33,16,25,26,23,7,32], classification [13,45,14] or target tracking [4]. It takes into account heterogeneous information (numerical or symbolic) which is often imperfect (imprecise, uncertain and incomplete) and modelized by means of sources which have to be combined or aggregated. Sensors are mainly associated in order to get benefits of their complementarity. Different kinds of advantages may be expected such as: • ability to face a more important set of situations, as one sensor may be efficient while another one is not because of particular counter-measures, physical phenomena, conditions of observation or lack of suitable knowledge (learning, ...) • saving of time thanks to task sharing and cooperation between specific functions, • discrimination capacity improvement as a result of observation conjunction when only partial information is locally available (classification, localization, ...). Consequently, when analyzing a situation, the available sensors are most often used under conditions that include uncertainties at different levels: • measurements are imprecise, erroneous, incomplete or ill-suited to the problem, • observations may be ambiguous, either in space or in time (position, velocity or feature measurements provided by two different sensors are not necessarily related to the same object, ...), • prior knowledge (generated by learning, models, descriptions) may be incomplete, poorly defined, and especially more or less representative of reality, in particular in light of the varying context. Emerging problems presented in this paper are conducted within the belief functions theory which provides the best suited toolbox for the processing and fusion of the data considered. Furthermore, this mathematical framework is the most federative in terms of synergy between the different confidence measures. Section 1 gives basic notions of the theory including the reasoning with both imprecision and uncertainty, rules to combine uncertain data, the management of heterogeneous frames of discernment and decision making. The specific problem of pattern recognition met in multisensor systems (for target discrimination) is then considered in section 2 where three families of approaches are presented to infer belief functions from observed data. Finally, several applications where belief functions have been successfully applied, are proposed in section 3.
1. Background Materials on Belief Functions Belief functions theory (or Dempster-Shafer theory: DST) is initially based on Dempster’s work [11] concerning lower and upper probability distribution families. From these mathematical foundations, Shafer [37] has shown the ability of
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion
127
the belief functions to modelize uncertain knowledge. The usefulness of belief functions, as an alternative to subjective probabilities, was later demonstrated axiomatically by Smets [40,41] with the Transferable Belief Model (TBM) giving a clear and coherent interpretation of the underlying concept of the theory. 1.1. Belief functions as mathematical objects Let Ω = {ω1 , . . . , ωk , . . . , ωK } be a finite space, and let 2Ω be its power set. A belief function defined on Ω can be mathematically defined by introducing a set function, called the basic belief assignment (bba) mΩ : 2Ω → [0, 1] which satisfies:
mΩ (A) = 1.
(1)
A⊆Ω
Each subset A ⊆ Ω such as mΩ (A) > 0 is called a focal element of mΩ . A bba mΩ such that mΩ (∅) = 0 is said to be normal. This requirement was originally imposed by Shafer [37], but may be relaxed if one accepts the open-world assumption stating that the set Ω might be complete. Given this bba, a belief function belΩ and a plausibility function plΩ can be defined, respectively, as: belΩ (A) =
mΩ (B), ∀ A ⊆ Ω.
(2)
mΩ (B), ∀ A ⊆ Ω.
(3)
∅=B⊆A
plΩ (A) =
A∩B=∅
Whereas belΩ (A) represents the amount of support given to subset A, the potential amount of support that could be given to A is measured by plΩ (A). A belief function (a Choquet [9] capacity monotone of infinite order) belΩ can also be mathematically defined as a function from 2Ω to [0, 1] satisfying: belΩ (∅) = 0 ∀ n ≥ 1, ∀ i = 1, . . . , n, Ai ⊆ Ω belΩ (∪i=1,...,n Ai ) ≥ (−1)|I|+1 belΩ (∩i=1,...,n Ai ).
(4)
I⊆{1,...,n},I=∅
As such these inequalities are hardly meaningful, but the special case with n = 2 and A1 ∩ A2 = ∅ is worth considering: belΩ (A1 ∪ A2 ) ≥ belΩ (A1 ) + belΩ (A2 ) ∀ A1 , A2 ⊆ Ω.
(5)
This last relation just illustrates that the belief given to the union of two disjoint subsets A1 and A2 of Ω is larger or equal to the sum of the beliefs given to each subset individually. When all the inequalities of relation (4) are replaced by equalities, the resulting function belΩ would then be a classical probability function.
128
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion
Among the functions derived from mΩ introduced in Shafer’s book [37], the commonality function q Ω is defined as: q Ω (A) =
mΩ (B)
∀ A ⊆ Ω.
(6)
B⊇A
All these functions belΩ , plΩ , q Ω and mΩ are in one-to-one correspondence and represent different facets of the same piece of information. We can retrieve each function from the others using the fast Mobis transform [27]. The full notation for belΩ and its related functions is: Ω {x}[ECY,t ](ω0 ∈ A) = λ. belY,t
(7)
It denotes that the degree of belief held by the agent Y (shortcut for You) at time t that the actual world ω0 (the possible value for a variable x) belongs to the set A of worlds is equal to λ, where A is a subset of the frame of discernment Ω. The belief is based on the evidential corpus ECY,t held by Y at t, where ECY,t represents all what agent Y knows at t. Fortunately, in practice many indices can be omitted for simplicity sake as the domain Ω in the sequel of this paper. Let us suppose a variable x taking values in the finite and unordered set Ω called the frame of discernment. Partial knowledge regarding the actual value taken by x can be represented by a bba m{x}. Complete ignorance corresponds to m{x}(Ω) = 1, called the vacuous bba, and perfect knowledge of the value of x can be represented by the allocation of the whole mass of belief to a unique singleton of Ω (m{x} is then called a certain bba). Another particular case is that where all focal sets of m are singletons: m is then equivalent to a probability function and is called a Bayesian bba. 1.2. Rules of combination for data fusion Let m1 and m2 be two bba’s defined on the same frame Ω. Suppose that the two bba’s are induced by two distinct pieces of evidence. Then the joint impact of the two pieces of evidence can be expressed by the conjunctive rule of combination which results in the bba: ∩ m2 )(A) = m m1 (B)m2 (C). (8) ∩ (A) = (m1 B∩C=A
This rule is sometimes referred to as the (unnormalized) Dempster’s rule of combination. If necessary, the normality assumption m ∩ (∅) = 0 may be recovered by dividing each mass by a normalization coefficient. The resulting operator which is knows as Dempster’s rule denoted by m⊕ is defined as: m⊕ (A) = (m1 ⊕ m2 )(A)
∩ m2 )(A) (m1 1 − m(∅)
∀ ∅ = A ⊆ Ω
(9)
where the quantity m(∅) is called the degree of conflict between m1 and m2 and can be computed using:
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion
∩ m2 )(∅) = m ∩ (∅) = (m1
m1 (B)m2 (C) .
129
(10)
B∩C=∅
The use of Dempster’s rule is possible only if m1 and m2 are not totally conflicting, i.e., if there exist two focal elements B and C of m1 and m2 satisfying B ∩ C = ∅. This rule verifies some interesting properties (associativity, commutativity, nonidempotence) and its use has been justified theoretically by several authors [43, 29,18] according to specific axioms. 1.2.1. Notes about conflict The normalization in Dempster’s rule redistributes conflicting belief masses to non-conflicting ones, and thereby tends to eliminate any conflicting characteristics in the resulting belief mass distribution. The non-normalized Dempster’s rule avoids this particular problem by allocating all conflicting belief masses to the empty set. In [38], Smets explains this by arguing that the presence of highly conflicting beliefs indicates that some possible event must have been overlooked (the open world assumption) and therefore is missing in the frame of discernment. The idea is that conflicting belief masses should be allocated to this missing (empty) event. Smets has also proposed to interpret the amount of belief mass allocated to the empty set as a measure of conflict between separate beliefs. Another approach on how to eliminate conflicts from the Dempster’s rule is to replace ∩ by ∪ in Eq.(8) which produces the Disjunctive (or Dual Dempster’s) Rule [19] defined as: ∪ m2 )(A) m ∪ (A) = (m1
m1 (B)m2 (C)
∀ A ⊆ Ω.
(11)
B∪C=A
The interpretation of (conjunctive) Dempster’s rule is that both beliefs to be combined are assumed to be correct, while at least one of them is assumed to be correct in the case of the Disjunctive Rule. Unfortunately, while the Disjunctive rule has some nice theoretical properties, its disadvantage is that non-specificity of beliefs is increased by an application of the rule; more and more belief mass is assigned to larger subsets of Ω and to the whole Ω. The drastic case is the combination of any belief with vacuous one, where result is always vacuous belief function. Let q1 and q2 denote the commonality functions related to two bba’s m1 and m2 induced by distinct items of evidence. The conjunctive combination of these two pieces of evidence (m ∩ = m1 ∩ m2 ) can be computed from q1 and q2 as: q ∩ (A) = q1 (A)q2 (A)
∀ A ⊆ Ω.
(12)
1.2.2. The Weighted Operator The Weighted Operator, denoted WO, has been created to overcome the sensibility problem of Dempster’s rule which produces unexpected results when evidence conflicts [31]. The idea of the weighted operator is to distribute the conflicting belief mass m(∅) on some subsets of Ω according to additional knowledge. More precisely, a part of the mass m(∅) is assigned to a subset A ⊆ Ω according to a
130
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion
weighting factor denoted w. This weighting factor can be a function of the considered subset A and belief functions m = {mj , j = 1, · · · , J} which are involved in the combination and have caused the conflict. This idea is formalized in the following definition of the Weighted Operator. Definition 1 (Weighted Operator) Let m = {mj , j = 1, · · · , J} be the set of belief functions defined on Ω to be combined. The combination of the belief functions m with the weighted operator, denoted , is defined as: m (∅) w(∅, m).m(∅)
(13)
m (A) m ∩ (A) + w(A, m).m(∅) ∀ A = ∅.
(14)
In the definition of the weighted operator , the first term of equation (14), m ∩, corresponds to the conjunctive rule of combination. The second one is the part of the conflicting mass assigned to each subset A and added to the conjunctive term. The symbol has been chosen to highlight these two aspects. Weighting factors w(A, m) ∈ [0, 1] are coefficients which depend on each subset A ⊆ Ω and
on the belief functions m to be combined. They must be constrained by w(A, m) = 1 so as to respect the property that the sum of mass functions A⊆Ω
must be equal to 1 (cf. Eq.(1)). In order to completely define this operator, we need additional information to choose the values of w(., m) which allow to have a particular behavior of the operator. This generic framework allows Dempster’s rule of combination and other proposed by Smets [38], Yager [44] and Dubois and Prade [20] to be rewritten. For each operator, we only have to define the weighting factors w(A, m) associated to each subset A ⊆ Ω. For example, the unnormalized Dempster’s rule is no more than the weighted operator with w(A, m) = 0 for all A ⊆ Ω\{∅} and w(∅, m) = 1. This is the open world assumed by Smets. Yager [44] assumes that the frame of discernment Ω is exhaustive but its idea consists in assigning the conflicting mass m(∅) to the whole set Ω. According to the weighted operator previously presented, it is easy to reformulate the Yager’s idea in setting w(Ω, m) = 1 and w(∅, m) = 0. According to the choice of weights w, we can define a family of weighted operators. Another operator of this family is the proportionalized combination which has been proposed by Daniel in [10]. 1.3. Discounting An α-discounted bba mα (.) can be obtained from the original bba m as follows: mα (A) = αm(A)
∀ A ⊆ Ω, A = Ω
mα (Ω) = 1 − α + αm(Ω)
(15) (16)
with 0 ≤ α ≤ 1. The discounting operation is useful when the source of information from which m has been derived is not fully reliable, in which case coefficient α represents some form of meta-knowledge about the source reliability, which could not be encoded in m.
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion
A⊆Ω
m
bel
pl
BetPm
m.5
bel
pl
BetPm.5
{ω1 } {ω2 }
0.2 0.1
0.2 0.1
0.9 0.4
0.5 0.2
0.10 0.05
0.10 0.05
0.95 0.70
0.416 0.266
{ω1 , ω2 }
0.0
0.3
1.0
-
0.00
0.15
1.00
-
{ω3 } {ω1 , ω3 }
0.0 0.4
0.0 0.6
0.7 0.9
0.3 -
0.00 0.20
0.00 0.30
0.85 0.95
0.316 -
{ω2 , ω3 } Ω
0.0 0.3
0.1 1.0
0.8 1.0
-
0.00 0.65
0.05 1.00
0.90 1.00
-
131
Table 1. Example of bba, belief, plausibility and pignistic probability functions.
1.4. Pignistic transformation In the TBM, we distinguish the credal level where beliefs are entertained (formalized, revised and combined) and the pignistic level used for decision making. Based on rationality arguments developed in the TBM, Smets proposes to transform m into a probability function BetPm on Ω (called the pignistic probability function) defined for all ωk ∈ Ω as: BetPm (ωk ) =
m(A) 1 |A| 1 − m(∅)
(17)
A ωk
where |A| denotes the cardinality of A ⊆ Ω and BetPm (A) = ω∈A BetPm (ω), ∀A ⊆ Ω. In this transformation, the mass of belief m(A) is distributed equally among the elements of A [41]. Example of such transformation is given in Table 1. 1.5. The Generalized Bayesian Theorem Let us suppose the two finite spaces X, the observation space, and Θ, the unordered parameter space. The Generalized Bayesian Theorem (GBT), an extension of Bayes theorem within the TBM, consists in defining a belief function on Θ given an observation x ⊆ X, the set of conditional bbas mX [θi ] over X, one for each θi ∈ Θ and a vacuous a priori on Θ. Given this set of bbas (which can be associated to their related belief or plausibility functions), then for x ⊆ X and ∀A ⊆ Θ, we have: plΘ [x](A) = 1 −
(1 − plX [θi ](x)).
(18)
θi ∈A
1.6. Uncertainty in DST Because a belief function can represent several kinds of knowledge, it constitutes a rich and flexible way to represent uncertainty. As remarked by Klir [30], a belief function can model two different kinds of uncertainty: nonspecificity and conflict. A measure of nonspecificity, which generalizes the Hartley measure to belief functions, was introduced by Dubois and Prade [17]. It is defined as:
132
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion
N (m) =
m(A) log2 |A|.
(19)
A⊆Ω
Since focal elements of probability measures are singletons, nonspecificity is null for probability functions, and it is maximal (log2 |Ω|) for the vacuous belief function. Several measures of conflict, viewed as generalized Shannon entropy measures, have also been introduced [30]. One such measure is discord, defined as: D(m) = −
m(A) log2 BetPm (A)
(20)
A⊆Ω
which is maximal (log2 |Ω|) for the uniform probability distribution on Ω. Finally, a measure Uλ of total uncertainty can be defined using a linear combination of N and D: U (m) = (1 − λ)N (m) + λD(m)
(21)
where λ ∈ [0, 1] is a coefficient. The choice of λ is not theoretically justified (Klir recommends to take λ = 0.5). In the sequel, we shall see that it can be used as a regularization parameter and determined from learning data. 1.7. Refinement and Coarsening Part of the flexibility of DST is due to the existence of justified mechanisms allowing to change the level of detail, or granularity of the frame of discernment. In this section, we briefly recall the concepts of refinement and coarsening of a frame of discernment ([37], p.115), which play a key role in the theory. Let Ω and Θ be two finite sets. A mapping ρ from 2Θ to 2Ω is called a refining if and only if it verifies: ρ({θ}) = ∅
∀ θ ∈ Θ,
ρ({θ}) ∩ ρ({θ }) = ∅
∀θ = θ ,
ρ({θ}) = Ω.
θ∈Θ
In other terms, the sets ρ({θ}) , θ ∈ Θ constitute a partition of Ω. Θ is then called a coarsening of Ω, and Ω is called a refinement of Θ. Given a bba mΘ defined on Θ, we can define its vacuous extension (see [37] p. 146) mΩ on Ω by transferring each mass mΘ (A) to ρ(A), for all subset A of Θ: mΩ (ρ(A)) = mΘ (A)
∀ A ⊆ Θ.
(22)
Conversely, let mΩ be a bba on Ω. Transferring mΩ to Θ is not so easy because, for some B ⊂ Ω, there may exist no subset A of Θ such that ρ(A) = B. However, the restriction (or outer reduction) of mΩ may still be defined as:
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion
mΘ (A) =
mΩ (B)
∀ A ⊆ Θ.
133
(23)
{B⊆Ω | ρ(A)∩B=∅}
1.8. Decision analysis Let us assume that we have a bba m on Ω summarizing one’s beliefs concerning the value of the unknown variable y, and we have to choose an action among a finite set of actions A. A loss function λ : A × Ω → R is also assumed to be given, such that λ(a, ω) denotes the loss incurred if one chooses action a and y = ω. Which action should we choose? Based on the pignistic probability defined in equation (17), we can associate to each a ∈ A a risk, defined as the expected loss (relative to BetPm ) if one chooses action a: R(a) =
λ(a, ω)BetPm (ω).
(24)
ω∈Ω
We then choose the action with the lowest risk. Alternatively, the decision process could be based on non-probabilistic extensions of the concept of mathematical expectation [13]. For example, the concept of lower expectation leads to the definition of the lower expected loss as R∗ (a) =
A⊆Ω
m(A) min λ(a, ω), ω∈A
(25)
which results in a different decision strategy. In pattern classification, Ω = {ω1 , . . . , ωK } is the set of classes, and the elements of A are, typically, the actions ak of assigning the unknown pattern to each class ωk . With 0-1 losses, defined as λ(ak , ωl ) = 1 − δk,l for k, l ∈ {1, . . . , K}, it can be shown [13] that the minimization of the pignistic risk R leads to choosing the class ω0 with maximum pignistic probability, whereas the minimization of R∗ leads to choosing the class ω∗ with maximum plausibility. If an additional rejection action a0 with constant loss λ0 is added, then the pattern is rejected if BetP (ω0 ) < 1 − λ0 using the first rule, and if pl(ω∗ ) < 1 − λ0 using the second rule [13].
2. Case-based, Likelihood-Based Approaches and Belief Decision Trees for Pattern Recognition Problems 2.1. The problem Let us suppose a population P of objects, each object described by two variables: x a vector of d attributes (features), quantitative, qualitative or mixed and ω a class variable, qualitative which takes values in finite set Ω = {ω1 , . . . , ωK }. The pattern recognition problem (discrimination, supervised learning, discriminant analysis) consists in assigning an input pattern x to a class, given a learning set L composed of n patterns xi with known classification. Each pattern in L is represented by a d-dimensional feature vector xi and its corresponding class label
134
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion
ω i . In the last ten years, several solutions to this problem have been proposed, based on belief functions theory [37,41]. Such approaches have been called evidential classifiers by their authors. An evidential classifier is a mapping f : Rd → Ω allowing to predict the class ω of any new object described by feature vector x given an output belief function m ˆ Ω . Advantages of these techniques (description of the uncertainty on the prediction, possibility of rejecting a pattern and detecting unknown class) have been demonstrated in numerous papers [2,14]. In particular, these classifiers are well adapted to applications where the available data come from multiple imperfect information sources (multisensor problems, environmental monitoring, medical diagnosis, classifier combination). The classifier output is a belief function which allows to have: • a more faithfull description of uncertainty (greater flexibility to handle various sources of uncertainty such as imprecise or bad quality data), • a distinct representation of: ∗ ignorance (pattern dissimilar from all training examples), ∗ conflicting information (pattern similar to examples of different classes), • a greater robustness (decision procedures) and improved performance when combining several classifiers (e.g. sensor fusion), • a reduced need for unjustified assumption in situations of weak available information. Furthermore, they offer the possibility to handle weak learning information such as partial knowledge of the class of learning examples (e.g., o ∈ {ω1 , ω3 }, o ∈ ω2 , ...) and heterogeneous, non exhaustive learning sets: • a learning set L1 with objects from {ω1 , ω2 } and attributes xj , j ∈ J • a learning set L2 with objects from {ω2 , ω3 } and attributes xj , j ∈ J = J. The main approaches to pattern recognition (parametric, distance-based, treestructured classifiers) can be transposed in the TBM framework. The case-based approach (2.3), developed by Denoeux., is an adaptation of the k-nearest neighbor method, which allows computing a belief function based on the similarity of an object to training samples. It can be applied to build classifiers from training data, possibly with imprecise and/or uncertain class labels. The likelihood-based approach uses the General Bayesian Theorem that replaces the Bayesian Theorem used for diagnosis (2.2). Finally, induction methods, called belief decision trees due to their links with belief function theory and decision trees, have been proposed (2.4). Such techniques give the possibility to interpret each decision rule in terms of individual features. 2.2. Likelihood-Based Methods (LB) Let us assume the class-conditional probability densities f (x|ωk ) to be known. Having observed x, the likelihood function is a function from Ω to [0, +∞) defined as L(ωk |x) = f (x|ωk ), for all k ∈ {1, . . . , K}. Shafer [37, p.238] proposed to derive from L a belief function on Ω defined by its plausibility function as:
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion
pl(A) =
maxωk ∈A [L(ωk |x)] maxk [L(ωk |x)]
∀A ⊆ Ω.
135
(26)
In pattern recognition, an application of this method (and a variant thereof) can be found in Ref. [28]. Note that pl defined by (26) is consonant, i.e., its focal elements are nested. For that reason, this first model will be called the “consonant likelihood-based” (CLB) model. Starting from axiomatic requirements, Appriou [2] proposed another method based on the construction of K belief functions mk (.). The idea consists in taking into account separately each class and evaluating the degree of belief given to each of them. In this case, the focal elements of each bba mk are the singleton {ωk }, its complement ωk , and Ω. Appriou actually obtained two different models with similar performances [22]. According to Appriou, one of these models seems to be preferable on theoretical grounds, because it is consistent with the generalized Bayes theorem introduced by Smets [39]. This model, hereafter referred to as the Separable Likelihood-based (SLB) method, has the following expression: mk ({ωk }) = 0
(27)
mk (ωk ) = αk (1 − R.L(ωk |x)) mk (Ω) = 1 − αk (1 − R.L(ωk |x)),
(28) (29)
where αk is a coefficient that can be used to model external information such as sensor reliability, and R is a normalizing constant that can take any value in the range (0, (maxk (L(ωk |x)))−1 ]. Parameter R is somewhat arbitrary, but the principle of maximum uncertainty leads to choose the largest allowed value, which results in the least specific bba. With these K belief functions and using the Demspter’s rule of combination, a unique belief function m is obtained as m = k mk . 2.3. Distance-Based Method (DB) A totally different approach was introduced by Denœux [12]. In this method, a bba is constructed directly, using as a source of information the training patterns xi situated in the neighborhood of the pattern x to be classified. If nearest neighbors (according to some distance measure) are considered, we thus obtain several bba’s that are combined using the Dempster’s rule of combination. The initial method was later refined to allow parameter optimization [45], and a neural-network-like version was recently proposed [14]. This version, which will be considered here, uses a set of prototypes that are determined to minimize an error function. Each prototype can be viewed as a piece of evidence that influences the belief concerning the membership class of x. A belief function mi associated to each prototype i is then defined for all k ∈ {1, · · · , K} as: mi ({ωk }) = αi φi (di )
(30)
m (Ω) = 1 − α φ (d )
(31)
mi (A) = 0 ∀ A ∈ 2Ω \ {{ωk }, Ω}
(32)
i
i i
i
136
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion
where di is the Euclidean distance to the i-th prototype, αi is a parameter attached to prototype i, and φi (.) is a decreasing function defined as φi (di ) = exp[−γ i (di )2 ]. In this expression, γ i is a positive parameter associated to prototype i. The belief functions mi for each prototype are then aggregated using the Dempster’s rule of combination. 2.4. Belief decision tree (BDT) In this paper, we only consider the Belief Decision Tree’s approach introduced by Denœux and Skarstein-Bjanger [15] and extended to multiclass problems by Vannoorenberghe et al. [42]. Due to its main ability to represent different kinds of knowledge (from total ignorance to full knowledge), DST allows us to process training sets whose labeling has been specified with belief functions (see 2.4.1). An impurity measure, based on a total uncertainty criterion, is used to grow the tree and has the advantage to define simultaneously the pruning strategy (2.4.2). Finally, we present in paragraph 2.4.3, a multi-class generalization of the method introduced in [15] which allows us to handle the most general case in which each example is labeled by a general belief function [42]. 2.4.1. Principle A decision tree is a specific graph in which each node is either a decision node or a leaf node. To each decision node is associated a test based on attribute values, and a node has two or more successors (depending on the number of possible outcomes of the test). The most commonly used decision tree classifiers are binary trees which use a single feature at each node with two outcomes. In [15], the problem of handling uncertain labels is solved for two-class problems. In this context, the available learning set is given by: L = {(xi , mΩ i ) | i = 1, · · · , n} where mΩ i is defined on Ω = {ω1 , ω2 } and represents the knowledge on the label of the ith example. The belief function mΩ [t] at node t is then derived from the n(t) belief functions mΩ i (by induction using the Dempster’s rule of combination) and becomes: mΩ [t]({ω1 }) =
(j,k) | j+k≤n(t)
αjk
j j+k+1
(33)
where αjk are coefficients which depend only on the functions mΩ i . Similar expressions for mΩ [t]({ω2 }) and mΩ [t](Ω) can be obtained. In the equation (33), n(t) is the total number of examples reaching the node t. These equations are derived from a theoretical result on credal inference presented by Smets in [40]. Demonstrations concerning the extension to the more general case of belief functions have been proposed by Denœux and can be found in [15] and [42]. 2.4.2. Induction For each node t, an impurity measure is computed from the belief function mΩ [t] using the total uncertainty measure: Uλ (t) = (1 − λ)N (mΩ [t]) + λH(mΩ [t]).
(34)
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion
137
This impurity measure is used at node t to choose a candidate split s which divides t into two nodes tL and tR . The goodness of a split s is defined as a decrease in impurity by: ΔUλ (s, t) = Uλ (t) − (pL Uλ (tL ) + pR Uλ (tR ))
(35)
where pL and pR are, respectively, the proportions of examples reaching tL and tR . The best split sˆ is chosen by testing all possible splits for each attribute. One of the advantages of this technique is that the tree growing can be controlled using parameter λ. In fact, according to the value of λ, it is possible to give more importance to the non-specificity term which penalizes small nodes. Optimizing this parameter by cross-validation allows us to build smaller trees, thus avoiding overtraining. Unfortunately, this induction method is only available for two-class problems but can be generalized as explained in the next section. 2.4.3. Dichotomous approach for K-class problems A standard way of handling a K-class problems is to decompose it into several 2-class subproblems. One way to do this is to train K binary classifiers, each classifier attempting to discriminate between one class ωk and all other classes. When the learning set is of the form L = {(xi , mΩ i ) | i = 1, · · · , n}, where mΩ is a bba defined on Ω, this approach implies transforming each bba mΩ i i originally defined on Ω into a bba defined on the 2-class coarsened frame. For each coarsening, a tree is grown, and the resulting K trees are combined using the averaging operator. More precisely, let us denote by Ωk the following coarsening of Ω: Ωk = {{ωk }, {ωk }},
(36)
where {ωk } denotes the complement of {ωk }. Each bba mΩ i defined on Ω may be k on Ωk using the following transformation: transformed into a bba mΩ i Ω k mΩ i ({ωk }) = mi ({ωk }) k mΩ mΩ i (A) i ({ωk }) =
(37) (38)
A⊆{ωk } Ωk Ωk k mΩ i (Ωk ) = 1 − mi ({ωk }) − mi ({ωk }).
(39)
k Each of the K coarsenings thus leads to a training set Lk = {(xi , mΩ i ) | i = 1, · · · , n}, which is used to build a decision tree. At the testing step, we obtain, k for each input vector x, K bba’s mΩ x , each defined on a distinct coarsening Ωk . Each of these bba’s can be trivially carried back to Ω using the transformation:
Ωk mΩ x,k ({ωk }) = mx ({ωk })
(40)
Ωk mΩ x,k (Ω \ {ωk }) = mx ({ωk })
(41)
Ωk mΩ x,k (Ω) = mx (Ωk ).
(42)
138
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion
Because information sources are not independent, Dempster’s rule of combination cannot be used to combine the bba’s mΩ x,k , k = 1, . . . , K. An alternative is to use the weighted operator as previously explained, which leads to: mΩ x =
mΩ x,k
(43)
w
where w are coefficients to be optimized. This dichotomous approach of Belief Decision Trees allows us to quantify the uncertainty of the prediction of vector x (the belief function mΩ x itself), process learning sets whose labeling has been specified with belief functions (mΩ i for each learning example) and is available for K-class pattern recognition problems. 2.5. Parameter optimization or ’Tuning the discounting’ In the application of the LB methods, the first difficulty concerns the estimation of likelihood functions. Several density estimation can be used, including parametric methods based, e.g., on a Gaussian model, and non parametric kernel methods. In the simulations presented in the sequel, we chose to use a Gaussian mixture model together with the EM algorithm as an estimation technique [34]. As remarked by Bastire [5], there is no general technique for evaluating the discounting coefficients αk in the separable method. In this paper, we propose to use the same approach as used by Denœux [14] for the DB method, i.e., minimizing the following error criterion: E(α) =
K n
(BetP i (ωk ) − uik )2
(44)
i=1 k=1
where uik is the class indicator of pattern xi (uik = 1 if ω i = ωk ), and BetP i (ωk ) is the pignistic probability of ωk for vector xi . In the same manner, it is possible to define an error criterion based on the plausibility function E∗ where p is replaced with pl. 2.6. Simulations For the following simulations, a learning set L was generated using 3 classes containing 50 bidimensional vectors each. Each vector x from class k was generated by first drawing a vector z from a Gaussian f (z|ωk ) ∼ N (μk , Σk ), and applying a non linear transformation z → x = exp(0.3 z) to obtain non-Gaussian data. The means of the 3 Gaussian distributions were taken as: μ1 = (−1, −1) , μ2 = (1, 2) , μ3 = (−1.5, 2) and the variance matrices were of the form Σk = Dk ADk with √ 3 √0 A= 0 3/3 and θ1 = π/3, θ2 = π/2, θ3 = −π/3.
Dk =
cos θk − sin θk sin θk cos θk
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion
139
Belief Decision Tree − Maximum pignistic probability 0.65 7 0.6 6
Feature x2
5
0.55
4 0.5 3
0.45
2
1 0.4 0 0.35 −1
0
0.5
1 Feature x1
1.5
2
2.5
Figure 1. Maximum pignistic probabilities for BDT, (learning samples with ω1 = ×, ω2 = ◦, ω3 = +).
We first demonstrate the qualitative effects using the first learning set L1 . Figure 1 shows the maximum pignistic probabilities as grey values for the BDT. For each vector x, this value is obtained using max BetPmx (wk )
ωk ∈Ω
(45)
where m x corresponds to the output belief function. The decision regions for the CLB and DB methods, with the two decision rules are shown in Figures 2 and 3 (the decision regions for the SLB method are somewhat similar to those of the CLB method, and are consequently not shown here for lack of space). In these figures, mixture component centers and prototypes are represented as asterisks (∗). For the LB methods, likelihood functions were estimated using a Gaussian mixture model with k = 2 modes per class, and the parameters were estimated by the EM algorithm [34]. For the DB method, we chose by analogy two prototypes per class whose locations were initialized using the c-means algorithm. The value of the rejection cost λ0 was set at 0.4. The specific form of the belief functions for the CLB and SLB methods impose that maxk pl({ωk }) = 1. For this reason, only the DB method allows to reject patterns using the maximum plausibility decision rule. As can be seen from these figures, both the inference method and the decision rule have a dramatic influence on the shape of the decision regions. To compare the performances of these models, a test set T was generated using the same distribution as L with 15, 000 samples. The experiment was repeated ten times with independent training sets. The number of components in the mixture model (for the LB methods) and the number of prototypes (for the DB method) were optimized using a cross-validation set. The left part of Fig. 4 shows the error rate vs. the reject rate for the 3 methods and the 2 decision rules. For the CLB and SLB methods associated to the maximum plausibility decision rule, there are no rejected patterns. For this data set, all the proposed models obtain comparable performances. However, the DB model yields lower error rates as compared to the LB model without rejection. Moreover, if the classes have different prior probabilities, this gain is further increased. To demonstrate the
140
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion Shafer − Decision regions with Rbet
Shafer − Decision regions with Rinf
6
6
5
5
4
4
3
3
2
2
1
1
0
0
−1
−1
−1.5
−1
−0.5
0
0.5
1
1.5
2
2.5
3
3.5
−1.5
−1
−0.5
0
0.5
1
1.5
2
2.5
3
3.5
Figure 2. Decision regions for the CLB Method (Shafer) with R (left) and R∗ (right) for rejection loss λ0 = 0.4, (ω1 = ×, ω2 = ◦, ω3 = +) Denoeux − Decision regions with Rbet
Denoeux − Decision regions with Rinf
6
6
5
5
4
4
3
3
2
2
1
1
0
0
−1
−1
−1.5
−1
−0.5
0
0.5
1
1.5
2
2.5
3
3.5
−1.5
−1
−0.5
0
0.5
1
1.5
2
2.5
3
3.5
Figure 3. Decision regions for the DB Method (Denœux) with R (left) and R∗ (right) for rejection loss λ0 = 0.4, (ω1 = ×, ω2 = ◦, ω3 = +)
robustness of these methods, the test set T was then contaminated with 1, 500 outliers with uniform distribution and random class labels. The right part of Fig. 4 presents the error rates of the different methods as functions of the rejection rates. The most robust decision rule seems to be the DB method with the maximum pignistic probability rule. This observation is easily explained by the shapes of the decision regions.
3. Applications for Multisensor Data Fusion We provide practical examples on how to use the previous models in detectionrecognition problems. They have nice properties like the possibility to quantify that none of the original hypothesis is supported, that the value of some ’likelihoods’ are unknown, that we can accept an a priori belief that really represents total ignorance. We survey several applications where belief functions have been successfully applied.
141
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion Error vs. reject rate
Error vs. reject rate
0.18
0.25
Denoeux RBet Appriou RBet Shafer RBet Denoeux RInf
0.16
0.14
Denoeux RBet Appriou RBet Shafer RBet Denoeux RInf
0.2
0.12
Error rate
Error rate
0.15 0.1
0.08
0.1 0.06
0.04
0.05
0.02
0
0
0.1
0.2
0.3
0.4
0.5 Reject rate
0.6
0.7
0.8
0.9
1
0
0
0.1
0.2
0.3
0.4
0.5 Reject rate
0.6
0.7
0.8
0.9
1
Figure 4. Test error rate vs. rejection rate for the three methods with the two decision rules without (left) and with outliers (right)
3.1. Sensors on partially overlapping frames Problem 1 A first sensor S1 has been trained to recognize objects in the frame Ω1 = {ω1 , ω2 } and a second sensor S2 has been trained to recognize objects in the frame Ω2 = {ω2 , ω3 }. Let us suppose that a new object O is presented to the two Ω2 1 sensors. Both sensors S1 and S2 express their beliefs mΩ 1 and m2 , the first on the frame Ω1 , the second on the frame Ω2 . How to combine these two beliefs on a common frame Ω = {ω1 , ω2 , ω3 }? Sensor S1 (respectively S2 ) never saw a ω3 (ω1 ) object, and we know nothing on how S1 (S2 ) would react if it looks at a ω3 (ω1 ) object. A solution has been proposed by Smets within the TBM and is based on the following idea. If both Ω2 1 mΩ 1 and m2 are conditioned on ω2 , and combined by conjunctive combination rule, the resulting belief function should be the same as the one obtained after Ω2 1 ’combining’ the original mΩ 1 and m2 on Ω, and conditioning the result on ω2 . Ω2 1 The problem is of course how to combine mΩ 1 and m2 because both belief mass functions are not defined on compatible frames of discernment so the Dempster’s rule of combination can’t be applied. The solution is as follows. Let Ω = {Ω1 ∩Ω2 } Ω1 Ω2 Ω and mΩ 1 , m2 the basic belief assignments obtained by extension of m1 and m2 on Ω. The result of the combination of m1 and m2 is given for all A ⊆ Ω1 ∪ Ω2 : m(A) =
1 2 mΩ mΩ 1 (A1 ) 2 (A2 ) (mΩ [ω2 ] ∩ mΩ 2 [ω2 ])(A1 ∩ A2 ) Ω Ω m1 [ω2 ](A0 ) m2 [ω2 ](A0 ) 1
(46)
where A0 = A ∩ Ω , A1 = A ∩ Ω1 and A2 = A ∩ Ω2 . In table 2, we illustrate the computation of m. We have (mΩ 1 [ω2 ] ∩ Ω m2 [ω2 ])(ω2 ) = (.1 + .3).(.7 + .1) = .32. This mass is distributed on ω2 , {ω1 , ω2 }, {ω2 , ω3 } and Ω according to the next ratios: (.1/.4).(.7/.8), (.3/.4).(.7/.8), Ω (.1/.4).(.1/.8), and (.3/.4).(.1/.8). The mass (mΩ 1 [ω2 ] ∩ m2 [ω2 ])(∅) = .68 is given to {ω1 , ω3 }. In this example, the sensor S1 supports that object O is an ω1 , whereas the second claims that O is a ω2 . If O had been a ω2 , how comes the first sensor did not say so? So the second sensor is probably facing an ω1 and
142
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion
A⊆Ω
1 mΩ 1
2 mΩ 2
mΩ 1
mΩ 2
mΩ 1 [ω2 ]
mΩ 2 [ω2 ]
mΩ 1∩2 [ω2 ]
m
BetPm
∅ ω1
0.0 0.6
0.0
0.0 0.0
0.0 0.0
0.6 0.0
0.2 0.0
0.68 0.00
0.00 0.00
0.455
ω2
0.1
0.7
0.0
0.0
0.4
0.8
0.32
0.07
0.190
0.2
0.0 0.0
0.0 0.7
0.0 0.0
0.0 0.0
0.00 0.00
0.00 0.21
0.355 -
0.6 0.1
0.2 0.0
0.0 0.0
0.0 0.0
0.00 0.00
0.68 0.01
-
0.3
0.1
0.0
0.0
0.00
0.03
-
ω3 {ω1 , ω2 } {ω1 , ω3 } {ω2 , ω3 } Ω
0.3 0.1
Table 2. Example of belief computation for two sensors on partially overlapping frames.
just states ω2 because it does not know what an ω1 is. So we feel that the most plausible solution is O = ω1 , what is confirmed by BetPm as it is the largest for ω1 . 3.2. Data association problem Multisensor systems are characterized by specific features that must be taken into account. While the different sensors observe the same scene, or at least partially (overlapping fields of view), they may have different resolutions, accuracies and points of view. The usual functions requested from multisensor systems are detection, localization, and recognition of the objects that may be present in the observed area. In most surveillance applications, the sensors are spread inside or around the area to be observed. For this reason, the association problem is complex and may result in a highly combinatorial problem. To illustrate this complexity, we propose an example which could be easily translated into a battlefield surveillance problem. Problem 2 Having five sensors that can locate targets in a given observed scene, how many targets are there, which sensor is associated with which target and where are the targets? We assume that the five sensors, denoted Si for i = 1, · · · , 5, observe the same scene, which is composed of not overlapping resolution cells which takes values in a set Ω. Once a sensor detects one target, it reports the detection and the position of the resolution cell where the detection was made. We assume that the location precision of each sensor is much better than the size of the resolution cell so that in this example there is no localization error. Besides, we know the confidence or reliability of each sensor, which we define as the ’probability that the sensor is in working condition’, assuming that when the sensor is in working condition, what it states is true. Let us consider the very simple case where sensors S1 and S2 locate a target in resolution cell c1 , while sensors S3 , S4 and S5 locate a target in cell c2 . The questions are therefore: 1. how many targets are we detecting? (a detection problem) 2. where are the targets located ? (a localization problem) 3. which sensor has detected which target? (an association problem).
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion
Sensor S mΩ F [S](∅) mΩ F [S]({c1 }) mΩ F [S]({c2 }) mΩ F [S](Ω)
S1
S2
0.7
0.8
0.3
0.2
S3
S4
S5
143
{Si }5i=1 0.925 0.015
0.6
0.6
0.9
0.059
0.4
0.4
0.1
0.001
Table 3. Target association problem.
If the sensors were perfect, we would conclude that there are two targets, one in c1 and one in c2 , that sensors S1 and S2 report on target located in c1 , whereas sensors S3 , S4 and S5 report on target located in c2 . We consider how these conclusions could be reached once uncertainty is introduced. Each sensor produces a belief function with a mass 1 allocated to the resolution cells c1 (sensors S1 Ω and S2 ) or c2 (sensors S3 , S4 and S5 ). So, we have: mΩ 1 ({c1 }) = m2 ({c1 }) = Ω Ω Ω m3 ({c2 }) = m4 ({c2 }) = m5 ({c2 }) = 1. For example, sensor S1 claims that there is an object in c1 and sensor S4 claims there is one in c2 . The fusion unit F , which function is to integrate the data, collects each of these five basic belief assignments and discounts them with the confidence that F gives to each sensor. This basic belief assignment expresses the belief held by F that there is an object in c1 and this belief results from what sensor S1 states and F ’s opinion about the reliability of sensor S1 . The resulting discounted belief functions are given in the table 3. We will now consider subsequently the hypotheses that these measurements resulted from either one target or two targets. Case 1: One target This means that all the declarations refer to the same event. Therefore, the fusion unit F combines the five belief functions using Dempster’s rule of combination, denoted by: 5 mΩ F [Si ]i=1 =
5
mΩ F [Si ].
(47)
i=1
The result is given in table 3 where the value 0.925 obtained for the empty set reflects a high degree of conflict between the measurements given by the five sensors. Case 2: Two targets Now, let us assume that there are two targets, so that some sensor measurements may refer to one target and the others to the other target. Schubert’s idea [6,36] is to cluster the sensors whose measurements are compatible, that is refer to the same target. As the hypothesis is that there are two targets, the set of five measurements is to be partitioned into two clusters, denoted X1 and X2 . Table 4 presents the masses given to the conflicts when the five sensors are grouped in two clusters. The least conflicting solution is X1 = 12 and Ω X2 = 345 which has the smallest internal conflict (mΩ F [X1 ](∅) = mF [X2 ](∅) = 0). We accept the heuristic that we should try to keep the number of targets as small as possible. The presence of two targets is sufficient to explain the data. Of course there might be three or more targets. As far as the data can be explained by the presence of two targets, that hypothesis is accepted. We can therefore conclude that there are two targets, without conflict. One target is located in c1 with belief
144
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion
Cluster X1
Cluster X2
mΩ F [X1 ](∅)
mΩ F [X2 ](∅)
1 2
2345 1345
0.00 0.00
0.787 0.689
3
1245
0.00
0.902
4 5
1235 1234
0.00 0.00
0.902 0.790
12 13
345 245
0.00 0.42
0.000 0.768
14
235
0.42
0.768
15 23
234 145
0.63 0.48
0.672 0.672
24
135
0.48
0.672
25 34
134 125
0.72 0.00
0.588 0.846
35 45
124 123
0.00 0.00
0.564 0.564
Table 4. Basic belief mass given to ∅ by F after combining the bba’s within each cluster X1 and X2 .
.94, and it is observed by sensors S1 and S2 . The other one is located in c2 with belief .984 and it is observed by sensors S3 , S4 and S5 .
4. Concluding Remarks This paper has focused on belief functions theory for multisensor data fusion. In this context, belief functions have showed their ability to model uncertain information while offering a suitable set of tools in a federative framework which includes other uncertainty theories. In the second part of this paper, they have been used for classification and pattern recognition tasks. From evidential classifiers, we can draw several conclusions: • The output belief functions take very different forms from the methods studied (more or less specific, consonant or not); consequently, the uncertainty related to the prediction is not represented in the same manner. • All the proposed models (except LB methods with the maximum plausibility decision rule) obtain comparable performances in the case of “standard” data; however, the DB method associated to pignistic risk minimization seems to be more robust to outliers than the other methods. Although these conclusions cannot be blindly generalized to all classification tasks, they seem to be sufficiently explicit to guide the choice of a model. For multisensor data fusion, several advantages may be expected from belief functions such as the ability to face a more important set of situations, the improvement of discrimination capacity. Practical examples on how to use the previous models in detection-recognition problems have been finally introduced such as management of heterogeneous frames of discernment or data association problem.
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion
145
References [1] M.A. Abidi and R.C. Gonzalez. Data Fusion in Robotics and Machine Intelligence, chapter 4 : Multisensor strategies using Dempster-Shafer belief accumulation, pages 165–210. Academic Press, INC, 1992. [2] A. Appriou. Aggregation and Fusion of Imperfect Information, chapter Uncertain Data Aggregation in Classification and Tracking Processes, pages 231–260. PhysicaVerlag, 1998. [3] A. Appriou. Multisensor signal processing in the framework of the theory of evidence. In Application of Mathematical Signal Processing Techniques to Mission Systems, pages (5–1)(5–31). Research and Technology Organization (Lecture Series 216), November 1999. [4] Y. Bar-Shalom and X.R. Li. Multitarget-Multisensor Tracking: Principles and Techniques. Storrs, CT: YBS Publishing, 1995. [5] A. Basti`ere. Methods for multisensor classification of airbone targets integrating evidence theory. Aerospace Science and Technology, 2(6):401–411, 1998. [6] M. Bengtsson and J. Schubert. Dempster-Shafer clustering using Potts spin mean field theory. Soft Computing, 5(3):215–228, 2001. [7] I. Bloch. Some aspects of Dempster-Shafer evidence theory for classification of multi-modality medical images taking partial volume effect into account. Pattern recognition Letters, 17:905–919, 1996. [8] L. Cholvy. About Merged Information. In D. Dubois and H. Prade, editors, Handbook of Defeasible Reasoning and Uncertainty Management Systems, volume 3, pages 233–263. Kluwer Acad. Publ., Dordrecht, 1998. [9] G. Choquet. Th´eorie des capacit´es. Annales de l’Institut Fourier, 5:131–295, 1954. [10] M. Daniel. Distribution of Contradictive Belief Masses in Combination of Belief Functions. In B. Bouchon-Meunier, R. R. Yager, , and L. A. Zadeh, editors, Information, Uncertainty and Fusion, pages 431–446. Kluwer Academic Publishers, 2000. [11] A. Dempster. Upper and lower probabilities induced by multivalued mapping. Annals of Mathematical Statistics, AMS-38:325–339, 1967. [12] T. Denœux. A k-nearest neighbour classification rule based on Dempster-Shafer theory. IEEE Transactions on Systems Man and Cybernetics, 25(5):804–813, 1995. [13] T. Denœux. Analysis of evidence-theoretic decision rules for pattern classification. Pattern Recognition, 30(7):1095–1107, 1997. [14] T. Denœux. A neural network classifier based on Dempster-Shafer theory. IEEE Transactions on Systems, Man and Cybernetics, Part A : Systems and humans, 30(2):131–150, 2000. [15] T. Denœux and M. Skarstein Bjanger. Induction of decision trees from partially classified data using belief functions. In Proceedings of SMC’2000, pages 2923–2928, Nashville, USA, 2000. IEEE. [16] J. Desachy, L. Roux, and E. Zahzah. Numeric and symbolic data fusion: A soft computing approach to remote sensing images analysis. Pattern Recognition Letters, 17:1361–1378, 1996. [17] D. Dubois and H. Prade. A note on measures of specificity for fuzzy sets. International Journal of General Systems, 10:279–283, 1985. [18] D. Dubois and H. Prade. On the unicity of Dempster’s rule of combination. International Journal of Intelligent Systems, 1:133–142, 1986. [19] D. Dubois and H. Prade. A set-theoretic view of belief functions. International Journal of General Systems, 12:193–226, 1986. [20] D. Dubois and H. Prade. Representation and combination of uncertainty with belief functions and possibility measures. Computationnal Intelligence, 4:244–264, 1988.
146
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion
[21] D. Dubois and H. Prade. Combination of Fuzzy Information in the framework of Possibility Theory. Data Fusion in Robotics and Machine Intelligence, pages 481–505, 1992. [22] S. Fabre, A. Appriou, and X. Briottet. Presentation and description of two classification methods using data fusion based on sensor management. Information Fusion, 2:49–71, 2001. [23] L. Fouque and A. Appriou. An evidential Markovian model for data fusion and unsupervised image classification. In Proc. of the Third International Conference on Information Fusion (FUSION 2000), pages TuB4 25–32, Paris, France, 2000. [24] J. Gebhardt and R. Kruse. Parallel Combination of Information Sources. In D. Dubois and H. Prade, editors, Handbook of Defeasible Reasoning and Uncertainty Management Systems, volume 3, pages 393–439. Kluwer Acad. Publ., Dordrecht, 1998. [25] S. Le H´egarat-Mascle, I. Bloch, and D. Vidal-Madjar. Application of DempsterShafer evidence theory to unsupervised classification in multisource remote sensing. IEEE Transactions on Geoscience and remote Sensing, 35(4):1018–1032, 1997. [26] S. Le H´egarat-Mascle, I. Bloch, and D. Vidal-Madjar. Introduction of neighborhood information in evidence theory and application to data fusion of radar and optical images with partial cloud cover. Pattern recognition, 31(11):1811–1823, November 1998. [27] R. Kennes. Computational aspects of the M¨ obius transform of a graph. IEEE Transactions on Systems,Man and Cybernetics, 22:201–223, 1992. [28] H. Kim and P.H. Swain. Evidential reasoning approach to multisource-data classification in remote sensing. IEEE Transactions on Systems, Man and Cybernetics, 25(8):1257–1265, 1995. [29] F. Klawonn and E. Schwecke. On the axiomatic justification of Dempster’s rule combination. International Journal of Intelligent Systems, 7:469–478, 1992. [30] G.J. Klir and M.J. Wierman. Uncertainty-Based Information. Physica-Verlag, Heidelberg, Germany, 1998. [31] E. Lefevre, O. Colot, and P. Vannoorenberghe. Belief functions combination and conflict management. Information Fusion, 3(2):149–162, June 2002. [32] E. Lefevre, P. Vannoorenberghe, and O. Colot. About the Use of Dempter-Shafer Theory for Color Image Segmentation. In First International Conference on Color in Graphics and Image Processing (CGIP’2000), pages 164–169, October 2000. [33] H. Li, S. Munjanath, and S. Mitra. Multisensor Image Fusion Using the Wavelet Transform. Graphical Models and Image Processing, 57(3):235–245, 1995. [34] G. J. McLaclan and T. Krishnan. The EM algorithm and extensions. John Wiley, New York, 1997. [35] C. Pohl and J.L. van Genderen. Multisensor Image Fusion in Remote Sensing: Concepts, Methods and Applications. International Journal of Remote Sensing, 19(5):823–854, 1998. [36] J. Schubert. Clustering belief functions based on attracting and conflicting metalevel evidence. In Proceedings of IPMU’2002, pages 571–578, Annecy, France, 2002. [37] G. Shafer. A Mathematical Theory of Evidence. Princeton University Press, 1976. [38] Ph. Smets. The combination of evidence in the transferable belief model. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(5):447–458, 1990. [39] Ph. Smets. Belief functions: The disjonctive rule of combination and the generalized bayesian theorem. International Journal of Approximate Reasoning, 9:1–35, 1993. [40] Ph. Smets. What is Dempster-Shafer’s model? In R.R. Yager, M. Fedrizzi, and J. Kacprzyk, editors, Advances in the Dempster-Shafer Theory of Evidence, pages 5–34. Wiley, 1994.
P. Vannoorenberghe / Belief Functions Theory for Multisensor Data Fusion
147
[41] Ph. Smets and R. Kennes. The Transferable Belief Model. Artificial Intelligence, 66(2):191–234, 1994. [42] P. Vannoorenberghe and T. Denœux. Handling uncertain labels in multiclass problems using belief decision trees. In Proceedings of IPMU’2002, pages 1919–1926, Annecy, France, 2002. [43] F. Voorbraak. On the justification of Dempster’s rule of combinations. Artificial Intelligence, 48:171–197, 1991. [44] R.R. Yager. On the Dempster-Shafer framework and new combination rules. Information Sciences, 41:93–138, 1987. [45] L.M. Zouhal and T. Denœux. An evidence-theoretic k-nn rule with parameter optimization. IEEE Transactions on Systems, Man and Cybernetics-Part C, 28(2):263– 271, May 1998.
148
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
Dempster-Shafer Evidence Theory Through the Years: Limitations, Practical Examples, Variants Under Conflict and a New Adaptive Combination Rule Mihai Cristian FLOREA (a), Anne-Laure JOUSSELME (b) and Dominic GRENIER (a) (a) Université Laval, Québec, Canada (b) Defence Research and Development Canada - Valcartier
Abstract. Evidence theory has been primarily used in the past to model imperfect information, and it is a powerful tool for reasoning under uncertainty. It appeared as an alternative to probability theory and is now considered a generalization of it. In this paper we first introduce an object identification problem and then present two approaches to solve it: a probabilistic approach and the Dempster- Shafer approach. We also present the limitations of Dempster’s rule of combination when conflictual pieces of information are combined and we present alternatives rules proposed in the literature to overcome this problem. We propose a class of adaptive combination rules obtained by mixing the basic conjunctive and disjunctive combination rules. The symmetric adaptive combination rule is finally considered and we compare it with the other existing rules. Keywords. Evidence theory, reliability, adaptive combination rule, information fusion, data fusion, reasoning under uncertainty.
Introduction Dempster-Shafer theory (DST) [1,2] is a mathematical tool developed for reasoning under uncertainty. One of its strengths is that it can cope with imprecise and uncertain information. The first combination rule in this framework, proposed by Dempster in 1968, has often been criticized based on its occasionally unexpected behavior. In particular, Zadeh proved, through a simple example [3], that it provides counterintuitive results when combining highly conflictual information. Some authors argue that this is a false problem since the reason for the counter-intuitive results comes from an improper use of the rule [4,5,6]. Indeed, Dempster’s rule of combination should be used only under the restrictions initially imposed by Dempster of (1) independent sources providing independent evidences, (2) homogeneous sources defined on a unique frame of discernment and (3) a frame of discernment containing an exclusive and exhaustive list of hypotheses. In practice, the restrictions (1)-(3) are severe and not easily satisfied, which has lead to evidence theory being extended to include new, more flexible theories to cope with an unknown and unpredictable reality (the Transferable Belief Model by Smets and Kennes [7] or the Dezert-Smarandache Theory (DSmT) [8]). A new direction consists of defining new combination rules in the DST as
M.C. Florea et al. / Dempster-Shafer Evidence Theory Through the Years
149
alternatives to Dempster’s rule such as Yager [9], Dubois and Prade [10] or Inagaki [11]. Haenni [5] promotes the fact that Dempster’s rule does not need any alternative rule and that rather the initial belief functions should be modified to better represent the sources’ information [2,12]. In this paper, we consider the sources of information to be independent and homogeneous and the frame of discernment to contain an exclusive and exhaustive list of hypotheses. Consequently, we consider that the conflict should be due to the unreliability of some sources. Moreover, instead of modifying the initial BPA to take into account reliability, we adopt the approach of an automatic combination rule based on the mixing of two conjunctive and disjunctive operators. In Section 1 we present the object identification problem. Sections 2 and 3 are short reviews of probability theory and Dempster-Shafer theory, respectively. Some alternatives to Dempster’s rule are presented in Section 4. Section 5 presents a class of adaptive combination rules as a mixing of the conjunctive and disjunctive combination rules. We restrict ourselves to a particular case of a symmetric adaptive combination rule. Section 6 illustrates some properties of the proposed adaptive rule in examples as well as in a test scenario for target identification. Section 7 presents a conclusion.
1. Object Identification Problem In the object identification problem several sensors placed on a platform receive information from one or more objects that have to be identified. The sensors can be human (experts) or electronic (Radars, Forward Looking Infra-Red (FLIR), Electronic Support Measures (ESM), etc.), can provide opinions or statistical information and can be more or less reliable. The possible observed objects and their attributes for each object (physical dimensions, cruise speed, maximum altitude, emitters on board, etc.) are listed in a database.
2. Probability Theory Probability theory first appeared as a mathematical tool able to model problems from game theory (such as card and dice games). Let 4 {T1,T2,...,TN} be the frame of discernment, containing N objects, hypotheses, etc. A probability distribution 4
function (pdf) P is defined over the powerset 2 such that:
P( A) [0,1], A 4
(1)
P(Ø) 0 and P(4) 1
(2)
P( A B)
(3)
P( A) P( B) P( A B), A, B 4
From the probability of all singletons T 4 , one can compute the probability of i any subsets A 4 , which can be interpreted as a restriction of the definition domain. The probability, which is not associated to a set A , is associated with its complement. When the probability of a set A is known, and the underlying distribution on its
150
M.C. Florea et al. / Dempster-Shafer Evidence Theory Through the Years
singletons is unknown, a common way to distribute the probability of A to its singletons is the equiprobability distribution P(T ) P( A) / | A |, T A . i i Example 1: Let consider a frame of discernment of 100 objects and a piece of information under the form of a pdf such that P ( ship ) 0.8 . From the database, we
T ,T ,...,T
are ships. Then P(not a ship) 0.2 and 1 2 90 P(T T ... T ) 0.8 and P(T T ... T ) 0.2 . By modeling the 1 2 90 91 92 100 ignorance using the equiprobability, it can be shown that the probability of any ship is smaller than the probability of any other singleton since P (T ) ... P (T ) 0.8 / 90 0.0088 and P(T ) ... P(T ) 0.2 /10 0.02 . 1 90 91 100 This result is somehow counterintuitive and inhibits good decision making. To overcome this problem, new theories were developed in the last couple of decades.
know
that
3. Dempster-Shafer Theory (DST)
Evidence theory is a powerful tool that deals with imprecise and uncertain information, developed by Dempster [1] and later formalized by Shafer [2]. This theory is often described as an extension of probability theory as it is based on the power set of the universe of discourse instead of on the universe itself. A Basic Probability Assignment (BPA) is a mapping m : 24 o [0,1] that must satisfy the following conditions: (1)
m(Ø) 0 and (2) ¦ m( A) 1 , where 0 d m( A) d 1 , A 4 . m( A) is called the mass of A and represents the degree of belief that someone strictly assigns to A . A subset A with a non-null mass is called a focal element of m. Let F designate the set of all focal elements of m . A wide variety of combination rules exists; a review and classification is proposed in [13], where the rules are analyzed according to their algebraic properties (idempotence, commutativity, associativity) as well as on different examples. Let m and m be two BPAs defined on the same frame of discernment 4 . The 1 2 basic combination rules between m1 and m2 are the disjunctive rule and the conjunctive rule defined respectively A 4 by: p( A)
¦
B C A
m1 ( B )m2 (C )
and
q ( A)
¦ m ( B)m (C ) 1
2
B C A
where the functions p and q are introduced here to simplify some upcoming expressions. K q(Ø) is called the weight of conflict (or simply conflict) between m1 and m2 and is equal to the mass of the empty set after the conjunctive combination. If K is close to 0, the BPAs are not in conflict, while if K is close to 1, the BPAs are in conflict. The most common combination rule of two BPAs is the rule proposed by Dempster [1], denoted by here and called an orthogonal sum. Dempster’s rule of combination is indeed a normalized conjunctive rule constrained by the mass of the empty set to be always equal to 0. For two BPAs, Dempster’s rule of combination is defined as m m (A) q( A) /(1 K ), A 4, A z Ø and m m (Ø) 0 . Although 1 2 1 2
M.C. Florea et al. / Dempster-Shafer Evidence Theory Through the Years
151
Dempster’s rule of combination is used for a large number of applications, in the presence of conflicting information (when K ป 1) this rule does not provide an adequate representation of the aggregation of these two BPAs (see Section 6). In the identification problems, given a BPA m , one needs to find a singleton T which is the most probable from the frame of discernment 4 . Several decision criteria were proposed to identify the most credible singleton. A recent survey of the different methods of decision making may be found in Bloch [14]. In this paper we consider only the maximum of pignistic probability decision criteria, proposed by Smets [15].
4. Alternatives to Dempster’s Rule of Combination in Conflictual Situations
Several combination rules were proposed in the past few decades to cope with different problems and, particularly, to solve the conflict problem, when Dempster’s rule of combination cannot be used. Inagaki [11] proposed to distribute the mass of the empty set after the conjunctive combination to any subset of 4 using a set of weighting coefficients w( A) ! 0 such that m m (A) q ( A) w( A) q (Ø) , A 4, A z Ø with 1 2 ¦ w(A) 1 . Inagaki also proposed a particular case of the previous equation where the ratio between the mass of any subsets A and B is the same before and after the distribution of the conflict. Note that as Dempster’s rule, this particular case of Inagaki’s combination rules cannot be used whenever K 1 . Yager proposed in [9] to allocate the mass of the conflict q(Ø) to the ignorance 4 . Dempster’s rule and Yager’s rule turn out to be particular cases of Inagaki’s particular combination rule. Dubois and Prade [10] proposed to distribute the mass of the empty set not only to focal elements of F F but also to some focal elements from F F . When the 1 2 1 2 intersection of two focal elements is the empty set, the combined mass is allocated to their union and not to the empty set. This combination rule cannot be written as Inagaki’s particular combination rule. However it can be obtained from the general expression of Inagaki’s combination rule from the previous equation.
5. A New Class of Adaptive Combination Rules (ACR)
Here we propose a new class of combination rules based on the mixing of the disjunctive rule and the conjunctive rule, following an idea equivalent to the one developed by Dubois and Prade in [10,16]. This class of rules, which includes Dempster’s rule as a special case, leads to more intuitive results than other rules. After introducing the general class and highlighting an interesting property, we focus our attention on a special class of rules, where the weighting coefficients of the conjunctive and disjunctive rules are symmetric. 5.1. Adaptive Combination Rule with Symmetric Weighting Coefficients A class of Adaptive Combination Rules (ACR) between two BPAs m and m is 1 2 defined by m ¡m (A) D ( K ) p( A) E ( K )q( A), A 4, A z Ø and m ¡m (Ø) 0 . 1 2 1 2
152
M.C. Florea et al. / Dempster-Shafer Evidence Theory Through the Years
Here, D and E are functions of the conflict K q (Ø) from [0, 1] to [0, 1]. A desirable behavior for the ACR is that it should act more like the disjunctive rule whenever K is close to 1 (i. e., at least one source is unreliable), while it should act more like the conjunctive rule if K is close to 0 (i. e., both sources are reliable). Thus, we consider the three conditions: x (C1) D is an increasing function with D (0) 0 and D (1) 1 ; x (C2) E is a decreasing function with E (0) 1 and E (1) 0 ; x (C3) D ( K ) 1 (1 K ) E ( K ) (arising from the necessity of ¦ m ¡m ( A) 1 ). 1 2 While the behaviors at the extrema ( K = 0 and K = 1) are easily interpretable, what should happen at the medium value K = 0.5 can be subjected to discussion. We suppose in this case that an acceptable choice could be that D (0.5) E (0.5) giving an equal weight to the two basic rules p and q . Hence, an interesting property for the adaptive rule is to have symmetric weightings for p and q . It can be shown that the combination rule m ¡m ( A) 1 2
K 1 K K 2
p( A)
1 K 1 K K 2
q( A)
is the unique symmetric-ACR (or SACR). A partial positive reinforcement of the belief can be observed in the case of the ACR for the focal elements common to both m and m . This property is one of the 1 2 strengths of this new class of adaptive combination rules. Moreover, this new class can be used when the conflict between two BPAs is equal to 1, when Dempster’s rule cannot be used. 5.2. Adaptive Combination Rule and Sequential Fusion Processes The adaptive combination rule with symmetric weighting factors is not associative. However, based on the commutative and associative properties of the conjunctive and the disjunctive combination rules p and q a quasi-associative ACR can be derived from the general representation of the new class of adaptive combination rules. p( A) and q( A) are propagated in a sequential fusion process, as illustrated in Figure 1.
Figure 1. Fusion process: quasi-associative adaptive combination rule in a sequential fusion process.
153
M.C. Florea et al. / Dempster-Shafer Evidence Theory Through the Years
6. Illustrations
6.1. Simple Examples In order to show how Dempster’s rule of combination provides counter-intuitive results when conflictual information is fused, Zadeh proposed the following example. Example 2 (Zadeh [3]) Let m and m be two BPAs defined over a frame of 1 2 discernment 4 {T ,T ,T } such that m (T ) m (T ) 0.99 and 1 1 2 2 1 2 3 m (T ) m (T ) 0.01 . 1 3 2 3 This example is now well known and has been frequently analyzed. Since K ป 1, the results obtained using the SACR are almost identical to the results obtained using the disjunctive rule or Dubois and Prade’s rule. The positive reinforcement of the belief of the singleton T (compared to the conjunctive rule) is insignificant in this case. 3 Singletons T and T cannot be distinguished, while singleton T is not necessarily 1 2 3 the most credible singleton. Significant differences between SACR and Dempster’s rule, Yager’s rule and Dubois and Prade’s rule of combination arise when K is neither too close to 0 nor to 1. Example 3 Let m and m be two BPAs defined over 4 : 3 4 m (T ) m (T ) 0.3 and m (T ) m (T ) 0.7 . 3 1 4 2 3 3 4 3 Table 1. Example 3: Combining m3 and m4 using different combination rules in evidence theory Focal elements
Ø T3
Conjunctive rule q Smets’ rule 0.51 0.49
Disjunctive rule p Dubois & Prade rule 0 0.49
Dempster’s rule Inagaki’s rule 0 1
Yager’s rule
SACR
0 0.49
0 0.6532 0.0612 0.1428
{T 1 , T 2 } {T 1 , T 3 }
0 0
0.09 0.21
0 0
0 0
{T 2 , T 3 }
0
0.21
0
0
0.1428
4 {T1 , T 2 , T 3 }
0
0
0
0.51
0
Decision
Ø
T3
T3
T3
T3
Table 1 presents the combination of m and m using different rules. We observe 3 4 a positive reinforcement of the belief for the common focal elements of both m and 3 m when using SACR. Moreover, the other focal elements have then a smaller mass 4 than those resulting from the disjunctive or Dubois and Prade combination rules. Example 4 Let m and m be two BPAs defined over 4 : 5 6 m (T ) m (T ) 0.3 m ({T , T }) m ({T ,T }) 0.2 and , 5 1 5 3 5 1 3 5 1 2 m (T ) m ({T , T }) 0.4 , m (T ) 0.2 . 6 1 6 2 3 6 3
154
M.C. Florea et al. / Dempster-Shafer Evidence Theory Through the Years
Table 2. Example 4: Combining m5 and m6 using different combination rules in evidence theory Focal elements Ø T1 T2 T3
Conjunctive rule q Smets’ rule 0.34 0.28 0.08 0.30
Disjunctive rule p 0 0.12 0 0.06
Dempster’s rule Inagaki’s rule 0 0.4242 0.1212 0.4546
Yager rule
{T 1 , T 2 } {T 1 , T 3 }
0 0
0.08 0.30
0 0
0 0
0 0.28 0.08 0.30
Dubois & Prade rule 0 0.28 0.08 0.30 0 0.18
SACR 0 0.2909 0.0681 0.2816 0.0351 0.1315
{T 2 , T 3 }
0
0.12
0
0
0
0.0525
{T 1 , T 2 , T 3 }
0
0.32
0
0.34
0.16
0.1403
Decision
Ø
T1
T3
T3
T3
T1
Computing the pignistic probability before combination, Source 5 is in favour of
T , while Source 6 is in favour of T and T , equally. We thus expect that after
1 1 3 combination, T will be the solution. Table 2 shows the results of the different 1 combination rules using this example. Dempster’s rule, Yager’s rule or Dubois and Prade’s rule associated with the considered decision criteria provide singleton T as 3 the solution, while SCAR alone gives T as solution because of a higher positive 1 reinforcement.
6.2. Test Scenario of Target Identification In this section we study a simple test scenario of target identification. Several pieces of evidential information coming from ESM are sequentially combined at a fusion centre. The 135 targets to be potentially identified are listed in a Platform Data Base (PDB), according to 22 features (ID, country, type, sub-type, emitters on board, etc.). The following simulation test was generated considering that the ESM provides wrong emitters in 20 % of reports. Suppose the observed target is the object T from the 48 database. All pieces of information are modeled in the DST using two focal elements: m( A) 0.8 and m(4) 0.2 where A is the set of objects corresponding to the received information. Figure 2 shows the results of the test-scenario: the final pignistic probabilities obtained using the SACR and Dempster’s rule and the comparative evolution of the pignistic probability of the object T during the fusion process for 48 three combination rules. The pignistic probability decreases in the last fusion step, since the last information received at the fusion center was a counter-measure. Because the combination rules (except Dempster’s rule) are not associative, the last piece of information to be fused will always be more important than those already fused. Using Dempster’s rule, the final pignistic probability distribution shows that object T is the 48 only possible object, since BetP( T ) ป 1. All other possibilities are excluded. We also 48 remark that the pignistic probability of the singleton T does not decrease 48 significantly in the case of false reports (except for the instant #9). This behavior is somehow counter-intuitive. SACR reacts to the “false” information in a more natural
M.C. Florea et al. / Dempster-Shafer Evidence Theory Through the Years
155
manner. The object T
is well identified by the three combination rules. Where 48 Dempster’s rule fails to provide other options in the identification of the observed target, the SACR gives other alternatives.
Figure 2. Test scenario with a probability of error for the ESM equal to 0.2
7. Conclusion
In this paper we defined a new class of adaptive combination rules for evidence theory. This new class of rules was developed to cope with the problem of combining conflictual information, a case where Dempster’s classical rule fails to provide intuitive results. The mixing of the classical conjunctive and disjunctive rules is done by using two weighting functions of the conflict factor K. Depending on the conflict, the proposed adaptive rules act more like a conjunctive rule or more like a disjunctive rule. We built the unique adaptive combination rule with symmetric weighting coefficients. An interesting property of positive reinforcement of the belief is shown. In some examples we saw that SACR is more adequate than other classical rules. Finally, we illustrated a test scenario of target identification and, due to non-associative properties, observed a more desirable behavior for SACR than for Dempster’s rule in the case of sequential fusion problems.
156
M.C. Florea et al. / Dempster-Shafer Evidence Theory Through the Years
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16]
A. Dempster, “Upper and Lower Probablilities Induced by Multivalued Mapping,” Ann.Math. Statist., vol. 38, pp. 325–339, 1967. G. Shafer, A Mathematical Theory of Evidence. Princeton University Press, 1976. L. A. Zadeh, “Review of Shafer’s Mathematical Theory of Evidence,” AI Magazine, vol. 5, pp. 81 – 83, 1984. F. Voorbraak, “On the justification of Dempster’s rule of combination,” Artificial Intelligence, vol. 48, pp. 171–197, 1991. R. Haenni, “Are alternatives to Dempster’s rule of combination real alternatives? Comments on "About the belief combination and the conflict management problem"—Lefevre et al.” Information Fusion, vol. 3, pp. 237–239, 2002. W. Liu and J. Hong, “Reinvestigating Dempster’s idea on evidence combination,” Knowledge and Inf. Syst., vol. 2, pp. 223–241, 2000. P. Smets and R. Kennes, “The Transferable Belief Model,” Artificial Intelligence, vol. 66, pp. 191–234, 1994. J. Dezert, “Foundations for a new theory of plausible and paradoxical reasoning,” Information & Security: An International Journal, vol. 9, pp. 90–95, 2002. R. R. Yager, “On the Dempster-Shafer framework and new combination rules,” Information Science, vol. 41, pp. 93–138, 1987. D. Dubois and H. Prade, “Representation and combination of uncertainty with belief functions and possibility measures,” Computational Intelligence, vol. 4, pp. 244–264, 1988. T. Inagaki, “Interdependence between safety-control policy and multiple-sensor schemes via DempsterShafer theory,” IEEE Trans. on reliability, vol. 40, no. 2, pp. 182 – 188, 1991. G. Rogova and V. Nimier, “Reliability in Information Fusion: Literature Survey,” in Proceedings of the 7th Annual Conference on Information Fusion , ISIF, Ed., 2004, pp. 1158 – 1165. K. Sentz and S. Ferson, Combination of evidence in Dempster-Shafer theory. Sandia National laboratory - Epistemic Uncertainty Project, Tech. Rep. SAND 2002-0835, 2002. I. Bloch, “Fusion of image information under imprecision and uncertainty: Numerical methods,” in Data Fusion and Perception, G. D. Riccia, H.-J. Lenz, and R. Kruse, Eds., vol. 431. Springer-Verlag, NY, 2001, pp. 135–168. P. Smets, “Constructing the pignistic probability function in a context of uncertainty,” Uncertainty in Artificial Intelligence, vol. 5, pp. 29–39, 1990, elsevier Science Publishers. D. Dubois and H. Prade, “La fusion d’informations imprécises,” Traitement du Signal, vol. 11, no. 6, pp. 447–458, 1994.
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
157
Decision Support and Information Fusion in the Context of Command and Control Éloi BOSSÉ Defence R&D Canada Valcartier
Abstract. Command and control can be characterized as a dynamic human decision making process. A technological perspective of command and control has led system designers to propose solutions, such as decision support and information fusion, to overcome many domain problems. Solving the command and control problem requires balancing the human factor perspective with that of the system designer and coordinating the efforts in designing a cognitively fitted system to support the decision-makers. This paper discusses critical issues in the design of computer aids, such as a data/information fusion system, by which the decision-maker can better understand the situation in his area of operations, select a course of action, issue intent and orders, monitor the execution of operations, and evaluate the results. These aids will support the decision-makers in coping with uncertainty and disorder in warfare and in exploiting people or technology at critical times and places to ensure success in operations. Keywords. Command and Control, Data/Information Fusion, Situation Awareness, Decision Making, Cognitive Systems Engineering.
Introduction Command and Control (C2) is defined by the military community as the process by which a commanding officer can plan, direct, control and monitor any operation for which he is responsible in order to fulfill his mission. From a human factor perspective, the complexity of military operations highlights the critical role of human leadership in C2. To resolve adversity, C2 systems require qualities inherent to humans such as decision-making abilities, initiative, creativity and the notion of responsibility and accountability. Although these qualities are essential, characteristics inherent to the C2 environment combined with the advancement in threat technology significantly challenge the accomplishment of this process and therefore require the support of technology to complement human capabilities and limitations. A technological perspective of C2 has led system designers to propose solutions by providing operators with Decision Support Systems (DSS). These DSSs should aid operators in achieving the appropriate Situation Awareness (SA) state for their decision-making activities, and support the execution of the resulting actions. The lack of knowledge in cognitive engineering has in the past jeopardized the design of helpful computer based aids aimed at complementing and supporting human cognitive tasks. Moreover, this lack of knowledge has most of the time created new trust problems in the designed tools. Solving the C2 problem thus requires balancing the human factor perspective with that of the system designer and coordinating the efforts in designing a cognitively fitted
158
É. Bossé / Decision Support and Information Fusion in the Context of Command and Control
system to support decision-makers. The paper starts with a discussion on the C2 and decision-making process followed by decision support definitions and concepts. It then presents the problem of designing a cognitively fitted DSS using a Cognitive Science Engineering (CSE) approach.
1. Decision Making The aim of C2 is to allow a commander to make decisions and take actions faster and better than any potential adversary. Accordingly, it is essential to understanding how commanders make decisions. One stream of decision-making theory, based on decision theoretic paradigms, views decision making as an analytic process that corresponds closely to the military estimate of the situation. According to this analytic approach, the commander generates several options, identifies criteria for evaluating these options, rates the options against these criteria and then selects the best option as the basis for future plans and action. It aims at finding the optimal solution, but it is time consuming and information intensive. A second stream of decision making theory emphasizes a more inductive than analytic approach. Called Naturalistic Decision Making (NDM), it emphasizes the acquisition of knowledge, the development of expertise and the ability of humans to generalize from past experience. It stresses pattern recognition, creativity, experience and initiative. In general, while the two models represent conceptually distinct approaches to decision making, they are not mutually exclusive in practice. The commander will adopt the approach that is best tailored to the situation and may use elements of the two at the same time. Indeed, a combination of the two is probably always at work within the C2 system. 1.1. Task/Human/Technology Triad Model A triad approach has been proposed by Breton, Rousseau and Price [1] to represent the collaboration between the system designer and the human factors specialist. As illustrated in Figure 1, three elements compose the triad: the task, the technology and the human. In the C2 context, the Observe-Orient-Decide-Act (OODA) loop represents the task to be accomplished. The design process must start with the identification of environmental constraints and possibilities by Subject-Matter Experts (SMEs) within the context of a CSE approach. System designers are introduced via the technology element. Their main axis of interest is the link between the technology and the task. The general question related to this link is: “What systems must be designed to accomplish the task?” System designers also consider the human element. Their secondary axis of interest is thus the link between the technology and the human. The main question of this link is: “ How must the system be designed to fit with the human”? However, system designers have a hidden axis. The axis between the human and the task is usually not covered by their expertise. From their analyses, technological possibilities and limitations are identified. However, not all environmental constraints are covered by the technological possibilities. These uncovered constraints, named hereafter deficiencies, are then addressed as statements of requirements to the human factor community. These
É. Bossé / Decision Support and Information Fusion in the Context of Command and Control
159
requirements lead to better training programs, the reorganization of work and the need for leadership, team communication, etc. REQUIREMENTS
OODA
REQUIREMENTS
Task
1
2
3
Technology
Human
System Designers
Human Factor Specialists
Principal Axis: (1) Technology -Task Secondary Axis: (3) Technology- Human Hidden Axis: (2) Human - Task
Principal Axis: (2) Human - Task Secondary Axis: (3) Technology- Human Hidden Axis: (1) Technology -Task
Figure 1. Task/Human/Technology Triad Model
2. Cognitive Engineering System Analyses CSE analyses are defined as approaches that aim to develop knowledge about the interaction between human information processing capacities and limitations, and technological information processing systems. The usefulness of a system is closely related to its compatibility with human information processing. Therefore, CSE analyses focus on the cognitive demands imposed by the world to specify how technology should be exploited to reveal problems intuitively to the decision maker’s brain. Cognitive Work Analysis (CWA) [2], one of the CSE approaches, seems to be the best choice to answer questions related to understanding the C2 task. 2.1. A Pragmatic approach to Cognitive Work Analysis (CWA) The Applied Cognitive Work Analysis (ACWA) methodology [3] emphasizes a stepwise process to reduce the gap to a sequence of small, logical engineering steps, each readily achievable. At each intermediate point the resulting decision-centered artifacts create the spans of a design bridge that link the demands of the domain as revealed by the cognitive analysis to the elements of the decision aid.
160
É. Bossé / Decision Support and Information Fusion in the Context of Command and Control
The ACWA approach is a structured, principled methodology to systematically transform the problem from an analysis of the demands of a domain to identifying visualizations and decision-aiding concepts that will provide effective support. The steps in this process include: x Using a Functional Abstraction Network (FAN) model to capture the essential domain concepts and relationships that define the problem-space confronting the domain practitioners; x Overlaying Cognitive Work Requirements (CWR) on the functional model as a way of identifying the cognitive demands / tasks / decisions that arise in the domain and require support; x Identifying the Information / Relationship Requirements (IRR) for successful execution of these cognitive work requirements; x Specifying the Representation Design Requirements (RDR) to define the shaping and processing for how the information / relationships should be represented to the practitioner; x Developing Presentation Design Concepts (PDC) to explore techniques to implement these representation requirements into the syntax and dynamics of presentation forms to produce the information transfer to the practitioner. Those steps are more extensively described in [3].
3. Ontological Engineering for Computer-Based Decision Support Systems One of the main challenges for the design of computer-based support systems for decision-makers resides in the representation of domain specific types of objects and situations, and the relations between these elements and the environment. Formally and consistently representing these entities and relations is critical to the successful design of automated processes, which take such representations as input (e.g., for automated reasoning). A key element in the development of appropriate knowledge-based systems is to relate the representation of situation elements in the system to those used by operators. In this regard, CSE techniques such as ACWA are essential in deriving the way humans represent these elements within their mental models. The use of methods such as ACWA enables the elicitation of information and knowledge. It provides a design framework to build effective and trusted decision support. Methodologies in CSE focus on identifying information needs and aiding strategies that reflect the goals and tasks in the domain, along with the means available to achieve them. These analysis tasks are carried out by means of various knowledge elicitation techniques, such as interviews with SMEs. Importantly, methods such as CWA and ACWA not only model the cognitive tasks (e.g., decisions) undertaken by operators in the domain, but also the fundamental purposes, functions and constraints of the work domain (captured in FAN or Abstraction Hierarchy (AH) models) as well. The FAN and AH models consist of networked or hierarchically organized entities representing the purposes, functions, and physical components of the work domain. Ontological engineering is the process of the construction of ontologies, including as steps the analysis of the domain of interest, the modeling of relevant concepts, the building and encoding of the resulting ontology into an appropriate formalism and, finally, the validation and evaluation of the ontology.
É. Bossé / Decision Support and Information Fusion in the Context of Command and Control
161
Combining CSE and Ontological Engineering (OE) principles will potentially provide a very powerful methodology for the building of formal domain knowledge models (domain ontologies) for military applications and its effective exploitation in decision support systems. In contrast to ontology and cognitive engineering alone, this novel approach will synergistically integrate ontological engineering and cognitive engineering into a unique and innovative methodology. The novelty comes from the mutual information process that will contribute to enrich the results of each approach. The ontology provides a formal, theoretical structure to represent the concepts identified through the cognitive engineering analysis (specifically, the FAN/AH work domain models), while the cognitive engineering analysis provides constraint over the scope of entities incorporated into the ontology. 3.1. Ontologies In the artificial intelligence community, where the concept of ontologies was investigated in the first place for the engineering of knowledge bases, an ontology is defined as a formal and explicit specification of a conceptualization [5]. It specifies the semantics of the domain concepts using attributes, properties, and relationships between concepts, as well as constraints and axioms. Therefore, it provides a formal and shared understanding of a domain, facilitating exploitation both by human agents or application programs. Ontologies are central to the design of DSS for military applications; military C2 problems are knowledge-intensive and involve a large number of concepts (either physical or abstract elements) to be considered within the situation analysis and decision-making processes. Ontologies constitute formal domain models upon which reasoning processes can be based and knowledge-based systems can be built. A formal ontological framework is necessary to afford a formal structure for analysis of domainspecific types of objects and situations, and the relations between them, and to ensure a certain level of reusability of the designed domain-specific ontology in a different application domain. The strength of ontologies is that they constitute knowledge components that are reusable across different applications. Different level data fusion processes could thus exploit these knowledge bases according to their reasoning processes. Finally, ontologies can serve to support knowledge level interoperability among heterogeneous knowledge sources. They provide a layer between an agent (human or artificial) and physical knowledge sources by formally defining the domain knowledge and explicitly specifying the content of the knowledge sources using the concepts of the ontology (meta-models). Using this approach, knowledge and information sources can be linked to the concepts of the ontology, and services can be provided to exploit the data and knowledge bases by a human or a machine agent.
4. Designing a Data/Information Fusion System The data-to-information relationship can be complex and requires a significant number of computations and/or transformations. The degree of transformation required can vary from simple algebra to complex, intelligent algorithms such as those used in data/information fusion. Data/information fusion is clearly a key enabler for situation
162
É. Bossé / Decision Support and Information Fusion in the Context of Command and Control
awareness and when built as a system becomes a key support to the decision-making process. A rich legacy in data fusion technology exists, ranging from the Joint Directors of Laboratories (JDL) data fusion process model to taxonomies of algorithms and engineering guidelines for architecture and algorithm selection. To date, numerous data fusion systems have been developed, especially for military applications. How can we use this legacy in our data fusion system design today? Unfortunately, the lack of common engineering standards, well-documented performance evaluations and architecture paradigms is a major impediment to objective evaluation, comparison or re-use of current data fusion systems. Developing a system that utilizes existing or developmental data fusion technology requires a standard method for specifying data fusion processing and control functions, interfaces, and associated data bases. In the initial JDL process model, the data fusion levels were never intended to be taken as a blueprint for system design or software development. However, in the revised version [6], the model has been extended to integrate a data fusion tree architecture model for system description, design and development. This data fusion tree would benefit from being guided and enriched with the formal methodology presented in the previous sections. The JDL guidelines recommend an architecture that represents data fusion processing in terms of nodes. When the data fusion process is partitioned into multiple processing nodes, the process is represented via a data fusion tree, illustrated in Figure 2 (from [6]). The guidelines recommend a four-phase process for developing data fusion processes within an information processing system, shown in Figure 3 (from [6]). To further enrich the guidelines, we suggest adding the CSE/OE (e.g. ACWA with an ontology layer) that will provide the data fusion designer with a formal methodology to capture overall system requirements and constraints. As a result, the risk of designing a data fusion system not cognitively fitted with the human will be mitigated.
5. Conclusion This paper presented discussions on the balance of human factors and technology in the design of decision support systems. CSE analysis methods and ontologies were introduced followed by an example of where these methodologies can be utilized in current engineering guidelines for designing data fusion systems.
References [1] [2] [3]
[4]
Breton, R., Rousseau, R. & Price, W. L. The Integration of Human Factors in the Design Process: a TRIAD Approach. Defence Research Establishment Valcartier, TM 2001-002, November 2001, 33 pages. Breton R., Potter S. S., and Bossé É., Application of a Human-centric Approach to the Design of Decision Support Systems for Command and Control, Journal of Decision Support Systems, Elsevier, Submitted 2005. Paradis. S., Elm, W. C., Potter. S. S., Breton. R. and Bossé. E., A Pragmatic Cognitive System Engineering Approach to Model Dynamic Human Decision-Making Activities in Intelligent and Automated Systems, NATO RTO HFM Symposium on Human Integration in Intelligent and Automated Systems, Warsaw, October 2002, 8 pp. Bossé, É., Valin, P., Boury-Brisset, A-C., Grenier, D., Exploitation of A Priori Knowledge for Information Fusion, Journal of Information Fusion, Elsevier, 2005.
É. Bossé / Decision Support and Information Fusion in the Context of Command and Control
[5] [6]
163
Gruber, T., A translation Approach to Portable Ontology Specifications, Knowledge Acquisition, Vol. 5, No. 2, pp. 199-220, 1993. Steinberg, A. N., Bowman, C. L. and White, F. E. Revision to the JDL data fusion model, Joint NATO/IRIS Conference, Quebec City, October 1998.
Figure 2. Integrated data fusion/resource management trees (from [6])
Figure 3. Data fusion system engineering method (modified from [6])
164
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
Fusion in European SMART Project on Space and Airborne Mined Area Reduction Isabelle BLOCH a, 1, Nada MILISAVLJEVIĆ b Ecole Nationale Superieure des Télécommunications, Paris, France b Royal Military Academy, Signal and Image Centre, Brussels, Belgium a
Abstract. This paper presents three fusion strategies applied within the European SMART project on space and airborne mined area reduction tools. Two strategies are based on belief function theory, and the third one is a fuzzy method. The main aim of the three methods consists of taking advantage of several available data sources with different properties, improving landcover classification and anomaly detection results, taking advantage of existing knowledge, and allowing user interaction. Keywords. Airborne mined area reduction, landcover classification, anomaly detection, belief function theory.
Introduction Three fusion strategies developed within the European SMART project [1,2] are presented in this paper. The underlying ideas of the methods are to take advantage of several available data sources with different properties, improve landcover classification and anomaly detection results, take advantage of existing knowledge, and allow user interaction. Available sources of information are Synthetic Aperture Radar (E-SAR) and multiband optical data (Daedalus), older KVR (satellite) data, knowledge about the sensors, registration, classification and detection results (from teams of the SMART consortium), ground-truth (legend and labeled regions in training and test areas) and expert knowledge. Two fusion strategies are based on belief function theory [3], with their main differences being the way discounting is performed and how classifiers are treated as information sources. The third strategy consists of fuzzy [4] weighted maximum fusion of only the best classifiers for each class. The strategies are illustrated on the test site of Glinska Poljana in Croatia, which is one of the three representatives of the Croatian terrain analyzed within the SMART project. The paper is organized as follows. In Section 2, three fusion methods are explained. In the following sections, knowledge inclusion and spatial regularization are discussed. Section 5 provides more detail about the levels of fusion involved, and Section 6 contains results. 1 Corresponding author: Isabelle Bloch, ENST-TSI, CNRS UMR 5141 LTCI, Rue Barrault 46, 75013 Paris, France; E-mail:
[email protected]
I. Bloch and N. Milisavljevi´c / Fusion in European SMART Project
165
1. Fusion Input The types of fusion inputs provided by different members of the SMART consortium are: • E-SAR classification with confidence images per class • E-SAR and Daedalus detection of hedges, trees, shadows, rivers, with confidence degrees for hedges and trees; shadows and rivers are discounted based on Daedalus bands • supervised classification of Daedalus data, where the result is provided as a decision image • region-based classification of Daedalus data with confidence images per class • belief function classification of Daedalus data with confidence images per class • E-SAR and Daedalus binary detection of roads • E-SAR binary detection of rivers • binary image of Daedalus and KVR change detection. The main characteristics of the available information are: • no classifier or anomaly detector is perfect • their reliabilities are very variable in function of class • complementarity • lack of spatial information • not all available knowledge is taken into account.
2. Three Fusion Methods 2.1. Belief Function Fusion Strategy Based on Global Discounting Factor (BF1) Here, each classifier is considered as one information source. The focal elements are simply the classes (singletons of the frame of discernment D) and the full set (D). At first, the classifier outputs (confidence values) are directly used as starting mass functions for singletons. In cases where no confidence values are provided but only a decision image or a binary detection is, the mass takes only values 0 or 1. In the next step, global errors are included in order to discount the starting masses. We propose to use a discounting factor α equal to the sum of the diagonal elements of the confusion matrix, divided by the cardinality of the training areas. This discounting is applied to all starting masses. Then: m(D) = 1 - α. Note that this means the explicit use of the confidence matrix, which should be computed on the training areas for each classifier or detector. Finally, definition of the full set mass should take into account the fact that some classes are not detected (therefore it should be equal to 1 at points where 0 is obtained for all detected classes). As a result, at each step of the fusion, the focal elements are always singletons and D. The Decision rule can be a maximum of belief, maximum of mass, or maximum of pignistic probability, all being equivalent in this case. This approach is very easy to implement, and demonstrates in a simple way that classifiers or detectors may not give any information on some classes and may be imperfect.
166
I. Bloch and N. Milisavljevi´c / Fusion in European SMART Project
2.2. Belief Function Fusion Strategy Based on a More Specific Discounting Factor (BF2) More sophisticated methods can be designed by considering each classifier as several sources. Namely, each classifier provides an output for each class, which can be considered as an information source for the fusion. In a simple model, focal elements for a source are defined by the output of a classifier or detector to a class Ci are Ci and D. Several instances of this approach have been designed, but the most interesting one is where the confusion matrix is used for more specific discounting for each class. From the confusion matrix computed from the decision made from one classifier and from training areas, we derive a kind of probability that the class is Ci given that the classifier says Cj as: conf (i, j ) c(i, j ) = , ∑ conf (i, j ) i
where the values conf(i,j) denote the coefficients of the confusion matrix. This formula corresponds to a normalization of the lines of the confidence matrices on training areas for all classifiers and detectors used as fusion input. It is possible to ignore low values and normalize others to reduce the number of non-zero coefficients (so the number of focal elements in the following). In our experiments, the threshold value is equal to 0.05. We use c(i,j) for discounting in methods described in the previous subsection. At the moment, we still consider that a class j of a classifier is one source. Then, from v(Cj), i.e., the value provided for this class by a classifier, we define: • m(Ci) = v(Cj) c(i,j), for all classes Ci which are confused with Cj (which provides ∑ m(C i ) = v(C j ) ), and i
• m(D) = 1 - v(Cj). Compared to BF1, instead of keeping a mass on Ci only (and D), this mass is spread over all classes possibly confused with Ci, thus better exploiting the richness of the information provided by a classifier. 2.3. Fuzzy Fusion To compare the previous methods with a fuzzy approach, we test a simple method, where we choose for each class the best classifiers, and combine them with a maximum operator (possibly with some weights). Then, a decision is made according to a maximum rule. The choice is made based on the confusion matrix for each classifier or detector, by comparing the diagonal elements in all matrices for each class. This approach is interesting because it is very fast. It uses only a part of the information, which could also be a drawback if this part is not chosen appropriately. Some weights have to be tuned, which may need some user interaction in some cases. Although it may sound somewhat ad hoc, it is interesting to show what we can get by using the best parts of all classifiers. Note that it is possible to make this approach more automatic.
I. Bloch and N. Milisavljevi´c / Fusion in European SMART Project
167
3. Knowledge Inclusion To improve results, some additional knowledge can be included in the fusion results (knowledge of the classifiers, their behaviors, etc. has already been included in the previous steps). We use at this step only the pieces of knowledge that directly provide information on the landcover classification. Other pieces of knowledge such as mine reports, etc., are not directly related to classes of interest, but rather to the dangerous areas, and are therefore included in the danger map construction, which follows the fusion task. At this step, several pieces of knowledge prove to be very useful. They concern, on the one hand, some “sure” detection. Some detectors are available for roads and rivers, which provide areas or lines that surely belong to these classes. There is almost no confusion, though some parts may be missing. These detections can then be imposed on the classification results. This is achieved by replacing the label of each pixel in the decision image by the label of the detected class if this pixel is actually detected. If not, its label is not changed. As for roads, additional knowledge is used, namely on the width of the roads (based on observations done during the field missions). Since the detectors provide only lines, these are dilated by the appropriate size, taking into account both the actual road width and the resolution of the images. On the other hand, another type of knowledge is very useful: the detection of changes between images taken during the project and KVR images obtained several years earlier. The results of the change detection processing provide information primarily about a class named “abandoned agricultural land”, since it exhibits fields which were previously cultivated, and which are now abandoned. Again these results do not show all regions belonging to the class “abandoned agricultural land”, but the detected areas surely belong to that class. A similar process can then be applied as for the previous detectors. With the proposed methods, it was difficult to obtain good results on the class “agricultural land in use”, while preserving the results on the class “abandoned agricultural land”, which is crucial since it corresponds to fields no longer in use and which are therefore potentially dangerous. Therefore we use the best detection of the class “agricultural land in use” (extracted from region based classification on Daedalus) as an additional source of knowledge.
4. Spatial Regularization The last step is a regularization step. Indeed, it is very unlikely that isolated pixels of one class can appear in another class. Several local filters have been tested, such as a majority filter, a median filter, or morphological filters, applied on the decision image. A Markovian regularization approach on local neighborhoods was tested too. The results are somewhat better, but not significantly. A better approach is to use the segmentation into homogeneous regions provided by another team in the project. In each of these regions, a majority voting is performed. We count the number of pixels in this region that are assigned to each class and the class that has the largest cardinality is chosen for the whole region (all pixels of this region are relabeled and assigned to this class). This type of regularization, which is performed at a regional level rather than at a local one, provides good results.
168
I. Bloch and N. Milisavljevi´c / Fusion in European SMART Project
5. On the Levels of Fusion Three levels of fusion are often distinguished: low level (usually pixel level), intermediary level (features such as lines or regions) and higher level (usually called decision fusion). Here all three levels are addressed in the proposed schemes. All three levels appear in the input of fusion, since classifiers may be based on pixels, on regions, on detection of linear structures (as for river or roads), on semantics (like change detection), etc. But, they also appear in the fusion itself; the computation of the combination is performed at pixel level, but based on semantic information provided by the classifiers and detectors (in terms of classes and decisions). Therefore this step elegantly merges two levels of fusion. The final regularization step is performed at an intermediary level, in homogeneous regions which are not reduced to a simple pixel neighborhood, but which, on the other hand, do not cover one class each (several regions belong to the same class).
6. Results For method BF1, the basic fusion results are reasonably good, except for the class “abandoned agricultural land”, where they are worse than using individual classifiers. Results for the class “water” are especially good. After knowledge inclusion, the results are improved, in particular for the class “abandoned agricultural land”. This shows the importance of knowledge on changes in the fusion. An additional improvement is obtained by the regularization step. In the case of BF2, the results are somewhat better than those obtained with the method BF1. The class “abandoned agricultural land” is not well detected and a lot of confusion occurs with the classes “agricultural land in use” and “trees and shrubs”. The class “rangeland” is not detected at all, but that class is not important for the analyzed application. After knowledge inclusion, the results are improved and confusion between the classes “abandoned agricultural land” and “agricultural land in use” is greatly reduced. Finally, the regularization step improves in particular the class “water”, which reaches a very satisfactory level. For the fuzzy method, the following output of classifiers has been used for each class: • for the class “abandoned agricultural land”: E-SAR logistic regression, regionbased classification, belief function classification and change detection (results for this class of these classifiers, combined with a maximum operator, with a factor 2 for logistic regression to compensate the lower values provided by this classifier); • for the class “agricultural land in use”: region-based classification and belief function classification; • for the class “asphalted roads”: region-based classification and road detection; • for the class “rangeland”: region-based classification, minimum distance classification and belief function classification; • for the class “residential areas”: region-based classification and belief function classification; • for the class “trees and shrubs”: region-based classification and E-SAR trees and hedges detection;
169
I. Bloch and N. Milisavljevi´c / Fusion in European SMART Project
•
for the class “shadows”: E-SAR logistic regression, E-SAR shadow detection, minimum distance classification and belief function classification; the maximum is discounted by a factor 0.5, taking into account that this class is not really significant for further processing (shadows “hide” meaningful classes); • for the class “water”: region-based classification, belief function classification and river detection. The results of the basic fusion are already very good. This can be explained by the fact that not all information provided by the classifiers is used, but only their best part. Compared with previous methods, it is somewhat less automatic and more ad hoc, but allows for reaching good results very fast. After knowledge inclusion, the improvement is clear, although not as strong as with the previous approaches (since the results were already good). All classes are detected. Finally the regularization step provides some more improvements, but the class “shadow” disappears. This is not a serious problem since this class is not significant for further processing. Comparison of the three methods based on user and producer accuracy for the three most important classes with regard to the application is given in Table 1. Note that the best classifier is not always the same, which further justifies the benefit of fusion.
Table 1. Comparison of the three methods
Class Abandoned agricultural land Agricultural land in use Asphalted roads
Measure User accuracy Producer accuracy User accuracy Producer accuracy User accuracy Producer accuracy
Best classifiers 0.82 0.84 0.87 0.84 0.88 0.77
BF1 0.87 0.81 0.86 0.90 0.96 0.88
BF2 0.79 0.78 0.81 0.91 0.96 0.88
Fuzzy 0.81 0.88 0.91 0.85 0.97 0.88
Note that in addition to the decision images for each method, we also provide confidence and stability images. The confidence image represents, at each pixel, the maximum confidence over all classes at this point (i.e., the confidence degree of the decided class). The stability image is computed as the difference between the two highest confidence degrees (i.e., confidence in the decided class and confidence in the second highest possible class). If the stability is high, it means that there is no doubt about the decision (one class is well distinguished from all other ones). If it is low, it means that two classes are very close to each other in terms of confidence, so the decision should be considered carefully. The confidence image and the stability images can be multiplied to provide a global image evaluating the quality of the classification at each point.
170
I. Bloch and N. Milisavljevi´c / Fusion in European SMART Project
Conclusion After a detailed bibliographical analysis and general considerations regarding the SMART project, several numerical fusion approaches were specified, adapted to the available data and classifiers or detectors. These approaches are to a large part original. Results have been detailed with the three most promising approaches. We have shown how the results can be improved by introducing knowledge in the fusion process. For instance, knowledge about roads has two parts: one comes from the images and provides mainly the location, and the other comes from field missions and provides mainly road width. A spatial regularization further improves the results. The final results are at least as good as the ones provided for each class by the best classifier for that class. Thus, they are globally better than any input classifier or detector. This clearly shows the improvement brought by fusion. Implementation issues have been addressed too, and a specific implementation is proposed to reduce memory cost and computational burden. The user has the possibility of intervening in the choice of classifiers and some of the parameters. The methods are to some extent specific to the actual SMART data. However, the programs are very general and can be used in any other application of belief function based data fusion without further work. The most crucial point is to define how to use the output of classifiers as input to fusion. It requires knowledge of the behavior of the classifiers, which type of information they provide on each class (or disjunction of classes), in order to choose the appropriate focal elements and mass functions. Although some help can be found in confusion matrices, some supervision may be needed at this step. Also the relative weight to be given to each classifier output with respect to the others belongs to the same class of problems. This requires moderate additional work for each new application, since it is expected that several trials with different parameters will have to be done. But an important reduction of the necessary work with respect to the one presented here is certain. The huge quantity of work done for this fusion module will certainly be useful in many other applications, even in quite different domains, and constitutes therefore a large set of methods and tools for both research and applicative work.
Acknowledgements The authors wish to thank the members of the SMART consortium who have provided their processing results as inputs for fusion.
References [1] [2]
[3] [4]
I. Bloch and N. Milisavljević, Report on possible fusion strategies in SMART. Technical report, 2003. Y.Yvinec, European project of remote detection: SMART in a nutshell. In Proc. of Robotics and Mechanical Assistance in Humanitarian Demining and Similar Risky Interventions, Brussels-Leuven, Belgium, 2004. P. Smets, The Transferable Belief Model for Uncertainty Presentation. Technical Report TR/IRIDIA/95-23, IRIDIA, Université libre de Bruxelles, Brussels, Belgium, 1995. L.A. Zadeh, Fuzzy Sets. Information and Control, 8: 338-353, 1965.
171
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
a,*
b
a b
Θ θi i = 1, . . . , n
*
Θ Θ Θ
172
J. Dezert and F. Smarandache / The DSmT Approach for Information Fusion
θi
Θ
Θ
Θ
Θ = {θ1 , . . . , θn }
DΘ
∅, θ1 , . . . , θn ∈ DΘ A, B ∈ DΘ A∩B
A∪B DΘ
n
DΘ
Θ Θ θi
θi
M
J. Dezert and F. Smarandache / The DSmT Approach for Information Fusion n
|DΘ | ≤ 22 Θ |DΘ | ≥ |2Θ |
|Θ| = n n ≥ 1
DΘ
D
Θ
173
Θ |DΘ |
Θ = {θ1 , . . . , θn } DΘ
Mf (Θ)
θi
DΘ
2Θ
0
M (Θ) Θ
M(Θ) B
Θ
M(Θ) = M0 (Θ)
M(Θ) = Mf (Θ) m(.) : DΘ → [0, 1]
m(∅) = 0
m(A) = 1
A∈D Θ
m(A)
A
(A)
m(B)
(A)
DΘ m2 (.) mMf (Θ) (C) ≡ m(C) =
m(B)
B∩A=∅ B∈D Θ
B⊆A B∈DΘ
m1 (.)
∀C ∈ DΘ A,B∈DΘ A∩B=C
m1 (A)m2 (B)
174
J. Dezert and F. Smarandache / The DSmT Approach for Information Fusion
DΘ
∪
∩
m(.)
m(.) : DΘ → [0, 1] Mf (Θ) k > 2
Mf (Θ)
k≥2
M(Θ) = Mf (Θ)
A ∈ DΘ mM(Θ) (A) φ(A) · S1 (A) + S2 (A) + S3 (A)
φ(A) A∈ /∅ φ(A) = 0 DΘ M ∅
∅ {∅M , ∅} ∅M
A
φ(A) = 1
S1 (A) ≡ mMf (θ) (A) S2 (A) S3 (A)
S1 (A)
k
mi (Xi )
X1 ,X2 ,...,Xk ∈DΘ i=1 (X1 ∩X2 ∩...∩Xk )=A
S2 (A)
k
mi (Xi )
i=1 X1 ,X2 ,...,Xk ∈∅ [U =A]∨[(U ∈∅)∧(A=It )]
S3 (A)
k
mi (Xi )
i=1 X1 ,X2 ,...,Xk ∈DΘ u(c(X1 ∩X2 ∩...∩Xk ))=A (X1 ∩X2 ∩...∩Xk )∈∅
U u(X1 ) ∪ . . . ∪ u(Xk ) X It θ1 ∪ . . . ∪ θn X S1 (A)
u(X)
θi c(X) k
175
J. Dezert and F. Smarandache / The DSmT Approach for Information Fusion
Mf (Θ) S2 (A) S3 (A)
mI (.) [0, 1]
X ∈ DΘ
DΘ
mI (.)
[0, 1] m(X) ∈ mI (X)
X∈D Θ
m(X) = 1
X1 X2 {x | x = x1 + x2 , x1 ∈ X1 , x2 ∈ X2 } X1 X2 {x | x = x1 · x2 , x1 ∈ X1 , x2 ∈ X2 } ∀A = ∅ ∈ DΘ mIMf (Θ) (A) =
mIi (Xi )
X1 ,X2 ,...,Xk ∈DΘ i=1,...,k (X1 ∩X2 ∩...∩Xk )=A
mM(Θ) (A) S1 (A) S2 (A) S3 (A) I I S2 (A) S3 (A) + S1I (A) S2I (A)
X = (A ∪ B)∩ C ∩ (A ∪ C)
mIM(Θ) (A) S1I (A) ·
S3I (A)
mi (Xi )
A ∩ B ∩ (C ∪ D) c(X) = (A∪B)∩C m(Θ) = 1
C ∩(A∪C) = C
mIi (Xi )
176
J. Dezert and F. Smarandache / The DSmT Approach for Information Fusion
2Θ m1 , m2 : G → [0, 1]
Θ = {θ1 , θ2 , . . . , θn } DΘ
G
X∈G
mi (X) = 1 i = 1, 2
∀X ∈ G m1...s (X) = k12
m(X1 ∩ X2 )
X1 ,...,Xs ∈G X1 ∩...∩Xs =X
k12 = X1 , X2 ∈ G
m(X1 ∩ X2 ) X1 X2
X1 ∩ X2 = ∅
s i=1
mi (Xi )
m1 (X1 )m2 (X2 ) = X1 ∩ X2 = ∅ s ≥ 2
k12 ∈ [0, 1]
mv (θ1 ∩ . . . ∩ θn ) = 1 mv ](X)
[m1 ⊕ . . . ⊕ ms ](X)
s > 1 [m1 ⊕ . . . ⊕ ms ⊕ mv (.)
∀X ∈ G \ {∅} mP CR5 (X) = m12 (X) +
Y ∈G\{X} c(X∩Y )=∅
c(x)
[
m2 (X)2 m1 (Y ) m1 (X)2 m2 (Y ) + ] m1 (X) + m2 (Y ) m2 (X) + m1 (Y )
x m12 (.) s≥2
J. Dezert and F. Smarandache / The DSmT Approach for Information Fusion
177
178
J. Dezert and F. Smarandache / The DSmT Approach for Information Fusion
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
179
Multitarget Tracking Applications of Dezert-Smarandache Theory A. TCHAMOVA1,2, J. DEZERT3, T.SEMERDJIEV1, P.KONSTANTINOVA1 Bulgarian Academy of Sciences, Institute for Parallel Processing, Sofia, Bulgaria 3 Office National d’Etudes et Recherches Aérospatiales, Chatillon, France
1
Abstract. The objective of this study is to present two multitarget tracking applications based on Dezert-Smarandache Theory (DSmT) for plausible and paradoxical reasoning: (1) Target Tracking in Cluttered Environment with Generalized Data Association incorporating the advanced concept of generalized data (kinematics and attribute) association to improve track maintenance performance in complicated situations (closely spaced and/or crossing targets), when kinematics data are insufficient for correct decision making.; (2) Estimation of Target Behavior Tendencies - it is developed on the principles of DSmT applied to conventional passive radar amplitude measurements, which serve as an evidence for corresponding decision-making procedures. The aim is to present and to demonstrate the ability of DSmT to improve the decision-making process and to assure awareness about the tendencies of target behavior in case of discrepancies in measurements interpretation. Keywords. Dezert-Smarandache Theory, Multitarget Tracking, Attribute Data Fusion, Generalized Data Association, Decision Making under Uncertainty.
Introduction An important function of each radar surveillance system in a cluttered environment is to keep and improve targets’ tracks maintenance performance. This becomes a crucial and challenging problem, especially in complicated situations of closely spaced and/or crossing targets. The design of a modern multitarget tracking (MTT) algorithm [4,5,6] in a real-life stressful environment motivates the incorporation of advanced concepts for generalized data association. To resolve correlation ambiguities and to select the best observation-track pairings, in this first application of DSmT, a particular generalized data association approach is proposed and incorporated in an MTT algorithm. This approach allows the introduction of target attributes to the association logic, based on the general DSm rule of combination. Estimation of target behavior tendencies is an important subject related to angle-only tracking systems, which are based on passive sensors. These systems tend to be less precise than those based on active sensors, but one important advantage is their stealth. In a single sensor case, only the direction of the target as an axis is known, but the true target position and behavior (approaching or descending) remains unknown. A number of developed tracking 1 Corresponding Author: Albena Tchamova, Bulgarian Academy of Sciences, Institute for Parallel Processing, “Acad. G.Bonchev”str.,bl.25-A,1113 Sofia, Bulgaria; e-mail:
[email protected]. 2 This work is partially supported by MONT grants I-1205/02, I-1202/02 and by Center of Excellence BIS21++.
180
A. Tchamova et al. / Multitarget Tracking Applications of Dezert-Smarandache Theory
techniques operating on angle-only measurements use additional information. We utilize the measured emitter's amplitude values in consecutive scans. This information can be used to assess tendencies in a target's behavior and, consequently, to improve the overall angle-only tracking performance. The aim of this application is to present and to improve the ability of DSmT to successfully finalize the decision-making process and to ensure awareness about a target’s behavior tendencies in case of discrepancies of angle-only measurements. Results are presented and compared in detail with the respective ones drawn from the fuzzy logic approach in companion papers [3,7]. The DSmT [1,2,3] proposes a new general mathematical framework for solving fusion problems. It overcomes the practical limitations of Dempster-Shafer Theory (DST), coming essentially from the acceptance of the law of the third excluded middle.3 DSmT is an extension of probability theory and DST.
1. Target Tracking in Cluttered Environment with Generalized Data Association based on the General DSm Rule of Combination 1.1. Basic Elements of Tracking Process The tracking process consists of two basic elements: data association and track filtering. The goal of the first element is to correctly associate observations with existing tracks. To eliminate unlikely observation-track pairing, a validation gate is formed around the predicted track position. Measurements in the gate are candidates for association with the corresponding track. The used tracking filter is the first order extended Kalman filter [4,5]. We assume Gaussian distributed measurements. One defines a threshold constant for gate G such that correlation is allowed, if the 2
relationship d ij2 G is satisfied, where d ij is the norm of residual vector. 1.2. Generalized Data Association When attribute data are available, generalized probability can be used to improve the assignment. In view of the independence of kinematic and attribute measurement errors, the generalized probability for measurement j originating from track i is:
Pk i , j Pa i , j
Pgen i , j
where Pk i , j and Pa i , j are kinematic and attribute probability terms. We choose a set of assignments that ensures a maximum of the total generalized probability sum, n
i.e., use the solution of the assignment problem min
m
¦¦ a
ij
F ij . Because our
i 1 j 1
3
It is a basic theorem of propositional logic, where it is written P ¬ P
181
A. Tchamova et al. / Multitarget Tracking Applications of Dezert-Smarandache Theory
probabilities vary between 0 and 1, the elements of the particular assignment matrix are defined as: aij 1 Pk i, j .Pa i, j to satisfy the condition to be minimized. 1.3. The Fuzzification Interface The fuzzification interface [3, pp.307] transforms crisp measurements into a fuzzy set. The input variable is the Radar Cross Section (RCS) of the observed targets, which is determined as a linguistic variable. The modeled RCS data [7] are analyzed with the subsequent declaration for specified type (Fighter, Cargo) or False Alarms. Bearing this in mind, we define two frames of the problem: first - the size of RCS: 4 1 ^VerySmall VS , Small ( S ), Big B ` and second - its corresponding Target Type 4 2 ^FalseAlarm s ( FA), Fighter F , Military C arg oC `. The RCS for real targets is modelled as a Swerling3 type function, for False Alarm,- a Swerling2. 1.4. Tracks’ Updating Procedures 1.4.1. Using Classical DSm Rule of Combination DSm classical combinational rule is used for track updating: ij C mupd
>m
i his
@
j C m mes
¦ m A .m B
A, BD 41
where mupd ij
i his , A B C
j mes
represents the general basic beliefs assignments (gbba) of the updated
track i with the new observation j;
i j mhis , mmes are respectively gbba vectors of track’s i
history and the new observation. DSmT takes into account and utilizes the paradoxical information hidden in non-empty intersections VS S B, VS S , VS B, S B . 1.4.2. Using Hybrid DSm Rule of Combination RCS data are used to analyze and subsequently to determine the type of the observed target, so a target’s type represents the second frame of the problem 42 ^( FA)lseAlarm, ( F )ighter, Military (C ) arg o`. We consider the following relationships: x If rcs is Very Small then the target is False Alarm x If rcs is Small then the target is Fighter x If rcs is Big then the target is Cargo We transform the updated tracks’ gbba from D
ij mupd C CD 4 2
ij mupd C CD 41
41
into respective gbba in D
42
,i.e:
182
A. Tchamova et al. / Multitarget Tracking Applications of Dezert-Smarandache Theory
Here the following exclusivity constraints are introduced: 01
01
01
01
FA F { , FA C { , F C { , FA F C { . We update the previous fusion result, obtained via the classical DSm rule, with this new information on the model 01 42 , and solve with the DSm hybrid rule [3], which transfers the mass of empty sets to the non-empty sets of D
42
1.5. The Generalized Data Association (GDA) Algorithm We consider particular clusters and sets of n tracks and m received observations at a current scan. The validation test is used for filling the assignment matrix. We solve the assignment problem by the extension of Munkres algorithm [8]. The JPDA approach [4] is used to produce the probability terms Pk , Pa . To define the probabilities for data association the following steps are implemented: (1). Check gating; (2). Clustering; (3). For each cluster: (3.1) Generate hypotheses following Depth First Search procedure; (3.2) Compute hypothesis probabilities for kinematic and attribute contributions; (3.3) Fill assignment matrix, solve assignment problem. 1.5.1. Attribute Probability Term for Generalized Data Association Calculating the attribute probability term follows the joint probabilistic approach: P
''
H l
d
i , j ,
e i z 0 , j z 0 | i , j H
l
where
¦ m C m C
d e i , j
i
j
2
C D 41
is the Euclidean distance between m C -predicted bba of C from track history of i
target i ; m C -bba of C of attribute measurement. The corresponding normalized probabilities are obtained as: j
Pa H l
P ' ' H l N
H
¦
P ' ' H l
l 1
where N H is the number of hypotheses. To compute Pa' i, j , a sum is taken over the probabilities from these hypotheses, in which this assignment occurs. Because the Euclidean distance is inversely proportional to the probability of association, the probability Pa i , j
1 Pa' i , j is used to match the corresponding Pk i, j .
A. Tchamova et al. / Multitarget Tracking Applications of Dezert-Smarandache Theory
183
1.6. Simulation Scenario Scenario 1 [3,pp.317] consists of two air targets (Fighter, Cargo) in clutter and a stationary sensor at the origin with Tscan= 5 sec., measurement standard deviations 0.3 [deg] and 60m for azimuth and range. The targets are moving east to west with a constant velocity of 250m/sec. The headings of the fighter and cargo are 225 [deg] and 315 [deg] from north, respectively. During the 11th –14th scans, the targets perform maneuvers with 2.5g. Scenario 2 [3,pp.317] consists of four air targets (alternating Fighter, Cargo, Fighter, Cargo) moving with a constant velocity of 100m/sec. The heading at the beginning is 155 [deg] from north. The targets make maneuvers with 0.85g-(right, left, right turns). 1.7. Comparative Analysis of the Results obtained using Kinematics only, DezertSmarandache and Dempster-Shafer Theory The incorporated advanced concept of Generalized Data Association (GDA) leads to improving track maintenance performance, especially in complicated situations (closely spaced and/or crossing targets). It influences the obtained tracks’ purity results [3, pp.318-321]. Track purity increases using DSmT as compared with DST. Analyzing all the obstacles, it can be underlined that: x DSmT allows paradoxical information to be processed and utilized in a flexible manner. This paradoxical information is peculiar to the problem of multiple target tracking in clutter, where the conflicts between the bodies of evidence often become high and critical. In this way it contributes to a better understanding of the overall tracking situation and to producing an adequate decision. Processing the paradoxes, the estimated entropy in the confirmed tracks’ attribute histories decreases during the consecutive scans. x Because of Swerling type modelling, observations for False Alarms, Fighter and Cargo are mixed. This causes some conflict between general basic beliefs assignments of the described bodies of evidence. When the conflict becomes unity, it leads to indefiniteness in Dempster’s rule and consequently the fusion process cannot be realized and the whole MTT becomes corrupt. x If the new measurement leads to an update of a track’s attributes, in which some particular hypothesis is supported by the unity, after that point, the Dempster’s rule becomes indifferent to any other measurements in the following scans. This means the track’s attribute history remains the same, regardless of the received observations. It leads to non coherent and non adequate decisions according to the right associations.
2. Estimation of Target Behavior Tendencies using DSmT 2.1. Approach for Behavior Tendency Estimation The block diagram of the target's behavior tracking system [3, pp.291] maintains two single-model-based Kalman-like filters using two models of target behavior Approaching and Receding. The tendency prediction is based on Zadeh's
184
A. Tchamova et al. / Multitarget Tracking Applications of Dezert-Smarandache Theory
compositional rule [9,10,11]. The updating procedure uses DSm rule to estimate target behavior states. 2.2. Fuzzification Interface A decisive variable in our task is the value transmitted from the emitter amplitude and received at consecutive time moments. We use the fuzzification interface [3, pp.292] that maps these values into two fuzzy sets T ^Small S , Big B`. Their membership functions rely on the inverse proportion dependency between the measured amplitude value and corresponding distance to target. 2.3. Behavior Models We consider two target behavior models: Approaching - characterized as a stable process where the amplitude value gradually increases; and Receding - characterized as a stable process where the amplitude value gradually decreases. To conform to these models the following rule bases have to be carried out: Behavior Model 1: Approaching Target: Behavior Model 2: Receding Target: Rule 1: IF A k is Small THEN A k 1 is Small Rule 1: IF A k is Big THEN A k 1 is Big
A k is Small THEN A k 1 is Big Rule 2: IF A k is Big THEN A k 1 is Small Rule 3: IF A k is Big THEN A k 1 is Big Rule 3: IF A k is Small THEN A k 1 is Small The models are derived as fuzzy graphs, in which Larsen product operator is used for fuzzy conjunction; maximum for fuzzy union; Zadeh max-min rule of composition [14] Relation1: Approaching Target k o k 1 SB B S SB Rule 2: IF
S SB
1
0
1
0
0
0
0
0
B SB
0.2
0
1
0
0
0
0
0
k o k 1
Relation2: Receding Target SB B S SB
S SB
1
0
0.2
0
0
0
0
0
B SB
1
0
1
0
0
0
0
0
2.4. Models’ Conditioned Attribute State Prediction At initial time moment k , the target is characterized by the fuzzified amplitude state
estimates according to the models P A App k k
and P A Re c k k . Using them and
applying the Zadeh max-min compositional rule to relations 1 and 2 we obtain models' conditioned amplitude state predictions for time moment k 1 , i.e.:
A. Tchamova et al. / Multitarget Tracking Applications of Dezert-Smarandache Theory
P A App k 1 k
PA
Re c
k 1 k
max min P A App k k , P App k o k 1
185
max min P A Re cp k k , P Re c k o k 1
2.5. Attribute State Updating using DSmT The updating procedure uses the DSm combinational rule: App / Re c C mupd
>m
App / Re c pred
@
mmes C
¦m
App / Re c pred A ,BD , A B C
A .mmes B
T
DSmT takes into account and utilizes the paradoxical information hidden in the nonempty set S B . This information refers to a moving target residing in an overlapping region, where it is hard to properly predict the tendency in its behavior. 2.6. The Decision Criteria The decision criterion for estimating the plausibility of models is based on the evolution of generalized pignistic entropies [3], associated with updated amplitude M M M states: H pig ^A`ln PupdM ^A` .The correct model corresponds to the Pupd Pupd
¦
AQ
smallest entropy value among these entropies. 2.7. Simulation Study A simulation scenario [3, pp.296] is developed for a single target trajectory in plane coordinates and for constant velocity movement. The target’s start-point and velocities x x x x are: X 0 5km, Y0 10km , X 100m / s, Y 100m / s , X 100m / s, Y 100m / s The time sampling rate is T distributed process.
10s . The measured amplitude value is a random Gaussian
2.8. Comparison between result of DSmT and Fuzzy Logic (FL) Approaches DSmT and FL approaches deal with a frame of discernments, based in general on imprecise/vague notions and concepts. DSmT allows dealing with rational, uncertain or paradoxical data, operating on the HyperPower Set. In our particular application DSmT provides an opportunity for flexible tracking during the overlapping region S B . DSmT based behavior estimates can be characterized as noise resistant, while FL uses an additional noise reduction procedure to produce ‘smoothed’ behavior estimates.
186
A. Tchamova et al. / Multitarget Tracking Applications of Dezert-Smarandache Theory
References [1] Dezert J., Foundations for a new theory of plausible and paradoxical reasoning,, in Information & Security, An international Journal, edited by Prof. Tzv. Semerdjiev, CLPP, BAS,Vol.9.,2002. [2] Dezert J. and F. Smarandache, "On the generation of hyper-powerset for the DSmT," Proceedings of the 6th International Conference on Information Fusion, Cairns, Australia, July 8-11,2003. [3] Smarandache.F, J.. Dezert (Editors), Advances and Application of DSmT for Information Fusion, American Research Press, Rehoboth, 2004 [4] Blackman S. Multitarget tracking with Radar Applications , Artech House, 1986 [5] Blackman S. and R. Popoli, Design and Analysis of Modern Tracking Systems, Norwood, MA, Artech House,1999 [6] Bar-Shalom Y.(Ed), Multitarget_ multisensor Tracking: Advanced Applications, Artech House,1990. [7] Benchmark Problem for Radar Resource Allocation and Tracking Maneuvering Targets in the presence of ECM, Technical Report NSWCDD/TR-96/10 [8] Bugeois F., J. C. Lassalle, "An Extension of the Munkres Algorithm for the Assignment Problem to Rectangular Matrices," Communications of the ACM, Vol.14, Dec.1971, pp.802-806. [9] Zadeh L., "Fuzzy Sets as a Basis for a Theory of Possibility," Fuzzy Sets and Systems,1978,1,pp.3-28. [10] Zadeh L., "From computing with numbers to computing with words - from manipulation of measurements to manipulation of perceptions," IEEE Trans. on Circuits and Systems, Jan.1999, 45,1, pp.105-119. [11] Mendel J., "Fuzzy Logic System for Engineering: A Tutorial," Proc. of the IEEE, March 1995,pp.345377.
187
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
Image Registration: A Tutorial Pramod K VARSHNEY a, 1, Bhagavath KUMAR a, Min XU a, Andrew DROZD b and Irina KASPEROVICH b a EECS Department, Syracuse University, Syracuse, New York, USA b ANDRO Computational Solutions, Rome, NY, USA
Introduction Multiple imaging sensors are increasingly being used in a variety of imaging applications. Image registration is a necessary preprocessing task for all such systems. The images involved in such applications can be taken at different times, using different sensors and from different viewpoints. Such a huge variation in the type of imagery makes the problem of image registration non trivial. Also, with the increase in the number of sensor types, application of multi-modal images has become very popular. There is a current need to develop an in-depth understanding of image registration and the goal of this chapter is to provide a tutorial exposition of the topic. For more detailed discussion, the reader is referred to [1, 2]. Image registration essentially involves the process of aligning or overlaying two or more images of the same scene. Some of the application areas where image registration is required include, remote sensing, medical image analysis, cartography, pattern recognition and computer vision. Remote sensing applications involve environmental monitoring (pollution), change detection (urban studies, forestry), oil and mineral exploration, weather forecasting, target location, planetary observation and integrating information into Geographic Information Systems (GIS). Medical applications involve combining different modalities of images for biomedical research, tumor detection and other medical analysis.
1. Definition of Image Registration Let the two images to be registered be R & F (reference and floating image), each being a 2D scalar function dependent on the pixel location (x, y). Let T be the transformation required to align them and G be the radiometric transformation, required to equate the intensities of both images. Image registration can be defined as the mapping between R (x, y) and F (x, y). The mapping can be expressed as, R (x, y) = G (F (T (x, y)))
(1)
The process of image registration essentially involves the estimation of the transformation T. One can model the problem of registration as an optimization
1
Corresponding Author, Email:
[email protected]
188
P.K. Varshney et al. / Image Registration: A Tutorial
problem, where T is the argument of the optimum of some similarity metric S, applied to R and Transformed FT. This can be expressed as in Eq. (2) T = arg (opt (S (R, FT)))
(2)
2. Types of Image Registration Image registration can be classified into four different classes based on the kind of images and the problem involved [1, 2]. The four classes are, • Multi-modal registration • Viewpoint registration • Temporal registration • Template matching. The first three typically involve registration of images of the same area, but with distortion as a result of different sensors, different orientation of the sensors and different times of acquisition. The last one usually involves the identification of the location of a particular smaller template in a large image, if one exists. It is often used in pattern matching and object recognition. Registration problems could involve a combination of these four classes also, e.g., registration of images taken at different times and from different viewpoints. 2.1. Multi-modal Registration Registration of images of the same scene taken using different sensors is known as multi-modal registration. The main aim of multi-modal image analysis is to integrate information from different sources to obtain enhanced and detailed image information. Images from various kinds of sensors have special properties. For example, panchromatic images have high spatial resolution, multi-spectral images have high spectral resolution, and active sensors like SAR work even at night. In the case of medical images, a CT image of the brain (Figure 1) captures a better view of bones while a Magnetic Resonance Image (MRI) of the brain (Figure 2) provides information about the soft tissues [3]. The fusion of these two images would yield a third image which would be able to provide an analyst better understanding of the anatomical structure of the brain. Other areas of application of multi-modal image analysis include remote sensing and video surveillance.
P.K. Varshney et al. / Image Registration: A Tutorial
Figure 1. CT image of brain, showing bones
189
Figure 2. MRI image of brain, showing soft tissues
2.2. Viewpoint Registration Registration of images of the same scene acquired from different viewpoints is known as viewpoint registration. Images acquired using an aircraft typically have variations in the view angle due to the motion of the aircraft. Many SAR sensors like Side-Looking Airborne Radar (SLAR) have a deliberate angle of acquisition to the vertical, which provides them better range information. In general it provides a larger view. Also such images help in 3D reconstruction and shape recognition, e.g., in concealed weapon detection (Figure 3) [4, 5]. It helps in depth recovery used in Digital Elevation Model (DEM) generation. Registration of such images involves the use of assumptions about viewing geometry and properties of the surfaces. The perspective distortions need to be accounted for using local transformations. Feature-based algorithms are often used for such registration.
Figure 3. Images taken from two view angles for Concealed Weapon Detection
2.3. Temporal Registration Registration of images of the same scene taken at different times is known as temporal registration. The analysis of multi temporal images is used in the detection and evaluation of changes that have taken place in between the times of acquisition. In the recent past, due to increased interest in surveillance and disaster management, the evaluation of changes in an area using multi-temporal images [6, 7] has become important. Some other remote sensing applications [8, 9] of this analysis include natural resource monitoring, urban growth monitoring and landscape planning. Such
190
P.K. Varshney et al. / Image Registration: A Tutorial
analysis also finds great application in a medical context as it is useful in detecting and monitoring tumor growth. Registration of multi-temporal images is a problem of dissimilar images. The method should be able to tolerate and differentiate between distortions caused due to changes (to be evaluated) and mis-registration in the original images. 2.4. Template Registration Registration of a template image in a larger image is referred to as template registration. It essentially is a high level matching of pre-selected features with known properties. It finds applications in various areas such as in remote sensing where a satellite image is to be registered into a GIS layer or a map. This kind of registration is also referred to as scene to model registration, where an image of the scene and a model of the scene are registered. Similarly, one can also register the images to a DEM. Such a process is useful for interpreting scenes like airports, battlefields, and networks of highways etc. In medical imaging it is used to compare a patient’s image with digital anatomical atlases. It is also used for pattern matching in computer vision, automatic quality inspection, signature verification, and character recognition. For example, in Figure 4, a T-shaped template is to be located in an IC Circuit for automatic quality assurance of the circuit designed or inspected [1].
Figure 4. Image of an IC Circuit and ‘T’ shaped Template
3. Transformations Registration techniques involve searching within a certain type of transformation space to find the optimal transformation for a particular problem [1]. Hence, the selection of the type of spatial transformation or mapping is the fundamental characteristic of any image registration technique. The most common transformations used in image registration are: • Rigid transformation • Affine transformation • Projective transformation • Polynomial transformation • Radial basis function based transformation. The first two are most commonly used in remote sensing images in practice, but depending on various conditions, different transformations are useful. Figure 5 shows various types of transformation applied to a Baboon image.
191
P.K. Varshney et al. / Image Registration: A Tutorial
(a) Original Image
(b) Rigid transform
(d) Second order Polynomial transform
(c) Affine transform
(e) Projective transform
Figure 5. Transformation Examples
3.1. Rigid Transformation A rigid-body transformation is composed of a combination of translation, rotation, and scaling. A 2-D image typically has four parameters, translation in x, translation in y, scaling and rotation (tx, ty, s, θ), which map a point (x1, y1) of the first image to a point (x2, y2) of the second image as,
( xy ) = ⎛⎜⎝ tt ⎞⎟⎠ + s (sincosθθ 2 2
x y
)( )
− sin θ x1 y1 cos θ
(3)
For rigid body transformation, the angles and lengths in the original image are preserved after the transformation. 3.2. Affine Transformation Affine transformations are more general than rigid body transformations and, therefore, admit more complicated distortions while maintaining some nice mathematical properties. The affine transformation has six parameters, which can be written as,
( xy ) = ( aa ) + ( aa 2
13
11
2
23
21
)( )
a12 x1 a22 y1
(4)
Affine transformation does not have the properties associated with the orthogonal rotation matrix. Angles and lengths are no longer preserved, but parallel lines remain
192
P.K. Varshney et al. / Image Registration: A Tutorial
parallel. A shear transformation is an example of one type of affine transformation. Shear transform can act either along the x-axis, or along the y-axis. Shear transform in the x-axis and y-axis is represented as, Sh ea r
x
=
(
1 0
)
(
a 1 , S h ea r = 1 b y
0 1
)
(5)
3.3. Projective Transformation Projective transformation and perspective transformation account for distortions due to the projection of objects at varying distances to the sensor onto the image plane. It is a transform from 3-D to 2-D. When the object plane is parallel to the image plane, the perspective transformation becomes projective transformation. Let (xp, yp) denote plane coordinates and (xi, yi) denote image coordinates. Thus, projective transformation is written as, a11 x p + a12 y p + a13 a21 x p + a22 y p + a23 xi = , yi = (6) a31 x p + a32 y p + a33 a31 x p + a32 y p + a33 3.4. Polynomial Transformation Polynomial transformations are one of the general global transformations (of which affine is the simplest) and can account for many types of distortions so long as the distortions do not vary too much over the image. Distortion due to moderate terrain relief can often be corrected by a polynomial transformation. However, higher order polynomials are not usually used in practical applications because they can unnecessarily warp the sensed images [2]. The typical second order polynomial transformation is represented as,
⎛ x1 ⎜ x1 y1 ⎜ y2 ⎜ 1 ⎜ x1 ⎜ y1 ⎝1 2
( ) ( x2 y2
=
a 11 a 12 a 13 a 14 a 15 a 16 a 21 a 22 a 23 a 24 a 25 a 26
)
⎞ ⎟ ⎟ ⎟ ⎟ ⎟ ⎠
(7)
3.5. Radial Basis Function-based Transformation Radial basis function-based transformations are the class of transforms that can handle even locally varying geometric distortions. This transformation has the form of a linear combination of translated radially symmetric function plus a low-degree polynomial [1]. The radial basis function reflects an important property of the function value at each point; it depends just on the distance of the point from the control points, not on its particular position [1]. The main radial basis functions used in image registration include multi-quadrics, reciprocal multi-quadrics, Gaussian, Wendland’s functions, and
193
P.K. Varshney et al. / Image Registration: A Tutorial
thin-plate splines. This group of transformations provides good registration accuracy but has a large computation complexity.
4. Interpolation Techniques Interpolation involves the process of estimating intensity values based on neighborhood information. Let us denote the two images that need to be registered as F, the floating image on which a geometric transformation will be applied, and R, the reference image that will be interpolated. When a transformation is applied to F, a new grid is obtained, and an intensity interpolation algorithm is necessary for the calculation of the intensity values in R at every transformed grid point of F. Some of the most commonly used techniques are, • Nearest neighbor interpolation • Bi-linear interpolation • Bi-cubic interpolation.
(a) Original Image
(b) Nearest Neighbor
(c) Bi-linear
(d) Bi-cubic
Figure 6. Resulting Image using Different Interpolation Techniques
4.1. Nearest Neighbor Interpolation Nearest neighbor interpolation is the simplest interpolation method. This method uses the digital value from the pixel in R, which is nearest to the new transformed grid point (Figure 7.a), and therefore does not alter the original values. However, it results in some pixel values being duplicated while others are lost. The transformed image may have a disjointed or blocky appearance. Figure 6 shows the resulting image from this interpolation.
(a) Nearest Neighbor
(b) Bi-linear Figure 7. Different Interpolation Techniques.
(c) Bi- cubic
194
P.K. Varshney et al. / Image Registration: A Tutorial
4.2. Bi-linear Interpolation Bi-linear interpolation takes a weighted average of four pixels in the original image nearest to the new pixel location (Figure 7.b). Given two points x0 and x1 and the values of the function at these two points equal to y0 and y1, the linear interpolation in 1-D is equivalent to the calculation of y, the value of the function at any point x in the interval [x0, x1]. From Figure 8, using basic coordinate geometry, we can obtain the value y as given in Eq. (8). y = y0 +
y1 − y0 ( x − x0 ) x1 − x0
(8)
It follows in the 2-D domain as in Figure 8, with four known points Q11, Q12, Q21, Q22, we can first obtain the value at point R1 by linear interpolation of Q11 and Q21 and the value at point R2 by linear interpolation of Q12 and Q22. Then by the linear interpolation of R1 and R2, we can obtain the value at point P.
Figure 8. Linear Interpolation and Bi-linear Interpolation
In bi-linear interpolation, the averaging process alters the original pixel values and creates entirely new digital values in the output image (Figure 6). Thus this may be undesirable if further processing and analysis, such as classification based on spectral response, is to be done. 4.3. Bi-cubic Interpolation The bi-cubic interpolation method calculates a distance-weighted average of a block of sixteen pixels from the original image, which surround the new output pixel location (Figure 7.c). As with bi-linear interpolation, this method results in completely new pixel values. However, these two methods both produce images that have a much sharper appearance and avoid the blocky appearance of the nearest neighbor method (Figure 6).
5. Image Registration Methods 5.1. Feature-based Method Feature-based methods are based on the extraction of salient structures or features in the images [2]. The feature space is a particular aspect of the images that is used for
195
P.K. Varshney et al. / Image Registration: A Tutorial
comparing the images. The common features used for registration are of three types, namely, feature points [10], contours such as lines and edges [11, 12], and regions such as trees, fields, and buildings. The feature points can be points of locally maximum curvature on contour lines; centers of windows having locally maximum variances; centers of gravity of closedboundary regions and line intersections.These feature points are usually selected as control points (Figure 9). Once the control points are matched, the spatial transformation can be determined using a least squares method. Appendix 1 provides the whole process of control point based image registration. In addition to these features, some high level features [13] such as Fourier descriptors; Moment invariant, Shape Matrix and B-Spline descriptors are often employed to represent the object. These high level features are normally scale, rotation and translation invariant and therefore can be effectively used to match the objects in the two images. The choice of the type of invariant description depends on the feature characteristics and the assumed geometric deformation of the images.
Figure 9. Control point based registration
5.2. Fourier-based Method The Fourier-based method [14, 15, 16] works with the image in the frequency domain and utilizes the translation and rotational property of Fourier transforms. Let f1 and f2 be the two images that differ only by displacement (xo, yo), i.e., f2 (x, y) = f1 (x - xo, y yo). Then in the Fourier domain, the corresponding Fourier transforms, F1and F2 are related as shown in Eq. (9). Thus, the cross power spectrum of two images f1 and f2 with Fourier transforms F1and F2 is defined as in Eq. (10). F2 ( u, v ) = e− j 2π (ux0 + vy0 ) F1 ( u, v )
(9)
F1 ( u, v ) F2* ( u , v )
(10)
F1 ( u, v ) F2* ( u , v )
=e
j 2π ( ux0 + vy0 )
where F* is the complex conjugate of F. Eq. (10) shows that the phase of the cross power spectrum is equivalent to phase difference between the images. Further, the inverse Fourier transform of the cross power spectrum is an impulse at the displacement that is needed to optimally register the two images (See Eq. (11)).
196
P.K. Varshney et al. / Image Registration: A Tutorial
(
F −1 e
j 2π ( ux0 + vy0 )
) = δ (x − x , y − y ) 0
0
(11)
According to the Fourier translation property and the Fourier rotation property, if the floating image has rotation θ0 besides the translation, the images are related as in Eq. (12) and the corresponding transforms are related as in Eq. (13). If we consider the magnitudes of the transforms F1 and F2 to be M1 and M2 , it is easy to see that magnitudes of both the spectra are the same, but one is a rotated replica of the other (See Eq. (14)). f 2 ( x, y ) = f1 ( x cos θ 0 + y sin θ 0 − x0 , − x sin θ 0 + y cos θ0 − y0 )
(12)
F2 ( u, v ) = e− j 2π (ux0 + vy0 ) F1 ( u cos θ0 + v sin θ 0 , −u sin θ 0 + v cos θ 0 )
(13)
M 2 ( u , v ) = M1 ( u cos θ 0 + v sin θ 0 , −u sin θ 0 + v cos θ 0 )
(14)
Rotation without translation can be determined in a similar manner using phase correlation by representing the rotation as a translation displacement with polar coordinates, i.e., in polar representation using phase correlation and angle θ0 can be easily found [3]. Once the rotation is estimated, we apply it to the floating image so that the new floating image only has translation error, compared with the reference image. Then we calculate the cross power spectrum of the new floating image and reference image to obtain the translation. Appendix 2 provides the whole process of the Fourier based image registration. In summary, the Fourier based method uses the cross power spectrum of the two images to determine the translation and the cross power spectrum of the two polar images to determine the rotation angle. It is faster computationally but cannot reach high accuracy. The Fourier based method works well if the images are corrupted by frequency dependent noise. 5.3. Intensity-based Method Intensity based methods use the raw intensities of the images to perform registration. A similarity measure, denoted as S, in Eq. (2), is defined using the intensity values. Some commonly used similarity measures are: • Sequential similarity • Correlation • Mutual information. Both correlation and sequential similarity measure the degree of similarity between an image and a template. Unlike correlation, smaller values imply a better match in the case of sequential similarity. In comparison with correlation, the sequential similarity technique improves the efficiency to find the optimal transformation by orders of magnitude. The application of these methods is restricted to a great extent to images of
P.K. Varshney et al. / Image Registration: A Tutorial
197
the same modality. Unlike these two methods, the mutual information based method is more successful with multi modal images. 5.3.1. Sequential Similarity The sequential similarity measure is computationally efficient, but it increases the size of the search space. Three sequential similarity measures used are • Mean Square Difference (MSD) (Eq. (15)) • Absolute Difference (AD) (Eq. (16)) • Normalized Absolute Difference (NAD) (Eq. (17)).
MSD (i, j ) =
AD (i, j ) =
(u ( x, y ) − v ( x − i, y − j )) ∑∑ x y
2
# of pixels in the overlap
u ( x, y ) − v ( x − i , y − j ) ∑∑ x y
NAD(i, j ) =
(15)
(16)
# of pixels in the overlap
u ( x, y ) − uˆ − v ( x − i, y − j ) − vˆ ∑∑ x y
(17)
# of pixels in the overlap
5.3.2. Correlation Cross-correlation [17] provides the basic statistical approach to registration. It is often used for template matching or pattern recognition in which the location and orientation of a template or pattern is to be found in an image. It is a similarity measure or match metric, i.e., it gives a measure of the degree of similarity between an image and a template. These methods are generally useful for images that are misaligned by small rigid or affine transformations. They are most successful in cases where the intensities of the two images involved have a linear relationship. For a template T and image I, where T is small compared to I, the twodimensional Normalized Cross-Correlation (NCC) function (Eq. (18)) measures the similarity for each translation.
∑ x ∑ y T ( x, y ) I ( x − u , y − v ) CC (u, v) = ∑ x ∑ y I ( x − u, y − v) 2
If the template matches the image exactly, except for an intensity scale factor, at a translation of (i, j), the cross-correlation will have its peak at C (i, j).
198
P.K. Varshney et al. / Image Registration: A Tutorial
NCC (u, v) =
∑ x ∑ y T ( x, y ) I ( x − u , y − v ) ∑x∑ y I
2
(18)
( x − u , y − v)
5.3.3. Mutual Information Mutual information has its roots in information theory, where it was developed to set fundamental limits on the performance of communication systems. However, since then it has been successfully used in varied disciplines like mathematics, physics and economics. The use of mutual information as a similarity metric for image registration was proposed in 1995 [20, 21]. Since its introduction, MI has been used widely for the purpose of image registration. It has been demonstrated that MI is robust for multimodal images and hence well suited for dissimilar images. Also, it facilitates the automation of the process of image registration without compromising accuracy as compared with correlation-based methods and control point based registration. This method assumes that out of the two images to be registered, one can give maximum information about the other when it is correctly aligned (least registration error). In other words, this method attempts to find the proper transformation required for the alignment of both the images (or image and template) such that information given by one about the other is maximum. The added advantage of this approach is that it assumes no specific relationship between the intensities of the images involved. 5.3.3.1. Basic Definition of Some Terms Involved Before we can explain the definition of Mutual Information, let us go over some basic terms involved. Entropy Entropy is the amount of information an event gives when it takes place. It can also be interpreted as the uncertainty about the outcome of an event. Given events e 1,. . . ,em occurring with probabilities p1,. . ., pm, the Shannon entropy is defined as in Eq. (19). H = ∑ pi log 1 i
pi
= −∑ pi log pi
(19)
i
The Shannon entropy can also be computed for an image, where we consider the probability distribution of the gray values of the image. The probability distribution of gray values of an image can be approximated with its histogram, which can be estimated, by counting the number of times each gray value occurs in the image and dividing those numbers by the total number of pixels (occurrence). Joint Entropy Joint entropy measures how much entropy is contained in a joint system of two random variables [22]. It summarizes the degree of dependence of the two random variables. Given a pair of discrete random variables (A, B) with joint distribution p (a, b), the Shannon entropy for the joint distribution is defined as in Eq. (20).
199
P.K. Varshney et al. / Image Registration: A Tutorial
H ( A, B) = −∑ p(a, b) log p(a, b)
(20)
a ,b
In the case of images, the joint distribution of the image pair is estimated by counting the occurrence of a particular pair of gray values in both the images and dividing it by the total number of all such pairs. This joint distribution can also be referred to as the joint histogram of the two images. Conditional Entropy Conditional entropy measures how much entropy a random variable has remaining if we have already learned completely the value of a second random variable [23]. It summarizes the randomness of one random variable given knowledge of the other. Given a pair of discrete random variables (A, B) the entropy of B conditioned on A is referred to as H (B | A) and is defined as in Eq. (21). H(B | A) = H( A, B) − H( A)
(21)
5.3.3.2. Definition of Mutual Information Mutual information has been defined in many ways in the literature [24]. One of the frequently used definitions is based on conditional entropy. Mutual information for two images, A and B, can be defined as in Eq. (22).
I ( A, B ) = H ( B) − H ( B | A)
(22)
This definition can be understood as the amount of reduction in the uncertainty of image B when A is known. Hence Mutual Information is the information A contains about B. The mutual information of A and B is the same as that of B and A (Eq. (23)). Hence it is also the information B contains about A. This information is hence called the Mutual Information of A and B. I ( A, B) = H ( B) − H ( B | A) = H ( A) − H ( A | B)
(23)
Mutual Information can also be defined using joint entropy as in Eq. (24). According to this definition the maximization of mutual information (criterion of registration) essentially means the minimization of joint entropy. I ( A, B ) = H ( B ) + H ( A) − H ( A, B )
(24)
6. Issues Related to Mutual Information Based Methods In recent years mutual information based image registration has become very popular and it is one of the highly researched areas of image registration. Its most popular
200
P.K. Varshney et al. / Image Registration: A Tutorial
application is in medical imaging. Research on this problem and its practical application has given rise to a number of research issues. Some of the issues are: • Joint Histogram Estimation • Interpolation methods • Interpolation Artifacts • Speed of computation. 6.1. Joint Histogram Estimation The most important step involved in the entire mutual information based method is the estimation of the joint histogram. The process involved in the estimation of the joint histogram can be divided into two kinds, viz. two-step and single-step. The most commonly used method is the two-step procedure. In the first step, the intensities are estimated at the transformed grid points using one of the interpolation schemes explained in the following sections. The interpolated intensities are not integers in most cases and hence in the second step, these are rounded to the nearest integer and the joint histogram is obtained by increasing the corresponding entry (of the pair) by one. The joint histogram depicts the relationship between the intensities of the two images involved [25]. The robustness of the mutual information similarity metric for various modalities is due to the fact that it does not assume any such specific relationship between the intensities of the two images, unlike the other similarity measures such as Correlation and MSD, which assume a linear relationship as shown in Figure 10(a). The mutual information based method is suitable even for a pair of images where the intensities are related in some arbitrary manner as shown in Figure 10(b).
(a) Correlation and MSD measure.
(b) Mutual Information.
Figure 10. Joint Histogram depicting the intensity relationship assumed in different metrics
6.2. Interpolation Artifacts The quality of the joint histogram estimated has a direct impact on the Mutual Information (MI) function and hence registration accuracy. In recent studies [26], it has been observed that certain artifacts appear in the MI function with the use of certain
201
P.K. Varshney et al. / Image Registration: A Tutorial
interpolation techniques as shown in Figure 11. It is observed that these artifacts hamper the global optimization process due to the presence of spurious local optima (the artifacts). It is also observed that in certain situations the true global optimum may be buried in the artifact patterns and hence directly limit the registration accuracy. 1.4
1.4
1.3
1.2
1.2
1
1.1
0.8
1 0.9
0.6
0
1
2
3
4
Linear Interpolation
5
0.4 0
1
2
3
4
5
Partial Volume Interpolation
Figure 11. Mutual Information vs. Shift in x axis. plot showing the interpolation artifacts
It has been pointed out in [26] that certain types of artifact patterns as shown in Figure 11 occur when the two images have equal sample spacing in one or more dimensions and interpolation schemes like partial volume interpolation and linear interpolation (explained in Section 6.3) are used. More precisely it is seen that the artifacts occur when the ratio of the two sample-spacing along a certain dimension is a simple rational number [27]. This happens because, in such cases, many of the grid lines may be aligned along these dimensions under certain geometric transformations and, therefore, fewer interpolations are required for the estimation of the joint histogram as compared to the case where no grid lines are aligned. In practice, artifacts influence the registration accuracy only when the true global optimum is located very close to any of the spurious local optima introduced by the artifacts. Otherwise it only makes the optimization (maximization) problem complicated and difficult. For example, when the resolution of the two images are 4 m/pixel and 1 m/pixel, then by shifting the first image (4 m/pixel) say along the x-axis, then along the y-axis, grid lines 1,2,3… of the first image can be made to coincide with the grid lines 1,5,9… of the second image. In this case the contribution of the coincident grid points to the joint histogram can be counted directly without resorting to any form of estimation. But when we shift the first image by slightly more, then no grid points will be coincident and one would need interpolation to estimate the joint histogram. This shift from much less estimation to substantially more estimation is one of the possible causes of the artifacts. 6.3. Interpolation Methods As mentioned in Section 6.1, the coordinates of the transformed pixel are not integers and hence the intensities need to be interpolated. The importance of interpolation techniques increases tremendously due to the resultant artifacts explained in the previous section. Interpolation techniques described in Section 4 can be used for the two-step procedure of estimating joint histograms. Currently the use of single-step estimation of the joint histogram is becoming popular. A graphical illustration of the interpolation schemes is shown in Figure 12. There are a number of interpolation techniques in the literature that perform the single-step estimation. Some of the frequently used methods are:
202
P.K. Varshney et al. / Image Registration: A Tutorial
• • •
Linear Interpolation (Two step) Partial Volume Interpolation (PVI) Generalized Partial Volume Estimation (GPVE). v1 = ( vx , v y )
v2 = ( vx , v y + 1)
w4
Δx
Δy
u′
w2
w3
w1
v3 = ( vx + 1, v y )
v4 = ( vx + 1, v y + 1)
Figure 12 Graphical illustration of interpolation
6.3.1. Linear Interpolation Linear interpolation is the commonly used method in a two-step interpolation procedure. First, a grid point u from the floating image is transformed into u̡ in the reference image. The interpolated gray value R (u’) can be calculated by the weighted gray values of four nearest neighbor points v1, v2, v3, v4 using the linear interpolation method mentioned in Section 4.2. That is, R ( u ′ ) = ∑ wi • R ( vi ), where i
∑w i
i
=1
(25)
Secondly, the histogram of h (F (u), R (u’)) is increased by 1. Linear interpolation generates a new interpolated reference image and therefore may introduce new intensity values, which were originally not present in the reference image, leading to unpredictable changes in the marginal distribution of the reference image. 6.3.2. Partial Volume Interpolation (PVI) Instead of generating a new interpolated reference image, PVI updates the joint histogram for each pixel pair (u, vi) by the same weights as for linear interpolation. The PVI algorithm obtains the joint histogram as defined in Eq. (26), PVI can make the changes of the histogram more smoothly than the linear interpolation method2. However, interpolation artifacts may still occur when using PVI as in the case of linear interpolation. h ( F ( u ) , R ( vi ) ) + = wi
2
A += B implies A = A+B
i = 1, 2,3, 4 .
(26)
203
P.K. Varshney et al. / Image Registration: A Tutorial
6.3.3. Generalized Partial Volume Estimation (GPVE) GPVE is a generalized version of PVI [3,8]. It can overcome the artifact problem as mentioned in Section 6.2. GPVE updates the joint histogram as described in Eq. (27), where, p and q are used to specify the pixels involved in the histogram updating procedure and f is the Kernel function.
(
)
h F ( u x , u y ) , R ( vx + p, v y + q ) + = f ( p − Δ x ) • f ( q − Δ y ) , ∀p, q ∈ Z .
(27)
The kernel function f is a real valued function with the following properties, 1. f (x) ≥ 0, where x is a real number. 2. ∑ f (n-Δ) = 1, where n∈ I, -∞ ≤ n ≤ ∞ ; 0 ≤ Δ ≤ 1
Figure 13. B-splines function: (a) First order (b) Second order (c) Third order.
0.8
0.9
Mutual Information
Mutual Information
A B-splines function is used as the kernel function f in GPVE. The shapes of the first, second, and third order B-splines are shown in Figure 13. When the 1st order Bspline function is employed in each direction, GPVE is equivalent to PVI. As the order of the B-spline function increases, more entries of the joint histogram are involved in updating each pixel in the floating image. The artifacts can be hardly seen in the Mutual Information function when either the 2nd or 3rd order GPVE is used. An example of the calculation of the Mutual Information vs. Shift in x-axis using 1 st and 3rd order GPVE is shown in Figure 14.
0.8 0.7 0.6 0.5 -20
-15
-10
-5
(a) First order GPVE
0
5
0.7 0.6 0.5 -20
-15
-10
-5
0
5
(b) Third order GPVE
Figure 14. Example of Mutual Information vs. Shift in x-axis
6.4. Speed of Computation Mutual Information based registration is computationally intensive. When the entire search space is taken into account to find the optimal solution, it consumes a large amount of time and quite often the optimizer does not converge to a solution. Various
204
P.K. Varshney et al. / Image Registration: A Tutorial
approaches have been devised to reduce this effort. One of the solutions is to perform a two level registration; a coarse registration using the Fourier method or the control point-based method so as to narrow down the search space followed by a finer registration method. In addition to this approach people usually adopt the multiresolution strategy to reduce the computation time. The multi-resolution strategy is explained in detail in Section 7.2.
7. Search Strategy 7.1. Optimization Techniques The registration measure as a function of transformation defines a multi-dimensional function, four (4-D) in the case of rigid body transformation. The argument of the transformation corresponding to the optimum of this function (See Eq.2) is assumed to be the transformation that correctly aligns the images. In practice the registration (similarity) function is not a smooth function, which makes this process a non-trivial one. Also, when dealing with the search strategy the selection of a bounded search space is very important. It has been seen that the function may attain higher values (in case of maximization) than that for the correct transformation for a large misregistration. Hence the optimization is futile, as it would result in some local maxima, which we are not interested in. But in case the search space is limited this anomaly is not found and hence the correct maxima/minima can be located. There are a number of optimization techniques available in the literature and many of them have been applied to the problem of image registration. A detailed listing of these algorithms can be found in a survey paper on mutual information based registration [28] and the related references cited there. The most commonly used methods are the Powell and the Simplex methods. The Powell method optimizes each transformation parameter in turn, but is sensitive to local optima in the registration function. Unlike the Powell method, the Simplex method considers all parameters together, which makes it computationally expensive. Both these methods do not require the computation of the derivatives of the image. Unlike these, methods such as Gradient descent, Newton, Levenberg-Marquardt, which are all local optimization methods and require the computation of the derivative of the image. LevenbergMarquardt is a combination of the Gradient method and the Newton method. It is computationally efficient and has been applied to minimize the sum of squared differences and mutual information. 7.2. Multi-resolution Strategy In some image registration applications, such as remote sensing, the image size is usually large, which results in a high computational cost of the registration algorithms, especially in the Mutual Information method. Therefore, a multi-resolution strategy is introduced to reduce the computation. The multi-resolution strategy starts with the registration of the reference image and the floating image on a coarse resolution (generated using Gaussian pyramids, Laplacian pyramids, simple averaging or wavelet transform coefficients) [2] and then goes up to a finer resolution. Figure 15 shows a two-level wavelet decomposition of an image. At each level, most of the methods listed in Section 5.3 can be used to obtain the registration result. The coarse registration
P.K. Varshney et al. / Image Registration: A Tutorial
205
result reduces the search space of the registration algorithm on the finer resolution and therefore considerably reduces the computational time. Accuracy increases while the registration goes from coarse to fine. However, this strategy will fail if the registration on coarser level gets a false result. To overcome this, a backtracking or consistency check procedure should be incorporated in the algorithms [2]. Wavelet-based registration is a typical multi-resolution strategy. Appendix 3 provides a wavelet registration process combined with correlation method.
Figure 15. Wavelet Decomposition of the Image
8. Evaluation of Image Registration Methods Performance assessment of different registration algorithms is highly desirable so that a user can select the appropriate algorithm for the application involved. This is a nontrivial problem since errors can be introduced into the process at various levels and it is difficult to distinguish between registration inaccuracies and differences due to actual changes in the scene. When registration of multi-modality images is involved, the performance evaluation task becomes even more difficult. It should be pointed out that registration accuracy is not the only metric when evaluating registration algorithm. As mentioned in [29], other metrics for registration evaluation include, Precision, Accuracy, Robustness and Stability, Reliability, Knowledge and Resource requirement, Algorithm Complexity and Computational time, Assumption verification and Usability.
9. Intelligent Methods for Image Registration The growing need for automation calls for an automatic registration algorithm. But since there are a large number of choices available in the various components of the
206
P.K. Varshney et al. / Image Registration: A Tutorial
registration process along with a large spectrum of images and requirements, one registration method cannot satisfy all the scenarios. Hence, an intelligent method, which is based on the inputs and the requirements, can decide which method (at various stages) to apply. Currently research is underway related to this concept at Syracuse University, Syracuse, NY, USA and a prototype of this system is under development at ANDRO Computational Solutions, Rome, NY, USA under a Small Business Innovation Research (SBIR) Phase II Program sponsored by AFRL/SNAR. The proposed architecture of the system is shown in Figure 16.
Figure 16. Proposed architecture of the intelligent image registration system
10. Challenges and Future Work Image registration is one of the most researched areas for multi-sensor image analysis. But the basic understanding of the process taking into account the various modalities of images and other related choices is still in its early stages. As described in Section 7, identifying a bounded search space is an important issue in the registration process and in an automatic registration algorithm this needs to be intelligently decided based on the particular case at hand. Coarse registration followed by fine registration is one possible method to solve the search space problem. Hence, there is a need to develop fast coarse registration techniques for multi-modality images. Also research needs to be done to develop algorithms that can find the global optimum correctly and more efficiently (e.g., use of heuristic testing algorithm) [27]. Recently some work [30, 31, 32] has been initiated to derive achievable performance bounds. This needs to be continued. Another related area, which is underexplored, is a common evaluation criteria/platform for different algorithms. There does not seem to be a consensus in the registration community on the metrics for image registration performance and this requires further investigation.
P.K. Varshney et al. / Image Registration: A Tutorial
207
References [1] [2] [3]
[4]
[5]
[6]
[7]
[8]
[9] [10]
[11] [12] [13]
[14]
[15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26]
Lisa G. Brown, A survey of image registration techniques. ACM Computing Surveys, 24(4): 325-376, December 1992. B. Zitov’a and J. Flusser. Image registration methods: a survey. Image and Vision Computing, vol 21, pp 977–1000, 2003. H. Chen and P. K. Varshney, Mutual Information Based CT-MR Brain Image Registration Using Generalized Partial Volume Joint Histogram Estimation, IEEE Transactions on medical imaging, vol. 22, no.9, pp. 1111-1119, 2003. H. Chen and P. K. Varshney, Automatic two-stage IR and MMW image registration algorithm for concealed weapon detection, IEE Proceedings of vision, image and signal processing. , Vol. 148, no. 4, pp. 209-216, Aug. 2001. P. K. Varshney, H. Chen, L. C. Ramac, Registration and fusion of infrared and millimeter wave images for concealed weapon detection, in Proc. of International Conference on Image Processing, Japan, vol. 3, pp. 532-6, Oct. 1999. H.M. Chen and P.K. Varshney, MI Based Registration of Multi-Sensor and Multi-Temporal Images, Advanced Image Processing Techniques for Remotely Sensed Hyper spectral Data, Editors: P.K. Varshney and M.K. Arora. Publisher: Springer Verlag, 2004. H. Chen, P. K. Varshney, and M. K. Arora, Performance of Mutual Information Similarity Measure for Registration of Multi temporal Remote Sensing Images. IEEE Transactions on Geosciences and Remote Sensing, vol. 41, no. 11, pp. 2445-2454, Nov. 2003 H.M. Chen and P.K. Varshney, Mutual Information Based Image Registration, Advanced Image Processing Techniques for Remotely Sensed Hyper spectral Data, Editors: P.K. Varshney and M.K. Arora. Publisher: Springer Verlag, 2004 H. Chen, P. K. Varshney, and M. K. Arora, Mutual information based image registration for remote sensing data, International Journal of Remote Sensing, Vol. 24, no. 18, pp. 3701-3706, 2003 G. C. Stockman, S. Kopstein, and S. Bennet, Matching images to models for registration and object detection via clustering, IEEE Trans. Pattern Analysis and Machine Intelligence, vol. 4, pp. 229–241, 1982. M. G. Tommaselli, and C. L. Tozzi, A recursive approach to space resection using straight lines, Photogrammetric Engineering and Remote Sensing, Vol. 62, N0. 1, pp. 57-66, 1996. Jan F. Andrus, C. Warren Campbell, Robert R. Jayroe, Digital Image Registration Method Using Boundary Maps, IEEE Trans. Computers Vol. 24 N0. 9, pp. 935-940, 1975 X. Huang, Y. Sun, D. Metaxas, F. Sauer, and C. Xu, Hybrid Image Registration based on Configural Matching of Scale-Invariant Salient Region Features, 2nd IEEE Workshop on Image and Video Registration July 2004 Q Chen, M Defrise, F Deconinck, Symmetric Phase-Only Matched Filtering of Fourier-Mellin Transforms for Images Registration and Recognition, IEEE Trans. On Pattern Analysis and Machine Intelligence, vol. 16, no. 12, pp. 1156- 1168, 1994. B. S. Reddy and B. N. Chatterji, An FFT-based technique for translation, rotation, and scale-invariant images registration, IEEE Trans. Image Processing, vol.3, pp.1266-1270, Aug.1996 E. De Castro, and C. Morandi, Registration of translated and rotated images using finite Fourier transforms, IEEE Transaction on Pattern Analysis and Machine Intelligence, vol. 9, pp. 700-703, 1997. R. Berthilsson. Affine correlation. In Proc. Int. Conf. Pattern Recognition, Brisbane, Australia, pp. 1458-1467, 1998. S. Kaneko, Y. Satoh, , S. Igarashi, Using selective correlation coefficient for robust image registration, Pattern Recognition, 36(5), pp.1165–1173, May 2003. J Kim and J A Fessler. Intensity-based image registration using robust correlation coefficients, IEEE Transactions on Medical Imaging, vol. 23, No.11, pp.1430-44, Nov., 2004. P. Viola, and W.M. Wells III, Alignment by maximization of mutual information, Int. Conf. on Computer Vision, pp. 1623, 20-23 June 1995 Collignon, F. Maes, et al, Automated multi-modality image registration based on information theory, in Information Processing in Medical Imaging, Kluwer, pp. 263-274, 1995. http://en.wikipedia.org/wiki/Joint_entropy http://en.wikipedia.org/wiki/Conditional_entropy T. M. Cover and J. A. Thomas, Elements of Information Theory, John Wiley & Sons, New York, 1991. Xenios Papademetris, Image registration: a review, Yale MRRC fMRI Seminar Series, 16th October 2003. J. Pluim et al, Interpolation Artifacts in mutual information-based image registration, Computer Vision and Image Understanding, Vol. 77, pp. 211-232, 2000.
208
P.K. Varshney et al. / Image Registration: A Tutorial
[27] Hua-mei Chen, Mutual information based image registration with applications, Ph.D. dissertation, Syracuse University, Syracuse, NY, May 2002. [28] J.P.W. Pluim, J.B.A. Maintz, M.A. Viergever, Mutual information based registration of medical images: a survey, IEEE Trans on Medical Imaging, vol X No Y, 2003 [29] Maintz, J.B.A. and Viergever, M.A., A survey of medical image registration, Medical Image Analysis, vol. 2, no.1, pp.1-36, 1998. [30] Robinson, D., and P. Milanfar, Fundamental performance limits in image registration, IEEE Transactions on Image Processing, vol. 13, no. 9, pp. 1185-1199, September 2004. [31] S. Yetik and A. Nehorai, Performance bound on image registration, IEEE International Conference on Acoustics, Speech, and Signal Processing, March, 2005. [32] M. Xu and P. K. Varshney, Tighter performance bounds on image registration, submit to IEEE International Conference on Acoustics, Speech, and Signal Processing, 2006.
Appendix 1 Control point based image registration. Step I: Selection of control points − At least four pairs of feature points in the reference image and floating image are selected and matched manually (automated algorithms can also be used). In this example we select four pairs as shown in Figure 9. Step II: Selection of transformation space − A transformation space corresponding to the first-order polynomial function is assumed. Step III: Formulation of relationship between features − Using the selected control points and the transformation space, eight (twice the number of feature pairs) equations similar to the one below are formed. u i = f ( x i , y i ) + n i = a 0 0 + a1 0 x i + a 0 1 y i + n i v i = g ( x i , y i ) + m i = b 0 0 + b1 0 x i + b 0 1 y i + m i i = 1 ...4 where ui and vi are the coordinates of the reference image; xi and yi are the coordinates of the floating image; ni and mi are usually modeled as noise; {a00, a01, a10, b00, b01, b10} are the parameters of transformation space. Step IV: Solving for the transformation parameters − Least squares technique is used to estimate the six parameters {a00, a01, a10, b00, b01, b10}. − Let us denote, * ⎧ f * ( x i , y i ) = a 0* 0 + a 10 x i + a 0*1 y i ⎨ * * * * ⎩ g ( x i , y i ) = b 0 0 + b1 0 x i + b 0 1 y i The criterion is to minimize the noise energy, which is to minimize ⎧ ∑ n ( u i − f * ( x i , y i )) 2 ⎪ i =1 ⎨ n * 2 ⎪⎩ ∑ i = 1 ( v i − g ( x i , y i )) The estimates can be obtained by solving the following equations
P.K. Varshney et al. / Image Registration: A Tutorial
⎧ ∂ ⎪ ∂a * j ,k ⎪ ⎨ ⎪ ∂ ⎪⎩ ∂ b *j , k
∑ ∑
n i =1
209
( u i − f ( x i , y i )) 2 = 0
n
( j , k ) = (0, 0 ), (0,1), (1, 0 )
( v i − g ( x i , y i )) 2 = 0 i =1
Step V: Estimating new image using the transformation determined − Choose an interpolation method − Transform the image using the transformation determined.
Appendix 2 Fourier based image registration. Step I: Domain transformation − Convert the images into frequency domain. − Further pass the magnitude of the Fourier spectrum through the high pass filter. − Convert this magnitude spectrum into log polar plane. Step II: Find the rotation angle and scale − Compute the Cross power spectrum (Eq. (10)) between the two log polar images. − Determine the inverse Fourier transform (Eq. (11)) of the cross power spectrum and identify the top peaks. The coordinates of the peaks are the estimated rotation angle and scale. The number of peaks taken into consideration depends on the accuracy desired. Step III: Find the translation − Apply the estimated rotation and scale to the floating image so that the new floating image only has translation errors. − Compute the cross power spectrum (Eq. (10)) between the new floating image and the reference image. − Determine the inverse Fourier transform (Eq. (11)) of the cross power spectrum and identify the top peaks. The coordinates of the peaks are the translations in xaxis and y-axis.
Appendix 3 Wavelet based image registration. Step I: Wavelet decomposition − An image is first decomposed recursively into four sets of coefficients (LL, HL, LH, HH) by filtering the image with two filters, a low-pass filter L and a high pass filter H, both working along the image rows and columns. Figure 15 shows the two-level wavelet decomposition by the Harr wavelet. Step II: Initial registration
210
P.K. Varshney et al. / Image Registration: A Tutorial
− Initial transformation parameters are estimated by optimizing normalized crosscorrelation function (Eq. (18)) of wavelet coefficients between the reference image and the floating image on a coarse level. The normally used wavelet coefficients are LL coefficients, HH coefficients, both HL and LH coefficients and modulus maxima of LH and HL coefficients, depending on the application. Step III: Finer registration − After initial registration, the search space of the optimization of the correlation function of wavelet coefficients in the next finer resolution will be narrowed down. New transformation parameters are updated. Step IV: Stopping criteria − Repeat Step III, until the accuracy of the estimation of the transformation parameters is achieved.
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
211
Automated Registration for Fusion of Multiple Image Frames to Assist Improved Surveillance and Threat Assessment Malur K. SUNDARESHAN and Mohamed I. ELBAKARY Department of Electrical and Computer Engineering University of Arizona, Tucson, AZ 85721-0104 Tel: (520) 621-2953; Fax: (520) 626-3144; e-mail:
[email protected] and
[email protected]
Abstract. Automated registration of image frames is often required for construction of High-Resolution (HR) data to perform surveillance and threat assessment. While some efficient approaches to image registration have been developed lately, the registration algorithms resulting from these approaches generally remain application dependent and may require operator-assisted tuning for different images to achieve same efficiency levels. In this article, we describe an algorithm for automatic image registration that assists improved surveillance and threat assessment in scenarios where multiple diverse sensors are used for these applications. This algorithm offers scene-independent registration performance and is efficient for different scenes ranging from complex highlyvarying gray-scale images to simpler low variable gray-scale images. While use of feature-based methods has emerged as more versatile for automatic registration in surveillance applications (compared to other methods based on correlation, mutual information maximization, etc.), the algorithm described here employs the local frequency representation for the image frames to be registered in order to generate a set of control points to solve the matching problem and to determine the registration parameters. The algorithm exploits certain inherent strong points of local frequency representation, such as robustness to illumination variation, capability of detecting the structure of the scene in the image (ridges and edges) simultaneously, and good localization in spatial domain. Experimental results reported here indicate that this registration technique is efficient and yields promising results for the alignment and fusion of complex images. Keywords: Image Registration, Resolution Enhancement, Feature Extraction, Object Recognition, Sensor Fusion, Surveillance, Threat Assessment.
Introduction High-Resolution (HR) imagery data are often needed in many practical applications to support important image exploitation tasks, such as detecting objects of interest within a scene, identifying and assessing severity of threats, autonomously recognizing specific targets that may be of interest for a specific mission, or precisely tracking the motion of a chosen target. Due to optical diffraction limits (and the consequent lowpass filtering effects) and also due to the fact that detector arrays in these sensors may not be configured to have required densities for imaging at Nyquist rates (and the consequent aliasing due to undersampling), the resolution in the data captured by EO
212
M.K. Sundareshan and M.I. Elbakary / Automated Registration for Fusion
and other types of imaging sensors (such as IR, MMW, LADAR) will be quite low to permit image exploitation tasks to be executed reliably. A very promising approach for the construction of HR frames is to use advanced signal processing methods directed to summing multiple frames of data output by a sensor (such as video data or image frames acquired from employing micro-scanning techniques) or directed to fusing image frames captured by multiple diverse sensors looking at the same scene. A fundamental processing step that is required to ensure an efficient integration of multiple Low-Resolution (LR) frames into a single HR frame by frame summing (when the frames are captured by the same sensor) or by image fusion (when the frames are captured by multiple diverse sensors) is the registration of frames to be integrated. A schematic of an image processing system (shown in Figure 1) that reconstructs HR image frames for executing image exploitation operations provides a framework for describing the general methodology where image registration will be of interest in the studies reported in this article. The integration of EO data with data from other sensors (IR or other), as shown, is only for illustration of general concepts. Our focus in this article will be the development of image registration algorithms that facilitate the multiframe summing and image fusion operations shown in order to produce a HR frame for surveillance and threat assessment applications. It should be noted that registration of image frames is an important problem in many other fields as well. It is of particular interest in remote sensing, medical imaging and computer vision, and is a prerequisite for the alignment and fusion of multiple image frames. In this problem we are given two images of roughly the same scene, and are asked to determine the transformation that most nearly maps one image to the other. A good survey of the existing literature on this problem can be found in [1,2]. High Resolution (HR) EO Frame
Frame 1
EO Sensor
Frame 2 Frame 3
Multi-frame Summing
Frame K
Image Fusion
Frame 1
IR or Other Sensors
Fused HR Image of Scene
Image Exploitation Stages
Frame 2 Frame 3 Frame K
Multi-frame Summing
High Resolution (HR) IR or other sensor
Figure 1. Schematic of an Image Processing System for Data Fusion and High-Resolution (HR) Frame Reconstruction
1.
Development of Image Registration Algorithms for Surveillance and Threat Assessment
One may broadly classify existing image registration methods into two classes, viz. feature-based matching methods [3-5] that attempt to match certain control points or dominant features in the two frames, and direct methods which implement a search strategy that attempts to optimize some meaningful criterion [6-8]. The former approach requires computing of contours, features, surfaces, or geometric distribution in the images, while the latter method uses the raw images (without any significant
M.K. Sundareshan and M.I. Elbakary / Automated Registration for Fusion
213
preprocessing) to compute the chosen optimization criterion (as in the minimization of normalized least-square error or the maximization of mutual information). Both approaches have certain strong and weak points. While methods that attempt to match extracted features have been shown to give accurate solutions, obtaining the correct features to match and selecting a corresponding matching algorithm are particularly difficult problems, especially in the case of images acquired from diverse sensors operating in different modalities. Consequently, most of the image registration algorithms developed using this approach assume that features are well preserved within the different images. Direct methods, on the other hand, can get computationally expensive and typically need a good initial guess to ensure proper convergence. The optimization procedure may get trapped in a local extremum, especially in the cases when the registration parameters (translation, rotation, and/or scaling) that need to be estimated are of significant sizes. Despite these differences between the two approaches, feature-based matching methods have generally emerged as more versatile for applications in surveillance and threat assessment, since it is possible to give a greater emphasis on specific portions of the overall scene (by extracting features from these portions only), while deemphasizing or even disregarding other areas. On the other hand, direct methods that typically utilize information contained in the entire scene in computing the optimization measure have found a more satisfactory application in geo-remote sensing and medical imaging. While some efficient image registration procedures have been developed lately following the two approaches cited above, the resulting registration algorithms are still application dependent [1,2]. In general, an algorithm that offers superior performance for one class of images (or scenes) may not be equally efficient for images of a different type. Recently, use of local frequency computed from the images to be registered has been suggested for multi-modal image registration [9-12]. An introduction to local frequency representation and its use in signal analysis can be found in [12-14]. It is of interest to note that local frequency enjoys some inherent advantages that make it useful for image analysis: it is relatively invariant to illumination changes (and hence insensitive to the level of signal energy), it provides a faithful representation of the structure (both edges and ridges simultaneously), and has a good localization in spatial domain [11]. These benefits make use of local frequency a promising candidate for handling image registration problems. Unfortunately, however, computation of local frequency can become quite involved, especially for complex images with significant gray-scale variations. In recent work, the authors have established a computationally efficient procedure for obtaining the local frequency representation of input images [15]. Once the local frequency representations for the two images to be registered are obtained, a set of points that have high local frequency values can be extracted from each. These sets serve to provide a characterization of the dominant features (edges and ridges, for instance) from each image and hence provide an ideal selection of control points for establishing a match between the two images. The matching problem can be solved within an optimization framework in order to estimate the registration parameters. In this article we shall outline a systematic procedure that implements these steps and demonstrate the performance of the overall scheme by application to diverse complex images of specific interest in surveillance and threat assessment scenarios.
214
2.
M.K. Sundareshan and M.I. Elbakary / Automated Registration for Fusion
Local Frequency Representation of Images
Given an image, its local frequency representation can be obtained from computing the spatial derivative of the local phase extracted from the image. For a brief introduction to the local frequency representation, consider a one-dimensional signal s , whose corresponding analytic signal is defined as s A s isHi , where sHi is the Hilbert transformation of s defined by
sH i ( x )
1
S
f
s([ )
³ [ x d[
f
(1)
It may be noted that the Hilbert transform can be computed quite easily by performing the operation [12]
sH i
s
1 , Sx
i.e., by convolving s with the function
(2) 1
S x
. Thus, the transformation of a given
real signal s to the corresponding analytic signal s A can be regarded as the result of convolving the real signal with a complex filter, such as a Gabor filter [16]. The argument of s A is referred to as the local phase of s , which is defined in the spatial domain. The spatial derivative of the local phase is called the instantaneous or local frequency [12]. To summarize the above discussion, given a real signal, its corresponding analytic signal is complex with the real part being the original signal itself and the imaginary part being its Hilbert transform obtained by convolving the signal with a Gabor filter. Although the above discussion is given for a onedimensional signal for the sake of simplicity, it can be generalized to higher dimensions readily [12]. It may be mentioned that Gabor filters are the most popularly used filters for local frequency representation of a given image due to the fact that 2-D Gabor functions are optimal in terms of their space-frequency localization. A computationally efficient procedure for obtaining the local frequency representation of a given image is recently developed by the authors [15, 21]. This procedure utilizes some important recent findings inspired by biological data (efficiency of the model human image code and orientation bandwidth of visual cortex cells) of splitting up the two-dimensional spatial frequency plane of a given image into 4 orientation bands, with orientation bandwidths of 45 degrees (in order to cover 180 degree range, or one half of the 2-D frequency plane), for constructing a multi-channel filtering scheme. A brief outline of this procedure will now be given. 1. Create a set of Gabor filters, Gk ( x, y, f k ,T k , V k ) , k= 1,2,3,4, to cover the frequency space for the image under investigation, where f denotes the spatial frequency, T is the orientation angle with x-axis, and V is the Gaussian window width. (x,y) denotes the pixel position in the spatial domain. 2. Convolve image I whose local frequency representation is desired with each filter Gk, k=1,2,3,4, using
M.K. Sundareshan and M.I. Elbakary / Automated Registration for Fusion
u ( k ) ( x, y )
3.
I
Gk
(3)
Compute the local phase, < calculating
< ( k ) ( x, y )
4.
tan 1
215
(k )
, for the analytic signal u ( k ) , k=1,2,3,4, by
imag (u ( k ) ( x, y )) real (u ( k ) ( x, y ))
(4)
Create the local frequency representation using 2 x (< ( k ) ( x, y )) 2 y (< ( k ) ( x, y )) | cos(T k T ' ( K ) ) | ,
*(k )
where T k is the orientation of
(5)
the kth Gabor filter and T ' ( x, y ) denotes the
direction of gradient vector with respect to x-axis , T ' ( k ) ( x, y )
tan 1
< ( k ) y ( x, y ) < ( k ) x ( x, y )
. It
k
may be noted that * provides a spatially localized estimate of the local frequency along the direction T k [17]. 5. Fuse the local frequency estimations obtained from each of the four filters,
* k , k=1…4, to get one representation using * ( x, y )
(6)
max{* (1) ( x, y ), * (2) ( x, y ), * (3) ( x, y ), * (4) ( x, y )}
An illustration of local frequency representation of a given image is shown in Figure 2, where the original image (“Cameraman” image) is shown in Figure 2(a) and its local frequency representation encoded as a gray-scale image is depicted in Figure 2(b). It is evident that the higher local frequency values in the local frequency representation translate directly into higher gray-scale values in the encoded image depicted in Figure 2(b).
(a)
(b)
Figure 2. (a) Cameraman image; (b) Gray- scale encoded local frequency representation
3.
Matching the Local Frequency Representations
Having obtained the local frequency representations of a pair of images, one can select a set of points from each representation as control points. We select those points that have the largest values because they reflect the most apparent structure in the image
216
M.K. Sundareshan and M.I. Elbakary / Automated Registration for Fusion
(edges and ridges) as shown in the encoded image in Figure 1b. The number of the selected points should be enough to capture all dominant features from the image and to establish the matching. Evidently, the number of points to be selected varies from one image to another based on the size of the image and level of activity within the scene. In our experiments, we have found that the number of points enough to capture most dominant features and to establish a good match between frames typically ranges from 100 to 300. The size of the images considered in our experiments varies from 64 x 64 to 256 x 256. For more details on these experiments and some guidelines for selecting control points, one may see [18]. Once the control points are selected for a pair of images, the correspondence between them can be obtained by matching the set of points selected. While there exist a number of point matching procedures, we employed the algorithm presented by Gold et. al. [19]. This algorithm incorporates an optimization technique and an iterative correspondence assignment technique called “Softassign”, which is a general procedure for identifying the correspondence between two sets of points in space. The motivation for using this algorithm comes from its capability that it can detect the outliers from matched pairs while at the same time estimate the transformation parameters. Given two 2D point sets { X i } and {Y j } related by an affine transformation X AY , A
denoting the transformation matrix, the algorithm attempts to minimize the objective function, min E (m, A) m, A
M
N
i 1
j 1
¦ ¦
mij X i AY j
2
M
D ¦i
1
¦
N j 1
mij
(7)
In Eq. (7), mij denote the correspondence variables that define the match matrix of dimension M u N . The second term with the multiplier D biases Eq. (7) towards matches. It acts as a threshold error distance, indicating how far apart the two points must be before they may be treated as outliers. For image registration, A is a 3x3 2D affine transformation matrix in the plane and defined by six parameters a, b, c , e, f, and g in the form
ªa A «« e «¬ 0
b f 0
cº g »» 1 »¼
As is well-known, these six parameters specify the translation, the scaling and the rotation in the plane [20]. Eq. (7) describes an optimization problem whose solution yields the transformation matrix, A. Due to space limitations, more details on this algorithm are omitted. They may however be found in [19].
4.
Experimental Results
A number of experiments were conducted with different types of images in order to evaluate the performance of the present registration algorithm. Results from a few illustrative ones are briefly summarized in this section. In order to create ground truth data with which the registration parameters estimated by the present algorithm could be compared for a quantitative evaluation, the basic image frame in each case was
M.K. Sundareshan and M.I. Elbakary / Automated Registration for Fusion
217
distorted by a known affine transformation in order to obtain a second frame that was then registered with the first (undistorted) frame. The first experiment uses the “Cameraman” image, of size 128 x 128, shown in Figure 1(a). This was rotated by a 4o angle in order to obtain the distorted frame shown in Figure 3(a). Estimating the registration parameter (rotation angle, in this case) between the two frames by the present algorithm occurs in three steps. First, we obtain the local frequency representations for the two frames using the procedure outlined in Section 2 (the local frequency representation of the frame in Figure 2(a) is shown in Figure 2(b)). Then a set of control points are selected from each local frequency representation for execution of the matching step by isolating points with local frequency values exceeding a chosen threshold. For the images under consideration, it was found that a set of 200 control points extracted from each local frequency representation is enough to cover all principal features in the image as well as to ensure that there are an adequate number of matching pairs to implement the matching algorithm. Determination of an appropriate number of control points is generally dependent on the activity within the image (and hence is image-dependent). However, in all experiments that were performed, matching sets containing about 300 control points were found to be adequate to give reasonably accurate estimates of the registration parameters. For illustration, Figure 3(b) shows the control points extracted from the local frequency representation shown in Figure 2(b). It must be emphasized that selection of the least number of control points necessary for the specific image being processed is useful for minimizing the computational effort in the matching step. However, extracting a matching set that may have more than the required minimum number of points would eliminate user intervention and makes the registration process fully automatic for all considered images. In the third step, we apply the matching algorithm to establish correspondence between the two extracted sets and to estimate the transformation parameters at the same time. The estimated rotation angle in this experiment was 3.9o, which agrees with the ground truth data quite well. Figure 3(c) shows the distorted frame in Figure 3(a) after it is registered by the estimated parameters (i.e. de-rotated by 3.9o).
(b)
(a)
(c) Figure 3. (a) Cameraman image distorted by a rotation angle 4 o. (b) The selected control points from the local frequency representation. (c) Image in (a) after de-rotating by the recovered angle 3.9 o
Figure 4 shows the results of an experiment performed with a different type of image. The original image, of size 256 x 256 and shown in Figure 4(a), is distorted by scaling to 85% of the original and a rotation of 3o. The resulting distorted image is then
218
M.K. Sundareshan and M.I. Elbakary / Automated Registration for Fusion
registered with the original frame. The estimated registration parameters by the present algorithm in this case are 3.1o for rotation and 0.85 for scaling. Figure 4(b) shows the local frequency representation of the frame in Figure 4(a), and the extracted control points for matching are shown in Figure 4(c). The registered image after de-rotating by 3.1o and prior to applying scaling is shown in Figure 4(d). For increasing the challenge to the present algorithm, one other experiment with remotely sensed earth data was conducted. Figure 5(a) shows a reference data frame of size 256 x 256. A distorted image, shown in Figure 5(b), was obtained by rotating this frame by 6o (and no scaling), which was then registered with the original frame. The estimated registration parameters obtained were 6.01o for rotation and 1 for scaling. The local frequency representation of the original frame is shown in Figure 5(c) and the distorted image after de-rotation by 6.01o is shown in Figure 5(d).
(a) (b)
(d)
(c)
Figure 4. (a) Original airplane image. (b). Local frequency representation. (c). The selected control points from the local frequency representation. (d). Distorted image registered with image in (a) by applying 3.1 o de-rotation.
(a)
(b)
(c)
(d)
Figure 5. (a) An aerial image of part of a city. (b) Distorted image by applying 6 o rotation to the reference image. (c) Local frequency representation for the image in (a). (d) The registered image obtained by applying 6.01o de-rotation to image in (b).
M.K. Sundareshan and M.I. Elbakary / Automated Registration for Fusion
5.
219
Conclusions
Utilization of local frequency representation of images to be fused offers a promising approach for solving image registration problems arising in the construction of highresolution frames for multi-sensor surveillance and threat assessment. Computing the local frequency representations of the images by the algorithm described in this article enables to obtain a fully automated approach for image registration. Results presented in this paper demonstrate that the technique described here can be efficiently utilized to register diverse images with differing complexity levels and that the algorithm is quite robust to scene details. The present algorithm can hence find many applications in sensor fusion, object recognition and threat assessment, and detection and tracking of surveillance targets. With some modifications, the algorithm can also be extended to sub-pixel registration, which has important applications in super-resolution frame reconstruction using micro-scanned sensor measurements. Details on these developments can be found in [22].
6. References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17]
L. G. Brown, “A Survey of Image Registration Techniques”, ACM Computing Surveys, 24(4), pp. 325376, 1992. A. J. B. Maintz, and Viergever, M. A., “A Survey of medical image registration”, Medical Image Analysis, 2(1), 1-36, 1998. V. Govindu and C. Shekhar, “Alignment using distributions of geometric properties,” IEEE Trans. PAMI 21(10), pp. 1031-1043, 1999. J. Ton and A. K. Jain, “Registration Landsat images by point matching,” IEEE Trans. Geosci. Rem. Sen., 27(9), pp.642-651, 1989. T. Kim and Y. J. Im, “Automatic satellite image registration by combination of matching and random sample consensus,” IEEE Trans. Geoscience and Remote Sensing, 41(5), pp.1111-1117, 2003. P. Viola and W. Wells, “Alignment by maximization of mutual information,” Proc. Fifth Int. conf. On Computer Vision, pp.16-23 Boston, MA, 1995. K. Johnson, A Rhodes, J. Le Moigne, and I. Zavorin, “Multi-resolution image registration of remotely sensed imagery using mutual information,” Proc. SPIE Aerosense conf. on Wavelet Applications VII, 2001. P. Fua and Y. Leclerc, “Image registration without explicit point correspondences,” Proc. DARPA Image Understanding Workshop, pp. 981-992, 1994. J. Liu, B. C. Vemuri and F. Bova, “Multi-modal image registration using local frequency,” Fifth IEEE Workshop on Applications of Computer Vision. pp: 120 –125, 2000. J. Liu, B. C. Vemuri, and J. L. Marroquin, “Local frequency representations for robust multimodal image registration,” IEEE Trans. Medical Imaging, 21(5), pp. 462-469, 2002. J. Liu, “Regularized quadrature for local frequency estimation: application to multi-modal volume image registration,” VMV’01, Stuttgart, Germany, pp.507-514, 2001. G. H. Granlund and H. Knutsson, Signal processing for Computer Vision, Dordrecht, The Netherlands: Kluwer Academics, 1995. B. Boashash, “Estimating and interpreting the instantaneous frequency of a signal – Part 1: Fundamentals,” Proc. Of IEEE , 80(4),pp. 520-536, 1992. B. Boashash, “Estimating and interpreting the instantaneous frequency of a signal – Part 2: Algorithms and applications”, Proc. of IEEE , 80(4), pp. 540-568, 1992. M. Elbakary and M. K. Sundareshan, “A Novel scheme for registration of images from multiple diverse sensors,” Proc. of The International Conference on Imaging Science, Systems and Technology (CISST’04), Las Vegas, Nevada, June 2004. D. Gabor, “Theory of communications,” Journal of International Electrical Engineers, 93:427-457, 1946. G. M. Haley and B. S. Manjunath, “Rotation-Invariant texture classification using a complete spacefrequency model”, IEEE Trans. Image Processing, 8(2), pp.255-269, 1999.
220
M.K. Sundareshan and M.I. Elbakary / Automated Registration for Fusion
[18] M. Elbakary and M. K. Sundareshan, “Extraction of control points from local frequency representation for image registration”, Technical Report IPDSL-12-2003, ECE Department, University of Arizona. Tucson, AZ, July 2003. [19] S. Gold, A. Rangarajan, C. Lu, S. Poppu, and E. Majolseness “New algorithms for 2D and 3D point matching: pose estimation and correspondence,” Pattern Recognition, Vol. 31, No. 8, pp. 1019-1031, 1998. [20] W. K. Pratt, Digital Image Processing., John Wiley & Sons Inc., New York, 1991. [21] M. Elbakary and M. K. Sundareshan, “Accurate representation of local frequency using a computationally efficient Gabor filter fusion approach with application to image registration”, Pattern Recognition Letters, June 2005. [22] M. Elbakary and M. K. Sundareshan, “Sub-pixel registration of images using local frequency representations”, Technical Report IPDSL-10-2004, ECE Department, University of Arizona. Tucson, AZ, December 2004.
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
221
Data Fusion and Image Processing : A Few Application Examples Olivier GORETTA and Francis CELESTE DGA/SPOTI/ASC/EORD DGA/CEP/ASC/GIP
Abstract. Image data provided by different available and future observation satellites can improve our capabilities of detection and reconnaissance concerning an area of interest. To be effective, this information must be processed and used in a coherent manner. We propose to introduce several examples that take advantage of SAR, optic and infrared images for mapping and threat activity detection and assessment. Image fusion can be performed at three different processing levels (pixel, feature and decision). These different examples of fusion application are related to defence purposes. Keywords: Image registration, data fusion, image fusion, image exploitation, remote sensing, DTM, mapping.
Introduction Imagery exploitation is now a key element in several defence applications. Image data provided by different available and future observation satellites can improve our capabilities of detection and reconnaissance concerning an area of interest. To attain the best level of preparedness and provide the warfighter with requisite information, all information must be processed and used coherently. We propose to introduce several examples that take advantage of Synthetic Aperture Radar (SAR) and optical (Visible and Infrared) images for mapping and threat activity detection and assessment. Image fusion can be performed at three different processing levels (pixel, feature and decision). These different examples of fusion application are related to defence purposes. The definition for data fusion will be the one proposed by Wald in 1998: Data fusion is a formal framework in which are expressed means and tools for the alliance of data originating from different sources and for exploitation of their synergy to obtain an information whose quality cannot be achieved otherwise. The following points will be discussed: x Co-registration or geo-localization x SAR image enhancement based on a multi-temporal fusion: SAR images are naturally corrupted by multiplicative noise (“speckle”), which can make analysis of SAR images more difficult than optical images. Additive multitemporal image fusion can make the interpretation easier by reducing the speckle without affecting the ground sample distance.
222
O. Goretta and F. Celeste / Data Fusion and Image Processing: A Few Application Examples
x x x x x
Digital Terrain Model (DTM) extraction from SPOT stereoscopy, European Remote Sensing Satellite (ERS) interferometry and Radarsat radargrammetry fusion. 3D model reconstruction enhanced with hyperspectral imagery used for ground/building classification. Assessment of a 3D model quality using a single SAR image. An example of multi-source image data synergy (optical: visible and infrared, SAR) for a military threat assessment (Detection Recognition and Identification). Change detection method.
1. Geo-localization and Registration Task
1.1. Image Geo-localization Geo-localization consists of finding a mathematical model that allows the geographical position of each image pixel to be identified. This model may be more or less accurate depending on the knowledge of the imaging acquisition process [4]. A geo-localization model is defined by two functions:
(O , M , h) G (i, j , h) ® ¯ (i, j ) H (O , M , h)
(O , M ) , h and (i, j ) are, respectively, the geographical position, the height on the Earth surface and the pixel position of one point present in the image. Two main families of models can be found: x A complete physical model where all information on flight or satellite and on sensors is used. x A mathematical model such as the polynomial or rational polynomial model. Generally, some parameters of the geo-localization functions are provided with the image but very often they are re-estimated. This can be done for meaningful points whose geographical positions are known. 1.2. Image Registration Image registration [5] [1] is the process of overlaying two or more images of the same geographic area. The images can be taken: x at different times, x from different viewpoints, x and/or by different sensors. Due to the differences introduced by several imaging conditions, accurate registration is still a challenging task. Nevertheless, image registration is a crucial step in all image analysis tasks in which final information is gained from the synergy of various data sources like in image fusion, change detection, and DTM mapping. The
O. Goretta and F. Celeste / Data Fusion and Image Processing: A Few Application Examples
223
purpose of a registration algorithm is to geometrically align two or more images—the master and slave (s). It is impossible to define a generic image registration method for all applications due to the diversity of images to be registered and the various types of image degradation. The geometric and radiometric distortions but also the noise corruption must be taken into account according to the nature of the considered data. Nevertheless, many image registration methods can be divided into four main steps: x Feature extraction: distinctive structures such as contours, lines, points or surfaces are detected and extracted from the images. The extraction process can be performed manually, semi-automatically or completely automatically. Point features are classically called tie-point or Control Points. x Feature matching: the aim is to establish a correspondence between the features from the master and those from the slave. This step can also be done manually or automatically. x Geometric transform model estimation: conditional to the features and the map correspondence given in the previous step, the aim is to find the “best transformation” model for aligning the slave image with the master. x Image geometric and radiometric resampling: the slave image is resampled in the geometry of the master image using an appropriate interpolation method. Master image
slave image
Feature extraction
Master features
Feature matching
Correspondent features Geometric model estimation
Model parameters
Image resampling
Figure 1. Image registration method
Slave features
224
O. Goretta and F. Celeste / Data Fusion and Image Processing: A Few Application Examples
1.3. Example : Automatic SAR-SAR Registration The example shown in Figure 2 deals with the registration of SAR and SAR images with different aspects. In this context, classical area based methods like crosscorrelation are not suitable because of the important geometric distortion between the two images. The idea is to use high-level features like lines. The matching and the geometric transformation model are estimated at the same time by a hypothesis test on the transformation.
Figure 2. Image with extracted line features (image courtesy of ONERA)
2. SAR Images Enhancement Based on a Multi-Temporal Fusion SAR images are naturally corrupted by a multiplicative noise-like phenomenon (known as “speckle”), which can make analysing them more difficult than optical images. The wave emitted by the sensor interacts with each discrete scatterer (surface individual element) within the resolution cell and thus each scatterer contributes a backscattered wave with a phase and amplitude change, so the total returned signal of the incident wave is: iI Ae
N iI k ¦ A e k k 1
The individual scattering amplitudes Ak and I k are unobservable because the individual scatterers are on a much smaller scale than the resolution of the SAR sensor and there are normally many such scatterers per resolution cell. This is the Goodman model [2]. This model can be used for low and medium resolution SAR sensors such as ERS or Radarsat. With this approach, speckle can be understood as an interference phenomenon in which the principal source of the noise-like quality of the observed data is the distribution of the phase terms I k. In practice, one can consider that the phase terms
I
k
are uniformly distributed in >S ; S @ and independent of the amplitude. If we
presume large numbers of scatterers to be statistically identical, the observed phase
I
O. Goretta and F. Celeste / Data Fusion and Image Processing: A Few Application Examples
225
is uniformly distributed over > S ; S @ . The speckle reduction can be made with one image by special filtering of the neighbouring pixels but this will reduce the resolution. If we assume that the scatterers in different images are randomly distributed and numerous enough, then by fusing, i.e., taking the sum of the returned signal of each image, the resulting imaginary term should be approximately equal to 0, which means that additive multitemporal fusion can reduce speckle without affecting the ground sample distance, thus making the interpretation process easier. This additive multitemporal fusion can be done using different techniques or filters [7], which lead to different visual aspects for a given image as can be observed in Figure 3.
Figure 3. Original (upper left) and multitemporal fused images with three different techniques [7]. (images courtesy of CNES and SILOGIC)
In practice, filter selection depends on the amount of available data (images) and on user needs. For example, some methods are more relevant for application where contours must be preserved.
3. Digital Terrain Modelling (DTM) Extraction from ERS Interferometry, SPOT Stereoscopy and Radarsat Radargrammetry Fusion There are three main methods to obtain elevation information from remote sensing data. These methods take advantage of data from different viewpoints, usually involving an image pair. These data can be either optical (SPOT stereoscopy) or SAR (interferometry or radargrammetry).
226
O. Goretta and F. Celeste / Data Fusion and Image Processing: A Few Application Examples
3.1. Interferometry Processing Interferometry [1] [2] uses two overlapping images from two orbits at a small distance (base-line). It differs from stereoscopy and radargrammetry in that the information used is not absolute but ambiguous. The height information is obtained by unwrapping the phase difference between the two SAR data. The absolute height information can then be calculated using ground control points. B D
S1
B
S2
Bz Bx
G
T
T
r2
r2 r1
H r1
zoom
z
The height can be estimated from the following approximated formulas:
°I1 I 2 ® °¯
'I
4S
r1 r2 # 4S Bx sin T Bz cos T O O cos T H z r1
Interferometry processing involves: x Image co-registration as described before. First, a coarse registration is done with the orbital information of the sensors. It is then refined with an areabased correlation technique using Fast Fourier Transform (FFT) computation. x Interferogram building: The complex multiplication of the two registered images gives the phase difference map (interferogram) and the coherence image. The resolution of the derived DTM is generally lower than the resolution of the original images. x DTM extraction. This is done by unwrapping the interferogram. Each pixel phase is known modulo 2S . The unwrapping strategy consists of converting the phase difference measurement to distance, from which the elevation is derived. x DTM calibration. Due to a phase’s ambiguous nature, the DTM does not have an absolute calibration. Height ground control points are therefore used to
O. Goretta and F. Celeste / Data Fusion and Image Processing: A Few Application Examples
227
calculate a correct DTM. These ground control points need to be chosen from a perfectly unwrapped area, usually manually. They can be taken from a map or from another DTM. x DTM correction. This is often necessary because of residual misalignment of the two images or unwrapping failures in some areas. The theoretical accuracy of the interferometry process depends on the so-called ambiguous height linked to the wavelength O , the interferometric baseline B and the sensor acquisition angle T .
ea
Or sin T 2 Bx
The quality of the extracted DTM is also described by the coherence image; the higher the coherence, the better the estimated height. Nevertheless, interferometry suffers from limitations such as: x Unwrapping failures x Very high sensitivity to time differences between the two images, which causes low coherence. x Sensitivity to atmospheric effects x Foreshortening and shadowing effects due to the SAR sensor geometric acquisition 3.2. Stereoscopy Processing Stereoscopy is a classical approach to derive DTM from two optical images, and involves the following steps: : x Image registration x Image resampling in epipolar geometry x Correlation between the two images to obtain a disparity image map x DTM extraction and filtering.
z#
d d # tan(T 2 ) tan(T1 ) B / H
S1
S2
B
T1
T
H
z d
2
228
O. Goretta and F. Celeste / Data Fusion and Image Processing: A Few Application Examples
The theoretical accuracy of the stereoscopy process depends on the geometrical configuration of the two images defined by the so-called B / H ratio.
Gd
Gz
B/H
with x Gz : height error x Gd : disparity error The confidence criteria of the estimated height is directly related to the correlation peak; the higher the correlation, the better the estimate. The limitations of stereoscopic methods are well known and are essentially: x Sensitivity to time differences between image acquisition x Presence of shadows and clouds in the images x Correlation difficulties in mountainous areas 3.3. Radargrammetry Processing Radargrammetry [1] [2] is based on the same principles as stereoscopy, but takes into account the geometric peculiarities of SAR images. The general algorithm should be the same as for stereoscopy but with the added possibility of using speckle reducing filters (§.2) before computation. The elevation confidence criteria is also derived from the correlation process. 'r S1
B
S2
M 'z T1
T2
T1
r2
r1
'x
H
M
M'
'x ® ¯'x
cot T1 cot T 2 .'z 'r / cos T 2
T2
O. Goretta and F. Celeste / Data Fusion and Image Processing: A Few Application Examples
229
The main limitations of radargrammetry are: x Sensitivity to time differences between image acquisition x Foreshortening and shadowing effects due to the SAR geometry x Speckle 3.4. Fusion Methodology The goal of the fusion process is to obtain a DTM from several DTMs previously acquired using the three main methods described above. Each method provides a height estimate with a confidence parameter. Due to limitations, there is no estimate for some locations. The purpose is to find the height h(X) at position X while minimizing the cost function:
J (h( X ))
¦¦ k
X
ok ( X ) o kth X , h( x)
Vo
k
2
J .¦ ((h( x)) X
with : x o k ( X ) : Estimate value given by method k at position X, which may be the x
disparity, the phase difference or height information. V ok ( X ) : confidence associated with ok ( X ) .
x
okth ( X , h( X )) theoretical estimation for method k at position X.
x
J .¦ ((h( x)) : is a regularization function used to filter or smooth the fused X
DTM, and to fill in the missing parts. It introduces rigidity constraints on the result. The three following combinations were tried: x C1 : SPOT stereoscopy and Radarsat radargrammetry x C2 : several radargrammetry with Radarsat data x C3 : SPOT stereoscopy and interferometry C1 enables any missing parts, due to stereoscopy limitations (clouds...), to be filled in and maintains accuracy. C2 and C3 not only derive a more complete DTM, but also provide more accurate height estimates.
230
O. Goretta and F. Celeste / Data Fusion and Image Processing: A Few Application Examples
GCP images
registration
Image pair 1
Image pair 2
Image pair n
method 1
method 2
method n
( o 2 , V o2 )
( o n , V on )
(o1 , V o1 )
Height fusion process
Fused DTM Figure 4. Height fusion process description
Figure 5. Stereoscopy (upper left). Radargrammetry (upper right). Interferometry stereoscopy+interferometry (lower right). Images courtesy of Thalès.
(lower left) and
O. Goretta and F. Celeste / Data Fusion and Image Processing: A Few Application Examples
231
4. Fusion of a Stereo Image Pair and Hyperspectral Data for 3D Urban Enhanced Extraction 3D model process extraction can be improved by using hyperspectral data. As described, 3D information can be obtained by exploiting two images (SAR or optical) from two different viewpoints. For urban 3D mapping, it is necessary to differentiate between points such as building tops and others high points. To solve this problem, a classification step of hyperspectral data can be used. Hyperspectral data composed of a hundred spectral bands can reduce confusion between building and non building classes. Again, an accurate registration of the images and the hyperspectral data set is necessary. Different classification methods can be used; however, hyperspectral data classification needs a pre-processing step to reduce the considerable amount of information. Statistical methods such as Principal Component Analysis (PCA), also called the Karhunen Loeve approach, can be used. This technique transforms a multivariate data set of M dimensions into a data set of new un-correlated linear combinations of the original variables. It generates a new set of axes that are orthogonal. In this new description space, a substantial part of the information contained in the original data is done by the first q ( q M ) bands. The classification can be supervised.
Figure 6. The left stereo pair image and the right one with one band of the hyperspectral data set
232
O. Goretta and F. Celeste / Data Fusion and Image Processing: A Few Application Examples
Figure 7. A view of the extracted 3D model draped with the left image pair. Images courtesy of ISTAR, Alcatel Space and Thalès
5. 3D Model Extraction and Assessment With Optical and SAR Images In this application, the 3D model is extracted from a stereo image pair. One or several SAR images with different aspect angles are used to assess the extraction. The extracted buildings are projected onto the SAR images with the geo-localization model, and the building’s shadow is also calculated. A visual assessment can then be done.
Figure 8. SAR image with one building with its projection. Image courtesy of ONERA and EADS
O. Goretta and F. Celeste / Data Fusion and Image Processing: A Few Application Examples
233
6. Relevant Images Synergy for Threat Assessment In this application, two areas with a possible military threat are considered. Intelligence sources have revealed that only one of the two areas is really dangerous. The dangerous area (with true military targets) must be found with the help of satellite images in different spectral ranges. Thanks to multi-sensor synergy, real targets can be separated from decoys. The images below show that a decision cannot be made with visible images only. The infrared image shows objects with a high radiant temperature but no definite conclusion can be made since inflatable decoys with warming devices exist. With SAR images, only real targets have a signature. The joint use of images from different spectral ranges enables decoys to be distinguished from real targets.
Images courtesy of French MOD
7. Change Detection Method Change detection is of tremendous importance in remote sensing applications. It can be performed by a human interpreter, and some automated methods seem to be promising. Automated change detection is a challenging domain for research in the field of defence applications. An example of automated change detection is automatic map update, where changes in the image concern objects like man-made structures, fields, forests and water areas. To simplify the process, cartographic objects that cease to exist can be processed separately from new objects. For configurations of partial overlapping between map and images, it is difficult or even impossible to formalise the approach suggested within a probabilistic framework. Thus, the Dempster-Shafer theory is shown as a more suitable formalism in view of the available information [6].
234
O. Goretta and F. Celeste / Data Fusion and Image Processing: A Few Application Examples
To detect changes, both new information (images) and previous information (map) are used. A two-stage processing is done. The first stage concerns the detection of missing objects and the second, the detection of new objects. The strategy to detect new objects is obtained by a multi-spectral classification into four classes (water area, ground area, gas tank, building) of the areas in the images that correspond to ground area classes on the map (precisely where there are new man-made constructions (gas tank and building) or new water areas). A region not classified as a ground area class corresponds to a new object. On the contrary, a «no-change» decision is obtained for this region of the map. Before classification, an important operation is the statistical learning of each class carried out automatically from objects declared during the first stage as still present in the images. These regions are used as training areas to estimate the a priori probabilities and the probability densities with respect to these classes. Missing objects must be assessed with an appropriate strategy to detect them. Objects from classes {Building, Gas tank, Water area}
Ground area objects
Detection of missing objects
Automatic learning process
Detection of new objects by multispectral classification
General synoptic to detect changes
The results of the extraction of new object areas by each approach are presented in the following figure. The two approaches attain an equivalent good-detection rate. The evidential method extracts one more gas tank and a new object area with a slightly better precision. The difference is mainly due to the false alarms that are more significant with the Bayesian approach as shown in the upper left part of the image. In conclusion, while we are in favour of the Bayesian approach, for an equivalent good detection rate, the evidential approach has a false alarm rate largely lower than the Bayesian approach.
O. Goretta and F. Celeste / Data Fusion and Image Processing: A Few Application Examples
235
Good detection False Alarm
Bayesian approach (left). Evidential approach (right)
8. Conclusion Several fusion methods and applications have been shown in this paper. Image fusion can be used in different cases at three different levels (pixel, feature, decision). The examples emphasized the need for fusion to produce efficient and user-friendly (from the user’s point of view) geospatial intelligence data. This will provide the warfighter with the information necessary to have the best level of preparedness.
References [1] : Radargrammetric image processing. F. W. Leberl. 1990 .Artech House. [2] : Traitement des images RSO, sous la direction d’H. Maître. Hermes 2001 [3] : Understanding Synthetic Aperture Radar Images. C.Oliver & S. Quegan. 1998. Artech House [4] : Photogrammétrie satellitale pour les capteurs de haute résolution : état de l’art. T. Toutin. SFPT n°175 [5] : Image registration method : a survey. B.Zitova & J.Flusser. Elsevier 2003 [6] : Automated map updating by fusion of multispectral images in the Dempster Shafer framework. F.Janez, O.Goretta, A. Michel. SPIE 2000. [7] : The speckle filters comparative test project. Multi-Temporal filters.CNES (French Space Agency). May 2003.
236
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
Secondary Application Wireless Technologies to Increase Information Potential for Defence against Terrorism* Christo KABAKCHIEV, Vladimir KYOVTOROV, Ivan GARVANOV Institute of Information Technologies Bulgarian Academy of Sciences Acad. G. Bonchev Str., bl. 2, 1113 Sofia Bulgaria, Phone: +3592/979-29-28 E-mail:
[email protected];
[email protected];
[email protected]
Abstract: This paper concerns the detection, parameter, and height estimation of Pseudo-random Noise (PN) signals; with a passive radar receiver network applied in wireless communication systems with multipath interference. The investigation is obtained by Monte Carlo simulation. The achieved results can be applied for target detection in multistatic radars using existing communication networks. Keywords: Multi-sensor data fusion, Passive correlation receivers, OS CFAR processor, Multipath interference.
Introduction Data Fusion technology is continually searching for new information sensors, new data processing algorithms, and reliable data relationships to increase the completeness of information needed for the defense against terrorism. Using networks of passive receivers to locate moving targets that reflect signals emanating from communication, navigation, TV or radio broadcasts is not well studied. However, these so-called Secondary Application Wireless Technology (SAWT) systems can be successfully used for the surveillance of large areas where the positioning of sensors is difficult and where the only reliable sensors are satellites. These systems could prove extremely important in rapid and robust situations, typically arising as a result of terrorist actions. Our aim is to use the existing Code Division Multiple Access (CDMA) wireless network for secondary application - radar detection, and parameter and height estimation of low flying targets. This could be achieved by adding passive radar receivers to the Base Station (BS) of the CDMA network to form a CDMA wireless network and a radar passive coherent network. As a result, a low flying target, as it is flying over the wireless communication network, would also cross the network of passive radar receivers. The receivers would use the global time synchronization of the Global Positioning System (GPS) or CDMA network and would be phase* This work is supported by IIT – 010059/2004, MPS Ltd. Grant “RDR” and Bulgarian NF “SR” Grant ʋ TH – 1305/2003.
C. Kabakchiev et al. / Secondary Application Wireless Technologies
237
synchronized with the CDMA network, but would be managed through their own or central control system. This paper concerns the detection, and parameter and height estimation of Pseudorandom Noise (PN) signals in multipath interference, using a passive radar receiver network linked to a wireless communication system [2]. In our research work, we use for each passive radar receiver the optimal detection structure for the PN sequences, consisting of a correlation receiver and an Ordered Statistics Constant False Alarm Rate (OS CFAR) processor. The receiver signals from the passive receiver network are processed in the fusion node. First they are synchronized in time and space, and then are assessed through a typical decision rule. By using three passive radars to simultaneously perform target detection, distance estimation, and data synchronization; target height can be estimated in the fusion node by applying the technique presented in Skolnik [3]. We assume that the target echo from the communication signal fluctuates according to a Swerling II case model at the input of the correlation receiver, the multipath interference follows a Poisson distribution for probability of appearance, and the amplitudes follow a Rayleigh distribution. Turin's model for multipath propagation is used [4,5]. The same approach can be used for other complex signals from different communication systems (GPS, CDMA 2000, WCDMA) [7].
1. Detection and Estimation in CDMA Networks in the Presence of Multipath Interference – Problem Formulation The purpose of our research work, presented in this paper, is to synthesize the structure of a network passive radar algorithm for simultaneous detection, and parameter and height estimation of a low flying target in a communication CDMA network. The contemporary CDMA communication networks are used for data transfer via air interface for mobile subscribers [6]. Generally, the networks consist of Base Stations (BS), with each BS covering a specific area (cell). The air interface consists of a pilot signal, a synchronization signal, and paging and traffic channels. The pilot signal is the same for the whole network and is used for phase initialization of demodulation (it supports system coherence) [6]. The synchronization channel is demodulated by all mobiles and contains important system information conveyed by the “synch channel message”, which is broadcast repeatedly. The paging channels are used to alert the mobile to incoming calls, to convey channel assignments and to transmit system overhead information. The traffic channels carry the digital voice or data to mobile users. All moving and fixed objects reflect these signals in the area covered by the communication networks. In our case, the target, flying over the wireless communication network, crosses the network of small passive radars. The receivers use the global time synchronization from the GPS or CDMA network, are phase-synchronized with the CDMA network, but are managed through their own or central control system. We suppose the target goes through different cells and receives the pilot signal from different base stations. Of the various signals used in a CDMA network, we choose the pilot signal as the radar signal as it is the most powerful signal, and has a continued code sequence, a long period of repetition, and is continuous in time. The signal, reflected from the target, consists of many independent reflecting elements. Such a signal distribution is known as Swerling II model with a priori known parameters. Specific interference in the
238
C. Kabakchiev et al. / Secondary Application Wireless Technologies
CDMA communication networks is the multipath interference, caused by the spread spectrum character of the signals. We use the well-known Turin’s model for the multipath propagation in urban areas with a priori known parameters. It describes signals with a Rayleigh distribution and Poisson probability of amplitude appearance. The signal time delay is assumed to be approximately 20Ps (or 3km). These interferences, together with the background interference, cover all small cells (micro and pico). Different from the background interference, which conceals the target signal, the multipath propagation forms many clutter targets and worsens detection and estimation. The task for passive radar network detection and estimation in a CDMA network can be transformed into a task for the detection of a target with unknown coordinates and velocity in the presence of multipath interference with a priori known signal and interference parameters. We use the approach applied in surveillance radar for the detection of moving targets and estimation of their parameters, as well as some of the well-known methods for background suppressing (for example Moving Target Detection (MTD), Adaptive Moving Target Detection (AMTD) or Space Time Adaptive Processing (STAP)) [7]. In this case, the detection of a target with unknown coordinates and velocity is transformed into multi-channel range target detection in a fixed velocity (azimuth) channel [7]. Using the CFAR approach allows the false alarm rate to be kept constant in all range cells. As a result, target detection in any communication cell could be transformed as CFAR target detection in a moving window creeping in range in all channels of velocity (in any channel azimuth). We do not investigate moving targets in our paper. Therefore target detection is reduced to pilot signal CFAR detection in the moving window in range (in any channel azimuth), in the presence of multipath interference.
2. Signal and Environment Model In this paper we study the signal and environment models similar to the ones proposed in [4,5]:
f w(t ) a0s(t td ) ¦ a s(t t ) n(t ) k k k 1
(1)
where: a0s(t - td) is the reflected signal in every range element of the signal matrix (a cell), a0 is an amplitude fluctuating independently according to Rayleigh distribution law (Swerling II target model), s(t) is the PN communication pilot signal, td is the time f delay of the direct signal; ɚ s(t t ) is Turin's multipath model where ak is the
¦
ɤ
k
ɤ 1
amplitude fluctuating according to Rayleigh distribution law, tk is the delay time with Poisson probability of appearance, n(t) is the Additive White Gaussian noise (AWGN). We do not consider the uniform random distribution of phases in the multipath model. The emitted signal is continuous, but the received signal can be considered a pulse signal after the use of a correlator with fixed length. In our case, the reflected signal is a PN code communication pilot signal. The PNcode spreading is followed by classic (quadraphase-shift keying) QPSK modulation of
239
C. Kabakchiev et al. / Secondary Application Wireless Technologies
the radio frequency carrier. We use the communication channel model suggested by Turin [4,5]. This model is described as a pulse train with Rayleigh amplitude distribution, Poisson probability of appearance, and uniform random distribution of phases. In accordance with Turin's model, we choose probability of appearance Pa=0.2 at the input of the CFAR.
3. Passive Receiver Network for Target Detection in a CDMA Communication Network in the Presence of Multipath Interference 3.1. Correlation Receiver We use the baseband acquisition diagram for the pilot signal of the CDMA - IS 95-A [6]. It consists of a correlator and a threshold detector with a fixed threshold. In our case, we use an OS CFAR processor for target detection in the presence of multipath interference (Figure 1). The quadrature components at the baseband filter input are presented in Figure 1, where CI(t) and CQ(t) are PN sequences in the I and Q channels respectively, n(t) is the Additive Gaussian White Noise (AWGN), nI(t) and nQ(t) are the corresponding noise in both channels (statistically independent white Gaussian noise) [6].
OSCFAR
H1 H0
H1 H0
Figure 1 : Filter diagram
3.2. OS CFAR Processor The signal after the correlation receiver is very dynamic. Unlike pulse radars, for which it is considered that there is no signal in the reference window, our model works with continuous signals; and therefore, the statistics in both the test and reference windows have similar structures, including white noise, signal, and multipath interference. Therefore we use the OS approach for noise level estimation in both windows. The minimum of the average decision threshold is used as a criterion of effectiveness of these estimations [1]. The rank-ordered parameters, giving the best OS estimation, are chosen by using Monte Carlo simulation. The effective estimations are equivalent to those elements of the ordered statistics in both windows, where the minimum SNR occurs for probability detection Pd=0.5 and fixed false alarm probability Pfa. In order
240
C. Kabakchiev et al. / Secondary Application Wireless Technologies
to optimize these estimations, we change dependently and independently the rankordered parameters (from 3/4L to 1/8L), as it is done in [1]. The algorithm consists of the following stages. The elements of the reference & & window x x1 , x2 ...x R , R=ML and the test resolution cell z z1 , z2 ... z L are rankordered according to increasing magnitude. The main idea of an OS CFAR procedure is to select one main value xk(1), k ^1,2,..., R` and zk1(1), k ^1,2,..., L` from the order statistic sequences. These two values xk(1) and zk1(1) are used as estimators V and q0 respectively for the average noise level and the average signal level of the observed reference and test window. The rank-ordered parameters k and k1 of the OS CFAR procedure are chosen in such a way that the average decision threshold of the OS CFAR processor is a minimum value. The target is then detected according to the following algorithm: H 1: : )q o 1, q o t TaV ® ¯ H o : )q o 0, q o TaV
(2)
where H1 is the hypothesis that the test resolution cells contain the echoes from the target and H0 is the hypothesis that the test resolution cells contain white noise, signal, and multipath interference. The constant Ta is a scale factor, which is determined in order to maintain a given false alarm probability constant. Analytical equations for the Probability Density Functions (PDF) for probability detection Pd0 and false alarm probability Pd1 at the output of the correlator are not available. The Monte Carlo simulation approach is then used for estimation of the probability performance of the OS CFAR processor. 3.3. Fusion Node - Detection of Low Flying Targets We use the hard-decision sensors. They choose a single – hypothesis decision at the sensors, and that decision, alone, is reported to the fusion process. With these single look (sensor decision is based upon a single measurement of the target signal) methods, the measure of certainty can be based upon accumulated evidence from multiple, independent looks at the signals. We use the sequential algorithms, L-of-M criteria. The binary integrator performs summing of L decisions from sensors. The fusion node detection is declared if this sum exceeds the second threshold M. The probability of target detection for the fusion node is computed by using the expression: L
PD
L l
¦ C Ll Pd1 1 Pd1 l
(3)
l M
where Pd1 is the probability of detection from each sensor and the probability of false alarm is calculated, setting s 0 Analytical equations for the probability density functions (PDF) for detection probability Pd 0 and false alarm probability Pd1 at the output of the OS CFAR are not available. The Monte Carlo simulation approach is then used for estimation of the probability performance of the OS CFAR BI processor in multipath interference, as it is in [9,10].
C. Kabakchiev et al. / Secondary Application Wireless Technologies
241
3.4. Fusion Node- Height Estimation of Low Flying Targets It is extremely important to know the height of low flying targets that have no transponders or have entered air traffic controlled areas without permission. This problem can be solved by applying Skolnik’s approach to a passive radar network, estimating the three coordinates of a target by measuring the distance from each of the passive radars to that target. By using a three-positioned passive radar system for example, the target coordinates can be determined only by measuring the three target distances (r1, r2 , r3). Complete synchronization of the radar performance should be ensured and the distances between the radars (the particular radar coordinates) should be measured. The target coordinates using three-positioned passive radar system can be estimated with [2]:
x
r12 r22 4a
(4)
y
r12 r22 2r32 2(b 2 a 2 ) 4b
(5)
z
r r12 ( x a ) 2 y 2
(6)
In this paper we use also these dependencies to obtain the error of target height estimation, modeling all the parameters in a MATLAB computational environment. The numerical characteristics of the coordinates are evaluated – mean value and standard deviation for a three-positioned radar system. The mean values of the target coordinates (x, y, z) could be obtained according to the following mathematical expressions (4, 5, 6). The analytical mean value and standard deviation of z – height estimation of low flying targets is:
M >z @
M >r1 @ M >x@ a M > y@ D>r1 @ D>x@ D> y@ D>z@ 2
2
2
(7)
The standard deviation of z is:
D>z @ M >r1 @ M >x @ a M > y @ M >z @ D>r1 @ D>x @ D> y @ 2
2
2
2
(8)
4. Conclusion This paper has considered the benefits of using a passive radar receiver network wireless communication system for the detection, and parameter and height estimation of PN signals in multipath interference. By having three passive radars simultaneously perform target detection, distance estimation and data synchronization, target height estimation can be performed in the fusion node by applying the Skolnik approach,
242
C. Kabakchiev et al. / Secondary Application Wireless Technologies
which estimates the three coordinates of a target by measuring the distance of each passive radar to the target. The results can then be applied for target detection in multistatic radars using existing communication networks.
References Rohling H.: Radar CFAR Thresholding in Clutter and Multiple Target Situation, IEEE Trans. vol. AES19, 4, July, pp. 608-621, 1983. [2] Cherniakov M., Kubik M.: "Secondary applications of wireless technology (SAWT)", 2000 European Conference on Wireless Technology – Paris 2000 [3] Skolnik M.: Radar handbook, McGraw-Hill, 1990. [4] Turin G.L.: et al, " A statistical model of urban multipath propagation", IEEE Trans. Vehicul.Technol.", pp.1-8, Feb.1972. [5] Suzuki, H.: "A Statistical Model for Urban Radio Propagation", IEEE Transactions on Communications, vol.com-25, No7, July 1977 [6] Lee J., Miller L.: CDMA Systems Engineering Handbook, Artech House, 1998. [7] Lazarov, A., Minchev, Ch.: "ISAR Technique with Complementary Phase Code Modulated Signals", PLANS 2004 Conference, Monterey, CA on April 26 to April 29, 2004. [8] Kabakchiev Chr., I. Garvanov and V. Kyovtorov – “Correlation Receiver with Active CFAR Detector for PN Signal Processing in Pulse Jamming with Unknown Parameters,” International Conference on Radar’ 04, Toulouse, France, CD - 6P-SP-121, 2004. [9] Chr. Kabakchiev, V. Kyovtorov and I. Garvanov: “ Detection with OS CFAR processor in CDMA networks in the presence of multipath interference” Cybernetics and Information Technologies, Volume 4, ȹ 2, pp. 101-120, 2004. [10] Behar V., Chr. Kabakchiev, L. Doukovska: "Adaptive CFAR PI Processor for Radar Target Detection in Pulse Jamming", VLSI, SP-26, pp. 383-396, 2000. [11] Garvanov, I. and Chr. Kabakchiev: “Sensitivity of API CFAR Detectors Towards Change of Input Parameters of Pulse Jamming”, Proc. of the International Radar Symposium – IRS 2004, Warszawa, Poland, pp. 233-238, 2004 [12] Garvanov, I., and Chr. Kabakchiev: ”Sensitivity of CFAR Processors Toward the Change of Input Distribution of Pulse Jamming”, Proc. of IEEE - International Conference on Radar “Radar 2003”, Adelaide, Australia, pp. 121-126, 2003. [13] Himonas S.: CFAR Integration Processors in Randomly Arriving Impulse Interference, IEEE Trans., vol. AES-30, 3, July, pp. 809-816, 1994. [14] Garvanov, I., V. Behar and Chr. Kabakchiev: “CFAR Processors in Pulse Jamming”, Conference, Numerical Methods and Applications 2002, NMA 2002, “Lectures Notes and Computer Science ”, LNCS 2542, pp. 291-298, 2003. [15] Akimov, P., F. Evstratov and S. Zaharov: Radio signal detection, Moscow, Radio and Communication, 1989, pp. 195-203, (in Russian). [16] Waltz E., Llinas J. Multisensor Data Fusion, Artech House, Boston, 1990. [17] Kabakchiev, H., Garvanov I and Kyovtorov V.:“Error estimation in targetheight finding using VHF radar and three-antenna system positioned one above the other”, Distributed Computer and Communication Networks, Sofia, 2005, pp. 222-238. [1]
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
1,2
' A
243
244
S.G. Nikolov et al. / Adaptive Image Fusion Using Wavelets: Algorithms and System Design
W
I1 (x, y) φ I(x, y)
In (x, y) W −1
I(x, y) = W −1 (φ(W (I1 (x, y)), W (I2 (x, y), .., W (In (x, y)))).
S.G. Nikolov et al. / Adaptive Image Fusion Using Wavelets: Algorithms and System Design
245
QAB/F
•
•
•
246
S.G. Nikolov et al. / Adaptive Image Fusion Using Wavelets: Algorithms and System Design
•
•
•
•
QAB/F
S.G. Nikolov et al. / Adaptive Image Fusion Using Wavelets: Algorithms and System Design
247
248
S.G. Nikolov et al. / Adaptive Image Fusion Using Wavelets: Algorithms and System Design
iR k
tR k
ftk
ftk
ftk
ftk
ftk
S.G. Nikolov et al. / Adaptive Image Fusion Using Wavelets: Algorithms and System Design
249
250
S.G. Nikolov et al. / Adaptive Image Fusion Using Wavelets: Algorithms and System Design
S.G. Nikolov et al. / Adaptive Image Fusion Using Wavelets: Algorithms and System Design
251
252
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
Methods for Fused Image Analysis and Assessment A. LOZAa,1,2, T. D. DIXONb, E. FERNÁNDEZ CANGAa, S. G. NIKOLOVa, D. R. BULLa, C. N. CANAGARAJAHa, J. M. NOYESb and T. TROSCIANKOb a Department of Electrical and Electronic Engineering, University of Bristol, UK b Department of Experimental Psychology, University of Bristol, UK
Abstract. The widespread use of image fusion – combining images of different modalities – is finding increasing application in fields such as, for example, medical imaging, remote sensing and surveillance. Consequently, the ability to assess the fused images accurately has become of great importance. In this correspondence, the differences between conventional image quality measures and composite image assessment are outlined and the image fusion assessment methods are covered with examples of both the objective quality measures and psychophysical testing techniques.
Keywords. Image fusion, image quality assessment, psychophysical testing
Introduction Digital image processing, compression and transmission affect image quality and may lead to a reduction of its readability and value. To measure the extent of the image degradation, image quality measures are necessary. Apart from quality monitoring, from an image processing point of view, two applications of image quality assessment are important: benchmarking of the algorithms, giving a point of reference to which considered methods can be compared, and, within the image processing system itself, optimising the performance of the algorithm and its parameters. When the image is not intended as an input to a processing procedure (for example classifier), the ultimate recipient or interpreter of the image is a human observer. Consequently, the subjective quality judgment seems to be the most appropriate. Subjective rating is usually performed by a group of interpreters, either trained experts or non-experts, and usually has the form of absolute or comparative evaluation. Subjective methods of image assessment, however, in many cases may not be reproducible, and are expensive, time-consuming and can be influenced by observers’ professional and physical qualifications, and the technological conditions of the experiment [1]. Thus, the need for computationally effective qualitative image and video assessment measures that would assess the distortion of a processed image without 1
Corresponding Author: Artur àoza, Merchant Venturers Building, University Gate 2.34, Woodland Road, Bristol, BS8 1UB, UK; e-mail:
[email protected]. 2 This work has been funded by the UK MOD Data and Information Fusion Defence Technology Centre.
A. Loza et al. / Methods for Fused Image Analysis and Assessment
253
having to rely on human participants arises. An objective quality metric for image or video assessment is essentially a computational method of comparing an image with a reference image, or, less often, with a statistical measure, that does not require a reference. The earliest and still ubiquitously used quantitative measures are simple functions of the analysed images, so-called mathematical metrics, such as Mean-Square Error (MSE), the Mean Absolute Error (MAE), and the Peak Signal-to-Noise Ratio (PSNR), as well as distortion contrast and the local MSE [2]. Some of the primary benefits of the mathematical metrics are that they are simple to calculate, have clear physical and logical meaning, and facilitate efficient mathematical optimisation. The main disadvantage of the mathematical metrics is that they often do not correlate well with actual subjective performance on quality estimation tasks [3] and might give very similar estimates across a range of picture distortion types without actually accounting for the nature of the distortion [4]. Metrics based on the Human Visual System (HVS) generate most of the research interest in the current literature, as they purport to explain and predict human image rating behaviour more accurately than pixel-based mathematical models. HSV functions may be modelled by: Contrast Sensitivity Function filtering (defines how sensitive a viewer is to a given spatial frequency, may also account for masking effects) and single- or multi-channel decompositions with transforms ranging from simple discrete cosine and wavelet decompositions to complex coders based on models of the low-level processing of the HVS [4, 5]. The errors of the transformed images with regard to the reference image are then calculated, normalised and pooled, usually using a method referred to as Minkowski error pooling [2, 5]. The HVS framework described above can be seen as based on error sensitive methods that are trying to identify a level of error between the reference image and the distorted image. Instead of error sensitivity, it is suggested that quality assessment should be based upon a measure of structural similarity [6]. This alternative philosophy is based on the premise that the HVS is highly tuned to extracting structural information from its field of view. It is from this foundation that a new quality metric has been developed, called The Structural SIMilarity (SSIM) Index [4]. The mathematical description of the SSIM Index and its application to the fused image assessment will be presented in Section 2.4. Multi-sensor image or video fusion can be defined as the process in which several images, or some of their features, coming from different sensors, are combined together to form a single fused image or video containing required complementary information. The successful fusion of images acquired from different sensors, modalities or instruments is of great importance in many applications, such as remote sensing, computer vision, robotics, surveillance, medical imaging and microscopic imaging. The recent rise in research interest into fused data has brought with it significant problems in the objective assessment of such data. Much research effort is being placed upon creating new and more efficient fusion schemes, without also devising new and more appropriate methods for objective evaluation of the output image quality. The issue of fused image quality assessment is complicated not only by the range of different fusion options available; it is the fusion itself that often requires a different approach to the problem, mostly due to the lack of a reference image and its strong dependency on datasets, fusion techniques and applications [7]. Therefore, it is essential to consider modalities, algorithms and tasks undertaken when attempting to assess the quality of a
254
A. Loza et al. / Methods for Fused Image Analysis and Assessment
fused image. This correspondence will concentrate on still image metrics as research into fused video metrics is virtually nonexistent [8]. This paper reviews briefly the specific issues associated with fused image quality assessment in Section 1 and focuses on selected assessment methods in Section 2 where main definitions and examples of the fused image objective quality measures and subjective testing methods are discussed.
1. Specifics of the Fused Image Assessment As pointed out in early reviews of multisensor image fusion techniques [7], general statements on the quality of a fused image or technique are very difficult to make, due to the lack of a reference fused image and task/application dependency of most of the methods. The complexity of the image fusion assessment criteria and aspects distinguishing image fusion from other image processing procedures that should be considered when assessing its quality will be discussed in the following paragraph. 1.1. Lack of Reference Fused “Ideal” Image Fused image inevitably contains complementary information from several, often incompatible sources. Even though the image fusion users may have a clear idea of what kind of image they want to obtain, the real-world equivalent of the composite image may not exist. For example, when fusing images from two or more cameras operating in different bandwidths, the equivalent of the ideal image is not available, because a single camera technology operating in such a bandwidth range is not available. In some applications however, such as digital photography, it is possible either to synthesise the reference image from available data [9] or to use distorted versions of available reference image in the simulated fusion process [10]. 1.2. Task, Application and Modality Dependency Image fusion can be used to achieve various ends, such as enhancement of spatial, spectral or temporal resolution in remote sensing; in fused imagery coming from a surveillance system. The detection of intrusions or abnormal behaviour is sought and the aesthetics of the image is less important. The results of image fusion are either presented to a human observer for easier and enhanced interpretation or are subjected to further computer analysis or processing, e.g., segmentation, classification, target detection or tracking, with the aim of improved accuracy and more robust performance. In all aforementioned cases, the true quality of the fused image depends upon how well it performs in the specific task. Most of conventional subjective image quality testing methods have been designed with assessing unimodal image quality in mind. The wide range of imaging modalities to be fused presents a significant problem relating to the compatibility of the images attained from each sensor, and how fully a metric can account for this issue. For example, images obtained by means of computer tomography and magnetic resonance scans might complement one another in terms of content, and be spatially and temporally similar. Consequently, the quality can be found using a measure which can be based on the amount of statistical dependence of feature and visual information of
A. Loza et al. / Methods for Fused Image Analysis and Assessment
255
input and fused images [11]. However, when combining images with maximum spatial and spectral resolutions that are largely differing, a number of issues arises, due to the incompatibility of the images, caused by possible differences in time or incommensurate spatial and spectral bandwidths [12]. Such issues must be considered in the creation of appropriate metrics for the appropriate fusion method. This situation is complicated further by the range of different fusion options and methods available, for which, in some cases, only specific assessment methods may be appropriate. As will be shown in Section 2.5, more specialised, task-related methods can be introduced for subjective quality or performance rating of the image fusion. 1.3. Desirable Properties of Fused Images When fusion serves a certain purpose, it is often possible to specify requirements that should be fulfilled by the fused image. For example, when images are fused to create a synthesis that has enhanced spatial resolution, it is necessary for them to have following properties [12]: 1) they should be as identical as possible to the originals; 2) they should have a spatial resolution close to the original high-spatial resolution images; 3) the multispectral set of synthesised images should coincide with the multispectral set of images observed by the sensor at the highest resolution. It is therefore suggested that such fused images should be evaluated in a content-dependent manner, both spatially and spectrally [13, 14].
2. Selected Fused Image Metrics 2.1. Mathematical Metrics The mathematical metrics for fused image assessment might be applicable to situations where the image content of a fused image is not to be assessed directly by a human. Under such circumstances, as with uni-modal metrics, a simple mathematical measure might be suitable to evaluate the differences between the fused and original/ideal images. This again raises the problem of sensor capabilities that cannot supply an ideal image. A number of mathematically based visual assessment procedures for fused images, such as colour-matching of features in fused images via histogram stretching and inversion of image channels have been specified in [7]. The root MSE was considered a useful evaluation technique in [10], where it produced appropriate quality values for a digital camera image fusion assessment, based against an ideal. A metric based upon the standard deviation of difference between an ideal and fused image in order to assess spectral quality has been proposed in [15]. In [16], a normalised squared error metric has been used within subparts of the fusion process to assess image quality. The measure however was found not robust enough to correspond to visual information. Among the fused image mathematical metrics, much attention was given to the Mutual Information (MI) measure. The MI is a natural measure of the dependence between random variables and was first used for image fusion assessment in [17]. It is defined by the Kullback-Leibler “distance” between two images, or, in the case of image fusion, as the average of the distances between the input images (A and B) and the fused image F:
256
A. Loza et al. / Methods for Fused Image Analysis and Assessment
.
where
MAF can be interpreted as the distance between the joint distribution of greyscale values of the images A and F, p(a, f), and the joint distribution of statistically independent images p(a)p(f). Measure MBF is defined analogously to MAF. 2.2. Combined Spatial and Spectral Metrics In remote sensing applications, where spatial and spectral image information is usually fused, reference images are usually not available. In [18] the correlation coefficients between the high-resolution satellite images were used as an indicator of spatial quality. However, this method is unable to make a direct comparison between the fused image and the high-resolution panchromatic image [13]. An alternative method is based on attempting to reconstruct the missing information that is not present in the high spectral and fused images [19]. Another recent fused image metric that measures both spatial and spectral information is the Blur Parameter Estimation (BPE) of [15]. The BPE is based on the supposition that image resolution and spatial quality are positively correlated. The resolution of an image is dependent on the sensor equipment used to attain the image, and can be limited by pixel size as much as optical constraints. The spatial quality of a fused image can by characterised by the line point spread function. This measure was shown to work more accurately than the metric of [18] on a limited set of fused satellite images. 2.3. Edge Based Metric The only aspect of the HVS that has been fully examined for image fusion purposes is the edge extraction [20]. This is a comparatively simple model similar to the singlechannel models described in Introduction, which assumes that one essential feature of the HVS can be used to evaluate the quality of an image. A metric, which measures the amount of edge information “transferred” from the source image to the fused image, has been recently proposed in [20]. It uses a Sobel edge operator to calculate the edge strength g(n,m) and orientation D(n,m) information of each pixel in the input and output images. The relative strength and orientation “change”, GAF(n,m) and AAF(n,m) of an input image A with respect to the fused image F, are defined as:
. These measures are then used to estimate the edge strength and orientation preservation values, Qg and QD, ,
257
A. Loza et al. / Methods for Fused Image Analysis and Assessment
where the constants k and V determine the exact shape of the sigmoid nonlinearities used to form the edge strength and orientation. The overall edge information preservation values are then defined as: . Measure QBF is defined analogously to QAF. A normalised weighted performance metric of a given process p that fuses A and B into F, is given as: . It can be observed that the edge preservation values QAF(n,m) and QBF(n,m), are weighted by coefficients wA(n,m) and wB(n,m), which reflect the perceptual importance of the corresponding edge elements within the input images. Note that in this method the visual information is associated with the edge information while the region information is ignored. 2.4. Metric Based on Structural Similarity This Image Fusion Quality Index (IFQI) [21] is based on the SSIM (see Introduction) image quality index recently introduced in [6], which is defined as:
where P and V stand for mean and standard deviation, respectively. The first and second component of SAF measures how close the luminance and contrast of the images are, respectively; the third component is the correlation coefficient between the two images, measuring the spatial similarity between the images. In order to apply this metric for image fusion evaluation, the authors of [21] introduce salient information O to reflect the relative importance of image A compared to image B, within the window w: .
Finally, to take into account aspects of the HVS which is the perceptual relevance of edge information, the same measure is computed with the “edge images”, Aಿ, Bಿ and Fಿ, and the final value is calculated as a product of the two measures: .
As with the previous metrics, this metric does not require a ground-truth or reference image.
258
A. Loza et al. / Methods for Fused Image Analysis and Assessment
2.5. Psycho-visual Fusion Evaluation Psycho-visual image fusion evaluation has been appropriately dominated by taskrelated perceptual evaluation of the images. The early advances in this field were initiated in [22], where colour image fusion schemes were applied to visible and thermal images of military relevant scenarios. The fusion methods used have been shown to improve the accuracy of observers performing detection and localization tasks. Other factors of human observer performance such as global scene recognition, target recognition and detection versus single modalities and different colour mapping used in fusion were tested in [23]. The psychophysical testing presented in [22, 23], has been extended to compare the JPEG2000 and JPEG compression schemes and combined with metric assessment (MI, QAB/F and IFQI) across a wider range of image fusion methods in [24, 25]. The fusion methods used were an averaging, contrast pyramid [26] and the dual-tree complex wavelet transform [27]. In the experiments of [24, 25] participants were asked to perform visual target detection tasks and to assess image fusion quality comparatively. The results obtained have shown that there is a correlation between two of the metrics (QAB/F and IFQI) and the psychophysical evaluation. They also indicate that the selection of the correct fusion method has more impact on task performance than the presence of compression.
3. Summary This paper reviews a wide range of computational, objective and subjective image fusion testing methods. It is emphasised that the issue of image fusion quality assessment differs from conventional image quality testing due to a lack of reference fused images, a wide range of fusion methods and modalities used, and task/application dependency of most of the methods. The most commonly used image fusion computational metrics try to estimate the amount of information transferred from input images to the fused image, whereas incorporating specific tasks into psycho-visual testing allows task-dependant objective fusion assessment.
References [1] Eskicioglu, A.M., Quality measurement for monochrome compressed images in the past 25 years, in Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) Conference. June 5-9, 2000: Istanbul, Turkey. p. 1907-1910. [2] Eckert, M.P. and A.P. Bradley, Perceptual quality metrics applied to still image compression. Signal Processing, 1998. 70(3): p. 177-200. [3] Miyahara, M., K. Kotani, and V.R. Algazi, Objective picture quality scale (PQS) for image coding. Communications, IEEE Transactions on, 1998. 46(9): p. 1215-1226. [4] Wang, Z., et al., Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, April 2004. 13(4): p. 600-612. [5] Pappas, T. and R. Safranek, eds. Perceptual Criteria for Image Quality Evaluation. Handbook of Image and Video Processing, ed. A. Bovik. 2000, Academic Press: San Diego. 669-684. [6] Wang, Z. and A.C. Bovik, A universal image quality index. Signal Processing Letters, IEEE, 2002. 9(3): p. 81-84. [7] Pohl, C. and J.L.v. Genderen, Review article Multisensor image fusion in remote sensing: concepts, methods and applications. International Journal of Remote Sensing, 1998. 19(5): p. 823-854. [8] CiteSeer.IST Scientific Literature Digital Library Online Search for "fused video metric", in http://citeseer.ist.psu.edu/cis?q=video+fusion+metric.
A. Loza et al. / Methods for Fused Image Analysis and Assessment
259
[9] Hill, P., N. Canagarajah, and D. Bull. Image Fusion using Complex Wavelets. in The 13th British Machine Vision Conference. 2002. [10] Zhang, Z. and R. Blum, A categorization of multiscale-decomposition-based image fusion schemes with a performance study for a digital camera application. Proceedings of the IEEE, August 1999. 87: p. 1315-1326. [11] Qu, G., D. Zhang, and P. Yan, Information measure for performance of image fusion. IEE Electronics Letters, 2002. 38(7): p. 313-315. [12] Wald, L., T. Ranchin, and M. Mangolini, Fusion of satellite images of different spatial resolution: Assessing the quality of resulting images. Photogrammetric and Remote Sensing, 1997. 63(6): p. 691699. [13] Li, J., Spatial Quality Evaluation Of Fusion Of Different Resolution Images. International Archives of Photogrammetry and Remote Sensing, 2000. XXXIII: p. 331-338. [14] Buntilov, V. and T. Bretschneider. Objective content-dependent quality measure for image fusion of optical data. in IEEE International Geoscience and Remote Sensing Symposium. 2004. [15] Li, H., B.S. Manjunath, and S.K. Mitra, Multisensor Image Fusion Using the Wavelet Transform. Graphical Models and Image Processing, May 1995. 57(3): p. 235-245. [16] Robinson, G.D., H.N. Gross, and J.R. Schott, Evaluation of Two Applications of Spectral Mixing Models to Image Fusion. Remote Sensing of Environment, 2000. 71(3): p. 272-281. [17] Qu, G., D. Zhang, and P. Yan, Medical image fusion by wavelet transform modulus maxima. Optics Express, August 2001. 9(4): p. 184-190. [18] Zhou, J., D.L. Civco, and J.A. Silander, A wavelet transform method to merge Landsat TM and SPOT panchromatic data. International Journal of Remote Sensing, 1998. 19(4): p. 743 - 757. [19] Ranchin, T., et al., Image fusion--the ARSIS concept and some successful implementation schemes. ISPRS Journal of Photogrammetry and Remote Sensing, 2003. 58(1-2): p. 4-18. [20] Petrovic, V.S. and C.S. Xydeas, Sensor noise effects on signal-level image fusion performance. Information Fusion, 2003. 4: p. 167-183. [21] Piella, G. and H. Heijmans, A New Quality Metric for Image Fusion, in Proceedings of the Intl. Conf. on Image Processing. 2003: Barcelona, Spain. [22] Toet, A., et al., Fusion of visible and thermal imagery improves situational awareness. Displays, 1997. 18(2): p. 85-95. [23] Toet, A. and E.M. Franken, Perceptual evaluation of different image fusion schemes. Displays, 2003. 24(1): p. 25-37. [24] Dixon, T., et al. Psychophysical and Metric Assessment of Fused Images. in 2nd Symposium on Applied Perception in Graphics and Visualization. 2005. Spain. [25] Fernandez-Canga, E., et al. Characterisation of Image Fusion Quality Metrics for Surveillance Applications over Bandlimited Channels. in The 8th Intl. Conf. on Information Fusion. 2005. Philadelphia, PA, USA. [26] Toet, A., L.v. Ruyven, and J. Velaton, Merging thermal and visual images by a contrast pyramid. Optical Engineering, 1989. 28(7): p. 789-792. [27] Kingsbury, N., Complex Wavelets for Shift Invariant Analysis and Filtering of Signals. Applied and Computational Harmonic Analysis, 2001. 10(3): p. 234-253.
260
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
Object Tracking by Particle Filtering Techniques in Video Sequences1 2
Lyudmila MIHAYLOVA , Paul BRASNETT, Nishan CANAGARAJAH and David BULL Department of Electrical and Electronic Engineering, University of Bristol, UK
Abstract. Object tracking in video sequences is a challenging task and has various applications such as port security. We review particle filtering techniques for tracking single and multiple moving objects in video sequences by using different features such as colour, shape, motion, edge, and sound. Pros and cons of these algorithms are discussed along with difficulties that have to be overcome. Results of a particular particle filter with colour and texture cues are reported. Conclusions and open research issues are formulated. Keywords: particle filtering, sensor data fusion, tracking in video sequences
Introduction Object tracking is required in many vision applications such as human-computer interfaces, video communication/compression, road traffic control, and security and surveillance systems. Often the goal is to obtain a record of the trajectory of the single or multiple moving targets over time and space by processing information from distributed sensors. Object tracking in video sequences requires on-line processing of a large amount of data and is time-expensive. Additionally, most of the problems encountered in visual tracking are non-linear, non-Gaussian, multi-modal, or any combination of these. Different techniques are available in the literature for solving tracking tasks in vision and can be generally divided into two groups: i) classical applications, where targets do not interact much with each other and behave independently, such as aircraft that does not cross paths, and ii) applications in which targets do not behave independently (ants, bees, robots, people) and their identity is not always well distinguishable. Tracking multiple identical targets has its own challenges when the targets pass close to each other or merge. In this paper. we concentrate on the particle filtering technique, which has recently proven to be a powerful and reliable tool for tracking non-linear systems. The promise of particle filtering is that it allows fusion of different sensor data to incorporate constraints and account for different non-Gaussian uncertainties. Furthermore, it can cope with missing data, circumvent possible occlusions, and solve data association problems when multiple targets are being tracked with multiple sensors [1]. The 1
We acknowledge the financial support of the UK MOD Data and Information Fusion DT Center. 2 Corresponding author, E-mail:
[email protected]
L. Mihaylova et al. / Object Tracking by Particle Filtering Techniques in Video Sequences
261
observations may come synchronously, or asynchronously in time from one sensor or many sensors, static or moving. How to position the cameras is an optimisation and decision making problem. The rest of the paper is organised as follows. Section 1 formulates the problem for object tracking in video sequences within a Bayesian framework. The most commonly used motion models and cues are presented in Section 2. Section 3 gives a particle filter based on fusing multiple independent information cues, colour, and texture. The algorithm relies on likelihood factorisation as a product of the likelihoods of different cues. We show the advantage of fusing multiple cues compared to colour-based tracking only and texture-based tracking only. Section 4 generalises the results and outlines future work. 1. Monte Carlo Framework for Object Tracking in Video Sequences The generic objective is to track the state of a specified object or region of interest in a sequence of images captured by a camera. Different techniques are available in the literature for solving tracking problems in vision. We focus mainly on Monte Carlo techniques (particle filters) because of their power and versatility [2-5]. Monte Carlo techniques are based on computation of the state posterior density function by samples, and are known under different names: Particle Filters (PFs) [3], bootstrap methods [2], or the condensation algorithm [6, 7], which was the first variant applied to video processing. The acronym CONDENSATION stems from CONDitional DENSity propagATION. The aim of sequential Monte Carlo estimation is to evaluate the posterior Probability Density Function (PDF) p( x k | Z k ) of the state vector x k nx , with dimension n x given a set Z k {z1 ,..., z k } of sensor measurements up to time k. The Monte Carlo approach relies on a sample-based representation of the state PDF. Multiple particles (samples) of the state are generated, each associated with a weight that characterises the quality of a specific particle. An estimate of the variable of interest is obtained by the weighted sum of particles. The two major steps are: prediction and update. During the prediction stage, each particle is modified according to the state model of the region of interest in the video frame, including the addition of random noise to simulate the noise in the state. In the update stage, each particle’s weight is re-evaluated with the new data. A resampling update procedure eliminates particles that have small weights and replicates the particles with larger weights. Many of the proposed particle filters for tracking in video sequences rely on a single feature, e.g., colour. However, single-feature tracking does not always provide reliable performance when there is clutter in the background. Multiple-feature tracking [1,19,20] provides a better description of the object and improves robustness. In [1], a particle filter is developed that fuses three types of raw data: colour, motion, and sound. However, developing a visual tracking algorithm that is robust to a wide variety of conditions is still an open problem. Part of this problem is the choice of what to track. Colour trackers are distracted by other objects that have the same or similar colour as the target.
262
L. Mihaylova et al. / Object Tracking by Particle Filtering Techniques in Video Sequences
2. Typical Motion and Observation Models 2.1. Motion Models The techniques used to accomplish a given tracking task depend on the purposes, and in particular on i) the objects possessing certain characteristics, e.g., cars, people, faces; ii) objects possessing certain characteristics with a specific attribute, e.g., moving cars, walking people, talking faces, face of a given person; iii) objects of a priori unknown nature but of specific interest, such as moving objects. In each case, part of the input video frame is searched against a reference model describing the appearance of the object. The reference can be based on image patches, which describe the appearance of the tracked region at the pixel level, on contours, and/or on global descriptors such as colour models. To characterise a target, first a feature space is chosen. The reference object (target) model is represented by its PDF in the feature space. For example, the reference model can be the colour PDF of the target [1]. In the subsequent frame, a target candidate is defined at some location and is characterised by the PDF. Both PDFs are estimated from the data and compared by a similarity function. The local maxima in the similarity function indicates the presence of objects in the second image frame with representations similar to the reference model defined in the first frame. Examples of similarity functions are the Bhattacharyya distance and the KullbackLeibler distance. In the tracking of a specified object or region of interest in image sequences, different object models have been proposed in the literature. Many of them make only weak assumptions about the precise object configuration and are not particularly restrictive about the types of objects. A reasonable approximation to the region of interest can be an ellipse [8] or a rectangular box such as in [1]. The object (motion) models used in the literature vary from general random walk models [10, 1] to constant acceleration models [9] or other specific models. To design algorithms that are applicable to fairly large groups of objects including people, faces, vehicles, etc., in [1], a weak model for the state evolution is adopted, mutually independent Gaussian Random Walk models. These models are augmented with small random uniform components to capture (rare) events such as jumps in the image sequence. They also help in recovering tracks after periods of complete occlusion. Mixed-state motion models as in [20] can be used to overcome partial and full occlusions. 2.2. Observation Models The observation models for object tracking in video sequences are usually highly nonlinear and can be either parametric (e.g., mixture of Gaussians) or nonparametric (e.g., histograms). Some of the most often used observation models are based on colour, shape and/ or motion cues. The localisation cues impact trackers based on PFs in different ways. Usually, likelihood models of each cue are constructed [10, 1]. These cues are assumed mutually independent, but it must be kept in mind that any correlation that may exist between them, e.g., the colour, motion and sound of an object, is likely to be weak. Adaptation of the cues is essential in distinguishing different
L. Mihaylova et al. / Object Tracking by Particle Filtering Techniques in Video Sequences
263
objects, making tracking robust to appearance variations due to changing illumination and pose. 2.2.1. Shape Information When a specific class of objects is considered, a complete model of its shape can be learned offline and contour cues can be applied to capture the visual appearances of tracked entities. Colour/spline based PFs are developed in [6, 7]. In [7], colour information has been used in particle filtering for initialisation and importance sampling. These models can be contaminated by edge clutter and they are not adaptable to scenarios without a predefined class of objects to be tracked or where the class of objects does not exhibit very distinctive silhouettes. When shape modelling is not appropriate, colour cues are a powerful alternative. 2.2.2. Colour Modelling Colour represents an efficient cue for object tracking and recognition that is easy to implement and requires only modest hardware. Most colour cameras provide RGB (red, green, blue) signal. HSI (hue, saturation, intensity) representation [11] can also be used [12]. Hue refers to the perceived ‘colour’ (technically, the dominant wavelength), e.g., ‘purple’ or ‘orange’. Saturation measures its dilution by white light, giving rise to ‘light purple’, ‘dark purple’, etc., i.e., it corresponds to ‘vividness’ or ‘purity’ of colour. HSI decouples the intensity information from the colour, while hue and saturation correspond to human perception. Colour-based trackers have been proven to be robust and versatile for a modest computational cost [13, 1, 14]. Colour localisation cues are obtained by associating a reference colour model with the object or region of interest. This reference model can be obtained by hand-labelling or from some automatic detection module. To assess whether a given candidate region contains the object of interest, a colour model of the same form as the reference model is computed within the region and compared to the reference model. The smaller the discrepancy between the candidate and the reference models, the higher the likelihood that the object is located inside the candidate region. Histogram-based colour models are used in [1, 8, 13]. The likelihood is computed from the histogram distance between the empirical colour distribution in the hypothesised region and the reference colour model. For colour modelling in [1], independent normalised histograms are used in the three channels of the RGB colour space. The colour likelihood model is then defined to favour candidate colour histograms close to the reference histogram. An appropriate distance metric for making decisions about the closeness of the histograms h1 , h2 is the Bhattacharyya similarity coefficient [15, 16] 1/ 2
D(h1 , h2 )
B § · ¨¨1 ¦ hi,1hi,2 ¸¸ © i 1 ¹
,
(1)
where B is the number of bins. This metric is within the interval [0,1]. Based on this distance, the colour likelihood model can be defined by [1]
264
L. Mihaylova et al. / Object Tracking by Particle Filtering Techniques in Video Sequences 1/ 2
§ · c p( z | x ) v ¨ D 2 (hxc , href ) / 2V C2 ¸ ¨ ¸ © c{R ,G , B} ¹
¦
(2)
c based on the histograms of the target and reference objects, respectively hxc and href ,
and standard deviation V C of the colour cue. Two PFs are developed in [1]: one based on colour and sound and one based on colour and motion. In the PF with colour and sound the search is performed at first in one dimensional space x direction, followed by another in a two dimensional space (x, y). This increases the PF efficiency, allowing the same accuracy to be achieved for a smaller number of particles. The same strategy is applied when colour and motion are being fused. The colour cues are persistent and robust to changes in pose and illumination, but are more prone to ambiguity, especially if the scene contains other objects characterised by a colour distribution similar to that of the object of interest. The motion and sound cues are very discriminative and they allow the object to be located with low ambiguity. 2.2.3. Motion Cues Instantaneous motion activity captures other important aspects of the sequence content and has been studied from various perspectives [17]. In the case of a static camera, the absolute value of the luminance frame difference computed on successive pair images is used to calculate a likelihood model [1] similar to the one developed for the colour measurements. Motion cues are usually based on histogramming consecutive frame differences. 2.2.4. Texture Cues Despite there being no unique definition of texture, it is generally agreed that texture describes the spatial arrangements of pixel grey levels in an image, which may be stochastic or periodic or both [18]. Texture is often considered to be made up of basic elements (textural primitives) repeated in a regular or random fashion across the image. Some of the most successful methodologies proposed to describe and analyse textures are spatial frequency techniques and stochastic random fields approaches. Texture cues can be implemented, e.g., by using wavelet transforms [20]. 2.2.5. Edge Cues Edges are pixels where the intensity changes abruptly. An edge in an image is usually taken to mean the boundary between two regions with relatively distinct grey levels. The ‘ideal’ situation is when the two regions have distinct constant grey levels and the edge is characterised by an abrupt change. However, in most practical situations, edges are usually characterised by a smooth transition in grey level with the two regions having slowly varying but distinct average grey level. Edges may be: i) viewpoint dependent - they may change as the viewpoint changes and typically reflect the geometry of the scene, objects occluding one another, or they may be ii) viewpoint independent - reflect properties of the viewed objects, e.g., markings and surface shape.
L. Mihaylova et al. / Object Tracking by Particle Filtering Techniques in Video Sequences
265
An image function depends on two co-ordinates in the image plane - and so operators describing edges are expressed using partial derivatives. A change of the image function can be described by a gradient that points in the direction of the largest growth of the image function. An edge [11], is a property attached to an individual pixel and is calculated from the image function behaviour in a neighbourhood of that pixel. An edge is a vector variable with two components: magnitude and direction. The edge magnitude is the magnitude of the gradient and the edge direction T is rotated with respect to the gradient direction ȥ by -900. 2.2.6. Multiple Cues The greatest weakness of the colour cue is its ambiguity due to the presence of objects or regions with colour features similar to those of the object of interest. By fusing colour, motion, texture, and other cues, this ambiguity can be considerably reduced if the object of interest is moving as shown in [1, 19, 20]. When the object is moving, strong localisation cues are provided by motion measurements, whereas colour measurements can undergo substantial fluctuations due to changes in the object pose and illuminations. Conversely, when the object is stationary or near stationary, motion information disappears and colour information dominates to provide a reliable localisation cue.
3. Particle Filtering Using Multiple Cues A PF algorithm for object tracking in video sequences using multiple cues - colour and texture - was developed [19, 20] and is presented in Table 1. Results from a natural sequence are shown in Fig. 1 with colour and texture cues (initial, intermediate and last frame). Fig. 2 presents the root-mean-square error obtained with synthetic data. It is evident from Fig. 2 that the colour cue, when compared to the other cues, is the least accurate.
Figure 1. a) initial frame
b) intermediate frame
c) last frame
Results from a natural sequence: tracking of the small boat by colour and texture cues with a PF.
266
L. Mihaylova et al. / Object Tracking by Particle Filtering Techniques in Video Sequences 3 Colour and Texture Cues Colour Cues Texture Cues
2.5
RMSE
xy
2
1.5
1
0.5
0 0
10
20
30
40
50
Frame
Figure 2. Results with synthetic data [19], with 100 Monte Carlo runs: root-mean-square error in x and y directions combined.
Due to space limitations other results could not be included here. However, these results show the algorithm performance under different scenarios. The PF is able to: 1) track a single moving object and 2) retrieve the object after tracking loss. This is achieved by a mixed-state motion model [19] composed of a constant velocity model, and a re-initialisation model drawing uniform samples (needed to recover the object after being lost). Table 1. A particle filter with multiple cues Initialisation
1. k = 0, for i=1, …, N, generate samples {x0 (i)} from the initial distribution p(x0 (i)) Prediction step
2. For k =1,2,…, i=1, …, N, sample x0(i) ~ p(xk+1|xk (i)) according to the object model Measurement Update: evaluate the importance weights
3. On the receipt of a new measurement, compute the weights
Wˆ k(i)1 v Wk(i)1 / L( z k 1 | x k(i)1 )
N
4. Normalise the weights Wˆ k(i) 1 v W k(i)1 /
¦W
(i) k 1
. The likelihood
L( z k 1 | x k(i)1 ) is calculated as a
i 1
product of the likelihoods of the separate independent cues. Output 5. A collection of samples from which the approximate posterior distribution is computed N
pˆ ( x k 1 | Z k 1 )
¦Wˆ
i k 1G ( x k 1
x k(i) 1 ) ,
i 1
where
Z k 1 is the set of measurements available till the time instant k+1.
6. The posterior mean is computed using the collection of samples (particles) N
xˆ k 1
E[ x k 1 | Z k 1 ]
¦Wˆ
(i ) ( i ) k 1 x k 1
i 1
Selection step (resampling)
7. Multiply/ suppress samples
xk( i)1 with high/ low importance weights Wˆ k(i)1
(i ) random samples approximately distributed according to p ( x k 1
8. Set
k o k 1 and return to step 2.
|Z
k 1
).
in order to obtain N new
L. Mihaylova et al. / Object Tracking by Particle Filtering Techniques in Video Sequences
267
4. Conclusions and Open Issues for Future Research
Particle filtering is a technique that is very suitable for object tracking in video sequences. We have results for a single object using video sequences from a fixed or moving single camera. The tracking algorithm is based on colour and texture cues. There are several challenges in solving tracking problems in image/ video applications. The first is the non-linear character of the object of interest and of the observation model. The algorithms must often run at high update rates. In many applications, prior information available for the environment is limited. From the point of view of implementations, this research domain is rich and challenging because of the need to overcome occlusions of the tracked entities over one or more frames and dealing with missing sensor data. How to handle clutter in the background is of considerable importance as well, especially with multiple targets. In case of multiple sensors, the data has to be fused appropriately, and probabilistic data association techniques are then of primary importance. We are aiming to consider: i) detection of the object, i.e., the object has to be localised at first in the image and continuously tracked afterwards. One of the biggest problems in motion based tracking is losing the object due to rapid movements and re-detecting the object of interest and following its movement afterwards; ii) tracking rigid and non-rigid bodies in three dimensions with multiple dynamically selected static or moving cameras.
References [1] P. Pérez, J. Vermaak, A. Blake, Data Fusion for Tracking with Particles, Proc. IEEE, 92:3, 2004, 495513. [2] N. Gordon, D. Salmond and A. Smith, A Novel Approach to Non-linear / Non-Gaussian Bayesian State Estimation, IEE Proc. on Radar and Signal Processing, 40, 1993, 107-113. [3] A. Doucet, N. Freitas, N. Gordon, Eds., Sequential Monte Carlo Methods in Practice, New York: Springer-Verlag, 2001. [4] M. Arulampalam, S. Maskell, N. Gordon, T. Clapp, A Tutorial on Particle Filters for Online Non-linear/ Non-Gaussian Bayesian Tracking, IEEE Trans. Sign. Proc., 50: 2, 2002, 174-188. [5] J. Liu, Monte Carlo Strategies in Scientific Computing, Springer Verlag, 2001. [6] M. Isard and A. Blake, Contour Tracking by Stochastic Propagation of Conditional Density, European Conf. on Comp. Vis., Cambridge, UK, 1996, 343-356. [7] M. Isard, A. Blake, Condensation -- Conditional Density Propagation for Visual Tracking, Intl. Journal of Computer Vision, 28:1, 1998, 5-28. [8] C. Shen, A. van den Hengel, A. Dick, Probabilistic Multiple Cue Integration for Particle Filter Based Tracking, Proc. of the VIIth Digital Image Comp.: Techniques and Appl., 2003. [9] Y. Bar-Shalom, X.R. Li, Estimation and Tracking: Principles, Techniques and Software, Artech House, 1993. [10] H. Nait-Charif, S. McKenna, Tracking Poorly Modelled Motion Using Particle Filters with Iterated Likelihood Weighting, Proc. of Asian Conf. on Comp. Vis., 2003. [11] M. Sonka, V. Hlavac, R. Boyle, Image Processing, Analysis, and Machine Vision, IInd Edition., Brooks/ Cole Publ. Company, 1999. [12] S. McKenna, S. Jabri and S. Gong, Tracking Colour Objects Using Adaptive Mixture Models, Image and Vision Computing, 17:3-4, 1999, 225-231. [13] K. Nummiaro, E. Koller-Meierand, L. Van Gool, An Adaptive Color-Based Particle Filter, Image and Vision Comp., 21, 2003, 99-110. [14] D. Comaniciu, V. Ramesh, P. Meer, Real-Time Tracking of Non-Rigid Objects Using Mean Shift, Proc. of 1st Conf. Comp. Vision Pattern Recogn., 2000, 142-149. [15] F. Aherne, N. Thacker, P. Rockett, The Bhattacharyya Metric as an Absolute Similarity Measure for Frequentcy Coded Data, Kybernetika, 3: 4, 1997, 1-7.
268
L. Mihaylova et al. / Object Tracking by Particle Filtering Techniques in Video Sequences
[16] T. Kailath, The Divergence and Bhattacharyya Distance Measures in Signal Selection, IEEE Trans. on Communication Technology, COM-15:1, 1967, 52-60. [17] J. Konrad, in Handbook of Images and Video Processing, Academic Press, 2000, 207-225. [18] R. Porter, Texture Classification and Segmentation, PhD thesis, 1997, Univ. of Bristol. [19] P. Brasnett, L. Mihaylova, N. Canagarajah, D. Bull, Particle Filtering with Multiple Cues for Object Tracking in Video Sequences, Proc. of SPIE's Annual Symp. EI ST, 5685, 2005. [20] P. Brasnett, L. Mihaylova, N. Canagarajah, D. Bull Sequential Monte Carlo Tracking by Fusing Multiple Cues in Video Sequences, IEEE Trans. on Image Proc., submitted, 2005.
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
269
Wavelets, Segmentation, Pixel- and Region- Based Image Fusion J. J. LEWIS1,2, R. J. O’CALLAGHAN, S. G. NIKOLOV, D. R. BULL and C. N. CANAGARAJAH The Centre for Communications Research, University of Bristol, UK
Abstract. Enabling technologies for pixel- and region-based fusion of wavelets and segmentation are introduced. Pixel-based fusion using the Discrete Wavelet Transforms (DWT) and the Dual-Tree Complex Wavelet Transform (DT-CWT) are discussed and compared with a region-based fusion method using the DTCWT. Rather than performing fusion pixel-by-pixel, segmentation is used to produce a set of regions representing features in the image and fusion is performed on a region-by-region basis. The DT-CWT is found to outperform the DWT. Region-based fusion gives results comparable to pixel-based DT-CWT fusion and has a number of advantages over these methods such as more intelligent fusion rules and the ability to manipulate regions with certain properties. Keywords. Image Fusion, Discrete Wavelets, Complex Wavelets
Introduction Image fusion usually makes use of either redundant or complimentary information in two or more images to produce a single image with improved accuracy or reduced uncertainty or produce an image with more information than any of the input images [1]. Thus, data collected from different modalities, at different times or frame rates, or using sensors in different locations can be fused to produce a single image. Image fusion is defined in [2] as “the process by which several images or some of their features are combined together to form a single image.” Traditionally, fusion is performed at one of four levels of abstraction: signal-level; pixel-level; feature-level; and object-level. The majority of image fusion methods implemented to date are pixellevel methods, where fusion is performed on a pixel-by-pixel basis depending on the information contained at that pixel or in an arbitrary window around that pixel. These methods range from the simple, such as averaging to the more complicated where the images are first transformed (e.g., using pyramids [3-5] or wavelets [2, 6-10]) and fusion is performed in these domains. More recent fusion work has included featurelevel algorithms such as region-based fusion [11-14]. Wavelet transforms are a key technology successfully used in fusion and segmentation is required for region-based fusion. These are discussed in some detail and results from fusion using both pixel- and region-based algorithms discussed.
1
Corresponding Author: J. J. Lewis, The Centre for Communications Research, University of Bristol, Bristol, BS8 1UB, UK; Tel.:+44 117 331 5073; E-mail:
[email protected]. 2 This work is funded by the UK MOD Data and Information Fusion Defence Technology Center.
270
J.J. Lewis et al. / Wavelets, Segmentation, Pixel- and Region- Based Image Fusion
1. Wavelets Wavelets perform multiresolution analysis by carrying out a series of sub-band filtering operations on a signal and decomposing it into sums of basis functions. They are similar to Fourier decompositions, but while Fourier functions are only localized in space (i.e., frequency changes in the Fourier domain produces changes throughout the time domain), wavelets are local in both time and space [15]. At each scale, high- and low-pass filters split the signal into detail and approximation signals. Wavelets are derived from a basis function called the mother wavelet. 1.1. The Discrete Wavelet Transform (DWT) The Discrete Wavelet Transform (DWT) is an orthogonal one-dimensional wavelet applied in two dimensions using filtering and down sampling performed along columns then rows. An example of DWT coefficients is given in Figure 1. At each scale there are four sub-bands: High-High (HH); High-Low (HL); Low-High (LH); and Low-Low (LL). The LH sub-band is sensitive to vertical frequencies, the HL to horizontal frequencies and the HH sub-band to diagonal (45º) frequencies. The DWT can be recursively applied to LL sub band to achieve multiresolution analysis. Provided that the filters used are orthogonal (e.g., the Daubechies mother wavelet), a related set of synthesis filters will exist to perfectly reconstruct the original image [2]. The DWT has been found to have a number of advantages over other multiresolution schemes such as pyramids schemes: (a) Compact representation: the wavelet transform is the size of the original image which is a more compact representation than pyramids [7]; The wavelet transform provides directional information on the image while pyramids do not contain any spatial orientation selectivity in the decomposition process [7]; Pyramid based fused images often contain blocking artifacts which do not occur in wavelet fused images [7]; Images generated by wavelets have better Signal To Noise Ratios (SNR) when compared with images fused using pyramids [4]; Finally, wavelet based fused images are better perceived than pyramid based fused images when compared using human analysis [4, 7]. 1.2. The Dual-Tree Complex Wavelet Transform There are two main problems with DWT [16]: the DWT is not shift invariant due to the sub-sampling at each scale, i.e., small shifts in the input signal to the DWT can cause large changes in the energy across the sub-bands at different levels. Shift invariance within wavelet transform image fusion is essential for the effective comparison of coefficient magnitudes by the fusion rule. The Shift Invariant DWT (SIDWT) [6] was developed to improved shift invariance in the DWT by removing all sub-sampling causing a very over complete method. The Dual Tree CWT (DT-CWT) [16] overcomes this problem by not decimating at the first level of filtering and producing two fully decimated trees from the odd and even samples produced at the first level. The DT-CWT has improved directional selectivity over DWTs as complex wavelets are able to distinguish between positive and negative orientations, six distinct subbands are produced at ±15o, ±45o, ±75o. Qualitative and quantitative experiments a [2, 8] show that DT-CWT outperformed methods such as DWT and SIDWT. All DWT schemes suffered from ringing and the DT CWT shows less ringing errors and better preserves subtle details, but there are increased computational costs.
J.J. Lewis et al. / Wavelets, Segmentation, Pixel- and Region- Based Image Fusion
(a) Original Image
(b) DWT Coefficients
271
(c) DT-CWT Coefficients
Figure 1. Wavelet Decompositions
2. Segmentation Segmentation is a key step in many computer vision tasks (e.g., tracking; classification; object based coding; and region-based fusion). A plethora of papers exist including some review chapters and papers such as [15, 17, 18]. Segmentation is defined by [17] as the process of partitioning the image into some non-intersecting regions, such that each region is homogeneous and the union of no two adjacent regions is homogeneous.
Approaches to segmentation can generally be divided into four methods: (a) EdgeBased; (b) Region-Growing; (d) Model-Based. A set of segmented images can be fused in region-by-region. The quality of the segmentation is important to produce good fused images and ideally should have the following properties: segmented as a set of closed connected regions; each feature in the image is represented by a single region; and as few regions as possible created as more regions take longer to fuse. 2.1. The Combined Morphological-Spectral Image Segmentation (CoMSUIS) Algorithm The Combined Morphological-Spectral Image Segmentation (CoMSUIS) algorithm [19] has been found to compare well with existing algorithms. It groups together areas of similar intensity and/or texture into separate regions. Texture information can be modeled as the superposition of oscillating components at characteristic scales and orientations. Textural information is extracted from the sub-bands of the DT-CWT. These are more efficient than other techniques such as two dimensional Gabor filters giving similar accuracies, compact representation and the transform is complete [20]. A perceptual gradient function is derived from the intensity and texture information. Larger gradients indicate possible edge locations (e.g., Figure 2(b)). A region-based method, the watershed transform (described in [15]), is used to produce the initial segmentation. However, it tends to over-segment and so this initial segmentation is further processed to reduce the number of regions, with a spectral clustering algorithm. Regions representing the same feature are grouped together by globally optimizing a cost function. The initial regions are used to construct a graph representation of the image which is processed by the spectral clustering algorithm. 2.2. Joint and Unimodal Segmentation Traditionally, information from an image produces a single segmentation map. This is called unimodal segmentation. However, fusion tasks usually deal with a set of two or more images. A weak region in one image may correspond to a strong region in another. There is an advantage of using information from all images of a scene to
272
J.J. Lewis et al. / Wavelets, Segmentation, Pixel- and Region- Based Image Fusion
(a) Original Texture
(b) Gradient Image
(c) Initial Segmentation
(d) Final Segmentation
(a) Unimodal Segmentation of Visible Image
(b) Unimodal Segmentation of IR Image
(c) Union of both Unimodal Segmentations
(d) Joint Segmentation
Segmentation textures
Joint and Unimodal Segmentations Figure 2. Segmentation Methods
produce a single segmentation map for all images in the set. This process is called Joint Segmentation and is introduced in [12]. In general, jointly segmented images work better for fusion as the segmentation map contains a minimum number of regions to represent all the features in the scene most efficiently. With separately segmented images, where different images show features differently a problem occurs where regions partially overlap. If the overlapped region is incorrectly dealt with, artifacts will be introduced and the extra regions created to deal with the overlap will increase the time taken to fuse the images. Joint segmentation can overcome some of the problems of noise and other inaccuracies in an image to produce a more reliable segmentation. However, if the information from the segmentation process is going to be used to register the images or if input images are very different, separate segmentations of the images are needed. The effects of segmenting the images in different ways are shown in Figure 2. In particular, the inefficient segmentation union of the two unimodal segmentation maps, which is necessary in order to fuse the images, is shown in Figure 2(c).
3. Fusion 3.1 Pixel Based Image Fusion Consider N registered input images, I1, I2…IN. Multiresolution fusion methods involve transforming these registered images from normal image space into another domain by applying the transform, ω, fusing using some rules, F, and then performing the inverse transform, ω-1, to reconstruct the fused image, I [2]. I = ω −1 (F (ω (I1 ), ω (I 2 ),...,ω (I N )))
(1)
273
J.J. Lewis et al. / Wavelets, Segmentation, Pixel- and Region- Based Image Fusion DT-CWT
I1
DT-CWT
I2
DT-CWT
I1
F
DT-CWT
I2
DT-CWT-1
F
DT-CWT-1
I DT-CWT
IN Input Images
I DT-CWT
IN
Complex Wavelet Fusion Fused Wavelet Coefficients Coefficients Rule
Fused Image
(a) Pixel-Based Fusion with the DT-CWT
Input Images
Complex Wavelet Coefficients
Joint/Separate Segmentation
Region Fused Wavelet Fusion Rule Coefficients
Fused Image
(b) Region-Based Fusion with the DT-CWT
Figure 3. Pixel Based Fusion Methods
Figure 3(a) shows pixel-based fusion with the DT-CWT. As wavelets tend to pick out salient features in an image (such as corners and edges), wavelet coefficients with larger values contain more information about the features in an image. Thus, a choose maximum scheme picking the higher absolute wavelet coefficient at each pixel gives good results. More complex fusion rules have been proposed such as coefficients combined as a weighted average [5] based on a local activity measure in the sub-bands of the images. An area based selection rule with consistency verification [7] decision at each pixel is made based on image with the higher the activity of a small arbitrary window centered on the pixel. If the activity of pixels from different images is similar, an average could be considered. Finally, a consistency check is made. These methods give some improvement to the quality of the fused image especially for DWT fusion. These methods can be thought of as a step towards region based fusion, but the arbitrary regions used here bear no relation to the features in the image. 3.2 Region Based Image Fusion The majority of applications of a fusion scheme are interested in features within the image, not in the actual pixels. Therefore, it seems reasonable to incorporate feature information into the fusion process [11]. There are a number of perceived advantages of this, including: • Intelligent fusion rules: Fusion rules are based on combining the regions of an image. Thus, more useful tests for choosing between the regions, based on various properties of a region, can be implemented; • Highlight features: Regions with certain properties can be either accentuated or attenuated in the fused image depending on a variety of the region's characteristics; • Reduced sensitivity to noise: Processing semantic regions rather than individual pixels or arbitrary regions can help overcome some of the problems with pixel-fusion methods such as sensitivity to noise, blurring effects and mis-registration; A number of region-based fusion schemes have been proposed, for example, [1114]. These initially transform pre-registered images using an Multiresolution (MR) transform. Regions representing image features are then extracted from the transform coefficients. A grey-level clustering using a generalized pyramid linking method is used for segmentation in [11]. The regions are then fused based on a simple region property such as average activity. These methods do not take full advantage of the wealth of information that can be calculated for each region.
274
J.J. Lewis et al. / Wavelets, Segmentation, Pixel- and Region- Based Image Fusion
A region-based fusion algorithm, initially proposed in [12], is briefly described here and shown in Figure 4. Initially, the registered input images are transformed
(a) Priorities for Visible Image
(b) Priorities for IR Image
(c) Mask
Figure 4. Choosing the Regions for the Fused Images
with the DT-CWT and the high-pass coefficients together with the original image are passed to the CoMSUIS algorithm to produce a set of corresponding segmentation maps. Joint or unimodal segmentation can be used, but if unimodal segmentation is used, the union of both segmentation maps is used in the fusion process. The segmentation map is then down-sampled, giving priority to smaller regions, so that a segmentation map is available at each level of the wavelet coefficients. A priority value that determines whether a given region in an input image should be included in the fused image is calculated for all regions in all input images. Thus, a priority map is generated for each image in the wavelet domain. Priority can be calculated from some property of a region, such as a statistical measure (e.g., activity, variance or entropy), the size, shape or spatial position of a region. Examples of priority maps are provided in Figure 4(a) and 4(b) using variance to calculate priority. Fusion decisions can now be made region by region based on the priority maps. Possible fusion rules include weighted averages of regions or a “choose maximum region” scheme. Intuitively, weighted averages based on the priority maps should produce good results, however, the averaging effect is detrimental to the quality of the fused image and the choose maximum scheme gives better results. Figure 4(c) shows the fusion decision: black regions are taken from the IR image while grey regions are from the visible image. The wavelet coefficients are combined based on this mask and the fused image reconstructed with the inverse DT-CWT. One of the main advantages of region based fusion is that as we are dealing with regions representing actual features in the images, the regions can be manipulated to improve the fused image for an end user. Based on some property of the region, manual or automatic classification etc., a region can easily be attenuated or highlighted to change its influence in the fused image.
(a) DWT Pixel Fused Image
(b) DT-CWT Pixel Fused Image Figure 5. Image Fusion
(c) DT-CWT Region Fused Image
J.J. Lewis et al. / Wavelets, Segmentation, Pixel- and Region- Based Image Fusion
275
4. Results and Discussion The IR and visible images shown in Figure 5 were fused with three methods: DWT pixel-based; DT-CWT pixel-based; and DT-CWT region-based. Four levels of decomposition and a choose maximum fusion rule are used with all methods with joint segmentation for region-based fusion. The results are given in Figure 6. The DWT has a number of artifacts, including ringing, particularly around edges with large contrast changes. These artifacts are much less obvious in the DT-CWT fusion. The regionbased fused image has improved contrast over the pixel-based methods as pixel-based fusion techniques tend to cause some averaging between the images and the visible image is very dark. However, some detail from the visible image is lost in the regionfused image, for example the café windows. This is caused since the segmentation (see Figure 2(d)) has not picked up some detail in the background and, as it is taken from the IR image, this detail from the visible image is lost. Figure 6 shows an example of how regions can easily be manipulated to improve the fused result. In this situation, we define a problem where it is more important to spot a figure that is closer to the road. The distance between the centre of masses of the region representing the road and the figure is calculated and the coefficients of the figure are weighted inversely proportional to the closeness to the road. For this experiment the road was manually selected and the figure detected by thresh-holding the IR image. These images are jointly segmented and fused using an entropy priority. The figure is seen to get brighter as he moves closer to the road. While this is a relatively trivial example, it is a worthwhile exercise showing some advantages of region-based fusion.
Figure 6. Person in IR Highlighted Depending on Closeness to the Road3
5. Conclusions This paper has introduced the topics of wavelet transforms and segmentation: two key technologies for many fusion applications. The advantages of DT-CWT over the DWT 3
The original IR and visible images are kindly supplied by Alexander Toet of the TNO Human Factors Research Institute and are available online at www.imagefusion.org.
276
J.J. Lewis et al. / Wavelets, Segmentation, Pixel- and Region- Based Image Fusion
and wavelets over other transforms have been discussed and the DT-CWT has been shown to out perform other methods for pixel based fusion. The CoMSUIS segmentation algorithm was described and used in a region based segmentation algorithm. The region-based DT-CWT fusion has been shown to produce fused images of similar quality to pixel-based fusion. The main advantage of region-based fusion is the ability to use more intelligent higher-level fusion rules; however this is at a cost of complexity.
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13]
[14] [15] [16] [17] [18] [19] [20]
Abidi, M. and R. Gonzalez, Data Fusion in Robotics and Machine Intelligence. 1992, USA: Academic Press. Nikolov, S.G., et al., Wavelets for Image Fusion, in Wavelets in Signal and Image Analysis, A.P.a.F. Meyer, Editor. 2001: Dordrecht, The Netherlands. p. 213--244. Toet, A., J.v. Ruyven, and J. Valeton, Merging thermal and visual images by contrast pyramids. Optical Engineering, 1989. 28(7): p. 789-792. Wilson, T., S. Rogers, and M. Kabrisky, Perceptual based hyperspectral image fusion using multispectral analysis. Optical Engineering, 1995. 34(11): p. 3154--3164. Burt, P. and R. Kolczynski, Enhanced image capture through fusion, in Proceedings of the 4th international conference on Computer Vision. 1993. p. 173--182. Rockinger, O., Image Sequence Fusion Using a Shift-Invariant Wavelet Transform, in Proceedings of the IEEE International Conference on Image Processing. 1997. p. 288--291. Li, H., S. Manjunath, and S. Mitra, Multisensor Image Fusion Using the Wavelet Transform. Graphical Models and Image Processing, 1995. 57(3): p. 235--245. Hill, P.R., Wavelet Based Texture Analysis and Segmentation for Image Retrieval and Fusion, in Department of Electrical and Electronic Engineering. 2002, University of Bristol: Bristol, UK. Chipman, L., T. Orr, and L. Graham, Wavelets and Image Fusion, in Wavelet Applications in Signal and Image Processing III. 1995. p. 208--219. Rockinger, O., Pixel-Level Fusion of Image Sequences using Wavelet Frames, in Proceedings of the 16th Leeds Applied Shape Research workshop. 1996. Piella, G., A general framework for multiresolution image fusion: from pixels to regions. Information Fusion, 2003. 4: p. 259--280. Lewis, J.J., et al. Region-Based image Fusion Using Complex Wavelets. in The Seventh International Conference on Image Fusion. 2004. Stockholme. Matuszewski, B., L.-K. Shark, and M. Varley, Region-based wavelet fusion of ultrasonic, radiographic and shearographyc non-destructive testing images, in Proceedings of the 15th World Conference on Non-Destructive Testing. October 2000: Rome. Zhang, Z. and R. Blum, Region-Based Image Fusion Scheme for Concealed Weapon Detection, in Proceedings of the 31st Annual Conference on Information Sciences and Systems. March 1997. Sonka, M., V. Hlavac, and R. Boyle, Image Processing, Analysis, and Machine Vision. 1999, Brooks/Cole Publishing Company: USA. Kingsbury, N., The dual-tree complex wavelet transform: a new technique for shift invariance and directional filters, in IEEE Digital Signal Processing Workshop. 1998. Pal, N.R. and S.K. Pal, A review on image segmentation techniques. Pattern Recognition, 1993. 26(9): p. 1277--1294. Cheng, H.D., et al., Color image segmentation: Advances and prospects. Pattern Recognition, 2001. 34(12): p. 2259--2281. O'Callaghan, R. and D.R. Bull, Combined Morphological-Spectral Unsupervised Image Segmentation. IEEE Transactions on Image Processing, 2005. 14(1): p. 49--62. Hill, P.R., C.N. Canagarajah, and D.R. Bull, Image segmentation using a texture gradient based watershed transform. IEEE Transactions on Image Processing, 2003. 12(12): p. 1618-1633.
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
277
Data Fusion and Quality Assessment of Fusion Products: Methods and Examples Paolo CORNA a , Lorella FATONE,b and Francesco ZIRILLIc,1 Via Silvio Pellico 4, 20030 Seveso (MI), Italy, e-mail:
[email protected] b Dipartimento di Matematica Pura ed Applicata, Università di Modena e Reggio Emilia, Via Campi 213/b, 41100 Modena (MO), Italy, e-mail:
[email protected] c Dipartimento di Matematica “G. Castelnuovo”, Università di Roma “La Sapienza” Piazzale Aldo Moro 2, 00185 Roma, Italy, e-mail:
[email protected] a
Abstract. In this paper we present ideas about image fusion methods based on the use of partial differential equations (PDE). These ideas have been translated into several mathematical models of fusion procedures. These models have been approached with appropriate numerical algorithms and tested on real data (e.g., ERS-SAR and SPOT data). Moreover we introduce quality assessment techniques for fusion products, i.e., quantitative procedures able to measure the quality of the fused images compared with the quality of the originals or other images. Some numerical results, obtained from these quality assessment techniques during tests on real data, are shown. Keywords. image fusion, calculus of variations, non linear optimization, quality assessment of fusion products
Introduction Data fusion covers a very wide domain, making it difficult to precisely define. In the last decades several definitions of data fusion have been proposed in the literature. Hall and Llinas [1] give the following definition: “data fusion techniques combine data from multiple sensors, and related information from associated databases, to achieve improved accuracy and more specific inferences that could be achieved by the use of single sensor alone”. The open geographic information systems (GIS) consortium defines fusion as “the process of organizing, merging and linking disparate information elements (i.e., map features, images, video and so on) to produce a consistent and understandable representation of an actual or hypothetical set of objects and/or events in space and time”. According to Wald [1], data fusion is a “formal framework in which are expressed means and tools for the alliance of data of the same scene originating from different sources. It aims at obtaining information of greater quality; the exact definition of greater quality will depend upon the application”. More specifically, image fusion focuses on the combination of images rather than the more general process of combining data. The information obtained from fused images generally enhances the information obtained from the originals. Moreover we note that
1
Corresponding author
278
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
a secondary purpose of data fusion may be saving memory space when storing or transmitting data relative to a scene. One of the most significant and straightforward examples able to illustrate the advantages and benefits of data fusion is human vision. The two eyes extend the ability of a single eye; in fact they have a slightly different viewing angle that makes stereo vision and depth perception possible. Moreover if one eye is disabled, vision is still possible, although in a degraded mode. This is what, in the fusion framework, is called exploitation of redundancy. With human vision, the image fusion process is carried out by the brain, while for digital data, it is carried out through numerical algorithms. The definitions mentioned above are concerned with fusion methods and information quality. We note that in this context quality does not have a very specific meaning. It is a generic word denoting that the information available is more satisfactory for the “customer” after the fusion process is performed than before it is performed. The problem of giving a quantitative meaning to statements about the quality of information contained in images resulting from fusion processes is called quality assessment of fusion products. Data fusion applications are numerous - we mention only two of them: remote sensing applications in earth observation and medical imaging. In the first application, see for example [2], [3], sensors travelling onboard satellites or airplanes provide repeated coverage of the earth’s surface on a regular basis and furnish a large number of data that can be of great interest for earth resource assessment and environment monitoring. For a proper exploitation of these data, it is mandatory to develop effective data fusion techniques able to take advantage of the multisource and multitemporal characteristics of the available data. In the second application, the medical framework, see for example [4], non invasive imaging technologies provide a unique window on the anatomy, physiology and functioning of living organisms. In this specific case one interesting goal of a fusion procedure is the fusion of anatomical and functional images to allow improved spatial localization of abnormalities. In detail, multisensor image data are observations of a given scene acquired by different sensors; they are functions of the parameters that define unknown objects contained in the observed scene. The extraction of objects or object parameters from image data of a single sensor is an inverse problem. Therefore, the extraction of objects or object parameters from multisensor image data in a data fusion procedure corresponds to the joint solution of several inverse problems. There are several fusion approaches that can take place at the signal, pixel, feature or symbolic level of representation (see Figure 1). Signal-level fusion refers to the combination of signals from different sensors before the production of images. Pixellevel fusion consists of merging information from different digital images on a pixelby-pixel basis. Feature-based fusion merges the different data sources at the intermediate level - we speak of feature-level fusion when features extracted from different images are merged. Finally, symbolic-level fusion refers to the combination of information obtained from images at a higher level of abstraction. This last type of fusion is possible even when images come from very dissimilar sensors. Despite this classification, in several application fields, e.g., in earth observation, a fusion approach can deal simultaneously with more than one of these levels. Many approaches to multisensor data fusion that implicitly or explicitly deal with uncertainty are based on a variety of tools such as artificial neural networks, Markov random fields, Bayes networks, wavelet transforms, Dempster-Shafer methods, and fuzzy logic, as well as combinations of several of these techniques. For an extensive
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
279
review of these techniques in data fusion we refer the interested reader to [5], [6] and the references quoted there. In recent years new techniques to process images based on Partial Differential Equations (PDE) have been proposed, see for example [7], [8], [9], [10] and the references therein. The use of PDE in image processing was originally introduced in the context of computer vision and robotics. These PDE based techniques are of practical interest due to the availability of numerical algorithms to solve PDE and the ability of today’s computers to quickly solve discretized PDE involving thousands or even millions of independent variables coming from the discretization procedure. Note that often each pixel in an image is associated with one independent variable of the discretized PDE. While data fusion is becoming a mature research field both in engineering and applied mathematics, the problem of quantitative quality assessment of fusion products remains in the pioneering stage. In this paper we discuss fusion procedures that can be classified as feature level fusion procedures and the problem of quality assessment of fusion products. In particular, the image fusion problem has been translated into several different mathematical problems involving PDE, and these models have been solved with appropriate numerical algorithms and tested on real data (e.g., ERS-SAR and SPOT data of the earth’s surface). Note that we always assume that the images to be fused refer to the same scene and are coregistered. Moreover in this paper we present algorithms for the quality assessment of fusion products. That is, we give a quantitative basis (i.e., we define numerical performance indices) to explain in which sense we believe that “the quality of the information contained in the fused images is higher than the quality of the information provided by the original images considered one by one”. As well as comparing the “fused” images with the originals, we compare the “fused” images corresponding to different fusion procedures with each other. The general idea that we propose for the problem of quality assessment of fusion products is the use of automatic recognition techniques together with a multiscale resolution analysis of the images. Automatic recognition techniques (e.g., the Hough transform) are used to detect simple features in an image (e.g., straight lines, circles, ellipses,), and a multiscale algorithm is used to decompose an image into subimages containing only simple features. These ideas are tested on fusion products obtained from the fusion of ERS-SAR and SPOT data. Finally we compare fused images coming from the fusion of ERS-SAR and SPOT data of a given scene with a high resolution optical image of the same scene (IRS-1C data). We call this last image “ground truth”; which, for our purposes, establishes a conclusive criterion to perform the quality assessment of the fusion products obtained, and test the validity of the results obtained with the performance indices. The paper is organized as follows: in Section 2 we present several image fusion techniques based on PDE. In Section 3 we show results obtained using these techniques on the fusion of SAR (ERS-SAR data) and optical images (SPOT data) relative to the same scene on the earth’s surface. In Section 4 we suggest quantitative methods to measure the quality improvement of the fused images compared with the quality of the originals and each other. Finally we present some numerical results on the quality assessment of SAR/optical image fusion products and compare them with the results obtained using the “ground truth” data.
280
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
(a) Signal-level fusion Sensor 1
Sensor 2
(b) Pixel-level fusion
Sensor n
Sensor 1
Sensor 2
Image 1
Fusion
Sensor n
Image 2
Image n
Processing Image 1
Image 2
Processing
Image m
Fusion
Results
Results
(c) Feature-level fusion Sensor 1
Image 1
Sensor 2
Image 2
(d) Symbolic-level fusion
Sensor n
Sensor 1
Image n
Image 1
Sensor 2
Sensor n
Image 2
Image n
Feature Extraction
Feature Extraction
Processing
Processing Feature Identification
Fusion
Fusion
Results
Results
Figure 1. Fusion approaches
1. The Use of PDE in Image Fusion The fusion procedures that we present are based on the idea of associating a “structure” to the images to be fused. Images referring to the same scene are supposed to have the same structure. Fusing two images consists of minimizing the difference between the image structures to be fused subject to the constraints posed by the data. Note that the fusion of more than two images can be treated with simple generalizations of the methods described below. We limit our attention to images comprising a few subregions, where the image is smooth but separated by boundaries where the image changes abruptly. Note that not all images satisfy these assumptions. For example images of objects with complicated textures or fragmented structures, such as a canopy of leaves, may not satisfy these assumptions. However we believe that the piecewise smooth or the piecewise constant model of the image that is at the foundation of the image segmentation and fusion algorithms discussed in this paper is a good model for many applications of potential interest to end users. For example we have in mind the
281
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
classification of agricultural fields and urban areas starting from SAR and optical images (see Section 3). Given two images, the fusion algorithms we propose (see [11], [12], [13], [14]) perform the following functions: x segmentation and denoising; x fusion. First let us separately examine these two functions. 1.1. Segmentation and Denoising of Images: PDE Based Filters The goal is to decompose a given noisy image of the type described above into piecewise smooth regions bounded by contours where the image intensity is allowed to change abruptly. This is called image segmentation, or more precisely due to the presence of noise in the images, image segmentation and denoising. Let R be a rectangle. An image can be seen as a function g(x,y), (x,y) R. The variables (x,y) can be real variables or discrete variables taking values from a discrete set. The processing of images using PDE is based on the idea that the images with discrete values of the independent variables (x,y) (i.e., images made by pixels) can be considered approximations of images where the independent variables (x,y) are real variables. If we consider black and white images, we can assume g to be a real valued function, so that g(x,y) is a measure of the brightness of the image in the location (x,y). In several applications, where digital images are concerned, the dependent variable g takes values from a discrete set. The use of PDE in the processing of these images is based on the assumption that the digital image g can be seen as an approximation of an image where g takes real values. More general situations where g is a complex variable (e.g., SAR images) or a vector (e.g., colour images) can be treated with simple generalizations of the methods that follow. The process of measuring the brightness of an image is always affected by noise so that the measured image will always be a noisy ~ (x,y), (x,y) R the noisy image measured corresponding image. Let us indicate with g to the (ideal) image g. Note that different types of images measure different physical properties of the underlying scene and are affected by different types of noise due to the different characteristics of the instruments used to measure the images. Let us assume that the (ideal) image g(x,y), (x,y) R is a piecewise smooth or a piecewise constant function, i.e., there exists a finite number of subsets Ri, i=1,2,..., n, of
R
Ri R j
such
that
^Ri `in 1 n
if i z j and i 1 Ri
is
a
partition
of
R
(i.e.,
R , see Figure 2) and the function g(x,y) on
each set Ri, i=1,2,..., n, is a smooth function or a constant function and changes rapidly or even discontinuously across the boundaries delimiting the sets Ri, i=1,2,..., n. Let us denote with ī the union of the parts of the boundaries of Ri, i=1,2,..., n, that do not belong to the boundary of R, which we denote with wR, see Figure 2. We call ī the structure of the image g. Let |ī | denote the total length of ī and g R ( x, y ) denote the i
restriction of g(x,y) to Ri, i=1,2,..., n.
282
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
Figure 2. The rectangle R; ī = dashed line n~
^~ `
~ * be the partition of R and the structure associated with g~ ~ (x,y), (x,y) respectively. Due to the presence of noise in the measurement process, g ~ R, is such that the relation between its structure * and ī and the relation between the ~ ~ ( x, y ) , i=1,2..., n~ , and the functions g ( x, y ), (x,y) R , i=1,2,..., functions g Let Ri
i 1
and
Ri
Ri
n, are not easily determined. Image segmentation and denoising is a numerical procedure that from the ~ (x,y), (x,y) R, recovers, as much as possible, g(x,y), (x,y) R, and knowledge of g in particular recovers the structure ī of g(x,y), (x,y) R. To successfully perform the image segmentation procedures suggested later, it is necessary that the partition of R,
^Ri `in 1 , that defines the structure of g(x,y), (x,y) R is a relatively “simple” one, i.e., it is a partition made of a few “easy” pieces so that ī is the union of a few elementary curves. The PDE based filters for image segmentation and denoising make use of PDE in two different ways: x solving a calculus of variation problem; x solving an initial value problem for an evolution equation. We refer to [9] for a sample of the type 1 approach and to [7] for a sample of the type 2 approach. The references [9], [7] are taken from the mathematical literature and deal mainly with the methodological aspects of using PDE in image processing. The engineering literature on this subject is vast and a very small sample of it can be found in [8], [10]. Type 1 filters transform the problem of image segmentation and denoising, i.e., the ~ ( x, y ) , (x,y) R with a piecewise problem of approximating the measured image g smooth (constant) function h(x,y), (x,y) R, in a problem of optimal approximation.
^ `
Let Rh ,i
nh
i 1
and *h be respectively the partition of R and the structure associated
with a function h. Note that h is a function smooth on each Rh ,i , i=1,2,..., n h , that is discontinuous (changes abruptly) across *h . We consider the following functional:
E (h, *h ) D
~ 2 dx dy + E
³ (h g ) R
³
R \ *h
|| h || 2 dx dy + F | *h | ,
(1)
where ||(·)|| is the Euclidean norm of the gradient of the function and D , E , F are positive constants used to control the scale of the segmentation and the smoothing effect. In particular the three addenda appearing in (1) have the following meaning: the
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
283
~ , the second one is a first one is a measure of how much h approximates the data g measure of how much the function h differs from a constant function on each component Rh ,i , i=1,2,... , nh , of the partition of R associated with h and the third one is a measure of how complicated the partition of the rectangle R associated with h is. Note that since the decomposition of the rectangle R is unknown, the curve *h is an argument of the functional E. The optimal approximation problem to be considered is the following:
min E (h, *h ).
(2)
h , *h
Problem (2) is a calculus of a variation problem. Since *h is an unknown to be determined solving problem (2), i.e., since the function h is discontinuous across the boundary *h , this calculus of a variation problem is non standard and its solution, both from a mathematical and computational point of view, is a challenging task. If we
h is the required denoised ~ , approximates the and segmented image that, starting from the measured image g (ideal) image g , and * is the approximation of the structure ī associated with the obtained g (see Figure 3). We remind the reader that solving problem (2) through its denote the minimizer of problem (2) to ( h , * ) , then
first order optimality conditions corresponds to the solution of a problem involving elliptic PDE, i.e., PDE whose behaviour is similar to the behaviour of the Laplace equation. In scientific and engineering literature, several other choices of the functional E (h, *h ) have been considered that are omitted here for simplicity. Start
Read g~ , D , E , F
Compute the minimization step 'h , '*h
Compute E(h+ 'h, *h '*h )
no
min E(h, *h ) h, *
h
yes
Output (h , * )
End
Figure 3. Numerical solution of the calculus of variation problem
284
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
Let us now consider the type 2 filters. Let u (t , x, y ) , (x,y) R, t t 0, be a real function; to fix the ideas let us consider the following problem:
wu wt
div(s a,b ( || u || ) u ) ,
wu (t , x, y ) wn
(x,y) R, t > 0,
(3)
0 , (x,y) wR , t > 0,
~ ( x, y ) , u(0,x,y) = g
(4)
(x,y) R
(5)
where div() is the divergence with respect to the (x,y) variables,
wu (t , x, y ) wn
means derivative of u in the direction n of the exterior unit normal vector to R in (x,y) wR , and the function sa,b( K ), K t0, is chosen as follows:
s a ,b (K )
a 1K 2 / b2
,
K
t 0,
(6)
where a and b are suitable real parameters such that a ! 0 , b z 0 (see Figure 4). Note that Eq. (3) is an evolution equation whose behaviour is similar to the behaviour of the heat equation. Start
Read g~,T,a,b
t=0
compute the time step 't
t
no
Output the solution u at time T
yes compute approximate solution of problem (3), (4), (5) at time t
End
t=t+'t
Figure 4. Numerical solution of the initial value problem for the evolution equation
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
285
Given T > 0, the transformation defined implicitly by (3), (4), (5) that associates to
g~ ( x, y ) , (x,y) R, the function u(T,x,y), (x,y) R, is a transformation well suited to be the basis of an image segmentation and denoising procedure. In fact, the smoothing process induced by (3), (4), (5) with the choice (6) of the diffusion coefficient is “conditional”. That is, where || u || is large in correspondence with the edges of the
s a,b ( || u || ) is small and therefore the exact ~ localization of the “edges” of g is kept in the evolution in t, while where || u || is
image, then the diffusion coefficient
small, the diffusion coefficient is large, and therefore the image is smoothed in the evolution in t. Thus the choice of the parameters T , a and b in (6) corresponds to a sort of threshold choice. Note that the smoothing process induced by the time evolution of the solution of (3), (4), (5) described previously corresponds to the denoising of the image. In order to perform the segmentation we define:
0, I W1 , ° ® Pm (I ) W 1 d I W 2 , ° 1 I tW2, ¯
SW1 ,W 2 (I )
(7)
where Pm is a suitable polynomial in one variable of degree m ! 1 , and the
0 W 1 W 2 are chosen to guarantee the differentiability and monotonicity of (7) (see Figure 5). We call SW 1 ,W 2 the “structure” threshold parameters
function. Note that
W 1 ,W 2
with
SW1 ,W 2 is an approximation of the Heaviside function.
Figure 5. An example of the “structure” function
SW1 ,W 2 (I )
u (T , x, y ) , (x,y) R, and therefore find the structure associated with the image u (T , x, y ) , (x,y) R, we can consider SW1 ,W 2 ( u (T , x, y ) ), (x,y) R. In particular the approximation of the edges To isolate the edges of the image
286
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
~ ( x, y ) , (x,y) R, found by our filtering procedure are the associated with the image g values of (x,y)R such that:
SW1 ,W 2
u(T , x, y)
1 , (x,y) R.
(8)
The edges defined in (8) are the approximated structure associated with the ~ ( x, y ) , (x,y) R obtained. The restriction of u(T,x,y), (x,y) R to measured image g the regions delimited by the edges found in (8) is the piecewise smooth approximation ~ ( x, y ) , (x,y) R obtained. The behaviour of the denoising of the measured image g and segmentation procedure obtained using (3), (4), (5), (8) depends on the values of the parameters T, a, b, W 1 , W 2 . A careful calibration of these parameters, depending on the image characteristics to be processed, is necessary. ~ ( x, y ) , (x,y) R, we are able to associate a In conclusion, given an image g
S ( g~ ) with it, where S is defined either through the minimization ~ ) S ( g~ ) , or through the initial boundary problem (2), in which case we write S ( g 1 ~ ) S ( g~ ) . value problem (3), (4), (5) and (8), in which case we write S ( g 2 “structure”
1.2. Fusion of Denoised Images
~ ( x, y ), g~ ( x, y ) , (x,y) R be two images possibly measured by two different Let g 1 2 sensors, referring to the same scene and coregistered. The ideas proposed here to fuse g~1 , g~2 are a generalization of the material presented in [15] and developed in [11], [12], [13], [14]. ~ , g~ are two images, possibly from different sensors, it is possible, for Since g 1 2 S1 to find the structure of the first image, g~1 , and S S 2 to ~ , or vice versa. Moreover let h , h be the find the structure of the second, g 2 1 2 example, to choose S
piecewise smooth (constant) approximations corresponding to the original images g~1 , g~2 , respectively determined by one of the previously described procedures. The first fusion procedure proposed is the following: given a suitable norm
||| ||| ,
the “fused” images U 1 , U 2 can be obtained as the minimizer of the following problem:
min w1 , w2
^
||| S ( w1 ) S ( w2 ) ||| 2 O1 ||| w1 h1 ||| 2 O 2 ||| w2 h2 ||| 2
` , (9)
where w1 , w2 , are (piecewise smooth) functions defined on R and O1 , O 2 are suitable positive penalization parameters. Problem (9) is a fusion procedure; in fact it tries to change the functions w1 , w2 in such a way that the distance between their structures S ( w1 ), S ( w2 ) is small, while
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
the penalization terms proportional to
O1
and
O2
287
force w1 and w2 to remain close to
the denoised data h1 and h2 , respectively. A fusion procedure similar to (9) has been used in [11] on ERS-SAR and SPOT data with satisfactory results. In the second fusion procedure proposed, we assume that the original images ~ g1 , g~2 are the results of two physical experiments denoted generically with:
Fi (mi )
g~i , in R , i=1,2,
(10)
where F1 , F2 are linear or non linear operators representing the physical experiments and m1 ( x, y ) , m 2 ( x, y ) , (x,y) R are the physical quantities measured in the experiments, i.e., they are the unknowns of the problem. In the next section we give an example of this situation using ERS-SAR and SPOT data. We note that in general the use of appropriate mathematical models to represent physical experiments (e.g., the SAR and the optical measuring process) provide more satisfactory models of the physical situation than that contained in problem (9), where modelling of the physical experiments is absent. For i 1,2 the (single sensor) inverse problem consists of determining the
mi when the data g~i and the operator Fi are known. Solving this problem means to individually invert the operators Fi , i 1,2 . In this case the facts that the ~ , g~ refer to the same scene, are coregistered, and (hopefully) have the two images g 1 2 unknown
same “structure”, are not taken into account. The fusion procedure that follows exploits these facts. The second fusion procedure proposed is the following: given H 1 ! 0 and a suitable norm ||| ||| , the “fused” physical unknown quantities M 1 , M 2 can be obtained as the minimizer of the following problem:
min K1 ,K 2
^ ||| S (K1) S (K2 ) |||2` ,
(11)
subject to
N 1 ||| F1 (K1 ) h1 ||| N 2 ||| F2 (K 2 ) h2 |||d H 1 ,
(12)
where K1 , K 2 are functions defined on R and N 1 , N 2 are suitable positive penalization parameters. From the solution M 1 , M 2 of problem (11), (12), the “fused” images V1 , V2 can be computed as follows:
Vi = Fi ( M i ) in R , i 1,2 .
(13)
288
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
In [12] the fusion procedure (11), (12), (13) has been used with satisfactory results to treat multifrequency electromagnetic scattering data taken from a laboratory experiment. Let us note that the critical parameters of the fusion procedures are O1 , O 2 for the fusion problem (9) and N 1 , N 2 , H 1 for the fusion problem (11), (12). Varying these parameters makes it possible to choose which of the two images has a greater “weight” (i.e., “relevance”) in the fusion procedure. If, for example, O1 and O 2 are of the same order of magnitude, the two images have the same weight, i.e., the same impact on the fusion products. However, when one of the two parameters, e.g., O1 , is greater than the other e.g.,
O2 , then “the image corresponding to O1
(i.e., h1 ) predominates over the
other (i.e., h2 )” in the sense that the salient features of the first image ( h1 ) are more in evidence and delineated in the fusion products than the features of the second ( h2 ). Given
H 1 , similar considerations hold for the choice of the parameters N 1 , N 2 .
2. A Unified Approach to Denoising, Segmentation and Fusion In [14] we proposed a new fusion procedure able to perform the filtering step, based on the solution of problem (3), (4), (5), together with the data fusion step based on problem (11), (12), when S S 2 . This improvement on the denoising, segmentation and fusion procedures discussed previously is based on the simple use of ideas taken from calculus of variations. In particular, in the situation described previously, given H 2 ! 0 and a suitable norm ||| ||| , the “fused” physical unknowns quantities N 1 , N 2 can be obtained as the minimizer of the following problem:
min P1 , P 2
^ ||| S
W 1 ,W 2
`
(|| P1 ||) SW1 ,W 2 (|| P 2 ||) ||| 2 J ( P1 , P 2 ) ,
(14)
subject to
J 1 ||| F1 ( P1 ) g~1 ||| J 2 ||| F2 ( P 2 ) g~2 |||d H 2 , where P1 , P 2 are functions defined on R, parameters and
J ( P1 , P 2 )
J1, J 2
(15)
are suitable positive penalization
§ || P1 ( x, y ) || 2 · ¸¸ dx dy + l1 ³ ln¨¨1 R b12 ¹ ©
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
§ || P 2 ( x, y ) || 2 · ¸¸ dx dy, l 2 ³ ln¨¨1 R b22 © ¹
289
(16)
where l1 ,l 2 are suitable positive parameters, b1 ,b2 are non zero parameters having the same meaning of b in (6) and ln(·) denotes the logarithm of (·). Minimizing the objective function of problem (14) means to make small ||| SW1 ,W 2 (|| P1 ||) SW1 ,W 2 (|| P 2 ||) ||| , which corresponds to image fusing, and to make small J ( P 1 , P 2 ) , which corresponds to the image denoising following a procedure similar to the one defined in (3), (4), (5). This last fact can be explained by noting that the trajectory ( Pˆ 1 (t , x, y ) , Pˆ 2 (t , x, y ) ), (x,y) R defined for t ! 0 by
2l1 b12 , b b1 , g~1 as the initial condition to generate Pˆ 1 and a b2 , g~2 as the initial condition to generate Pˆ 2 , is ~ , g~ ) associated to the minimization the steepest descent trajectory starting from ( g 1 2 ~, of the functional J ( P 1 , P 2 ) . Finally the constraint (15) guarantees that the data g 1 ~ g 2 are considered. For more details about the choice of the term J defined in (16) added to the objective function in (14), we refer to [14]. From the solution N 1 , N 2 of problem (14), (15), the “fused” images W1 , W2 can be computed as follows:
a 2 2l 2 b2 , b
(3), (4), (5), when we choose
Wi = Fi ( N i ) ,
in R ,
i 1,2
(17)
In [14] the fusion procedure (14), (15), (17) has been applied to ERS-SAR, SPOT data to detect urban areas with satisfactory results. 3. An Example: the Fusion of SAR and Optical Images 2 An interesting application of the fusion procedures presented in the previous section consists of combining images relative to the same scene on the earth’s surface provided by SAR and optical sensors travelling onboard satellites. There are several motivations behind this choice. One important motivation is the lack of up-to-date information in many applications in the fields of earth resource surveying, water management, urban growth, agriculture and forestry. Monitoring by conventional methods is insufficient to keep pace with the rapid changes that are occurring, especially in deforestation and urban growth. Remotely sensed data is therefore an alternative method for supplying relevant information to monitor earth resources. More specifically, several synergies between the (synthetic aperture) radar and optical sensors can be identified and some problems may occur if radar or optical sensors are used separately. For example, radar sensors are active sensors, i.e., they have their “own illumination source” onboard, 2
The authors thank MIT Lincoln Laboratory (Lexington, Massachusetts), ESA (ESRIN, Italy) and SPOT Image (France) for making available some of the data used in this section.
290
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
which operates with wavelengths up to the cm-scale (microwave range) in all weather conditions, both day and night. However, to interpret radar images correctly, some experience is necessary. Optical sensors, on the other hand, operate with wavelengths in the visible and near-infrared range, allowing human eyes to interpret the content of optical images. Unfortunately optical sensors are strongly influenced by weather conditions (particularly clouds) and daylight conditions, so that they cannot provide ground information as regularly as radar sensors. We are particularly interested in combining images from microwave and optical sensors to detect, for example, urban zones and monitor urban growth. The “urban areas” we are interested in are small/medium size urban zones made of heterogeneous elements: houses, small buildings, etc., but also soil and grass. In the radar images, human settlements are “easily” detectable, but the optical images add details and texture to the information contained in the radar image. We expect that in the images resulting from the fusion procedures described in Section 2, urban zones will be more easily recognizable than in the original images, due to three main facts: x the boundaries and the contour lines of the objects lying in the “fused” images should be more delineated and in evidence than those present in the original images; x the areas inside the edges of the objects should be more homogeneous in the “fused” images than in the original ones; x the characteristic features of SAR and optical images should be easily recognizable in each “fused” image. For example, since urban areas produce very brilliant textures in SAR imagery, we expect that this feature will be reinforced in the “fused” images. Later we will try to give a quantitative basis to these expectations, introducing appropriate tools to analyze the “fused” images obtained. It must be remembered that to efficiently perform the fusion procedures described in Section 2, the SAR and optical images to be fused must not only be relative to the same scene but also co-registered (i.e., they must be referenced to a single georeference plane). ~ (which measures the radar Referring to (10), if we consider a SAR image g 1
~ (which measures the luminance reflectivity m1 of a scene) and an optical image g 2
m 2 of a scene), then the operators F1 , F2 represent respectively the SAR and the optical measurement process that from the radar reflectivity m1 and from the optical ~ , g~ . Two simple mathematical luminance m 2 of the scene, generate the two images g 1 2 models that simulate the SAR and the optical measurement processes as linear integral operators are discussed in [14]. We note that the basic assumption of these fusion procedures requires that the changes of reflectivity and luminance in the scene occur in the same physical locations. This assumption corresponds to the fact that the two images must have the same structure and be amply fulfilled in many situations of practical interest, including the situation considered here, i.e., the detection of urban areas. We consider three pairs of digital SAR and optical images. The first pair (see Figure 6(a), 6(b)) is obtained from the scientific literature, some pictures contained in [16]. The second and the third pair are made of one ERS-SAR image (range looks = azimuth looks = l1) and one of the four SPOT-4 channel images. The second pair
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
291
represents a peri-urban area in the south of Paris (see Figure 7(a), 7(b)), while the third one shows an area in the north of Paris that contains a part of the Roissy Charles de Gaulle airport (see Figure 8(a), 8(b)). The pixel size (i.e., the spatial resolution) of the first pair of images is not specified in [16], while the pixel size of the last two pairs of images is 20m u 20m (where m denotes meters). Note that the resolution of the last two pairs of images is due to a pre-processing step of the data made by SPOT Image that consists in resampling the images to have equal pixels and coregistering the SAR and the optical images taking into account the Digital Elevation Model (DEM) of the images. The size of the images considered is reported in Table 1. Table 1. Dimensions of the images Pixels Figure 6
190 u 190
Figure 7
170 u 180
Figure 8
180 u 180
In all the images the white colour represents high values of the pixel variable (gray level = 255), the black colour represents low values of the pixel variable (gray level = 0), and the unit in the x and y axes is the edge of the pixel. The images of Figure 6(a), 6(b) were fused taking S S 2 and solving problem (9). The corresponding fused images are shown in Figure 6(c), 6(d). The images of Figure 7(a), 7(b) were fused taking S S 2 and solving problem (11), (12). The corresponding fused radar reflectivity and fused luminance are shown in Figure 7(c), 7(d). Finally the images shown in Figure 8(a), 8(b) were fused with two different procedures. The fused radar reflectivity and luminance shown in Figure 8(c), 8(d) was obtained from the image processing of Figure 8(a), 8(b) with S S 2 and solving problem (11), (12), while the fused radar reflectivity and luminance shown in Figure 8(e), 8(f) were obtained from the images of Figure 8(a), 8(b) and solving problem (14), (15). Note that when the fusion procedure (14), (15) is used no S S1 or S S 2 choice is required. In the remaining fusion experiments described here S S 2 has always been chosen; a satisfactory implementation of the numerical procedure behind the choice S S1 is not available to us at this moment. The values of the parameters used in the fusion procedures presented are shown in Table 2. In the numerical experience we always choose n 5 ,
(I W 1 ) 3 (I W 1 ) 4 (I W 1 ) 5 + . To avoid the difficulties 15 6 (W 2 W 1 ) 3 (W 2 W 1 ) 4 (W 2 W 1 ) 5 arising from the highly non linear character of the function SW 1 ,W 2 used when we
P5 (I )
10
choose S
S 2 , an iterative process is used to reach the desired value of the
parameters W 1 , W 2 (for more details see [12], [14]). Furthermore in the discretized problems arising from the fusion procedures, the norms used are always Euclidean norms. The finite dimensional optimization problems arising from the discretization of the various fusion problems are naturally induced by the pixel structure of the images and have been solved using the optimization software package LANCELOT (see [17]).
292
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
This package is particularly suited for problems in high dimension, i.e., with many independent variables, and solves the optimization problems considered using a quasiNewton method. We underline the fact that the operator S S 2 is highly nonlinear and the discretization of the unknowns on the grid given by the pixel representation of the images leads to challenging optimization problems in high dimension. For example, the images of Figure 8(a), 8(b) involve 2 u 180 u 180 pixels=64˙800 pixels, that is the optimization problem corresponding to the fusion procedures considered has 64˙800 independent variables (i.e. is posed in 64˙800 dimensions). Let us note that the behaviour of the fusion procedures proposed depends on the values of the parameters chosen, and obviously there are several other parameter choices than those reported in Table 2 that lead to good results. Above all, the choice of the penalization parameters in (9), (12), (15) determines the quality of the fusion products obtained. In general, an ad hoc calibration of the parameters is necessary depending on the characteristics of the images that must be processed. Altogether the results obtained from the numerical experiments are satisfactory. In fact, in each fused image the characteristics of both the SAR and the optical images are easily recognizable. We can observe, for example, that both the fused SAR and optical image of Figure 6 gain reciprocally in details, the one from the other. Instead, in Figure 8 the fused optical images are those that are mainly enriched by brilliant particulars present only in the corresponding SAR images. Furthermore we note that the homogeneous zones of both SAR and optical images of Figure 7 emerge in the fused images and that the boundaries of these zones are more delineated and marked in the fused images than in the original ones. Note that in Figure 7 and 8 the fused images represent quantities different from those represented in the original images. This is due to the presence of the operators F1 , F2 in the fusion procedures employed. Concluding we claim that “the quality of the information obtained from the fused images is higher than the quality of the information provided by the original images taken one by one”. In the next section we try to consider this statement from a quantitative point of view. An extensive review of numerical results relative to fusion problem (9) are presented in [11], [12], [13] and in the web site: http://web.unicam.it/matinf/fatone/esrin.asp, while the results contained in [14] and in the web site http://web.unicam.it/matinf/fatone/w1 concern fusion problem (14), (15).
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
293
Figure 6. (a) original SAR image; (b) original optical image; (c) fused SAR image; (d) fused optical image3
Figure 7. (a) original ERS-SAR image; (b) original SPOT image; (c) fused radar reflectivity image; (d) fused luminance image 3
Figures 6(a), 6(b) are reprinted with permission of MIT Lincoln Laboratory, Lexington, Massachusetts.
294
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
Figure 8. (a) original ERS-SAR image; (b) original SPOT image; (c) fused radar reflectivity image; (d) fused luminance image; (e) fused radar reflectivity image; (f) fused luminance image
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
295
Table 2. Values of the parameters employed in the fusion procedures used to produce the fused images shown in Figures 6, 7, 8
Figure 6(c), 6(d)
Segmentation and Denoising
Fusion
S
fusion problem (9)
S2
with:
with:
SAR: T1 =1, a1 =1, b1 =5 Optical Image:
O1 =10 8 , O 2 =10 8
T2 =1, a 2 =1,
b 2 =5 Figure 7(c), 7(d)
S
fusion problem (11), (12)
S2
with: SAR: Figure 8(c), 8(d)
with:
T1 =10, a1 =1, b1 =0.5
SPOT:
N 1 =1, N 2 =1, H 1 =0.1
T2 =2, a 2 =1, b 2 =6
Segmentation, Denoising and Fusion Segmentation, denoising and fusion problem (14), (15) with: Figure 8(e), 8(f) SAR: and
b1 =0.5, SPOT: b 2 =6
J 1 =1, J 2 =1, H 2 =0.1, l1
=10
6
, l 2 =10
7
4. Quality Assessment of Fusion Products: Methods and Examples We present two algorithms that give a quantitative basis in some particular circumstances to the qualitative statements made before about the “quality improvements” obtained with the fusion procedures. These two algorithms have a common basis in the analysis of certain mean values associated to the images considered or to some of their subimages. The subimages considered can be chosen in an adaptive way via a multiresolution analysis of the original images. We limit our attention to two different contexts: the identification of urban areas and the recognition of “simple features”, i.e., straight lines, lying in a given image. We sketch the basic ideas of the two algorithms and show how to apply them to compare the SAR/optical fusion products obtained in the previous section with the original images, between themselves and with the “ground truth”. 4.1. Automatic Recognition of Urban Areas First of all let us consider the problem of urban area detection. We use a simple detection algorithm that recognizes urban areas as the more brilliant zones of the images considered and we prove in the test cases considered that the performances of this algorithm are improved when fused images are used instead of the original images.
296
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
More precisely let M and V be the mean value and the standard deviation associated to the grey levels of a given image X . Given a positive integer Q , the algorithm divides the image to be processed X in J non intersecting subimages of Q u Q pixels. To fit the real dimensions of the image X , the number of pixels of the subimages lying on the boundary of the image
X
may be adjusted. Let
ml ,
l 1,2,, J be the mean value associated to the grey levels of the l-th subimage. We expect subimages of X containing urban areas to have high values of the following parameter: ml M
Dl
V
,
l
1,2,, J .
(18)
Given a threshold / >0, we say that the l-th subimage contains an urban area when:
Dl ! / .
(19)
Let us note that this detection algorithm is very simple, and more complex procedures can be used to detect urban areas. We expect that the values of Dl relative to subimages containing urban areas belonging to fused images are higher than the corresponding values of Dl relative to the subimages of the original images. Let us
U (X ) the average of the parameters Dl corresponding to the subimages X that, according to the test (19), contain urban areas, and let us denote with V (X ) the average of the parameters Dl corresponding to the subimages of X that, according to the test (19), do not contain urban areas. If we denote with Xˆ the fused
denote with of
image (or the fused physical unknown quantity depending on which fusion procedure is used) corresponding to X , we can “measure” the consequent change in Dl , in the urban and not urban areas, of the fusion procedure examining respectively the following performance indices (see [14]):
i =
U (Xˆ ) U (X ) V (Xˆ ) V (X ) , j= . U (X ) V (X )
(20)
U (X ), U (Xˆ ) are positive numbers, while V (X ), V (Xˆ ) are negative numbers, so that i and j greater than zero is a quantitative statement that translates the observation that in the fused image Xˆ the urban areas are more visible and Note that
separated from the context than in the original image X . Now we can define similar performance indices measuring the change in the performance of the urban area’s detection procedure provided by the different SAR/optical image fusion procedures considered previously. In particular let us
297
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
consider the following performance indices referring when i=1 to the SAR images and when i=2 to the optical images: fusion problem (9):
U (U i ) U ( g~i ) V (U i ) V ( g~i ) Di = , Ei = , i=1,2, U ( g~i ) V ( g~i )
(21)
fusion problem (11), (12):
Ji=
U ( M i ) U ( g~i ) V ( M i ) V ( g~i ) , G = , i=1,2, i U ( g~i ) V ( g~i )
(22)
fusion problem (14), (15):
Vi=
U ( N i ) U ( g~i ) V ( N i ) V ( g~i ) , X = , i=1,2. i U ( g~i ) V ( g~i )
(23)
In the numerical experience we choose Q =25, / = 0.5 for the first pair of images (see Figure 6) and Q =10, / = 0.25 for the other images (see Figure 7, 8). The numerical values of the parameters (21), (22), (23) associated to the fusion procedures described previously are shown in Table 3. The performance indices of Table 3 are all positive. This gives a quantitative basis to the claims made before of “quality improvements” obtained through the fusion procedures. Moreover fusion problem (9) (Figure 6(c), 6(d)) gives greater improvement for the optical images ( D 2 , E 2 than for the corresponding parameters relative to the other fusion procedures). Similarly fusion problem (11), (12) (Figure 7(c), 7(d)) gives the greatest improvement for the SAR images. However the results shown in Table 3 suggest that the three fusion procedures considered give similar results. That is why a much wider set of numerical examples should be considered when studying the relative performance of the three fusion procedures. 4.2. Automatic Recognition of Simple “Features” in a Given Image The algorithm we use for the automatic recognition of simple “features” is based on the Hough transform of the images considered (see [18]). We limit ourselves to the simplest “feature”, i.e., the straight lines lying in a given image. Table 3. Numerical values of the performance indices Performance Indices Figure 6(c), 6(d) (fusion problem (9))
D 1 =0.137, E 1 =0.109, D 2 =0.483, E 2 =0.523
Figure 7(c), 7(d) (fusion problem (11), (12))
J 1 =0.375, G 1 =0.539, J 2 =0.038, G 2 =0.051
298
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
Performance Indices Figure 8(c), 8(d) (fusion problem (11), (12))
J 1 =0.236, G 1 =0.389, J 2 =0.079, G 2 =0.150
Figure 8(e), 8(f) (fusion problem (14), (15))
V 1 =0.233, X1 =0.386, V 2 =0.093, X 2 =0.180
Let us extract from a given image a (rectangular) subimage w whose “main content” is a straight line separating two homogeneous regions. This situation can be easily seen by a human observer, but we want to recognize it automatically. We proceed as follows: let W 3 >0 be a parameter and let
TW 3 (9 )
0, 0 d 9 W 3 , ® 9 ! W 3. ¯1,
Considering the image
(24)
w = TW 3 ( || w || ) we can reduce the analysis of the
image w to the analysis of the image w , i.e., essentially a black image with a white “straight line” lying inside it. A straight line y mx q in the Cartesian plane
( x, y ) containing the image w is completely characterized by the coefficients m and q. For simplicity we can assume that the x, y coordinate axes coincide with the edges of the image
w and that y
mx q is the equation of the straight line contained in
w . We can reconstruct the parameters corresponding to the straight line contained in w proceeding in the following way. Let us consider the transformation U x cosT y sin T , 0 d T S . Through this transformation at each point ( x, y ) of the ( x, y ) -plane corresponds a sinusoidal curve in the (T , U ) -plane, i.e., U x cosT y sin T , 0 d T S . This transformation is called Hough transform (see [18]). Note that (T , U ) are Cartesian coordinates. Let y mx q ; we have: x cos ș y sin ș x cos ș (mx q) sin ș
ȡ
x(cos ș m sin ș ) q sin ș,
(25)
0 d ș ʌ.
That is, (25) is the family parametrized by the real variable x, of the sinusoidal curves in the (T , U ) -plane corresponding through the Hough transform to the straight line y
mx q in the ( x, y ) -plane. It is easy to see that the point ( T , U ) such
cosT that sin T
m and U
q sin T belongs to all the curves of family (25). To
exploit this observation in a more practical situation, let us discretize using rectangular
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
299
( x, y ) -plane and the (T , U ) -plane. In particular let ( xi , y j ) , i 1,2, , I , j 1,2, , J be the Cartesian coordinates of the centres of the pixels
contained in the image w . Remember that in the image w the pixel value is equal to one when the line y mx q goes through the pixel considered. Let A ^ ȡ xi cos ș y j sin ș : w = TW 3 ( || w || ) =1 in the pixel containing pixels, both the
( xi , y j ) , i= 1,2,, I , j= 1,2,, J be the family of sinusoidal curves in the
`
(26)
(T , U ) -plane corresponding to the
(centres of the) pixels associated to large values of || w || in the image w. The set A contains the curves associated to the straight line y mx q and eventually some
other curves coming from minor (undesired) features present in w . Considering also the (T , U ) -plane subdivided in pixels, let (T l , U n ) , l 1,2,, L ,
N , ( N 1),, ( N 1), N be the centres of the pixels contained in a suitable rectangle of the (T , U ) -plane. We associate to each pixel of the rectangle of the (T , U ) -plane considered the (finite) number of curves belonging to A that go n
through it, and we indicate the function defined in this way with H. In particular denoting with H l ,n the value of H in the pixel containing (T l , U n ) of the (T , U ) -
l 1,2,, L , n N , ( N 1),, ( N 1), N , we have H= H l ,n , l 1,2,, L , n N , ( N 1),, ( N 1), N , attaining its maximum in the
pixel whose centre can be considered as an approximation of ( T , U ) . Sometimes, with abuse of notation, we call the function H=H (T , U ) (defined to be equal to H l ,n if plane,
(T , U ) belongs to the pixel of centre (T l , U n ) ) Hough transform of the image w . We assume that the presence of curves belonging to A not coming from the straight line y mx q does not change the properties of H=H (T , U ) induced by the fact that
the line y mx q is the main content of the image w . Under this assumption, from the knowledge of the maximizer of H, it is possible to reconstruct the straight line y mx q ; in fact it is possible from the knowledge of the maximizer of H, i.e.
( T , U ) , to reconstruct m, q. Note that a simple modification of the Hough transform method introduced above can be used to recognize vertical lines, or other simple curves such as circles, ellipses and so on. In more detail, the strategy presented above used for the detection of straight lines can be extended to consider, at least in principle, the recognition of “arbitrary” curves contained in a given image; indeed if the boundaries of the objects contained in a given image are “simple” rectifiable curves, this is possible. The idea is to use a multiscale algorithm to divide the image under examination into N subimages, each one containing only a curve separating two homogeneous regions that is satisfactorily
300
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
approximated with a straight line separating two homogeneous regions. The analysis of the image containing an arbitrary rectifiable curve is obtained as a “combination” of the analysis of the N Hough transforms via the reconstruction of the (approximate) straight line present in each subimage. That is the analysis of the N. Hough transforms of the subimages will provide the required information about the original image. The choice of the subimages that must be analyzed with the Hough transform is done with a multiscale algorithm. The multiscale algorithm used in the numerical experience shown here is adaptive, i.e., the subdivision of the image in subimages is performed several times and is determined as a consequence of the analysis of the content of the subimages of the previous step in the subdivision process. In this way it is possible to recognize non trivial features present in a given scene, such as boundaries of agricultural fields, edges of human settlements and so on. We say that “a fused image containing a straight line is of higher quality than the original image” if the maximizer of the Hough transform of the fused image is sharper and more isolated, i.e., is more easily recognizable, than the corresponding maximizer of the Hough transform of the original image. To give a quantitative basis to this statement we suggest the following criterion. We choose L=180; moreover N and the pixel size in the (T , U ) -plane are determined as a function of the image w. We consider the following distribution associated to the function H:
F (w )
# ^ (ș l ,ȡn ) : H l ,n w , l # ^(ș l ,ȡ n ), l
1,2,, L, n
1,2,, L, n
N , ( N 1),, ( N 1), N
N , ( N 1), , ( N 1), N
`
f w f ,
`
,
(27)
where #{·} denotes the cardinality of the set, i.e., the number of elements contained in the set. Let us note that we regard F as an approximation of a real function
~ ~ ~ ~ F , and we denote with f the density function associated to F , i.e., f ( w ) ~
~ dF . dw
Let f be a finite differences approximation of the real function f ; the function f contains information about the scene under examination. In particular the presence of “corrugations” in the graphical representation of the approximate density f denotes the presence of “noise” in the image. Here with the term “noise” we mean information different from that associated to the presence of the straight line. That is when the straight line lying in the image is well delineated, the maximizer of H associated to the straight line is sharp and isolated. We translate now these qualitative statements in quantitative facts. Let w f be the maximum of H: if we denote with M f and V f respectively the mean value and the standard deviation associated to the density f, a measure of the quality of the fusion product 4 can be given by the sum of
w f M f and 4 = V f , that is: 4
44
w f M f V f .
4=
(28)
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
301
In fact a small value of the parameters 4, 4 , and hence a small value of 4 , means that the function f is very concentrated around the pixel where the maximum w f is obtained, i.e., the “peak” of the density f is well separated with respect to the “body” of the density f, and hence the image is of high quality since its content (i.e., the straight line) is “in evidence”. Indeed, if as a consequence of a fusion procedure the value of 4 decreases, this is a quantitative measure of the fact that the “fused image is of higher quality than the original image”. Let us observe that other ad hoc quantitative indices can be introduced to perform a quantitative analysis of the quality of fusion products. Note that when an (optical) image of very high resolution of the scene under examination is available, it can be used as “ground truth” to validate the result obtained with the techniques described previously. More precisely, when a “ground truth” image is available, it is possible to measure the quality of the automatic recognition with the performance indices introduced above and make a comparison with the value of the same indices associated to the ground truth image. Let us apply the previous techniques to the recognition of some simple features in the pre- and post-fusion SAR/optical images presented in Section 3. Let us begin analyzing the automatic recognition of a straight line separating two homogeneous regions in a given image. To prove that the proposed fusion procedures lead to significant quality improvements, we extract from the original optical image of Figure 8(b) a subimage of 40 u 40 pixels containing the almost vertical straight line well visible in the right bottom corner of Figure 8(b). This image, with the corresponding segmented image obtained taking W 3 =20 in (24), is shown in Figure 9. The same subimage has been extracted from the two fused luminances shown in Figure 8(d), Figure 8(f), and through relations (13) and (17) with i=2 from these subimages, we have reconstructed the corresponding optical fused images. These optical fused images with the corresponding segmented images obtained taking W 3 =20 in (24) are shown in Figure10. As can be seen by inspection, the contours of the lines lying in the fused images (see Figure 10(b), 10(d)) are more delineated and in evidence with respect to the background of the image than those represented in Figure 9(b). From the quantitative point of view these facts are well in evidence given that the density functions associated to the Hough transform of Figure 9(b), 10(b), 10(d) shown in Figure 11 and given that that the values of the parameter 4 defined in (28) corresponding to Figures 11(a), 11(b), 11(c) are 41 39.47 4 2 18.96, 4 3 20.95, respectively. Let us note that the supports of the densities shown in Figure 11 are connected sets and that
f (w )
0 when
w 0. Let us consider now the multiscale Hough transform algorithm. Let us consider for example the optical data representing the peri-urban land in the south of Paris shown in Figure 7(b). In particular we take into account the original optical image (see Figure 7(b)) and the fused luminance (see Figure 7(d)). We find the edges contained in these images with the segmentation operator (24) taking W 3 =13. Just to fix the ideas, let us extract from each segmented image a subimage of 60 u 60 pixels containing a portion of the upper left corner of the segmented image, see Figure 12(a), 12(b), and let us try
302
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
to determine “automatically” the boundaries of the agricultural fields present in this zone. The reconstructed boundaries obtained through a simple implementation of the multiscale Hough transform procedure proposed here are shown in Figure 12(c), 12(d) respectively. By inspection it is easy to see that the multiscale Hough transform procedure seems to work reasonably well. Moreover the edges of the fused image shown partially in Figure 12(b) are reconstructed with more care with respect to the boundaries shown in Figure 12(a) of the original image. To give a quantitative basis to this statement it is necessary to consider a sort of global performance index like (28). In conclusion, in the numerical experiments we have observed that the proposed multiscale reconstruction procedure based on the Hough transform gives promising results. Moreover the fusion procedure may sometimes facilitate the reconstruction of the boundaries of the objects lying in a given image. To confirm from another point of view the improvement obtained with the proposed fusion procedures in the understanding of a scene, we have compared the content of the measured images and of the fused physical unknowns shown in Figure 8 with a high resolution optical image of the same scene. This image is an IRS-1C image and has a pixel of size 5m u 5m. We refer to it as “ground truth”. We have manually coregistered the IRS-1C image with the images shown in Figure 8 and extracted from the IRS-1C image the “same” subimage shown in Figure 9 and Figure 10. The density f ( w ) associated to the ground truth image is shown in Figure 13. Note that
0 when w 0 . The parameter 4 defined in (28) associated to the image shown in Figure 13 is 4 4 =15.59. Let us note that 4 4 is smaller than the value of 41 , 4 2 , 4 3 ; this emphasizes the fact that the “ground truth” image contains f (w )
“information” of higher quality than the other images involved and produced in the various fusion procedures. Let us note that here we have considered the effects and the benefits on optical images provided by the fusion procedure with ERS-SAR data. Similarly it is possible to consider the improvements that the fusion with optical data can bring to the ERSSAR data and substantially similar results are obtained. Only a very limited number of tests have been performed, but we believe that the encouraging results obtained in these tests justify the study and the use in practical situations of the fusion procedures presented here.
Figure 9. (a) subimage extracted from the image shown in Figure 8(b); (b) segmented image corresponding to the subimage 9(a)
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
303
Figure 10. (a) optical image corresponding to the right bottom corner of Figure 8(d); (b) segmented image corresponding to the subimage 10(a); (c) optical image corresponding to the right bottom corner of Figure 8(f); (d) segmented image corresponding to the subimage 10(c)
304
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
Figure 11. (a) density density density
f ( w ) associated
f ( w ) associated f ( w ) associated
to the Hough transform of the image shown in Figure 9(b); (b)
to the Hough transform of the image shown in Figure 10(b); (c) to the Hough transform of the image shown in Figure 10(d)
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
305
Figure 12. (a) edges of the objects contained in the upper left corner of the segmented optical image shown in Figure 7(b); (b) edges of the objects contained in the upper left corner of the segmented luminance image shown in Figure 7(d); (c) edges of image 12(a) reconstructed with the multiscale Hough transform algorithm; (d) edges of image 12(b) reconstructed with the multiscale Hough transform algorithm
Figure 13. Density
f ( w ) associated
bottom corner of Figure 8(b)
to the Hough transform of the “ground truth” image of the right
306
P. Corna et al. / Data Fusion and Quality Assessment of Fusion Products
References [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]
[11]
[12] [13] [14]
[15] [16] [17] [18]
Wald, L. (1999). “Some terms of reference in data fusion”, IEEE Transactions on Geoscience and Remote Sensing, 37(3), 1190-1193. Pohl, C. and Genderen, J.L. van (1998). “Multisensor image fusion in remote sensing: concepts, methods and applications”, International Journal of Remote Sensing, 19, 823-854. Simone, G., Farina, A., Morabito, F.C., Serpico, S.B. and Bruzzone, L. (2002). “Image fusion techniques for remote sensing applications”, Information Fusion Journal, 3 (1), 3-15. Townsend, D.W. and Cherry S.R. (2001). “Combining anatomy and function: the path to true image fusion”, European Radiology, 11, 1968-1974. Waltz, E. and Llinas, J. (1990). Multisensor Data Fusion, Artech House, Boston, Mass.. (1997) Special issue on data fusion, Proceedings of the IEEE, 85, 1-208. Alvarez, L., Guichard, F., Lions, P.L. and Morel, J.M. (1993). “Axioms and fundamental equations of image processing”, Archives for Rational Mechanics and Analysis, 123, 199-257. Perona, P. and Malik, J. (1990). “Scale space and edge detection using anisotropic diffusion”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(7), 629-639. Mumford, D. and Shah, J. (1989). “Optimal approximations by piecewise smooth functions and associated variational problems”, Communications on Pure and Applied Mathematics, 42, 577-684. Schneider, M.K., Fieguth, P.W., Karl, W.C. and Willsky, A.S. (2000). “Multiscale methods for segmentation and reconstruction of signals and images”, IEEE Transactions on Image Processing, 9(3), 456-468. Fatone, L., Maponi, P. and Zirilli, F. (2001). “Fusion of SAR/Optical images to detect urban areas”, in “Proceedings of the IEEE/ISPRS Joint Workshop on Remote Sensing and Data Fusion over Urban Areas, Roma, Italy”, IEEE Publisher, Piscataway N. J., 217-221. Fatone, L., Maponi, P. and Zirilli, F. (2001). “An image fusion approach to the numerical inversion of multifrequency electromagnetic scattering data”, Inverse Problems 17, 1689-1702. Fatone, L., Maponi, P. and Zirilli, F. (2002). “Data fusion and nonlinear optimization”, SIAM News 35(1), 4;10. Fatone, L., Maponi, P. and Zirilli, F. (2003). “Data fusion and filtering via calculus of variations: an application to SAR/optical data and urban areas detection”, submitted to ISPRS Journal of Photogrammetry and Remote Sensing. Haber, E. and Oldenburg, D. (1997). “Joint inversion: a structural approach”, Inverse Problems, 13, 6377. Baxter, R. and Seibert, M. (1998). “Synthetic aperture radar image coding”, Lincoln Laboratory Journal, 11(2), 121-155. Conn, A.R., Gould, N.I.M. and Toint Ph.L. (1992). LANCELOT: A Fortran Package for Large Scale Nonlinear Optimization (Release A), Springer-Verlag, Berlin. Asano, T. and Katoh, N. (1996). “Variants for the Hough transform for line detection”, Computational Geometry, 6, 231-252.
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
307
Information Management Methods in Sensor Networks1 L. MIHAYLOVA
a, 2
, A. NIX a , D. ANGELOVA b , D. BULL a,
N. CANAGARAJAH a , A. MUNRO a a b
Department of Electrical and Electronic Engineering, University of Bristol, UK
Bulgarian Academy of Sciences, 25A Acad. G. Bonchev Str, 1113 Sofia, Bulgaria
Abstract. This paper considers issues related to information management in centralised and decentralised sensor networks. The sensor management is treated in the light of different applications: detection, classification and tracking of moving objects. Information management is the process of optimising the information flow and is highly dependent on measures of information. Often used information measures are considered. Further, problems of localisation of mobile nodes in wireless ad hoc networks and mobility tracking in wireless networks are addressed. Results for mobility tracking based on Monte Carlo methods are presented. Keywords: sensor networks, wireless networks, data fusion, mobility localisation
1. Introduction A sensor network comprises a large number of sensor nodes collaboratively collecting information, processing it on their onboard capabilities, and transmitting only the requested data. Sensor networks have unique features – they are randomly deployed. Unique requirements such as situation awareness, self-organisation and fault tolerance have to be fulfilled. How to appropriately select the right sensor combination to reach a trade-off between sensor usage and tracking performance is a key task in sensor management. Multi-sensor management is a broad concept [1] referring to a set of distinct issues of planning and control of sensor resource usage to enhance multi-sensor fusion performance. The main elements are: sensor deployment, sensor behaviour assignment, and sensor coordination. Sensor deployment concerns making decisions about when, where and how many sensing resources need to be deployed in reaction to the state of the world and its changes. Sensor placement consists of positioning multiple sensors in a way that is optimal or nearly optimal (e.g., location configuration) in the sense of extracting maximum amount of information. Typically, it is desired to locate sensors within a particular region to optimise certain criteria expressed, for instance, by detection probabilities, signal arrival time, emitted energy, etc. This 1 We acknowledge the financial support of the UK MOD Data and Information Fusion DT Center and partial support of the Bulgarian Foundation for Scientific Investigations under grants I-1202/02 and I1205/02. 2 Corresponding author, E-mail:
[email protected]
308
L. Mihaylova et al. / Information Management Methods in Sensor Networks
problem can be formulated as a constrained optimisation of a set of parameters. In most cases a map of the environment is known and these available physical or mathematical terrain models are used as the basis for evaluation of sensor placement decisions. Sensor behaviour assessment [1] means efficient determination and planning of sensor functions and usage according to changing situation awareness or mission requirements. Figure 1 a) shows a top-down structure of the sensor management process with its different levels. Figure 1 b) illustrates the relation between the different layers: physical layer (sensors randomly deployed) and the network layer that comprises different topologies, and routing protocols. The top layer is application dependent and might comprise various applications such as sensor data fusion, compression of data, classification and tracking. The deployed sensor network has an impact on the application level with the fused data/ information provided. On the other hand the problem of interest dictates what kind of data is to be gathered by the other two layers. Situation Awareness
Human Requests
Level 4
Mission Planning
Level 3
Resource Deployment
Metasensor issues: services/ targets accuracy, priority, viewing area
System-level
Level 2
k
When, where and how
Sensors available Sensor selection, sensor allocation sensor cueing, motion planning, Resource Planning negotiation and cooperation Requests on Sensors
Level 1
Sensor Scheduling
Level 0
Sensor Control
Time Line of Commands
Fully Defined Sensor Operation Image, Video
Classification
Data Fusion
Tracking
Joint network/ video fusion Optimisation. Influence of compression, latency & loss Topology discovery & Routing
Distributed Fusion
Distributed sensor management & communication optimisation
Packets
Network Layer
Performance analysis in the presence of noise, congestion and interference Sensors
Physical Layer
Figure 1 a) Top-down sensor management b) general system model comprising the physical, network layer and the application level
1.1. Centralised versus Decentralised Architecture Centralised fusion involves processing of all sensor measurements at a single location, while sensor measurement errors are assumed independent across sensors and time. Distributed or decentralised sensor fusion involves a collection of processing nodes in
L. Mihaylova et al. / Information Management Methods in Sensor Networks
309
which each node processes its sensor measurements and communicates the results with neighbouring nodes. In addition, each node performs a specific fusion task using state information from its neighbours. Distributed architectures have an improved reliability and flexibility compared to centralised architectures. A primary concern is the limited energy reserve at each node. Combining information from spatially distributed sensor nodes requires processing and communicating sensor data. Communications are energy consuming. Additionally, the network should be able to scale large numbers of nodes and track many events. The decentralised sensor systems have the following features [5]: i) there is no single central data processing centre; ii) there is no common communication facility; not all nodes may be allowed to broadcast or multicasting may not be always possible; iii) nodes do not have any global knowledge of the network topology; nodes know only about connections in their own neighbourhood. Decentralised systems are characterised by : x scalability, i.e., there are no limitations such as in the centralised systems with respect to computational bottlenecks or lack of communication bandwidth; x the system is more robust to dynamic changes in its nodes compared to centralised systems and respectively possesses an increased reliability. Failures of some nodes or addition of new ones cannot change the structure of the network; x modularity, since all decision processes take place locally in separate nodes; x communication issues: the processing nodes of a decentralised system communicate with each other. The remaining part of the paper is organised as follows. Section 2 formulates the problems of interest. Section 3 considers mobility tracking in cellular networks within the sequential Monte Carlo framework. Section 4 addresses the mobility issues of mobile nodes in wireless sensor networks. Finally, conclusions are given in Section 5.
2. Problems of Interest The information management process incorporates different layers and deals with various kinds of data and knowledge. The network can assemble information from spatially diverse sources, providing different data, e.g., kinematic, attribute or features, video frames, etc. The data and information differ both in their quality and quantity. From the viewpoint of decision theory, sensor management is a decision making task to determine the most appropriate sensor action to perform to achieve maximum utility. Such a decision has already been treated in a Bayesian framework [2] and is in particular a Markov decision process. Nevertheless, practical implementations suffer from a combinatorial increase in the computational complexity even for lowdimensional cases. Reinforcement learning techniques [3] offer a feasible way of achieving the optimality of sensor management performance and to learn from experience. There are two methods: off-line learning and on-line learning. Off-line learning is based on comprehensive training data sets with scenarios covering a wide range of situations. On-line learning represents the idea of direct managing a sensor’s assets through its interactions with the environment and is much more difficult than the off-line learning case. Sensor management is formulated in [4] in the framework of Partially Observable Markov Decision Processes (POMDPs). Particle filtering allows
310
L. Mihaylova et al. / Information Management Methods in Sensor Networks
integration with decision processes and the use of realistic models. Particle filtering has also been combined with Q-function approximation methods. The process of optimising information flow relies on information measures. Measures like the Shannon entropy, the Hartley entropy, the Kullback-Leibler distance, the Rényi information divergence, and the Bhattacharyya distance are widely used in communication networks, decision making and image processing to characterise uncertainty, or respectively accuracy. We treat them from the point of view of sensor management. 3. Typical Information Measures The first measure of information stems from communication theory which deals with broadcasting of a message from a sender to a receiver. The Shannon entropy [6] weights the information per event X i by the probability p(.) that event this occurs H
K
¦ p( X ) log( p( X )) , i
(1)
i
i
with K being a positive constant. The entropy characterises the average amount of information that is gained from a certain set of events. The entropy is maximal when all events are equally likely, and we are uncertain which event is going to happen. When one of the events has much a higher chance to happen than the others, then the uncertainty/ entropy decreases. Interpreting the entropy as a measure of uncertainty, information I can be quantified as the difference between two probabilities of the random event, e.g., in the sensor management before and after the measurement arrival I
H before
observation
H after
observation
.
(2)
A general information measure is the Rényi information divergence DD [7], in which the information gain between two probability density functions p1 and p2 can be calculated in the following way DD
1 log p1D p12D ( X )dX , D ! 0, D z 1 . D 1
³
(3)
p2 · ¸¸dX and the Rényi 1 ¹ divergence becomes the commonly utilised Kullback-Leibler divergence. It is shown in [7] that using an index of the form (2) and with Rényi divergence, the next measurement can be performed to make the divergence between the current density and the density after a new measurement as large as possible. This indicates that the sensing action has maximally increased the information content of the measurement updated density with respect to the density before the measurement update. Many information management methods rely on the change of the Shannon entropy or Kullback-Leibler distance or other metric function characterising the information gained after the
In the limiting case the parameter D o 1 , lim DD D o1
§
³ p ¨¨© log p 2
L. Mihaylova et al. / Information Management Methods in Sensor Networks
311
measurement arrival. Since the Kullback-Leibler distance is not symmetric, it is not a metric and does not obey the triangle property. However, it is accepted as a measure for difference between two probability distributions (or densities). In surveillance scenarios, information is gained when a target is localised or the accuracy of the state estimate of the target being tracked is increased. Strategies for sensor selection can be developed for single and multiple target tracking. The next Section considers a solution to the problem of mobility tracking in cellular networks.
4. Mobility Tracking in Sensor Networks Mobility tracking is one of the most important features of wireless cellular networks. Data from two station types are usually used: base stations - positions are known and mobile stations (or mobile users) - location and dynamic motion are estimated. Mobility tracking techniques can be divided into two groups [8]: methods in which the position, speed, and acceleration are estimated versus conventional geo-location techniques that only estimate the position coordinates. Two types of signal measurements are usually used: pilot signal strengths from different base stations measured at the mobile unit and the corresponding propagating times. Received Signal Strength Indicators (RSSIs), i.e., the pilot signal strengths from neighbouring base stations, are quite often used measurement signals in practice. However, the RSSIs contain shadowing, fading, and path loss components that might severely corrupt the signal. Conventional filtering algorithms can not cope with these type of noises. The Bayesian approach is a suitable framework for overcoming different uncertainties 4.1. Mobility Tracking within Bayesian Framework The state x k n x of the unknown unit can be evaluated at each time instant from the conditional probability density function (pdf) p ( xk | z1:k ) and a set of measurements
z1:k
^z1,, zk ` up
to time instant k. The predicted state pdf can be computed
according to the Chapman-Kolmogorov equation
p ( x k , z1:k 1 )
³ p( x
x
k
| x k 1 ) p( x k 1 , z1:k 1 )dx k 1 .
(4)
nx
After the arrival of the measurement zk at time k, the posterior state pdf can be updated via the Bayes rule
p ( xk | z1:k )
p ( zk | xk ) p ( xk | z1:k 1 ) , p( zk | z1:k 1 )
where p ( z k | z1:k 1 ) is a normalising constant. The analytical solution to the above equations is very difficult and intractable. We utilise Monte Carlo techniques [9] that have been proven very suitable and powerful for dealing with non-linear system dynamics.
312
L. Mihaylova et al. / Information Management Methods in Sensor Networks
The Monte Carlo approach relies on a sample-based approximation of these probability density functions. Multiple particles (samples) of the variables of interest are generated, each one associated with a weight that characterises the quality of a specific particle. An estimate of the variable of interest is obtained by the weighted sum of particles. Two major stages can be distinguished: prediction and update. During prediction, each particle is modified according to the state model, including the addition of random noise to simulate the effect of the noise on the variable of interest. Then in the update stage, each particle's weight is re-evaluated based on the new sensor data. The resampling procedure deals with the elimination of particles with small weights and replicates the particles with higher weights. Two sequential Monte Carlo algorithms are developed for mobility tracking: a Particle Filter (PF) and a Rao-Blackwellised PF (RBPF) working with Received Signal Strength Indicators (RSSI) [10]. The PF provides a reliable solution to the considered problem. However, faster and more accurate tracking is obtained with the RBPF. The RBPF analytically marginalizes out some of the variables (linear, Gaussian) from the joint posterior pdf. Then the linear part of the system model is estimated by a Kalman filter, an optimal estimator, whilst the non-linear part is estimated by a PF. Fig. 2 presents the actual trajectory with the centres of the base stations, the estimated trajectories by the PF and RBPF. The speed and acceleration of the moving unit are shown in Fig. 3. Abrupt manœuvres are followed by longer rectilinear motions.
Figure 2. Actual and estimated trajectories by the PF and the RBPF. Centres of the base stations are indicated as well. 6
40
5
A cceleration, [m /s2 ]
Speed, [m/s]
35 30 25 20
3 2 1
15 10 0
4
50
100
150 200 Time, [s]
250
300
0 0
50
100
150
200
250
300
Time, [s]
Figure 3. a) actual speed of the moving object; b) actual acceleration of the object
L. Mihaylova et al. / Information Management Methods in Sensor Networks
313
5. Localisation of Mobile Nodes Location awareness in mobile sensor networks is another important issue for various applications, such as environmental monitoring of sensor nodes, target tracking and robotics applications in military scenarios. A strategy for node localisation in a general framework is suggested in [11]. Three kinds of scenarios are addressed: 1. Nodes are static, seed nodes (which know their own location, e.g., because they have GPS receivers) are moving, e.g., in a military application where nodes are dropped from a plane onto land, and transmitters attached to soldiers or animals in the area are used as moving seeds. Each node location estimate should become more accurate as time passes and it receives information from more nodes. 2. Nodes and moving seeds are static. One example would be nodes floating in currents along the river and seeds at fixed locations on the river banks. For these scenarios, each node’s location will fluctuate around its current actual location: as time passes the previous location information becomes inaccurate since the node has moved, but as the new seed information is received, the location estimate is revised. 3. Both nodes and seeds are moving. This is the most general situation. It applies to situations where nodes and seeds are deployed in an ad hoc way, and move, either because of the environment they are in (wind, currents, etc.) or because thy have actuators for motion.
6. Conclusions and Open Issues for Future Research This paper addresses some of the main challenges in the information management of centralised and decentralised sensor networks. The sensor network should provide relevant information in the light of the application considered and be able to cope with missing data, delays, limited bandwidth and energy constraints. The approaches taken to achieve location in sensor networks differ in their assumptions about the network deployment and the hardware capabilities depending on the application. Typical measures of information are considered. We formulate the mobility tracking in wireless sensor networks and we suggest a solution within a Bayesian framework. Finally, typical problems with localisation of mobile nodes are discussed.
References [1] N. Xiong, P. Svensson, Multi-sensor Management for Information Fusion: Issues and Approaches, Information Fusion, 3(2): 163-186, 2002. [2] L. Kaelbling, M. Littman, A. Cassandra, Planning and Acting in Partially Observable Stochastic Domains, Artificial Intelligence, 101(1-2):99-134, 1998. [3] R. Sutton, A. Barto, Reinforcement learning: An Introduction, Cambridge, MA, MIT Press, 1998. [4] Y. He, E. Chong, Sensor Scheduling for Target Tracking in Sensor Networks, Proc. 44rd Conf. Dec. Contr., 2004. [5] H. Durrant-White, B. Grocholsky, Management and Control in Decentralised Networks, Proc. of the 6th Conf. on Inf. Fusion, pp. 560-566, 2003. [6] C. Shannon, A Mathematical Theory of Communications, I, II., The Bell System Techn. J., 27:623-656, 1948.
314
L. Mihaylova et al. / Information Management Methods in Sensor Networks
[7] C. Kreucher, K. Kastella, A. Hero, Sensor Management Using Relevance Feedback Learning, IEEE Trans. Sign. Proc., 2003, submitted. [8] Z. Zaidi, B. Mark, Real-time Mobility Tracking Algorithms for Cellular Networks Based on Kalman Filtering, IEEE Trans. on Mobile Computing, 2005. [9] B. Ristic, S. Arulampalam, N. Gordon, Beyond the Kalman Filter: Particle Filters for Tracking Applications, Artech House, 2004. [10] L. Mihaylova, D. Bull, D. Angelova, N. Canagarajah, Mobility Tracking in Cellular Networks with Sequential Monte Carlo Filters, Proc. of the Eight Intl. Conf. on Information Fusion, USA, 2005. [11] L. Hu, D. Evans, Localization in mobile sensor networks, Proc. of the Tenth Annual Intl. Conf. on Mobile Computing and Networking, USA, 2004.
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
315
A Novel Method For Correction Of Distortions And Improvement Of Information Content In Satellite-Acquired Multispectral Images V.I. VOLOSHYN, V.M. KORCHINSKY, M.M. KHARYTONOV State Company "Dniprocosmos", Dnipropetrovsk, 49008, post office box 798, Ukraine, E-mail:
[email protected] Sundareshan M. K. - University of Arizona, Tucson, AZ 85721-0104,USA, E-mail:
[email protected]
Abstract. A novel method for the correction of geometric and radiometric distortions in images acquired from satellites is described. Application of this method helps to increase the information content of digital multispectral images. Keywords: multispectral images, data fusion, remote sensing.
Introduction Remote sensing of earth using multispectral observation platforms has now become an important part of a number of diverse human activities, such as cartography, monitoring of environment, assessment of security schemes, resource management, etc. [1]. This underscores the importance of increasing the information content and improving the reliability of information contained in digital images acquired from satellite-borne sensors. Images observed from space can be captured at different wavelengths of the electromagnetic spectrum. Although under ideal conditions these images formed at different spectral ranges are uniform (but may contain unique information in different frames), under real measurement conditions they suffer from geometric and radiometric distortions that contribute to placing limits on the information content due mainly to the instability of positional parameters and non-ideal settings of optic-mechanical scanners. Hence it becomes necessary to find ways for correcting these distortions before the acquired multispectral images can be used for analysis and efficient decision-making. Available methods for correcting the distortions involve using projection techniques that do not take into account the physical and geometrical laws underlying the formation of images, consequently requiring a lot of reference image data as well as huge computational resources [2]. Furthermore, there are some natural limitations to the possible compensation due to the relative displacement between the spectral channels [3]. It is in this context that a novel method for correction of geometric and radiometric distortions is presented in this article. The method proposed here makes essential use of the physical laws governing the image
316
V.I. Voloshyn et al. / A Novel Method for Correction of Distortions
formation mechanisms in different spectral regions and aims to significantly enhance the information content present in the multispectral imagery data. For a brief description of the methods employed in the present study, we perform image reconstruction from employing one reference image selected within the spectral range under consideration. The reconstruction is performed using an information-geometrical model describing the physical process of image formation in the short wave range of electromagnetic spectra (visual and neighboring infrared). The image databases that we have used to validate our studies were obtained in the EO range of the electromagnetic spectrum with such satellites as: Sich -1M, Meteor - 3M, IRS (LISS), Spot, Landsat, Terra (Aster), etc. However, the current methods are more extensively studied with Ukrainian users’ satellite image databases such as Sich-1M and Meteor - 3M. The reconstruction of image pixel intensities is performed by arranging in ideal Projection Planes (PP) the locations needed for representation in different spectral channels to analyze and validate the available data. The model for reconstruction of the multispectral images was developed from physical principles. Scanner method was applied with coordinate-susceptibility photoreceivers (CSPR) [4].
1. Physical Model for Formation of Multispectral Images A physics-based model for the formation of multispectral raster images with CSPR scanners in the electro-optic and infrared ranges can be developed by taking into account the tuning of optic-mechanic scanner scheme. By using basic optic principles, the conformity between points on the earth surface and scanner Sensitivity Element (SE) can be established. One may note that each SE is fixing the electromagnetic radiation from some site as any cell projection on the earth surface where the center of projection is situated in the optic object-lens center. Figure 1 depicts a commonly accepted model describing the geometric relations for one pixel in an individual spectral channel. Z
Z m
r
Image plane
O
X d
R
Objective Y lens
A
Y X O
Object plane M X
Figure 1. Geometric relations underlying development of multi-spectral raster image formation model
317
V.I. Voloshyn et al. / A Novel Method for Correction of Distortions
It is well known that the relation between the spectral amplitude u R at a selected point with radius-vector R in the geodesic coordinate system and the source field represented by U x is described by the Kirhgoff equation. In the special case 0 when the source lies on the plane Z 0 , this equation can be written as: u R
jkR 1 w §¨ e ³ U0 x wZ ¨ R j O ( :) ©
where U
0
x
· ¸ dx ¸ ¹
,
(1)
denotes the spectral amplitude source radiation field, O denotes the
wavelength of radiation, k
2S
O
is the wavelength number; R
R denotes the
modulus of radius vector. For the particular case when kR !! 1 , which corresponds to photogrammetry in electro-optic and infrared images, the above equation can be rewritten as u R
exp jkR Z dx , ³ U x j O (: ) 0 R2
(2)
where Z denotes the R vector vertical component. In the Fresnel wave zone (corresponding to big wavelength distances from the source to radiation receiving point), this equation further reduces to u R
ª r 2 x 2 2r 2 x º « A A »dx , ³ U 0 x exp « jk » 2q j O q 2 (: ) «¬ »¼ Z
(3)
where rA is the horizontal projection of R vector and q denotes distance from x point to R point. Two coordinate systems should now be introduced to describe the intensity of radiation received with scanner. The Geodesic system OXYZ has the origin at the pixel projection center on the earth surface (with the OZ axis oriented along the image plane), whereas the Image coordinate system oxyz has the origin beginning in perpendicular basis with objective–lens optic center projection on the image plane (with the coordinate plane oxy coinciding with the image plane). These coordinate systems are presented in Figure 1. Hence, using Equation (3), the radiation intensity I r at a point in the image plane can be related to the brightness of the corresponding point on the object surface I I r
0
x
by
2 1 § M r x · n m n ¸ dx , ³ I0 x F ¨ 0 3,3 0 ¹ H © d O 4 d 2 H 2 (: )
(4)
where the integration is performed on projection plane for each separate SE cell. A few of the terms appearing in the above equation need some description.
I 0 x ,
as
described earlier, denotes the brightness of the element on the object surface aligned
318
V.I. Voloshyn et al. / A Novel Method for Correction of Distortions
n 0 is the normalized vector of the image plane directed toward F y - the function for the image forming system
with x radius-vector, the object plane,
F y
J yA 2S A 1 ,3 y
(5)
where J x is the first order Bessel function, r is the radius vector of the point on the 1 image plane in the coordinate system oxyz, and M denotes the transfer matrix from OXYZ coordinate system to oxyz coordinate system with components
M
§ cos D cos E cos J sin D sin J ¨ ¨ cos D cos E sin J sin D cos J ¨ cos D sin E ©
sin D cos E cos D sin J sin D cos E sin J cos D cos J sin D sin E
sin E cos J · ¸ sin E sin J ¸ . cos E ¸¹
It must be noted that D , E , J denote Euler angles. For the ideal optic-mechanic scanner system tuning, Equation (5) takes the form
I r
2 1 §r x · I F x ³ 0 ¨ ¸ dx , ©d H ¹ O 4 d 2 H 2 (: )
(6)
A careful analysis of Equations (5) and (6) shows that different spectral channel images are displaced both with respect to each to other and also with regard to the image that could be obtained under ideal tuning conditions. The size of displacement is given by 'r
M r r n m n , 0 3,3 0 d d
(7)
It is to be noted that these images are obtained from corresponding image plane angles and their determination does not depend on the spectral range. Scanner ES cell integral is dependent on intensity radiation square from the object. Therefore, considering the image formed as intensities on a 2D grid allows one to obtain the next equation for the brightness of an element located in the n-th row and the m-th column from computing
I n, m
1 4 2
( 0)
2
³ I n,m x Rx dx
O d H (:)
,
(8)
(0) x denotes the central projection of brightness of the n, m corresponding cell on the earth surface. The function R(x) in the integrand of Equation (6) is described by
In Equation (8), I
V.I. Voloshyn et al. / A Novel Method for Correction of Distortions
R x
2 § M r x · F m dr , n n ³ ¨ 0 3,3 0 ¸¹ H (S ) © d
319
(9)
Integration is performed on each cell square. This gives the mathematical model for the formation of digital images from CSPR equipped scanners in the multispectral regime. It also enables explaining the reasons for distortions in these images. It may be noted that different projection distortions caused by optic-mechanic tuning scanner lead to displacements at the subpixel level.
2. A Novel Method for Distortion Correction and High Resolution Reconstruction Due to construction peculiarities characterizing the satellite and the sensor, the intensity distribution captured on the image plane will suffer distortions due to mixing of images collected along the different spectral channels. Consequently, it is necessary to correct for these distortions and reconstruct a high resolution frame for use in image exploitation operations. A method for the correction of distortions is to select the intensity distribution on one channel as the basic image, referred to as BI, and construct the intensity distribution data on other channels by image processing operations. In this case, the multispectral image can be formed under ideal conditions by taking BI image with maximum signal entropy and relating the intensity distributions f r and i f j r on spectral channels i and j by computing their normalized cross-correlation F x i, j
f § · ³ f i r f j ¨ A i , j x r ¸ dr , K © ¹ i, j f 1
(10)
f f f 2 r dr ³ f 2 r dr ; and A x denotes a geometric ³ i, j i j i, j f f transformation matrix, of the argument x §¨ a , a , b , b , M ·¸ . The elements of y x y © x ¹ vector x are the distortion parameters given by displacements ( a , a ) along spatial x y where K
coordinates [i.e., along Ox and Oy axes), center turning coordinates ( b , b ) and x y rotation angle M ]. It may be noted that the correlation attains a maximum value when the two intensity distributions have the largest similarity. Hence the distortion parameters are computed by solving the optimization problem of maximizing F x . i, j Thus denoting the basic image BI selected as channel 0 the optimization problem is solved for each of the other channels by maximizing F x where I denotes the o, i channel number of the corrected spectral channel.
320
V.I. Voloshyn et al. / A Novel Method for Correction of Distortions
Once the optimization problem is solved and the distortion parameters are
i i a xi , a y , bxi , b y , M i , the correction procedure is implemented to the intensivity distribution f r where r -IP radius-vector with components.
computed as
i
x
ª ª i º i º i « x bx » cos (Mi ) « y b y » sin (Mi ) a x ; ¬ ¼ ¬ ¼
y
ª ª i º i º i « x b » sin (M ) « y b » cos (M ) a x i y i y ¬ ¼ ¬ ¼
(11)
Transformation of corrected continuous intensities distribution into raster form is performed in two steps: 1. First step: 2D discretization f r on uneven grid of initial raster image; i 2. Second step: Uneven (quanting) on M levels with Loids algorithm application to minimize quanting error. It worth nothing that the correction based on XBI of the quantity index of effectiveness Y image is analog to the conditional entropy in information theory, which is determine by: M 1 ª M 1 ¦ « p xi ¦ p y j | xi log 2 p y j | xi i 0 «¬ j 0
E Y | X
p xi
iNi X M 1
p y j | xi
;
j |i
Y is
º
» »¼
(12)
jN j| i Y M 1
¦ kN k | i Y
¦ kN k X
k 0
where N
k 0
the Y Image pixel quantity with intensity level j, displaced in
places of X image with intensity level i. E Y | X is additional information estimation of the Y image with X image application. Mathematic background: images creation based on localized wavelet-operations. The general approach is to increase spatial resolution of the multispectral digital images. 3. Wavelet-decomposition: approximating and detail elements selection along vertical, horizontal and diagonal for spatial components of panchromatic and multi spectral images. 4. Synthetic multispectral image forming based on approximating component of primary multi spectral image and detail component of primary panchromatic image. Reconstruction initial image (function of intensity f(r) ) after J stages of decomposition(15): J
>
@
f r f A J r ¦ f H j r fV j r f D j r j 1
V.I. Voloshyn et al. / A Novel Method for Correction of Distortions
321
The wavelet decomposition scheme is shown in of Figure 2. INITIAL IMAGE
Approximating part A
Detail part ?
Detail - V
Detail - ?
Approximation
Detail - V
Detail - D
Detail - D
. . . Figure 2. Wavelet Decomposition Scheme
Approximating component(16) f A j r
N
M
¦ ¦ a 2 j n,m
2 j
M 2 j x n M 2 j y m
n 1m 1
Detail component (17) f H j r
N
M
\ 2 j x n M 2 j y m
¦¦h 2 j n,m
1 0.5 j
n 1m 1
fV j r
N
M
¦¦v 2 j n, m
1 0.5 j
M 2 j x n \ 2 j y m
n 1m 1
f D j r
N
M
¦¦d 2 \ 2 j n, m
j
j
x n \ 2 j y m
n 1m 1
The decomposition level and coefficients are obtained by maximum likelihood estimation methods. A block diagram of the overall method is shown in Figure 3.
322
V.I. Voloshyn et al. / A Novel Method for Correction of Distortions
Figure 3. Block Scheme of Algorithm Forming of Composition Component
3. Discussion For imagery acquired by satellite-borne sensors, one should also be concerned with some additional sources of degradation, chief among them being platform vibrations and certain physical limitations of the sensors themselves (such as diffraction limits), which in turn lead to distorted scene data appearing in the captured image frames. Some specific algorithms and procedures for the restoration and super-resolution of digital imagery data were developed during the past two decades in USA [5, 6], This work has resulted in a number of powerful algorithms derived by following distinct analytical approaches, such as statistical optimisation (Bayesian estimation methods), set-theoretic reconstruction (Projection-onto-convex sets methods), and image registration and fusion techniques. Of particular significance to our joint studies involving remote sensed satellite imagery data is the impressive capability of these algorithms in correcting for degradations caused by vibrations and diffraction limit constraints. Image fusion involves the synergistic use of information contained in different image frames in order to obtain a better understanding of the scene under surveillance [7]. In the case of remote sensed data, since images acquired from different multispectral sensors or by the same sensor in multiple acquisition contexts (different passes of the satellite for instance) are available, a careful fusion of images obtained in spectral bands can provide a more consistent and reliable interpretation of the scene. An important pre-processing step in efficiently combining different image frames is image registration, which involves correcting for the pixel and sub-pixel shifts and bringing the different images to be combined into a single reference frame. Thus the main goal for the joint collaboration that we will attempt to establish is develop a sophisticated image processing architecture that integrates the various component algorithms and obtain efficient realizations (both software and hardware) for real-time implementations to realize information enhancement in satellite acquired multi-spectral imagery data.
V.I. Voloshyn et al. / A Novel Method for Correction of Distortions
323
4. Conclusions A method for radiometric correction of multispectral images to ensure enhancement of information content and spatial resolution was presented in this paper. The method offers correction in the case of point displacement between different spectral channels due to diffraction effects and spatial coordinate-susceptibility of photoreceivers causing positioning instability during sensing. As applications of this method, some case studies on potentially dangerous landslide areas selection and classification of land cover elements classification have been developed as well.
References 1. 2. 3. 4. 5. 6. 7.
R. A. Schowengerdt, Remote Sensing - Models and Methods for Image Processing, Academic Press, 1997 Buzovsky O.V., Boldak A.A., Mohamed Rumy M.H. Computational images processing. - Kiev: Korniychuk, 2001. - 180 p. (in Russian) Voloshyn V.I., Korchinsky V.M. Multispectrral projective images geometric forms reconstruction // Applied geometry and engineering graphyc. / Tavriysky State Agrotechnical Academy. – 4. – Vol 19. Melitopol: TSAA, 2003. – P.40-44. (in Ukrainian) Gonyn G.B. Earth Remote Sensing. – Leningrad: Nedra, 1989. – 380 p. (in russian) Sundareshan M. K. and Bhattacharjee S., “Enhanced iterative processing algorithms for restoration and super-resolution of tactical sensor imagery”, Optical Engineering, Vol. 43, pp. 199-208, January 2004. M. K. Sundareshan, S. Bhattacharjee, R. Inampudi, and H. Y. Pang, “Image preprocessing for improving computational efficiency in implementation of restoration and super-resolution algorithms”, Applied Optics (IP), Vol. 41, pp. 7464-7474, 2002. C. Pohl and J. L. Van Genderen, “Multi-sensor image fusion in remote sensing: concepts, methods, and applications”, International Journal of Remote Sensing, Vol. 19, pp 65-78, 1998.
324
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
Multisensor Data Fusion in the Processes of Weighing and Classification of the Moving Vehicles Janusz GAJDA1, Ryszard SROKA1, Tadeusz ZEGLEN1 AGH – University of Science and Technology, Department of Measurement and Instrumentation, Al. Mickiewicza 30, 30-059 Cracow, Poland
1
Abstract. The application of data fusion to the problems of traffic parameter measurement, especially the weighing and classification of moving vehicles, is presented in this paper. The benefits of applying data fusion and its relevant advantages are demonstrated through examples. The necessity of carrying out realtime moving vehicle measurements is its characteristic feature. Keywords. Data Fusion, Vehicle Classification, Weigh-In-Motion
Introduction Cognitive processing includes information processing. Information can be acquired from various sources through observation measurements. The quality of the cognitive process is dependent on the amount and quality of the information collected from the object measurements, resources of the a priori knowledge about this object, and the quality of the processing. Independent of the object measurement and the purpose of the cognitive process is the basic principle saying that the richer and more complete the information gained from the object, the more reliable are the effects of the cognitive process. Enrichment of measurement gained from the object can be achieved not only by increasing the measuring accuracy but also through measuring a greater number of appropriately selected object variables. At the stage of information processing, it is possible to join the measurement (experiment) knowledge with the a priori knowledge, which may considerably increase the effectiveness of the cognitive process. Combining knowledge from various sources is called data fusion. Depending on the type of information and the structure of the system in which the fusion takes place, it can be performed at the data, feature, or decision level. Depending on its purpose, the fusion can be cooperative, competitive or complementary [1][2][3]. These problems are presented in the case of measuring the parameters of moving vehicles.
1. Vehicle Classification Classifying an automotive vehicle means determining to which of the selected classes the vehicle belongs. Classification methods are dependent on the vehicle parameters that can be determined in a given measuring system and on the classification purpose.
J. Gajda et al. / Multisensor Data Fusion in the Processes of Weighing
325
The simplest classification method, often used in practice, is based on measuring the vehicle length (i.e., three classes). The method can be applied in a very simple measuring system, e.g., in a single-sensor system with an inductive loop utilizing only the signal of vehicle occurrence. When the necessity of defining more (four or five) classes arises, it is possible to use a system with an inductive sensor and process the obtained magnetic profile of the vehicle. The profiles generated by different vehicles differ in shape, amplitude, frequency spectrum, statistical parameters, etc. One method of the magnetic profile preprocessing consists of transforming the profile into the vehicle length domain [4]. This operation results in the profile containing combined information on the shape and length of the primary profile, which enables more selective classification to be made (Figure 1). To carry out such transformation, information on the vehicle speed is also necessary. This transformation is therefore an example of data fusion. Also, amplitude standardization can be included in this way, gaining additional information on the vehicle suspension height.
Normalized amplitude
1.0
Short bus
0.8
Bus
0.6
City bus
0.4 0.2 0.0 0
2
4
6
8 10 12 14 Vehicle length [m]
16
18
20
Figure 1. Bus magnetic profiles in the vehicle length domain
Nonparametric classification methods consist of directly comparing the entire profile generated to a vehicle being classified (after transformation) with the reference profiles representing each of the defined classes. Depending on the vehicle class, the effectiveness of such classification for our example can range up to 67% to 100%. Parametric methods consist of comparing the profile parameters (mean and standard deviation) of the vehicle being classified to a reference vehicle. The effectiveness of the classification based on individual profile parameters is unsatisfactory [5] (depending on the selected parameter, the effectiveness gained is 60%–70% for our example of one of the classes and considerably worse for the others). Combined utilization of various parameters is can be more effective: the classification effectiveness in all classes under consideration is then increased and equalized [5], [6]. Such action is called decision fusion. It can be implemented based on voting or weighted voting methods, or hierarchical methods. Depending on the class, the classification effectiveness of the voting methods in our example is in the interval of 50% to 97%. The classification effectiveness of the hierarchical methods in our example ranges from 77% to 96%. Describing reference data with fuzzy measures is another approach to the classification problem. A model of any class consists of a set of membership functions (similarity measures) defined for selected parameters. The membership functions are determined using statistical analysis (mean and standard deviation) of a selected parameter. Both simple logical functions operating on fuzzy sets (OR and AND, fuzzy
326
J. Gajda et al. / Multisensor Data Fusion in the Processes of Weighing
3.5 3.0 2.5 2.0 1.5 1.0 0.5 0.0 -0.5
(a) Amplitude [V]
Amplitude [V]
set normalized power), and more complex functions enabling weighting coefficients to be taken into account are selected as functions realizing data fusion [7], [8]. In this case, the classification effectiveness depends on the set of selected parameters, adopted shape of fuzzy measures (triangular, Gaussian), and on the functions realizing data fusion. The method’s classification effectiveness in our study reaches 92–94% for five selected magnetic profile parameters and four defined vehicle classes. It is also possible to join parametric and nonparametric methods (NP). In [9] an algorithm is presented which utilizes a neural network (NP approach) to fuse the features obtained from the profile and the samples of this profile. For our study, 89% classification effectiveness was reached at five defined vehicle classes. The vehicle classification can also be effected by measuring the number of vehicle axles. This is particularly important in vehicle weighing systems. To decide on exceeding the allowable load values, it is necessary to combine the obtained result of weighing and the result of classification. However, such classification may not be selective enough. It is then necessary to measure the inter-axle distances. Taking this parameter into account considerably improves the classification selectiveness although it requires vehicle speed to be measured. In such a system vehicle length (treated as a parameter auxiliary to the classification process) is also measured and a trailer is detected. So the measuring system must cooperate with different-type sensors and realize the fusion process (complementary fusion - increasing the completeness of object description) of the data acquired from these sensors and the possessed a priori knowledge. This type of classification allows 13–14 vehicle classes to be defined [10]. Complementary fusion can also be realized by having a single detector as an inductive loop only. Depending on the shape and size of the loop, the range of the electromagnetic field generated by the loop will be different. The resulting measurement signal (magnetic profile) will contain different information on the vehicle that moved above this loop (Figure.2).
0
200 400 600 800 Sample number
1000
2.5 2.0 1.5 1.0 0.5 0.0 -0.5 -1.0
(b)
0
200 400 600 800 Sample number
1000
Figure 2. Passenger vehicle profile acquired from sensors of a) 150 cm, b) 10cm width
The inductive loop enables detection and vehicle axles counting to be carried out. The mechanism of complementary fusion realized on the sensor level can be explained by the fact that the measuring sensors react simultaneously to many different physical quantities. This makes it possible to collect a greater amount of information on the object with a single sensor provided that the user can extract from the measurement signal the information relevant to each measured quantity or can utilize the combined information.
327
J. Gajda et al. / Multisensor Data Fusion in the Processes of Weighing
2. Weigh in Motion (WIM) The term weigh-in-Motion (WIM) means a process of measuring the dynamic wheel forces of a moving vehicle on the road and estimating the corresponding static loads or gross weight. The lack of significant limitations posed on the vehicle speed is a characteristic feature of such weighing systems. In general, the WIM systems complement the static vehicle weighing stations, playing the role of preselection systems. Classic WIM preselection systems are based on an inductive sensor and two load sensors. Such a system configuration allows static forces of individual axles to be estimated and a vehicle based on the number of its axles to be classified. Also, the pavement temperature is measured as this is necessary for correction of weighing results, which depend on the thermal and mechanical properties of the pavement and sensors. In such a system, the data fusion is realized to complete the description of the object under consideration and to ensure the highest possible measuring accuracy. The High Speed WIM preselection systems provided with two load sensors can determine the gross weight of a moving vehicle with an error rate of not less than 10–15%. The main reason is the occurrence of the dynamic component in the signal of vehicle load on the road surface (Figure 3). The amplitude of this component depends on the pavement quality and vehicle speed and may achieve even up to 40% of the static load value [11].
Relative axle load
0.6
V = 100[km/h]
0.4 0.2 0 -0.2 -0.4 -0.6 0
5
10
15 20 25 Distance [m]
30
35
40
Figure 3. Relative changes of the vehicle instantaneous axle load at the speed of 100 km/h
Weighing vehicles with accuracy of 1% or 2% is now possible with static or lowspeed (up to 6 km/h) scales only. The improvement of measuring accuracy of the gross weight and static axle loads of vehicles moving with road speed up to the accuracy of low-speed scales is possible by building multi-sensor weighing systems (MS-WIM), developing static load estimation algorithms, and applying suitable methods for calibrating such systems. Based on the analysis of pavement models [11] and selected models of vehicle suspension, it was found that the following relationship is a good approximation of the signal of the force the vehicle wheels exert on the pavement during vehicle motion: F t F0
M
¦ F sin2Sf t M k
k 1
k
k
(1)
328
J. Gajda et al. / Multisensor Data Fusion in the Processes of Weighing M
where: F0 – static load exerted on the road by stationary vehicle,
¦ >x@ – dynamic k 1
components occurring during vehicle motion, Fk , f k , M k – parameters of the dynamic load components: amplitude, frequency and phase angle, respectively. Depending on the required modelling accuracy and suspension construction in a vehicle under consideration, different numbers M of dynamic components (usually M = 1 or M = 2) [12] are defined in the model. Frequencies f1 and f 2 in this model describe the vertical oscillation of the mass of a suspended vehicle and wheel hopping (oscillations of unsuspended mass), respectively. The amplitudes of individual dynamic components are significantly dependent on vehicle speed. To solve the problem of estimating the axle static load, F0 , the following estimates are used [10]: mean value, usually calculated from the results of instantaneous load on successive sensors (Mean), maximum likelihood estimate (ML), nonlinear least-square estimate (NLS), nonlinear Kalman filter (NKF), modified nonlinear least-square estimate (MNLS), artificial neural networks (usually of back propagation type) [13, 14]. To assess and compare the estimates, the following characteristic was applied: Pr G 1 PG
where: G
Fˆ F F 0
0
0
(2) is the absolute value of the relative estimation error of the
static component F0 , Fˆ0 is an estimate of the static component, PG is the cumulative distribution function of error G . This characteristic specifies the occurrence probability of error greater than G . Measuring systems with 16 pressure sensors distributed uniformly every 1.7 m distance, and non-uniformly with distance between successive sensors decreasing linearly were considered. The distance of 1.7 m between the first sensors was decreased for each sensor by 0.1 m. In Figure 4, typical characteristics allowing the described estimators to be compared are presented. The architecture of the MS-WIM systems is usually organized in such a way that successive pairs of load sensors are operated by systems that are similar in their operation to the preselection systems. Their task is to preprocess signals and make corrective actions. After processing is over (in an asynchronous way), each of these systems sends its measurement results to a central processor. The data received by a host system must be properly associated because of the fact that more than one vehicle may be present on the measurement site. Next, the data must be aligned with regard to the vehicle occurrence time at the successive sensors (this is required by the load estimation algorithms). Both stages are an important element of the data fusion process realized at the central level. After these initial operations are done and the nonexceedance of imposed limitations (e.g., the variability of vehicle speed during travelling through the measuring site, data completeness) is checked, measurement data processing can be carried out according to the relevant estimation algorithm (whose selection realized on line will depend on vehicle speed, vehicle classification, etc.). In such a multi-sensor data fusion process, initial knowledge plays an important role and is taken into account by applying the appropriate number of load sensors and their optimum arrangement. This knowledge is acquired from the experience of other
329
J. Gajda et al. / Multisensor Data Fusion in the Processes of Weighing
constructors, model studies, and the resulting parameters of acquisition and signal preprocessing, limitations of various types, etc. 1
1
(a)
Mean
Mean
ML 0.6
MNLS
0.4 0.2
(b)
ML
0.8 Probability
Probability
0.8
MNLS
0.6 0.4 0.2
0
0 0.00
0.01
0.02 0.03 Relative error
0.04
0.05
0.00
0.01
0.02 0.03 Relative error
0.04
0.05
Figure 4. Characteristics (2) for M=2 and three comparable static load F0 estimation algorithms for: a) uniform; b) non-uniform distribution of sensors
The number of load sensors applied is usually limited by economic reasons, although in [13] it has been shown that this number can be limited because of the properties of the applied estimation method and the lack of its accuracy improvement after exceeding a certain number of sensors. Developing and implementing adequate system calibration algorithms and procedures is a very important problem. Nowadays, many calibration methods exist [15, 16]. However, because of the nonstationarity of WIM systems, a new method turns out to be promising, one consisting of data fusion where the current measurement results in the WIM system are combined with a priori knowledge about the axle loads of a selected class of vehicles moving along a given road and adopted as the reference vehicles. It is therefore necessary to apply a very effective classification method. A characteristic feature of the proposed autocalibration method is determining the current estimate of the system constant after the passing of any vehicle recognized as the reference one. Thanks to this, the system acquires the possibility to react automatically to the changes of many factors affecting its operation, such as temperature variation, humidity, sensor sensitivity, etc. Data fusion in the MS-WIM systems therefore includes many more elements than the classic preselection systems.
3. Summary Selected problems of applying data fusion to measuring vehicle-in-motion parameters were presented in this paper. The approach was twofold: increasing the description completeness of the object undergoing measurement and achieving the highest possible measurement accuracy of object parameters, or classification effectiveness. These goals were realized through the adequate selection of the number and type of sensors, their construction parameters, systems of their cooperation, and also through initial operations on signals (e.g., by transforming into the vehicle length domain, parameterisation, association, alignment, etc.). No less important was the selection of vehicle classification method (parametric, nonparametric) and vehicle axle load estimation and calibration methods. With regard to the measurements presented, data fusion was applied at the level of data, features, and decisions. Examples of complementary and cooperative data fusion
330
J. Gajda et al. / Multisensor Data Fusion in the Processes of Weighing
were presented. It was shown that in each case the quality or quantity of the results was better than if no data fusion had been applied.
References [1]. [2]. [3]. [4]. [5]. [6]. [7]. [8]. [9]. [10]. [11]. [12]. [13]. [14]. [15]. [16].
Hall D.: Mathematical Techniques in Multisensor Data Fusion, Artech House, London 1992. Klein L.A.: Sensor and Data Fusion Concepts and Applications, SPIE, Washington 1999. Klein L.A.: Sensor Technologies and Data Requirements for ITS, Artech House, London 2001. Gajda J., Stencel M.: “Determination of road vehicle types using an inductive loop detector,” Proceedings of XIV IMEKO Congress, Tampere 1997, pp. 231-236. Gajda J., Sroka R.: “Vehicle classification by parametric identification of the measured signals, “Proceedings of XVI IMEKO World Congress, Vienna 2000, vol. IX, pp. 199-204. Gajda J., Sroka R., Stencel M., Wajda A., Zeglen T.: “A vehicle classification based on inductive loop detectors,” Proceedings of the 18th IEEE, IMTC, Budapest 2001, pp. 460-464. Sroka R.: “Data fusion Methods Based on Fuzzy Measures in Vehicle Classification Process, “Proceedings of the 21th IEEE IMTC, Como 2004, vol. 3, pp. 2234-2239. Karlsson B.: “Fuzzy Measures for Sensor Data Fusion in Industrial Recycling,” Measurement Science and Technology, Vol. 9, 1998, pp. 907-912. Ki Y.K., Baik D.K.: “Vehicle Classification Algorithm for Loop Detectors using Neural Network,” IEEE Transactions on Vehicular Technology, in press. Weigh-in Motion of Road Vehicle, Final Report of COST 323 action, Ver. 3.0, 1999. Cebon D.: Handbook of Vehicle-Road Interaction, Swets&Zeitlinger B.V., Lisse, the Netherlands 1999. Cebon D.: “Design of multiple-sensor weigh-in-motion systems,” Journal of Automobile Engineering, Proc. I. Mech. E., 204, 1990, pp. 133 – 144. Mangeas M., Glaser S., Dolcemascolo V.: “Neural networks estimation of truck static weights by fusing weight-in-motion data,” Proceedings of Eurofusion 2000. Gonzales A., Papagiannakis A.T., O’Brien E.: Evaluation of an Artificial Neural Network Technique Applied to Multiple Sensor Weigh-n-Motion Systems, University College Dublin, Ireland. Huhtala M.: “Factors Affecting Calibration Effectiveness,” Proceedings of the Final Symposium of the Project WAVE, Paris 1999. Stanczyk D.: “New Calibration Procedure by Axle Rank,” Proceedings of the Final Symposium of the Project WAVE, Paris 1999.
This paper was financed by the Polish Ministry of Scientific Research and Information Technology, grant No 4T10C02625.
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
331
Sensor Performance Estimation for MultiCamera Ambient Security Systems: a Review Lauro SNIDARO and Gian Luca FORESTI Department of Mathematics and Computer Science University of Udine, Italy
Abstract. Security is a primary concern in the research and design of tomorrow’s intelligent spaces and multi-camera surveillance systems have generated growing interest recently. Due to their advantages of enlarged surveillance area, robustness, enhanced monitoring performance and relatively low cost, they are very suitable for applications in visual security for both indoor and outdoor environments, such as banks, shopping malls, subway stations and parking lots. In a system that employs multiple sensors, the problem of selecting the most appropriate sensor or set of sensors to perform a certain task often arises. This paper represents a critical review on the available techniques for image and segmentation quality assessment for video surveillance applications. Keywords. Sensor performance, Video surveillance
Introduction Security is a primary concern in the research and design of tomorrow’s intelligent spaces and automatic video surveillance systems are perhaps the technology with the greatest potential. Such systems can perform real-time intrusion detection and/or suspicious event detection in complex environments [1][2]. Current research focuses on multi-sensor systems that allow for enhanced monitoring capabilities and performance for both indoor and outdoor environments, such as banks, shopping malls, subway stations and parking lots [3]. Many aspects of multi-sensor surveillance systems, including system architecture, change detection, image authentication and object tracking, have been investigated and discussed in detail. However, the reliability of the sensors is never explicitly considered. In a system that employs multiple sensors, the problem of selecting the most appropriate sensor or set of sensors to perform a certain task often arises, e.g., selecting the closest sensor for a close-up recording of a target, or one that can provide the most informative content. This is generally called the sensor selection problem and has been addressed in different ways in the literature [16]. If multiple sensors are monitoring the same scene, redundant data (i.e., positions of the tracked objects) can be fused to better perform situation assessment [3][4]. In this case evaluating the performance of the sensor is even more critical since fusing unreliable data may yield dramatic results [5]. Therefore the fusion process necessarily
332
L. Snidaro and G.L. Foresti / Sensor Performance Estimation
has to take into account the reliability of the sources, for example, weighting their contribution accordingly. Evaluating the performance of a video sensor can be a difficult task, though, as it depends on the application and on the type of information that needs to be extracted from the sensors. For video sensors, the evaluation of their performance almost inevitably relates to the evaluation of image quality. When dealing specifically with surveillance applications, the evaluation of the results obtained after the source images have been processed (i.e. to perform change detection [1][2]) is another important step to consider to assess the performance of a sensor. Until recently, image quality estimation was mainly considered in the field of image compression and transmission and conducted through subjective tests [6][7], that is, based on the human perception of quality. Lately, a number of techniques have been developed to assess quality automatically, that is, on an objective basis. These measures are meant to be as closely correlated to human perception as possible. An interesting introduction to the problem of image quality and a statistical study on the correlation between objective measures and human perception can be found in [8]. However, most of the work has concentrated on metrics used to estimate the quality of source images corrupted by noise or compression when the flawless original image is available [8]. Objective techniques have been developed in the past, especially for the assessment of the performance of edge extraction algorithms [15]. Only recent work has also focused on the estimation of segmentation quality for multimedia or surveillance applications [9][10][11][13]. When reference segmentation is assumed to be present, this is called “objective relative evaluation” [9]. Only the latest works have considered the possibility of estimating segmentation quality without ground truth [10][11]. This is called "objective standalone evaluation" [9]. This paper represents a brief tutorial on how available techniques for image and segmentation quality estimations may be used for automatic security systems. The reader is referred to [10][11] for further reading.
1. Sensor Performance Evaluation Even though some techniques may be used in many situations, evaluating a video sensor is basically a matter dependant on the particular application. The general idea of estimating the informative content provided by a camera can in fact be broken down to a myriad of evaluation methods based on the task at hand. Applications can span a wide spectrum involving very different fields, from industrial automation to medicine, robotics, etc. This paper will focus particularly on the evaluation of video sensors for automatic surveillance systems. Even constraining the domain to a specific type of application still leaves several ways to assess the performance of a camera. For example, the sensor selection problem [12], that is, the selection of the closest (and therefore possibly the most informative) sensor, may be solved geometrically if the models of the environment and of the fields of view of the sensors are available. In Figure 1, the trajectory of an object crosses the field of view of several sensors monitoring the area. The hand off between one sensor to another, for example between a sensor of LAN1 and one of LAN2 in Figure 1, can therefore be solved by simply taking into account the geometry of the fields of view of the cameras and the current position of the tracked object.
L. Snidaro and G.L. Foresti / Sensor Performance Estimation
333
Figure 1. Example of multi-camera surveillance system
Cost functions may be developed to model the performance decay of a sensor in detecting an object given the distance from it. This is the simplest method and it does not take into account the actual video streams produced by the cameras. A completely different matter is the ranking of the sensors according to the informative content produced. This problem can be broken down to at least the following categories: • Image quality evaluation • Segmentation quality evaluation The first regards the assessment of the quality of the video signal produced by the sensors prior to any processing step. This involves the estimation of the amount of noise affecting the signal due to the acquisition process or the transmission. Of course, a noisy video stream will hamper the following processing steps (segmentation, tracking, etc…). So the estimation of the type and magnitude of the noise affecting the signal may be useful in order to apply several filtering steps before actually processing it. A typical measure is the Signal to Noise Ratio (SNR), but others measures have been defined in the literature as will be described in Section 2. The estimation of the segmentation quality is, on the other hand, the assessment of the value of the information extracted from the video by the different processing algorithms. This can be very useful, for example, to tune the parameters of the segmentation algorithms or to choose a specific implementation, if several are available. It could be very important to estimate both image and segmentation quality “on the fly”, that is, during the operation of the surveillance system, to adjust its operating parameters to take into account changing illumination conditions, failures of the sensors, noise, etc. In the following sections a concise account of the available techniques will be presented.
2. Image Quality Evaluation The process of assessing the quality of an image can be broken down into two main categories: subjective evaluation and objective evaluation.
334
L. Snidaro and G.L. Foresti / Sensor Performance Estimation
2.1. Subjective Image Quality Evaluation Subjective evaluation is performed by humans and several protocols have been defined to assess in a principled way, for example, the degradation of an image due to compression or noise added by analog transmission [6][7]. A classic approach is to use the Mean Opinion Score (MOS) [14] which indicates the level of satisfaction expressed by a number of test users by taking the average of their opinions in a scale ranging from not recommended to best (Figure 2). Quality Best (very satisfied)
Desirable
High (satisfied) Medium
Acceptable
(some users dissatisfied)
Low (many users dissatisfied)
Poor (many users dissatisfied)
5 4.3 4.0 3.6 3.1 2.6
Not Recommended (many users dissatisfied)
1 Figure 2. Example of MOS image quality evaluation
This kind of experiment is slow and expensive but should nevertheless be considered during the development of automatic objective techniques. The driving idea behind this is to have the objective measures be as closely correlated to the human perception of quality [8]. 2.2. Objective Image Quality Evaluation There is a large corpus of objective image quality evaluation techniques developed in the past thirty years from classical measures like contrast to more sophisticated ones involving, for example, the analysis of the textures [13]. Most of these techniques are designed to detect a certain type of noise in the image. A defocused or blurred image, for example, does not convey much information since details are missing. An old but simple and still effective way to detect this kind of defect can be achieved by computing the edges and then the Tenengrad value of the image as follows:
S ( x, y ) = I x ( x, y ) 2 + I y ( x, y ) 2 where I x ( x, y ) and I y ( x, y ) are the two spatial derivatives of image I ( x, y ) , and the Tenengrad value J is:
L. Snidaro and G.L. Foresti / Sensor Performance Estimation
J = ∑∑ S ( x, y ) 2 x
335
S ( x, y ) ≥ T
y
The Tenengrad value grows with the strength of the edges in the images, and stronger edges generally mean a crisper image (Figure 3).
Figure 3. Edges and Tenengrad values for a defocused (top left) and focused (top right) image. Stronger edges (higher Tenengrad values) mean crisper images.
Many methods are available, each with their advantages and disadvantages. Some of the most common techniques are listed in Table 1 and described in [17]. Table 1. Several techniques to detect and assess blurring noise in images. They have similar performances but different drawbacks. Algorithm FFT
Disadvantages Computational complexity No reference to the spatial domain
Tenengrad
Sensitive to noise Threshold selection
Laplacian
Computational complexity Threshold selection
Histogram Entropy
No reference to the spatial domain Issues with textures
Grey Level Variance
No reference to the spatial domain
Sum Modulus Difference
Sensitive to noise Issues with textures
3. Segmentation Quality Evaluation Recently, a lot of interest has been focused on the development of quality evaluation measures for segmentation algorithms [10][11]. Given the relative newness of the topic, the research community has not yet agreed on a common terminology yet; here the definitions given in [10] will be used.
336
L. Snidaro and G.L. Foresti / Sensor Performance Estimation
O.58 O.49 O.34
Figure 4. Source image (left) and individual evaluation of detected objects (right)
Two kinds of measurement can be made to assess quality: • Overall segmentation quality evaluation • Individual segmentation quality evaluation The former computes a global quality index for all the blobs detected in the scene, while the latter expresses a quality figure for each. In Figure 4 (right), individual segmentation quality was computed according to [4]. The quality is expressed as an index ranging from 0 (worst) to 1 (best). The overall evaluation can be considered as an indicator of the performance of the sensor and may be obtained as a function of the individual measurements (i.e., the average). Care should be taken, though, when assessing overall segmentation quality from individual segmentation quality, since the sensor may perform differently for different regions of the image (e.g., due to illumination, shadows, etc.). So averaging the quality of the blobs may not always yield a meaningful index of the performance of the sensor. Another important distinction is between: • Relative segmentation quality estimation • Stand-alone segmentation quality estimation Relative estimation is performed comparing the computed segmentation against a reference one (Figure 5). In [11], the quality value is determined as a weighted sum of three measures:
D = μDcolor + βDhist + γDmotion where Dcolor is a pixel misclassification penalty, Dhist is a shape penalty (the contours are compared), and Dmotion is a motion penalty (the motion vectors are compared).
Figure 5. Computed segmentation (left), reference segmentation (center), misclassified pixels (right)
Stand-alone evaluation algorithms yield instead quality values in absolute terms. Since no reference segmentation is used, the measures rely on several comparisons
L. Snidaro and G.L. Foresti / Sensor Performance Estimation
337
between the pixels of the detected object and those of the background [4]. Other techniques may involve the comparison between two consecutive frames of the color histogram of the object or of its motion vectors [10]. Relative evaluation generally performs better than stand-alone. This may be due to the fact that relative measures are designed to be as closely correlated to human perception as possible since the ground truth is available. On the other hand, standalone assessment is less correlated to human perception of segmentation quality, but can be used during the operation of the system.
4. Conclusions This paper represents a very brief tutorial on the problem of sensor performance evaluation for video surveillance applications. Sensor evaluation is extremely important for multi-camera security systems as it could be exploited for the sensor selection task or to regulate the fusion process when redundant information is available. Some of the available techniques for image and segmentation quality assessment have been described herein.
References [1] [2] [3] [4]
[5] [6] [7] [8] [9]
[10] [11] [12] [13] [14] [15]
C. Regazzoni, V. Ramesh, and G.L. Foresti, “Special issue on video communications, processing, and understanding for third generation surveillance systems”, Proceeding of the IEEE, vol. 89, n. 10, 2001. R.T Collins, A.J. Lipton, H. Fujiyoshi, and T. Kanade, “Special Section on Video Surveillance”, IEEE Transactions of Pattern Analysis and Machine Intelligence, Vol. 22, n. 8, August 2001. G. L. Foresti, C.S. Regazzoni, and P.K. Varshney, Multisensor Surveillance Systems:The Fusion Perspective, Kluwer Academic Publishers, 2003. L. Snidaro, G. L. Foresti, R. Niu, and P. K. Varshney, “Sensor fusion for video surveillance”, Proceedings of the 7th International Conference on Information Fusion, Vol. II, pp.739—746, Stockholm, Sweden, June 28th - July 1st , 2004. D. L. Hall and J. Llinas, “An introduction to multisensor data fusion,” Proceedings of the IEEE, vol. 85, no. 1, pp. 6–23, January 1997. Methodology for the Subjective Assessment of the Quality of Television Pictures, Recommendation BT.500-7, 1995. Recommendation P.910—Subjective Video Quality Assessment Methods for Multimedia Applications, 1996. I. Avcibas, B. Sankur, and K. Sayood, “Statistical evaluation of image quality measures,” Journal of Electronic Imaging, vol. 11, no. 2, pp.206–223, 2002. P. L. Correia and F. Pereira, “Objective evaluation of relative segmentation quality,” in Proceedings of the IEEE International Conference on Image Processing (ICIP), Vancouver, Canada, September 10-13 2000, pp. 312–315. P. L. Correia and F. Pereira, “Objective evaluation of video segmentation quality,” IEEE Transactions on Image Processing, vol. 12, no. 2, pp. 186–200, February 2003. C. E. Erdem, B. Sankur, and A. M. Tekalp, “Performance measures for video object segmentation and tracking,” IEEE Transactions on Image Processing, vol. 13, no. 7, pp. 937–951, July 2004. R.T. Collins, A.J. Lipton, H. Fujiyoshi and T. Kanade, “A system for video surveillance and monitoring,” Proceedings of the IEEE, Vol. 89, pp. 1456-1477, October 2001. K.Kim and L. Davis, “A fine-structure image/video quality measure using local statistics”, Proceedings of the International Conference on Image Processing, 2004. ITU-T Recommendation P.800, “Methods for subjective determination of transmissions quality”, August 1996. J. Fram and E. Deutsch, “On the quantitative evaluation of edge detection schemes and their comparison with human performance,” IEEE Trans. Comput., vol. C-24, no. 6, pp. 616–628, June 1975.
338
L. Snidaro and G.L. Foresti / Sensor Performance Estimation
[16] J. Manyika and H. Durrant-Whyte, Data Fusion and Sensor Management: A Decentralized InformationTheoretic Approach. Ellis Horwood Series in Electrical and Electronic Engineering, 1994. [17] E. Krotkov, "Focusing," Intl. Journal of Computer Vision, Vol. 1, No. 3, October, 1987, pp. 223-237
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
339
Principles and Methods of Situation Assessment Alan N. STEINBERG1 CUBRC, Inc
Abstract. The goal of the present paper is to present new ideas in estimating and predicting relationships and situations, given uncertain evidence and uncertain models of such relationships and situations. Developments in Situation Theory are extended for application to estimation and prediction problems by (a) generalizing the notion of situation to include any structured part of reality: a single - or multitarget state, including attributes of and relations among entities; and (b) using the structure of situations as means for assigning prior and posterior statistics, both for "attributive" entity states and for "relational" states. The inference process will need to work within the context of an ontology that permits abductive, inductive and deductive inference across the diversity of information relevant to recognizing and characterizing the complex, diverse, and largely unpredictable threat situations of interest. Key challenges are (a) in maintaining consistency both in representation and in confidence assignment and (b) in constraining combinatoric hypothesis proliferation. Keywords. Situation assessment, threat assessment, JDL data fusion model, situation logic, context-sensitive reasoning, relational state estimation, combinatory logic, ontology
1. Objectives and Approach The objectives of the present research are to: • Improve methods for representing, recognizing, discovering and predicting situations; and • Improve methods for inferencing – whether by people, machines or some combination thereof –across information levels; from features, individuals, relationships, situations and impacts (e.g. outcome costs). The approach draws from current developments in ontology, situation and estimation theories. Specific features of the approach include • Fuzzy definition of situations and relationships; • Integration of diverse inference bases: logical/semantic, causal, conventional, etc. • Context-conditioned reasoning with uncertain evidence formulated in terms of “probabilistic infons”.
1
Corresponding author: Alan N. Steinberg, 8151 Needwood Road, #T103, Derwood, MD 20855, USA,
[email protected]
340
A.N. Steinberg / Principles and Methods of Situation Assessment
2. Situation and Threat Assessment Data fusion in general involves the use of multiple data – often from multiple sources – to estimate or predict the state of some aspect of reality. Among data fusion problems are those concerned with estimation/prediction of the state of one or more individuals; i.e. of entities whose states are treated as if they were independent of the states of any other entity. Many formulations of target recognition and target tracking problems make this assumption. These are the province of “Level 1” in the well-known JDL Data Fusion Model [1-3]. Other data fusion problems involve the use of context to infer entity states. Such contexts can include relationships among entities of interest and other entities and of the type of context, or situation, itself. The ability to exploit such relational and situational contexts, of course, presupposes an ability to characterize and recognize relationships and situations. A situation is a partial world state, in some sense of ‘partial’. Devlin provides an attractive working definition for ‘situation’ as ‘a structured part of Reality that is discriminated by some agent’ [4, p. 31, paraphrased]. Such structure can be characterized in terms of the states of entities and of relationships among them. Determining just which entities, which state elements and relationships are parts of a situation or relevant to a situation is an important part of the problems of Situation Semantics and of Situation Assessment (level 2 data fusion). The problem of Situation Semantics is that of defining situations in the abstract; e.g. specifying the characteristics that define a situation as a political situation or as a scary situation or, more specifically, as a tennis match or an epidemic. In the terminology of Peircean semeiotics, Situation Semantics involves abductive processes that determine general characteristics about types of situations and inductive processes that generalize these characteristics to establish an ontology of situations [5]. In contrast, the problem of Situation Assessment is that of recognizing situations; i.e. of classifying concrete situations (whether actual, real-world situations or hypothetical ones) as to situation type. Situation Assessment – like most data fusion – involves primarily deductive processes. Situation Assessment has only recently received serious attention in the military data fusion community, which has hitherto found it both easier to build and to market target identification and tracking technologies. The newfound interest in Situation Assessment reflects changes both in the marketplace and the technology base. The former change is largely motivated by the transition from a focus on traditional military problems to asymmetrical threat problems. The latter change involves developments in the theories of Generalized Belief Propagation [6,7], in Situation Logic and in Machine Learning [8]. Table 1 contrasts the problems of level 1 and level 2 data fusion. As an example, the problem of recognizing and predicting terrorist attacks is much different from that of recognizing or predicting the disposition and activities of battlefield equipment. Often the key indicators of potential, imminent or current threat situations are in the relationships among people and equipment that are not in themselves distinguishable from common, non-threatening entities. The goal of the present paper is to present new ideas in estimating and predicting relationships and situations, given uncertain evidence and uncertain models of such relationships and situations.
A.N. Steinberg / Principles and Methods of Situation Assessment
341
Table 1. Level 2/3 fusion is more difficult than Level 1 Fusion
Level 1 Fusion (Object Assessment); e.g. ATR or Target Tracking
Level 2/3 Fusion (Situation/ Impact Assessment) e.g. Threat Assessment
• Spatially Localized Problem • Spatially Distributed Problem • Hypothesis Generation via Validation • Hypothesis Generation via Gate Situation/Behavior Model • Strongly-Constrained Ontology: only as much as relevant to characterize target type, kinematic and activity state
• Weak Ontological Constraints (i.e. little constraint on what can be relevant evidence): Unpredictable High Dimensionality;
• High-Confidence Causal Models: Typically dominated by Physical Models: e.g. Signature, Kinematic models
• Weak Causal Models: Typically dominated by Human/Group Behavior Models: e.g. Coordination/ Collaboration, Perception, Value, Influence models
Situation Assessment – whether implemented by people, an automatic process or some combination thereof – requires the capability to make inferences of the following types: • Inferring the presence and the states of entities on the basis of relations in which they participate • Inferring relationships on the basis of other relationships • Recognition and characterization of extant situations • Prediction of undetected (e.g. future) situations. Broadly speaking, relationships to be inferred and exploited in situation assessment can include: • Logical/Semantic relationships (e.g. definitional, analytic, taxonomic, mereologic) • Physical relationships (e.g. spatio-temporal, causal, nomic) • Functional relationships (e.g. structural or organizational role) • Conventional relationships (e.g. ownerships, legal and other societal conventions) • Cognitive relationships (e.g. sensing, perceiving, believing, fearing). To be sure, relationships of most of these types are generally not directly observable, but rather must be inferred from the observed attributes of entities, their context and from other relationships.2 Indirect inference of this sort is not unique to Situation Assessment, of course. Inference of relationships and of entity states given relationships is also essential to model-based target recognition, in which the spatial and spectral relationships among components are inferred and these, in turn, used to infer the state of the constituted entity. Scene understanding extends such analysis to infer such relationships as occlusion, illumination, shadowing, and causal relations such as radiative heating and terrain disturbance. These can be extended to infer 2
We define a relationship as a function mapping from an n-tuple of entities to a relational state, R (x1,…,xn) = y, m>n; e.g. a four-element relationship, R(4)(w,x,y,z) = w trades x to y in exchange for z. (m)
342
A.N. Steinberg / Principles and Methods of Situation Assessment
functional relations among components (e.g. the tank tread is moved by this drive wheel) and organizational relations (these tanks are following the lead of that one). Additional relationships can include informational and perceptual states (x detects y, x is tracking y, x infers the intent of y, x predicts that y will transition to state z at time t, etc.). Other applications in which relationships are inferred from attributive entity states are in target state transition prediction and tracking (e.g. using Kalman filters or Markov random fields), force structure analysis, link and network analysis, etc. 3 The relative difficulty of the higher-level Situation and Impact/Threat Assessment problems can largely be attributed to the following three factors: • Weak spatio-temporal constraints on relevant evidence: Evidence relevant to a level-1 estimation problem (e.g. target recognition or tracking) can be assumed to be contained within a small spatio-temporal volume, generally limited by kinematic or thermodynamic constraints. In contrast, many situation and threat assessment problems can involve evidence that is widespread in space and time, with no easily defined constraints. • Weak ontological constraints on relevant evidence: The types of evidence relevant to threat assessment problems can be very diverse and can contribute to inferences in unexpected ways. This is why much of intelligence analysis – like detective work – is opportunistic, ad hoc and difficult to codify in a systematic methodology. Rather, the methodology in threat assessment is second-order: not to discover instantiations of pre-scripted threat scenarios, but (1) to discover patterns of activity that are consistent with unanticipated threat scenarios and (2) to nominate searches for data that could either confirm or refute such possibilities. • Weakly-modeled causality: Situation assessment and, particularly, threat assessment often involve inference of human intent and human behavior. Such inference is basic not only to predicting future events (e.g. attack indications and warning) but also in understanding current or past activity. Needless to say, our models of human intent, planning and execution are far less complete and far more fragile than the physical models used in target recognition or target tracking.
3. Probabilistic Situation Logic In discussing situations and relationships, one of the first things to re realized is that a relationship is not simply a multi-target state. Or, more precisely, a multi-target state of the sort of interest in Situation Assessment cannot in general be inferred from a set of single-target states X = {x1,…, xn}. E.g. from P = ‘x is healthy, wealthy and wise’, 3
The need to address such issues within the traditional Level 1 applications of target recognition and tracking has occasioned a rethinking of the JDL Data Fusion levels. The suggested revised partitioning of data fusion functions is designed to capture the significant differences in the types of input data, models, outputs, and inferencing appropriate to broad classes of data fusion problems. In particular, level 1 and level 2 data fusion processes are distinguished based on the types of inferences involved. In level 1, entity states are estimated based on inferred attributes of the entities. In level 2, entity states are estimated based on inferred relationships among entities. By such definition, model-based vision, scene understanding and situation assessment fall under level 2 fusion, facilitating the consistent, integrated exploitation of component and contextual evidence across many levels of abstraction and aggregation.
A.N. Steinberg / Principles and Methods of Situation Assessment
343
we can’t be expected to infer anything like Q = ‘x is married to y’ (however, we in such cases we can sometimes infer Q → ¬P ). To achieve the additional expressive power needed to reason about relationships and situations under uncertainty, we adapt a Situation Logic model to the problems of estimating and predicating states of entities and of situations. Situation Logic, as developed by Barwise and Perry [9], Devlin [4], et al, employs a second-order predicate calculus, related to the Combinatorial Logic of Currey and Feys [10]. The latter represents first order propositions R(x1,…,xn), involving an m-place predicate R, m>n, as second-order propositions Applies(r,x1,…,xn), employing a single second-order predicate “Applies”. This abstraction amounts to the reification (i.e., admission to the ontology) of relations r corresponding intentionally to predicates R (the concept of relation is extended to include attributes, i.e. 1-placed relations). There being but one second-order predicate, we can often abbreviate ‘Applies(r,x1,…,xn)’ as ‘(r,x1,…,xn)’. This is the basis of Devlin’s notion of an Infon, construed as a “piece of information”. Under this formulation [4], an infon has the form σ = (r,x1,…,xn,h,k,p);
(1)
for m-place relations r, m>n, entities xi, locations h, times k and polarities (i.e. truth values) p. Hypotheses also can be represented as sets of infons (r,x1,…,xn,h,k,p). We may read ‘(r,x1,…,xn,h,k,1)’ as “relation r applies to the n-tuple of entities <x1,…,xn,> at location h and time k”. Similarly, ‘(r,x1,…,xn,h,k,0)’ can be read “attribute or relation r doesn’t apply …”. Barwise and Perry [9], Devlin [4], et al, broaden the expressive power of the second-order predicate calculus by a further admission into the working ontology. The new entities are our friends, situations. A new operator ‘|=’ expresses the notion of contextual applicability, so that ‘s |= σ’ can be read as “situation s supports σ” or “σ is true in situation s”. This extension allows consistent representation of factual, conditional, hypothetical and estimated information. 4 Some situations can be crisply defined; e.g., a chess game, of which the constituent entities and their relevant attributes and relationships are explicitly bounded. On the other hand, we should allow that many situations may have fuzzy boundaries. Fuzziness is present both in abstract situations (e.g. the concepts economic recession or naval battle) and of concrete situations such as occur in the real world (e.g. the 1930s, the Battle of Trafalgar). Both, it would seem, can naturally be characterized via fuzzy membership functions. In this way, we may represent the relevance of individual facts to characterizing a situation or to recognizing an instance of an abstract situation type.5
4
The term ‘infon’ was coined on analogy with ‘electron’, ‘meson’, ‘photon’, suggesting a discrete “particle” – though not necessary an elementary, indivisible particle – of information. Note that an infon is not the same as a proposition, which is a claim about the world. An infonic proposition is one in which a set of information is claimed concerning the world or another situation. Such a statement has the form s |= I, where s is a situation and I is a finite set of infons [4, pp. 62f]. 5 If we can completely specify both abstract and concrete situations by the use of fuzzy membership functions, then there is no need to include situations as ontological primitives.
344
A.N. Steinberg / Principles and Methods of Situation Assessment
One further expansion of situation logic is necessary to make it a useful tool in data fusion. To use infons as vehicles for reasoning under uncertainty, we replace the bipolar truth-value model of [4] to a probabilistic model. I.e. we wish to treat the term p in expression (1) as a continuous density function. This extension allows us to infer hypotheses – stated in terms of sets of probabilistic infons – across uncertain real-world state estimates together with uncertain inference models (e.g. expressed as conditionals among infons) and conditional and hypothetical state estimates (e.g. in counter-factual historical analysis or analysis of fiction). In summary, we propose that situations – to include abstract situations (or situation types) and concrete situations (or situation tokens) – can be represented as sets of infons. Sensor measurement reports, track reports and fused situation estimates alike can be stated in terms of probabilistic infons. Each such infon σ = (r,x1,…,xn,h,k,p) is a second-order expression of a relationship, stated in terms of the applicability, with probability p, of a relation r to an n-tuple of entities <x1,…,xn> at some place and time. This formulation allows inferencing from one perceived entity in a situation to another and from estimated entities, their attributes and relationships to situations. Attributes of individuals can be treated as1-place relationships. Real-world situations can be classified in the same way as we classify real-world individual entities. Indeed, an entity is – by definition – a kind of situation. A situationtype or entity-type s can be defined in terms of an equivalence class R(s) of relations: ‘x is of type s’ can be defined as ‘ r ( m ) ∈ R( s) ⇒ ∃y1 ,..., ym−1[s |= (r , x, y1 ,..., ym−1 , h, k ,1)] ’. The Bowman model for a data fusion node [11, et al], shown in Figure 1, applied to level 2 and 3 processes as well as to those in level 1. In this model, data association involves generating, evaluating and selecting hypotheses that a set of information (e.g. a set of sensor measurement reports) is the maximal set of information to be used in inference: • A level 1 hypothesis is a set of measurements, hypothesized to be the maximal set of measurements of a perceived entity;6 • A level 2 hypothesis is a set of perceived entities (whether from individual sensors or from a multi-sensor data fusion process) hypothesized to be the maximal set of entities in a perceived relationship or situation. Level 1 state estimation (e.g. concerning target identification, location or tracking) typically involves determining infons involving 1-place relationships (or, possibly, nplace relations with n-1 bound or parameterized variables). Level 2 and 3 fusion hypotheses generally concern multi-place relationships.
6
Allowing, in set covering formulations, that a given report may involve measurement of more than one target; e.g. in cases of two targets being detected within a single resolution cell.
A.N. Steinberg / Principles and Methods of Situation Assessment
Sensors/ Sources or Prior DF Nodes
User or Next DF Node
DATA FUSION NODE Data Association
Data Alignment (Common Referencing)
Hypothesis Generation
Hypothesis Evaluation
Hypothesis Selection
345
State Estimation/ Prediction
Resource Mgmt Controls Figure 1. Data Fusion Node Paradigm
A situation can imply and can be implied by the states and relationships of constituent entities. As in Level 1 inferencing (i.e. with 1-place relations), we can write production rules based on logical, semantic, causal, customary or material (etc.) relationships among predicates of any length. Characteristic inference patterns include the following (in which the infon notation is replaced with more concise predicate calculus notation): Situational Inferences P(n)(X(n)) ⇒S
(2)
Level 1 Inferences f(P(1)(x)|Q(1)(x),S)
(3)
(e.g., single target likelihood functions or Markov transition densities) Level 2 and 3 Inference examples f(P(2)(x,y)|Q(1)(x),R(1)(y),S) (Level 1→2 deduction)
(4)
f(∃y[P(2)(x,y)]|Q(1)(y),S) (Level 1→2 induction)
(5)
f(P(1)(x)|Q(2)(x,y),S) (Level 2 →1 or Level 3 →1 deduction)
(6)
f(P(2)(x,y)|Q(2)(x,y),S) (Level 2 →2, 3→3, 2 →3 or 2 →3 deduction)7
(7)
Such inferences are amenable to the machinery of generalized belief propagation, whereby entity states are inferred conditioned on other entity state (i.e. on other states of the same entities or on the states of other entities) [7]. In such a formulation, belief in a state xj of an entity X is modeled as
bX ( x j ) = kφ X ( x j )
∏m
W ∈N ( X )
7
W ,X
( x j );
(8)
These types of deduction can include, for example, estimating multi-target likelihood functions or multi-target Markov transition densities.
346
A.N. Steinberg / Principles and Methods of Situation Assessment
in terms of “local” evidence ϕX(xj) and evidence passed as “messages” from other entities in the is the immediate neighborhood N(X) in the graph that represents the set of relationships in the relevant situation:
mW , X ( x j ) = ∑ φW ( wi )ψ W , X ( wi , x j ) wi
∏m
Y ∈N (W ) \ X
Y ,W
( wi ).
(9)
This can be restated in terms of relationships on X and its neighbors:
mW , X ( x j ) = ∑ φW ( wi ) ∫ f ( wi , x j | r (W , X )) f (r (W , X )) wi
∏m
Y ∈N (W ) \ X
r
Y ,W
( wi ). (10)
Joint belief, then, is given as
bW , X ( wi , x j ) = kψ W , X ( wi , x j )φW ( wi )φ X ( x j )
∏m
Y ∈N (W ) \ X
Y ,W
( wi )
∏m
Z ∈N ( X ) \W
Z,X
(x ) (11)
In (9) and (11), k is normalizing constant such that beliefs sum to 1). An important feature of this formulation for the present purpose is the fact that infons and situations can be elements of other infons. For example, consider an infon of the form 1= (Believes,x1, η2,h,k,p); to be interpreted as “with probability p, at place and time h,k, x1 believes that η2”, η2 being another infon. Similar nested infons can be created with such predicates R = Perceives, Hypothesizes, Suspects, Doubts that, Wonders whether, etc., or such compounds as “with probability p, at place & time h,k, x1 asked x2 whether x3 reported to x4 that x4 believes that η2”. In this way, the Probabilistic Situation Logic representational scheme can be used to characterize and reason about counterfactual and hypothetical situations, their relations to one another, and their relations to reality. Probabilities embedded in recursively nested infons allow these pieces of information to be combined with sensor reports and information of other types. The second-order formulation also permits inferencing concerning propositions other than assertions. For example, a first-order interrogative “is it true that σ?” can be restated as the assertion “I ask whether σ” or, in infon notation, “(ask whether, σ,h2,k2,p)”. Similarly for other modalities, replacing the relation ‘ask whether’ with ‘demand that’, ‘believe that’, ‘fear that’, ‘pretend that’, etc. 8 Infons can be nested recursively: “x perceives that y perceives that σ”. Situations, too, may be nested recursively to characterize, for example, my beliefs about your beliefs or my data fusion system’s estimate of the product of your data fusion system. One can use such constructs to reason about such perceptual states as one’s adversary’s state of knowledge and his belief about present and future world state; e.g. our estimate
8
In our formulation, instances of such modalities (e.g. sensing, perceiving, believing, fearing) are cognitive relationships r and therefore can appear either as the initial argument of an infon or as one of the succeeding n-tuple of arguments.
A.N. Steinberg / Principles and Methods of Situation Assessment
347
of the enemy’s estimate of the outcome of his available courses of action. In this way, the scheme supports an Operational Net Assessment analysis framework that can reason simultaneously from multiple perspectives. 4. Threat Assessment A particularly important class of situations is that in which threat activity is either predicted, imminent or is occurring. 9 Threat assessment includes the following functions: • Threat Event Prediction: Determining likely threat events (“attacks”): who, what, where, when, why, how • Indications and Warning: Recognition that attack is imminent or under way; • Threat entity detection & characterization: identification, attributes, composition, location/track, activity capability, intent • Attack Assessment: o Responsible parties (country, organization, individuals, etc.) and attack roles o Intended target(s) o Intended effect (e.g., physical, political, economic, psychological effects) o Threat capability (e.g., weapon and delivery system characteristics) o Force composition, coordination & tactics (goal and plan decomposition) • Consequence Assessment: Estimation and prediction of event outcome states and their cost/utility to the responsible parties, to affected parties or to the system user. Threats can be modeled in terms of potential and actualized relationships between threatening entities (which may or may not be people or human agencies) and threatened entities, or targets (which often include people or their possessions). A threat situation (i.e. a situation involving potential threat events) is one in which there is some high likelihood of certain types of potential events (e.g. attacks) by some agent against threatened entities. Threat events may be intentional or unintentional (e.g. potential natural disasters or human error). Intentional threats can also have unintended consequences. Following the ontology work of Rogova and Little [13], indicators of threat situations relate to the capability, opportunity and intent of agents to carry out such actions against various targets. The threat ontology decomposes these three subelements of the threat situation into sub-elements as illustrated in Figure 2. The Threat Assessment process will need to generate, evaluate and select hypotheses concerning threat situations – i.e. of situations in which threat events (attacks) are likely – in terms of entities’ capability, intent and opportunity to carry out various actions, to include attacks of various kinds. It will do so by reference to an ontology that decomposes capability, opportunity and intent into indicators that can be recognized in constructing threat hypotheses. By evaluating and selecting such hypotheses, the Threat Assessment system will provide indications, warnings and characterizations of possible, imminent or occurring attacks.
9
A more detailed treatment of this topic is to be found in reference [12].
348
A.N. Steinberg / Principles and Methods of Situation Assessment
Capability • Design: Concept, Theory, Technology • Development: Materials, Facilities, Skills • Deployment/Delivery
Indicators
Threat Types e.g. CBN, cyber, econ,… Threat Delivery Constraints
Intent (as applicable) • High-Level Objectives: e.g. Security, wealth, prestige, revenge • Means Decomposition: e.g. Force, trade, coercion, trickery, thirdparty pressure, internal pressure • Desirable Targets & Effects: e.g. Capture, Destruction, Demoralization
Feasible Threat Types
Desirable Targets and Effects x x x x
Attack Outcome (perceived by threat agent) Attack Likelihood
x x x x
x x x x x x x x x x xx x x x x x x xx x x x x x x x x Opportunity Target Access Target Vulnerabilities Opportunity Assessment: Assessed target vulnerabilities & access, Outcome assessment: Predicted goal satisfaction, Predicted Secondary Effects: e.g. retaliation, international political or economic reaction x
x x x x x x x x
x
x
x x x x x x x x
x
x
Attack Opportunity • • •
•
Figure 2. Elements of a threat situation hypothesis
Important aspects of the concept are the adaptive, opportunistic generation and evaluation of hypotheses concerning present and future situations and events. Such hypotheses are subject for selection, refinement or resolution via additional data collecting, mining, fusion and/or analysis. A notional threat assessment architecture is shown in Figure 3. As shown, the process is adaptive, by which hypotheses concerning threat situations and threatened events are successively generated, evaluated and selected for response or for further refinement. A secondary feed-back loop refines the models whereby threat situations and events are recognized or predicted. The threat assessment system involves the following processes: 1. Data Collection: sensor management and data mining to obtain reports of real-world entities, relationships and events. Sensor reports can be considered to have the form of probabilistic infons (r,x1,…,xn,h,k,p), for attributes/relations r, entities xi, locations h, times k and probabilities p. 2. Hypothesis Generation: building candidate threat hypothesis consistent with the available data. Such hypotheses – which also can be stated in the form of probabilistic infons – will be instantiations of situations and events per the threat ontology; 3. Hypothesis Evaluation: ascribing a likelihood score to each candidate hypothesis and requesting additional data expected to either support or refute the hypothesis; 4. Hypothesis Selection: selecting among candidate hypotheses on the basis of global likelihood; 5. Alerting: Providing indications of current and predicted threat situations and threat events for Event Prediction, Indications & Warning, Threat Characterization, Attack Assessment and Consequence Assessment; 6. Model Management: building and refining threat models. Model management is typically performed as an off-line task, involving abductive and inductive
349
A.N. Steinberg / Principles and Methods of Situation Assessment
processes for pattern explanation and generalization. Advanced learning methods can be employed to develop and validate predictive models of threat activity [8].
Abstract Threat Models*
Hypothesis Generation • Build cases consistent with data
Hypothesized Situations
Model Mgmt* • New Model Generation • Model Refinement
* Threat Models • Entity Capability models • Opportunity models
• Entity Intent models • Attack Planning & Preparation models
Sensor Source Data
Data Collection • Data Mining • Sensor Management
Hypothesis Evaluation • Test Existing Hypotheses
Hypothesis Selection • Nominate Response
External Events**
Alerting • Who, What, Where, Why, When, How • Probabilities
** External Events: (r,x1,…,xn,h,k), with attributes/relations r, entities xi, locations h, times k)
• Attack & Consequence models
Figure 3. Threat assessment functional flow
In general, Hypothesis Generation is expected to be the most complex and difficult of the processes in on-line threat analysis, because of the above-listed three factors: 1. Weak spatio-temporal constraints on relevant evidence 2. Weak ontological constraints on relevant evidence 3. Dominance of relatively poorly-modeled causal processes (specifically, human and group behavior vice simple physical processes). This is a major contrast with level 1 data fusion, in which Hypothesis Generation is relatively straight-forward. It is expected that threat assessment in general – and particularly threat hypothesis generation – will remain an intensely manual process for some time. Adaptive data collection – seeking evidence to support or refute threat hypothesis – can also be expected to evolve to greater levels of automation, largely driven by anticipated advances in data mining and collection management. 5. Summary The present paper presents new ideas in estimating and predicting relationships and situations, given uncertain evidence and uncertain models of such relationships and situations. Developments in Situation Theory are extended for application to estimation and prediction problems by • Generalizing the notion of situation to include any structured part of reality: a single - or multi-target state, including attributes of and relations among entities;
350
A.N. Steinberg / Principles and Methods of Situation Assessment
•
Using the structure of situations as means for inferring assigning prior and posterior statistics, both for "attributive" entity states and for "relational" states. Attributes of individual entities are represented by one- place predicates, relations by >1-place predicates; i.e. attributes are 1-place relations. Recognizing relationships and situations requires a validated, comprehensive ontology; but one in which the uncertainties in the inference are captured. Additionally, some formal method of situation semantics and situation logic is required. The inferential exploitation of situations for contextually-conditioned estimation of target states and relationships requires some form of inferential calculus, in which uncertainties in the ontology, in sensor/source reports and in the inference process are systematically represented and manipulated. For complex, noisy and poorly-modeled problems such as Threat Assessment, the process will require an integration of abductive, inductive and deductive methods. The ontology must be sufficiently rich and flexible to permit inference across the diversity of information relevant to recognizing and characterizing the complex, diverse, and largely unpredictable threat situations of interest. A key challenge is in maintaining consistency both in representation and in confidence assignment. Another is in constraining what can easily become a combinatoric nightmare. References [1] [2]
[3] [4] [5] [6] [7]
[8] [9] [10] [11]
[12] [13]
F.E. White, A Model for Data Fusion, Proceedings of the First National Symposium on Sensor Fusion, 1988. A.N. Steinberg, C.L. Bowman and White, F.E., Revisions to the JDL Model, Joint NATO/IRIS Conference Proceedings, Quebec, October, 1998 and in Sensor Fusion: Architectures, Algorithms, and Applications, Proceedings of the SPIE, vol. 3719, 1999. A.N. Steinberg and C.L. Bowman, Rethinking the JDL data fusion model, Proceedings of the 2004 MSS National Symposium on Sensor and Data Fusion, vol 1, June 2004. K. Devlin, Logic and Information, Press Syndicate of the University of Cambridge, 1991. E.T. Nozawa, Peircean Semeiotic, a new engineering paradigm for automatic and adaptive intelligent systems, http://www.student.nada.kth.se/~tessy/Nozawa.pdf. J. Pearl, Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference, Morgan Kaufmann Publishers, 1988. J.S. Yedidia, W.T. Freeman and Y. Weiss, Understanding belief propagation and its generalizations, Chapter 8 of Exploring Artificial Intelligence in the New Millennium, ed. G. Lakemeyer and B. Nebel, Morgan Kaufmann Publishers, 2002. A. Khalil, Computational Learning and Data-Driven Modeling for Water Resource Management and Hydrology, PhD Dissertation, Utah State University, 2005. J. Barwise and J. Perry, Situations and Attitudes, Bradford Books, MIT Press, 1983. H. Curry and R. Feys, Combinatory Logic, volume 1, North-Holland Publishing Company, Amsterdam, 1974. C.L. Bowman and A.N. Steinberg, A systems engineering approach for implementing data fusion systems, Chapter 16 of Handbook of Multisensor Data Fusion, ed. D.L. Hall and J. Llinas, CRC Press, London, (2001). A. N. Steinberg, Threat assessment technology development, Proceedings of the Fifth International and Interdisciplinary Conference on Modeling and Using Context, (CONTEXT’05), Paris, July 2005. E. Little and G. Rogova, Theoretical Foundations and Proposed Applications of Threat Ontology (ThrO) to Information Fusion, Final Report, CUBRC contract W7701-011616, June 2004.
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
351
Higher Level Fusion for Catastrophic Events Galina L. ROGOVA Encompass Consulting
Abstract. The purpose of higher level fusion (situation and impact assessment processes) is to infer and approximate the critical characteristics of the state of the environment in relation to particular goals and information requirements of the decision makers. The result of higher level fusion is a coherent composite picture of the current situation along with prediction of consequences to be used for the enhancement of decision makers’ situation awareness and resource management. This paper presents a general methodology of designing higher level fusion processes as applied to a dynamic post-disaster environment. The methodology exploits historical databases and distributed dynamic estimation of objects characteristics, relationships, and their behavior to infer contextual understanding of the state of the environment, discovery of the underlying causes of the assessed situation, and prediction of possible consequences.
Introduction It is widely agreed that the great majority of successful data fusion methods to date have focused on low level (Level 0 and Level 1) fusion related to processing information about a single object of interest (see, e.g. [1]). While effective fusion at the attribute and object levels producing object identification and characterization offers real performance gains in many applications, it does not provide for user situation awareness essential for effective decision making[2]. Situation awareness requires contextual understanding and interpretation of the events and behaviors of interest, which can be achieved by utilizing higher level fusion processes (situation assessment and impact prediction). The purpose of Higher Level Fusion (HLF) processes is to infer and approximate the critical characteristics of the environment in relation to particular goals, capabilities and policies of the decision makers. HLF utilizes fused data about objects of interest and relationships between them, their behavior, dynamic databases, expert experience, knowledge, and opinion for context processing. The result of HLF is a coherent composite picture of the current situation along with prediction of consequences, which provides decision makers with essential information to help them understand and control the situation and act effectively to mitigate its impact. In the case of multiple decision makers, the Situation and Impact Assessment (SIA) processes have to deliver a consistent current and predicted situational picture relevant to each decision maker’s goals and functions. The main goals of crisis management in a post-disaster environment are to save lives, control the situation, and minimize the effects of the disaster. “Multiple distributed decision makers are searching for the answers to the following questions:
352
G.L. Rogova / Higher Level Fusion for Catastrophic Events
where the problems are, what kind of problems they are, and what the impact of this problem is” [3]. HLF is essential for answering these questions since identification, recognition, and attribution of individual objects are not sufficient for an effective coordinated disaster response. There is a need to convert the fused data about individual objects such as damage of individual buildings, roads, bridges, facilities, fires, and casualties into usable knowledge about current and predicted disaster scenes [2,4]. This paper presents a general approach to designing higher level fusion processing as applied to a dynamic post-disaster environment partly discussed in [2,5].
1. General Approach The post-disaster environment has specific characteristics that define requirements for HLF architecture and processes. These characteristics comprise: • Noisy and uncertain dynamic environment with insufficient a priori statistical information • Geographically distributed damage, casualties, and resources of first responders • Geographically distributed uncertain sources of information, often of varying significance, low fidelity, contradictory and redundant • A large amount of heterogeneous information • High probability of secondary incidents such as aftershocks and tsunamis in the case of earthquakes, hazmat events, flood, fire, etc. • Resource and time constraints • High cost of error • Multiple decision makers with different goals, functions, and information requirements. Some have tactical missions calling for decisions on direct response to a situation while the others have strategic missions calling for higher level estimation of the situation, impact prediction, and analysis • Multiple agencies in multiple jurisdictions These specific domain characteristics call for multi-agent distributed dynamic HLF processes, which have to be scalable, adaptive to resource and time constraints and new and uncertain environments, and reactive to uncertain inputs. These processes also have to accommodate heterogeneous information (both symbolic and numeric), allow for complex distributed system modeling and efficient information sharing, and incorporate qualitative experts’ opinions. It is necessary to note that the post-disaster environment characteristics and HLF processing requirements mentioned above are very common to various applications dealing with unintended threat, which makes an approach to building such processing quite generic. In the disaster environment, the HLF processes continually exploit associated and fused information on single entities, such as casualties, road, building, and facility damage obtained from multiple observer reports, domain knowledge, and the results of domain-specific simulations and models to produce a consistent estimate of the current and predicted state of the environment, which is presented to a user. Figure 1 shows a notional architecture of the HLF processing.
353
G.L. Rogova / Higher Level Fusion for Catastrophic Events
Quality Check of Level 1 Estimations
Situation and impact assessment Objects Attributes Credibility Quality
Delay vs. Sensor Mng
Consistency ?
Belief Change Delay Vs SM
Reasoning about objects, attributes, aggregates, relationships and their behavior over time within a specific context
Quality ?
Dynamic current and predicted situational pictures
Level 1 Fusion results Formally structured and computationally tractable domain representation
Real World
•Goals •Information needs (EEI) •Hypotheses •Actions •Domain Knowledge
Quality ?
Environmental parameters: •Time of the day •Weather •Geospatial parameters •Models parameters
Domain specific models
Figure 1. HLF notional architecture
There are several essential components of fusion processing required for building current and predicted situational pictures: • Formally structured and computationally tractable domain representation capturing the basic structures of relevant objective reality and users’ domain knowledge and requirements, which further serves as a basis for reasoning about the states of the environment. • Dynamic reasoning procedures about objects, attributes, aggregates, relationships and their behavior over time within a specific context. • Inter- and intra-fusion level quality and consistency control procedures. • Domain specific models used to support reasoning processes. In the postdisaster environment such models may include dynamic route generation model, ambulance dispatch, plume, and hospital operation models. The remainder of this paper comprises a description of the processes presented in Figure 1 in greater detail.
2. Domain Representation One of the major challenges of designing the HLF processes is a problem of providing a consistent and comprehensive representation of the domain under consideration. A combination of Cognitive Work Analysis (CWA) and formal ontological analysis of a specific domain is designed to overcome this problem and provide sufficient information about a decision maker’s goals, functions, information needs, types of objects, relations between them, and processes to support the domain-specific generation of situational hypotheses and high-level reasoning about these situational hypotheses [6]. Users’ goals, functions, and information needs are identified by the means of CWA [7]. CWA is a systems-based approach to the analysis, design and
354
G.L. Rogova / Higher Level Fusion for Catastrophic Events
evaluation of an emergency management environment in a post-disaster context. The result of CWA provides answers to the following questions: • What are the decision makers expecting from a situational picture? • What information is required for making decisions? • What active alternative hypotheses about the environment can be expected? Examples of essential elements of information in the post-disaster environment include regions of causalities, risks of secondary incidents (e.g., hazardous materials spills, fires, floods), area of impeded transportation, and status of critical facilities (e.g., hospitals, shelters). The role of a formally structured domain-specific ontology of the environment under consideration is to provide a comprehensively large and metaphysically accurate model of situations, through which specific tasks such as situation assessment, knowledge discovery, or the like, can be more effectively performed, since the information necessary for these decision-making aids is contained within the ontology’s structure. [8]. The formal ontology framework is necessary to provide a formal structure for ontological analysis of the specific environment and to ensure a certain level of reusability of the designed domain-specific ontology in a different application domain. Formal ontology of situations comprises two types of items: spatial (situational items) and temporal (processes), together with the relations between and among them. Spatial items, elements of the embedded snap ontology, and relations between them are defined by a set of spatial and mereological attributes. The values of these attributes define the state of these items and a corresponding state of the environment. Temporal items, i.e. processes, are elements of the related span ontology, which describe the temporal behavior of the situational items and dynamics of attributes and relations. Important characteristic of processes are events representing transition between states of the environment defining situations. Each relationship characterizing a situation falls into one of two basic categories: inter-class relations and intra-class relations. Intrarelations (i.e., internal relations) are spatial, temporal, or functional relations that exist within a given set of ontologically similar items while inter-class relations (e.g., external relations) exist between various items. A more detailed description of a formal ontology of catastrophic events is presented in [8].
3. Situation and Impact Assessment 3.1. Reasoning about Situations Let Ω be a set of possible states of the environment, Ω k ⊂ Ω be a subset of possible states of the environment relevant to decision maker k, and Ω be a plausibility structure on Ω . At each time t, a situational picture relevant to decision maker k can be described as a set of the plausible states of environment: S k (t ) = {ωik (t ) ∈ Ω k | Pl (ωik (t )) > 0) . It is assumed here that decision makers do not have complete knowledge about all relevant states of the environment (the open world assumption). Building a situational picture comprises dynamic generation of hypotheses about the states of the environment and assessment of their plausibility via reasoning about situational items and relationships between them within a specific context. In some
G.L. Rogova / Higher Level Fusion for Catastrophic Events
355
cases, assessment of the plausibility of more complex hypotheses may require hierarchical processing, which includes not only reasoning about situational items and relationships between them but also includes relationships between hypotheses and assessments of plausibility of lower level hypotheses [9]. Automatic hypotheses generation is the most difficult part of SIA and it is not discussed in this paper, in which it is assumed that hypotheses are generated by the users. The hierarchical structure of emergency management and considerations of hierarchical hypotheses call for reasoning about situations at various levels of granularity. Situation building blocks are described by inter- and intra-class relationships between physical items such as casualties, buildings, and ambulances, or events such as discovery of casualties of a certain type of injury at a certain time or an ambulance and a hospital within a certain time interval. These basic situations are defined as aggregates and are obtained by applying a similarity metric in the feature space. The type of features used for aggregation depends on the information needs of a certain user or a group of users. Context dependent relationships between aggregates at a certain level of granularity define derived situations at the next level of granularity. Assessment of the plausibility of hypotheses about situational items may include assessments of relationships between hypotheses at lower levels of granularity and plausibility of these hypotheses. Derived intra-class situations created by the composition of basic intra-class situations at specific levels of granularity are called elementary situations. Examples of elementary situations are: (1) Communication system situation (e.g., capacity vs. demand; location, boundary, possible causes of the problems; predicted problems), (2) Transportation system situation (e.g., impassable areas), (3) HAZMAT situation (e.g., secondary incidents; location, type, and dynamics of the possible secondary incidents), and (4) Casualty situation (boundary, severity, injury types, dynamics and causes). Relationships between elementary situations within a selected spatio-temporal setting and overall context comprise an overall situational picture. An important component of situation assessment processing is causal inference aimed at discovery of underlying causes of observed situational items and their behavior. Discovery of underlying causes of observed situations is the result of abductive reasoning [9,10] (“inference for best explanations”), which can be initiated by discovery of “significant deviation from usual” exhibited by attribute behavior attributes and relationships of situational items. The significance of this deviation from the usual is always context dependent since “significance” is always context specific. For example, in the early post-earthquake phase, reasoning about situations relies on the assumption that casualties and damage are the result of the primary earthquake shock and reported secondary incidents such as fire, flood, and hazmat events. However, some secondary incidents, which might have a devastating impact, can stay unreported (e.g., unnoticed hazmat incident) and have to be discovered based on observed unusual characteristics of the situational items and their behavior, e.g., discovery of unusual type of injury. 3.2. Modeling Framework The modeling framework selected for SIA in post-disaster environment is a combination of the Probabilistic Argumentation System (PAS) (see, e.g. [11]) with domain specific models such as a hospital model and a dynamic dispatch/routing model. Following [11], PAS can be described as an approach to non-monotonic
356
G.L. Rogova / Higher Level Fusion for Catastrophic Events
reasoning under uncertainty, combining logic with probability theory for judging hypotheses about the unknown or future world by utilizing given knowledge. Logic is used to find arguments in favor and against a hypothesis about possible causes or consequences. An argument is a defeasible proof built on uncertain assumptions, that is, a chain of deductions based on assumptions that makes the hypothesis true.Every assumption is linked to an a priori probability that an assumption is true. The probabilities can be understood in a traditional way but also can represent an expert subjective belief that an assumption is true. The probabilities that the arguments are valid are used to compute the credibility of the hypothesis, which can then be measured by the total probability that it is supported by arguments. The resulting degree of support corresponds to belief in the theory of evidence [12] and is used to determine whether a hypothesis can be accepted, rejected, or whether the available knowledge does not permit a decision to be made PAS has many important properties that make it appropriate for reasoning about situations in the post-disaster environment. It exhibits non-monotonicity, provides a mechanism for “cost-bounded approximations” (anytime argumentative reasoning), and supports inferring from given knowledge to a possible explanation for an observed or reported fact. 3.3. Consistency Considerations An important component of the reasoning about situations is a consistency evaluation step, in which the agents’ knowledge is examined for consistency in a Belief Change (BC) process. New information obtained at time t may drive changes to the agents’ knowledge utilized in the reasoning processes at time t-1. Two special cases of BC are usually considered in the literature: Belief Revision (BR) and Belief Update (BU) [13]. BR focuses on how agents revise their beliefs when they adopt new belief information in a static environment. BU focuses on how agents should change their beliefs when they realize that the world has changed. In distributed multi-agent dynamic situations both BR and BC can be justified. In general, the transition from the old to new beliefs is supposed to obey the principle of minimum change and the principle of priority of new information. Basically, these principles suggest that BC retracts some of the old information after obtaining new information to make the agents’ beliefs consistent. The principle of priority of incoming information changes the credibility of certain possible worlds and can make them equal to zero after processing new information. The principle of minimum change means minimizing some kind of distance between the old and the new plausibility distributions. In distributed multi-agent dynamic situations, in which uncertain information is coming from different (often unreliable) sources at different times, the priority of incoming information is not totally justified since the chronological sequence of information has nothing to do with its importance. In such cases, the quality of information has to be taken into account while building the BC processes and adjusting the plausibility distribution to accommodate new information. For example, new information can be rejected if it inconsistent with absolutely reliable prior plausibility distribution or background knowledge.
G.L. Rogova / Higher Level Fusion for Catastrophic Events
357
4. Quality Control First response phase casualty mitigation operations are under severe time and resource constraints, and timely decision making and swift action are required. Waiting may result in unacceptable decision latency, leading to either wasted resources or lost lives. At the same time the cost of false alarms can be very high since valuable resources might be diverted from the location where it later becomes clear that they are critically needed. Therefore the cost of waiting for additional information, or cost of additional computation delay to produce information of better quality and reduce false alarm, has to be justified by the benefits of achieving results of better quality. These considerations call for “quality control” processing, which can be achieved by either implicitly modeling expected utility of making decision at a certain moment or comparing the quality of information achieved at a certain time with a time varying threshold [14]. Quality control has to be implemented multiple times throughout the whole dynamic process of building and delivering current and predicted situational pictures to the users to ensure their quality [15]. The information quality consideration plays an important role in transferring information between fusion levels, starting from presenting the results of detection and identification of each object (e.g., each casualty or each damaged facility) to HLF and considering the result of estimation of situation states for prediction of impact of the assessed situation and ending with displaying the results of HLF to the decision makers. The utility of waiting for higher information quality should also be taken into account while information is transferring between different subprocesses within fusion levels. For example, the quality of aggregation has to be considered in the state estimation and impact prediction steps. The quality criteria to be employed are context specific and may relate to any particular component of quality or to any combination of quality components such as uncertainty, relevance, reliability, and significance. Thus the quality of the level 1 estimations can be assessed based on the compound reliability of the associated and fused reports on casualty or damage and their location uncertainty. In general, the problem of defining information quality criteria requires clear understanding of how to measure data and information quality from the perspective of information fusion process designers and how different aspects of information quality are interrelated. In certain situations, when decisions based on the resulting decision state estimations have very serious consequences, a sensor management process can be employed. For example, a highly reliable sensor, perhaps a policeman or structural engineer, can be tasked to observe and evaluate the situation in question.
5. Conclusions This paper presents an approach to designing a general methodology for HLF and its application to situation and impact assessment in a dynamic post disaster environment. The SIA methodology described herein utilizes a combination of CWA and ontological analysis of catastrophic events developed within the framework of a formal ontology. This combination allows for comprehensive and formally structured domain representation and provides sufficient information to support high-level reasoning processes. The dynamic situational picture is built by analyzing spatial and temporal relations of the situational entities and entity aggregations at different levels of
358
G.L. Rogova / Higher Level Fusion for Catastrophic Events
granularity, and their dynamics provided within the overall situational context. The modeling framework selected for SIA is a combination of PAS with domain specific models. The presented methodology includes multi-step inter-level and intraprocessing information exchange comprising a quality control step, in which information is evaluated to see whether it satisfies certain quality criteria, and a consistency evaluation step, in which the state estimate is examined for consistency in a belief change process. The presented approach to building HLF processing is quite generic and can be used in other applications.
Acknowledgements This work was supported by the AFOSR under award F49620-01-1-0371. The author gratefully acknowledges valuable input from Ann Bisantz, Eric Little, and Peter Scott.
References [1] [2] [3] [4] [5] [6]
[7] [8] [9] [10] [11]
[12] [13] [14]
[15]
M. L. Hinman, Some Computational Approaches for Situation Assessment and Impact Assessment, Fusion 2002, Annapolis MD, July 8-11, 2002, pp 687-693 P Scott, G. Rogova, Crisis Management in a Data Fusion Synthetic Task Environment, in: Proc. of the FUSION’2004-7thConference on Multisource Information Fusion, 2004. State of California Emergency Plan, Governor’s office of emergency services, 1998. J. Llinas. Information Fusion for Natural and Man-Made Disasters. In Proc. Fifth Int. Conf. Information Fusion, pages 570-574, Annapolis, MD, USA, 8 July–11 July 2002. G. Rogova, P. Scott, C. Lollett, Higher level fusion for post-disaster casualty mitigation operations, in: Proc. of the FUSION’2005-8th Conference on Multisource Information Fusion, 2005. A. Bisantz, G. Rogova, E. Little, On the Integration of Cognitive Work Analysis within Information Fusion Development Methodology, in Proc. of the Human Factors and Ergonomics Society Annual Meeting, New Orleans, 2004 A. Rasmussen, J. Pejtersen, L. Goodstein. Cognitive Systems Engineering. Wiley, New York, 1994. E. Little, G. Rogova, Ontology Meta-Model For Building A Situational Picture Of Catastrophic Events, in: Proc. of the FUSION’2005-8th Conference on Multisource Information Fusion, 2005. P. Thagard, C. P. Shelley, Abductive reasoning: Logic, visual thinking, and coherence, In M.-L. Dalla Chiara et al. (Eds.), Logic and scientific methods. Dordrecht: Kluwer, 413-427. HTML, 1997. G. Harman, The Inference to the Best Explanation, Philosophical Review 64, 88-95, 1965 R. Haenni, J. Kohlas, N. Lehmann. Probabilistic Argumentation Systems, in: J. Kohlas, S. Moral (eds). Handbook of Defeasible Reasoning and Uncertainty Management Systems, Volume 5: Algorithms for Uncertainty and Defeasible Reasoning. Kluwer, 2001 Shafer, G. A Mathematical Theory of Evidence, Princeton, MIT Press, 1976. N. Friedman, J.Y Halpern, Modeling Beliefs in Dynamic systems, Part I, Foundations, Artificial Intelligence, 95:257--316, 1997. G. Rogova, P. Scott, C. Lollett, Distributed Fusion: Learning in multi-agent systems for time critical decision making, in Data Fusion for Situation Monitoring, Incident Detection, Alert and Response Management, E. Shahbazian, G. Rogova, P. Valen (eds), FOI Press, 123-152, 2005. J. Llinas, C. Bowman, G. Rogova, A. Steinberg, E. Waltz, and F. White, Revisions to the JDL Data Fusion Model II, in: Proc. of the FUSION’2004-7th Conference on Multisource Information Fusion, 2004.
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
359
Ontology-Driven Knowledge Integration from Heterogeneous Sources for Operational Decision Making Support Alexander SMIRNOV, Michael PASHKIN, Nikolai CHILOV, and Tatiana LEVASHOVA St. Petersburg Institute for Informatics and Automation of the Russian Academy of Sciences
Abstract. This paper discusses a context-based approach to intelligent support of operational decisions. This approach involves integrating information and knowledge relevant to a decision-making problem, and then solving this problem by taking into account the dynamic nature of the environment. The context is formed as a knowledge-based problem model expressed by a set of constraints. Formalism of object oriented constraint networks is applied to describe the problem and to solve it as a constraint satisfaction problem. The approach takes into account alternative and reusable problem models and capture of decisions. Key words. Ontology management, context management, constraint satisfaction, decision making
Introduction The goal of intelligent support to operational decision making is to assess the relevance of information to a decision and gain insight in seeking and evaluating possible decision alternatives. Operational decisions must rely on timely, accurate, directly usable, and easily obtainable information provided by the dynamic environment. These decisions can be characterized as ad hoc, quickly accessible, often repeatable, based on past experience, and providing for alternatives. Intelligent support for decision making is critical to successfully achieving established goals and objectives. This is particularly important at the operational level where the decision maker must transform intentional goals into deliberate decisions. The paper presents an approach to intelligent support for operational decision making as a follow-up of the Knowledge Source Network (KSNet) approach [1]. The KSNet-approach was a research effort in knowledge logistics as part of the knowledge management activities dealing with intelligent knowledge delivery. The approach presented in this paper inherits the main aspects of the KSNet-approach, such as the ontology model for knowledge representation and integration, and constraint-based problem specification and solving. The KSNet-approach was tested through a number of case studies in areas of e-business [2] and health service logistics [3]. The case study of health service logistics showed that, in real life situations, to find the right solution for a problem, it is not enough to have knowledge describing how to express and how to solve the problem. The problem needs some actual information,
360
A. Smirnov et al. / Ontology-Driven Knowledge Integration from Heterogeneous Sources
e.g., current location of an object, time, weather conditions, etc. Handling this problem requires an appropriate context model to represent, manipulate, and access context information. Context is defined as any information that can be used to characterize the situation of an entity, where an entity is a person, place, or object that is considered relevant to the interaction between a user and an application, including the user and applications themselves [4]. The main idea behind the approach consists of creating a knowledge-based model of the user problem and then solving it considering the dynamic nature of the environment. Technologies supporting the approach are ontology management for identification and definition of the problem requiring a solution; context management for gathering and organizing contextual information and for capturing decisions; and constraint satisfaction for problem solving.
1. Approach Overview The approach aims at modeling the user (decision makers, and other participants involved in the decision making process) problem and solving it. The concept “problem” is used for either a problem at hand to be solved or a current situation to be described. A well-known model of a decision making process, Simon’s model [5], consists of the phases shown in Table 1. The starting point of decision making is the decision maker’s goal or the problem at hand. The role of information is particularly fundamental at the “Intelligence” and “Design” phases dealing with problem definition and generation of alternative decisions. At these phases, information acts as a constraint on decisions. The goals and constraints are interchangeable in the role they play in defining problems. The constraints are part of the model whereas the goals depend on the evaluation. Later it was concluded that “satisfying” decisions are a good choice for decision support. Table 1. Simon’s model and proposed approach Simon’s phase names
Intelligence
Design
Choice
Phase content
Problem recognition
Alternatives generation
Efficient alternatives selection
Steps
• fixing goals • setting goals
• designing alternatives • evaluation & choosing alternatives
Proposed approach steps
• abstract context composition • constraint-based generating efficient alternatives • operational context producing
Following Simon’s model operational decision support within the approach is considered consisting of two parts: a preliminary stage and a decision making stage. The preliminary stage is responsible for preparedness of a decision support system to make decisions. Activities carried out at this stage are (i) creation of semantic models for components of a decision support system (information sources, domain knowledge representing the area of interests, and users), (ii) accumulating domain knowledge, and (iii) coupling of domain knowledge with the environment sources (sensors, Web-sites, databases, etc.). An information and knowledge representation ontology model has
A. Smirnov et al. / Ontology-Driven Knowledge Integration from Heterogeneous Sources
361
been chosen. The decision making stage deals with integration of information and knowledge relevant to the problem, problem modeling, and solving. Decision making is a complex process where a large number of factors can have an effect on a single problem. To naturally take into account the various factors and constraints imposed by the environment, the mechanism of Object-Oriented Constraint Networks (OOCN) [2] is employed. The problem is modeled by a set of constraints. Constraints can provide the expressive power of the full first-order logics [6] that tend to be used as the key logics for ontology formalization. Problem solving within the model of a decision making process [5] suggests resolving a problem by selecting a “satisfactory” solution. This suggestion is easier applied to intricate problems with numerous constraints. Within the proposed approach the problem expressed by a set of constraints is to be solved by specialized solvers as a Constraint Satisfaction Problem (CSP). A result of CSP solving is one or more satisfactory solutions for the problem modeled. The CSP model consists of three parts: a set of variables; a set of possible values for each variable (its domain); and a set of constraints restricting the values that the variables can simultaneously take. To express the problem by a set of constraints that would be compatible on the one hand with the ontology model, and with internal solver representations on the other, the formalism of OOCN is used. Typical ontology modeling primitives are classes, relations, functions, and axioms. A correspondence between the primitives of ontology modeling and OOCN is shown in Table 2. Table 2. Primitives of Ontology Model and OOCN Ontology Model
OOCN
Class
Object
Attribute
Variable
Attribute domain (range)
Domain
Axioms and relations
Constraints
According to the formalism of OOCN, knowledge is described by sets of classes, class attributes, attribute domains, and constraints. The set of constraints consists of constraints describing “class, attribute, domain” relation; constraints representing structural relations as hierarchical relationships “part-of” and “is-a”, classes compatibility, associative relationships, attribute cardinality restrictions; and constraints describing functional dependencies. Within the approach the user problem is modeled by two types of context: abstract and operational. Abstract context is a knowledge-based model integrating information and knowledge relevant to the problem. Operational context is an instantiation of the abstract context with data provided by information sources.
2. Decision Support Stages 2.1. Preliminary Stage The components of the decision support system are modeled as follows: domain knowledge is modeled by ontology model; semantics of information sources is described by information source capabilities model; users are modeled by user profile
362
A. Smirnov et al. / Ontology-Driven Knowledge Integration from Heterogeneous Sources
model. All the components are represented by OOCN formalism. The common representation of the components allows unification of information and knowledge provided by the components, enables information sources to be handled in the same manner, and simplifies integration of the information and knowledge. Operational decision making deals with complex problems requiring deep knowledge in the domain. Decision makers do not necessarily have satisfactory knowledge, which, under time pressure pertinent to operational decision making, can lead to wrong decisions. Because of this, the approach relies on an availability of sufficient domain knowledge and support of domain experts, if required. The domain knowledge has to be collected before it can be used in problem solving and decision making. Knowledge collecting includes phases of knowledge representation and integration. Due to the ontology model, heterogeneous knowledge being collected is represented in a uniform way. An ontology library serves as a repository for collected knowledge. It provides a vocabulary for knowledge representation and supports the internal knowledge representation formalism for specification of the ontologies it has. Components of the ontology library are multiple ontologies of two types: domain ontology and tasks & methods ontology. Domain ontology represents conceptual knowledge about the domain, and tasks & methods ontology formalizes tasks identified for the domain and hierarchies of problem solving methods (taking into account alternative ones). The tasks and methods are represented by classes; the sets of methods’ arguments and argument’s types are represented by sets of attributes and domains, respectively. Domain and tasks & methods ontologies are interrelated by functional constraints. In order to obtain actual information from the environment, ontologies are linked to information sources that keep track of environment changes. Since information sources and users are represented by the same formalism as the ontologies, domain knowledge and the environment are coupled through the attributes. For this, attributes of domain ontology and attributes of the representations for information sources and users are linked by associative relationships. The links mean that the attribute of the ontology class gets values provided by the information source or user (Figure 1). 2.2. Decision Making Stage The starting point for the decision making stage is the user request containing the formulation of the user problem in the user presented form. Based on the result of the request recognition, knowledge relevant to it is searched for within and extracted from the ontology library. The found knowledge is integrated. Ontology management techniques are used to operate on the extraction of relevant knowledge, its integration and consistency checking.
A. Smirnov et al. / Ontology-Driven Knowledge Integration from Heterogeneous Sources
Domain ontology
Information sources Information source1
363
Route
User
Air Location
Property1: … Propertyn: Location
Airport Information sourcek …
Property1: …
Weather conditions
Weather radar
Route availability
Propertym: Visibility
Information source’s data
Environment Figure 1. Links between domain knowledge and information sources
Since the user vocabulary (the request vocabulary) and the ontology library vocabulary are supposed to be different, the first step consists in matching the request vocabulary and the vocabulary of the ontology library. Then terms of the request are searched throughout the ontology library. The terms found serve as “seeds” for the slicing operation [7]. The purpose of this operation is to extract knowledge from the ontology library, that is believed to be relevant to the request, and consequently to the user problem. The operation assembles knowledge related to the “seeds” based on attributes and constraints inheritance rules. The result of the operation is a set of ontology slices containing pieces of knowledge that surround “seeds”. Different slices that combine knowledge representing alternative methods or alternative branches of diverse domain ontologies are considered as alternative. The slices are merged so that alternative slices would be parts of different pieces of knowledge. The resulting pieces of knowledge are considered as relevant to the problem (Figure 2). The operation of merging is supported by: (i) the common vocabulary and formalism used for the internal knowledge representation, (ii) simple representation formalism excluding complex interdependencies that can have effect on semantic ambiguity, and (iii) a certain area of interests (application domain) ensuring collecting reliable knowledge provided by experts having fundamental understanding of the domain. Domain ontology
Tasks & methods ontology
Merging Alternative slice 1
Alternative slice 2
Set of alternative knowledge
Tasks & methods ontology slice
Figure 2. Set of alternative knowledge relevant to the problem
Due to relations between ontologies and the environment, assigned in the ontology library, the merged knowledge is connected to information sources and users that will provide data. In this sense, users are considered as information sources. Obtaining
364
A. Smirnov et al. / Ontology-Driven Knowledge Integration from Heterogeneous Sources
information and its organization in contexts are technologies addressed by context management. Associative relationships between attributes of the information source and the knowledge describing the problem show what data are needed (Figure 3) If an information source is of a complex data structure, a slice from the information source representation is formed limited to the structure elements needed for the problem solving. If an information source is of a simple data structure the slice is the representation of the information source. The slices of information sources and the knowledge relevant to the problem are merged again. Ontology-driven knowledge integration enables to involve methods of consistency checking for the integrated knowledge and to create a consistent context. The consistent context is considered as an abstract context. Alternative ontology slices make up alternative contexts. The abstract context is an ontology-based problem model supplied with links to information sources. These information sources provide data needed for the given problem and instantiate the abstract context through redefinition of variable domains. The abstract context with fully or partially redefined domains is an operational context that is, on the one hand, the problem model along with problem data; and, on the other hand, OOCN to be processed as CSP. Request decomposition
Domain ontology
1 Search
Tasks & methods ontology
Route
Ontology slice
Route
Air
Transportation line Airport
Visibility Information source representation Monitoring system
Wind
Air
Land
Airline availability Weather conditions (i) Departure (i) Destination (i) Availability (o)
2 Slicing
Weather radar
Ground sensor
Visibility Wind Slicing External source context 3
(i) – input argument
(o) – output argument
Figure 3. Organization of relevant information and knowledge
In order to enable the capturing, monitoring, and analysis of decisions and their effects, the contexts representing problem models and respective decisions made are retained. To do this, context versioning techniques, which are a context management issue, are applied. As a result the user is provided with reusable problem models and knowledge of similar situations and decisions made.
A. Smirnov et al. / Ontology-Driven Knowledge Integration from Heterogeneous Sources
365
Conclusion The paper describes a context-based approach to operational decision support as a follow-up of the KSNet-approach developed earlier. The approach aims at modelling and solving the decision maker problem. The problem is represented as ontology-based context integrating information and knowledge relevant to it. A constraint-based problem model is used for problem solving. The approach is based on a common knowledge representation of the main components of a decision support system: domain knowledge, information sources, and the decision maker. It incorporates ontology management techniques for integration of knowledge relevant to the decision maker problem, context management methods for integration of information relevant to the problem and organization of the contexts, and object-oriented constraint networks mechanism for problem solving. An applicability of the approach to a real application domain is illustrated through a case study of a portable hospital configuration [3].
Acknowledgement The paper is due to the research carried out as a part of Partner Project # 1993P funded by EOARD of the USAF, project # 16.2.35 of the research program "Mathematical Modelling and Intelligent Systems", the project # 1.9 of the research program “Fundamental Basics of Information Technologies and Computer Systems” of the Russian Academy of Sciences, the project funded by grant # 05-01-00151 of the Russian Foundation for Basic Research, and the CRDF partner project with US ONR and US AFRL & EOARD.
References [1]
[2] [3]
[4]
[5] [6]
[7]
Smirnov, A., Pashkin, M., Chilov, N., Levashova, T., and Haritatos F. Knowledge Source Network Configuration Approach to Knowledge Logistics. International Journal of General Systems. Taylor & Francis Group, 32 (3), (2003), 251—269. Smirnov A., Pashkin M., Chilov N., Levashova T. KSNet-Approach to Knowledge Fusion from Distributed Sources. Computing and Informatics, 22 (2003), 105—142. Smirnov A., Pashkin M., Chilov N., Levashova T., Krizhanovsky A. Fusion-based knowledge logistics in network-centric environment: intelligent support of OOTW operations. Proceedings of the Seventh International Conference on Information Fusion, Stockholm, Sweden, June 28 – July 1 (2004), 487— 494. Dey A.K., Salber D., Abowd G.D. A Conceptual Framework and a Toolkit for Supporting the Rapid Prototyping of Context-Aware Applications. Context-Aware Computing, A Special Triple Issue of Human-Computer Interaction, In: T.P. Moran, P. Dourish (eds.), Lawrence-Erlbaum, 16 (2001), http://www.cc.gatech.edu/fce/ctk/pubs/HCIJ16.pdf. Simon H.A. Making management decisions: The role of intuition and emotion, Academy of Management Executive, 1 (1987) 57—64.. Bowen J. Constraint Processing Offers Improved Expressiveness and Inference for Interactive Expert Systems, International Workshop on Constraint Solving and Constraint Logic Programming, (2002), 93—108. Chaudhri V.K., Lowrance J.D., Stickel M.E., Thomere J.F., Wadlinger R.J. Ontology Construction Toolkit. Technical Note Ontology, AI Center. Report, January 2000. SRI Project No. 1633.
366
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
Evaluation of Information Fusion Techniques Part 1 – System Level Assessment Erik BLASCH Air Force Research Laboratory, Dayton, OH Susan PLANO Wright State University, Dayton, OH
Abstract: To determine the performance gain of an information fusion system, a designer or integrator must determine whether the design is useful. Usefulness can be accomplished through a system level assessment as to whether the fusion system meets specified goals. The chapter demonstrates a fusion evaluation process as a guideline by which an algorithm or system can meet the desired objectives. The assessment requirements include usability goals, fusion metrics, experiment design, and test scenarios.
1. Information Fusion Information Fusion (IF) strategies are employed to (1) reduce uncertainty, (2) reduce dimensionality, and (3) increase responsiveness. [1] By combining information (multidatabase, multi-sensor, multi-look, multi-mode); the uncertainty is minimized, as shown in Figure 1. Additionally, by combining two pieces of information into a single entity, we have reduced the complexity or dimensionality. Transmitting one piece of evidence is faster than transmitting two pieces of evidence. While traditional definitions are valid, another way to address IF is whether or not the desired performance goals are achieved.
Figure 1. Uncertainty reduction through Fusion of information to reduce the covariance
If we address the IF system as a tool, as opposed to an entity unto itself, we must address the context to which the system augments a user’s task. Applications for multisensor IF require insightful analysis of how IF systems will be deployed and
E. Blasch and S. Plano / Evaluation of Information Fusion Techniques Part 1
367
utilized. Increasingly complex, dynamically changing scenarios arise, requiring more intelligent and efficient reasoning strategies. Integral to information reasoning is Decision Making (DM), which requires pragmatic knowledge representation for user interaction. [2] Many IF strategies are embedded within systems; however the user-IF system must be rigorously evaluated by a standardized method over various environments, changing targets, differing sensor modalities, and IF algorithms. [3] The current fusion model supporting the evaluation and deployment of sensor fusion systems is the User-Fusion model as shown in Figure 2(a). The key here is that the IF system is a tool that augments the user’s perception to assess the social, political, and environmental situations. A conventional example is the tracking of targets. [4 - 7]
Figure 2. User Fusion model and the 2004 DFIG model
A useful model is one which represents a real world system instantiation. The IF community has railed behind the JDL process model with its revisions and developments. [8] The current Data Fusion Information Group (DFIG) assessed the current model, shown in Figure 2(b). Note, the DFIG model separates management into three functions (1) user, (2) platform or mission, and (3) sensor. The user makes decisions on the data observed. Mission management includes the context (i.e., social and political information), which primes the IF system with the a priori information. The third aspect is that of the resource management, which is the control of the sensors gathering the data. The key aspect of fusion system assessment includes many features traditionally overlooked by the community: namely the functionality of the IF system to meet the desired management objectives with real-world constraints. One such case, for simplicity, is a usability analysis. If the IF system is only understood by the engineers, it will never be utilized. In this model 1 , the goal was to separate IF and management functions. Management functions are divided into sensor control, platform placement, and user selection to meet mission objectives. Level 2 (SA) includes tacit functions which are 1 The views expressed in the paper are those of the authors and do not reflect the official position of the DFIG.
368
E. Blasch and S. Plano / Evaluation of Information Fusion Techniques Part 1
inferred from level 1 explicit representations of object assessment [9]. Since the unobserved aspects of the SA problem cannot be processed by a computer, user knowledge and reasoning is necessary. Likewise, issues with Level 1 belief conflicts [10] may not always be solvable by fusion algorithms. The current definitions, based on the revised DFIG fusion model [8], include the standard Level 0 – 5 definitions plus: • Level 6 − Mission Management (an element of Platform Management): adaptive determination of spatial-temporal control of assets (e.g., airspace operations) and route planning and goal determination to support team decision making and actions (e.g., theater operations) over social, economic, and political constraints. The chapter details the experiment design (testing), metrics, and a site security example. 2. System Level Assessment System level assessment includes listing all the possible operating conditions that could affect fusion process outcomes. An assessment is the functional determination of whether a system is properly functioning. Assessment techniques include software checks to determine whether the input/output data streams are working. Likewise, an assessment may involve estimating system value. The key to assessment is whether IF systems meet desired qualitative and quantitative objectives (i.e., those facilitating management functions). An evaluation determines system performance. An evaluation may include a quantitative estimate of performance. Before an IF system can be deployed for use, it must go through a rigorous testing phase. Testing includes (1) Usability analysis, (2) Design of experiments, and (3) objective analysis. For a usability analysis, the interface design must be determined as to its presentation of information. In the algorithm design phase, the engineer usually outputs the fusion result and determines whether the data is correct. For example, in a tracking scenario [6, 7], the output track can be determined relative to the true track of the object. For a rigorous test, the algorithm must be robust to changing conditions and thus a Design of Experiments (DOE) analysis is needed. For example, if tracking a target, the output track can be compared against different runs in which various sensor covariance changes are conducted. If the covariance is low, the tracker will do better than if the covariance is large. The third objective analysis is to test whether the system works with real world data [4]. For example, terrain and real world sensor models can be used.
E. Blasch and S. Plano / Evaluation of Information Fusion Techniques Part 1
369
Figure 3. Assessment process for a IF system design and transition to use
The assessment levels are shown as analytical, simulation, empirical and process, shown in Figure 3. In typical cases, the fusion community only looks at the analytical and simulated results. What is needed is the augmentation of the IF system with empirical (real-world) data and embedding the IF design in the process. The process includes how the IF system will be operated and controlled by users. For our tracking problem, the output of the IF system will be relayed to a commander. The commander would like this data in real time and expects a certain level of accuracy (covariance) and reliability (confidence). Figure 4 shows the information necessary for assessment. Data starts with a signal, transduced by sensors, that affords signal extraction. Once the data is resolved, the IF algorithm must do signal estimation and recognition. These functions together can be assessed through Measures Of Performance (MOPS). However, to determine whether the system meets desired goals, we must assess the system as to its effectiveness and usefulness. Typically, these measures require user/customer expectations, see [1] (Ch 13) for an explanation. Hence, the assessment process is actually one of determining the qualitative and qualitative fusion evaluation goals. Qualitative information can be checks to determine whether the system works (T/F). The quantitative measures are shown next.
Figure 4. Assessment measures of the IF system
370
E. Blasch and S. Plano / Evaluation of Information Fusion Techniques Part 1
3. Information Fusion Metrics for Assessment Metrics are the key to the analysis. While each individual metric can be achieved, it is important to determine a set of metrics that must be assessed simultaneously. As explained in the introduction, many metrics can be the same that the sensor manager is trying to optimize. Dynamic DM requires: (1) SA, (2) dynamic responsiveness to changing conditions, and (3) continual evaluation to meet throughput and latency requirements. These three factors are instantiated by an IF system, to allow the user to make decisions and management functions to re-plan platforms and control sensors. To afford interactions between future IF designs and users information needs, metrics are required. The metrics chosen include timeliness, accuracy, throughput, confidence, and cost. These metrics are similar to the standard QoS metrics in communication theory and human factors literature, as shown in Table 1. [2] Table 1: Quality of Service (QOS) Metrics for various Disciplines Communication Delay Probability of Error Delay Variation Throughput Cost
Human Factors Reaction Time Confidence Attention Workload Cost
Info Fusion Timeliness Confidence Accuracy Throughput Cost
ATR/ID Acquisition /Run Time Prob. (Hit), Prob. (FA) Positional Accuracy No. Images Collection platforms
TRACK Update Rate Prob. of Detection Covariance No. Targets No. Assets
While there is a host of metrics that can be utilized, it will be the scenario and the user that determines the key attributes desired from the fusion system. The next section shows an example of a way in which IF aids in analysis as opposed to being the analysis. 4. Information Fusion Example – Site Security Site security is growing in importance and increasing efforts of automation are needed to assist operators with impending threats. In many cases, the customer desires an accurate track and ID on all objects within a vicinity. Additionally, they desire a timely set of information. One way to achieve the goal is to place many human observers in the location of the targets. Another way would be to augment the human observers with Unmanned Aerial Vehicles (UAVs) equipped with visual and infrared sensors. The IR sensors would be able to detect heat and the visual cameras would be able to recognize targets. While accuracy is typically assessed from the tracker and the confidence in target ID, we seek to determine whether responsiveness and spatial coverage can also be increased. To determine whether an IF design could be employed to meet the user’s needs, we simulate a system of UAVs circling overhead with sensors. The sensor models include operational systems with estimated sensor uncertainties. A set of observers (humans in patrols) with UAVs are simulated. Some realism is used as to incidents of threats and the fusion system is evaluated as to whether the design objectives are achieved. In the initial evaluation, a track and ID algorithm is used, and robustness is achieved by assessing varying operation conditions of performance (anticipation time and force protection). A response time analysis is conducted to determine whether a decision (reliable target position and ID) is available. For the response time, we look at the ability to
E. Blasch and S. Plano / Evaluation of Information Fusion Techniques Part 1
371
make decisions, where λ is the arrival of detected events processed by the user, μ is the movement rate of the adversary, and ρ = λ / μ is sensor utilization. ρ indicates the ability of the user to act faster and utilize the information before the adversary moves (larger ρ). We assume that λ is Poisson-distributed, and the Probability of events is Pr[E items arrive in T] = Pr[X = E] =
(λT)E - λ (t - T) E! = e
(1)
where the events detected and processed by the operator over the OODA loop (observe, orient, decide, act) cycles are: P(EOP) = e – [λOO + λOD + λDA + λAO ] (t – T)
(2)
If we look at the analysis postulated, we can determine the effectiveness of the IFS to aid the user in decision making. Since we want proactive and preventive strategies, we assess the capability of the IFS to increase insurgent risk, increase effort, and lower adversary payoff. Using the user-adversary action relationships, we have ρ = λ Ts = λ / μ
(3)
If ρ > 1, the user is faster than the adversary. If ρ = 1, the user and the adversary act at the same time. In the worst case, if ρ < 1, the user acts slower (or has less data). The desire to evaluate the effort and risk are shown below EFFORT T effort = ρ μ / (1 - ρ) EX: ρ = 2, the user is faster T = - 2 ρ = 1/2, the user is slower T = 1
(4) Hence adversary BLOCKED Hence adversary ACTS
2
RISK: R events = ρ μ / (1 - ρ) EX: ρ = 2, the user is faster R = - 4 ρ = 1/2, the user is slower R = 2
(5) Hence NEGATIVE EFFECT Hence POSTIVE EFFECT
With ρ > 1, the user is afforded information faster, which allows the decisionmaking cycle to be reduced and creates a negative effect to adversary action. Let the number of users be n and number of adversaries be m, each observing the situation. The user has N (i.e., UAVs) sensors and M sensors for adversaries, then ρ = [ N λ / n ] / [ M μ / m]
(6)
In this analysis, we are interested to determine the case for increased use of UAV analysis (N) that affords multiple users (n) to act over a distributed set of threats (M) associated with threat actors (m). For example if, N > 1 - the user has more observer UAVs resulting in a faster decision cycle n < 1 - user has fewer actors resulting in efficient DM m > 1 - adversary has more actors/ no communication the user is less effective in DM
372
E. Blasch and S. Plano / Evaluation of Information Fusion Techniques Part 1
M < 1 - the user prevents adversary from observation intelligent
user-fusion
system
is
more
The resulting information utilization over N distributed sensors for users is effectively ρ =
(NUAV + N OP) λ / n Mμ /m
(7)
Hence a combination of an IFS with an operator increases the ability to make event detections in a timely manner (or faster than an adversary target can evade detection). 5. System Performance Evaluation The goal of any multisensor system intelligent analysis is to have a fusion gain. The information fusion gain can be assessed as Measures of Effectiveness (MOE) or Measures of Performance (MOP). MOE/Ps can be determined from the system analysis. MOPS include throughput, time, and PD. MOEs include force protection, reduction in casualties, reduction in material loss, and reduced event occurrence. The goal here is to determine force protection given throughput, time, and PD performance. Performance metrics include throughput and timeliness. Throughput can be determined as the average rate, peak rate, and variability of the system to deliver information to the user. The average rate λ (events processed / time) is the average load provided by a source (sensor). The average rate expresses the flow that can be sustained by the sensor over an extended period of time. The peak rate tells the network what type of surge traffic must be coped with, either by dedicating data-rate capacity or by allocating sufficient buffer space to smooth out surges. Variability is measured as the throughput peak rate or source burstiness and is an indication of the extent to which target prioritization can increase efficiency. Delay (or latency) can be measured with time to assessment or delay variation. Given a transfer delay of a network to move data from a source to a destination and a delay variation, we are also interested in the ability to make quick decisions. For this analysis, we are concerned with the proactive anticipation time: Anticipation time = [ λ - μ ] t
2
(8)
If we utilize the sensor/user throughput [PD = f(resolution, range)] and timeliness (λ, N), we can determine the MOEs of increased effort (reaction time), reduced risk to threats, distributed detection (PD / area / s), and force protection. Assuming sensor performance, (λ = PD / r 2), force protection is [where a negative result is better]: 2
FP risk =
( N [PD / r 2] / n ) ρ2μ = M [ Pr(event) / Area ] / m - N [P / r 2] / n (1 - ρ) D
(9)
E. Blasch and S. Plano / Evaluation of Information Fusion Techniques Part 1
373
6. Scenario Simulation For the presented scenario, we are interested in a force protection environment. We envision the deployment of multiple UAVs with EO/IR cameras that are used to detect, ID, and track targets of interest. We are also interested in the detection of dismounts (i.e., drivers exciting their car) which requires RFID and audio ID checkpoint detection. Figure 5a shows the traffic and dismounts over a 22K × 27K ft area. Assumptions for base protection perimeter defense are: events consist of vehicles with people nearby, responsiveness updates with 50% solution, electric high persistence UAVs (solar powered, 10k ft, 80mph, EO/IR, 1 ft AGR), and wireless telemetry for ground processing. We model 30 UAVS, each with an IFS with event detection arrival rate (λ), and 50 adversary vehicles each with one sensor, one threat direction leader, and a movement rate (μ). Assuming orthogonal azimuth detection, we determine the PD as a function of the UAV and vehicle movement. Figure 5b shows the deployed UAVs and the IFS timeless for two simulations (A) UAVs with EO/IR and RFID tags and AudioID from random patrols and (B) random patrols. Since we utilize N UAVs, we effectively decrease our response time and can act quicker for event detection. Using the system assessment, ρ = 1.782 and FP-Risk = - 0.3418 with a proactive anticipation time of 89 seconds to prevent action (determined as the distance to move toward the protection zones). If PD increases, then the proactive anticipation time for action would increase.
Figure 5. Results from a simulated scenario augmented with real data
7. Conclusions To determine whether an information system will be effective and efficient, we must assess the capability of the information fusion system to meet user operating goals. We have given a brief overview of fusion management design as it pertains to system evaluation and shown an end-to-end assessment of a fusion system to provide timely, accurate, and robust information to a user for force protection. References [1] [2]
E. Waltz and J. Llinas, Multisensor and Data Fusion. Artech House, Boston, MA, 1990. E. Blasch, and S. Plano, “Level 5: User refinement to aid the Fusion Process,” Proc SPIE 5099, April 2003.
374 [3]
E. Blasch and S. Plano / Evaluation of Information Fusion Techniques Part 1
E. Blasch, M. Pribilski, et al, “Fusion Metrics for Dynamic Situation Analysis,” Proc SPIE 5429, Aug 2004. [4] E. Shahbazian, et al, “Target Tracking and Identification Issues when using real data,” Fusion00. [5] W. Koch, Target Tracking, In advanced Signal Processing Handbook, CRC Press, 2001. [6] A. Tchamova, J. Dezert, et al, “Target Tracking with Generalized data association based on the general DSm rule of combination,” Fusion04. [7] D. Angelova & L. Mihaylova, “Joint Tracking and Classification with Particle Filtering and Mixture Kalman Filtering using Kinematic Radar Information,” Digital Signal Processing, 2005. [8] J. Llinas, C. Bowman, G. Rogova, A. Steinberg, E. Waltz, & F. White, “Revisiting the JDL Data Fusion Model II”, Fusion2004. [9] S. G. Nikolov, P. R. Hill, D. R. Bull, C. N. Canagarajah, Wavelets for image fusion, in Wavelets in signal and image analysis, A. Petrosian and F. Meyer (Eds.), Comp. Imaging & Vis. Series, Vol. 19, Kluwer Academic Publishers, Dordrecht, The Netherlands, 2001, 213-244. [10] E. Lefevre. O Colot, et al, “A generic framework for resolving the conflict in the combination of belief structures,” Fusion00.
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
375
Evaluation of Information Fusion Techniques Part 2 – Metrics Erik BLASCH Air Force Research Laboratory, Dayton, OH Abstract: Information fusion evaluation is based on a test design, pragmatic metrics, and experimental performance. For instance, a sensor management system must determine how to point the sensors based on the estimated tracker performance. Tracking performance is a function of data quality, tracker type, and target maneuverability. Many contemporary tracking methods are useful for various operating conditions. To determine nonlinear tracking performance independent of the scenario, we wish to explore metrics that highlight the tracker capability. With the emerging relative track metrics, as opposed to Root-MeanSquare error (RMS) calculations, we explore the Averaged Normalized Estimation Error Squared (ANESS) and Non Credibility Index (NCI) to determine tracker quality independent of the data. This paper demonstrates the usefulness of relative metrics to evaluate a model mismatch, or more specifically a bias in the model, using the probabilistic data association filter, the unscented Kalman filter, and the particle filter.
1. Information Fusion Metrics In a dynamic targeting scenario, there are a host of algorithms that affect performance: sensor registration, measurement-to-track assignment, track-to-track association, sensor management, and ultimately, the user. A key determinant of target-tracking success is being able to choose the right sensor with the correct measurement-to-track algorithm. Typically, a tracking algorithm is evaluated by the absolute position error (i.e., Root Mean Square (RMS)) [1]. Using the RMS represents the sensor-algorithm performance, as opposed to evaluating the tracker independent of the sensor. In deployable systems, it is important to assess the tracker capability over different operational conditions (target dynamics, sensor quality, and environment obstacles). To evaluate tracker capability independent of Operating Conditions (OCs), we utilize relative metrics [1]. There are many tracking approaches such as the joint probabilistic data association (JPDAF) [2], Interactive Multiple Model (IMM) [2], unscented Kalman filter [3], particle filter [4] and its variants [5]. Others have used identification information to improve tracker performance [6]. One difficulty for a user is to determine the best approach for a given scenario. To determine the tracker algorithm performance for various OCs, there is a need to study track metrics [7]. Many authors have explored various tracking metrics to determine tracker performance, such as covariance [6]. Drummond investigated various approaches and has compiled a list of absolute metrics. Chong and Mori looked at track association metrics [8]. Another approach is the Cramér-Rao lower bound [4]. One emerging set of metrics is called relative metrics [1]. Relative metrics are statistical estimators to isolate tracker performance independent of the data. The goal of this work was to
376
E. Blasch / Evaluation of Information Fusion Techniques Part 2
explore the use of the relative metrics to non-linear tracking methods. We chose to explore the use of relative metrics to the linear JPDAF, suboptimal nonlinear unscented KF, and the optimal nonlinear particle filter Assessing the relative metrics for nonlinear approaches required some modifications to “estimate” the density information. As shown in the paper, useful results were gained in using the relative metrics in assessing tracker quality. 2. Tracking Metrics for Linear Systems Tracking methods include many opportunities for analysis. Some metrics are listed below: Metric
Description
Absolute Track Quality Relative Track Quality Track Life-Time Track Length Relative Track Length Track Purity Track Coverage Track Density Track Continuity
Mean square position, velocity, acceleration error Mean square kinematic error relative to sensor covariance Total time target is in track Distance over which target is tracked Distance over which target is tracked relative to maneuverability Percent of associations of dominant track over lifetime Area of tracking Number of targets track per area Number of individual targets associated with a given track
The relative track metrics include ANEES, Non-Credibility Index, and Credibility Index [1]. The Averaged Normalized Estimation Error Squared (ANEES) is defined by: T 1 N -1 ANEES = N n ∑ (x i - ^x i) P i (x i - ^x i) i=1
(1)
where (x i - x^ i) is the state estimation error, Pi is the error covariance provided by the estimator in the ith run, n is the state dimension, and N is the number of runs in the MC test. If the estimation error and estimated covariance are the same, then the ANEES ~ 1, and the filter is credible [1]. The credibility test is an assessment of the Chi-square estimator capability. The determinant is a check for The Non-credibility Index as to whether the sensor is credible. When the ANEES is outside the 95% probability interval, the estimator is not credible. For the vector case, the NCI is the sample average 10 Log 10 (ρ) (analogous to SNR). T
-1
(x i - x^ i) P i (x i - x^ i) Credibility = ρ = T -1 (x i - x^ i) Σ i (x i - x^ i )
(2)
where Σi is the actual MSE covariance of the sensor. 10 N Non-Credibility Index = N ∑ Log 10 (ρ i) i=1
(3)
E. Blasch / Evaluation of Information Fusion Techniques Part 2
377
If NCI >> 0, Pi is low (optimistic) If NCI << 0, Pi is high (pessimistic) Many estimators / filters provide self assessment of estimation errors – error covariance. However, are the self-assessments (error-covariance) close to the actual MSE, and if so how close? A credibility test determines how much we can trust these self-assessments. An estimator is: Credible if its actual error and self-assessment are statistically equal (e.g., their difference is statistically insignificant). It is Optimistic if its self-assessment is statistically smaller than the actual error and Pessimistic if its selfassessment is statistically larger than the actual error. An estimator is not credible usually because its assumptions / model / approximations are not accurate. In a classic example of estimating the position of a target moving in a circle, Figure 1 shows the value of relative metrics. The two credibility measures (NCI and ANEES) differ with respect to equitable treatment of optimism versus pessimism. The NCI is a geometric average, with the mean near the mode. Hence optimism and pessimism are treated equitably. The ANEES is an arithmetic average, and hence severely penalizes optimism.
Figure 1. Example. Circle data, no bias, Extended Kalman filter, Q = 1, R = 10, 30 Monte Carlo runs
Figure 2 shows the case of a target maneuvering where the credibility metrics show behavior hidden by absolute metrics. NCI indicates track confidence is dependent on target direction. The performance analysis reveals un-modeled system dynamic and sub-optimal covariance estimates.
Figure 2. Linear case for the relative metrics
378
E. Blasch / Evaluation of Information Fusion Techniques Part 2
3. Nonlinear Tracking using Optimal Particle Filter (PF) In data association tracking approaches, typical tracking is provided through position measurements. For linear Gaussian cases, this is acceptable. However, for nonlinear cases the Gaussian assumption is not viable. Of recent interest is the particle filter method which uses a series of particles to estimate the likelihood of a target location. The sampling approaches approximate the posterior density by a set of samples: the Unscented Kalman Filter (UKF) uses a small number of deterministically chosen samples, while the particle filter uses a large number of random (Monte Carlo) samples as shown in Figure 3a.
Figure 3. Re-sampling/selection and diversification
Classical Monte Carlo methods for dynamic systems, such as Particle Filters (PFs), are capable of tracking complex nonlinear systems with noisy measurements. PFs approximate the probability distribution with a set of samples or particles. PFs have a number of characteristics that make them attractive: they are nonparametric (can represent arbitrary distributions), can handle hybrid state spaces, can handle noisy sensing and motion, and can easily be balanced to meet performance objectives where the number of particles can be adjusted to match available computation. The design of a PF is based on four ideas. The first idea is the approximation of a continuous support distribution p(x 0: t | y 1 : (i) t) by N discrete samples x 0: t , “randomly” drawn from p(x 0: t | y 1 : t), for i = 1,…., N: (i) 1 N p(x 0: t | y 1 : t) ≅ N ∑ δ(x 0: t − x 0: t ) i=1
(4)
where the subscript “0:t” or “1:t” indicates the observation interval from 0 or 1 to t and δ(•) is the Dirac delta function. This is the so-called Monte Carlo method. The second idea is the importance sampling. In the estimation problem, the posterior distribution p(x 0: t | y 1 : t) is in fact what we want to estimate from data, thus not available for sampling directly. One way to get around this is to approximate the expectation over the unknown distribution p(x 0: t | y 1 : t) by another expectation taken over a known easy-to-sample distribution q(x0: t | y1 : t), called importance function, also
E. Blasch / Evaluation of Information Fusion Techniques Part 2
379
known as proposal distribution. When these expectations are approximated by drawing N samples from the importance function (proposal distribution) q(x 0: t | y 1: t), we have: E{g t (x 0 : t)} = ~ (i) = w (i) w t t
N (i) E q {g t (x 0 : t) w t (x 0: t)} ~ (x (i)) = g t (x 0 : t ) w ∑ t t E q {w t (x 0 : t)} i=1
/
N
∑ wt
(j)
(5)
(6)
j=1
~ (i) = 1. ~ (i) used in of Eq. (6) with ∑ N w The normalized importance weight w i=1 t t The third idea is sequential importance sampling (SIS). Under the assumptions t that the underlying state corresponds to a Markov process [i.e., p(x 0 : t) = Π j = 1 p(x j | x j − 1)], the observations are conditionally independent given the state [ i.e., p(y 1 : t | x 0 : t) t = Π j = 1 p(x j | x j)], and the importance function or proposal distribution is factorable [i.e., q(x 0 : t | y 1 : t) = q(x 0 : t − 1 | y 1 : t − 1) q(x t | x 0 : t − 1 | y 1 : t − 1) according to the Bayes’ rule), then the un-normalized importance weight can be estimated recursively as: w t (x0 : t) = w t − 1 (x 0 : t − 1)
p(y t | x t) p(x t | x t − 1) q(x t | x 0 : t − 1 | y 1 : t − 1)
(7)
Eq. (7) provides a mechanism to sequentially update the importance weight given the conditional proposal distribution q(x t | x 0 : t − 1 | y 1 : t − 1). Indeed, we can sample (i) from this proposal distribution (i.e., generate N discrete samples x t according to q(x t | x 0 : t − 1 | y 1 : t − 1) and evaluate the likelihood and transition probabilities [i.e., p(y t | x t) and p(x t | x t − 1) given by the process and measurement models] for theses samples. Table 1 lists the algorithm for a generic particle filter with additional steps discussed below. Table 1. Generic Particle Filter [Yang and Miller, 2005]
380
E. Blasch / Evaluation of Information Fusion Techniques Part 2
To avoid degeneracy, the fourth idea is importance resampling, also called selection, in which samples with low importance weights are eliminated while samples with high importance are multiplied keeping the total population of samples at the same level. Techniques for resampling include sampling importance resampling (SIR), residual resampling, and minimum variance sampling. Since the selection step favors the creation of multiple copies of the “fittest” particles (thus allows us to track the updated distributions), many “unfit” particles may end up with few or no copies, leading to sample impoverishment. To solve this problem, an additional step is therefore needed to introduce the sample diversification after the selection step without affecting the validity of the approximation. A brute force approach would increase the number of samples. But a refined technique is to implement a Markov chain Monte Carlo (MCMC) step, which moves new particles to areas of more interest in the state space by applying a Markov chain transition kernel, as shown in Figure 3b. Particle filters have computational and representational advantages over other Bayesian techniques. The main problem is that a large number of particles are often needed to maintain a reasonable approximation of the state probability distribution to detect rapid maneuvers which requires maintaining and updating large numbers of particles. This is typically not practical due to limited computation. However, small particle sets do not provide reasonable approximations because they are unlikely to represent quick maneuver changes and their estimates have a high variance. The balance between the number of particles and tracker performance is assessed using the NCI estimator. To address the sample size and tracker performance, we use the relative metrics by approximating P in Eq. (1-3) as a function of the particles. 4. Relative Performance Analysis We compared the PDAF, UKF, and PF using the relative metrics ANESS and NCI versus RMS. The scenario is presented in Figure 4 which includes a long-run target maneuvering scenario to assess the difficulty of a tracker losing track. In the investigation the number of particles was varied between 200 and 5000. Figure 5 presents the absolute metrics of the trackers. These absolute metrics mask the performance of the tracker relative to the data quality. Thus, we evaluated the tracker estimation quality or robustness.
Figure 4. Sampling results
E. Blasch / Evaluation of Information Fusion Techniques Part 2
381
Figure 5. Absolute RMS errors
Next, we show performance results to detect a model mismatch or bias in the estimator for the tracking problem using nonlinear methods. Figures 6-7 demonstrate nonlinear tracking using the NCI and ANESS to determine tracker performance independent of the data.
Figure 6. UKF credibility analysis
382
E. Blasch / Evaluation of Information Fusion Techniques Part 2
Figure 7. PF credibility analysis
5. Discussion & Conclusions We looked at the standard RMS track metrics which show the sensor-tracker accuracy; however, we desired to assess tracker performance independent of the data. To do this, we applied the Relative Track Metrics. In a series of simulation experiments, the NCI demonstrated the ability to detect a model mismatch or bias in the results. Also, we were able to implement the NCI in nonlinear tracking systems. By estimating the particle region, there is an issue with PF re-sampling number versus relative metric analysis in the MSE over estimator covariance. We showed sensitivity metrics that address tracker quality independent of measurements useful for nonlinear tracker evaluation. Regions of optimism could spawn particles and the regions of pessimism could reduce particles to save computation. Additionally, the NCI and ANESS could control the spread of particles. Future work will explore (1) the tracker sensitivity metrics to multiple target tracking, (2) target maneuverability, and (3) addressing simultaneous track and identification. These metrics assist in fusion evaluation for sensor management cost function optimization and user performance validation. References [1] [2] [3] [4] [5]
X. R. Li & Z. Zhao, “Evaluation of Estimation Algorithms – Part 1: Local Performance Measures,” submitted to IEEE AES, 2004. S. Blackman and R. Popoli, Design and Analysis of Modern Tracking Systems, Artech House, Boston, 1999. S. J. Julier, J.K. Uhlmann, & H. Durrant-Whyte,“ A New approach for filtering Nonlinear Systems,“ Amer. Control Conference, 1995. B. Ristic, S. Arullampalam, & N. Gordon, Beyond the Kalman Filter: Particle filters for tracking applications, Artech House, 2004. D. Angelova & L. Mihaylova, “Joint Tracking and Classification with Particle Filtering and Mixture Kalman Filtering using Kinematic Radar Information,” Digital Signal Processing, 2005.
E. Blasch / Evaluation of Information Fusion Techniques Part 2
[6] [7] [8] [9]
383
E. Blasch, and L. Hong, “Data Association through Fusion of Target Track and Identification Sets,” Fusion00. R. Rothrock & O. Drummond, “Performance Metrics for Multiple-sensor, Multiple-Target Tracking,” in SPIE, Vol. 4048, pp. 521-531, 2000. C.Y. Chong, “Problem Characterization in Tracking/Fusion Algorithm Evaluation,” IEEE AES Mag, 2001. C Yang & M Miller, “Nonlinear Filtering Techniques for GNSS Data Processing,” ION 61st Ann. Mtg/ MITRE Corp. & Draper Lab., 27-29 June 2005, Cambridge, MA, pp. 690-703.
384
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
Rapid and Reliable Content Based Image Retrieval Dimo T. DIMOV Institute of Information Technologies at Bulgarian Academy of Sciences e-mail:
[email protected]
Abstract. The paper concerns an open problem in the area of Content Based Image Retrieval (CBIR). A new constructive method is proposed for an effective CBIR indexing, either a fast or noise tolerant one that is well suited to the conventional Database (DB) specifics of Image data base storing (IDB). The method’s consistency is proven by analyzing the most widespread viewpoints on CBIR, the IDB viewpoint, and the computer vision one. Different techniques based on the method are briefly described and results of their real tests by an Experimental Image Retrieval System are also considered. Keywords. Content Based Image Retrieval (CBIR), Image DB (IDB), fast and noise tolerant CBIR.
Introduction Content Based Image Retrieval (CBIR) is a relatively new area of Informatics that covers techniques for automatic (or automated) retrieval of image and/or video objects by features (e.g., color, texture, shape, movement, etc). Current CBIR systems are often qualified as being in the “early” stage, because of the predominantly simple statistics (histograms) of the features considered. However, these simple statistics are usually preferred to more sophisticated structures like contours, trajectories, etc., because of the well-known difficulties encountered by segmentation of graphical objects, even in cases of low levels of image noise [1, 2, 3]. There are known CBIR systems in world practice, such as TRADEMARK (1992), QBIC (1993), Virage (1995), Photobook (1997), ARTISAN (1999), PictureFinder (2002), etc., but most of these can only be associated with this “early” CBIR. Similarly, while a large number of image databases (IDBs) are available on the Internet, they also need the proper retrieval tools [4, 5, 6]. An ordinal CBIR system includes a multimedia database for keeping images and/or video data and for maintaining typical user queries, a graphical user interface to request wording and visualization results, and suitable indexing techniques for feature vector storing as well. These indexing techniques are currently being intensively researched to find an effective method for either fast or noise tolerant CBIR. The paper proposes a new constructive method for an effective CBIR indexing well suited to conventional DB specifics. This method’s consistency has been proved (Section 1.3), analyzing both the most widespread viewpoints on CBIR, DB (Section 1.1), and image processing and recognition (Section 1.2). Three techniques are
D.T. Dimov / Rapid and Reliable Content Based Image Retrieval
385
considered for method realization (Section 2): image contour analysis, image wavelet analysis, and image Fourier analysis. The techniques are briefly described from a speed (fast retrieval) perspective (Sections 2.2 and 2.3) and from a noise tolerance interpretation (Section 2.4). The techniques have been tested (Section 3) on both aspects of interest – performance speed and noise tolerance, via the so-called Experimental Image Retrieval System (EIRS) operating on test IDBs, consisting of approximately 4000y14000 hallmark images.
1. Basic Formulation The most general peculiarity of CBIR systems is that they are usually developed based on conventional DB Management Systems (DBMS), resulting in their inheriting all the DBMS advantages and shortcomings. The DBMS methods for fast (indexed) data access are generally optimized for textual and numerical data, and are inappropriate for other types of objects, e.g. images and/or video. The well known extensions of these methods, e.g., R-trees, k-D trees, etc. [1, 4, 5], were generally developed for the socalled GIS (Geographic Information Systems) technologies, which is why they do not directly meet CBIR specifics. More sophisticated CBIR approaches, borrowed from pattern recognition areas, generally lead to a so-called sequential access to images from the IDB, which is also unacceptable, especially in cases of large IDBs. Another popular approach to speed up CBIR (from a DB viewpoint) is to preannotate the IDB, via a structure of textual descriptors (keys), e.g., following the socalled Vienna convention, well known in patent offices’ practice. The approach is simple but also inappropriate, because it requires user-operator’s involvement, which either slows down the effectiveness of retrieval, or, in the case of large IDBs, prevents it. Nevertheless, this interactive DB support approach is definitely promising for future CBIR systems that will search by logical and/or abstract queries. For instance, recent research [7] proposes a pre-annotation approach, not to the whole IDB but to a portion of it (~1%). Definite CBIR hopes are also linked to machine learning approaches, for both short and long term learning [3]. Finally, classical computer vision approaches to CBIR should also be mentioned [1, 2, 3, 4]. Actually, CBIR should be considered from at least four viewpoints [1], namely: x Computer Vision (CV), and more precisely Image Analysis and Pattern Recognition (IAPR); x Databases (DB) and DB Management Systems (DBMS); x Artificial Intelligence (AI) and especially Machine Learning (ML), AI, ML AI, GUI Others and knowledge DBs as well; x Graphical User Interfaces (GUIs) as well as multimodal UI in CBIR that aid the human-operator. DBMS CV, IAPR This paper has been limited to the first two viewpoints, considered by the following “formulae”: IDB CBIR Figure 1. Different viewpoints on CBIR
CBIR = IAPR + IDB ,
(1)
386
D.T. Dimov / Rapid and Reliable Content Based Image Retrieval
i.e., CBIR will be discussed as an interdisciplinary area between IAPR and IDB, see also Figure 1. 1.1. DB &DBMS Viewpoint on CBIR Given relational table for IDB description, see Figure 2, can be represented as: (i )
(i )
(i )
(i )
(i )
(( i, a1 , a 2 ,... , ak , b1 ,... , bn k ), 1didN) { { ( (ID) ¡ (A1) ¡ (A2) ¡… ¡ ( Ak) ¡ (B1) ¡…¡ ( Bn-k) ) ,
(2)
where i is the record identification code (ID), and N=|IDB|. On the left (top) side of the above equation is the rows’ interpretation and on the right (bottom), the symbols ¡ show the column’s interpretation (i.e. same name features). The columns are split into two groups, {a} and {b}, and can be interpreted as: x informative {a} and unessential {b} features (from an IAPR viewpoint), or x coordinates {a} of k-dimensional (k-D) feature space, and (n-k)-dimensional vector objects {b} therein, n>k, (from another viewpoint of IAPR), or x primary {a} and secondary {b} keys (from a DB viewpoint), or x key {a} fields (primary and/or secondary ones) and ordinary {b} fields for extra information, etc. For instance, the so-called BLOB fields [8], where IDBs usually keep images, can be considered {b} type. Both basic types of DB access methods of interest are hereinafter referred to as Sequential Access Methods (SAMs) and Index Access Methods (IAMs). SAMs are restricted by the socalled Peano principle, i.e., given record (j) could be read if, and only if, the preceding record (j-1), Figure 2. A simplified logical scheme of IDB for CBIR 2djdN, is already read, i.e., the average access time is ~O(N), N=|DB|. IAMs ensure much faster access time ~O(logN), and can be performed on each field of both {a} and/or {b} types. To do this, a linear order over the chosen field should be defined. In DBMS terms, this means constructing a separate index file for this field, also called a key field. Each field A (or B) of the table, see Eq.2, takes a symbolic value, so that it can always realize a key, via the natural order S over these values: j
S ( a ( j ) ) , 1 d j N , a ( j ) d a ( j 1 ) , a (.) A .
(3)
Briefly, using the order S, the DBMS index on the field A returns a pointer (the ID) to that object of DB, whose feature value is respectively the greatest one less (or at least equal) to the given (input) value a.
D.T. Dimov / Rapid and Reliable Content Based Image Retrieval
387
1.2. CBIR from an IAPR Viewpoint Most commonly, when an image I is input in an IAPR system, then the output O is expected to be informative enough for the class C to affiliate the image, see Figure 3. The classes Cm, m=1,2,…M that the system has to recognize are often considered non-crossing subsets of the so-called world C of the system:
Cm Cn
, Cm C ,Cn C , C
C
m
(4)
m 1y M
and the recognition algorithm can be expressed by the following test sequence: m : 1 ( I ) R (C m )
false m : m 1 ( I ) R ( C m ) ® ¯ true O : m end
(5)
where (.)R(.) is a correspondence relation based on our confidence that ICm or ICm, m=1yM. Usually, the (I)R(C) test is difficult to perform directly because of a possible variety of objects, which is why this relation is properly extended to a triple relation of similarity S(A, B, s), where A and B are two arbitrary objects from the world C, and s is their degree (measure) of similarity. Most often, the relation S is represented by a similarity function S: s
S ( A , B ) , A C , B C , s ( 0 y 1]
(6)
perhaps defined for a part of C only. Thus, the (I)R(C) test can be set up as: ( I ) R (C m )
true ( n z m )( s ( I , A ) d s ( I , B ) , A C n , B C m )
(7)
which is enough to describe many of the well known IAPR methods, e.g., the Nearest Neighbor (NN) method, the k-NN one, etc., [9]. The definition of the appropriate function S for a concrete application (world C) is a currently open problem of IAPR. Several approximations are usually admissible, e.g., the potential function’s method known in nonlinear discriminate analysis. Using a metrics in the world C or at least a distance function defined only for the important object couples therein is also very popular.
388
D.T. Dimov / Rapid and Reliable Content Based Image Retrieval
CBIR specifics’ place I
IA&P
S&D
RbP
CoR
FV
O
DBS
FBs
Figure 3. A general IAPR interpretation of CBIR, where: (IA&P) = Image Acquisition & Preprocessing, (S&D) = Segmentation & Decomposition, (RbP) = Recognition by Parts, (CoR) = Composition of Result, (FV) = Final Verification, with a DB of Standards (Examples), and (FBs) = possible Feedbacks.
To perform the correspondence and/or similarity tests, the input image information is necessary. Two basic approaches are used for that, namely: x Vector representation of I, I~(x1, x2,…xk), as a fixed feature string of length k, where the values xi of each feature, i=1yk, can be measured (evaluated). x Structural representation of I, I~H=(V,E,…), where the structure H can be generally considered as a graph, with V, the set of vertices, E the set of edges among them, and with respective attribute values over V and E, and so on. Without loss of generality, we can restrict ourselves to vector representation only, which naturally leads to a representation of the world C as a linear vector space C of features: C { {X | X
(8)
( x 1 , x 2 ,... x k )}
with k the number of dimensions (features), and where each object X is represented as a point (k-D vector). Thus, we reach the often used Euclidean distance (also a metrics):
D( A, B)
A B , X ( x1 , x2 ,...xk )
( x12 x22 ... xk2 )1 / 2 , A C, B C, XC
(9)
Ignoring certain details, we can assume that the chosen k features representing the world C are informative enough, and that they can be ordered by importance, for definiteness, in descending order. Because of the CBIR context, we consider only objects belonging to the world of images. 1.3. Correspondence between Both Basic Viewpoints In this way, the Eq.(1) formulation of CBIR assumes the following more concrete sense: x CBIR inherits most of the IAPR methods, assuming that the preliminary knowledge for the solved task is stored in an IDB. The IAPR space of features, Eq.(8), corresponds to the relational table description of IDB, see Eq.(2).
D.T. Dimov / Rapid and Reliable Content Based Image Retrieval
389
x
The solved tasks (from a DB viewpoint) are: (i) search by example, and (ii) search by group of examples (category search). They correspond (from an IAPR viewpoint) to: (j) identification (recognition) of an object and (jj) objects’ recognition (association with class). Each key (IDB table column) consists of key fields, each with a fixed length, L, that may vary for different keys. These fields, where number N equals the number of records (DB table rows), store number values, textual strings, and/or Boolean values. From an IAPR viewpoint, a key is primarily a one-dimensional (1D) array F, of a length N=|DB|, whose values are of a fixed precision respecting L. If given key (i.e., DB object feature) is a primary one, then each of its values x(j), j=1yN, corresponds to a unique (the j-th) object in the DB. If the key is a secondary one, then its given value v may correspond to a group of objects {j1, j2,… jm}, with m being its number, i.e., x(j1)= x(j2)=…= x(jm) = v. To effectively use a given conventional DB, i.e., to build a plausible CBIR index, it is necessary that the space C objects be orderable into a full order of the Eq.(3) type. Formally, this is assured by the described correspondence: ¢feature space ² ¢DB table²; nevertheless, the practice requires also IAPR specific experience. The task for association with an object class is general for the IAPR area. Usually, decision methods lead, as already mentioned, to similarity function S and, in particular, to distance function D among the object pairs. Thus, even being defined for the whole C, either S or D leads, in general, only to a partial order of objects (see Figure 4a), while the index definition needs a full (linear) order relation (see Figure 4b).
(a)
(b)
Figure 4. A similarity graph defining: (a) a partial order, and (b) a full (linear) order in the object world
Therefore it follows the main idea (hypothesis) of this paper, namely that: A possible (and promising) approach to an effective CBIR is to cut off the partial order defined by given similarity function over the feature space, in such a way that a full (linear) order relation can be obtained, which will be enough for an appropriate DB index to be built. 1.4. Illustrative Examples O1 O2
X
Figure 5. Two areas of Mahalanobis distances defined for the objects therein. The full order is obviously impossible.
It is not very difficult to show that the similarity S, see Eq.6, (or the distance D, see Eq. 9) is only going to appear as a full order in the world C in some particular cases. It is enough to remember that both S and D are symmetrical two-place relations, while the order relation, whether a full or partial one, is anti-symmetric by definition. In other words:
390
D.T. Dimov / Rapid and Reliable Content Based Image Retrieval
(i) If a metrics is defined for the set C, e.g., by Eq.9, then these metrics cannot be directly applied to obtain a full order, except when the “triangle inequality“ from the metrics definition degenerates into an equality, for instance, in the 1D case (for k = 1). Thus, to define a key, respecting the given metrics, it is necessary to appropriately “cut off” the full “graph of distances“. From the combinatorial large number of opportunities, we have to vote for only one, the key we need. This approach to ordering, object by object, is often called “scanning” of the space C, e.g., R-trees in a 2D-C [5]. Here we primarily consider scanning by coordinates (i.e., features), to order them by importance, also known as hyper-rectangles’ trees [1]. (ii) If the available distance is not a metrics, then it is very possible that an appropriate cut-off does not exist. The so-called Mahalanobis distance [9] is an example of this, see also Figure 5.
2. A Promising Approach to the “Rapid and Reliable CBIR” Problem
2.1. About the Images of Interest Graphic images, which will be the main focus hereinafter, can be considered to be reproduced by a small number of colors or half tones (i.e., gray intensities), whose respective areas are of a sufficient size. Ideally, these images should be centered, i.e., the essential graphics should be in the “middle” of the image frame, while the area near the frame should be filled only with background color (or half tones). Graphic images are intensively used in patent office activities, e.g. for registration of companies, firms, etc., and are popular by the names “hallmarks”, “trademarks”, or simply “marks” [10, 11], see also Figure 6.
(a)
(b)
(c)
Figure 6. Three examples: (a) a halftone (gray) image, (b) almost B/W one, and (c) colors converted to gray
Depending on the success with a chosen type of image, the results obtained could, without difficulty, be spread over a larger spectrum of images using the “split and own” principle. This is very popular in the IAPR area, see also the module couple, (RbP) and (CoR), in Figure 3.
D.T. Dimov / Rapid and Reliable Content Based Image Retrieval
391
2.2. Three CBIR Techniques for Fast IDB Access Keeping in mind the cut-off idea, see Section 1.3, three techniques for image key derivation have been proposed considering both the CBIR necessities and the conventional DBMS limitations: (T1): a heuristic decomposition of images to contextual contour parts, cf. [12, 13], (T2): a two-dimensional wavelet transform, cf. [14], and (T3): a two-dimensional Fourier transform (FT) & heuristic modifications, cf. [15]. Although applying different recognition principles (and/or feature spaces), the above three techniques are built on the following common statements [13, 15]: x The search content is the input image itself or a sketch of it. x The most essential image data are automatically extracted and arranged in a key string of a fixed length (following the chosen technique – T1, T2, or T3). x The fast access is performed using conventional IAMs of the given DBMS. x Noise tolerance is treated the same way by each technique, i.e., in parallel with the processing speed characteristic for the basic IAM applied. The techniques proposed are illustrated in Figures 7 and 8 on the example of Figure 6a. Frame
(T1)=>
the root (the image itself)
(T2)=>
the leafs
Zy
(T3)=> Zx
Figure 7. Feature space scanning and IDB keys derivation by the three techniques proposed
No decomposition of input image is applied, in the sense of Figure 3. The final IDB verification is performed based on the image key, i.e., a lossy-compressed but informative enough representation of the image, reflecting given user perception for the image context importance. Specific problems connected with a technique’s invariance against accidental rotation, scaling and/or translation of the images are solved according to each technique’s specifics.
392
D.T. Dimov / Rapid and Reliable Content Based Image Retrieval
2.3. Proposed Techniques Combination The T1 technique is rotationally invariant by definition (because of 1D FT applied) but is sensible to noise, while T2/T3 have quite opposite characteristics. Thus, attempts to improve the proposed method have been focused on T3 modifications as follows:
Figure 8. Modifications of the T3 technique: T3a, T3b and T3c, after a preliminary Simple Polar Mapping
(T3a): applies a simple polar Mapping of the input I before its 2DFT, cf. [15, 16], (T3b): similar to T3a: vertical 1DFT and horizontal 1DWT instead of 2DFT, [15], (T3c): similar to T3b: a 1D CosFT, [17], is used horizontally instead of 1DWT. These modifications are illustrated in Figure 8 on the same “ant” image of Figure 6a. 2.4. Noise Tolerance Interpretation Given object X=(x1, x2,… xk) of the feature space C can be considered an error version of another object F=(f1, f2,… fk), by the formulae: F
X
H
F H , H (H 1 , H 2 ,...H k ) , H C , (10)
X
where the vector H is usually called the error vector. One object (e.g., X) could be interpreted as the input object (for x1 recognition), while the other, F, as an object already stored in the IDB. This Figure 9. Errors’ model in the feature space doesn’t depend on the facts of where and when the error occurred! A 2D illustration is given on Figure 9. If the space C metrics is defined, then the error rate can be evaluated by the norm (module) of H: x2
Oerr
H
(H 12 H 22 ... H k2 ) 1 / 2 , H C
(11)
D.T. Dimov / Rapid and Reliable Content Based Image Retrieval
393
On the other hand, it is most often statistically considered that the error probability diminishes polynomially (or even exponentially) if the distance ||H|| to the centre F increases. Thus, we arrive at the definition of error domain Oerr, FOerrC, and even – the domain of most probable errors. For one-modal error probability densities, these domains often occur one-fold and as convex ones, by Gaussian densities, they occur as hyper-ellipses, or even spheres – uniformly distributed on the separate coordinates of C. A well known rule for noise suppression (by the distance minimum to the original vector F) can be written as:
D( F , F H ) d d min
1
2
min{D( A, B) | A, B E n { C} , F C , H C
(12)
For index conformity, error domains are most often considered rectangular and parallel to respective coordinate axes, i.e., hyper-rectangles. One more popular IAPR approach, which can be interpreted as the noise tolerance aspect, is feature space squeezing, or, more precisely – reducing of the space’s dimensions number k, e.g. using Karunnen-Loeve’s approach of principle components. Both above approaches are exploited by the techniques (T1, T2 and T3a,b,c), i.e.: x For given key of IDB, the coordinate axes of C are ordered (enumerated) by their importance. The enumeration should be preliminarily defined, possibly by intuitive rules complying with user preferences. x Error domains are interpreted by tolerances for the generated keys and represent hyper-rectangles. The above description reflects the so-called “regular noise” problem. The real image experiments encountered another problem, which we call the “rough artifacts’ noise” problem. A similar noise example is given in Figure 6b, where noisy artifacts surround the mark situated in the middle (a piece of newspaper appears under the mark). The approaches necessary to mitigate against rough noise are beyond this paper’s scope.
3. The Experimental Image Retrieval System (EIRS) A software system, which we call EIRS, has been designed to support experiments with the proposed method for Rapid and Reliable CBIR [18]. Considering its possible application area, EIRS remains perhaps the closest to the system ARTISAN, which was designed for the Royal Patent Office necessities [19]. ARTISAN searches for mark images, generally accenting the more common shapes (line-cuts, circle-cuts, ellipses, etc.) as well as their relations (parallelism, closeness, etc.). EIRS primarily searches by arbitrary image examples, generally focusing on processing speed and noise tolerance. At the time an instrumental software, EIRS allows testing the abovementioned techniques (T1, T2 and T3a,b,c) concurrently, on test IDBs consisting of 4000y14000 images. The retrieval time per image is about a few seconds (3÷5s), mostly depending on the access time of the HDD used. The average error rate for regular image noise (produced by accidental linear transforms of image acquisition) is about 2%, i.e., by arbitrary introduced noise, less than 2% of IDB images do not succeed in winning the first position in the similarity list of the respective retrieval experiment. EIRS is written in C++ and operates in a Windows 98/NT/XP environment, on IBM PC compatibles.
394
D.T. Dimov / Rapid and Reliable Content Based Image Retrieval
4. Conclusion The paper proposes and proves the consistency of a universal CBIR superstructure over the index access methods of a conventional DB&DBMS used for image storage. This superstructure, which we call “rapid and reliable CBIR”, respects user requirements for either processing speed or noise tolerance. From a computer vision viewpoint, this superstructure can be interpreted as a specific technology for image recognition backed by an after the fact verification with a glossary, a DB of image examples. It is assumed that the proposed method is not limited to images, but can also be applied to 1D objects (e.g., sounds, music, speech), a time series of images (video-clips or short-movies) and/or multimedia. There are no limitations concerning the volume of the respective DB and/or its location, locally or via Internet. The main limitation of this method, namely the “object against background”, is currently being investigated as to whether it can be reduced, which, if successful, would enlarge the method’s possible applications. Definite hopes in this respect are connected with a near future combination of the techniques proposed, namely between T1 as image context decomposition by respective sub-object contours and T2/T3 as more resistant ones to a larger spectrum of image noise.
Acknowledgements This research is supported by following grants of the Institute of Information Technologies at Bulgarian Academy of Sciences (BAS): (i) Grant #010056/2003 of BAS, (ii) Grant # I-1306/2003 of the National Science Fund (NSF) at Bulgarian Ministry of Education & Science (MES), (iii) Grant # RC6/2004 of the CICT Development Agency at Bulgarian Ministry of Transport & Communication as well as by the R&D Grant (iv) of the Greek-Bulgarian Collaboration Project on "Study of Biomedical Data by the Methods of Multiresolution Analysis and Polysplines”, 20052006, managed by the NSF at Bulgarian MES.
References [1]
A.W.M. Smeulders, M. Worring, S. Santini, A. Gupta, and R. Jain, Content-Based Image Retrieval at the End of the Early Years, IEEE Trans. on PAMI, 22, 12, (2000), 1349-1380. [2] J.P. Eakins, Towards intelligent image retrieval, Pattern Rec. J., 35 (2002), 3-14. [3] N. Vasconselos, and M. Kunt, Content-based retrieval from databases: current solutions and future directions, in Proceedings of ICIP’2001, Oct. 7-10, 2001, Thessaloniki, Greece: IEEE, Vol. 3, 6-9. [4] Y. Rui, S.H. Thomas, and S.-F. Chang, Image Retrieval: Past, Present, and Future, 1998, 1-17 (http://citeseer.nj.nec.com/cs, keyword ‘CBIR’, huang97image.pdf ). [5] C. Zaniolo, S. Ceri, C. Faloutsos, R.T. Snodgrass, V. S. Subrahmanian, R. Zicari, Advanced Database Systems, Morgan Kaufmann Publ., Inc., San Francisco, CA, 1997. [6] P.L. Stanchev, Content Based Image Retrieval Systems. In: Proceedings of CompSysTech’2001, Sofia, Bulgaria, 2001, P.1.1-6. [7] H.T. Nguyen, A.W.M. Smeulders, Everything Gets Better All the Time, Apart from the Amount of Data. Third Int. Conf. on Image and Video Retrieval (CIVR’2004), Dublin, Ireland, July 21-23, 2004, Proceedings, Lecture Notes in Computer Science 3115, Springer, 2004, 33-41. [8] K. Reinsdorph, Teach Yourself Borland C/C++ Builder in 14 Days, Borland Press, Indianapolis, 1998. [9] M. Sonka, V. Hlavac, and R. Boyle, Image Processing, Analysis, and Machine Vision, 2d edition, PWS Publ. at Brooks-Cole Publ. Co, ITP, Pacific Grove, CA, 1998. [10] International Hallmarks on Silver, collected by TARDY, Paris, France, 1981, 429-523.
D.T. Dimov / Rapid and Reliable Content Based Image Retrieval
395
[11] Official bulletin of the Bulgarian Patent Office, ISSN 1310-179X, Index 20311, 10 (2001), (in Bulgarian). [12] D.T. Dimov, Fast, Shape Based Image Retrieval, In: Proceed. of CompSysTech’2003, June 19-20, Sofia, 2003, 3.8.1-7, (http://ecet.ecs.ru.acad.bg/cst/Docs/proceedings/S3/III-8.pdf). [13] D.T. Dimov, Fast Image Retrieval by the Tree of Contours’ Content, Cybernetics and Information Technologies, BAS, Sofia, 4, 2, (2004), 15-29. [14] D.T. Dimov, Wavelet Transform Application to Fast Search by Content in Database of Images, In: Proceed. of the IEEE Conference Intelligent Systems’2002, Sept. 10-12, 2002, Varna, Vol.I, 238-243. [15] D.T. Dimov, Harmonic and Wavelet Transform Combination for Fast Content Based Image Retrieval, International Conf. on PDE Methods in Applied Mathematics and Image Processing, Sep. 7-10, 2004, Sunny Beach, Bulgaria. (http://www.math.bas.bg/~kounchev/2004_conference/lectures/ddimov.pdf). [16] D. Zhang, and G. Lu, Enhanced Generic Fourier Descriptors for Object-Based Image Retrieval, Int. Conf. on Acoustic, Speech, and Signal Processing (ICASSP’2002), May 13-17, 2002 Orlando, Florida. [17] E. Chu and A. George, Inside the FFT Black Box, Serial and Parallel Fast Fourier Transform Algorithms, CRC Press, Boca Raton, Florida, 2000. [18] D.T. Dimov, Experimental Image Retrieval System, Cybernetics and Information Technologies, BAS, Sofia, 2, 2, (2002), 120-122. [19] J.P. Eakins, K. Shields, and J. Boardman, ARTISAN – a shape retrieval system based on boundary family indexing, 1996, 1-12, (http://citeseer.nj.nec.com/cs, keyword ‘CBIR’, eakins96artisan.pdf).
This page intentionally left blank
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
397
Subject Index Agent 24, 27–28, 32, 86, 90, 128, 161, 340, 347–348, 352, 356, 358 Assignment 27, 30, 41–42, 47, 71, 94, 100, 114, 116, 118, 123, 127, 141, 143, 150, 180–183, 186, 216, 237, 307, 339, 350, 375 Belief function 33, 71, 117, 123, 125–127, 129–132, 134–136, 138–146, 149, 156, 164–166, 168–170 Classification 26, 33–35, 38–40, 66, 74, 79, 82, 92, 114, 126, 133, 144–146, 150, 164–169, 194, 219, 222, 231, 234, 254, 268, 271, 274, 278, 281, 307–308, 323–326, 328–330, 336, 374, 383 Command and control 157, 162 Completeness 69–71, 74–76, 116, 236, 326, 328–329 Constant False Alarm Rate (CFAR) 78–80, 82–83, 236–240, 242 Context information 1–5, 7, 18–19, 33, 57–58, 66, 69–70, 73, 78–84, 360 Correlation 8–9, 30, 58–60, 63–66, 68, 179–180, 196–200, 205, 207, 210–211, 224, 226–228, 236–239, 242, 257–258, 262, 319, 332 Data association 9, 41, 47–48, 142, 144, 179–180, 182–183, 260, 267, 344–345, 374–375, 378, 383 Decision making 1, 69–70, 74, 76, 85–86, 98, 125–126, 131, 150–151, 157–162, 166, 179–180, 261, 309–310, 315, 351–354, 357–362, 365–368, 371, 389 Decision support 1, 123, 157–162, 360–361, 365 Dempster–Shafer 30, 81, 84, 92, 98, 101, 114–130, 135–136, 138, 141, 145–156, 180, 183, 233, 235, 278 Denoising 281–282, 285–286, 288–289, 295
DSmT
115, 120–121, 124, 148, 179–186 Entropy 75–76, 117, 132, 183, 185, 198–199, 207, 274–275, 310, 319–320, 335 ESM 92–98, 149, 154–155 Evidence theory 71–72, 74, 76–78, 881, 83, 92, 94, 101, 118, 124–125, 145–150, 153–155, 356, 358 Feature extraction 30, 211, 223, 280 Fusion, Evidential 78, 81, 83 Fusion, Image 146, 212, 221–222, 235, 252, 254–259, 269–280, 296, 306, 322–323, 374 Fusion, Information 1–3, 23–24, 33, 39–40, 47, 54–55, 57, 69–71, 74–75, 77–78, 92, 95, 105, 123–125, 146, 148, 156–157, 161–162, 186, 252, 259–260, 269, 276, 306–307, 313–314, 337, 350, 357–358, 365–366, 370, 372–373, 375 Fusion rules 114, 117–119, 123, 269–270, 273–276 Fusion, sensor 12–13, 32, 126, 134, 211, 219, 307–308, 337, 350, 367 Fusion theories 114, 118–119, 122–123 Fuzzy logic 71–72, 75–77, 117–118, 125, 145–146, 164, 166, 168–170, 180–181, 184–186, 278, 325–326, 330, 339, 343 Game 85–91, 149, 343 Geographic Information System (GIS) 81–82, 187, 190, 277, 385 Global Positioning System (GPS) 31–32, 56–57, 62, 236–237, 313 Ground Moving Target Identification (GMTI) 11–12, 17–23, 39 Hough transform 30, 83, 279, 297–306 Image registration 187–188, 190,
398
192, 194–196, 198–199, 204–213, 216, 219–223, 227, 235, 322 Information quality 69–70, 76, 278, 357 Interacting Multiple Model (IMM) 21, 23, 33, 48–49, 51–55, 375 Interpolation 60, 193–194, 200–202, 207, 209, 223 Joint Directors of Laboratories (JDL) model 99, 104, 162–163, 339–340, 342, 350, 358, 367, 374 Joint Probabilistic Data Association (JPDA) 16, 182, 375–376 Kalman filter 6–8, 13, 20, 23, 30, 33–34, 36, 39–40, 48, 50–51, 54–55, 180, 183, 312, 314, 328, 342, 374–375, 377–378, 383 Land cover 81–82, 164, 167, 323 Lattice 114–115 Likelihood 6, 9, 15–17, 35–38, 43, 52, 58, 108, 125, 133–135, 138–140, 261–264, 266–267, 321, 328, 345, 347–348, 378–379 Linear programming 41, 47 Modeling 16–17, 24–25, 28–32, 68, 73–74, 90–91, 113, 150, 160, 183, 225, 241, 263, 287, 328, 350, 352, 355, 357–358, 360–361, 365 Monte Carlo 33, 38–39, 53, 236, 239–240, 261, 266–268, 307, 309, 311–312, 314, 377–378, 380 Multispectral 234–235, 255, 276, 315–316, 319–320, 322–323 Nonparametric 106, 108, 110, 112, 262, 325–326, 329, 378 Ontologies 86, 160–163, 339–343, 347–350, 353–365 OODA 158–159, 371 Particle filter 33, 36, 40, 55, 260–261, 263, 265–268, 309–310, 312, 314, 374–375, 378–380, 383 Partial Differential Equations (PDE) 277, 279–283, 395 Probability Density Function (PDF) 5–9, 18–22, 35–36, 118, 124, 149–150, 240, 261–262, 311–312, 350, 365, 394–395 Pignistic probability 71, 131, 133,
138–140, 144, 151, 154, 156, 165, 185 Plausiblility 127, 131, 133–134, 138–139, 144, 185, 354–355 Recognition, Object 188, 211, 219 Recognition, Pattern 125–126, 133–135, 138, 144–146, 158, 187, 197, 207, 220, 276, 385 Recognition, Target 78, 84, 258, 340–342 Regression 106–107, 110–111, 113, 168–169 Relevance 4, 69–71, 75–77, 257, 288, 314, 343, 357, 359 Reliability 25, 30, 49, 69–74, 76–77, 114, 119, 123, 130, 135, 142–143, 148–149, 156, 205, 309, 315, 331–332, 357, 369 Remote sensing 78, 83–84, 106–107, 145–146, 187–190, 204, 207, 212–213, 219, 221–222, 225, 233, 252–254, 256, 258–259, 278, 306, 315, 323 Resource management 27, 99–100, 163, 315, 350–351, 367 Retrodiction 1, 4, 7, 10–11, 19, 23 Segmentation 106, 109–110, 113, 146, 167, 254, 268–269, 271–276, 280–282, 285–286, 288, 295, 301, 306, 331–333, 335–337, 384, 388 Sensor management 1–5, 21–22, 27, 70, 146, 307–310, 313–314, 338, 348–349, 357, 375, 382 Sensor network 24–26, 29, 32, 56–58, 62, 67, 307–309, 311, 313–314 Sensor resolution 1, 14–16, 47 Situation assessment 99–100, 331, 339–342, 351, 354–355, 358 Situation awareness 70, 74, 105, 157, 307–308, 351 Surveillance 1, 3, 9, 11, 21, 24, 28, 32–33, 41, 78–79, 81, 83, 92, 142, 179, 188–189, 211–213, 219, 236, 238, 252–254, 259–260, 311, 322, 331–333, 337 Sythetic Apperture Radar (SAR) 23, 78–84, 92–98, 164–165, 168–169, 188–189, 221–222, 224–233, 277, 279, 281, 287, 289–297, 301–302, 306
399
Target detection
57, 78–84, 106, 236–242, 254, 258 Threat assessment 99–101, 104–106, 211–213, 219, 222, 233, 339–342, 347–350 Tracking 1, 3–12, 14, 16, 17–23, 32–35, 38–41, 47–49, 53–55, 66, 76, 93, 95, 98, 100, 126, 145, 179–180, 183, 185–186, 205, 211, 219, 254, 260–263, 265–268, 271, 307–309, 311–314, 331, 337, 340–342, 344, 367–369, 374–378, 381–383
Uncertainty 5, 7, 48–49, 58, 67–73, 75–77, 81, 94, 122, 125–126, 131–132, 134–136, 138, 143–146, 148, 156–157, 170, 179, 198–199, 269, 278, 310, 343–344, 356–358, 366 Underwater 56–57, 60, 66–67 Wavelet 106, 110, 146, 204–205, 209–210, 219, 253, 258–259, 264–276, 278, 320–321, 374, 385, 391, 395
This page intentionally left blank
Advances and Challenges in Multisensor Data and Information Processing E. Lefebvre (Ed.) IOS Press, 2007 © 2007 IOS Press. All rights reserved.
401
Author Index Alexiev, K. Allard, Y. Angelova, D. Asatryan, D. Averbuch, A. Barhen, J. Blasch, E. Bloch, I. Bonneau, O. Bossé, É. Brasnett, P. Brodsky, B. Bull, D.R. Canagarajah, C.N. Celeste, F. Chilov, N. Corna, P. Couture, J. Dästner, K. Dezert, J. Dimov, D.T. Dixon, T.D. Drozd, A. Elbakary, M.I. Fatone, L. Fernández Canga, E. Florea, M.C. Foresti, G.L. Gajda, J. Garvanov, I. Germain, M. Goretta, O. Grenier, D. Hadzagic, M. Imam, N. Ivanyan, E. Javadyan, A. Jousselme, A.-L. Kabakchiev, C. Kasperovich, I. Kausch, T.
24 78 33, 307 106 56 56 366, 375 164 78 69, 157 260 106 243, 252, 260, 269, 307 243, 252, 260, 269, 307 221 359 277 99 48 171, 179 384 252 187 211 277 243, 252 148 331 324 236 78 221 148 69 56 85 85 148 236 187 48
Kharytonov, M.M. Koch, W. Konstantinova, P. Korchinsky, V.M. Kumar, B. Kyovtorov, V. Lefebvre, E. Levashova, T. Lewis, J.J. Loza, A. Ménard, E. Mihaylova, L. Milisavljević, N. Munro, A. Nikolov, S.G. Nix, A. Noyes, J.M. O’Callaghan, R.J. Opitz, F. Pashkin, M. Plano, S. Pogossian, E. Rogova, G.L. Safaryan, I. Semerdjiev, T. Smarandache, F. Smirnov, A. Snidaro, L. Sroka, R. Steinberg, A.N. Sundareshan, M.K. Tchamova, A. Troscianko, T. Valin, P. Vannoorenberghe, P. Varshney, P.K. Voloshyn, V.I. Vose, M. Wardlaw, M. Xu, M. Zeglen, T. Zirilli, F.
315 1 179 315 187 236 v, 69 359 243, 269 243, 252 99 33, 260, 307 164 307 243, 252, 269 307 252 269 41, 48 359 366 85 351 106 179 114, 171 359 331 324 339 211 179 252 92 125 187 315 56 56 187 324 277
This page intentionally left blank